Current Data Inventory
Everything this model consumes lives in Supabase project sstmupvotyzrjygqoany (ap-northeast-1) and is served through the MCP gateway (Railway, 84 tools, 19 datasets).
Core tables
Section titled “Core tables”| Table | Rows | Coverage | Refresh |
|---|---|---|---|
ohlcv_1d | ~23 908 | 81 symbols | Daily 06:00 UTC |
ohlcv_1h | — | 40 symbols | Hourly 0 0-21 * * * |
ohlcv_15m | — | 40 symbols | Hourly |
ohlcv_5m | — | 40 symbols | Hourly |
tickers | 116 | 15-min quotes | */15 * * * * |
economic_calendar | 18 | High-impact events | Daily |
economic_indicators | ~25 768 | 85 FRED series | Daily |
central_bank_rates | — | G10 rates | Daily |
cot_reports | ~1 420 | Commitment of Traders | Weekly |
options_chains | ~2 492 | Select equities + crypto | Daily |
equity_fundamentals | — | S&P 500 subset | Quarterly |
earnings_estimates | ~360 | S&P 500 subset | Quarterly |
news_articles | ~13 855 | 6 RSS feeds · VADER | */30 * * * * |
ml_predictions | — | Kronos outputs | Every cron tick |
signal_evaluations | — | Prediction labels | Hourly / daily |
Gateway datasets
Section titled “Gateway datasets”Per project memory, 19 registered datasets. The model is directly powered by:
ohlcv_daily,ohlcv_intraday— candle source for Kronos inputeconomic_calendar— event flagseconomic_indicators— macro contextpredictions— cached Kronos outputs served to UI
What the model consumes today
Section titled “What the model consumes today”flowchart LR subgraph Existing OHLCV[(ohlcv_1h<br/>ohlcv_1d)] --> BATCH[kronos-batch-predict.py] CAL[(economic_calendar<br/>10-ch placeholder)] --> BATCH end BATCH --> ML[(ml_predictions)] ML --> GW[Gateway<br/>/showcase/ml-prediction] GW --> UI[prediction.datfxlabs.com]What the model will need (Phase 1 · 20 channels)
Section titled “What the model will need (Phase 1 · 20 channels)”See Data Gaps and Backfill Plan — both populated from the data-gap researcher report.
The short version:
- Have:
economic_calendarwithactual/forecastfields for primary events. - Partial: surprise z-scores (need to compute rolling-20 std per event type at query time or materialise a view).
- Missing: FOMC hawkish score (requires NLP over statement text not currently stored), cross-asset leader OHLCV for DXY + VIX at 1 h resolution (verify coverage), FOMC statement text archive.
Data-quality guarantees
Section titled “Data-quality guarantees”- No fake data — enforced by
CLAUDE.mdData Integrity rules plus DB triggers blocking sources like'fake','mock','simulated'. - Source whitelist — only Yahoo Finance, FRED, Alpha Vantage, CoinGecko, Binance, Bybit, Finnhub, FMP, Marketaux.
- External verification — cross-checks against live market prices on data entry.