Prediction Pipeline
The prediction pipeline is a cache-first architecture. Live Kronos inference is expensive (~50 ms cached, ~800 ms cold, single-concurrency due to RoPE cache); hourly/daily batch jobs generate all predictions in advance and the gateway simply serves the most recent row from Postgres.
Flow diagram
Section titled “Flow diagram”sequenceDiagram participant Cron as Railway Cron participant Scripts as scripts/kronos-batch-predict.py participant DB as Supabase Postgres participant K as Kronos FastAPI<br/>(RTX 4060) participant GW as MCP Gateway<br/>(Railway) participant UI as prediction.datfxlabs.com
Cron->>Scripts: 55 * * * * (hourly) Scripts->>DB: SELECT ohlcv_1h WHERE symbol IN (23 instruments) Scripts->>DB: SELECT economic_calendar WHERE date >= now() - 30d Scripts->>Scripts: EventEncoder.encode(...) → (T, 20) Scripts->>K: POST /predict {ohlcv, events, pred_len=120} K-->>Scripts: {p10[], p50[], p90[], samples[]} Scripts->>DB: INSERT INTO ml_predictions (...) UI->>GW: GET /showcase/ml-prediction/BTCUSDT GW->>DB: SELECT latest from ml_predictions DB-->>GW: row GW-->>UI: {p10, p50, p90, generated_at}Cron schedule
Section titled “Cron schedule”Currently live (base Kronos, 10-channel placeholder events):
| Cron | Schedule | Purpose |
|---|---|---|
kronos-batch-1h | 55 * * * * | Hourly 1 h predictions, 23 instruments |
kronos-batch-1d | 45 5 * * * | Daily 1 d predictions |
score-signals-1h | 5 * * * * | Score predictions older than 1 h |
score-signals-1d | 30 6 * * * | Score predictions older than 1 d |
Phase 0–6 additions (planned):
| Cron | Schedule | Purpose |
|---|---|---|
chronos2-batch-1h | 55 * * * * | Parallel Chronos-2 predictions |
chronos2-batch-1d | 45 5 * * * | Parallel Chronos-2 daily |
kronos-rolling-finetune | 0 6 1 * * | Monthly per-asset-class LoRA retrain |
Components
Section titled “Components”Kronos FastAPI service
Section titled “Kronos FastAPI service”- Lives on your local RTX 4060 at port 8200
- Exposed to Railway/gateway via cloudflared tunnel (no public IP)
- Semaphore pinned to 1 to avoid RoPE numerical drift under concurrency
- Models loaded once at startup; event encoder + LoRA adapter loaded alongside
Batch script: scripts/kronos-batch-predict.py
Section titled “Batch script: scripts/kronos-batch-predict.py”Responsibilities per run:
- Fetch OHLCV for each of 23 instruments from Supabase
- Fetch
economic_calendar+ cross-asset leader OHLCV (BTC / SPY / DXY / VIX) - Build 20-channel event tensor via
EventEncoder - POST to Kronos FastAPI
- Insert
{p10, p50, p90, samples, event_context, model_name}intoml_predictions
Gateway: /showcase/ml-prediction/:symbol
Section titled “Gateway: /showcase/ml-prediction/:symbol”- No auth (
SHOWCASE_API_KEYbypass for showcase routes) - Reads latest row from
ml_predictionsfiltered bysymboland optionalmodel_name - 30-minute LRU cache upstream
- Serves prediction.datfxlabs.com, the Finkit UI, and this docs site
Signal scoring pipeline
Section titled “Signal scoring pipeline”After the prediction horizon elapses, score_signals.py evaluates each prediction:
- Did p50 direction match realized direction?
- Did actual close land inside p10–p90 envelope?
- Per-instrument, per-horizon Sharpe and hit-rate stats written to
signal_performancematerialised view
This closed loop powers the Evaluation page and feeds training labels to the Phase 5 ensemble meta-learner.
Why cache-first?
Section titled “Why cache-first?”Three reasons:
- Model concurrency = 1 — Kronos can’t safely serve multiple live requests anyway.
- Latency SLO — user-facing endpoints need <200 ms; cold inference is 4× that.
- Resilience — if the RTX 4060 is unreachable (power, network, tunnel restart), cached predictions keep serving until next cron.
The tradeoff: predictions are stale for up to 1 h (intraday) or 24 h (daily). Acceptable for the use case; real-time trading would need a different stack.