Skip to content

Implementation Timeline

Five-week execution, three parallel tracks. Kickoff day = W1D1.

gantt
title Kronos Improvement · 5-week plan
dateFormat YYYY-MM-DD
axisFormat W%V
section Track A · Ensemble
Phase 0 · Chronos-2 service :a0, 2026-04-28, 5d
Phase 0 · Schema + cron :a0b, after a0, 2d
section Track B · Event-conditioned
Phase 1 · Event encoder 20 ch :b1, 2026-04-28, 5d
Phase 2 · Model modification :b2, after b1, 3d
Phase 3 · LoRA training :b3, after b2, 7d
Phase 4 · API + batch wire-up :b4, after b3, 3d
Phase 5 · Ensemble evaluation :b5, after a0b b4, 5d
section Track C · Continuity
Phase 6 · Rolling fine-tune :c1, after b3, 5d
Phase 6 · First monthly run :c2, after c1, 2d
  • Track A: Scaffold Chronos-2 service, Supabase model_name migration, seed chronos-2 predictions into ml_predictions.
  • Track B: Build EventEncoder with all 20 channels (events + continuous surprise-z + sinusoidal days-until + 4 cross-asset leaders). Unit tests covering leakage guard.
  • Track A: Chronos-2 cron live, dashboard toggle between models.
  • Track B: Modify Kronos.forward() and auto_regressive_inference() to accept events. Backward-compat regression check: events=None produces identical output to base model. Build training dataset (OHLCV + calendar + leader OHLCV joins from Supabase).
  • Track B: LoRA rank 8 on q_proj / v_proj, 15 epochs, event oversampling 3×. Train time target <4 h on RTX 4060. Save LoRA adapter + event-embedding state dict (~50 MB total).
  • Track B: /predict accepts events. kronos-batch-predict.py fetches calendar + leaders and builds the tensor per run. event_context JSONB column added to ml_predictions.
  • Track C kick-off: scaffold scripts/kronos-rolling-finetune.py, decide asset-class classifier.
  • Track B: Run envelope calibration, directional accuracy, non-event regression, walk-forward backtest on 2025 test period.
  • All tracks merge: LightGBM meta-learner on signal_evaluations — per-asset-class weighting of Kronos-base / Kronos-event / Chronos-2.
  • Track C: first Phase 6 rolling fine-tune run. Warm-start from Phase 3 adapter, rolling 2-y window per asset class, LR halved, 5 epochs. Regression guard checks val-MAE ≤ pooled base before marking active.

Phase 1 → 2 → 3 → 4 → 5 drives the end-to-end date. Phase 0 is independent and can slip without blocking, but its data is needed for the ensemble evaluation in Phase 5. Phase 6 is gated on Phase 3 but can scaffold early.

RiskMitigation
Training diverges on event-day oversamplingAblation: rank 4 vs 8, oversample 3× vs 5×
Cross-asset leader data missing for datesBackfill via Yahoo before Phase 3 — see data gaps
RTX 4060 VRAM pressure with both modelsChronos-2 falls back to CPU if needed
FOMC hawkish score NLP not readyUse rate-decision sign as proxy; retrofit later