Phase 0 · Chronos-2 Ensemble
Priority: High Status: Pending Depends on: None — parallel with Phases 1–4
Context
Section titled “Context”- Kronos runs on RTX 4060:8200 via cloudflared tunnel
- 23 instruments, hourly + daily batch via Railway cron
ml_predictionstable stores cached predictions- Gateway serves predictions via
/showcase/ml-prediction
Overview
Section titled “Overview”Deploy Amazon Chronos-2 (Apache 2.0, 200M params, multivariate + covariate ICL) as a second predictor running alongside Kronos. Zero modifications to existing Kronos pipeline. Provides regime-shift robustness baseline and enables ensemble in Phase 5.
Requirements
Section titled “Requirements”Functional
Section titled “Functional”- Load
amazon/chronos-2from HuggingFace - Batch predict same 23 instruments as Kronos on same schedule
- Store predictions in
ml_predictionswithmodel_name='chronos-2' - Expose via existing gateway endpoint with optional
?model=param
Non-Functional
Section titled “Non-Functional”- Co-locate on same RTX 4060 (share GPU, not 2× hardware)
- Fallback to CPU inference if GPU memory contention
- Daily inference <5 min for all 23 symbols
Architecture
Section titled “Architecture”Service Layout
Section titled “Service Layout”kronos-service/├── main.py # existing FastAPI, add /predict-chronos2 endpoint├── predictor.py # existing Kronos wrapper├── chronos2_predictor.py # NEW — Chronos-2 wrapper└── ...Shared service (not separate microservice) because:
- Single cloudflared tunnel simplifies infra
- One GPU, coordinated memory via semaphore
- Gateway already speaks to this service
Schema Extension
Section titled “Schema Extension”ALTER TABLE ml_predictions ADD COLUMN IF NOT EXISTS model_name TEXT NOT NULL DEFAULT 'kronos-base';ALTER TABLE ml_predictions ADD COLUMN IF NOT EXISTS model_version TEXT NULL;
CREATE INDEX IF NOT EXISTS idx_ml_pred_symbol_model_ts ON ml_predictions(symbol, model_name, created_at DESC);Chronos-2 Input Format
Section titled “Chronos-2 Input Format”# Pass as multivariate group for regime contextgroup = { "target": close_prices, # (T,) "covariates": { "volume": volumes, # (T,) "high_low_spread": hl_diff, # (T,) }}# Output: p10, p50, p90 quantiles at each horizon stepCron Addition
Section titled “Cron Addition”chronos2-batch-1h:55 * * * *(same slot as kronos-batch-1h, runs in parallel)chronos2-batch-1d:45 5 * * *
Implementation Steps
Section titled “Implementation Steps”- Add
chronospackage tokronos-service/requirements.txt - Create
kronos-service/chronos2_predictor.py— load model, predict method - Add
/predict-chronos2and/predict-batch-chronos2tomain.py - Share inference semaphore with Kronos (avoid GPU contention)
- Create
scripts/chronos2-batch-predict.py(mirrorkronos-batch-predict.py) - Apply Supabase migration for
model_name,model_versioncolumns - Update gateway handler to filter by
model_name(default behavior unchanged) - Add Railway cron services:
chronos2-batch-1h,chronos2-batch-1d - Test: verify both Kronos and Chronos-2 predictions in
ml_predictionsfor same symbol/timestamp - Dashboard: add model toggle to prediction view
Key Files
Section titled “Key Files”- Create:
kronos-service/chronos2_predictor.py - Create:
scripts/chronos2-batch-predict.py - Create:
supabase/migrations/20260423_ml_predictions_model_name.sql - Modify:
kronos-service/main.py - Modify:
kronos-service/requirements.txt - Modify:
mcp-servers/gateway/src/handlers/ml-handlers.ts(optional?model=filter)
| Risk | Likelihood | Mitigation |
|---|---|---|
| Chronos-2 VRAM conflicts with Kronos | Medium | Start on CPU; GPU only if latency fails SLO (<5s/predict) |
| Package conflicts with kronos_lib | Medium | Install chronos in isolated path, separate loader |
| Zero-shot accuracy poor on financial data | Medium | Still useful as regime-shift canary; ensemble weighting fixes |
Success Criteria
Section titled “Success Criteria”- Chronos-2 produces predictions for all 23 instruments
-
ml_predictionscontains bothkronos-baseandchronos-2rows - Gateway serves both models (default = kronos-base for back-compat)
- Daily batch completes in <5 min
- No regression on existing Kronos latency