Future Improvements

Ideas evaluated during brainstorming but deliberately not in the current 5-week plan. Logged here so they don’t get lost.

Roadmap beyond Phase 6

timeline
    title Kronos roadmap · beyond 2026-04
    Phase 1 (current) : Event-conditioned Kronos
                      : Chronos-2 ensemble
                      : Rolling fine-tune
    Phase 2 (Q3 2026) : Cross-asset token model
                      : Shared BSQ + group attention
                      : iTransformer-style variate attention
    Phase 3 (Q4 2026) : Multi-modal fusion
                      : Price + macro + news tokens
                      : Two-tower + cross-attention
    Phase 4 (2027)    : Regime-specific adapters
                      : Per-regime LoRA swap at inference
                      : Meta-learner picks adapter

Evaluated, deferred, or rejected

Cross-asset token model — DEFERRED to Phase 2

Pass multiple assets into a shared tokenizer simultaneously. Each asset tokenized independently, then cross-attention across assets. Moirai-2 does this natively but its CC-BY-NC licence blocks commercial use. Chronos-2’s group attention is the pragmatic substitute.

For now, Phase 1 approximates this with 4 cross-asset leader channels. True cross-sectional attention (all 23 instruments attending to each other) is a bigger architecture lift.

Combine:

Price token stream (Kronos / Chronos-2)
Macro indicator stream (FRED series embedded)
News text stream (FinBERT embeddings)

Research gap: no production model fuses all three natively. State-of-art is two-tower + fusion (e.g. FinMem, FinTral). Engineering investment: 6–12 months. Revisit when Phase 1 baseline is solid and we have a reason to believe each modality adds material alpha.

Regime-specific LoRA adapters — DEFERRED

Train one LoRA per regime (trending / ranging / volatile / quiet). Meta-learner picks the adapter at inference from current regime classification. Complementary to Phase 6 (which does per-asset-class rolling retrain). Combining both = one adapter per (asset_class × regime) = 12 adapters. Probably overkill until event-conditioned baseline plateaus.

INT8 / FP8 quantization — REJECTED for v1

Kronos-base inference is already 50 ms from cache. Quantization would shave 20-30 % but introduces quality risk. Not worth engineering time until inference becomes the bottleneck — e.g. if we serve live (non-cached) predictions or scale to 100+ instruments.

ONNX / TensorRT export — REJECTED

Same rationale. Additional issue: Kronos uses dynamic token sampling which doesn’t export cleanly to ONNX graph. Would require model surgery that defeats the time saving.

Retrain Kronos from scratch — REJECTED (hard)

Needs >10 M candles pre-training corpus
3+ months GPU time
No evidence of upside vs LoRA fine-tuning on 80 K candles
Zero forks with event conditioning exist — LoRA is the right path

Switch to Moirai-2 — REJECTED (licence)

Best any-variate design, smallest fast model. But CC-BY-NC-4.0 means no commercial deployment.

Switch to TimesFM 2.5 — EVALUATED

200 M params, Apache 2.0, 16 K context, XReg covariate support. Strong candidate for a second ensemble member if Chronos-2 underperforms. Keep in back pocket.

Instrumentation / ops wish-list

Confidence-weighted scoring — signal quality should be a function of envelope width, not just p50 direction. Partially exists; needs exposure on the prediction site.
Prediction explain-ability — SHAP-style contributions of each event channel to the p50 shift. Would turn the model into a narrative tool, not just a number.
Automatic drift alerts — when Chronos-2 and Kronos disagree >N standard deviations for K consecutive runs, alert and trigger Phase 6 retrain.
A/B serving — route a fraction of prediction-site traffic to the ensemble model vs baseline, log conversion/engagement, validate real-user impact not just offline metrics.

What we won’t do

Train a private financial LLM — orders of magnitude more data + compute; BloombergGPT / FinGPT territory.
Real-time (sub-second) predictions — requires a completely different architecture; not the use case.
Individual retail trading signals — this is a research / documentation site, not a signal provider. Prediction outputs are educational.