Our trading infrastructure has 3 ML systems. Here's where each one stands:
Status: FULLY TRAINED — This is the only system that can train immediately because it learns from price data (17,000+ observations), not trade outcomes.
Status: HEURISTIC MODE — Needs 50 closed trade outcomes to begin experimental training. Currently using rule-based scoring (this is the professionally correct cold-start approach).
Status: HEURISTIC MODE — Zero closed picks. The KIMI tournament system generates picks but doesn't yet close and record them for ML training.
Based on academic research (Lopez de Prado 2018, BMC Medical Informatics 2021) and current pick accumulation rates. This is an honest assessment — not hype.
| Phase | Picks Needed | Estimated Date | What Happens |
|---|---|---|---|
| NOW — Heuristic Mode | 0-50 picks | Feb-Apr 2026 | Rule-based scoring (Sharpe + win rate + tier bonuses). This is the correct approach. Professionals do exactly this during cold-start. |
| Experimental ML | 50-150 picks | ~May-Jun 2026 | RF starts training with reduced feature set (5-6 features only). Runs alongside heuristic — does NOT replace it. High overfitting risk. |
| ML Becomes Useful | 150-300 picks | ~Aug-Oct 2026 | Walk-forward validation begins. Model blended 30/70 with heuristic. Feature count increases as data supports it. |
| ML Primary Ranker | 300+ picks | Late 2026+ | Full 14-18 feature model. Purged cross-validation. Model confidence high enough to serve as primary signal ranker. |
Why so long? Random Forest with 14-18 features needs at minimum 10-20 observations per feature (140-360 picks) to avoid overfitting. Our 50-pick threshold is for experimental training only — not production-quality predictions. The HMM bypasses this entirely by learning from price data (17,000+ observations), which is why it's operational now.
Imagine the stock market has different "moods" — sometimes it's excited and prices go up (bull), sometimes it's scared and prices crash (bear), sometimes it's confused and goes sideways (neutral). These moods are hidden — you can't directly see them. You can only see the results: price changes, volume, and volatility.
An HMM is an AI that works backwards from the results to figure out what mood the market is probably in. It's like being a detective: you can't see the criminal, but you can see the evidence and deduce what happened.
The learning process has 3 steps:
After training, the AI sorts the 7 states by their average return:
Even when the HMM says "this is a bull market," we don't blindly trust it. We require at least 7 of 8 technical indicators to agree before generating a signal:
Most trading algorithms use fixed rules like "if RSI < 30, buy." The problem? Markets change. A rule that works in a calm market fails in a crash.
The HMM adapts because it first figures out what type of market we're in, then applies the right strategy. It's like a doctor who first diagnoses the disease before prescribing medicine, instead of giving everyone the same pill.
This approach was pioneered by Renaissance Technologies, the most successful hedge fund in history (66% annual returns for 30 years).
Real-time status of all automated systems. Green = healthy, orange = needs attention, red = broken.
| Feature | HMM Regime Terminal | KIMI ML Ranker | Alpha Engine ML |
|---|---|---|---|
| Algorithm | Gaussian HMM (7 states) | Random Forest (200 trees) | Random Forest (200 trees) |
| Training Data | 17,000+ price observations | 0/50 trade outcomes | 2/50 trade outcomes |
| Can Train Now? | YES — trains on market data | NO — chicken-and-egg | NO — insufficient picks |
| Approach | Probabilistic regime detection | Post-hoc signal scoring | Post-hoc signal scoring |
| Features | 5 (return, vol, volume, momentum x2) | 14 (mixed categories) | 18 (mixed indicators) |
| Adaptation | Retrains every scan on latest data | Retrains every 25 picks | Retrains every 25 picks |
| Regime Awareness | Core feature (7 states) | None (single ADX check) | None (single ADX check) |
| Confidence | Gaussian posterior probability | RF class probability | RF class probability |
| Walk-Forward | Built-in (365d train / 30d test) | No (5-fold CV only) | No (5-fold CV only) |
| Transaction Costs | Modeled (10bps + 5bps slippage) | Not modeled | Not modeled |
| Markets | Crypto + Meme + Forex + Stocks + Penny | Crypto + Forex (limited) | Crypto + Forex + Equity |