Historical Backtest Engine
The TRADEOS.tech backtest engine evaluates strategy performance against real historical market data. It reuses the same signal pipeline, feasibility gates, and execution simulator that run in paper and live trading — so backtested results reflect how the system actually behaves, not how a separate simulation model thinks it behaves.
What makes a backtest trustworthy
Most backtesting frameworks use simplified models that diverge from the real trading path: separate signal generation code, simplified risk checks, optimistic fill assumptions. These produce results that look good in testing but underperform in live trading.
TRADEOS.tech avoids this by running the identical signal pipeline in both backtest and live trading. The same Python classes that generate signals in production process the historical OHLCV bars. The same feasibility gates apply the same risk checks. The same sandbox executor simulates fills. The only differences are:
- Clock source — backtest uses a
ReplayClockthat steps through historical timestamps; live uses the real wall clock - Data source — backtest feeds stored OHLCV bars from the database; live receives live market data from exchange WebSocket feeds
- Fill path — backtest uses ATR-scaled slippage simulation; live routes to real venue APIs
Data sourcing
Historical OHLCV data is sourced from exchange REST APIs (Kraken primary, Coinbase for tail instruments). The data manager is idempotent and gap-filling — running a backtest for any time window automatically fetches and stores any missing bars before the backtest begins.
Each backtest result includes a data_coverage_pct metric showing what fraction of the requested time window had available data. Results with insufficient coverage display a warning on the dashboard.
Fill modeling
Backtest fills use ATR-scaled slippage: the fill price deviates from the bar's close price by an amount proportional to the instrument's Average True Range at the time of the fill. In volatile markets, slippage is larger. Venue taker fees are applied to every fill.
This produces more conservative backtest P&L than naive next-bar-open or last-traded-price fills — which is the right direction for a trustworthy simulation.
Performance
The backtest engine runs in parallel across instruments and streams bars from the database one at a time (no full dataset preload). Multi-instrument, multi-month backtests at hourly resolution complete in seconds — not minutes — due to the parallel streaming architecture.
Launching a backtest
Backtests can be launched from the dashboard Backtest tab or via the REST API:
# Start a backtest job
curl -X POST http://localhost:8085/backtest/run \
-H "Content-Type: application/json" \
-d '{
"start_date": "2025-09-01",
"end_date": "2026-03-01",
"symbols": ["BTCUSDT", "ETHUSDT"],
"profile": "balanced",
"timeframe": "1h"
}'
# Poll for status (returns progress %)
curl http://localhost:8085/backtest/{job_id}
# Get full results when complete
curl http://localhost:8085/backtest/{job_id}/results
Results are cached — submitting the same configuration twice returns the stored result immediately.
Metrics reported
| Metric | Description |
|---|---|
| Total return | Net P&L over the backtest period |
| Sharpe ratio | Annualized risk-adjusted return (risk-free rate 0%) |
| Sortino ratio | Downside-only risk-adjusted return |
| Calmar ratio | Return divided by maximum drawdown |
| Max drawdown | Largest peak-to-trough decline |
| Max drawdown duration | Longest continuous drawdown period |
| Win rate | Fraction of closed trades with positive P&L |
| Profit factor | Gross profits divided by gross losses |
| Per-signal breakdown | Win rate, P&L, and trade count for each signal type |
| Regime attribution | Performance breakdown by regime (trending, mean-reverting, etc.) |
| Data coverage | Fraction of the backtest window with available market data |
Comparison mode
The dashboard supports comparing up to 5 backtest runs side-by-side. This is used for strategy variant evaluation — the autonomous agent runs parallel backtests on proposed parameter changes and promotes variants that show statistically significant improvement.
Relationship to the autonomous agent
The autonomous agent uses backtesting as the validation step before promoting any strategy variant to live paper trading. The agent generates a variant, runs a backtest over a meaningful historical window, and compares the results against the current baseline. A variant is only promoted if it shows improvement in Sharpe and Sortino with statistical significance.
Backtest results are also used to pre-populate historical signal outcome data, which the autonomous agent uses for initial calibration before sufficient live paper trade data has accumulated.