Skip to main content

Historical Backtest Engine

The TRADEOS.tech backtest engine evaluates strategy performance against real historical market data. It reuses the same signal pipeline, feasibility gates, and execution simulator that run in paper and live trading — so backtested results reflect how the system actually behaves, not how a separate simulation model thinks it behaves.

What makes a backtest trustworthy

Most backtesting frameworks use simplified models that diverge from the real trading path: separate signal generation code, simplified risk checks, optimistic fill assumptions. These produce results that look good in testing but underperform in live trading.

TRADEOS.tech avoids this by running the identical signal pipeline in both backtest and live trading. The same Python classes that generate signals in production process the historical OHLCV bars. The same feasibility gates apply the same risk checks. The same sandbox executor simulates fills. The only differences are:

  1. Clock source — backtest uses a ReplayClock that steps through historical timestamps; live uses the real wall clock
  2. Data source — backtest feeds stored OHLCV bars from the database; live receives live market data from exchange WebSocket feeds
  3. Fill path — backtest uses ATR-scaled slippage simulation; live routes to real venue APIs

Data sourcing

Historical OHLCV data is sourced from exchange REST APIs (Kraken primary, Coinbase for tail instruments). The data manager is idempotent and gap-filling — running a backtest for any time window automatically fetches and stores any missing bars before the backtest begins.

Each backtest result includes a data_coverage_pct metric showing what fraction of the requested time window had available data. Results with insufficient coverage display a warning on the dashboard.

Fill modeling

Backtest fills use ATR-scaled slippage: the fill price deviates from the bar's close price by an amount proportional to the instrument's Average True Range at the time of the fill. In volatile markets, slippage is larger. Venue taker fees are applied to every fill.

This produces more conservative backtest P&L than naive next-bar-open or last-traded-price fills — which is the right direction for a trustworthy simulation.

Performance

The backtest engine runs in parallel across instruments and streams bars from the database one at a time (no full dataset preload). Multi-instrument, multi-month backtests at hourly resolution complete in seconds — not minutes — due to the parallel streaming architecture.

Launching a backtest

Backtests can be launched from the dashboard Backtest tab or via the REST API:

# Start a backtest job
curl -X POST http://localhost:8085/backtest/run \
-H "Content-Type: application/json" \
-d '{
"start_date": "2025-09-01",
"end_date": "2026-03-01",
"symbols": ["BTCUSDT", "ETHUSDT"],
"profile": "balanced",
"timeframe": "1h"
}'

# Poll for status (returns progress %)
curl http://localhost:8085/backtest/{job_id}

# Get full results when complete
curl http://localhost:8085/backtest/{job_id}/results

Results are cached — submitting the same configuration twice returns the stored result immediately.

Metrics reported

MetricDescription
Total returnNet P&L over the backtest period
Sharpe ratioAnnualized risk-adjusted return (risk-free rate 0%)
Sortino ratioDownside-only risk-adjusted return
Calmar ratioReturn divided by maximum drawdown
Max drawdownLargest peak-to-trough decline
Max drawdown durationLongest continuous drawdown period
Win rateFraction of closed trades with positive P&L
Profit factorGross profits divided by gross losses
Per-signal breakdownWin rate, P&L, and trade count for each signal type
Regime attributionPerformance breakdown by regime (trending, mean-reverting, etc.)
Data coverageFraction of the backtest window with available market data

Comparison mode

The dashboard supports comparing up to 5 backtest runs side-by-side. This is used for strategy variant evaluation — the autonomous agent runs parallel backtests on proposed parameter changes and promotes variants that show statistically significant improvement.

Relationship to the autonomous agent

The autonomous agent uses backtesting as the validation step before promoting any strategy variant to live paper trading. The agent generates a variant, runs a backtest over a meaningful historical window, and compares the results against the current baseline. A variant is only promoted if it shows improvement in Sharpe and Sortino with statistical significance.

Backtest results are also used to pre-populate historical signal outcome data, which the autonomous agent uses for initial calibration before sufficient live paper trade data has accumulated.