Historical Backtest Engine

The TRADEOS.tech backtest engine evaluates market hypotheses and strategy behavior against real historical market data. It reuses the same signal pipeline, feasibility gates, and execution simulator that run in paper validation and live execution — so backtested results reflect how the system actually behaves, not how a separate simulation model thinks it behaves.

What makes a backtest trustworthy

Most backtesting frameworks use simplified models that diverge from the real execution path: separate signal generation code, simplified risk checks, optimistic fill assumptions. These produce results that look good in testing but underperform when exposed to live market conditions.

TRADEOS.tech avoids this by running the identical signal pipeline in backtest, paper validation, and live execution. The same Python classes that generate signals in production process the historical OHLCV bars. The same feasibility gates apply the same risk checks. The same sandbox executor simulates fills. The only differences are:

Clock source — backtest uses a ReplayClock that steps through historical timestamps; live modes use the real wall clock
Data source — backtest feeds stored OHLCV bars from the database; live modes receive market data from exchange WebSocket feeds
Fill path — backtest uses ATR-scaled slippage simulation; paper validation simulates against live books; live execution routes to real venue APIs

Data sourcing

Historical OHLCV data is sourced from exchange REST APIs (Kraken primary, Coinbase for tail instruments). The data manager is idempotent and gap-filling — running a backtest for any time window automatically fetches and stores any missing bars before the backtest begins.

Each backtest result includes a data_coverage_pct metric showing what fraction of the requested time window had available data. Results with insufficient coverage display a warning on the dashboard.

Fill modeling

Backtest fills use ATR-scaled slippage: the fill price deviates from the bar's close price by an amount proportional to the instrument's Average True Range at the time of the fill. In volatile markets, slippage is larger. Venue taker fees are applied to every fill.

This produces more conservative backtest P&L than naive next-bar-open or last-traded-price fills — which is the right direction for a trustworthy simulation.

Performance

The backtest engine runs in parallel across instruments and streams bars from the database one at a time (no full dataset preload). Multi-instrument, multi-month backtests at hourly resolution complete in seconds — not minutes — due to the parallel streaming architecture.

Launching a backtest

Backtests can be launched from the operator dashboard or from authorized internal interfaces. A backtest request specifies the date range, symbols, profile, and timeframe, then returns progress and final results through the review interface.

Results are cached by configuration, so repeating the same test can return the stored result immediately.

Metrics reported

Metric	Description
Total return	Net P&L over the backtest period
Sharpe ratio	Annualized risk-adjusted return (risk-free rate 0%)
Sortino ratio	Downside-only risk-adjusted return
Calmar ratio	Return divided by maximum drawdown
Max drawdown	Largest peak-to-trough decline
Max drawdown duration	Longest continuous drawdown period
Win rate	Fraction of closed trades with positive P&L
Profit factor	Gross profits divided by gross losses
Per-signal breakdown	Win rate, P&L, and trade count for each signal type
Regime attribution	Performance breakdown by regime (trending, mean-reverting, etc.)
Data coverage	Fraction of the backtest window with available market data

Comparison mode

The dashboard supports comparing up to 5 backtest runs side-by-side. This is used for strategy variant evaluation — the autonomous agent runs parallel backtests on proposed parameter changes and promotes variants that show statistically significant improvement.

Relationship to the autonomous agent

The autonomous agent uses backtesting as the validation step before promoting any strategy variant to live paper validation. The agent generates a variant, runs a backtest over a meaningful historical window, and compares the results against the current baseline. A variant is only promoted if it shows improvement in Sharpe and Sortino with statistical significance.

Backtest results are also used to pre-populate historical signal outcome data, which the autonomous agent uses for initial calibration before sufficient live paper trade data has accumulated.

What makes a backtest trustworthy​

Data sourcing​

Fill modeling​

Performance​

Launching a backtest​

Metrics reported​

Comparison mode​

Relationship to the autonomous agent​