Skip to content

Phase 10 — Backtest Reporting

Summary

Phase 10 turns the backtest from a single-shot smoke run into a serious reporting tool. After this phase, a Backtest.run() produces a structured result with the equity curve over time (configurable cadence, default candle-close), per-strategy attribution, and the metrics every quant report carries: profit factor, Sharpe, Calmar, win/loss stats. The data lands in memory for programmatic comparison and on disk as JSON + CSV for charting in any external tool.

The phase is deliberately scoped to measurement, not iteration. Parameter sweeps and walk-forward analysis are larger features that build on what we shipped here; both go to a later phase.

What's new

  • com.qkt.backtest package — Backtest, BacktestResult, TradeRecord moved here from com.qkt.app.
  • SampleCadence enum — TICK, CANDLE_CLOSE, FILL. New cadence parameter on Backtest. Default resolves to CANDLE_CLOSE when candleWindow is set, else TICK.
  • EquitySample(timestamp, equity) — single point on an equity curve.
  • EquityCurveCollector — subscribes to the bus at the chosen cadence, exposes global and per-strategy curves.
  • PerformanceReport — full metric bundle: realized/unrealized/total P&L, trade count, win rate, fractional max drawdown, profit factor, avg/largest win+loss, max consecutive losses, Sharpe ratio, Calmar ratio, equity curve.
  • BacktestResult.global: PerformanceReport and BacktestResult.perStrategy: Map<String, PerformanceReport> — replaces the old flat fields.
  • com.qkt.backtest.metrics — pure-function metrics: profitFactor, winLossStats, sharpe, calmar.
  • BacktestReportWriter(dir) — emits result.json, equity_global.csv, equity_<strategyId>.csv, trades.csv, rejections.csv.
  • TradingCalendar.tradingPeriodsPerYear(window) — calendar-aware annualization factor for Sharpe; crypto impl provided.
  • DrawdownTracker.fromCurve(samples) — pure drawdown computation, used by both backtest and any future curve-based caller.
  • TradeRecord.strategyId — every trade now carries its originating strategy id.
  • TradingPipeline.onFilled — callback signature now (Trade, BigDecimal, String) -> Unit, where the third arg is the strategyId.

Migration from previous phase

Before After
import com.qkt.app.Backtest import com.qkt.backtest.Backtest
import com.qkt.app.BacktestResult import com.qkt.backtest.BacktestResult
import com.qkt.app.TradeRecord import com.qkt.backtest.TradeRecord
result.totalPnL result.global.totalPnL
result.realizedTotal result.global.realizedTotal
result.unrealizedTotal result.global.unrealizedTotal
result.tradeCount result.global.tradeCount
result.winRate result.global.winRate
result.maxDrawdown (absolute money) result.global.maxDrawdown (FRACTIONAL — Phase 9 convention)
TradeRecord(trade, realized) TradeRecord(trade, realized, strategyId)
onFilled = { trade, realized -> ... } onFilled = { trade, realized, strategyId -> ... }

The biggest semantic change is drawdown is now fractional. Tests that asserted absolute-money drawdown values must update both the assertion and the test setup so that the equity curve has a positive peak before the dip — fractional drawdown is undefined when peak is non-positive (returns zero).

Usage cookbook

Default backtest (candle-close cadence)

import com.qkt.backtest.Backtest
import com.qkt.candles.TimeWindow

val backtest = Backtest(
    strategies = listOf("ema-cross" to MyStrategy()),
    ticks = historicalTicks,
    candleWindow = TimeWindow.ONE_MINUTE,
    // cadence defaults to CANDLE_CLOSE because candleWindow is set
)
val result = backtest.run()
println("Total P&L: ${result.global.totalPnL}")
println("Sharpe: ${result.global.sharpeRatio}")
println("Max drawdown: ${result.global.maxDrawdown}")

Tick-cadence backtest (diagnostic resolution)

import com.qkt.backtest.SampleCadence

val backtest = Backtest(
    strategies = listOf("scalper" to MyScalper()),
    ticks = historicalTicks,
    cadence = SampleCadence.TICK,
)

If you omit candleWindow and don't pass cadence, the default resolves to TICK automatically.

Per-strategy comparison

val result = Backtest(
    strategies = listOf(
        "trend" to TrendStrategy(),
        "meanrev" to MeanReversionStrategy(),
    ),
    ticks = historicalTicks,
    candleWindow = TimeWindow.ONE_MINUTE,
).run()

for ((id, report) in result.perStrategy) {
    println("$id: PnL=${report.totalPnL}, Sharpe=${report.sharpeRatio}, drawdown=${report.maxDrawdown}")
}

Writing reports to disk

import com.qkt.backtest.report.BacktestReportWriter
import java.nio.file.Files
import java.nio.file.Paths

val dir = Paths.get("./reports/run-2026-05-07")
Files.createDirectories(dir)
BacktestReportWriter(dir).write(result)
// Files: result.json, equity_global.csv, equity_<strategyId>.csv, trades.csv, rejections.csv

Charting an equity curve in pandas

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("./reports/run-2026-05-07/equity_global.csv")
df["timestamp"] = pd.to_datetime(df["timestamp"], unit="ms")
df.plot(x="timestamp", y="equity", title="Equity curve")
plt.show()

Inspecting metrics directly

import com.qkt.backtest.metrics.profitFactor
import com.qkt.backtest.metrics.sharpe
import com.qkt.candles.TimeWindow
import com.qkt.common.TradingCalendar

val realizeds = result.trades.map { it.realized }
val pf = profitFactor(realizeds)  // BigDecimal? — null when no losses
val annualization = TradingCalendar.crypto().tradingPeriodsPerYear(TimeWindow.ONE_MINUTE)
val sharpeRatio = sharpe(result.global.equityCurve.map { it.equity }, annualization)

Testing patterns

The metrics are pure functions — test them with literal BigDecimal inputs:

@Test
fun `profitFactor on mixed list`() {
    val realizeds = listOf(BigDecimal("10"), BigDecimal("-5"), BigDecimal("20"))
    assertThat(profitFactor(realizeds)).isEqualByComparingTo(BigDecimal("6.0"))
}

End-to-end tests use real Backtest runs with deterministic fixture ticks:

val result = Backtest(
    strategies = listOf("s1" to fixtureStrategy),
    ticks = listOf(Tick("X", Money.of("100"), 1_000L), ...),
    candleWindow = TimeWindow.ONE_MINUTE,
).run()
assertThat(result.global.equityCurve).hasSize(expectedCandleCount)
assertThat(result.perStrategy["s1"]!!.equityCurve).isNotEmpty()

Tests that assert drawdown must construct an equity curve with a positive peak before the dip, since fractional drawdown returns 0 when no positive peak exists:

@Test
fun `drawdown captures unrealized swings on open positions`() {
    // Buy at 100, watch price rise to 120 (peak +20), then drop to 110 (-10)
    // fractional drawdown = (20 - 10) / 20 = 0.5
    val result = Backtest(...).run()
    assertThat(result.global.maxDrawdown).isEqualByComparingTo(BigDecimal("0.5"))
}

The report writer test uses JUnit 5's @TempDir:

@Test
fun `writer emits expected files`(@TempDir dir: Path) {
    BacktestReportWriter(dir).write(result)
    assertThat(dir.resolve("result.json")).exists()
}

Known limitations

  • No parameter sweep / grid search. Deferred to a future phase.
  • No walk-forward analysis. Same.
  • No HTML report. JSON + CSV only; HTML belongs to a presentation phase after the DSL.
  • No "total return %" or CAGR. Both require an initial-capital concept the engine doesn't have.
  • No round-trip / hold-time metrics. Inferring "completed trades" from a fill stream is ambiguous with scale-in/out; per-fill realized P&L is used as the proxy.
  • TICK / FILL Sharpe is approximate. Annualization for irregular sample spacing uses the run-average interval; the result.cadence field tells consumers which mode produced the curve.
  • Sortino, Ulcer, recovery factor — not shipped. Add only with a concrete demand.
  • No transactional writer. If a CSV write fails after the JSON wrote, the directory contains a partial result. Caller decides how to handle IOException.
  • JSON serializer is hand-rolled. No Jackson / kotlinx.serialization dependency added. Adequate for BigDecimal + ASCII identifiers; not stressed against arbitrary string content.

References

  • Spec: docs/superpowers/specs/2026-05-07-trading-engine-phase10-design.md
  • Plan: docs/superpowers/plans/2026-05-07-trading-engine-phase10.md
  • Merge commit: 634b2e3