Benchmark Guide

Generating Benchmark Data

Use the CLI to generate a comprehensive benchmark suite:

otava-gen generate --output-dir ./benchmark --lengths 50 500 --seed 42

This creates:

  • CSV files for each test case

  • manifest.json with metadata about each file

  • summary.json with overall statistics

Running Otava

# Example Otava invocation (adjust based on Otava's actual CLI)
otava analyze --input ./benchmark/0001_step_function_L500.csv

Comparing Algorithms

The manifest.json file contains ground truth for each test case:

import json

with open("benchmark/manifest.json") as f:
    manifest = json.load(f)

for entry in manifest:
    print(f"{entry['filename']}: {entry['n_change_points']} change points")
    print(f"  Expected indices: {entry['change_point_indices']}")

Metrics

When comparing algorithms, consider:

  1. True Positive Rate: % of actual change points detected

  2. False Positive Rate: % of non-change-points flagged

  3. Location Accuracy: How close detected points are to actual

  4. Latency: How many points after change before detection