# Benchmark Guide ## Generating Benchmark Data Use the CLI to generate a comprehensive benchmark suite: ```bash otava-gen generate --output-dir ./benchmark --lengths 50 500 --seed 42 ``` This creates: - CSV files for each test case - `manifest.json` with metadata about each file - `summary.json` with overall statistics ## Running Otava ```bash # Example Otava invocation (adjust based on Otava's actual CLI) otava analyze --input ./benchmark/0001_step_function_L500.csv ``` ## Comparing Algorithms The manifest.json file contains ground truth for each test case: ```python import json with open("benchmark/manifest.json") as f: manifest = json.load(f) for entry in manifest: print(f"{entry['filename']}: {entry['n_change_points']} change points") print(f" Expected indices: {entry['change_point_indices']}") ``` ## Metrics When comparing algorithms, consider: 1. **True Positive Rate**: % of actual change points detected 2. **False Positive Rate**: % of non-change-points flagged 3. **Location Accuracy**: How close detected points are to actual 4. **Latency**: How many points after change before detection