Replay System
Understand how deterministic replay works and how to use it for testing and debugging AI agents.
The RunLog AI replay system captures all the inputs, randomness, and external API responses from your agent runs. This allows you to replay any historical run with identical results, making debugging and testing much more reliable.
Replay is deterministic - running the same replay multiple times will always produce identical results, even if the original run involved randomness or external API calls.
Common Use Cases
1. Capture Phase
During original execution, RunLog captures all inputs, random seeds, timestamps, and external API responses.
2. Isolation
Replay runs in an isolated environment that prevents any side effects or external API calls.
3. Deterministic Execution
The agent re-executes using the captured inputs and responses, producing identical results.
Basic Replay
# Replay a specific run rl.replay(run_id="run_abc123") # Replay with modified policies rl.replay( run_id="run_abc123", policies=["new_policy.yaml"] )
Batch Replay
# Replay multiple runs results = rl.replay_batch( date_range="2024-01-01:2024-01-07", filters={"status": "failed"}, policies=["updated_policy.yaml"] ) print(f"Replayed {len(results)} runs") print(f"Success rate: {results.success_rate}")
Policy Impact
See how many actions would be blocked, approved, or modified by new policies.
Cost Analysis
Calculate potential cost savings or increases from policy changes.
Behavior Changes
Identify how agent behavior would change with different configurations.
Success Rates
Compare success rates between original runs and replayed versions.
- Always test policy changes with replay before deploying to production
- Use representative date ranges that include various scenarios and edge cases
- Pay attention to both blocked actions and unintended side effects
- Replay failed runs when debugging to understand root causes
- Use batch replay to validate changes across large datasets
- Monitor replay performance to ensure changes don't impact latency