Replay System

Understand how deterministic replay works and how to use it for testing and debugging AI agents.

What is Deterministic Replay?

Replay allows you to re-execute agent runs with identical conditions and inputs.

The RunLog AI replay system captures all the inputs, randomness, and external API responses from your agent runs. This allows you to replay any historical run with identical results, making debugging and testing much more reliable.

Replay is deterministic - running the same replay multiple times will always produce identical results, even if the original run involved randomness or external API calls.

Common Use Cases

Debugging Failures

Reproduce exact conditions that led to agent failures

Policy Testing

Test policy changes against historical agent runs

A/B Testing

Compare different agent versions or configurations

How Replay Works

Understanding the technical details of deterministic replay

1. Capture Phase

During original execution, RunLog captures all inputs, random seeds, timestamps, and external API responses.

2. Isolation

Replay runs in an isolated environment that prevents any side effects or external API calls.

3. Deterministic Execution

The agent re-executes using the captured inputs and responses, producing identical results.

Using Replay

How to trigger and configure replay runs

Basic Replay

# Replay a specific run
rl.replay(run_id="run_abc123")

# Replay with modified policies
rl.replay(
    run_id="run_abc123",
    policies=["new_policy.yaml"]
)

Batch Replay

# Replay multiple runs
results = rl.replay_batch(
    date_range="2024-01-01:2024-01-07",
    filters={"status": "failed"},
    policies=["updated_policy.yaml"]
)

print(f"Replayed {len(results)} runs")
print(f"Success rate: {results.success_rate}")

Analyzing Replay Results

Understanding what replay tells you about your changes

Policy Impact

See how many actions would be blocked, approved, or modified by new policies.

Cost Analysis

Calculate potential cost savings or increases from policy changes.

Behavior Changes

Identify how agent behavior would change with different configurations.

Success Rates

Compare success rates between original runs and replayed versions.

Replay Best Practices

Tips for effective use of replay testing

Always test policy changes with replay before deploying to production
Use representative date ranges that include various scenarios and edge cases
Pay attention to both blocked actions and unintended side effects
Replay failed runs when debugging to understand root causes
Use batch replay to validate changes across large datasets
Monitor replay performance to ensure changes don't impact latency