State Backends Guide¶

Overview¶

State backends provide pluggable storage for execution state, supporting both file-based and database storage with atomic transactions and locking.

Available Backends¶

File Backend (Default)¶

Path: target/run_state.json
Use case: Single-node deployments, development
Pros: Simple, no dependencies
Cons: Not suitable for distributed execution

Database Backend¶

Path: target/state.db (SQLite/Turso)
Use case: Production, distributed execution
Pros: Concurrent access, transactional integrity
Cons: Requires database setup

Configuration¶

Set backend in seeknal_project.yml:

name: my_project
version: 1.0.0
state_backend: file  # Options: file, database

Or via environment variable:

export SEEKNAL_STATE_BACKEND=database

File Backend¶

Default storage using JSON files:

{
  "run_id": "run_20240110_120000",
  "started_at": "2024-01-10T12:00:00Z",
  "nodes": {
    "my_node": {
      "run_id": "run_20240110_120000",
      "node_id": "my_node",
      "status": "success",
      "duration_ms": 1500,
      "row_count": 1000,
      "fingerprint": "abc123..."
    }
  }
}

Database Backend¶

SQLite or Turso-based storage with tables:

runs: Run metadata
node_states: Node execution status
completed_intervals: Interval tracking
partitions: Partition tracking

Using SQLite¶

state_backend: database
# Automatically uses target/state.db

Using Turso (Remote)¶

state_backend: database
database_url: "libsql://token@database-name.turso.io"

Migration¶

Migrate File to Database¶

# Preview migration
seeknal migrate-state --backend database

# Execute migration
seeknal migrate-state --backend database --no-dry-run

This preserves: - ✅ All run history - ✅ Node execution states - ✅ Completed intervals - ✅ Fingerprints - ✅ Row counts

Rollback¶

Backup created automatically at run_state.json.bak.

To rollback:

# Restore from backup
cp target/run_state.json.bak target/run_state.json

API Usage¶

Using the Backend Protocol¶

from seeknal.state.backend import create_state_backend

# Create file backend
file_backend = create_state_backend("file", base_path="target")
file_backend.initialize()

# Create database backend
db_backend = create_state_backend("database", db_path="target/state.db")
db_backend.initialize()

# Use unified interface
backend.create_run("run_1", status="running", started_at=datetime.now())
backend.set_node_state("run_1", "node1", status="success")
backend.add_completed_interval("run_1", "node1", start, end)

Transactions¶

# Atomic state updates
with backend.transaction() as tx:
    backend.set_node_state("run_1", "node1", status="success")
    backend.set_node_state("run_1", "node2", status="success")
    # Both succeed or both fail

Locking¶

# Acquire lock for concurrent execution
acquired = backend.acquire_lock("run_1", timeout_ms=5000)
if acquired:
    try:
        # Execute with lock held
        backend.set_node_state("run_1", "node1", status="running")
    finally:
        # Lock released automatically
        pass

Best Practices¶

1. Use Database for Production¶

# Production
state_backend: database

2. Use File for Development¶

# Development
state_backend: file

3. Backup Before Migration¶

# Always backup first
cp target/run_state.json target/run_state.json.pre-migration.bak

# Then migrate
seeknal migrate-state --backend database

4. Test Migration in Dry Run¶

# Preview first
seeknal migrate-state --backend database

# Check output, then execute
seeknal migrate-state --backend database --no-dry-run

Troubleshooting¶

Migration Failed¶

# Check backup
ls -la target/*.bak

# Restore backup
cp target/run_state.json.bak target/run_state.json

# Investigate error
seeknal migrate-state --backend database --dry-run

Database Locked¶

# SQLite: Check for locks
sqlite3 target/state.db "PRAGMA lock_status"

# Close other processes using the database

Performance Issues¶

For large state files:

Migrate to database - Better concurrent access
Clean old runs - seeknal state cleanup --days 30
Use partitions - Enable partition tracking

Advanced¶

Custom Backend¶

Implement the StateBackend protocol:

from seeknal.state.backend import StateBackend

class CustomBackend(StateBackend):
    def initialize(self):
        # Setup
        pass

    def create_run(self, run_id, **kwargs):
        # Create run
        pass

    # Implement all abstract methods...

Register and use:

from seeknal.state.backend import StateBackendFactory

StateBackendFactory.register("custom", CustomBackend)
backend = create_state_backend("custom", option="value")

Reference¶

Backend Protocol Methods¶

initialize() - Setup backend storage
create_run() - Create new run record
update_run() - Update run status
get_run() - Retrieve run info
set_node_state() - Update node execution state
get_node_state() - Retrieve node state
add_completed_interval() - Track processed interval
transaction() - Begin transaction context
acquire_lock() - Acquire optimistic lock