Parallel Execution¶
Seeknal automatically executes independent pipeline tasks in parallel for faster performance.
Overview¶
When pipeline tasks don't depend on each other, Seeknal runs them concurrently. This parallelization happens automatically—you don't need to configure it.
How It Works¶
Dependency Analysis¶
Seeknal analyzes your pipeline to identify dependencies:
# These can run in parallel (no dependencies)
sources:
- name: sales
- name: customers
- name: products
# This waits for sources
transforms:
- name: enriched_sales
depends_on: [sales, customers, products]
Execution:
1. sales, customers, products run in parallel
2. enriched_sales runs after all three complete
Automatic Parallelization¶
# All independent tasks run in parallel
transforms:
- name: clean_sales
depends_on: [raw_sales]
- name: clean_customers
depends_on: [raw_customers]
- name: clean_products
depends_on: [raw_products]
# This waits for all transforms
aggregations:
- name: daily_metrics
depends_on: [clean_sales, clean_customers, clean_products]
Execution:
1. clean_sales, clean_customers, clean_products run in parallel
2. daily_metrics runs after all three complete
Performance Benefits¶
Sequential Execution¶
Parallel Execution¶
Task 1: 5 minutes ┐
Task 2: 5 minutes ├─→ 5 minutes (all run together)
Task 3: 5 minutes ┘
Total: 5 minutes
Speedup: 3x faster for three independent tasks
Configuring Parallelism¶
Default Behavior¶
Seeknal automatically detects and parallelizes independent tasks.
Manual Control¶
For advanced use cases, you can control parallelism:
Best Practices¶
- Design for independence when possible
- Let Seeknal handle parallelization automatically
- Monitor performance for bottlenecks
- Use environments to test parallel execution
Troubleshooting¶
Tasks Not Running in Parallel¶
Issue: Tasks that should run in parallel are running sequentially.
Check: - Are there hidden dependencies? - Is there a resource limitation? - Are tasks sharing the same output?
Performance Issues¶
Issue: Parallel execution is slower than expected.
Solutions:
- Reduce max_parallel if resource-constrained
- Check for I/O bottlenecks
- Profile task execution times
Related Topics¶
- Pipeline Builder Workflow - Core workflow
- Change Categorization - Understanding dependencies
- Building Blocks: Tasks - Task configuration
Return to: Concepts Overview