Skip to content

Welcome to Seeknal

Build data pipelines and ML features in minutes, not days.

Seeknal is an all-in-one platform for data and AI/ML engineering. Define transformations in YAML or Python, run them with DuckDB or Spark, and deploy features to production with a single CLI.


What Can You Build With Seeknal?

For ML Engineers For Data Engineers For Analytics Engineers
Feature stores with point-in-time correctness ELT pipelines with incremental execution Semantic layers with consistent metrics
Training datasets from raw events Data transformations with SQL Business metrics with change tracking
Online serving for real-time inference Multi-engine workflows (DuckDB + Spark) Self-serve analytics for stakeholders

Common use cases: Recommendation systems, churn prediction, customer segmentation, real-time dashboards, A/B test analysis, fraud detection.


Get Started in 10 Minutes

→ Quick Start Guide

  1. Install Seeknal (pip install seeknal or from GitHub Releases)
  2. Load your data (CSV, Parquet, database)
  3. Transform with SQL
  4. Run your first pipeline

No infrastructure required. Works on your laptop.


How Seeknal Works: The Pipeline Builder

Seeknal's workflow is inspired by modern infrastructure tools like terraform and kubectl:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│     init        │ →   │     draft       │ →   │     apply       │ →   │   run --env     │
│  (setup project)│     │  (write YAML)   │     │  (save changes) │     │  (execute)      │
└─────────────────┘     └─────────────────┘     └─────────────────┘     └─────────────────┘

Step-by-Step

  1. seeknal init - Create a new project
  2. seeknal draft - Generate YAML templates for sources, transforms, feature groups
  3. seeknal apply - Save your pipeline definition (like git commit)
  4. seeknal run --env prod - Execute in production with safety checks

Key benefits: Dry-run validation, change detection, rollback support, multi-environment support, and multi-target materialization (PostgreSQL + Iceberg).


Choose Your Learning Path

Seeknal supports different workflows depending on your role and goals.

🆕 New to Seeknal?

Start here if you're evaluating or just getting started:

→ Quick Start Guide (10 min)

Install, load data, create features, and run your first pipeline.


🏗️ Data Engineer Path

→ Start Data Engineer Path

Goal: Build reliable ELT pipelines with incremental execution and production safety.

Start with: YAML Pipeline Tutorial (75 min)

Then learn: - Environment Management - Safe development with isolated environments - Incremental Models - Efficient incremental processing - Change Categorization - Understand breaking vs. non-breaking changes

Typical use case: "I need to transform raw data into analytics-ready tables, incrementally, with production safety."


📊 Analytics Engineer Path

→ Start Analytics Engineer Path

Goal: Define metrics and build a semantic layer for self-serve analytics.

Start with: YAML Pipeline Tutorial (75 min)

Then learn: - Semantic Layer & Metrics - Define and query consistent metrics - Change Categorization - Track metric changes over time - Testing & Audits - Validate data quality

Typical use case: "I need consistent metrics across dashboards and tools, with change tracking."


🤖 ML Engineer Path

→ Start ML Engineer Path

Goal: Build feature stores with point-in-time joins for ML models.

Start with: Getting Started (30 min)

Then learn: - Python Pipelines - Feature engineering with Python - Training to Serving - End-to-end ML workflow - Parallel Execution - Speed up large pipelines

Typical use case: "I need features for training that prevent data leakage, with online serving."


Concepts

Learn the mental model behind Seeknal.

Guides

Task-oriented walkthroughs for specific workflows.

Reference

Lookup documentation for commands, schemas, and configuration.

Tutorials

Step-by-step learning paths with copy-pasteable code.

Additional Resources