Skip to content

Seeknal

Aggregations

mta-tech/seeknal

Aggregations¶

Aggregations compute metrics and rollups across entities and time periods.

Overview¶

Seeknal supports first-order aggregations (standard GROUP BY) and second-order aggregations (hierarchical rollups).

First-Order Aggregations¶

Basic Aggregations¶

name: customer_metrics
kind: aggregation
entity: customer
time_grain: day
metrics:
  - name: total_revenue
    expression: SUM(amount)
  - name: transaction_count
    expression: COUNT(*)
  - name: avg_transaction_value
    expression: AVG(amount)

Window Functions¶

name: rolling_metrics
kind: aggregation
entity: customer
window: 7d
metrics:
  - name: rolling_7day_revenue
    expression: SUM(amount)
  - name: rolling_7day_transactions
    expression: COUNT(*)

Second-Order Aggregations¶

Multi-Level Rollups¶

name: regional_customer_metrics
kind: second_order_aggregation
base_entity: customer
aggregate_entity: region
metrics:
  - name: regional_total_revenue
    expression: SUM(customer.total_revenue)
  - name: regional_avg_customer_revenue
    expression: AVG(customer.total_revenue)

Conditional Aggregations¶

name: customer_segment_metrics
kind: second_order_aggregation
base_entity: customer
aggregate_entity: segment
condition: "signup_date >= '2024-01-01'"
metrics:
  - name: new_customer_revenue
    expression: SUM(total_revenue)

Python Aggregations¶

Define aggregations using Python:

from seeknal.workflow.decorators import aggregation

@aggregation(
    name="python_aggregation",
    entity="customer",
    output="customer_features"
)
def compute_features(df):
    return df.groupby("customer_id").agg({
        "amount": ["sum", "count", "mean"]
    })

Aggregation Configuration¶

Common Options¶

Option	Type	Description	Default
`name`	string	Unique identifier	Required
`entity`	string	Entity to aggregate over	Required
`time_grain`	string	Time granularity (day, week, month)	Optional
`metrics`	list	Metric definitions	Required
`window`	string	Rolling window size	Optional

Best Practices¶

Define clear entities for aggregation
Use time grains for temporal aggregations
Document metric logic in descriptions
Test with sample data first
Consider performance for large datasets

Second-Order Aggregations - Conceptual overview
Feature Groups - Aggregations for ML features
Transforms - Data preparation

Next: Learn about Feature Groups or return to Building Blocks