JAF - Just Another Flow¶

JAF (Just Another Flow) is a powerful streaming data processing system for JSON/JSONL data with a focus on lazy evaluation, composability, and a fluent API. It provides both a command-line interface and a Python API for filtering, transforming, and analyzing JSON data streams.

Key Features¶

Streaming Architecture: Process large datasets efficiently without loading everything into memory
Lazy Evaluation: Operations are only executed when results are needed
Fluent API: Chain operations together in a readable, intuitive way
Composable Operations: Combine filters, maps, and other transformations
Boolean Algebra: Perform AND, OR, NOT operations on filtered streams
Multiple Data Sources: Files, directories, stdin, memory, gzip, and infinite streams
Unix Philosophy: Individual commands can be piped together with other tools

Quick Start¶

Installation¶

pip install jaf

Command Line¶

# Filter JSON data (outputs stream descriptor by default)
jaf filter users.jsonl '["gt?", "@age", 25]'

# Evaluate immediately with --eval
jaf filter users.jsonl '["gt?", "@age", 25]' --eval

# Pipe operations together
jaf filter users.jsonl '["eq?", "@status", "active"]' | \
jaf map - "@name" | \
jaf eval -

# Use the stream command for eager evaluation
jaf stream users.jsonl --filter '["gt?", "@age", 25]' --map "@name"

# Combine with other Unix tools
jaf filter users.jsonl '["eq?", "@role", "admin"]' --eval | \
ja groupby department

Python API¶

from jaf import stream

# Create a stream and build a pipeline
pipeline = stream("users.jsonl") \
    .filter(["gt?", "@age", 25]) \
    .map("@name") \
    .take(10)

# Execute the pipeline
for name in pipeline.evaluate():
    print(name)

Core Concepts¶

Streaming Architecture¶

JAF uses a lazy streaming architecture where operations build pipelines rather than immediately processing data:

# This doesn't process any data yet
pipeline = stream("large_file.jsonl") \
    .filter(["eq?", "@type", "error"]) \
    .map(["dict", "time", "@timestamp", "msg", "@message"])

# Data is processed only when we evaluate
for error in pipeline.evaluate():
    print(error)

Query Language¶

JAF uses an S-expression syntax for queries and expressions:

# Simple equality check
["eq?", "@name", "Alice"]

# Complex boolean logic
["and",
  ["gt?", "@age", 18],
  ["or",
    ["eq?", "@status", "active"],
    ["in?", "@role", ["admin", "moderator"]]
  ]
]

# Path navigation with @
["eq?", "@user.profile.verified", true]

The @ Notation¶

The @ prefix provides concise path navigation:

"@name" → Get the "name" field
"@user.email" → Navigate nested objects
"@items.*.price" → Use wildcards
"@**.error" → Recursive search

What JAF Does Well¶

JAF focuses on:

Filtering: Complex boolean queries on JSON data
Transformation: Mapping and reshaping data
Stream Processing: Efficient handling of large datasets
Composition: Building complex pipelines from simple operations

For operations like grouping, joining, or aggregation, JAF works great with specialized tools: - Use jsonl-algebra for relational operations - Use jq for complex JSON transformations - Use pandas or duckdb for analytical queries

Learn More¶

Getting Started: Detailed installation and first steps
Fluent API Guide: Complete guide to the Python API
Query Language: JAF query syntax and operators
CLI Reference: Command-line interface documentation
Cookbook: Common patterns and recipes
Advanced Topics: Performance, infinite streams, and more

Philosophy¶

JAF is designed around several key principles:

Do One Thing Well: Focus on filtering and transforming JSON streams
Lazy by Default: Build complex pipelines without immediate execution
Composability: All operations can be freely combined
Streaming First: Handle large datasets that don't fit in memory
Unix Philosophy: Work well with other tools in a pipeline

Next Steps¶

Read the Getting Started Guide for a comprehensive introduction
Explore the Fluent API Guide to master the Python interface
Check out the Cookbook for practical examples
Reference the Query Language documentation for all available operators