Dotsuite vs JAF: Pedagogy vs Production

This document compares dotsuite with JAF (Just Another Flow) to help you choose the right tool for your needs.

Overview

  • Dotsuite: A pedagogical ecosystem teaching data manipulation concepts through simple, composable tools
  • JAF: A production-ready streaming data processing system implementing similar concepts at scale

Philosophical Differences

Dotsuite - "Learn by Building"

  • Goal: Teach concepts through simplicity
  • Philosophy: Unix philosophy - one tool, one purpose
  • Audience: Learners, educators, prototypers
  • Code Style: "Steal this code" - simple enough to copy and understand

JAF - "Battle-Tested Solution"

  • Goal: Production-ready data processing
  • Philosophy: Feature-complete streaming framework
  • Audience: Production systems, data engineers
  • Code Style: Robust, optimized, extensive error handling

Feature Comparison

Feature Dotsuite JAF
Query Language Simple strings: "equals role admin" S-expressions: ["eq?", "@role", "admin"]
Path System Dot notation: "user.name" Advanced AST with regex, fuzzy, wildcards
Evaluation Eager (with lazy QuerySet option) Lazy by default with streaming
Data Sources In-memory collections Files, directories, stdin, compressed, infinite
Set Operations Basic (union, intersection, difference) Index-preserving with collection tracking
Pipeline Composition Separate tools (dotpipe) Built-in method chaining
Memory Usage Loads data into memory Streaming generators
Error Handling Basic Comprehensive with custom exceptions
Performance Educational focus Optimized for production
Dependencies Minimal Optional advanced libraries

Conceptual Alignment

Both projects share core concepts but implement them differently:

Filtering as Boolean Algebra

  • Dotsuite: dotfilter with QuerySet for lazy evaluation
  • JAF: FilteredStream with comprehensive boolean operations

Pipeline Composition

  • Dotsuite: dotpipe for transformation chains
  • JAF: Built-in .filter().map().take() chaining

Path Navigation

  • Dotsuite: dotget with simple dot notation
  • JAF: Complex path AST with multiple navigation strategies

When to Use Which?

Use Dotsuite When:

  • Learning data manipulation concepts
  • Teaching programming or data structures
  • Building prototypes
  • You need simple, understandable code
  • You want to modify/extend the implementation
  • Working with small to medium datasets

Use JAF When:

  • Building production systems
  • Processing large datasets or streams
  • Need advanced features (regex paths, fuzzy matching)
  • Require robust error handling
  • Working with multiple data sources
  • Need memory-efficient streaming

Learning Path

  1. Start with Dotsuite to understand concepts:
  2. How path navigation works (dotget)
  3. Boolean filtering (dotfilter)
  4. Pipeline composition (dotpipe)
  5. Set operations on collections

  6. Graduate to JAF for production:

  7. Apply learned concepts at scale
  8. Leverage advanced features
  9. Handle real-world edge cases

Code Examples

Simple Filter Operation

Dotsuite:

from collections.dotfilter import filter_docs

users = [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]
adults = filter_docs(users, "greater_than age 25")

JAF:

from jaf import stream

adults = stream("users.jsonl") \
    .filter(["gt?", "@age", 25]) \
    .evaluate()

Pipeline Composition

Dotsuite:

from shape.dotpipe import pipe
from collections.dotfilter import filter_docs

result = pipe(
    data,
    lambda d: filter_docs(d, "equals status active"),
    lambda d: [dotget(doc, "email") for doc in d]
)

JAF:

result = stream("users.jsonl") \
    .filter(["eq?", "@status", "active"]) \
    .map("@email") \
    .evaluate()

Integration

Both projects can work together:

  1. Learn with Dotsuite: Understand the concepts
  2. Prototype with Dotsuite: Quick implementations
  3. Production with JAF: Scale when ready
  4. Contribute back: Share learnings with both communities

Summary

Dotsuite and JAF are complementary projects: - Dotsuite teaches the "why" and "how" through simplicity - JAF provides the "what" for production use

Together, they form a complete ecosystem from learning to production.