Skip to content

Fluent API Guide

Overview

The Complex Network RAG system now features a fluent, Pythonic API designed for simplicity, composability, and discoverability. This guide shows how to use the new API and migrate from the old API.

Design Philosophy

The fluent API follows these principles:

  1. Simplicity - Common tasks should be trivial (one-liners when possible)
  2. Composability - Components combine naturally via method chaining
  3. Discoverability - Method names reveal intent, IDE autocomplete works well
  4. Pythonic - Uses context managers, properties, and Python conventions
  5. Backward Compatible - Old API still works, no breaking changes

Quick Start

Simple Usage (Most Common Case)

from src import NetworkRAG

# Create a RAG system with one line
rag = NetworkRAG.create("my_knowledge.db")

# Add content
rag.add("Python is a programming language")
rag.add("Machine learning uses Python", tags=["ml"])

# Search with fluent interface
results = rag.search("programming").top(5)
for result in results:
    print(result.content, result.score)

Factory Methods

Quick Setup Patterns

# In-memory (for testing)
rag = NetworkRAG.in_memory()

# With TF-IDF embeddings
rag = NetworkRAG.with_tfidf("knowledge.db", max_features=512)

# With Ollama embeddings
rag = NetworkRAG.with_ollama("knowledge.db", model="nomic-embed-text")

# Default (in-memory with TF-IDF)
rag = NetworkRAG.create()

Builder Pattern

Custom Configuration

# Fluent builder for complex setups
rag = (NetworkRAG.builder()
    .with_storage("my_knowledge.db")
    .with_tfidf_embeddings(max_features=512)
    .with_similarity_threshold(0.7, 0.8)
    .build())

Adding Content

Simple Add

# Auto-generated ID
node_id = rag.add("Content here")

# Explicit ID
rag.add("Content", id="doc1")

# With metadata
rag.add("Content", id="doc1", tags=["important"], year=2024)

Batch Operations

# Efficient bulk adds
with rag.batch() as batch:
    for doc in documents:
        batch.add(doc.content, id=doc.id, **doc.metadata)
# Network rebuilds automatically after batch

Searching

# Simple search
results = rag.search("query").top(10)

# With strategy
results = rag.search("query").with_strategy("hybrid").top(10)

Advanced Queries

# Filter by metadata
results = (rag.search("machine learning")
    .filter(tags=["tutorial"])
    .filter(year=2024)
    .top(5))

# Search in specific community
results = (rag.search("query")
    .in_community(2)
    .top(10))

# Expand to neighbors
results = (rag.search("query")
    .expand_neighbors(hops=2)
    .top(10))

# Prioritize network structure
results = (rag.search("query")
    .prioritize_hubs()
    .prioritize_bridges()
    .top(10))

# Chain multiple filters
results = (rag.search("deep learning")
    .with_strategy("hybrid")
    .filter(tags=["ai"])
    .expand_neighbors(hops=1)
    .prioritize_hubs()
    .top(10))

Working with Results

Rich Result Objects

results = rag.search("query").top(10)

# Access by index
first = results[0]
print(first.id, first.content, first.score, first.metadata)

# Slicing
top_three = results[:3]

# Iteration
for result in results:
    print(result)

# Conversion
ids = results.ids()           # List[str]
scores = results.scores()     # List[float]
contents = results.contents() # List[str]

# Filtering
high_score = results.filter_by_score(0.7)
tagged = results.filter_by_metadata(tag="important")

Node Management

CRUD Operations

# Create
node_id = rag.add("Content", id="doc1", tag="important")

# Read
node = rag.get("doc1")
print(node.content)
print(node.metadata)
print(node.neighbors)
print(node.community)
print(node.degree)
print(node.is_hub(min_degree=5))

# Update
rag.update("doc1", tag="very-important", year=2024)

# Delete
rag.delete("doc1")

Network Analysis

Properties

# Simple property access
print(f"Nodes: {rag.node_count}")
print(f"Edges: {rag.edge_count}")
print(f"Communities: {rag.community_count}")
print(f"Density: {rag.density}")

Hub and Bridge Nodes

# Get hub nodes
hubs = rag.hubs(min_degree=5)

# Get bridge nodes
bridges = rag.bridges(min_betweenness=0.01)

Community Exploration

# Get community object
comm = rag.get_community(0)
print(f"Size: {comm.size}")
print(f"Tags: {comm.tags}")
print(f"Nodes: {comm.nodes}")

# Iterate communities
for comm_id in range(rag.community_count):
    comm = rag.get_community(comm_id)
    print(f"Community {comm_id}: {comm.size} nodes")

Visualization

# Simple visualization
rag.visualize("network.html")

# With options
rag.visualize("network.html",
    highlight_communities=True,
    show_hubs=True)

# Custom visualization
viz = rag.visualization()
viz.highlight_nodes(["doc1", "doc2"])
viz.save("custom.html")

Migration Guide

Old API → New API

Setup

# Old
from src import NetworkRAG, SQLiteStorage, TFIDFEmbedding
storage = SQLiteStorage("my.db")
embedder = TFIDFEmbedding(max_features=256)
rag = NetworkRAG(storage, embedder, min_similarity=0.7)

# New
from src import NetworkRAG
rag = NetworkRAG.with_tfidf("my.db", max_features=256)

Adding Nodes

# Old
rag.add_node("doc1", "content", {"tag": "foo"})

# New
rag.add("content", id="doc1", tag="foo")

Searching

# Old
results = rag.find_similar("query", n=10, strategy="hybrid")
# returns: List[str] (node IDs)

# New
results = rag.search("query").with_strategy("hybrid").top(10)
# returns: ResultSet (rich results)

# Convert to IDs if needed
ids = results.ids()

Network Building

# Old
rag.build_network()
communities = rag.detect_communities()

# New (same, but also available via properties)
rag.build_network()
communities = rag.detect_communities()
# OR
count = rag.community_count  # Auto-builds if needed

Getting Node Info

# Old
node_data = rag.storage.get_node("doc1")
content = node_data['content_text']
metadata = node_data['metadata']

# New
node = rag.get("doc1")
content = node.content
metadata = node.metadata
neighbors = node.neighbors
community = node.community

Compatibility

The old API still works! You can:

  1. Use old API exclusively - No changes needed
  2. Use new API exclusively - Cleaner, more intuitive
  3. Mix both APIs - Transition gradually
# Old API construction
storage = SQLiteStorage(":memory:")
embedder = TFIDFEmbedding()
rag = NetworkRAG(storage, embedder)

# Mix with new API methods
rag.add("Content 1", id="doc1")  # New
rag.add_node("doc2", "Content 2")  # Old
results = rag.search("query").top(5)  # New
ids = rag.find_similar("query", n=5)  # Old

Complete Examples

Example 1: Simple Knowledge Base

from src import NetworkRAG

# Setup
rag = NetworkRAG.create("kb.db")

# Add documents
docs = [
    "Python is great for data science",
    "JavaScript is used for web development",
    "SQL manages relational databases"
]
for doc in docs:
    rag.add(doc)

# Search
results = rag.search("programming languages").top(3)
for r in results:
    print(f"{r.score:.2f}: {r.content}")

Example 2: Metadata Filtering

# Add with metadata
rag.add("Intro to Python", id="py1", level="beginner", topic="python")
rag.add("Advanced Python", id="py2", level="advanced", topic="python")
rag.add("JavaScript Basics", id="js1", level="beginner", topic="javascript")

# Search with filters
results = (rag.search("programming tutorial")
    .filter(level="beginner")
    .filter(topic="python")
    .top(5))
# Build network
rag.build_network()
rag.detect_communities()

# Explore communities
for i in range(rag.community_count):
    comm = rag.get_community(i)
    print(f"Community {i}: {comm.size} nodes")
    print(f"  Tags: {comm.tags[:3]}")

# Search within community
results = (rag.search("query")
    .in_community(0)
    .top(10))

Example 4: Bulk Import

# Efficient bulk loading
import json

with rag.batch() as batch:
    with open("data.jsonl") as f:
        for line in f:
            doc = json.loads(line)
            batch.add(
                doc["content"],
                id=doc["id"],
                **doc["metadata"]
            )

API Reference Summary

Factory Methods

  • NetworkRAG.create(db_path) - Quick start with defaults
  • NetworkRAG.in_memory() - In-memory for testing
  • NetworkRAG.with_tfidf(db_path, max_features) - TF-IDF embeddings
  • NetworkRAG.with_ollama(db_path, model, host) - Ollama embeddings
  • NetworkRAG.builder() - Builder pattern

Content Management

  • add(content, id, **metadata) - Add content
  • get(node_id) - Get node object
  • update(node_id, **metadata) - Update metadata
  • delete(node_id) - Delete node
  • batch() - Batch context manager
  • search(query) - Create query builder
  • .with_strategy(strategy) - Set strategy
  • .filter(**kwargs) - Filter by metadata
  • .in_community(id) - Search in community
  • .expand_neighbors(hops) - Include neighbors
  • .prioritize_hubs() - Boost hub nodes
  • .prioritize_bridges() - Boost bridge nodes
  • .top(n) - Execute and get results

Properties

  • node_count - Total nodes
  • edge_count - Total edges
  • community_count - Number of communities
  • density - Network density

Analysis

  • hubs(min_degree) - Get hub nodes
  • bridges(min_betweenness) - Get bridge nodes
  • get_community(id) - Get community object
  • visualize(path, **options) - Visualize network
  • visualization() - Get visualizer object

Result Objects

  • ResultSet - Collection of results
  • [index] - Index access
  • [start:end] - Slicing
  • .ids() - Extract IDs
  • .scores() - Extract scores
  • .contents() - Extract contents
  • .filter_by_score(min) - Filter by score
  • .filter_by_metadata(**kwargs) - Filter by metadata

  • SearchResult - Single result

  • .id - Node ID
  • .content - Text content
  • .score - Similarity score
  • .metadata - Metadata dict
  • .community_id - Community ID

  • Node - Rich node object

  • .id - Node ID
  • .content - Text content
  • .metadata - Metadata dict
  • .neighbors - Neighbor IDs
  • .community - Community ID
  • .degree - Node degree
  • .is_hub(min_degree) - Check if hub

  • Community - Rich community object

  • .id - Community ID
  • .nodes - Node IDs
  • .size - Number of nodes
  • .tags - Auto-generated tags

Best Practices

  1. Use factory methods for common patterns

    rag = NetworkRAG.with_tfidf("kb.db")  # Not: builder + build
    

  2. Use batch context for bulk operations

    with rag.batch() as batch:
        for doc in docs:
            batch.add(doc.content)
    

  3. Chain query operations for complex searches

    results = (rag.search("query")
        .filter(category="tech")
        .prioritize_hubs()
        .top(10))
    

  4. Use properties for simple stats

    print(rag.node_count)  # Not: rag.get_stats()['nodes']
    

  5. Use rich objects for exploration

    node = rag.get("doc1")
    print(node.neighbors, node.degree)  # Not: manual graph queries
    

Conclusion

The fluent API makes Complex Network RAG: - Easier to learn - Intuitive method names - Faster to write - One-liners for common tasks - More powerful - Method chaining for complex operations - Better documented - Rich objects with clear properties - Fully compatible - Old API still works

Start with the factory methods, use method chaining for queries, and leverage rich objects for exploration!