Fluent API Guide¶

Overview¶

The Complex Network RAG system now features a fluent, Pythonic API designed for simplicity, composability, and discoverability. This guide shows how to use the new API and migrate from the old API.

Design Philosophy¶

The fluent API follows these principles:

Simplicity - Common tasks should be trivial (one-liners when possible)
Composability - Components combine naturally via method chaining
Discoverability - Method names reveal intent, IDE autocomplete works well
Pythonic - Uses context managers, properties, and Python conventions
Backward Compatible - Old API still works, no breaking changes

Quick Start¶

Simple Usage (Most Common Case)¶

from src import NetworkRAG

# Create a RAG system with one line
rag = NetworkRAG.create("my_knowledge.db")

# Add content
rag.add("Python is a programming language")
rag.add("Machine learning uses Python", tags=["ml"])

# Search with fluent interface
results = rag.search("programming").top(5)
for result in results:
    print(result.content, result.score)

Factory Methods¶

Quick Setup Patterns¶

# In-memory (for testing)
rag = NetworkRAG.in_memory()

# With TF-IDF embeddings
rag = NetworkRAG.with_tfidf("knowledge.db", max_features=512)

# With Ollama embeddings
rag = NetworkRAG.with_ollama("knowledge.db", model="nomic-embed-text")

# Default (in-memory with TF-IDF)
rag = NetworkRAG.create()

Builder Pattern¶

Custom Configuration¶

# Fluent builder for complex setups
rag = (NetworkRAG.builder()
    .with_storage("my_knowledge.db")
    .with_tfidf_embeddings(max_features=512)
    .with_similarity_threshold(0.7, 0.8)
    .build())

Adding Content¶

Simple Add¶

# Auto-generated ID
node_id = rag.add("Content here")

# Explicit ID
rag.add("Content", id="doc1")

# With metadata
rag.add("Content", id="doc1", tags=["important"], year=2024)

Batch Operations¶

# Efficient bulk adds
with rag.batch() as batch:
    for doc in documents:
        batch.add(doc.content, id=doc.id, **doc.metadata)
# Network rebuilds automatically after batch

Searching¶

Basic Search¶

# Simple search
results = rag.search("query").top(10)

# With strategy
results = rag.search("query").with_strategy("hybrid").top(10)

Advanced Queries¶

# Filter by metadata
results = (rag.search("machine learning")
    .filter(tags=["tutorial"])
    .filter(year=2024)
    .top(5))

# Search in specific community
results = (rag.search("query")
    .in_community(2)
    .top(10))

# Expand to neighbors
results = (rag.search("query")
    .expand_neighbors(hops=2)
    .top(10))

# Prioritize network structure
results = (rag.search("query")
    .prioritize_hubs()
    .prioritize_bridges()
    .top(10))

# Chain multiple filters
results = (rag.search("deep learning")
    .with_strategy("hybrid")
    .filter(tags=["ai"])
    .expand_neighbors(hops=1)
    .prioritize_hubs()
    .top(10))

Working with Results¶

Rich Result Objects¶

results = rag.search("query").top(10)

# Access by index
first = results[0]
print(first.id, first.content, first.score, first.metadata)

# Slicing
top_three = results[:3]

# Iteration
for result in results:
    print(result)

# Conversion
ids = results.ids()           # List[str]
scores = results.scores()     # List[float]
contents = results.contents() # List[str]

# Filtering
high_score = results.filter_by_score(0.7)
tagged = results.filter_by_metadata(tag="important")

Node Management¶

CRUD Operations¶

# Create
node_id = rag.add("Content", id="doc1", tag="important")

# Read
node = rag.get("doc1")
print(node.content)
print(node.metadata)
print(node.neighbors)
print(node.community)
print(node.degree)
print(node.is_hub(min_degree=5))

# Update
rag.update("doc1", tag="very-important", year=2024)

# Delete
rag.delete("doc1")

Network Analysis¶

Properties¶

# Simple property access
print(f"Nodes: {rag.node_count}")
print(f"Edges: {rag.edge_count}")
print(f"Communities: {rag.community_count}")
print(f"Density: {rag.density}")

Hub and Bridge Nodes¶

# Get hub nodes
hubs = rag.hubs(min_degree=5)

# Get bridge nodes
bridges = rag.bridges(min_betweenness=0.01)

Community Exploration¶

# Get community object
comm = rag.get_community(0)
print(f"Size: {comm.size}")
print(f"Tags: {comm.tags}")
print(f"Nodes: {comm.nodes}")

# Iterate communities
for comm_id in range(rag.community_count):
    comm = rag.get_community(comm_id)
    print(f"Community {comm_id}: {comm.size} nodes")

Visualization¶

# Simple visualization
rag.visualize("network.html")

# With options
rag.visualize("network.html",
    highlight_communities=True,
    show_hubs=True)

# Custom visualization
viz = rag.visualization()
viz.highlight_nodes(["doc1", "doc2"])
viz.save("custom.html")

Migration Guide¶

Old API → New API¶

Setup¶

# Old
from src import NetworkRAG, SQLiteStorage, TFIDFEmbedding
storage = SQLiteStorage("my.db")
embedder = TFIDFEmbedding(max_features=256)
rag = NetworkRAG(storage, embedder, min_similarity=0.7)

# New
from src import NetworkRAG
rag = NetworkRAG.with_tfidf("my.db", max_features=256)

Adding Nodes¶

# Old
rag.add_node("doc1", "content", {"tag": "foo"})

# New
rag.add("content", id="doc1", tag="foo")

Searching¶

# Old
results = rag.find_similar("query", n=10, strategy="hybrid")
# returns: List[str] (node IDs)

# New
results = rag.search("query").with_strategy("hybrid").top(10)
# returns: ResultSet (rich results)

# Convert to IDs if needed
ids = results.ids()

Network Building¶

# Old
rag.build_network()
communities = rag.detect_communities()

# New (same, but also available via properties)
rag.build_network()
communities = rag.detect_communities()
# OR
count = rag.community_count  # Auto-builds if needed

Getting Node Info¶

# Old
node_data = rag.storage.get_node("doc1")
content = node_data['content_text']
metadata = node_data['metadata']

# New
node = rag.get("doc1")
content = node.content
metadata = node.metadata
neighbors = node.neighbors
community = node.community

Compatibility¶

The old API still works! You can:

Use old API exclusively - No changes needed
Use new API exclusively - Cleaner, more intuitive
Mix both APIs - Transition gradually

# Old API construction
storage = SQLiteStorage(":memory:")
embedder = TFIDFEmbedding()
rag = NetworkRAG(storage, embedder)

# Mix with new API methods
rag.add("Content 1", id="doc1")  # New
rag.add_node("doc2", "Content 2")  # Old
results = rag.search("query").top(5)  # New
ids = rag.find_similar("query", n=5)  # Old

Complete Examples¶

Example 1: Simple Knowledge Base¶

from src import NetworkRAG

# Setup
rag = NetworkRAG.create("kb.db")

# Add documents
docs = [
    "Python is great for data science",
    "JavaScript is used for web development",
    "SQL manages relational databases"
]
for doc in docs:
    rag.add(doc)

# Search
results = rag.search("programming languages").top(3)
for r in results:
    print(f"{r.score:.2f}: {r.content}")

Example 2: Metadata Filtering¶

# Add with metadata
rag.add("Intro to Python", id="py1", level="beginner", topic="python")
rag.add("Advanced Python", id="py2", level="advanced", topic="python")
rag.add("JavaScript Basics", id="js1", level="beginner", topic="javascript")

# Search with filters
results = (rag.search("programming tutorial")
    .filter(level="beginner")
    .filter(topic="python")
    .top(5))

Example 3: Community-Aware Search¶

# Build network
rag.build_network()
rag.detect_communities()

# Explore communities
for i in range(rag.community_count):
    comm = rag.get_community(i)
    print(f"Community {i}: {comm.size} nodes")
    print(f"  Tags: {comm.tags[:3]}")

# Search within community
results = (rag.search("query")
    .in_community(0)
    .top(10))

Example 4: Bulk Import¶

# Efficient bulk loading
import json

with rag.batch() as batch:
    with open("data.jsonl") as f:
        for line in f:
            doc = json.loads(line)
            batch.add(
                doc["content"],
                id=doc["id"],
                **doc["metadata"]
            )

API Reference Summary¶

Factory Methods¶

NetworkRAG.create(db_path) - Quick start with defaults
NetworkRAG.in_memory() - In-memory for testing
NetworkRAG.with_tfidf(db_path, max_features) - TF-IDF embeddings
NetworkRAG.with_ollama(db_path, model, host) - Ollama embeddings
NetworkRAG.builder() - Builder pattern

Content Management¶

add(content, id, **metadata) - Add content
get(node_id) - Get node object
update(node_id, **metadata) - Update metadata
delete(node_id) - Delete node
batch() - Batch context manager

Search¶

search(query) - Create query builder
.with_strategy(strategy) - Set strategy
.filter(**kwargs) - Filter by metadata
.in_community(id) - Search in community
.expand_neighbors(hops) - Include neighbors
.prioritize_hubs() - Boost hub nodes
.prioritize_bridges() - Boost bridge nodes
.top(n) - Execute and get results

Properties¶

node_count - Total nodes
edge_count - Total edges
community_count - Number of communities
density - Network density

Analysis¶

hubs(min_degree) - Get hub nodes
bridges(min_betweenness) - Get bridge nodes
get_community(id) - Get community object
visualize(path, **options) - Visualize network
visualization() - Get visualizer object

Result Objects¶

ResultSet - Collection of results
[index] - Index access
[start:end] - Slicing
.ids() - Extract IDs
.scores() - Extract scores
.contents() - Extract contents
.filter_by_score(min) - Filter by score
.filter_by_metadata(**kwargs) - Filter by metadata
SearchResult - Single result
.id - Node ID
.content - Text content
.score - Similarity score
.metadata - Metadata dict
.community_id - Community ID
Node - Rich node object
.id - Node ID
.content - Text content
.metadata - Metadata dict
.neighbors - Neighbor IDs
.community - Community ID
.degree - Node degree
.is_hub(min_degree) - Check if hub
Community - Rich community object
.id - Community ID
.nodes - Node IDs
.size - Number of nodes
.tags - Auto-generated tags

Best Practices¶

Use factory methods for common patterns

rag = NetworkRAG.with_tfidf("kb.db")  # Not: builder + build

Use batch context for bulk operations

with rag.batch() as batch:
    for doc in docs:
        batch.add(doc.content)

Chain query operations for complex searches

results = (rag.search("query")
    .filter(category="tech")
    .prioritize_hubs()
    .top(10))

Use properties for simple stats

print(rag.node_count)  # Not: rag.get_stats()['nodes']

Use rich objects for exploration

node = rag.get("doc1")
print(node.neighbors, node.degree)  # Not: manual graph queries

Conclusion¶

The fluent API makes Complex Network RAG: - Easier to learn - Intuitive method names - Faster to write - One-liners for common tasks - More powerful - Method chaining for complex operations - Better documented - Rich objects with clear properties - Fully compatible - Old API still works

Start with the factory methods, use method chaining for queries, and leverage rich objects for exploration!