Fluent API Guide¶
Overview¶
The Complex Network RAG system now features a fluent, Pythonic API designed for simplicity, composability, and discoverability. This guide shows how to use the new API and migrate from the old API.
Design Philosophy¶
The fluent API follows these principles:
- Simplicity - Common tasks should be trivial (one-liners when possible)
- Composability - Components combine naturally via method chaining
- Discoverability - Method names reveal intent, IDE autocomplete works well
- Pythonic - Uses context managers, properties, and Python conventions
- Backward Compatible - Old API still works, no breaking changes
Quick Start¶
Simple Usage (Most Common Case)¶
from src import NetworkRAG
# Create a RAG system with one line
rag = NetworkRAG.create("my_knowledge.db")
# Add content
rag.add("Python is a programming language")
rag.add("Machine learning uses Python", tags=["ml"])
# Search with fluent interface
results = rag.search("programming").top(5)
for result in results:
print(result.content, result.score)
Factory Methods¶
Quick Setup Patterns¶
# In-memory (for testing)
rag = NetworkRAG.in_memory()
# With TF-IDF embeddings
rag = NetworkRAG.with_tfidf("knowledge.db", max_features=512)
# With Ollama embeddings
rag = NetworkRAG.with_ollama("knowledge.db", model="nomic-embed-text")
# Default (in-memory with TF-IDF)
rag = NetworkRAG.create()
Builder Pattern¶
Custom Configuration¶
# Fluent builder for complex setups
rag = (NetworkRAG.builder()
.with_storage("my_knowledge.db")
.with_tfidf_embeddings(max_features=512)
.with_similarity_threshold(0.7, 0.8)
.build())
Adding Content¶
Simple Add¶
# Auto-generated ID
node_id = rag.add("Content here")
# Explicit ID
rag.add("Content", id="doc1")
# With metadata
rag.add("Content", id="doc1", tags=["important"], year=2024)
Batch Operations¶
# Efficient bulk adds
with rag.batch() as batch:
for doc in documents:
batch.add(doc.content, id=doc.id, **doc.metadata)
# Network rebuilds automatically after batch
Searching¶
Basic Search¶
# Simple search
results = rag.search("query").top(10)
# With strategy
results = rag.search("query").with_strategy("hybrid").top(10)
Advanced Queries¶
# Filter by metadata
results = (rag.search("machine learning")
.filter(tags=["tutorial"])
.filter(year=2024)
.top(5))
# Search in specific community
results = (rag.search("query")
.in_community(2)
.top(10))
# Expand to neighbors
results = (rag.search("query")
.expand_neighbors(hops=2)
.top(10))
# Prioritize network structure
results = (rag.search("query")
.prioritize_hubs()
.prioritize_bridges()
.top(10))
# Chain multiple filters
results = (rag.search("deep learning")
.with_strategy("hybrid")
.filter(tags=["ai"])
.expand_neighbors(hops=1)
.prioritize_hubs()
.top(10))
Working with Results¶
Rich Result Objects¶
results = rag.search("query").top(10)
# Access by index
first = results[0]
print(first.id, first.content, first.score, first.metadata)
# Slicing
top_three = results[:3]
# Iteration
for result in results:
print(result)
# Conversion
ids = results.ids() # List[str]
scores = results.scores() # List[float]
contents = results.contents() # List[str]
# Filtering
high_score = results.filter_by_score(0.7)
tagged = results.filter_by_metadata(tag="important")
Node Management¶
CRUD Operations¶
# Create
node_id = rag.add("Content", id="doc1", tag="important")
# Read
node = rag.get("doc1")
print(node.content)
print(node.metadata)
print(node.neighbors)
print(node.community)
print(node.degree)
print(node.is_hub(min_degree=5))
# Update
rag.update("doc1", tag="very-important", year=2024)
# Delete
rag.delete("doc1")
Network Analysis¶
Properties¶
# Simple property access
print(f"Nodes: {rag.node_count}")
print(f"Edges: {rag.edge_count}")
print(f"Communities: {rag.community_count}")
print(f"Density: {rag.density}")
Hub and Bridge Nodes¶
# Get hub nodes
hubs = rag.hubs(min_degree=5)
# Get bridge nodes
bridges = rag.bridges(min_betweenness=0.01)
Community Exploration¶
# Get community object
comm = rag.get_community(0)
print(f"Size: {comm.size}")
print(f"Tags: {comm.tags}")
print(f"Nodes: {comm.nodes}")
# Iterate communities
for comm_id in range(rag.community_count):
comm = rag.get_community(comm_id)
print(f"Community {comm_id}: {comm.size} nodes")
Visualization¶
# Simple visualization
rag.visualize("network.html")
# With options
rag.visualize("network.html",
highlight_communities=True,
show_hubs=True)
# Custom visualization
viz = rag.visualization()
viz.highlight_nodes(["doc1", "doc2"])
viz.save("custom.html")
Migration Guide¶
Old API → New API¶
Setup¶
# Old
from src import NetworkRAG, SQLiteStorage, TFIDFEmbedding
storage = SQLiteStorage("my.db")
embedder = TFIDFEmbedding(max_features=256)
rag = NetworkRAG(storage, embedder, min_similarity=0.7)
# New
from src import NetworkRAG
rag = NetworkRAG.with_tfidf("my.db", max_features=256)
Adding Nodes¶
# Old
rag.add_node("doc1", "content", {"tag": "foo"})
# New
rag.add("content", id="doc1", tag="foo")
Searching¶
# Old
results = rag.find_similar("query", n=10, strategy="hybrid")
# returns: List[str] (node IDs)
# New
results = rag.search("query").with_strategy("hybrid").top(10)
# returns: ResultSet (rich results)
# Convert to IDs if needed
ids = results.ids()
Network Building¶
# Old
rag.build_network()
communities = rag.detect_communities()
# New (same, but also available via properties)
rag.build_network()
communities = rag.detect_communities()
# OR
count = rag.community_count # Auto-builds if needed
Getting Node Info¶
# Old
node_data = rag.storage.get_node("doc1")
content = node_data['content_text']
metadata = node_data['metadata']
# New
node = rag.get("doc1")
content = node.content
metadata = node.metadata
neighbors = node.neighbors
community = node.community
Compatibility¶
The old API still works! You can:
- Use old API exclusively - No changes needed
- Use new API exclusively - Cleaner, more intuitive
- Mix both APIs - Transition gradually
# Old API construction
storage = SQLiteStorage(":memory:")
embedder = TFIDFEmbedding()
rag = NetworkRAG(storage, embedder)
# Mix with new API methods
rag.add("Content 1", id="doc1") # New
rag.add_node("doc2", "Content 2") # Old
results = rag.search("query").top(5) # New
ids = rag.find_similar("query", n=5) # Old
Complete Examples¶
Example 1: Simple Knowledge Base¶
from src import NetworkRAG
# Setup
rag = NetworkRAG.create("kb.db")
# Add documents
docs = [
"Python is great for data science",
"JavaScript is used for web development",
"SQL manages relational databases"
]
for doc in docs:
rag.add(doc)
# Search
results = rag.search("programming languages").top(3)
for r in results:
print(f"{r.score:.2f}: {r.content}")
Example 2: Metadata Filtering¶
# Add with metadata
rag.add("Intro to Python", id="py1", level="beginner", topic="python")
rag.add("Advanced Python", id="py2", level="advanced", topic="python")
rag.add("JavaScript Basics", id="js1", level="beginner", topic="javascript")
# Search with filters
results = (rag.search("programming tutorial")
.filter(level="beginner")
.filter(topic="python")
.top(5))
Example 3: Community-Aware Search¶
# Build network
rag.build_network()
rag.detect_communities()
# Explore communities
for i in range(rag.community_count):
comm = rag.get_community(i)
print(f"Community {i}: {comm.size} nodes")
print(f" Tags: {comm.tags[:3]}")
# Search within community
results = (rag.search("query")
.in_community(0)
.top(10))
Example 4: Bulk Import¶
# Efficient bulk loading
import json
with rag.batch() as batch:
with open("data.jsonl") as f:
for line in f:
doc = json.loads(line)
batch.add(
doc["content"],
id=doc["id"],
**doc["metadata"]
)
API Reference Summary¶
Factory Methods¶
NetworkRAG.create(db_path)- Quick start with defaultsNetworkRAG.in_memory()- In-memory for testingNetworkRAG.with_tfidf(db_path, max_features)- TF-IDF embeddingsNetworkRAG.with_ollama(db_path, model, host)- Ollama embeddingsNetworkRAG.builder()- Builder pattern
Content Management¶
add(content, id, **metadata)- Add contentget(node_id)- Get node objectupdate(node_id, **metadata)- Update metadatadelete(node_id)- Delete nodebatch()- Batch context manager
Search¶
search(query)- Create query builder.with_strategy(strategy)- Set strategy.filter(**kwargs)- Filter by metadata.in_community(id)- Search in community.expand_neighbors(hops)- Include neighbors.prioritize_hubs()- Boost hub nodes.prioritize_bridges()- Boost bridge nodes.top(n)- Execute and get results
Properties¶
node_count- Total nodesedge_count- Total edgescommunity_count- Number of communitiesdensity- Network density
Analysis¶
hubs(min_degree)- Get hub nodesbridges(min_betweenness)- Get bridge nodesget_community(id)- Get community objectvisualize(path, **options)- Visualize networkvisualization()- Get visualizer object
Result Objects¶
ResultSet- Collection of results[index]- Index access[start:end]- Slicing.ids()- Extract IDs.scores()- Extract scores.contents()- Extract contents.filter_by_score(min)- Filter by score-
.filter_by_metadata(**kwargs)- Filter by metadata -
SearchResult- Single result .id- Node ID.content- Text content.score- Similarity score.metadata- Metadata dict-
.community_id- Community ID -
Node- Rich node object .id- Node ID.content- Text content.metadata- Metadata dict.neighbors- Neighbor IDs.community- Community ID.degree- Node degree-
.is_hub(min_degree)- Check if hub -
Community- Rich community object .id- Community ID.nodes- Node IDs.size- Number of nodes.tags- Auto-generated tags
Best Practices¶
-
Use factory methods for common patterns
-
Use batch context for bulk operations
-
Chain query operations for complex searches
-
Use properties for simple stats
-
Use rich objects for exploration
Conclusion¶
The fluent API makes Complex Network RAG: - Easier to learn - Intuitive method names - Faster to write - One-liners for common tasks - More powerful - Method chaining for complex operations - Better documented - Rich objects with clear properties - Fully compatible - Old API still works
Start with the factory methods, use method chaining for queries, and leverage rich objects for exploration!