Skip to content

Fluent API Design for Complex Network RAG

Philosophy

This API design follows these principles: 1. Simplicity - Common tasks should be trivial 2. Composability - Components should combine naturally 3. Discoverability - Method names reveal intent 4. Pythonic - Use context managers, properties, and conventions 5. Type hints - Clear interfaces for IDE support

Design Overview

Three-Layer Architecture

  1. Configuration Layer - Builder pattern for setup
  2. Operation Layer - Fluent interface for queries
  3. Result Layer - Rich result objects

Key Improvements

  1. Builder Pattern for configuration
  2. Fluent Queries with method chaining
  3. Context Managers for transactions
  4. Rich Results instead of raw lists
  5. Property-based Access for simple attributes
  6. Factory Methods for common patterns

API Examples

1. Simple Setup (Most Common Use Case)

from complex_network_rag import NetworkRAG

# One-liner for quick start
rag = NetworkRAG.create("my_knowledge.db")

# Add content
rag.add("Python is a programming language")
rag.add("Machine learning uses Python", tags=["ml", "ai"])

# Query
results = rag.search("programming").top(5)
for result in results:
    print(result.content, result.score)

2. Builder Pattern for Custom Setup

from complex_network_rag import NetworkRAG

# Fluent configuration
rag = (NetworkRAG.builder()
    .with_storage("my_knowledge.db")
    .with_tfidf_embeddings(max_features=512)
    .with_similarity_threshold(0.7)
    .build())

3. Fluent Query Interface

# Method chaining for complex queries
results = (rag.search("machine learning")
    .in_community(2)
    .with_strategy("hybrid")
    .top(10))

# Filter by metadata
results = (rag.search("python")
    .filter(tags=["tutorial"])
    .filter(year=2024)
    .top(5))

# Explore topology
results = (rag.search("databases")
    .expand_neighbors(hops=2)
    .prioritize_hubs()
    .top(10))

4. Batch Operations with Context Manager

# Transaction-like batch adds
with rag.batch() as batch:
    for doc in documents:
        batch.add(doc.id, doc.content, metadata=doc.meta)
# Network automatically rebuilds after context exits

5. Network Analysis

# Property-based access for simple attributes
print(f"Nodes: {rag.node_count}")
print(f"Edges: {rag.edge_count}")
print(f"Communities: {rag.community_count}")

# Method calls for computed properties
communities = rag.communities  # Dict[str, int]
hubs = rag.hubs(min_degree=5)  # List of node IDs
bridges = rag.bridges(min_betweenness=0.01)

# Community exploration
community = rag.get_community(0)
print(community.size)
print(community.tags)
for node_id in community.nodes:
    print(node_id)

6. Rich Result Objects

# Results are iterable and indexable
results = rag.search("query").top(10)

# Access by index
first = results[0]
print(first.id, first.content, first.score)

# Iteration
for result in results:
    print(result)

# Conversion
node_ids = results.ids()  # List[str]
scores = results.scores()  # List[float]
contents = results.contents()  # List[str]

# Slicing works
top_three = results[:3]

7. Node Management

# Add with automatic ID generation
node_id = rag.add("Content here")

# Add with explicit ID
rag.add("Content", id="doc1", tags=["important"])

# Get node details
node = rag.get("doc1")
print(node.content)
print(node.metadata)
print(node.neighbors)

# Update metadata
rag.update("doc1", tags=["very-important"])

# Delete
rag.delete("doc1")

8. Factory Methods for Common Patterns

# Quick start with Ollama
rag = NetworkRAG.with_ollama("knowledge.db", model="nomic-embed-text")

# Quick start with TF-IDF
rag = NetworkRAG.with_tfidf("knowledge.db", max_features=512)

# In-memory for testing
rag = NetworkRAG.in_memory()

9. Visualization Integration

# Simple visualization
rag.visualize("network.html")

# With options
rag.visualize("network.html",
    highlight_communities=True,
    show_hubs=True,
    interactive=True)

# Get visualization object for customization
viz = rag.visualization()
viz.highlight_nodes(["doc1", "doc2"])
viz.save("custom.html")

10. Advanced: Custom Embedding Providers

from complex_network_rag import EmbeddingProvider

class MyEmbeddings(EmbeddingProvider):
    def embed(self, texts):
        # Your implementation
        pass

    def get_dimension(self):
        return 768

    def get_model_name(self):
        return "my-model"

    def get_model_config(self):
        return {}

rag = NetworkRAG.builder().with_embeddings(MyEmbeddings()).build()

Backward Compatibility

All existing APIs remain functional:

# Old API still works
storage = SQLiteStorage("my.db")
embedder = TFIDFEmbedding(max_features=256)
rag = NetworkRAG(storage, embedder, min_similarity=0.7)
rag.add_node("doc1", "content...", metadata={"tag": "foo"})
results = rag.find_similar("query", n=10, strategy="hybrid")

Migration Path

Old code:

storage = SQLiteStorage("db.db")
embedder = TFIDFEmbedding()
rag = NetworkRAG(storage, embedder)
rag.add_node("id1", "content", {"tag": "foo"})
results = rag.find_similar("query", n=5)

New code:

rag = NetworkRAG.with_tfidf("db.db")
rag.add("content", id="id1", tags=["foo"])
results = rag.search("query").top(5)

Implementation Strategy

  1. Create QueryBuilder class for fluent queries
  2. Create ResultSet class for rich results
  3. Create NetworkBuilder class for configuration
  4. Create BatchContext for batch operations
  5. Create Node and Community classes for rich objects
  6. Add factory methods to NetworkRAG
  7. Keep existing methods for backward compatibility