Skip to content

Immutability

Immutability is the cornerstone of AlgoGraph's design. This document explains why we chose immutability and how it benefits your code.

What is Immutability?

Immutable objects cannot be changed after creation. Instead of modifying an object, you create a new one with the desired changes.

from AlgoGraph import Vertex

# Create vertex
v1 = Vertex('A', attrs={'value': 10})

# "Modify" it - actually creates new vertex
v2 = v1.with_attrs(value=20)

# Original unchanged
print(v1.get('value'))  # 10
print(v2.get('value'))  # 20

Why Immutability?

1. No Unexpected Side Effects

With mutable objects, functions can modify data unexpectedly:

# Mutable approach (NOT how AlgoGraph works)
def bad_add_edge(graph, edge):
    graph.edges.add(edge)  # Modifies original!
    return graph

g1 = create_graph()
g2 = bad_add_edge(g1, edge)
# g1 was modified - surprise!

With immutability, the original is always safe:

# Immutable approach (how AlgoGraph works)
g1 = Graph(...)
g2 = g1.add_edge(edge)  # Returns new graph
# g1 is unchanged - guaranteed!

2. Thread Safety

Immutable objects are inherently thread-safe:

from concurrent.futures import ThreadPoolExecutor
from AlgoGraph.algorithms import dijkstra, bfs

graph = load_graph('network.json')

# Safe to use same graph in multiple threads
with ThreadPoolExecutor() as executor:
    future1 = executor.submit(dijkstra, graph, 'A')
    future2 = executor.submit(bfs, graph, 'B')
    future3 = executor.submit(connected_components, graph)

# No locks needed - graph can't be modified
results = [f.result() for f in [future1, future2, future3]]

3. Easier Debugging

With immutability, data doesn't change unexpectedly:

# Create graph
g1 = create_initial_graph()
print(f"Initial: {g1.vertex_count} vertices")  # 5

# Transform it
g2 = add_some_vertices(g1)
print(f"After adding: {g2.vertex_count} vertices")  # 8

# Original still intact for debugging
print(f"Original still: {g1.vertex_count} vertices")  # 5

# Can compare states
print(f"Added {g2.vertex_count - g1.vertex_count} vertices")

4. Time Travel / Undo

Keep history of all states:

history = []

# Build graph step by step
g = Graph()
history.append(g)

g = g.add_vertex(Vertex('A'))
history.append(g)

g = g.add_vertex(Vertex('B'))
history.append(g)

g = g.add_edge(Edge('A', 'B'))
history.append(g)

# "Undo" by going back in history
previous_state = history[-2]
print(f"Before last operation: {previous_state.edge_count} edges")

5. Referential Transparency

Same inputs always produce same outputs:

from AlgoGraph.algorithms import dijkstra

# Pure function - no side effects
distances1, _ = dijkstra(graph, 'A')
distances2, _ = dijkstra(graph, 'A')

# Results are always identical
assert distances1 == distances2

# Graph is unchanged
assert graph == graph  # Always true

Implementation Techniques

Frozen Dataclasses

AlgoGraph uses Python's frozen dataclasses:

from dataclasses import dataclass

@dataclass(frozen=True)
class Vertex:
    id: str
    attrs: dict

# Attempts to modify raise errors
v = Vertex('A', {'x': 1})
v.id = 'B'  # FrozenInstanceError!

Immutable Collections

Sets and dicts are copied, not shared:

@dataclass(frozen=True)
class Graph:
    vertices: Set[Vertex]
    edges: Set[Edge]

    def add_vertex(self, vertex):
        # Create new set (doesn't modify original)
        new_vertices = self.vertices | {vertex}
        return Graph(new_vertices, self.edges)

Copy-on-Write

Only copy what changes:

def add_edge(self, edge):
    # Only vertices and edges change
    # All other graph properties remain shared
    new_edges = self.edges | {edge}
    return Graph(self.vertices, new_edges)

Performance Considerations

Memory Overhead

Immutability can use more memory:

# Creating many versions
g1 = Graph(...)
g2 = g1.add_vertex(v1)  # New graph
g3 = g2.add_vertex(v2)  # Another new graph
g4 = g3.add_vertex(v3)  # Yet another new graph

# All four graphs exist in memory

Mitigation: Python's garbage collector reclaims unused graphs automatically.

Structural Sharing

AlgoGraph uses structural sharing to reduce overhead:

g1 = Graph(vertices_set, edges_set)
g2 = g1.add_vertex(new_vertex)

# g2 shares edges with g1 (sets are immutable)
# Only vertices set is new

When to Be Careful

Large batch operations can be expensive:

# Inefficient - creates many intermediate graphs
g = Graph()
for i in range(1000):
    g = g.add_vertex(Vertex(str(i)))

Better approach:

# Efficient - create all vertices at once
vertices = {Vertex(str(i)) for i in range(1000)}
g = Graph(vertices)

Working with Immutability

Building Graphs

When building large graphs, create collections first:

# Good - create sets then build graph
vertices = {Vertex(name) for name in names}
edges = {Edge(src, tgt, weight=w) for src, tgt, w in edge_data}
graph = Graph(vertices, edges)

# Less efficient - many intermediate graphs
graph = Graph()
for name in names:
    graph = graph.add_vertex(Vertex(name))
for src, tgt, w in edge_data:
    graph = graph.add_edge(Edge(src, tgt, weight=w))

Transforming Graphs

Collect changes then apply:

# Collect updates
updates = {}
for v in graph.vertices:
    if needs_update(v):
        updates[v.id] = compute_new_attrs(v)

# Apply all at once
new_graph = graph
for vid, new_attrs in updates.items():
    v = graph.get_vertex(vid)
    updated = v.with_attrs(**new_attrs)
    new_graph = new_graph.update_vertex(updated)

Algorithms

Algorithms work with immutable graphs but can use mutable internal state:

def dijkstra(graph, source):
    # Graph is immutable
    # But algorithm uses mutable data structures internally
    distances = {}  # Mutable dict (internal)
    visited = set()  # Mutable set (internal)

    # ... algorithm logic ...

    # Returns immutable result
    return distances, predecessors

Comparison with Mutable Approaches

NetworkX (Mutable)

import networkx as nx

# Mutable graph
G = nx.Graph()
G.add_node('A', value=10)
G.add_edge('A', 'B')

# Modifies in place
G.nodes['A']['value'] = 20  # Original changed!

# Not thread-safe
# Need locks for concurrent access

AlgoGraph (Immutable)

from AlgoGraph import Graph, Vertex, Edge

# Immutable graph
g1 = Graph({Vertex('A', attrs={'value': 10})}, {Edge('A', 'B')})

# Returns new graph
v = g1.get_vertex('A')
updated_v = v.with_attrs(value=20)
g2 = g1.update_vertex(updated_v)

# Original unchanged
print(g1.get_vertex('A').get('value'))  # 10
print(g2.get_vertex('A').get('value'))  # 20

# Thread-safe by default
# No locks needed

Best Practices

1. Embrace Immutability

Don't fight it - work with it:

# Don't try to "modify" in loops
# BAD
g = Graph()
for item in items:
    g = g.add_vertex(Vertex(item))  # Many intermediate graphs

# Good - build collections first
vertices = {Vertex(item) for item in items}
g = Graph(vertices)

2. Use Chaining

Chain operations for clarity:

result = (graph
    .add_vertex(Vertex('A'))
    .add_vertex(Vertex('B'))
    .add_edge(Edge('A', 'B'))
)

3. Keep References Minimal

Don't keep unnecessary graph versions:

# BAD - keeps all intermediate states
states = []
g = Graph()
for i in range(1000):
    g = g.add_vertex(Vertex(str(i)))
    states.append(g)  # Lots of memory!

# Good - keep only final state
g = Graph({Vertex(str(i)) for i in range(1000)})

4. Leverage Structural Sharing

Take advantage of shared structure:

base_graph = create_large_graph()  # Expensive

# These are cheap - share most structure with base
variant1 = base_graph.add_vertex(Vertex('X'))
variant2 = base_graph.add_vertex(Vertex('Y'))
variant3 = base_graph.add_edge(Edge('A', 'B'))

Conclusion

Immutability makes AlgoGraph:

  • Safer: No unexpected modifications
  • Simpler: Easier to reason about
  • More Testable: Predictable behavior
  • Thread-Safe: Use in concurrent code without locks
  • Debuggable: Keep history of all states

While there's a learning curve and some performance considerations, the benefits far outweigh the costs for most applications.

Further Reading