Tree Transformations¶
AlgoTree provides two families of transformation functions inspired by dotsuite’s Shape pillar:
Closed Transformations (dotmod family): Tree → Tree transformations that preserve structure
Open Transformations (dotpipe family): Tree → Any transformations that reshape to any structure
Dot Notation and Escaping¶
All transformation functions use dot notation for path navigation:
Dots (
.
) separate path componentsUse
\.
to escape literal dots in node namesWildcards:
*
matches any single node,**
matches any subtreeAttributes:
[key=value]
or[key]
to check existence
# Match nodes with dots in names
dotmatch(tree, "files.doc1\.txt") # Matches node named "doc1.txt"
# Match all .txt files
dotmatch(tree, "files.*\.txt") # Wildcard with escaped dot
# Match nodes with attributes
dotmatch(tree, "**[type=file]") # All nodes with type="file"
dotmatch(tree, "**[size]") # All nodes that have a size attribute
Closed Transformations (dotmod family)¶
These functions transform trees while preserving their tree structure.
dotmod¶
Apply transformations to specific nodes using dot paths.
from AlgoTree import dotmod
# Update node payloads
tree = dotmod(tree, {
"app.config": {"debug": False, "port": 9000},
"app.database": {"host": "prod.db.com"}
})
# Rename nodes
tree = dotmod(tree, {"app.cache": "redis_cache"})
# Apply functions
tree = dotmod(tree, {
"app.config": lambda n: {"port": n.payload.get("port", 0) * 2}
})
# Clear payload
tree = dotmod(tree, {"app.temp": None})
dotmap¶
Map transformations over nodes matching a pattern.
from AlgoTree import dotmap
# Transform all nodes
tree = dotmap(tree, lambda n: {"processed": True})
# Transform specific fields
tree = dotmap(tree, {
"size": lambda v: v * 1024, # Convert to bytes
"name": lambda v: v.upper()
})
# Apply to specific pattern
tree = dotmap(tree,
lambda n: {"validated": True},
dot_path="app.modules.*")
dotprune¶
Remove nodes from tree based on condition.
from AlgoTree import dotprune
# Remove by pattern
tree = dotprune(tree, "**.test_*")
# Remove by predicate
tree = dotprune(tree, lambda n: n.payload.get("deprecated", False))
# Keep structure but clear nodes
tree = dotprune(tree, "**.temp", keep_structure=True)
dotmerge¶
Merge two trees with various strategies.
from AlgoTree import dotmerge
# Overlay (tree2 overrides tree1)
merged = dotmerge(tree1, tree2, "overlay")
# Underlay (tree1 takes precedence)
merged = dotmerge(tree1, tree2, "underlay")
# Combine (merge arrays and dicts)
merged = dotmerge(tree1, tree2, "combine")
# Custom resolution
def resolver(node1, node2):
# Custom merge logic
return Node(node2.name, **{**node1.payload, **node2.payload})
merged = dotmerge(tree1, tree2, "custom", conflict_resolver=resolver)
dotannotate¶
Add metadata annotations to nodes.
from AlgoTree import dotannotate
# Add computed annotations
tree = dotannotate(tree,
lambda n: {
"depth": n.level,
"path": ".".join(p.name for p in n.get_path()),
"has_children": len(n.children) > 0
},
annotation_key="_meta")
# Add static annotations to specific nodes
tree = dotannotate(tree,
{"reviewed": True, "version": "1.0"},
dot_path="**.critical_*")
dotvalidate¶
Validate nodes against constraints.
from AlgoTree import dotvalidate
# Validate with predicate (raises on failure)
dotvalidate(tree,
lambda n: n.payload.get("size", 0) < 1000000,
dot_path="**[type=file]")
# Get invalid nodes instead of raising
invalid = dotvalidate(tree,
lambda n: len(n.name) <= 255,
raise_on_invalid=False)
# Validate required attributes
dotvalidate(tree,
{"type": "module", "enabled": True},
dot_path="app.modules.*")
Additional Closed Transformations¶
from AlgoTree import (
dotgraft, # Graft subtree at specific points
dotsplit, # Split tree extracting subtrees
dotflatten, # Flatten to list of nodes
dotreduce, # Reduce tree to single value
dotnormalize # Normalize node names
)
# Graft a subtree
tree = dotgraft(tree, "app.modules", new_module_tree)
# Split tree
tree, extracted = dotsplit(tree, "app.deprecated")
# Flatten tree
all_nodes = dotflatten(tree, "**[type=file]")
# Reduce to aggregate value
total_size = dotreduce(tree,
lambda acc, n: acc + n.payload.get("size", 0),
initial=0)
# Normalize names
tree = dotnormalize(tree) # my-node -> my_node
Open Transformations (dotpipe family)¶
These functions transform trees into arbitrary data structures.
dotpipe¶
The main pipeline function for chaining transformations.
from AlgoTree import dotpipe
# Extract all names
names = dotpipe(tree,
lambda t: [n.name for n in t.traverse_preorder()])
# Multi-stage pipeline
result = dotpipe(tree,
("**[type=file]", lambda n: n.payload), # Extract payloads
lambda payloads: [p["size"] for p in payloads], # Get sizes
sum) # Total size
# Convert to different format
json_data = dotpipe(tree, to_dict)
Conversion Functions¶
Convert trees to common data structures.
from AlgoTree import (
to_dict, # Nested dictionary
to_list, # Flat list
to_paths, # Path strings
to_adjacency_list, # Graph adjacency list
to_edge_list, # Edge pairs
to_nested_lists, # S-expressions
to_table # Tabular/DataFrame format
)
# Convert to dictionary
data = to_dict(tree)
# {"name": "root", "children": [...], ...}
# Get all paths
paths = to_paths(tree)
# ["root", "root.child1", "root.child2", ...]
# With payloads
path_data = to_paths(tree, include_payload=True)
# {"root": {...}, "root.child1": {...}, ...}
# For graph algorithms
adj = to_adjacency_list(tree)
# {"root": ["child1", "child2"], ...}
edges = to_edge_list(tree)
# [("root", "child1"), ("root", "child2"), ...]
# For DataFrames
rows = to_table(tree, columns=["type", "size"])
# df = pd.DataFrame(rows)
Data Extraction¶
Extract and collect data from trees.
from AlgoTree import (
dotextract, # Extract with custom function
dotcollect, # Collect/aggregate data
dotgroup, # Group nodes by key
dotpartition, # Split into two groups
dotproject # SQL-like projection
)
# Extract specific data
sizes = dotextract(tree,
lambda n: n.payload.get("size"),
dot_path="**[type=file]")
# Collect statistics
stats = dotcollect(tree,
lambda n, acc: {
"count": acc["count"] + 1,
"total": acc["total"] + n.payload.get("size", 0)
},
initial={"count": 0, "total": 0})
# Group by attribute
by_type = dotgroup(tree, "type")
# {"file": [node1, node2], "dir": [node3], ...}
# Partition nodes
large, small = dotpartition(tree,
lambda n: n.payload.get("size", 0) > 1000)
# Project specific fields
data = dotproject(tree, ["name", "size", "type"])
# [{"name": "...", "size": ..., "type": "..."}, ...]
Specialized Conversions¶
from AlgoTree import to_graphviz_data, to_json_schema
# For visualization
viz_data = to_graphviz_data(tree)
# {"nodes": [...], "edges": [...]}
# Convert to JSON Schema
schema = to_json_schema(tree)
In-Place vs Copy Operations¶
Most transformation functions support an in_place
parameter:
# Create a copy (default)
new_tree = dotmod(tree, {"app.config": {"debug": False}})
# Original tree unchanged
# Modify in place
dotmod(tree, {"app.config": {"debug": False}}, in_place=True)
# Original tree modified
Chaining Transformations¶
Transformations can be chained for complex operations:
from AlgoTree import dotpipe, dotmod, dotprune, to_dict
result = dotpipe(tree,
# First normalize names
lambda t: dotnormalize(t),
# Remove deprecated nodes
lambda t: dotprune(t, lambda n: n.payload.get("deprecated")),
# Update configurations
lambda t: dotmod(t, {"app.config": {"env": "production"}}),
# Convert to dict
to_dict
)
Pattern Matching Reference¶
Patterns support various matching strategies:
name
- Exact name match*
- Single wildcard**
- Deep wildcard (any subtree)*.txt
- Wildcard with suffix[attr=value]
- Attribute match[attr]
- Attribute existence~regex
- Regex pattern%fuzzy
- Fuzzy matching[?(@.size > 100)]
- Predicate expressions[0]
,[1:3]
- Array indexing/slicing
Examples:
# Various pattern examples
dotmatch(tree, "app.config") # Exact path
dotmatch(tree, "app.*.settings") # Wildcard
dotmatch(tree, "app.**") # All descendants
dotmatch(tree, "**[type=file]") # Attribute filter
dotmatch(tree, "**[size]") # Has attribute
dotmatch(tree, "files.*\.txt") # Escaped dot
dotmatch(tree, "**[?(@.size > 1000)]") # Size predicate