FlatForest¶
The FlatForest class represents a forest (set of tree-like objects) using a flat dictionary structure where each node has a unique key and an optional ‘parent’ key to reference its parent node. This class provides a view adapter for dict/JSON data of a particular format.
Tree Data Format¶
A FlatForest is represented using a dictionary, where each key is a unique node identifier, and the value is another dictionary containing node data and an optional ‘parent’ key indicating the parent node.
{
"<node_key>": {
"parent": "<parent_node_key>", # Parent node key (optional)
"<key>": "<value>", # Node payload (optional key-value pairs)
"...": "...",
"<key>": "<value>"
},
"...": "...",
"<node_key>": {
"parent": "<parent_node_key>",
"<key>": "<value>",
"...": "...",
"<key>": "<value>"
}
# ... more node key-value pairs
}
Example Forest Data:
{
"node1": {
"data": "Some data for node1"
},
"node2": {
"data": "Some data for node2"
},
"node3": {
"parent": "node1",
"data": "Some data for node3"
},
"node4": {
"parent": "node3",
"data": "Some data for node4"
},
"node5": {
"parent": "node3",
"data": "Some data for node5"
}
}
Theoretical Background¶
Trees are hierarchical data structures consisting of nodes, where each node has a parent and potentially many children. Trees are used in various domains such as databases, file systems, and network routing. They are particularly useful for representing data with a nested or hierarchical nature.
Tree Terminology
Node: A structure that contains data and references to its parent. A tree is a collection of nodes related by parent-child relationships.
Root: The top node of a tree.
Leaf: A node with no children.
Proxy Objects and Views¶
In computer science, a proxy object is an object that acts as an intermediary for another object. The proxy can control access to the original object, providing additional functionality such as validation, lazy loading, or caching. This is a common design pattern used to create a level of indirection.
A view in this context is an abstraction that provides a different perspective or representation of the underlying data. For example, a view can present a flat dictionary as a hierarchical tree structure.
FlatForestNode Proxies¶
The FlatForestNode is a proxy class for providing a node-centric view of FlatForest objects. It allows you to treat nodes as first-class objects while maintaining the underlying flat dictionary structure. You do not even need to be aware of FlatForest objects, since you can create and manipulate nodes directly, but these operations are reflected in the underlying FlatForest, which may be accessed if needed using the forest attribute.
Key Features:
Encapsulation: Provides methods to manipulate individual nodes.
Abstraction: Hides the complexity of the flat dictionary structure, presenting a more intuitive tree-like interface.
Flexibility: Allows you to work with sub-trees and individual nodes seamlessly.
Root Node¶
In FlatForest, there can be multiple roots (multiple trees). These roots are the nodes that have no parent. They can be accessed with the roots and root_names attributes.
FlatForest also exposes itself as a tree-like structure, where the default behavior is to treat the first root node found as the tree. This may be overridden by changing the preferred_root attribute.
We also provide an as_tree method to merge all of the trees in the forest under a new root node, which can be useful if a tree-like structure is needed for all nodes in the forest.
FlatForest Class¶
The FlatForest class provides a flexible way to work with tree structures using a flat dictionary format. It offers various methods for manipulating and visualizing trees.
Initializing a FlatTree¶
You can initialize a FlatForest with a dictionary representing the tree data.
import AlgoTree
tree_data = {
"node1": {
"data": "Some data for node1"
},
"node2": {
"parent": "node1",
"data": "Some data for node2"
},
"node3": {
"parent": "node1",
"data": "Some data for node3"
},
"node4": {
"parent": "node3",
"data": "Some data for node4"
},
"node5": {
"parent": "node3",
"data": "Some data for node5"
}
}
tree = AlgoTree.FlatForest(tree_data)
print(json.dumps(tree, indent=2))
Expected Output:
{
"node1": {
"data": "Some data for node1"
},
"node2": {
"parent": "node1",
"data": "Some data for node2"
},
"node3": {
"parent": "node1",
"data": "Some data for node3"
},
"node4": {
"parent": "node3",
"data": "Some data for node4"
},
"node5": {
"parent": "node3",
"data": "Some data for node5"
}
}
Visualizing the Tree¶
You can visualize the tree using the PrettyTree class.
Text Visualization¶
from AlgoTree.pretty_print import pretty_print
print(pretty_print(tree))
Expected Output:
node1
├── node3
│ ├── node4
│ └── node5
└── node2
Manipulating the Tree¶
Adding a Child Node¶
child = tree.root.add_child(name="node36", data="Some data for node36")
print(child)
Expected Output:
FlatForestNode(name=node36, parent=node1, data="Some data for node36"})
Viewing Sub-Trees¶
You can work with sub-trees rooted at any node.
print(pretty_tree(tree.node("node3")))
Expected Output:
node3
├── node4
└── node5
Validating the Tree¶
Ensures that all keys are unique and that parent references are valid.
tree.check_valid()
Detaching and Purging Nodes¶
You can detach nodes, which sets their parent to a special key indicating they are detached, and purge detached nodes to remove them from the underlying dictionary.
Purging Detached Nodes¶
tree.purge()
Handling Errors¶
Invalid Parent Reference¶
Attempting to create a tree with an invalid parent reference will raise an error.
try:
invalid_tree = AlgoTree.FlatForest({
"node1": {
"parent": "non_existent_parent",
"data": "Some data for node1"
}})
invalid_tree.check_valid()
except KeyError as e:
print(e)
Expected Output:
Parent node non-existent: 'non_existent_parent'
Cycle Detection¶
The FlatForest class checks for cycles in the forest and raises an error if a cycle is detected.
try:
cycle_tree_data = {
"node0": { "data": "Some data for node0"},
"node1": {"parent": "node2", "data": "Some data for node1"},
"node2": {"parent": "node3", "data": "Some data for node2"},
"node3": {"parent": "node1", "data": "Some data for node3"},
"node4": {"parent": "node0", "data": "Some data for node4"}
}
cycle_tree = AlgoTree.FlatForest(cycle_tree_data)
cycle_tree.check_valid()
except ValueError as e:
print(e)
Expected Output:
Cycle detected: {'node2', 'node3', 'node1'}
Tree Conversions¶
You can convert between different tree representations, as long as they expose an API like children property or parent. We provide a TreeConverter class to facilitate these conversions.
Converting to TreeNode¶
import AlgoTree.tree_converter as tc
new_tree = tc.TreeConverter.convert(tree, target_type=AlgoTree.TreeNode)
print(type(new_tree))
Expected Output:
<class 'AlgoTree.treenode.TreeNode'>
Conclusion¶
The FlatForest class provides a flexible and powerful way to represent and manipulate tree structures using a flat dictionary format. With methods for adding, detaching, pruning, and visualizing nodes, FlatForest can handle various tree-related tasks efficiently. This tutorial has covered the basic and advanced usage of the class, demonstrating its capabilities and versatility.
For more detailed information and code implementation, refer to the [GitHub repository](https://github.com/queelius/AlgoTree).