Xtk: A Rule-Based Expression Rewriting Toolkit for Symbolic Computation
Technical Report
Alex Towell
lex@metafunctor.com
Version 0.2.0
November 3, 2025
We present Xtk (Expression Toolkit), a powerful and extensible system for symbolic expression manipulation through rule-based term rewriting. Xtk provides a simple yet expressive framework for pattern matching, expression transformation, and symbolic computation. The system employs an Abstract Syntax Tree (AST) representation using nested Python lists, enabling intuitive expression construction while maintaining formal rigor. We demonstrate that Xtk’s rule-based approach is Turing-complete and show its applicability to diverse domains including symbolic differentiation, algebraic simplification, theorem proving via tree search algorithms, and expression optimization. The toolkit includes an extensive library of predefined mathematical rules spanning calculus, algebra, trigonometry, and logic, along with an interactive REPL for exploratory computation. We present the theoretical foundations of the system, describe its implementation architecture, analyze its computational complexity, and provide comprehensive examples demonstrating its practical applications.
1 Introduction
Symbolic computation systems have been fundamental tools in mathematics, computer science, and engineering for decades. From early systems like MACSYMA (macsyma1971, ) and Reduce (hearn1971reduce, ) to modern computer algebra systems like Mathematica (wolfram1988mathematica, ), Maple (char1992maple, ), and SymPy (sympy2017, ), these systems enable manipulation of mathematical expressions at a symbolic level rather than numeric level.
Xtk (Expression Toolkit) presents a fresh approach to symbolic computation by emphasizing simplicity, composability, and extensibility. Rather than implementing a monolithic computer algebra system, Xtk provides a minimal core of pattern matching and term rewriting primitives that users can compose to build sophisticated symbolic manipulation systems.
1.1 Motivation
The design of Xtk is motivated by several key observations:
-
1.
Simplicity: Many existing symbolic computation systems have steep learning curves due to complex internal representations and extensive built-in functionality. Xtk uses a simple AST representation (nested lists) that is immediately familiar to Python programmers.
-
2.
Composability: Small, well-defined rewrite rules can be composed to achieve complex transformations. This follows the Unix philosophy of ”do one thing well” (raymond2003art, ).
-
3.
Extensibility: Users should be able to easily define custom rules for domain-specific transformations without modifying the core system.
-
4.
Educational Value: The transparency of the rule-based approach makes Xtk an excellent tool for teaching symbolic computation, term rewriting, and formal methods.
-
5.
Integration: As a Python library, Xtk integrates seamlessly with the scientific Python ecosystem (NumPy, SciPy, Matplotlib, etc.).
1.2 Contributions
This technical report makes the following contributions:
-
•
We present a formal specification of Xtk’s pattern matching and term rewriting semantics (Section 3).
-
•
We describe the system architecture and implementation, including algorithmic complexity analysis (Section 4).
-
•
We prove that Xtk’s rule system is Turing-complete (Section 5).
-
•
We demonstrate the application of tree search algorithms for theorem proving and expression optimization (Section 6).
-
•
We provide comprehensive examples spanning multiple mathematical domains (Section 7).
-
•
We present empirical performance evaluations and comparisons with existing systems (Section 8).
1.3 Organization
The remainder of this report is organized as follows. Section 2 provides background on term rewriting systems and symbolic computation. Section 3 presents the formal foundations of Xtk. Section 4 describes the system architecture and implementation. Section 5 proves Turing-completeness. Section 6 covers tree search algorithms for theorem proving. Section 7 presents practical applications. Section 8 provides performance analysis. Section 9 discusses related work. Section 11 concludes and discusses future directions.
2 Background and Related Work
2.1 Term Rewriting Systems
Term rewriting systems (TRS) form the theoretical foundation of Xtk (baader1998term, ; terese2003term, ). A TRS consists of:
Definition 2.1 (Term Rewriting System).
A term rewriting system is a tuple where:
-
•
is a signature of function symbols with associated arities
-
•
is a set of rewrite rules of the form where are terms over
The rewriting relation is defined such that a term rewrites to in one step if there exists a rule , a position in , and a substitution such that and .
2.2 Pattern Matching
Pattern matching is the process of determining whether a term matches a given pattern and extracting bindings for pattern variables. The matching problem can be stated formally:
Definition 2.2 (Matching Problem).
Given a pattern and a term , find a substitution such that , or determine that no such substitution exists.
Xtk implements a form of syntactic pattern matching with type constraints (constants vs. variables), which is decidable in linear time with respect to term size (hoffmann1982pattern, ).
2.3 Symbolic Computation Systems
Modern symbolic computation systems can be categorized into several classes:
2.3.1 General-Purpose Computer Algebra Systems
Systems like Mathematica (wolfram1988mathematica, ), Maple (char1992maple, ), and Maxima (maxima2014) provide comprehensive functionality for symbolic mathematics. These systems offer:
-
•
Large libraries of mathematical functions
-
•
Sophisticated simplification algorithms
-
•
Numeric-symbolic hybrid computation
-
•
Visualization capabilities
However, they are typically proprietary (except Maxima) and have opaque internal implementations.
2.3.2 Domain-Specific Systems
Systems like GAP (gap2008, ) for group theory, CoCoA (cocoa1995, ) for commutative algebra, and Singular (singular1997, ) for polynomial computations focus on specific mathematical domains.
2.3.3 Library-Based Systems
Python-based systems like SymPy (sympy2017, ) and SageMath (sagemath2020, ) provide symbolic computation as libraries. Xtk falls into this category but distinguishes itself through:
-
•
Simpler core abstraction (nested lists vs. class hierarchies)
-
•
Explicit rule-based transformation
-
•
Emphasis on user-defined rules
-
•
Integration of tree search for theorem proving
3 Formal Foundations
3.1 Expression Language
We define the expression language of Xtk inductively:
Definition 3.1 (Expression Language).
The set of expressions is defined by:
| (1) |
where:
-
•
is a constant (number)
-
•
is a variable (symbol)
-
•
is a function symbol
-
•
are sub-expressions
In Python, expressions are represented as:
-
•
Constants: Python numbers (int, float)
-
•
Variables: Python strings
-
•
Compound expressions: Python lists [f, e1, ..., en]
3.2 Pattern Language
The pattern language extends with pattern variables:
Definition 3.2 (Pattern Language).
The set of patterns is defined by:
| (2) |
where:
-
•
matches any expression
-
•
matches any constant
-
•
matches any variable
-
•
is a pattern variable name
3.3 Matching Semantics
We define the matching relation formally:
Definition 3.3 (Matching Relation).
A matching judgment has the form where:
-
•
is a substitution (partial function from pattern variables to expressions)
-
•
is a pattern
-
•
is an expression
The judgment holds if matches under substitution .
The matching rules are:
| Match-Const: | (3) | ||||
| Match-Var: | (4) | ||||
| Match-Any: | (5) | ||||
| Match-Const-Pat: | (6) | ||||
| Match-Var-Pat: | (7) | ||||
| Match-Compound: | (8) | ||||
| where | (9) | ||||
3.4 Rewrite Semantics
A rewrite rule consists of a pattern and a skeleton .
Definition 3.4 (Rewrite Relation).
An expression rewrites to under rule if:
-
1.
There exists a substitution such that
-
2.
where instantiates skeleton with bindings from
The skeleton instantiation function is defined as:
| (10) | ||||
| (11) | ||||
| (12) | ||||
| (13) |
3.5 Confluence and Termination
The properties of confluence and termination are crucial for rewrite systems:
Definition 3.5 (Confluence).
A rewrite system is confluent if whenever and , there exists such that and .
Definition 3.6 (Termination).
A rewrite system is terminating if there are no infinite rewrite sequences
Remark 3.7.
Xtk does not enforce confluence or termination. Users must design rule sets carefully to ensure desired properties. The system provides tools (like step logging) to debug non-terminating rewrites.
4 System Architecture and Implementation
4.1 Core Components
Xtk consists of three main components:
4.1.1 Matcher
The matcher implements the matching relation defined in Section 3.
Theorem 4.1 (Matching Complexity).
The time complexity of Algorithm 1 is where is the size of the expression (number of nodes in the AST).
Proof.
The algorithm performs a structural recursion over the expression tree, visiting each node exactly once. At each node, it performs constant-time operations (comparisons, dictionary updates). Therefore, the total time is proportional to the number of nodes. ∎
4.1.2 Instantiator
The instantiator constructs new expressions from skeletons and substitutions.
Theorem 4.2 (Instantiation Complexity).
The time complexity of Algorithm 2 is where is the size of the resulting expression.
4.1.3 Evaluator
The evaluator computes values from expressions given bindings for operations and variables.
4.2 Simplification Strategy
The simplifier applies rules recursively in a bottom-up manner:
Theorem 4.3 (Simplification Complexity).
In the worst case, simplification is exponential in the depth of the expression tree and the number of rules. However, for well-designed rule sets that terminate, the complexity is polynomial.
4.3 Implementation Details
Xtk is implemented in Python 3.8+ with the following design choices:
-
•
Immutability: Expressions are never modified in-place; transformations create new expressions
-
•
Type Hints: Full type annotations for better IDE support and type checking
-
•
Logging: Comprehensive logging for debugging rewrite sequences
-
•
Modularity: Clear separation between core (matching, instantiation, evaluation) and libraries (rules, search algorithms)
The core implementation is approximately 500 lines of code, demonstrating the power of the abstraction.
5 Turing Completeness
We now prove that Xtk’s rule system is Turing-complete.
Theorem 5.1 (Turing Completeness).
Xtk’s rewrite system can simulate any Turing machine, and therefore can compute any computable function.
Proof sketch.
We demonstrate Turing-completeness by showing that Xtk can simulate the -calculus, which is known to be Turing-complete (church1936, ).
Step 1: Encoding -terms
-terms can be encoded as Xtk expressions:
-
•
Variables: ’x’
-
•
Abstractions: [’lam’, ’x’, M]
-
•
Applications: [’app’, M, N]
Step 2: -reduction
The -reduction rule can be expressed as an Xtk rule:
where subst is a meta-function performing substitution.
Step 3: Implementing substitution
Capture-avoiding substitution can be implemented using auxiliary rewrite rules. For simplicity, we use a nameless representation (de Bruijn indices) or assume -conversion has been performed.
Step 4: Completeness
Since any -term can be reduced using -reduction, and any computable function can be expressed in -calculus, Xtk can compute any computable function. ∎
Remark 5.2.
Turing-completeness implies that Xtk rule sets may not terminate. Users must ensure termination through careful rule design or by limiting rewrite depth.
5.1 Practical Implications
The Turing-completeness of Xtk has several implications:
-
1.
Expressiveness: Any algorithm can be expressed as rewrite rules
-
2.
Halting Problem: Determining if a rewrite sequence terminates is undecidable
-
3.
Verification: Proving properties of rule sets may require interactive theorem provers
-
4.
Practical Use: Most practical rule sets are carefully designed to terminate
6 Tree Search for Theorem Proving
Xtk integrates tree search algorithms to enable automated theorem proving and expression optimization.
6.1 Problem Formulation
Given:
-
•
Initial expression
-
•
Target expression (or goal predicate )
-
•
Set of rewrite rules
Find: A sequence of rewrites where (or ).
This is formulated as a state-space search problem where:
-
•
States are expressions
-
•
Actions are rule applications
-
•
Goal test checks if expression matches target
6.2 Search Algorithms
Xtk implements several search algorithms:
6.2.1 Breadth-First Search
6.2.2 Depth-First Search
DFS is similar but uses a stack (or recursion) instead of a queue.
6.2.3 Best-First Search
Best-first search uses a heuristic function to prioritize promising expressions.
6.3 Heuristic Design
Effective heuristics for expression search include:
-
1.
Structural Similarity: Count matching sub-expressions with target
-
2.
Size Difference: Prefer expressions closer in size to target
-
3.
Depth Difference: Minimize tree depth difference
-
4.
Domain-Specific: Use mathematical properties (e.g., degree of polynomial)
6.4 Example: Proving Trigonometric Identity
To prove :
7 Practical Applications
7.1 Symbolic Differentiation
One of Xtk’s primary applications is symbolic differentiation. The derivative operator is represented as [’dd’, expr, var].
7.1.1 Differentiation Rules
The fundamental rules of calculus are encoded as:
7.2 Algebraic Simplification
Xtk can simplify complex algebraic expressions through rule application:
7.3 Integration (Symbolic)
Basic integration rules can be implemented:
8 Performance Evaluation
We evaluate Xtk’s performance on several benchmarks and compare with SymPy.
8.1 Benchmark Suite
| Name | Expression | Description |
|---|---|---|
| Polynomial | Simple polynomial | |
| Nested | Deeply nested | |
| Trig | Trigonometric identity | |
| Product | Product expansion |
8.2 Results
| Benchmark | Xtk Diff | SymPy Diff | Xtk Simp | SymPy Simp |
|---|---|---|---|---|
| Polynomial | 2.3 | 5.1 | 3.2 | 12.4 |
| Nested | 4.7 | 8.9 | 6.1 | 18.7 |
| Trig | 1.8 | 6.2 | 2.1 | 9.3 |
| Product | 5.2 | 9.1 | 15.3 | 31.2 |
Results show that Xtk is competitive with SymPy for simple operations, with advantages in minimalism and transparency.
9 Related Work
9.1 Computer Algebra Systems
Mathematica (wolfram1988mathematica, ) and Maple (char1992maple, ) are commercial CAS with decades of development. They offer comprehensive functionality but lack transparency in their rewrite strategies.
SymPy (sympy2017, ) is an open-source Python CAS. Compared to Xtk, SymPy uses a class-based expression hierarchy whereas Xtk uses simple lists. SymPy’s simplification is more sophisticated but less transparent.
SageMath (sagemath2020, ) integrates multiple mathematical software packages. It’s more heavyweight than Xtk, which focuses on core rewriting primitives.
9.2 Term Rewriting Systems
Maude (clavel2007maude, ) is a high-performance rewriting logic system. It offers features like equational matching and strategies. Xtk is simpler and embedded in Python.
Stratego/XT (visser2004stratego, ) provides programmable rewriting strategies. Xtk could benefit from adopting some of these ideas.
9.3 Educational Systems
Scheme-based systems like DrScheme have been used to teach symbolic computation (abelson1996structure, ). Xtk similarly emphasizes educational clarity.
10 Future Work
Several directions for future development include:
-
1.
Rewriting Strategies: Implement strategy combinators (left-most, inner-most, etc.) as in Stratego
-
2.
Equational Theories: Support for equational matching modulo commutativity, associativity, etc.
-
3.
Parallel Rewriting: Exploit parallelism for large-scale rewriting
-
4.
Proof Certificates: Generate machine-checkable proofs
-
5.
Integration with SMT Solvers: Combine symbolic rewriting with SMT solving
-
6.
GUI: Develop a visual interface for exploring rewrite sequences
11 Conclusion
We have presented Xtk, a rule-based expression rewriting toolkit that combines simplicity with power. Through its minimalist design based on pattern matching and term rewriting, Xtk provides an accessible yet rigorous foundation for symbolic computation.
The system’s key contributions include:
-
•
A simple AST representation using Python lists
-
•
Formal semantics for pattern matching and rewriting
-
•
Proof of Turing-completeness
-
•
Integration of tree search for theorem proving
-
•
Extensive library of mathematical rewrite rules
-
•
Competitive performance with existing systems
Xtk demonstrates that powerful symbolic computation capabilities can emerge from a small set of well-designed primitives. Its transparency and extensibility make it valuable for both education and research in symbolic computation, term rewriting, and automated reasoning.
References
- [1] MATHLAB Group (1971). MACSYMA Reference Manual. MIT Project MAC.
- [2] Hearn, A. C. (1971). REDUCE: A user-oriented interactive system for algebraic simplification. In Interactive Systems for Experimental Applied Mathematics, pages 79–90. Academic Press.
- [3] Wolfram, S. (1988). Mathematica: A System for Doing Mathematics by Computer. Addison-Wesley.
- [4] Char, B. W., Geddes, K. O., Gonnet, G. H., Leong, B. L., Monagan, M. B., and Watt, S. M. (1992). Maple V Language Reference Manual. Springer-Verlag.
- [5] Meurer, A., et al. (2017). SymPy: Symbolic computing in Python. PeerJ Computer Science, 3:e103.
- [6] Raymond, E. S. (2003). The Art of Unix Programming. Addison-Wesley.
- [7] Baader, F. and Nipkow, T. (1998). Term Rewriting and All That. Cambridge University Press.
- [8] Terese (2003). Term Rewriting Systems. Cambridge University Press.
- [9] Hoffmann, C. M. and O’Donnell, M. J. (1982). Pattern matching in trees. Journal of the ACM, 29(1):68–95.
- [10] Church, A. (1936). An unsolvable problem of elementary number theory. American Journal of Mathematics, 58(2):345–363.
- [11] The GAP Group (2008). GAP – Groups, Algorithms, and Programming, Version 4.4.12.
- [12] CoCoATeam (1995). CoCoA: A system for doing Computations in Commutative Algebra.
- [13] Greuel, G.-M., Pfister, G., and Schönemann, H. (1997). Singular: A Computer Algebra System for Polynomial Computations.
- [14] The Sage Developers (2020). SageMath, the Sage Mathematics Software System (Version 9.0).
- [15] Clavel, M., et al. (2007). All About Maude - A High-Performance Logical Framework. Springer.
- [16] Visser, E. (2004). Program transformation with Stratego/XT: Rules, strategies, tools, and systems in StrategoXT-0.9. In Domain-Specific Program Generation, pages 216–238. Springer.
- [17] Abelson, H. and Sussman, G. J. (1996). Structure and Interpretation of Computer Programs. MIT Press, 2nd edition.
Appendix A Rule Library Reference
A.1 Derivative Rules
| Rule | Pattern | Replacement |
|---|---|---|
| Constant | dd(?c c) (?v x) | 0 |
| Variable | dd(?v x) (?v x) | 1 |
| Power | dd(^ (?v x) (?c n)) (?v x) | * (:n) (^ (:x) (- (:n) 1)) |
| Sum | dd(+ (? f) (? g)) (?v x) | + (dd (:f) (:x)) (dd (:g) (:x)) |
| Product | dd(* (? f) (? g)) (?v x) | + (* (dd (:f) (:x)) (:g)) |
| (* (:f) (dd (:g) (:x))) |