Term Rewriting as Metalinguistic Abstraction
When a problem domain involves transforming structured expressions, you have two choices. You can write recursive traversals by hand, matching cases and rebuilding trees. Or you can build a language for expressing transformations and let the engine do the traversal.
The first approach works fine until you have thirty rules and need to change the traversal strategy. The second approach is what SICP Chapter 4 calls metalinguistic abstraction: building a language suited to the problem.
Rerum is a Python library for pattern matching and term rewriting. The core idea: transformation rules are data, not code. You write them in a DSL, load them from files, combine them with composition operators. The engine applies them.
The Rule DSL¶
A rewrite rule has a pattern (what to match) and a skeleton (what to replace it with):
@add-zero[100] "x + 0 = x": (+ ?x 0) => :x
@mul-one[100]: (* ?x 1) => :x
@mul-zero[100]: (* ?x 0) => 0
Each rule has a name (@add-zero) for tracing, an optional priority ([100]) for conflict resolution, and an optional description. The pattern syntax uses ?x to bind variables and :x to reference them in the skeleton:
| Syntax | Meaning |
|---|---|
?x |
Match anything, bind to x |
?x:const |
Match only numbers |
?x:var |
Match only symbols |
?x:free(v) |
Match expressions not containing v |
?x... |
Variadic, capture remaining arguments |
This is a small language. It has primitives (patterns, skeletons), means of combination (rules, rulesets), and means of abstraction (named rules, engine composition). Those are SICP's three requirements for a language.
Symbolic Differentiation¶
Here's what the DSL looks like for a real problem. Fifteen rules that compute symbolic derivatives:
[basic-derivatives]
@dd-const[100]: (dd ?c:const ?v:var) => 0
@dd-var-same[100]: (dd ?x:var ?x) => 1
@dd-var-diff[90]: (dd ?y:var ?x:var) => 0
[rules]
@dd-sum: (dd (+ ?f ?g) ?v:var) => (+ (dd :f :v) (dd :g :v))
@dd-product: (dd (* ?f ?g) ?v:var) => (+ (* (dd :f :v) :g) (* :f (dd :g :v)))
@dd-power: (dd (^ ?f ?n:const) ?v:var) => (* :n (* (^ :f (- :n 1)) (dd :f :v)))
@dd-exp: (dd (exp ?f) ?v:var) => (* (exp :f) (dd :f :v))
@dd-log: (dd (ln ?f) ?v:var) => (/ (dd :f :v) :f)
@dd-sin: (dd (sin ?f) ?v:var) => (* (cos :f) (dd :f :v))
@dd-cos: (dd (cos ?f) ?v:var) => (* (- (sin :f)) (dd :f :v))
from rerum import RuleEngine, E
engine = RuleEngine.from_file("calculus.rules")
engine(E("(dd (^ x 2) x)")) # => (* 2 (* (^ x 1) 1))
The result needs simplification (another ruleset handles that), but the differentiation itself is entirely declarative. Each rule is the chain rule, product rule, or power rule stated once. The engine handles traversal and substitution.
Rules vs. Code: The Security Boundary¶
A key architectural decision: rules can reference operations but can't execute arbitrary code. The (! op args...) compute form invokes operations that the host explicitly provides through a prelude:
from rerum import RuleEngine, ARITHMETIC_PRELUDE
engine = (RuleEngine()
.with_prelude(ARITHMETIC_PRELUDE)
.load_dsl('''
@fold-add: (+ ?a ?b) => (! + :a :b)
when (! and (! const? :a) (! const? :b))
'''))
engine(E("(+ 1 2)")) # => 3 (computed by prelude)
engine(E("(+ x 2)")) # => (+ x 2) (no change, x isn't const)
This means you can load rules from untrusted sources safely. A rule file can only invoke operations you've explicitly allowed. No eval, no exec, no filesystem access. The rule language is deliberately not Turing-complete in its compute form.
This separation between untrusted transformation logic (rules) and trusted computation (preludes) is worth the design effort. Most symbolic systems don't make this distinction, which means either rules can do anything (unsafe) or rules can't compute at all (too restrictive).
Engine Composition¶
Engines combine with two operators:
expand = RuleEngine.from_dsl("@square: (square ?x) => (* :x :x)")
simplify = RuleEngine.from_file("algebra.rules")
# Sequence: expand first, then simplify
normalize = expand >> simplify
normalize(E("(square 3)")) # => 9
# Union: combine all rules into one engine
combined = algebra | calculus
The closure property holds: combining engines yields an engine. >> is associative. This means you can build complex transformation pipelines from simple, testable pieces.
Different problems also need different traversal strategies. exhaustive applies rules until fixpoint. bottomup simplifies children before parents. topdown tries the root first. once applies at most one rule. The strategy is separate from the rules, which means the same ruleset can be applied in different ways depending on the problem.
Why "Rerum"?¶
The name comes from Lucretius's De Rerum Natura ("On the Nature of Things"), a poem about how complex phenomena arise from simple atomic transformations. That's the idea here too: complex symbolic computations as compositions of simple rewrite rules.
This post is part of a series on SICP, exploring how the ideas from Structure and Interpretation of Computer Programs appear in modern programming practice.