Mar 17, 2023

Term rewriting is one of the most fundamental techniques in
programming languages. It is used to define program semantics, to
optimize programs, and to check program equivalences. An issue with
using term rewriting to optimize program is that, in a non-confluent
term rewriting system, it is usually not clear which rule should be
applied first, among all the possible rules. Equality saturation (EqSat)
is a variant of term rewriting that mitigates this so-called
Phase-Ordering Problem. In EqSat, all the rules are applied at the same
time, and the resulting program space is stored compactly in a data
structure called E-graph^{1}.

EqSat has been shown to be very successful for program optimizations and program equivalence checking, even when the given set of rewrite rules are not terminating or even when the theory is not decidable in general. However, despite its success in practice, there are no formal guarantees about EqSat: for example, when does EqSat terminate, and if it does not, how does one make it terminate. The first problem is known in the term rewriting literature as the termination problem, and the second is known as the completion problem. Both problems are very hard, and there are a ton of literatures on both problems. In the setting of EqSat, these problems are not only theoretically interesting, both also have practical implications. For example, in program optimization, we may want to get the most “optimized” term with regard to a given set of rules, so making sure EqSat terminate is important to such optimality guarantees. Or, some theories are decidable but deciding them is slow, so one may want to speed up the reasoning by using EqSat, but there is no point in “speeding up” the decision procedure if it simply does not terminate. In this post, we will focus on the termination problem of EqSat. We don’t attempt to solve this problem entirely, but rather use this blog post as a first step and to draw community’s attention to this problem. In fact, we found many interesting results about the termination problem with EqSat.

This post will show (1) how the innocent-looking associativity rule can cause non-termination, (2) why a terminating, and even canonical, term rewriting system does not necessarily terminate in EqSat, (3) how to fix the above problem by “weakening” EqSat’s merge operation, and (4) two potentially promising approaches to ensure the termination of EqSat. One fascinating thing I found during this journey is that, researchers working on tree automata indeed developed a technique almost identical to EqSat, known as Tree Automata (TA) completion. Different from EqSat, TA Completion does not have the problem in (2) and is exactly the algorithm we will show in (3). Moreover, there is a beautiful connection between EqSat and TA completion: TA completion is the “preorder” version of EqSat.

Before understanding why associativity can cause non-termination, let us first briefly review some relevant backgrounds on ground theories and congruence closure.

A ground equational theory is an equational theory induced by a finite set of ground identities of the form \(s\approx t\), where both \(s\) and \(t\) are ground terms (i.e., no variables). For example, below is an example of a ground theory over signature \(\Sigma={a,b,c,f,g}\): \[\begin{align*} a&\approx f(b)\\ b&\approx g(c)\\ f(g(c))&\approx f(a)\\ \end{align*}\] All the equations that can be implied by the three identities are true in this equational theory. For example, we have \(a\approx f(a)\) because \(a\approx f(b)\approx f(g(c))\approx f(a)\). Here, \(f(b)\approx f(g(c))\) is implied by \(b\approx g(c)\). In equational theory, a function by definition maps equivalent inputs to equivalent outputs.

A classic result in term rewriting and logic is that the word problem of ground equational theory is decidable. A word problem asks whether two ground terms \(s\) and \(t\) are equivalent in a given theory. In general, this problem is undecidable. However, if the theory is ground, several algorithms exist that decide its word problem. One of the most well-known algorithm is the \(O(n \log n)\) congruence closure algorithm of Downey, Sethi, and Tarjan. One way to view the congruence closure algorithm is that it produces a canonical term rewriting system for each input set of ground identities: For theory \(E\), it builds an E-graph of the theory. Every E-graph corresponds to a canonical term rewriting system, which gives a way to decide \(E\). For example, the congruence closure algorithm will produce the following E-graph for the theory above: \[\begin{align*} c_a&=\{a, f(c_a), f(c_b)\}\\ c_b&=\{b, g(c_c)\}\\ c_c&=\{c\} \end{align*}\] where \(c_a, c_b, c_c\) denote E-classes of the E-graph, and \(a,b,c,f(c_a), f(c_b), g(c_c)\) denote E-nodes. This E-graph naturally gives the following canonical term rewriting system \(G\), which rewrite terms to their e-classes: \[\begin{align*} a&\rightarrow_G c_a\\ f(c_a)&\rightarrow_G c_a\\ f(c_b)&\rightarrow_G c_a\\ b&\rightarrow_G c_b\\ g(c_c)&\rightarrow_G c_b\\ c&\rightarrow_G c_c\\ \end{align*}\] Now, checking \(s\approx t\) can be simply done by checking if there exists some normal form \(u\) such that \(s\rightarrow^*_G u \leftarrow^*_Gt\) holds. For example, \(g(f(a))\approx g(f(g(c)))\) because \[\begin{align*} g(f(a))&\rightarrow_Gg(f(c_a))\\ &\rightarrow_Gg(c_a)\\ &\leftarrow_G g(f(c_b)) \\ &\leftarrow_Gg(f(g(c_c))\\ &\leftarrow_Gg(f(g(c)))) \end{align*}\]

This is sound and always terminates, because the term rewriting system produced by an E-graph is canonical—meaning every term will have exactly one normal form and term rewriting always terminates.

Associativity is a fundamental axiom to many algebraic structures
like semigroups, monoids, and groups. It has the following form: \[x\cdot (y\cdot z)\approx (x\cdot y)\cdot
z.\] This rule can be oriented as \(x\cdot (y\cdot z)\rightarrow (x\cdot y)\cdot
z\) (or \(x\cdot (y\cdot z)\rightarrow
(x\cdot y)\cdot z\)). This rule must be terminating, you may
think, so we can just apply the rule until saturation in EqSat, which
will decide ground theories with associativity! Unfortunately, ground
associative theories are not decidable in general. *Term Rewriting
and All That* gives an example (we write \(xy\) for \(x\cdot
y\) and \(x\cdots x\) for \(x^n\) for brevity and associativity allows
us to drop brackets): \[\begin{align*}
(xy)z&\approx x(yz)\\
aba^2b^2&\approx b^2a^2ba\\
a^2bab^2a&\approx b^2a^3ba\\
aba^3b^2&\approx ab^2aba^2\\
b^3a^2b^2a^2ba&\approx b^3a^2b^2a^4\\
a^4b^2a^2ba&\approx b^2a^4
\end{align*}\] There is another way to state this proposition
that appeals to math-minded persons: the word problem for finitely
presented semigroups are not decidable.

Because of this, associative rules do not terminate in EqSat in general. To show this, suppose otherwise it is terminating, given a set of ground identities \(E\), we run the congruence closure algorithm over \(E\) to get E-graph, and run both directions of the associativity. When it reaches the fixed point and terminates, this gives us a way to decide ground associative theories, which is a contradiction to the fact that such theories are not decidable.

To better understand why associativity does not terminate in EqSat, here is an example: suppose \(\cdot\) is associative and satisfy the ground identity \(0\cdot a\approx 0\) for constants \(0,a\). Now suppose we orient this identity into rewrite rule \(0\cdot a\rightarrow 0\) while having the associative rule \((x\cdot y)\cdot z\rightarrow x\cdot(y\cdot z)\). This is a terminating term-rewriting system (although not confluent, because the term \((0\cdot a)\cdot a\) has two normal forms \(0\) and \(0\cdot(a\cdot a)\)).

However, this ruleset causes problems in EqSat: Starting with the initial term \(0\cdot a\), EqSat will apply the rewrite rule \(0\cdot a\rightarrow 0\) and merge \(0\cdot a\) and \(0\) into the same E-class. The E-graph will look like this:

Notice that because of the existence of cycles in this E-graph, it
represents not only the two terms \(0\)
and \(0\cdot a\) but indeed an infinite
set of terms. For example,
\((0\cdot a)\cdot a\) is *explicitly*
represented by E-class \(q_0\) because
\[(0\cdot a)\cdot a\rightarrow^* (q_0\cdot
q_a)\cdot q_a\rightarrow q_0\cdot q_a\rightarrow q_0.\] In fact,
\(q_0\) represents the infinite set of
terms \[0\cdot a\approx(0\cdot a)\cdot
a\approx ((0\cdot a)\cdot a)\cdot a\approx\cdots.\] For any such
term \((0\cdot a)\cdot\cdots\), it can
be rewritten to a term of the form \(0\cdot
(a\cdot \cdots)\). Now, for associativity to terminate, the
output E-graph need to at least represent the set of terms \(a,a\cdot a,(a\cdot a)\cdot a,\cdots\),
where any two terms are not equal. This requires infinitely many
E-classes, each represents some \(a^n\), while a finite E-graph will have
only a finite number of E-classes. Therefore, EqSat will not terminate
in this case.

In our last example, the term rewriting system \(R=\{0\cdot a\rightarrow 0,(x\cdot y)\cdot z\rightarrow x\cdot (y\cdot z)\}\) is terminating, but it is not confluent. Confluence means that every term will have at most one normal form, and associativity is usually not confluent. One may be tempted to think that maybe non-confluence is what causes EqSat to not terminate. But it is not the case; there are canonical (i.e., terminating + confluent) TRSs that are non-terminating in EqSat. Here we give such an example: Let the TRS \(R\) be \[\begin{align*} h(f(x), y) &\rightarrow h(x, g(y))\\ h(x, g(y)) &\rightarrow h(x, y)\\ f(x) &\rightarrow x \end{align*}\]

This is a terminating term rewriting system, where every term of the form \(h(f^n(a), g^m(b))\) will have the normal form \(h(a, b)\), no matter the order of rule application. However, this is not terminating in EqSat: consider the initial term \(h(f(a), b)\). Running the rule \(f(x)\rightarrow x\) over the initial E-graph will union \(f(a)\) and \(a\) together, creating an infinite (but regular) set of terms \(h(f^*(a), b)\). See figure.

Now, by rule \(h(f(x), y) \rightarrow h(x, g(y))\), each \(h(f^n(a), b)\) will be rewritten into \(h(a, g^n(b))\), so the output E-graph must contain \(g^n(b)\) for \(n\in \mathbb{N}\). But notice that the rule set will not rewrite any \(g^n(b)\) to \(g^m(b)\) for \(n\neq m\), which means that we have an infinite set of inequivalent terms \(b\not\approx g(b)\not\approx g^2(b)\not\approx \ldots\). Again, the existence of infinitely many e-classes, one for each \(g^n(b)\), implies that EqSat will not terminate.

I hope, just like me, you will find the above observations somewhat
surprising. Intuitively, one will think that EqSat is just a more
powerful way of doing term rewriting. And for a terminating TRS, the set
of reachable terms is always finite^{2}, so it is natural to
think that running EqSat with a terminating TRS starting with some
initial term will eventually terminate. But this is not true, as has
been shown in the last two sections. The issue is because EqSat is not
exactly term rewriting: the equivalence in EqSat is bidirectional. For
example, in our last example, the rewrite from \(f(a)\) to \(a\) does not only make the E-graph
represent these two terms, but also \(f(f(a))\) and \(f(f(f(a)))\) and so on.

Before going further, let us first formally define problem in which we expect terminating TRSs to enjoy the termination property. For a TRS \(R\), we define the set of reachable terms \(R^*(s)=\{t\mid s\rightarrow_R^* t\}\). If \(R\) is terminating, \(R^*(s)\) is finite for any term \(s\). It can also be shown that EqSat always computes a superset of \(R^*(s)\). A natural idea is that if our EqSat procedure can compute exactly \(R^*(s)\), it is likely to terminate for terminating TRSs. And in fact it is capable of handling TRSs beyond terminating ones: E-graphs can represent many infinite sets of terms. This problem itself is interesting to study and has applications in areas like program verification.

It turns out, term rewriting researchers have developed a technique
that computes exactly \(R^*(s)\),
represented as a tree automaton. The technique is known as **tree
automata completion**, which is the main technique I hope to
introduce in this blog post.
Tree automata completion proceeds as follows: build an initial tree
automaton and run term rewriting over this tree automaton until
saturation. Specifically, it searches for left-hand sides of rewrite
rules, build and insert right-hand sides, and merge the left-hand sides
with right-hand sides. Does this sound familiar? Yes, this is EqSat! It
is striking to me that the program optimization and term rewriting
communities independently come up with essentially the same
technique.

But wait a second, didn’t we just say EqSat does not necessarily compute \(R^*(s)\) exactly? This is correct. There is a single tweak that distinguishes tree automata completion from EqSat. In tree automata completion, merging is performed directionally. For example, suppose the left-hand side is in E-class \(q_l\) and right-hand side in E-class \(q_r\), EqSat will basically rename every occurrence of \(q_l\) with \(q_r\) (or vice versa). As a result the two E-classes are not distinguishable after the merging.

Tree automata completion, on the other hand, performs the merging by adding a new \(\epsilon\)-transition \(q_r\rightarrow q_l\) (remember the TRS view of an E-graph). This allows tree automata completion to add only terms from the right-hand side to the E-graph.

To better see the difference, consider the E-graph that represents terms \(\{f(a), g(b)\}\) \[\begin{align*} a&\rightarrow q_a\\ f(q_a)&\rightarrow q_f\\ b&\rightarrow q_b\\ g(q_b)&\rightarrow q_g \end{align*}\] and the rewrite \(a\rightarrow b\). EqSat will rename \(q_b\) with \(q_a\) (or \(q_b\) with \(q_a\)), so every E-node that points to child \(a\) (resp. \(b\)) now also points to \(b\) (resp. \(a\)). The E-graph after the merging will now contain \(\{f(a), f(b), g(a), g(b)\}\). Note that among these terms, \(g(a)\) is not reachable by the TRS; the rewrite rule \(a\rightarrow b\) can only rewrite \(f(a)\) to \(f(b)\), but not \(g(b)\) to \(g(a)\). In contrast, tree automata completion will add the transition \(q_b\rightarrow q_a\). Recall that we say a term \(t\) is represented by an E-class \(q\) in an E-graph \(G\) if \(t\rightarrow_G^* q\). With the above transition, we have every term represented by \(q_a\) is now represented by \(q_b\), but not the other way around. As a consequence, \(f(b)\) is represented by the E-graph, since \[f(b)\rightarrow f(q_b)\rightarrow f(q_a)\rightarrow q_f,\] while \(g(a)\) is not in the E-graph.

This difference guarantees that tree automata completion will only contain terms that can is reachable by the TRS at any moment. Moreover, if tree automata completion terminates, it will compute exactly \(R^*(s)\). The actual tree automata completion is slightly more general than this: instead of considering the set of reachable terms of a single initial term, it considers the set of reachable terms of an initial tree automaton, which may contain an infinite set of terms. It turns out, although the set of reachable terms of a single term \(R^*(s)\) is always finite (and thus regular) if \(R\) is terminating, it is undecidable if the set of reachable terms is regular or not given an initial tree automaton even when \(R\) is terminating and confluent. To ensure the termination of tree automata completion even when the reachable set is not regular, researchers have proposed approximation algorithms for tree automata completion, which are useful for applications like program verification.

*Equivalence and pre-order*. One interesting way of viewing
tree automata completion is that *it generalizes the equivalence
relation in EqSat to a preorder*: EqSat maintains an equivalence
relation \(\approx\) between terms and
asserts \(l\sigma \approx r\sigma\) for
every left-hand side \(l\sigma\) and
right-hand side \(r\sigma\). EqSat also
guarantees that if \(t[a]\) is in the
E-graph and \(a\approx b\), then \(t[b]\) is also in the E-graph and \(t[a]\approx t[b]\)^{3}.
Tree automata completion, instead, maintains a *preorder*
relation \(\lesssim\) and asserts \(l\sigma\lesssim r\sigma\) for every
left-hand side \(l\sigma\) and
right-hand side \(r\sigma\). \(l\sigma\lesssim r\sigma\) and \(l\sigma\gtrsim r\sigma\) in tree automata
completion is equivalent to \(l\sigma \approx
r\sigma\) in equality saturation, and in such cases \(l\sigma\) and \(r\sigma\) can be viewed as one state.
Moreover, tree automata completion guarantees that if \(t[a]\) is in the tree automaton and \(a\lesssim b\), then \(t[b]\) is also in the tree automaton and
\(t[a]\lesssim t[b]\).

*Implementation of tree automata completion*. I have not
implemented tree automata completion, but it would be interesting to see
how to implement tree automata completion in an EqSat framework like
egg. It seems we only need to make two modifications: First, during
rewrite, instead of merging left-hand side and right-hand side, adding
an edge from the left-hand side to the right-hand side (or equivalently,
an \(\epsilon\)-edge from the
right-hand side to the left-hand side). As an optimization, we can merge
two states together if they are in the same strongly connected
component. Second, modify the matching procedure so that it will also
“follow” these \(\epsilon\)-transitions. The new matching
procedure can no longer be expressed as a conjunctive query, as opposed
to EqSat, and is more expensive to compute. In general, though, tree
automata completion has a higher time complexity than equality
saturation, since dealing with DAGs / SCCs are more difficult than
dealing with equivalences.

*The termination problem of TA completion*. We have shown
above that given a terminating TRS \(R\) and an initial term \(t\), tree automata completion is always
terminating but EqSat may not terminate, which shows that the
termination of tree automata completion does not imply the termination
of EqSat. But is the other direction true? Indeed, the termination of
EqSat does not imply the termination of tree automata completion as
well! To see this, consider \[\begin{align*}
f(x)&\rightarrow_R g(f(h(x)))\\
h(x)&\rightarrow_R b
\end{align*}\] For tree automata completion to terminate, the set
of reachable terms must be regular. However, for initial term \(f(a)\), the set of reachable terms is \[\{g^n(f(h^n(a)))\mid n\in \mathbb{N}\}\cup
\{g^n(f(h^m(b)))\mid n>m\},\] which is not regular. In EqSat,
because equivalence is bidirectional, all the \(h(x)\) are in the same E-class as \(b\), so the first rewrite rule can be
effectively viewed as \(f(x)\rightarrow_R
g(f(b))\), where the right-hand side is a ground term. As a
result, there are only a finite number of equivalence classes in the
theory defined by the rewrite rules, which implies the termination of
equality saturation.

The reader should treat E-graphs and tree automata as two interchangeable terms. An E-graph is just a deterministic finite tree automaton with no \(\epsilon\) transitions and no unreachable states. Moreover, all tree automata in this post contain no unreachable states. ↩︎

This can be shown via König’s lemma for trees. Notice that TRSs are always finitely branching and rewriting in terminating TRSs will not contain cycles.↩︎

Relations with these properties are known as partial strong congruences.↩︎