Ensuring the Termination of EqSat over a Terminating
Term Rewriting System
Yihong Zhang
Mar 17, 2023
Term rewriting is one of the most fundamental techniques in
programming languages. It is used to define program semantics, to
optimize programs, and to check program equivalences. An issue with
using term rewriting to optimize program is that, in a non-confluent
term rewriting system, it is usually not clear which rule should be
applied first, among all the possible rules. Equality saturation (EqSat)
is a variant of term rewriting that mitigates this so-called
Phase-Ordering Problem. In EqSat, all the rules are applied at the same
time, and the resulting program space is stored compactly in a data
structure called E-graph1.
EqSat has been shown to be very successful for program optimizations
and program equivalence checking, even when the given set of rewrite
rules are not terminating or even when the theory is not decidable in
general. However, despite its success in practice, there are no formal
guarantees about EqSat: for example, when does EqSat terminate, and if
it does not, how does one make it terminate. The first problem is known
in the term rewriting literature as the termination problem, and the
second is known as the completion problem. Both problems are very hard,
and there are a ton of literatures on both problems. In the setting of
EqSat, these problems are not only theoretically interesting, but also
have practical implications. For example, in program optimization, we
may want to get the most “optimized” term with regard to a given set of
rules, so making sure EqSat terminate is important to such optimality
guarantees. Or, some theories are decidable but deciding them is slow,
so one may want to speed up the reasoning by using EqSat, but there is
no point in “speeding up” the decision procedure if it simply does not
terminate. In this post, we will focus on the termination problem of
EqSat. We don’t attempt to solve this problem entirely, but rather have
this blog post as a first step and to draw community’s attention to this
problem. In fact, we found many interesting results about this
termination problem.
This post will show (1) how the innocent-looking associativity rule
can cause non-termination, (2) why a terminating, and even canonical,
term rewriting system does not necessarily terminate in EqSat, (3) how
to fix the above problem by “weakening” EqSat’s merge operation, and (4)
two potentially promising approaches to ensure the termination of EqSat.
One fascinating thing I found during this journey is that, researchers
working on tree automata indeed developed a technique almost identical
to EqSat, known as Tree Automata (TA) completion. Different from EqSat,
TA completion does not have the problem in (2) and is exactly the
algorithm we will show in (3). Moreover, there is a beautiful connection
between EqSat and TA completion: TA completion is the “preorder” version
of EqSat.
Term
rewriting 101: Ground theories are decidable via congruence closure
Before understanding why associativity can cause non-termination, let
us first briefly review some relevant backgrounds on ground theories and
congruence closure.
A ground equational theory is an equational theory induced by a
finite set of ground identities of the form , where both and are ground terms (i.e., no variables).
For example, below is an example of a ground theory over signature : All the equations that can be deduced from these
three identities hold in this equational theory. For example, we have
because .
Here, is
implied by . In
equational theory, a function by definition maps equivalent inputs to
equivalent outputs.
A classic result in term rewriting is that the word problem of ground
equational theory is decidable. A word problem asks whether two ground
terms and are equivalent in a given theory. In
general, this problem is undecidable. However, if the theory is ground,
several algorithms exist that decide its word problem. One of the most
well-known algorithm is the congruence closure algorithm of Downey, Sethi, and Tarjan.
One way to view it is that the congruence closure algorithm produces a
canonical term rewriting system for each input set of ground identities:
For theory , it builds an E-graph
of the theory. Every E-graph corresponds to a canonical term rewriting
system, which gives a way to decide . For example, the congruence closure
algorithm will produce the following E-graph for the theory above: where denote E-classes of the E-graph, and denote
E-nodes. This E-graph naturally gives the following canonical term
rewriting system , which rewrite
equivalent terms to the same e-class: Now, checking can be simply done by checking if there exists some normal
form such that holds.
For example, because
This is sound and always terminates, because the term rewriting
system produced by an E-graph is canonical—meaning every term will have
exactly one normal form and term rewriting always terminates.
Ground
associative theory does not terminate in EqSat
Associativity is a fundamental axiom to many algebraic structures
like semigroups, monoids, and groups. It has the following form: This rule can be oriented as (or ). It must be terminating, you may think, so
we can just apply the associativity until saturation in EqSat, which
will decide theories with associativity! Unfortunately, ground
associative theories are not decidable in general. Term Rewriting
and All That gives an example of undecidable associative theory (we
write for and for for brevity and associativity allows
us to drop brackets): There is another way to state this proposition
that appeals to math-minded folks: the word problem for finitely
presented semigroups are not decidable.
Because of this, associative rules do not terminate in EqSat in
general. Otherwise, given a set of ground identities , we can run the congruence closure
algorithm over to get an E-graph,
and run EqSat with associativity rules. When it reaches the fixed point
and terminates, this gives us a way to decide ground associative
theories, which is a contradiction to the fact that such theories are
not decidable.
To better understand why associativity does not terminate in EqSat,
here is an example: suppose
is associative and satisfy the ground identity for constants . Now suppose we orient this identity
into rewrite rule while having the associative rule . This is a terminating term-rewriting system (although not
confluent, because the term has two normal forms
and ).
However, this ruleset causes problems in EqSat: Starting with the
initial term , EqSat will
apply the rewrite rule and merge and
into the same E-class. The
E-graph will look like this:
Notice that because of the existence of cycles in this E-graph, it
represents not only the two terms
and but indeed an infinite
set of terms. For example, is explicitly represented by E-class because In fact,
represents the infinite set of
terms For any such
term , it can
be rewritten to a term of the form . Now, for associativity to terminate, the
output E-graph need to at least represent the set of terms ,
where any two terms are not equal. This requires infinitely many
E-classes, each represents some , while a finite E-graph will have
only a finite number of E-classes. Therefore, EqSat will not terminate
in this case.
Canonical
TRSs do not necessarily terminate in EqSat as well
In our last example, the term rewriting system is terminating, but is not confluent. Confluence means that
every term will have at most one normal form, and associativity is
usually not confluent. At first I thought maybe non-confluence is what
causes EqSat to not terminate. But this is not the case; there are
canonical (i.e., terminating + confluent) TRSs that are non-terminating
in EqSat. Here we give such an example: Let the TRS be
This is a terminating term rewriting system, where every term of the
form will have
the normal form , no matter
the order of rule application. However, this is not terminating in
EqSat: consider the initial term . Running the rule over the initial
E-graph will union and together, creating an infinite (but
regular) set of terms .
See figure.
An E-graph that represents
Now, by rule , each
will be rewritten into , so the output E-graph must contain for . But notice that the rule
set will not rewrite any to
for , which means that we have an
infinite set of inequivalent terms . Again, the existence of infinitely many e-classes, one
for each , implies that EqSat
will not terminate.
Tree Automata Completion
to the Rescue
For a terminating TRS, the set of reachable terms is always finite2. Intuitively, one will think that
EqSat is just a more powerful way of doing term rewriting. So it is
natural to think that running EqSat with a terminating TRS (with some
initial term ) will eventually
terminate. But this is not true, as has been shown in the last two
sections. The issue is because EqSat is not exactly term rewriting: the
equivalence in EqSat is bidirectional. For example, in our last example,
the rewrite from to does not only make the E-graph
represent these two terms, but also and and so on.
Before going further, let us first formally define the problem. For a
TRS , we define the set of
reachable terms . If
is terminating, is finite
for any term . It can also be
shown that EqSat always computes a superset of . A natural idea is that if our
EqSat procedure computes exactly , it should terminate for
terminating . And in fact it may
also be capable of handling some non-terminating TRSs: E-graphs can
represent many infinite sets of terms.
It turns out, term rewriting researchers have developed a technique
that computes exactly ,
represented as a tree automaton. The technique is known as tree
automata completion, which is the main technique I hope to
introduce in this blog post.
TA completion proceeds as follows: build an initial tree automaton and
run term rewriting over this tree automaton until saturation.
Specifically, it searches for left-hand sides of rewrite rules, build
and insert right-hand sides, and merge the left-hand sides with
right-hand sides. Does this sound familiar? Yes, this is EqSat! It is
striking that the program optimization and term rewriting communities
independently come up with essentially the same technique.
But wait a second, didn’t we just say EqSat does not necessarily
compute exactly? This is
correct. There is a single tweak that distinguishes tree automata
completion from EqSat. In tree automata completion, merging is performed
directionally. For example, suppose the left-hand side is in E-class
and right-hand side in E-class
, EqSat will basically rename
every occurrence of with (or vice versa). As a result the two
E-classes are not distinguishable after the merging. Tree automata
completion, on the other hand, performs the merging by adding a new
(-)transition (remember the TRS view
of an E-graph).
To better see the difference, consider the E-graph that represents
terms and the rewrite rule . EqSat will rename
with (or with ), so every E-node that points to
child (resp. ) now also points to (resp. ). The E-graph after the merging will
now contain . Note that among these terms, is not reachable by ; the rewrite rule can only rewrite to , but not to . In contrast, tree automata
completion will add the transition . Recall that we say a
term is represented by an E-class
in an E-graph if . With the new
transition , we
have every term represented by
is now represented by , but not
the other way around. As a consequence, is represented by the E-graph, since
while is not represented.
This difference guarantees that TA completion will only contain terms
that are reachable by the TRS. Moreover, if TA completion terminates, it
will compute exactly . The
actual TA completion is slightly more general than this: instead of
considering the set of reachable terms of a single initial term, it
considers the set of reachable terms of an initial tree automaton, which
may contain an infinite (but regular) set of terms. It turns out,
although the set of reachable terms is always finite (and thus
regular) for initial term if is
terminating, it is undecidable
if the set of reachable terms is regular or not for an initial tree
automaton even when is
terminating and confluent. To ensure the termination of TA completion
even when the reachable set is not regular, researchers have proposed
approximation algorithms for TA completion, which are useful for
applications like program verification.
Discussions on tree
automata completion
Equivalence and preorder. One interesting
way of viewing TA completion is that it generalizes the equivalence
relation in EqSat to a preorder: EqSat maintains an equivalence
relation between terms and
asserts for
every left-hand side and
right-hand side . EqSat also
guarantees that if is in the
E-graph and , then is also in the E-graph and 3. TA
completion, instead, maintains a preorder relation and asserts for every
left-hand side and
right-hand side . and in TA completion
is equivalent to in equality saturation, and in such cases and can be viewed as one state.
Moreover, TA completion guarantees that if is in the tree automaton and , then is also in the tree automaton and
.
Implementation of tree automata completion.
I have not implemented TA completion, but it would be interesting to see
how to implement TA completion in an EqSat framework like egg. It seems
we only need to make two modifications: First, during rewrite, instead
of merging left-hand side and right-hand side, add an edge from the
left-hand side to the right-hand side (or equivalently, an -transition from the right-hand
side to the left-hand side). As an optimization, we can merge two states
together if they are in the same strongly connected component. Second,
modify the matching procedure so that it will also “follow” these -transitions. The new matching
procedure can no longer be expressed as a conjunctive query, as opposed
to EqSat, and is more expensive to compute. In general, though, TA
completion has a higher time complexity than equality saturation, since
dealing with DAGs / SCCs are more difficult than dealing with
equivalences.
The termination problem of tree automata
completion. We have shown above that given a terminating
TRS and an initial term , tree automata completion is always
terminating but EqSat may not terminate, which shows that the
termination of tree automata completion does not imply the termination
of EqSat. But is the other direction true? Indeed, the termination of
EqSat does not imply the termination of tree automata completion as
well! To see this, consider For tree automata completion to terminate, the set
of reachable terms must be regular. However, for initial term , the set of reachable terms is which is not regular. In EqSat,
because equivalence is bidirectional, all the are in the same E-class as , so the first rewrite rule can be
effectively viewed as , where the right-hand side is a ground term. As a
result, there are only a finite number of equivalence classes in the
theory defined by these rewrite rules, which implies the termination of
equality saturation.
Practical approaches to
termination
So far we have shown that TA completion is a variant of EqSat that is
terminating for terminating TRS. But besides this we still have not
shown anything positive about the termination of EqSat itself. In
particular, although there have been research on when term rewriting
terminates and when TA completion terminates, neither of them implies
the termination of EqSat (and vice versa). My collaborators and I have
been thinking about the termination problem for a while, and we have yet
to come up with some non-trivial criteria4.
Despite this, in practice there are many tricks people can use to stop
EqSat early and still get relatively “complete” e-graphs. I will briefly
mention two of them below.
Depth-bounded equality saturation. Let us
define the depth of an E-class to be the smallest
depth possible among terms represented by the E-graph, namely . This is well-defined as we require all
E-classes to represent some terms. Now, given a limit on depth , depth-bounded equality saturation
maintains for
each E-class during equality saturation, and only apply a rewrite rule
when any of the created E-classes does not have a depth greater than
. Because there’s only finite
number of E-graphs with bounded depth5, depth-bounded EqSat
always terminates for any given .
There is something nice about depth-bounded EqSat. If two terms can
be proved equivalent without using any term with depth , depth-bounded EqSat can eventually
show their equivalence. This is also useful in program optimization,
where the optimal term is unlikely to be, say, 10 larger than the original term.
However, as I prototyped depth-bounded equality saturation a while ago,
I found depth-bounded EqSat still took a very long time to terminate
even for a reasonable . This
somehow makes sense, since the number of trees with bounded depths grows
rapidly.
Merge-only equality saturation. This idea
has been around for a while and I think was first came up with by Remy. It is also very natural: We only
apply the subset of rewrite rules if both the left-hand side and the
right-hand side are already present in the E-graph. These rewrite rules
essentially only merge E-classes together without creating any new
E-nodes and are obviously terminating. They are useful when you have run
EqSat for several iterations, want to stop there, but still want some
relatively complete result. Merge-only EqSat provides the guarantee that
if two terms can be proven equivalent using only terms in an E-graph
, they can be proven equivalent by
running merge-only EqSat over .
The reader should treat E-graphs and
tree automata as two interchangeable terms. An E-graph is just a
deterministic finite tree automaton with no transitions and no unreachable
states. Moreover, all tree automata in this post contain no unreachable
states.
↩︎
This can be shown via König’s lemma
for trees. Notice that TRSs are always finitely branching and
rewriting in terminating TRSs will not contain cycles.↩︎
Relations with these properties are
known as partial strong congruences.↩︎
There are some simple syntactic
criteria that we can borrow from the ones for TA completion. For
example, if all rules have right-hand sides with depth , equality saturation will always
terminate because applying rules won’t create new E-classes. Similarly,
if the right-hand sides are ground terms only, equality saturation will
also terminate. The two criteria can be further combined: if the
variables of the right-hand side terms only occur at depth , equality saturation will always
terminate. ↩︎
Every distinct e-class contains (at
least) one distinct term of depth . There are only finitely many
depth- terms, so finitely many
E-classes. Finally, there are finitely many ways to connect finitely
many E-classes.↩︎