Expansion Trees with Cut

Expansion Trees with Cut

Federico Aschieri Funded by FWF Lise Meitner Grant M 1930–N35 and START project Y544–N23Institute of Discrete Mathematics and Geometry
Vienna University of Technology
Vienna, Austria
   Stefan Hetzl Funded by the WWTF Vienna Research Group (VRG) 12-004Institute of Discrete Mathematics and Geometry
Vienna University of Technology
Vienna, Austria
   Daniel Weller Institute of Discrete Mathematics and Geometry
Vienna University of Technology
Vienna, Austria

Herbrand’s theorem is one of the most fundamental insights in logic. From the syntactic point of view, it suggests a compact representation of proofs in classical first- and higher-order logic by recording the information of which instances have been chosen for which quantifiers.
This compact representation is known in the literature as Miller’s expansion tree proof. It is inherently analytic and hence corresponds to a cut-free sequent calculus proof. Recently several extensions of such proof representations to proofs with cuts have been proposed. These extensions are based on graphical formalisms similar to proof nets and are limited to prenex formulas.
In this paper we present a new syntactic approach that directly extends Miller’s expansion trees by cuts and covers also non-prenex formulas. We describe a cut-elimination procedure for our expansion trees with cut that is based on the natural reduction steps and show that it is weakly normalizing.

1 Introduction

Herbrand’s theorem [14, 7], one of the most fundamental insights of logic, characterizes the validity of a formula in classical first-order logic by the existence of a propositional tautology composed of instances of that formula.

From the syntactic point of view this theorem induces a way of describing proofs: by recording which instances have been picked for which quantifiers we obtain a description of a proof up to its propositional part, a part we often want to abstract from. An example for a formalism that carries out this abstraction are Herbrand proofs [7]. This generalizes nicely to most classical systems with quantifiers, in particular to simple type theory, as in Miller’s expansion tree proofs [20]. Such formalisms are compact and useful proof certificates in many situations; they are for example produced naturally by methods of automated deduction such as instantiation-based reasoning [18] and they play a central role in many proof transformations in the GAPT-system [11].

These formalisms consider only instances of the formula that has been proved and hence are analytic proof formalisms, corresponding to cut-free proofs in the sequent calculus. Considering an expansion tree to be a compact representation of a proof, it is thus natural to ask about the possibility of extending this kind of representation to non-analytic proofs, corresponding to proofs with cut in the sequent calculus.

In addition to enlarging the scope of instance-based proof representations, the addition of cuts to expansion trees promises to shed more light on the computational content of classical logic. This is a central topic of proof theory and has therefore attracted considerable attention, see  [22, 2, 10, 9], [24, 25], or [6], for different investigations in this direction.

Two proof formalisms manipulating only formula instances and incorporating a notion of cut have recently been proposed: proof forests [13] and Herbrand nets [19]. While some definitions in the setting of proof forests are motivated by the game semantics for classical arithmetic [8], Herbrand nets are based on methods for proof nets [12]. These two formalisms share a number of properties: both of them work in a graphical notation for proofs, both deal with prenex formulas only, for both weak but no strong normalization results are known.

In this paper we present a purely syntactic approach to the topic. We start from expansion tree proofs, add cuts and define cut-reduction rules, naturally extending the existing literature in this tradition. The result is a rewriting theory of expansion trees with cuts. The main staple of a good rewriting theory is that the syntax should look simple and the reduction rules should be as few and as elementary as possible. When a rewriting system falls short of any of these requirements, reasoning about its combinatorial properties may easily become unwieldy; when it satisfies them, it is always a good sign. Indeed, expansion trees are by design compact strings of symbols, expansion proofs just lists of those trees and the reduction rules that we shall present straightforward manipulation of those lists. This is a novel technical achievement. In fact, graph-based formalisms like proof forests and Herbrand nets allow rather simple mathematical definitions of tree forests and their transformations, but as soon as one tries to write them down syntactically, their rewriting complexity becomes evident. A simple rewriting theory may help to solve the intricate combinatorial problems that arise, like strong normalization.

With respect to proof forests, the main related work, we offer several technical novelties.

Miller’s correctness criterion. Expansion trees are just simple collections of witnesses for quantifiers, so not every tree makes logical and semantical sense. Miller’s correctness criterion [20] is the most direct known way to express that an expansion tree (list) is sound: it requires a certain acyclic ordering of the tree nodes, it maps the tree into a propositional formula and asks it to be a tautology. Syntactically, the definition of Miller’s criterion follows in a straightforward way the tree’s shape and the obtained propositional formula matches exactly the tree’s number of leaves. Semantically, Miller’s criterion states that a list of expansion trees represents a winning strategy in Coquand’s backtracking games [8]. Though Heijltjes’ correctness criterion was motivated as well by Coquand’s game semantics, it represents a different way of extracting a propositional formula from a list of expansion trees: it is constructed from a case distinction on all cuts, rendering the formula’s size exponential with the respect to the number of cuts. On our side, we managed to keep Miller’s correctness criterion unchanged. As by-product, we also effortlessly obtain a treatment of non-prenex formulas. This avoids not only the distortion of the intuitive meaning of a formula by prenexification, but also the non-elementary increase in complexity that can be caused by prenexification [5]. It also seems possible to extend proof forests and Herbrand nets to non-prenex formulas, but this has not been done in [13].

Cut-Reductions. Eliminating cuts from expansion proofs resembles a Coquand game between expansion trees, when they are interpreted as strategies. Following this game semantics analogy, one would thus expect, during cut-elimination, to only encounter new trees whose branchings are isomorphic to sub-trees of the original expansion proof. This however does not happen in the theory of proof forests and Herbrand nets: the restructuring performed during cut-elimination is significant and trees eventually become much bigger than the original ones, due to an operation of copying and glueing them together. Though we are not pursuing the game semantics analogy here, we define cut-reduction steps that instead satisfy the mentioned condition. The gain is all about the rewriting theory of expansion proofs: cut-reduction only involves the operations of copying, decomposing, substituting terms and renaming variables applied to subtrees of the original ones.

Pruning and Bridges. In proof forests an unexpected technical issue arises. Cut-reductions create some unwanted “bridges” that cause non-termination of cut-elimination. Therefore, additional restructuring of the forest is needed, this time in terms of scissors, cutting those bridges. Here we show that bridges are not an issue at all and our cut-elimination terminates, regardless of bridges. Again, in this way we avoid an additional layer of rewriting complexity.

1.1 Plan of the Paper

In Section 2, we modify Miller’s concept of expansion proof in order to also include special pairs of expansion trees, which represent logical cuts. In Section 3, we show that our expansion proofs are sound and complete with respect to first-order classical validity. In Section 4, we define a cut-elimination procedure which transform any expansion proof with cuts into a cut-free one.

2 Expansion Trees

In this entire paper we work with classical first-order logic. Formulas and terms are defined as usual. In order to simplify the exposition, we restrict our attention to formulas in negation normal form (NNF) and without vacuous quantifiers. Mutatis mutandis all notions and results of this paper generalize to arbitrary formulas. We write for the de Morgan dual of a formula . A literal is an atom or a negated atom . We start by defining Miller’s concept of expansion tree [20].

Definition 1 (Expansion Trees).

Expansion trees and a function (for shallow) that maps an expansion tree to a propositional formula are defined inductively as follows:

  1. A literal is an expansion tree with .

  2. If and are expansion trees and , then is an expansion tree with .

  3. If is a sequence of terms and are expansion trees with for , then is an expansion tree with .

  4. If is an expansion tree with , then is an expansion tree with .

The of point 3. are called -expansions and the or point 4. are called -expansions, and both - and -expansions are called expansions. The variable of a -expansion is called eigenvariable of this expansion. We say that dominates all the expansions in . Similarly, dominates all the expansions in . We also say that is an expansion tree of . If is a sequence of expansion trees, we define .

We recall now the definition of the propositional formula , which is used to state Miller’s correctness criterion for an expansion tree .

Definition 2.

We define the function (for deep) that maps an expansion tree to a propositional formula as follows:

If is a sequence of expansion trees, we define .

Cuts are simply defined as pairs of expansion trees, whose shallow formulas are one the involutive negation of the other.

Definition 3 (Cut).

A cut is a set of two expansion trees s.t. . A formula is called positive if its top connective is or or a positive literal. An expansion tree is called positive if is positive. It will sometimes be useful to consider a cut as an ordered pair: to that aim we will write a cut as with parentheses instead of curly braces with the convention that is the positive expansion tree. For a cut , we define which is also called cut-formula of . We define . If is a sequence of cuts, we define and .

For each expansion tree we now define the set of finite formulas and number sequences, representing all formulas that one encounters and all branch choices one makes in any complete path from the tree’s root to one of its leaves. This concept will soon be needed for defining correctness of expansion proofs.

Definition 4 (Formula Branch).

We define a function (for branch) that maps an expansion tree with merges to a finite set , where each is some list made of formulas and the integers or .

For every cut we define .

A very simple property that we shall use without further mentioning is the following.

Lemma 1.

Let be an expansion tree and a sub-tree of . Then there is a formula sequence such that for every , it holds that .


By a straightforward induction on . ∎

Expansion proofs will be defined as sequences of expansion trees and cuts satisfying a number of properties. The correctness criterion of expansion tree proofs [20], as well as those of proof forests [13] and Herbrand nets [19], have two main components: 1. a tautology-condition on one or more quantifier-free formulas and 2. an acyclicity condition on one or more orderings. These conditions can be interpreted, logically, as ensuring that expansion proofs represents logical proofs, semantically, as defining correct winning strategies in Coquand games with backtracking  [13, 1]. While the tautology condition of [20] generalizes to the setting of cuts in a straightforward way, the acyclicity condition needs a bit more work: in the setting of cut-free expansion trees it is enough to require the acyclicity of an order on the -expansions. In our setting that includes cuts we also have to speak about the order of cuts (w.r.t. each other and w.r.t. -expansions). To simplify our treatment of this order we also include -expansions. Together this leads to the following inference ordering constraints.

Definition 5 (Dependency Relation).

Let , where is a sequence of cuts and a sequence of expansion trees. We will define the dependency relation , which is a binary relation on the set of expansions and cut occurrences in . First, we define the binary relation (writing if is clear from the context) as the least relation satisfying:

  1. if is an -expansion in whose term contains the eigenvariable of the -expansion

  2. if is an expansion in that dominates the expansion

  3. if is an expansion of the cut in

  4. if is a cut and contains the eigenvariable of the -expansion

is then defined to be the transitive closure of . Again, we write for if is clear from the context.

As clauses 1–4 never relate two cuts, there is no -cycle containing cuts only, thus is cyclic iff for an expansion : we will make use of this property without further mention.

We now define the concept of expansion proof. In the following, lists of expansion trees and cuts will be identified modulo permutation of their elements.

Definition 6 (Expansion Proofs).

Let be a sequence of cuts and let be a sequence of expansion trees. Let . We define , which corresponds to the end-sequent of a sequent calculus proof, and , which is a sequent of quantifier-free formulas, and .Then is called expansion proof whenever:

  1. (weak regularity) for every and in , if and , then , and are both trees or both cuts.

  2. (acyclicty) is acyclic, that is, holds for no .

  3. (validity) is a tautology, that is, a valid sequent.

  4. (eigenvariable condition) For every -expansion in , the variable does not occur in .

An important difference of expansion proofs with respect to other formalisms, such as proof forests [13] and Herbrand nets [19], is that the same -expansion can occur multiple times. This phenomenon is very natural, as soon as one realizes that the weak regularity condition that we have imposed corresponds to an interpretation of eigenvariables as Skolem functions. Namely, weak regularity ensures: that the same witness is only used for the same formula with same parameters; that an expansion proof can always be transformed into one satisfying the usual regularity condition that every -expansion occurs exactly one time [20]. This last property, that we shall not prove, guarantees that we are still working with the familiar objects. Our weak regularity condition offers, however, a great technical advantage. Namely, the definition of cut-reduction becomes much easier, as it avoids the heavy restructuring of the expansion trees which would be needed to prevent duplication of -expansions.

Condition 2 and 3 embody Miller’s correctness criterion. Condition 4 could be formulated as asking that does not contain free variables. But the real trouble is indeed that if is such that contains free variables, then the eigenvariable of some -expansion may be contained in , so that would not represent a proof of . This issue will become transparent in Section 3, where we show that the notion of expansion proof represents indeed a sound and complete proof system with respect to classical first-order validity. Moreover, again because that the eigenvariable of some -expansion could be contained in , without condition 4 the notion of expansion proof would not be closed under the cut-reduction that we shall provide in Section 4.

Example 1.

Consider the straightforward proof of from via a cut on . In negation normal form these formulas are , , and . The proof will be represented by the expansion proof where

We have and

Note that is a tautology (of the form ). Let us now consider the dependency relation induced by : in each term belongs to at most one - and at most one -expansion. In such a situation we can uniformly write all expansions as for some term and . The expansions of are then written as , , , , , , and . Furthermore, contains a single cut . Then is exactly:

  1. , , ,

  2. , , ,

  3. , , , ,

  4. there is no as the cut formula of is variable-free.

As the reader is invited to verify, is acyclic.

3 Expansion Proofs and Sequent Calculus

In this section we will clarify the relationship between our expansion proofs and the sequent calculus. The concrete version of sequent calculus is of no significance to the results presented here, they hold mutatis mutandis for every version that is common in the literature. For technical convenience, we treat sequents as multisets of formulas in Section §3.1 and as sets of formulas in Section §3.2.

Definition 7.

The calculus is defined as follows: initial sequents are of the form for an atom . The inference rules are

with the usual side conditions: must not appear in and must not contain a variable which is bound in .

An -proof is called regular if each two -inferences have different eigenvariables and different from the free variables in the conclusion of the proof. From now on we assume w.l.o.g. that all -proofs are regular.

3.1 From Sequent Calculus to Expansion Proofs

In this section we describe how to read off expansion trees from -proofs (with sequents as multisets), thus obtaining a completeness theorem for expansion proofs. For representing a formula that is introduced by (implicit) weakening we use the natural coercion of into an expansion tree, denoted by : , ( fresh), , for atomic. For a sequent we define .

Definition 8.

For an -proof define the expansion proof by induction on :

  1. If is an initial sequent , thus with atomic, then

  2. If with and where and , then .

  3. If with where and , then .

  4. If with where , then .

  5. If with where and , then .

  6. If for positive with and where and , then .

Theorem 2 (Completeness).

If is an -proof of a sequent , then is an expansion proof such that . If is cut-free, then so is .


That satisfies weak regularity and the eigenvariable condition follows directly from the definitions as we are dealing with regular -proofs only, thus we are constructing regular expansion proofs as well. By a straightforward induction on one shows that is a tautology. Acyclicity is also shown inductively by observing that if is a free variable in the end-sequent of , then is not an eigenvariable in . This implies that if is the new expansion introduced in the construction of , and is an old expansion in , then , which in turn yields acyclicity. ∎

3.2 From Expansion Proofs to Sequent Calculus

In this section we show how to construct an -proof (with sequents as sets) from a given expansion proof. To this aim we introduce a calculus , generalizing the treatment in [20], that works on sequences of expansion trees and cuts instead of sequents of formulas.

Definition 9.

The axioms of are of the form for an atom . The inference rules are

with the following side conditions: for the cut rule; for the second rule; the eigenvariable condition for the rule: must not occur in .

The reader is invited to note that does not include the cut formulas of , though they may – and indeed often have to – contain the eigenvariable . An important feature of the above calculus, which is easily verified, is that if is an -proof, then – defined as the result of replacing in each sequence of expansion trees and cuts with – is a -proof. In the following proof we describe how to transform expansion proofs to -proofs.

Theorem 3 (Soundness).

If is an expansion proof of a sequent , then there is an -proof of . If is cut-free, then so is the -proof.


It is enough to construct an -proof of , as then is a proof of . The construction will be carried out by induction on the number of nodes in . The inductive statement we are going to prove is: if is an expansion proof, then there is a -proof of .

If contains only literals, the thesis is obvious.

If for some , and , then is a strictly smaller expansion proof. By the induction hypothesis we obtain an -proofs of from which a proof of is obtained by an -inference. For , proceed analogously.

So assume there are no top-level conjunctions or disjunctions. We observe that for any non-top-level quantifier expansion there is some top-level quantifier expansion that dominates it, and is smaller than it, according to . Thus, the -minimal quantifier expansions are all top-level. By the acyclicity of there must be a -minimal quantifier expansion or cut. In the case the -minimal expression is a quantifier expansion, then it is a top-level one.

For the case of cut, we proceed as follows: let be a cut minimal with respect to (if does not begin with a quantifier the argument is easier). Then , which, by weak regularity of , forces every element of containing to be a cut of the shape . Then, we can write

where does not occur in . Now,


are tautologies. To prove weak regularity of and , we observe that and are contained in ; the only issue is when a branch belongs to . In this case, in order to show weak regularity, we have to show that for every branch , we have that is empty and the branch is the branch of a tree. By weak regularity of , the same branch was in a branch of a cut. Thus, it is enough to observe that belongs to some cut , for some , otherwise would belong to some cut in , impossible by construction of . Furthermore, the orderings of the expansions and cuts of and are suborderings of , hence also acyclic. Last, and contain the same free variables of plus those of ; now, no -expansion of can have occur in , otherwise , contradicting the minimality assumption on , so we have that the eigenvariable condition holds. Then, by the induction hypothesis we obtain -proofs of and respectively, from which a proof of is obtained by a cut.

For the case of the minimal node being an -expansion, let be an expansion tree of such that is minimal with respect to . As we said, occurs at top-level. We move all top-level at the end of the lists of -espansions relative to the corresponding top level formula. In this way, we can rewrite as

in such a way that: for every expansion tree in ; there are such that and implies . Let

Then , so they are both tautologies. To prove weak regularity of , we first observe that every is either already contained in or or , with and and in . Thus the only problematic case is when belongs to but not to the branches of the new trees of , while, for instance, , with . We show that it cannot be the case that and : assume for the sake of contradiction that it is. Since , we have , by weak regularity of . Therefore, , so without loss of generality, say . But and since by construction, we have a contradiction. Furthermore, the orderings of the expansions and cuts of are suborderings of , hence also acyclic. Last, no -expansion of can have occur in , otherwise either occurs in or already occurs in , against the assumptions on minimality of , in the first case, agains the assumption on having the eigenvariable condition, in the second case. Then, by the induction hypothesis we obtain a -proof of , from which a proof of is obtained by the second rule and a number of applications of the first rule, taking care of the rewriting of that we made.

For the case of the minimal node being a -expansion, let be an expansion tree of such that is minimal with respect to . As we said, occurs at top-level. Then , which, by weak regularity of , forces every element of containing to be an expansion tree of the shape . Then, we can write

where does not occur in . Now, , so they are both tautologies. To prove weak regularity of , it is enough to note that every is either already contained in or , for and is in . Thus the only problematic case is when belongs to but not to the branches of the new trees of , while, for instance, , with . We show that it cannot be the case that and : assume for the sake of contradiction that it is. We first notice that . Moreover, since , we have , by weak regularity of . Therefore, and we have a contradiction. Furthermore, the orderings of the expansions and cuts are suborderings of , hence also acyclic. Last, no -expansion of can have occur in , otherwise either or already occurs in , against the assumptions on not occurring in or against weak regularity of , in the first case, against the assumption on having the eigenvariable condition, in the second case. Then, by the induction hypothesis we obtain a -proof of , from which a proof of is obtained by the rule, because does not occur in .

4 Cut-Elimination

Given any expansion proof there is always a cut-free expansion proof of : by the soundness Theorem 3, transform into an - proof, then perform Gentzen cut-elimination and obtain a cut-free proof; finally map it back to a cut-free expansion proof by the completeness Theorem 2. Nevertheless, the interest of expansion proofs is that they allow to investigate the combinatorics and the computational meaning of cut-elimination, with the additional advantage of factoring out tedious structural rules such as cut-permutations. In this section we indeed define a natural reduction system for expansion proofs, such that the normal forms are cut-free expansion proofs. We prove weak normalization and discuss the status of other properties such as strong normalization and confluence in comparison to other systems from the literature.

4.1 Cut-Reduction Steps

In the following, by a substitution we mean as usual a finite map from variables to first-order terms, and if is any syntactic expression, denote the expression resulting from after simultaneous replacement of each variable in the domain of with . To make sure that the application of a substitution transforms expansion trees into expansion trees we restrict the set of permitted substitutions: a substitution can only be applied to an expansion tree , if it acts on the eigenvariables of as a renaming, more precisely: if is a -expansion of , then is a variable. Otherwise would destroy the -expansions.

While presenting our cut-reduction steps, we have to take into account weak regularity: cut-reduction will duplicate sub-proofs, making it necessary to discuss the renaming of variables, as in the case of the sequent calculus. We will carefully indicate, in the case of a duplication, which subtrees should be subjected to a variable renaming and which variables are to be renamed.

Definition 10 (Cut-Reduction Steps).

The cut-reduction steps, relating expansion proofs and written , are:

Quantifier Step

where: ; the -expansions do not occur in ; no cut in has shallow formula ; are renamings to fresh variables of the eigenvariables of such that for some and occurrence of we have .

Propositional Step

where and for every in , .

Atomic Step

These reduction rules are very natural: atomic cuts are simply removed and propositional cuts are decomposed. The reduction of a quantified cut is, when thinking about cut-elimination in the sequent calculus, intuitively appealing: existential cuts are replaced by cuts on a disjunction of the instances. The fact that at least one of these rules that can be applied to any expansion proof containing cuts will be proved in Theorem 6.

One may think that the quantifier reduction rule already incorporates a reduction strategy, because several cuts are reduced in parallel. However, a strategy implies a choice and there is no real choice here: when the main eigenvariable occurs in other cuts, all these cuts have to be regarded as linked together, otherwise reducing one of them would destroy the soundness of the others. Moreover, all the cuts with the same shallow formula must be reduced, otherwise weak regularity would not be preserved.

The reason why only the eigenvariables greater than some are renamed is that these are the variable indirectly affected by the substitutions . Semantically, the witnesses that these variables represent are influenced by the substitutions, so for each of them a new collection of eigenvariables is created.

One surprising aspect of the quantifier-reduction rule is the presence of , without a substitution applied, on the right-hand side of the rule: in general, will contain , and one would expect that occurrences of are redundant (since is “eliminated” by the rule). The reason why this occurrence of must be present is that is not, in fact, eliminated since some might contain it. This situation occurs, for example, when translating from a regular -proof where an -quantifier may be instantiated by any term, and we happen to choose an eigenvariable from a different branch of the proof. In the sequent calculus, this situation can in principle be avoided by using a different witness for the -quantifier, but realizing such a renaming in expansion proofs is technically non-trivial due to the global nature of eigenvariables. For simplicity of exposition, we therefore allow this somewhat unnatural situation and leave a more detailed analysis for future work. Note that this -st copy is reminiscent of the duplication behavior of the -calculus [17], see [21] for a contemporary exposition in English.

Remark 1 (On Bridges).

We note that this phenomenon also occurs in the proof forests of [13], where it is an example of bridge. Bridges, when ignored, can generate cycles in the dependency relation . In [13], they are addressed with a pruning reduction that eliminates them and the weak normalization proof of that system depends on this pruning. In our setting, we do not need additional machinery for proving weak normalization (see Section 4.3). The reason is our renaming policy: while in [13] every occurrence of every variable above in the dependency relation is renamed, in our case only some of those occurrences are renamed, namely those which are not in or in . In such a way, bridges are broken by our cut-reduction step, so that cycles in the dependency relation cannot be generated. Furthermore, the counterexample to strong normalization from [13] also contains a bridge; we investigate (a translation of) this counterexample in Section 4.4 and find that it is not a counterexample in our setting for the reasons explained.

Example 2.

We now consider an example of cut reduction steps, in particular when an eigenvariable occurs more than once.