# Properties and Extensions of Alternating Path Relevance - I

###### Abstract

When proving theorems from large sets of logical assertions, it can be helpful to restrict the search for a proof to those assertons that are relevant, that is, closely related to the theorem in some sense. For example, in the Watson system, a large knowledge base must rapidly be searched for relevant facts. It is possible to define formal concepts of relevance for propositional and first-order logic. Various concepts of relevance have been defined for this, and some have yielded good results on large problems. We consider here in particular a concept based on alternating paths. We present efficient graph-based methods for computing alternating path relevance and give some results indicating its effectiveness. We also propose an alternating path based extension of this relevance method to DPLL with an improved time bound, and give other extensions to alternating path relevance intended to improve its performance.

###### Keywords:

Theorem Proving, Resolution, Relevance, Satisfiability, DPLL## 1 Introduction

In some applications, there may be knowledge bases containing thousands or even millions of assertions. Selecting relevant assertions has the potential to significantly reduce the cost of finding a proof of a desired conclusion from such large knowledge bases and even from smaller ones. Relevance can be defined in many ways. In first-order clausal theorem proving, a concept of relevance based on ”alternating” paths between clauses [PY03] permits relevant facts to be chosen automatically. Here we present efficient graph-based methods for computing alternating path relevance. We give some theoretical and practical results indicating its effectiveness. We also present a relationship between alternating path relevance and the set of support strategy [WRC65] in theorem proving. We incorporate this approach to relevance into the DPLL method. Finally, we present some extensions to alternating path relevance with a view to improving its effectiveness.

### 1.1 Related Work

Meng and Paulson [MP09] describe a relevance approach in which clauses are relevant if they share enough symbols with clauses that have previously been found to be relevant. They give clauses a âpass mark,â a number between 0 and 1. This is used as a filter, and the tests becomes increasingly strict as the distance from the conjecture increases. Their method makes use of two parameters, and . They finally chose and based on experiments. This approach significantly improves the performance of the Sledgehammer theorem prover [MQP06] used with Isabelle [WPN08]. They also tried the alternating path relevance approach [PY03], but apparently did not set a bound on relevance distance, but rather included all of the relevant clauses.

Pudlak [Pud07] defines relevance in terms of finite models. The idea is to find a set of clauses such that all of the finite models that have been constructed so far and satisfy , also satisfy the theorem . Such a set of clauses is a candidate for a sufficient set for proving the theorem. Clauses are added to by constructing models of the theorem that do not satisfy , and finding clauses that contradict such models. This approach is attractive because humans seem to use semantics when proving theorem. It had good results on the bushy division of MPTP [Urb04], and was extended in the SRASS system [SP07] where it was among the most successful systems in the MPTP division. The SRASS system uses a syntactic similarity measure in addition to semantics, as an aid in ordering the axioms. However, the basic system [Pud07] appears to be inefficient in the presence of a large number of clauses. SRASS is also resource intensive for large theories.

The MaLARea system [Urb07] uses machine learning from previous proof attempts to guide the proof of new problems. It uses a Bayesian learning system to choose a relevance ordering of the axioms based on the symbols that appear in the conjecture. In each proof attempt, the most relevant clauses are used for various , and various time limits are tried. It has given good performance on the ”chainy” division of MPTP, on which it solved more problems than E, SPASS, and SRASS. This system has been modified [USPV08] to take into account semantic features of the axioms in choosing the relevance ordering, similar to SRASS, and the combined system has outperformed both MaLARea and SRASS on the MPTP problems. The basic MaLARea system has been combined with neural network learning and has performed very well on the large theory batch division in the CASC competitions [Sut16] in 2017 and 2018. The Divvy system [RPS09] also orders the axioms, but does so solely on the basis of a syntactic similarity measure. For each proof attempt, a subset of the most relevant axioms is used. This system has obtained good results on the MPTP challenge problems.

The Sine Qua Non system [HV11] evaluates how closely two clauses are related by how many symbols they share, and which symbols they share. It also considers the length of paths between clauses as a measure of how closely they are related. The d-relevance idea of Sine Qua Non is straightforward, but in order to reduce the number of relevant axioms, a triggering technique is used. However, using the least common symbol as a trigger doesnât work well, so they add a tolerance, a depth limit, and a generality threshold. Experiments revealed that some of these parameters donât make much of a difference, and at least the depth limit turned out to be important. Tolerance was less important for the problems tried. This relevance measure can be computed very fast even for large sets of clauses. This approach has performed very well in the large theory division of CASC, and has been used by Vampire [RV99] and by some competing systems as well. However, sometimes clauses may be closely related in the Sine Qua Non system but not in the alternating path approach; for example, the literals and are closely related in the former system but unrelated in the latter.

These systems do at least show that relevance can be effective in aiding theorem provers on large knowledge bases. Also, especially relevance measures that can be computed quickly may lead to spectacular increases in the effectiveness of theorem provers on very large knowledge bases. Even if the original knowledge base is small, large numbers of clauses may be generated during a proof attempt, and relevance techniques may help to select derived clauses leading to a proof.

## 2 Terminology

### 2.1 Connectedness of two clauses

This section contains basic definitions related to alternating paths. Some of the formalism is new. Standard definitions of terms, clauses, sets of clauses, substitutions, resolution, resolution proof, and satisfiability in first-order logic are assumed. If is an atom, then and are literals, and each is the complement of the other. If is then will generally denote .

The number of clauses in a set of first-order clauses is denoted by .

The relation on literals denotes syntactic identity. That is, it is the smallest relation on literals such that for all atoms , , , , and . This treatment of negation permits one to say that and are complementary literals iff .

A pair and of literals are complementary unifiable if there are substitutions and such that .

An alternating path from to in a set of clauses is a sequence , where for all , is a pair of literals, , , and are complementary unifiable, and for all . Frequently the are omitted. Such a path is called alternating because it alternates between connecting literals in possibly different clauses and switching to a different literal in the same clause. An example of such a path for propositional calculus is the sequence , . The sequence , is not an alternating path.

Why this particular definition of alternating path is chosen will become clear later, as its properties are presented.

The length of an alternating path () is , counting only the clauses.

The relevance distance of and in is the length of the shortest alternating path in from to . This is a measure of how closely related to each other two clauses are. If there is no alternating path in from to then is .

If is a subset of set of clauses then is . This is called the relevance distance of from . Also, . This is frequently written as . Clauses in are said to be relevant at distance n from . If then we say is relevant (for and ).

Two clauses are alternating connected or relevance connected in if there is an alternating path in between them.

Note that alternating connectedness is not transitive. Example: , and . and are alternating connected in as are and , but and are not.

is the set of ground instances of clauses in . If has no constant symbols then one such symbol is added for purposes of making non-empty.

Two clauses and are ground connected in if they have ground instances and that are relevance connected in .

The ground relevance distance of and is .

Sometimes can be larger than , for example, consider and where , and .

A set of clauses is relevance connected if between any two clauses , in there is an alternating path.

## 3 Properties of alternating path relevance

###### Theorem 3.1

If is a relevance connected set and then between any two clauses in there is an alternating path of length at most .

###### Proof

This proof is essentially from Plaisted and Yahya [PY03], with a slightly different notation. The idea is that if there is a relevance path between two clauses, then there is one in which each clause appears at most twice. Also, the clauses at the ends of the path clearly only need to appear once, or else there is a shorter path.

### 3.1 Minimal unsatisfiable sets of clauses

This section relates alternating paths to unsatisfiability of sets of clauses. This section and section 3.2 are basically reviews of known results, with some new formalism. First we have the following result [PY03] :

###### Theorem 3.2

If is a minimal unsatisfiable set of clauses then is relevance connected.

Using Theorem 3.1 above, we obtain the following:

###### Theorem 3.3

If is a set of clauses, is a minimal unsatisfiable subset of , and then between any two clauses in there is an alternating path in of length at most .

Now, short proofs imply small minimal unsatisfiable sets of clauses, which in turn implies that if there is a short refutation from then there is a minimal unsatisfiable subset of in which any two clauses are connected by a short alternating path.

###### Theorem 3.4

If there is a resolution refutation of length from then there is a minimal unsatisfiable subset of in which any two clauses are connected by an alternating path of length at most .

###### Proof

Suppose there is a resolution refutation of length from . Let be the set of input clauses (clauses in ) used in the refutation; then . Also, is unsatisfiable so it has a minimal unsatisfiable subset with . Then any two clauses in are connected by an alternating path of length at most .

We are not aware of any such result that applies to other relevance measures. If one wants to add a small proof restriction to other relevance measures, then one way to do this is to combine them with alternating path relevance.

The following result is also easily shown, but without a bound on the length of the path:

###### Theorem 3.5

If is a minimal unsatisfiable set of clauses then any two clauses in are ground connected.

###### Proof

The idea is that if is unsatisfiable then it has a finite unsatisfiable set of ground instances which therefore has a relevance connected subset including at least one instance of each clause in .

### 3.2 Completeness and Set of Support

Using relevance, one can filter the potentially large set of input clauses (clauses in ) to obtain a smaller set of relevant clauses, and then one can search for a proof from instead . This can be done using the concept of a set of support.

###### Definition 1

If is an unsatisfiable set of clauses, then a support set for is a subset of such that any unsatisfiable subset of has non-empty intersection with .

Such support sets are easily constructed from interpretations of the input clauses in many cases. In particular, if is an interpretation of the set of clauses, then is a set of support for . If it is decidable whether for clauses , then such a set of support can be effectively constructed. This is true, for example, for finite models of .

###### Theorem 3.6

If is unsatisfiable, is a support set for , there is a length refutation from and then is unsatisfiable.

This leads to the following theorem proving method :

Choose , compute , and test for satisfiability.

If is much smaller than , then can be much faster than looking for a refutation directly from . However, because one does not know which to try, one can perform , , et cetera, interleaving the computations because even for a fixed , may not terminate. This leads to the following theorem proving method:

for step 1 until infinity do in parallel od;

An example of a support set for mentioned above is the set of clauses contradicting an interpretation of . This provides a way to use semantics () in theorem proving, which humans also commonly use. Also, if is a satisfiable subset of , then is a support set for . For example, if is a collection of axioms from some satisfiable theory such as number theory then is a support set for . Typically one attempts to prove a theorem from some collection of general, satisfiable axioms. Then one converts to clause form, obtaining set of clauses where are the clauses coming from and are the clauses coming from . Then is a support set for . Support sets are often specified with a theorem.

###### Definition 2

If is unsatisfiable and is a support set for , then the support radius for , , is the minimal such that is unsatisfiable. If is satisfiable, then is .

The support neighborhood for , , is then . A support neighborhood clause for is a clause in and a support neighborhood literal for is a literal appearing in a support neighborhood clause for . The support diameter for is the maximum relevance distance between two clauses within the support neighborhood for .

## 4 Time Bound to Compute Relevance

Relevance neighborhoods can be computed within a reasonable time bound, which makes it feasible to use relevance techniques in theorem proving applications involving large knowledge bases. The computation methods presented here are new. This computation requires finding all pairs of complementary unifiable literals in clauses of , constructing a graph from them, and then applying breadth-first search to compute the relevance distances from a support set to all clauses in .

### 4.1 Pairwise unification

To compute the time for the unifications, let be the length of in characters when written out as a string of symbols, and similarly and for clauses and literals, respectively. Suppose , are the literals in . Unification can be done in linear time [PW78], so testing all pairs of literals for unifiability takes time proportional to . In practice, term indexing [RSV01] permits this to be done much faster. However, with some algebra, which is quadratic in .

### 4.2 The graph

Let be a graph obtained from for purposes of computing relevance distances. The nodes of are triples and where and . There are two kinds of edges in : Type 1 edges from to for all nodes and in such that and are complementary unifiable. There are also type 2 edges from nodes to where and are distinct literals of . Type 1 edges encode the links between clauses in an alternating path and type 2 edges encode switching from one literal of a clause to another literal in an alternating path. The number of edges can be quadratic in .

A path in a graph is a sequence in which there is an edge from to for all . The length of this path is . Then there is a direct correspondence between alternating paths in and paths in . Suppose is an alternating path in and is the pair of literals. The corresponding path in is , , , , , . The length of the alternating path is but the length of the path in is if .

### 4.3 Relevance neighborhoods

Suppose one wants to find where is a subset of . This can be done using the well-known linear time breadth-first search algorithm [CLRS09], which outputs the distances of all nodes from the starting node (and can be easily modified to have more than one starting node). First one sets the distances of all nodes and for to 1, and all other nodes have distances of infinity. Then one applies breadth-first search which computes the length of all shortest paths from the nodes for to all other nodes. From this, is obtained as the clauses appearing in nodes whose distance is less than or equal to . Because this graph has a size that is quadratic in , the overall method is quadratic. However, to compute , it is only necessary to construct and search the portion of the graph consisting of nodes at distance or less, which can result in a considerable savings of time, especially for very large knowledge bases and small .

### 4.4 The propositional case

If is purely propositional then it is possible to find relevance neighborhoods much faster, as follows: The edges of type 1 are replaced by a smaller number of edges. Suppose are all the clauses in containing a positive literal and are all the clauses in containing . Then the type 1 edges from to are replaced by edges from to a new node and edges from to . Also, simlar edges are added with the sign of reversed: the type 1 edges from to are replaced by edges from to a new node and edges from to . This means that the path in corresponding to a path in becomes a little longer, but the number of edges in is reduced from quadratic to linear, making the whole algorithm linear time. Then an alternating path of length in corresponds to a path of length in if . For small relevance distances, one need not construct all of , as before.

## 5 Branching Factor Argument

Now we give a new evidence that relevance can help to find proofs faster.

Suppose each (first-order) clause has at most literals and each predicate appears with a given sign in at most clauses in a set of clauses. Suppose a clause is in an alternating path; how many clauses can appear after it in alternating paths in ? If is the first clause in the path then any one of its up to literals can connect to at most other clauses, so there can be up to clauses after in various alternating paths in . If is not the first clause in the path, then it cannot exit by the same literal it entered by, so the number of clauses that can appear after it in various alternating paths in is at most . Thus there is a branching factor of at most at each level except at the first level.

Suppose is a support set for . Then there can be clauses that are the first clauses in some alternating path starting in . There can be up to clauses after in alternating paths in and for each clause for there can be at most clauses after it in various alternating paths in . So in paths starting in with various clauses , the number of clauses in all such paths is at most . All clauses in must appear in some such path. The total number of clauses appearing in such paths is then bounded by . If this is bounded by . If this quantity is muich smaller than and is unsatisfiable then the effort to find a proof from can be much less than the effort to find a proof from all of .

The alternating path approach is intended for proofs with small relevance bounds, so that the exponent should be small. Also, if clause splitting is used to break unifications, then the effective value of may be reduced. The value is typically small in first-order clause sets.

## 6 Experimental evidence with some knowledge bases

There is another evidence that relevance can help to reduce the size of the clause set that one must consider. This is based on an implementation of relevance [JP84] in which a first-order situation calculus knowledge base KB1 of about 200 clauses and a first-order knowledge base KB2 of about 3000 clauses expressing a map of a portion of the United States were considered and relevance methods were applied. For these examples, was computed for representing various queries and with increasing until a non-empty set of relevant clauses was found. This approach used additional pruning techniques to reduce the size of the relevant set such as a purity test in which clauses were deleted from if they had literals that did not complement unify with any other literals in ; additional details can be found in the paper.

Know. | Query | Dist. | No. of | No. | Note |
---|---|---|---|---|---|

base | bound | clauses | needed | ||

found | for proof | ||||

1 | 1 | 5 | 10 | 7 | |

1 | 2 | 6 | 24 | 18 | |

2 | 3 | 4 | 22 | 6 | |

2 | 3 | 4 | 7 | 6 | Different |

strategy | |||||

2 | 4 | 3 | 4 | 4 | |

2 | 5 | 3 | 4 | 4 |

For query 5, the method instantiated the query itself and found four instances of the query, all needed for a refutation. In all cases a small set of unsatisfiable clauses was found. These results were pubished previously [JP84] but are not widely known.

## 7 Large Knowledge Bases

For some large theories, the actual refutations tend to be small; this also suggests that relevance techniques can be helpful. The following quote [PSST10] concerns the SUMO knowledge base:

”The Suggested Upper Merged Ontology (SUMO) has provided the TPTP problem library with problems that have large numbers of axioms, of which typically only a few are needed to prove any given conjecture.”

However, in this case, one is frequently testing to see if the axioms themselves are satisfiable. Because relevance techniques presented depend on knowing that a large subset of the axioms is satisfiable, in order to use relevance one would have to find such a subset even without knowing that all the axioms were satisfiable. The following quotation [RRG05] concerns another large knowledge base:

”…the knowledge in Cyc’s KB is common-sense knowledge. Common-sense knowledge is more often used in relatively shallow, ’needle in a haystack’ types of proofs than in deep mathematics style proofs.”

Concerning the Sine Qua Non approach [HV11], the authors write:

”Problems of this kind usually come either from knowledge-base reasoning over large ontologies (such as SUMO and CYC) or from reasoning over large mathematical libraries (such as MIZAR). Solving these problems usually involves reasoning in theories that contain thousands to millions of axioms, of which only a few are going to be used in proofs we are looking for.”

Of course, the knowledge base used for Watson [Fer12] was huge, but the proofs (if one can call them proofs) had to be found quickly, so they had to be relatively small.

## 8 Relationship to the Set of Support Strategy in Resolution

Another evidence that relevance can help comes from the usefulness in many cases of the set of support strategy [WRC65] for first-order theorem proving. This strategy is included as one of the standard options in many first-order theorem provers. This is because experience has shown that the set of support strategy often helps to find proofs faster. Some experimental evidence that first-order theorem proving strategies using set of support techniques outperform others for large theories has been obtained [RS98].

Now we show formally that the set of support strategy restricts attention to relevant clauses, and in fact, uses the most relevant clauses first. This result is new. It is surprising that set of support should correspond to relevance defined in terms of alternating paths in this way, because the definition of alternating paths is non-intuitive. Because the set of support strategy often helps to find proofs, this is evidence that relevance is also helpful for proof finding.

###### Definition 3

If is a set of first-order clauses then a resolution sequence from is a sequence of clauses where each is either in or is a resolvent of and for . There is a parent function that for any returns such a pair , but mostly this will be left implicit.

###### Definition 4

Suppose is a set of first-order clauses and is a subset of . Then a clause in a resolution sequence from is -supported if it is either in or at least one of its parents is -supported in the sequence. The set of support strategy for with support set is the set of resolution sequences from in which each non-input clause is -supported in the resolution sequence.

The set of support strategy is complete in the sense that if is an unsatisfiable set of first-order clauses and is a support set for then there is a set of support refutation from , that is, a resolution sequence from according to the set of support strategy for and in which is the empty clause, denoting false.

###### Definition 5

Suppose is the alternating path and , for and . Let be for . Let be and let be . Then is . is considered to be a clause denoting the disjunction of its literals. If then .

###### Definition 6

A collection of alternating paths covers a first-order clause if for every literal there is an alternating path and a literal such that is an instance of .

###### Theorem 8.1

Suppose covers first-order clause , and is a resolvent of and for some clause . Let and be literals of resolution in the respective clauses. Let be a path in and be a literal in such that is an instance of . Let be a prefix of the path such that the last clause in contains . Let be the path , that is, is with and added to the end. Then is an alternating path and covers .

###### Proof

The fact that is an alternating path follows directly from the definitions. Let be . Now, covers ; all literals in that are instances of literals in are covered because is the last clause in , and the literal which is possibly not in has been removed from by the resolution operation. The literals in that come from literals in are covered because the literals in were already covered by and the literals in have only been instantiated in the resolution operation.

Suppose is a set of first-order clauses and is a support set for .

###### Theorem 8.2

Suppose is a resolution sequence in the set of support strategy for with support set . Then for every literal in every derived (that is, non-input) clause there is an alternating path from of length at most such that is an instance of a literal in ; thus is an instance of a literal in a clause in at relevance distance at most . Also, for every input clause in the proof there is an alternating path from of length at most ending in .

This theorem is saying that there is an alternating path of length 1 from to , an alternating path of length at most 2 from to if is an input clause, an alternating path of length at most 3 from to if is an input clause, and so on.

###### Proof

This follows from Theorem 8.1. By induction on , showing this for = 1 and showing that if it’s true for it’s also true for , there is a set of alternating paths of length at most starting in and covering if is a derived clause. This implies all the conclusions of the theorem.

The implication of Theorem 8.2 for the set of support strategy is that all input clauses in the proof are relevant at distance at most .

This shows that the set of support strategy restricts attention to relevant clauses, so that the effectiveness of this strategy is evidence that relevance is helpful. In fact, one can easily show that if is a clause in at relevance distance then appears in a set of support proof of length at most . Thus the set of support strategy uses exactly the relevant clauses in .

### 8.1 Limitations of the Set of Support Strategy

The question now arises, if the set of support strategy is in some sense equivalent to relevance, then why not just use it all the time and dispense with relevance altogether?

First, there are some extensions to relevance that do not naturally incorporate into the set of support strategy; these involve a purity test and the use of multiple sets of support. These techniques have been presented [Pla80, JP84] in a couple of early papers.

Also, for propositional clause sets, with CDCL [SLM09] is generally much better than resolution, so relevance techniques may have an advantage over resolution for such clause sets even with the set of support strategy.

In addition, for a Horn set, that is, a set of first-order Horn clauses, with an all-negative set of support, the set of support strategy is the same as input resolution, which requires every resolution to have one parent that is an input clause. Define the depth of a proof so that the depth of a (trivial) proof of an input clause is zero, and if is proved by resolving and , then the depth of the proof of is one plus the maximum depths of the proofs of and . Then the depth of a proof from a Horn set with an all-negative set of support is at least as large as the number of input clauses used, and could be larger if more than one instance of an input clause is used. This implies that it is possible to have a clause set with a small support radius but which requires a large number of input clauses and therefore a very deep set of support proof. Thinking in terms of Prolog style subgoal trees, the relevance distance corresponds roughly to the depth of the tree but the length of a set of support proof corresponds to the number of nodes in the tree. For such clause sets, it may be better to use hyper-resolution, which basically resolves away all negative literals of a clause at once, because the proof depth corresponds only to the depth of the tree.

As an example, consider the propositional clause set together with the unit clauses . Suppose is the set of support. Then the set of support strategy resolves with to produce . This then resolves with to produce . Two more resolutions produce and six more resolutions with unit clauses produce a contradiction, for a total of ten resolutions. However, all clauses are within a relevance distance of three of the set of support. Also, hyper-resolution can find a refutation in fewer levels of resolution. First, the unit clauses hyper-resolve with to produce the units and these hyper-resolve with to produce the unit which then resolves with to produce a contradiction. If the subgoal tree is larger the difference can be more dramatic; the number of levels of resolution by the set of support strategy can be exponentially larger than the number of levels of hyper-resolution required. If there are many clauses in the clause set, a deep proof can easily get lost in the huge search space that is generated. The point is, although set of support restricts attention to relevant clauses, it does not always process them in the most efficient way, so it may be better to separate relevance detection from inference in some cases. The set of support strategy intermingles a relevance restriction with resolution inference.

## 9 Dpll

This section presents an extension of alternating path relevance to the method. The method for testing satisfiability of propositional clause sets is now of major importance in many areas of computer science, especially with the extension to CDCL [SLM09]. For example, an open conjecture concerning Pythagorean triples was recently solved in this way [HKM16]. Therefore any improvement in is of major importance. Relevance can decrease the work required for if the number of support neighborhood literals is small for a support set . This decrease in work requires a modification to .

The method, without unit rules or CDCL, can be expressed this way, for a set of propositional clauses:

if the empty clause, representing false is in then unsat |
---|

else |

if is empty then sat else |

choose a literal that appears in ; |

if unsat then else sat |

Here is with all clauses containing deleted and removed from all clauses containing . Also, is defined similarly with the signs of the literals reversed. We say that is the split literal or the literal chosen for splitting, in this case.

also has unit rules. Essentially, if contains a unit clause then is used to simplify by replacing all occurrences of by true and simplifying, and replacing all occurrences of by false and simplifying. There is a similar rule if there is a unit clause in , with signs reversed.

### 9.1 Relevant

The material in this section is new. Let be a support set for a set of propositional clauses. Define the relevance distance of a literal from to be the minimal such that or its negation appears in a clause of relevance distance from .

###### Definition 7

If is a set of clauses then is , that is, the set of literals in clauses of .

###### Definition 8

The stepping sequence for a set of propositional clauses and a support set for is a sequence where is the set of literals of at relevance distance from , and is the maximum relevance distance from of any literal in . A stepping remainder sequence for and is a sequence such that for the stepping sequence for and , for all . A leading literal of a stepping remainder sequence is an element of or the complement of such an element, where is minimal such that is non-empty. Also, where is a set of clauses and denotes the stepping remainder sequence in which for all . If is the stepping remainder sequence then is .

Note that support sets are easy to obtain for propositional clause sets. For example, the set of all-positive clauses and the set of all-negative clauses are both support sets for all propositional clause sets.

The relevant method - can be expressed this way, where is a set of propositional clauses and is a stepping remainder sequence for :

- |

1. if the empty clause, representing false is in |

1. then unsat else |

2. if then sat else |

3. choose to be a leading literal from ; |

4. if - unsat |

4. then - |

5. else sat |

This procedure is called at the top level as - with as the stepping sequence for and , and as a support set for . If there is a choice of leading literals, then any DPLL heuristic can be used to choose among them. Now, the recursive calls to - will have stepping sequences with fewer literals; that is, will be smaller with each level of recursion. This enables proofs of properties of - by induction. Also, for the recursive calls to -, will always include all support neighborhood literals of that remain in , so that if then contains no support neighborhood literals of . If at this point does not contain the empty clause, then one who understands can see that this means that - has essentially found a model of the relevant clauses of . By theorem 3.2, is satisfiable.

These considerations justify returning ”sat” if is empty for all in line 2 of -. Then a model of the relevant clauses can be returned, without even exploring the remaining literals. However, this depends on being a valid support set for , that is, is satisfiable. If there is some doubt about this, then line 2 of - should be replaced by the following:

2. if then else |

This means that ordinary is called in this case to explore the remaining literals.

###### Theorem 9.1

If is an unsatisfiable set of propositional clauses, is the stepping sequence for and a support set for , and - is called at the top level, then the number of recursive calls to - for various stepping remainder sequences is bounded by where is the number of literals in the support neighborhood for .

###### Proof

Let be the support radius for . If is a stepping remainder sequence for , let be . By induction on for the recursive calls, one shows that for every call to -, the set of clauses in over the support neighborhood literals in is unsatisfiable. The proof makes use of the lemma that if is unsatisfiable so are and for any literal . If then in the recursive call must contain the empty clause. Each recursive call to - reduces the value of by one or more, and there are at most two recursive calls to -, whence the bound follows.

Note that can be much smaller than the set of all literals in relevant clauses in . This result is not trivial, because some clauses in are deleted in and and even clauses that have literals that are not in can contribute to such deletions by a succession of unit deletions in . Therefore one needs to show that the clauses over the literals in are still unsatisfiable in and .

This bound of can be much smaller than the worst case bound on ordinary where is the number of atoms (predicate symbols) in . It is not necessary to know the support neighborhood in order to apply this method. This method can be extended to CDCL, as before. Of course, in practice the number of calls is likely to be much smaller than . There are already heuristics for DPLL, but to our knowledge none of them decreases the worst case time bound as this approach does.

If the unit rules are used freely, units that are not in the support neighborhood may be processed by the unit rules even if the split literals are handled as in -. The choices to deal with this are to allow all units to be used in the unit rules, or to restrict the unit rules to only use units that are relevant in some sense.

## 10 Other extensions of alternating path relevance

### 10.1 Splitting clauses

Another idea that can help to make relevance more effective is to split a clause into several clauses that together have the same ground instances. For example, a clause containing an occurrence of a variable can be split into , where are all the function symbols appearing in the clause set and are sequences of new variables. If the clause set has no constant symbols, then one such symbol has to be allowed, in addition. This idea can be extended in the obvious way to clauses containing more than one variable. This idea can be especially hepful with general axioms such as the equality axioms. For example, the axiom can join many clauses together, causing all clauses to be at small relevance distances from one another. This can hinder the application of relevance in systems involving equality. Splitting clauses that have literals unifying with the complements of literals in many other clauses can help a lot in such cases. Also, it can be helpful to choose which variable to split so that the largest number of complement unifications from literals in the clause is broken. Spltting clauses was one of the techniques used earlier [JP84]; the technique used there was to split a clause into two clauses and where restricts to be instantiated to one of for some and restricts instantiation of to one of .

### 10.2 A fine type theory

Also, a very fine type theory can help with equality and with relevance in general. Types can be incorporated into the unification algorithm, so that for example a variable of type ”person” would not unify with a variable of type ”building.” This idea can increase the relevance distance between clauses and make relevance more effective.

### 10.3 Multiple sets of support

If one has a finite collection of interpretations of a set of clauses, then let be . Then the are sets of support for all such . Let be . Then if is unsatisfiable, it is easy to see that will also be unsatisfiable for some . The method of Jefferson and Plaisted [JP84] was basically to compute for various and apply a purity test until a non-empty set was obtained. Also, clause splitting was used. These techniques proved to be effective for the problems tried.

## 11 Equality

Equality can cause a problem for relevance if literals of the form or their negations appear in many clauses. It may be acceptable to use the equality axioms without any special modification for equality. However, there are also other possibilities. In general, there needs to be thought devoted to how to integrate equality and relevance, possibly using some kind of completion procedure [BN98] combined with relevance.

Ordered paramodulation is a theorem proving technique that is effective for clause form resolution with equality. It basically uses equations, replacing the large (complex) side of the equation by the small (simple) side. However, it is not compatible with the set of support strategy. For example, consider this set: , , . If is the set of support, then a refutation requires using the equations in the wrong direction for ordered paramodulation and even paramodulating from variables. This kind of paramodulation can be highly inefficient compared to ordered paramodulation. This is another evidence that just applying the set of support strategy is not always the best way to handle relevance.

For equality and relevance there are at least several choices: 1. Find the relevant clauses by some method, then use a strategy such as ordered paramodulation and hyper-resolution to find the proof. 2. Use Brand’s modification method [Bra75] on the set of clauses, find the relevant set of clauses, and then apply some inference method to find the proof. 3. Modify relevance to take into account equality, possibly by using a set of equations to rewrite things before computing relevance or by incorporating E-unification [BS01] into the unification algorithm.

## 12 Discussion

The features of alternating path relevance have been reviewed, and some extensions including an extension to have been presented. Graph based methods for computing this relevance measure have been presented. Alternating path relevance has an unexpected relationship with the set of support strategy. Some previous successes with it as well as with the set of support strategy argue that this method can be effective. A couple of theoretical results indicating the effectiveness of alternating path relevance have also been presented. Evidence has been given that several large knowledge bases frequently permit small proofs, suggesting that alternating path relevance and other relevance methods can be effective for them. An open problem for many of the relevance techniques is to give a theoretical justification for their effectiveness.

## References

- [BN98] Franz Baader and Tobias Nipkow. Term Rewriting and All That. Cambridge University Press, Cambridge, England, 1998.
- [Bra75] D. Brand. Proving theorems with the modification method. SIAM J. Comput., 4:412–430, 1975.
- [BS01] Franz Baader and Wayne Snyder. Unification theory. In John Alan Robinson and Andrei Voronkov, editors, Handbook of Automated Reasoning, pages 445–532. Elsevier and MIT Press, 2001.
- [CLRS09] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Breadth-first search, pages 531–539. The MIT Press, 3rd edition, 2009.
- [Fer12] D. A. Ferrucci. Introduction to ”this is watson”. IBM J. Res. Dev., 56(3):235–249, May 2012.
- [HKM16] Marijn J. H. Heule, Oliver Kullmann, and Victor W. Marek. Solving and verifying the boolean pythagorean triples problem via cube-and-conquer. In Nadia Creignou and Daniel Le Berre, editors, Theory and Applications of Satisfiability Testing – SAT 2016, pages 228–245, Cham, 2016. Springer International Publishing.
- [HV11] Krystof Hoder and Andrei Voronkov. Sine qua non for large theory reasoning. In CADE, volume 6803 of Lecture Notes in Computer Science, pages 299–314. Springer, 2011.
- [JP84] S. Jefferson and D. Plaisted. Implementation of an improved relevance criterion. In First Conference on Artificial Intelligence Applications, pages 476–482, 1984.
- [MP09] Jia Meng and Lawrence C. Paulson. Lightweight relevance filtering for machine-generated resolution problems. J. Applied Logic, 7(1):41–57, 2009.
- [MQP06] Jia Meng, Claire Quigley, and Lawrence C. Paulson. Automation for interactive proof: First prototype. Inf. Comput., 204(10):1575–1596, 2006.
- [Pla80] D. Plaisted. An efficient relevance criterion for mechanical theorem proving. In Proceedings of the First Annual National Conference on Artificial Intelligence, pages 79–83, 1980.
- [PSST10] Adam Pease, Geoff Sutcliffe, Nick Siegel, and Steven Trac. Large theory reasoning with sumo at casc. AI Communications, 23(2-3):137–144, 2010.
- [Pud07] Petr Pudlak. Semantic selection of premisses for automated theorem proving. In Sutcliffe et al. [SUS07].
- [PW78] M.S. Paterson and M.N. Wegman. Linear unification. Journal of Computer and System Sciences, 16(2):158 – 167, 1978.
- [PY03] David A. Plaisted and Adnan H. Yahya. A relevance restriction strategy for automated deduction. Artif. Intell., 144(1-2):59–93, 2003.
- [RPS09] Alex Roederer, Yury Puzis, and Geoff Sutcliffe. Divvy: An ATP meta-system based on axiom relevance ordering. In CADE, volume 5663 of Lecture Notes in Computer Science, pages 157–162. Springer, 2009.
- [RRG05] Deepak Ramachandran, Pace Reagan, and Keith Goolsbey. First-orderized researchcyc: Expressivity and efficiency in a common-sense ontology. In In Papers from the AAAI Workshop on Contexts and Ontologies: Theory, Practice and Applications, 2005.
- [RS98] Wolfgang Reif and Gerhard Schellhorn. Theorem Proving in Large Theories, pages 225–241. Springer Netherlands, Dordrecht, 1998.
- [RSV01] I.V. Ramakrishnan, R. Sekar, and A. Voronkov. Term indexing. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, volume II, chapter 26, pages 1853–1964. Elsevier Science, 2001.
- [RV99] Alexandre Riazanov and Andrei Voronkov. Vampire. In Automated Deduction — CADE-16, pages 292–296, Berlin, Heidelberg, 1999. Springer Berlin Heidelberg.
- [SLM09] João P. Marques Silva, Inês Lynce, and Sharad Malik. Conflict-driven clause learning SAT solvers. In Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications, pages 131–153. IOS Press, 2009.
- [SP07] Geoff Sutcliffe and Yury Puzis. SRASS - A semantic relevance axiom selection system. In CADE, volume 4603 of Lecture Notes in Computer Science, pages 295–310. Springer, 2007.
- [SUS07] Geoff Sutcliffe, Josef Urban, and Stephan Schulz, editors. Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories, Bremen, Germany, 17th July 2007, volume 257 of CEUR Workshop Proceedings. CEUR-WS.org, 2007.
- [Sut16] G. Sutcliffe. The CADE ATP System Competition - CASC. AI Magazine, 37(2):99–101, 2016.
- [Urb04] Josef Urban. MPTP - motivation, implementation, first experiments. J. Autom. Reasoning, 33(3-4):319–339, 2004.
- [Urb07] Josef Urban. Malarea: a metasystem for automated reasoning in large theories. In Sutcliffe et al. [SUS07].
- [USPV08] Josef Urban, Geoff Sutcliffe, Petr Pudlák, and Jirí Vyskocil. Malarea SG1- machine learner for automated reasoning with semantic guidance. In Alessandro Armando, Peter Baumgartner, and Gilles Dowek, editors, Automated Reasoning, 4th International Joint Conference, IJCAR 2008, Sydney, Australia, August 12-15, 2008, Proceedings, volume 5195 of Lecture Notes in Computer Science, pages 441–456. Springer, 2008.
- [WPN08] Makarius Wenzel, Lawrence C. Paulson, and Tobias Nipkow. The isabelle framework. In Otmane Aït Mohamed, César A. Muñoz, and Sofiène Tahar, editors, Theorem Proving in Higher Order Logics, 21st International Conference, TPHOLs 2008, Montreal, Canada, August 18-21, 2008. Proceedings, volume 5170 of Lecture Notes in Computer Science, pages 33–38. Springer, 2008.
- [WRC65] L. Wos, G. Robinson, and D. Carson. Efficiency and completeness of the set of support strategy in theorem proving. Journal of the Association for Computing Machinery, 12:536–541, 1965.