Rules with parameters in modal logic I
We study admissibility of inference rules and unification with parameters in transitive modal logics (extensions of ), in particular we generalize various results on parameter-free admissibility and unification to the setting with parameters.
Specifically, we give a characterization of projective formulas generalizing Ghilardi’s characterization in the parameter-free case, leading to new proofs of Rybakov’s results that admissibility with parameters is decidable and unification is finitary for logics satisfying suitable frame extension properties (called cluster-extensible logics in this paper). We construct explicit bases of admissible rules with parameters for cluster-extensible logics, and give their semantic description. We show that in the case of finitely many parameters, these logics have independent bases of admissible rules, and determine which logics have finite bases.
As a sideline, we show that cluster-extensible logics have various nice properties: in particular, they are finitely axiomatizable, and have an exponential-size model property. We also give a rather general characterization of logics with directed (filtering) unification.
In the sequel, we will use the same machinery to investigate the computational complexity of admissibility and unification with parameters in cluster-extensible logics, and we will adapt the results to logics with unique top cluster (e.g., ) and superintuitionistic logics.
Admissibility of inference rules is among the fundamental properties of nonclassical propositional logic: a rule is admissible if the set of tautologies of the logic is closed under the rule, or equivalently, if adjunction of the rule to the logic does not lead to derivation of new tautologies. Admissible rules of basic transitive modal logics (, , , , , …) are fairly well understood. Rybakov proved that admissibility in a large class of modal logics is decidable and provided semantic description of their admissible rules, see [ryb:bk] for a detailed treatment. Ghilardi [ghil] gave a characterization of projective formulas in terms of extension properties of their models, and proved the existence of finite projective approximations. This led to an alternative proof of some of Rybakov’s results, and it was utilized by Jeřábek [ejadm, ej:indep] to construct explicit bases of admissible rules, and to determine the computational complexity of admissibility [ej:admcomp]. A sequent calculus for admissible rules was developed by Iemhoff and Metcalfe [iem-met]. Methods used for transitive modal logics were paralleled by a similar treatment of intuitionistic and intermediate logics, see e.g. [ryb:bk, ghilil, iem:aripc, iem:imed2].
Admissibility is closely related to unification [baa-sny, baa-ghi]: for equational theories corresponding to algebraizable propositional logics, -unification can be stated purely in terms of logic, namely a unifier of a formula is a substitution which makes it a tautology. Thus, a rule is admissible iff every unifier of the premises of the rule also unifies its conclusion, and conversely the unifiability of a formula can be expressed as nonadmissibility of a rule with inconsistent conclusion. In fact, the primary purpose of Ghilardi [ghil] was to prove that unification in the modal logics in question is finitary.
In unification theory, it is customary to work in a more general setting that allows for extension of the basic equational theory by free constants. In logical terms, formulas may include atoms (variously called parameters, constants, coefficients, or metavariables) that behave like ordinary propositional variables for most purposes, but are required to be left fixed by substitutions. Some of the above-mentioned results on admissibility in transitive modal logics also apply to admissibility and unification with parameters, in particular Rybakov [ryb:s4con, ryb:provlog, ryb:grz, ryb:bk] proved the decidability of admissibility with parameters in basic transitive logics, and he has recently extended his method to show that unification with parameters is finitary in these logics [ryb:modunifcoef, ryb:intunifcoef2]. Nevertheless, a significant part of the theory only deals with parameter-free rules and unifiers.
The purpose of this paper is to (at least partially) remedy this situation by extending some of the results on admissibility in transitive modal logics to the setup with parameters. Our basic methodology is similar to the parameter-free case, however the presence of parameters brings in new phenomena leading to nontrivial technical difficulties that we have to deal with.
For a more detailed overview of the content of the paper, after reviewing basic concepts and notation (Section 2) and establishing some elementary background on multiple-conclusion consequence relations with parameters (Section 2.1), we start in Section 3 with a parametric version of Ghilardi’s characterization of projective formulas in transitive modal logics with the finite model property in terms of a suitable model extension property on finite models. In Section 4, we introduce the class of cluster-extensible (clx) logics (and more generally, -extensible logics for the case when the set of allowed parameters is finite), and we use the results from Section 3 to show that in clx (or -extensible) logics, all formulas have projective approximations. As a corollary, this reproves results of Rybakov [ryb:bk, ryb:modunifcoef] that such logics have finitary unification type, and if is decidable, then admissibility in is also decidable, and one can compute a finite complete set of unifiers of a given formula. In order to determine which of these logics have unitary unification, we include in Section 4.2 a simple syntactic criterion for directed (filtering) unification, vastly generalizing the result of Ghilardi and Sacchetti [ghi-sacc]. In Section 4.3, we look more closely at semantic and structural properties of clx logics: we show that every clx logic is finitely axiomatizable, decidable, -definable on finite frames, and has an exponential-size model property. Moreover, the class of clx logics is closed under joins in the lattice of normal extensions of . (These results mostly do not have good analogues in the parameter-free case; they exploit the fact that the extension conditions designed to make the other results on admissibility and unification work need to be more restrictive when parameters are considered.)
In Section 5, we introduce (multiple-conclusion) rules corresponding to the existence of a parametric version of tight predecessors, generalizing the parameter-free rules considered in [ejadm, ej:indep]. We investigate their semantic properties, and as the main result of this section, we show that these extension rules form bases of admissible rules for clx or -extensible logics. We present single-conclusion variants of these bases in Section 5.1. Finally, in Section 5.2, we modify the extension rules further to provide independent bases of admissible rules with finitely many parameters for -extensible logics, and we show that finite bases exist if and only if the logic has bounded branching.
As the name suggests, this paper is to be continued by a sequel, where we will address the computational complexity of admissibility and unification with parameters in clx logics, and modifications of our results to related classes of logics: modal logics whose finite rooted frames have a single top cluster (such as and ), and intuitionistic and intermediate logics.
2 Preliminaries and notation
The purpose of this section is to review basic definitions and standard facts we are going to use in order to fix our terminology and notation. For more detailed information, we refer the reader to [cha-zax] (modal logic), [ryb:bk] (admissible rules), [baa-sny, baa-ghi] (unification), [jans-sep] (propositional consequence relations), [sh-sm] (multiple-conclusion consequence relations).
We will work with propositional languages consisting of formulas freely built from a (usually countably infinite) set of atoms using a fixed set of finitary connectives. (We distinguish two types of atoms: variables and parameters. We will elaborate on this in Section 2.1.) We will usually denote formulas by lowercase Greek letters . We write if is a subformula of . denotes the set of all subformulas of a formula , and the length (i.e., the number of symbols) of . Finite sets of formulas will be usually denoted by uppercase Greek letters .
Let us fix a propositional language . An atomic substitution is a mapping that commutes with connectives, hence it is uniquely determined by its values on atoms. (We reserve the term “substitution” for parameter-preserving substitutions, see below.) A (propositional) logic is an atomic single-conclusion consequence relation: a binary relation between finite sets of formulas and formulas (written in infix notation as ), satisfying
(weakening) implies ,
(cut) and implies ,
(substitution) implies for every atomic substitution .
Here we employ common conventions for sets of formulas: denotes , can stand for , and . We also write instead of ; such formulas are called -tautologies. A logic is inconsistent if all formulas are tautologies, otherwise it is consistent. Note that our consequence relations are by definition finitary, or more precisely, they are finitary fragments of consequence relations under a more conventional definition; we consider our choice more convenient for the purpose of investigation of admissible rules and unification, as these only concern the finitary fragment of a given logic.
Being a binary relation, a logic is a set of pairs , where is a finite set of formulas, and a formula. Such pairs are called single-conclusion rules, and we write them as . In this context, a formula can be identified with the rule (an axiom). If is a logic and a set of single-conclusion rules (or formulas), denotes the smallest logic including and . (We reserve for parametric consequence relations, see below.)
In this paper, we will mostly work with normal modal logics. The basic modal language is generated by the connectives ; other common connectives () are defined as abbreviations in the usual way. We also put , . is the smallest logic in the basic modal language that includes classical propositional tautologies, and the axioms and rules
A transitive modal logic is an axiomatic extension of , i.e., a logic of the form , where is a set of formulas. (Under our definition, normal modal logics are identified with their global consequence relations, whereas in most modal literature they are identified with their sets of tautologies. Nevertheless, we will abuse the notation and write as a short-hand for “ is a transitive modal logic”. Local consequence relations or non-normal modal logics do not appear in this paper, hence our usage of agrees with its standard meaning.)
The set of all transitive modal logics ordered by inclusion is a complete lattice, denoted . The meet of a family of logics is just their intersection, and we will write it as such. We will write joins with , though in the case of binary joins we also have .
A (transitive) Kripke frame is a pair , where is a transitive binary relation on a (possibly empty) set . We will generally use the same symbol to denote both the frame and its underlying set. We write for , for , and for . We read as “ is accessible from ”, or for short, “ sees ”. A point is reflexive if , and irreflexive otherwise. If , we define , . The cluster of is . If , then is the frame . If for some , is called a rooted frame, any such is its root, and its root cluster. If , an is called a maximal (or -maximal) point of , if for every . A cluster is final if its points are maximal in , and inner otherwise. An is an antichain if for any such that .
A (Kripke) model is a triple , where is a Kripke frame, and the valuation is a relation between elements of and formulas, satisfying the usual conditions for compound formulas. Again, we often use the same symbol for a model and its underlying set, and we write instead of when we need to stress which model the belongs to. We write if for every .
If , a Kripke -frame is a Kripke frame such that for every model and every -tautology . An -model is a model such that is an -frame. denotes the set of all finite rooted -models. For a formula , we put . Notice that . has the finite model property (fmp) if implies for every formula .
If is a model and a substitution, we define to be the model based on the same frame such that iff for every formula and . Notice that .
Let be a finite model. The depth of is the maximal length of a chain in . The branching of is the maximal number of immediate successor clusters of any node . The width of is the maximal size of an antichain in any rooted subframe of .
For any formula , we put , . If is a set of formulas, denotes the set of all assignments , where . If is finite and , we put . (Here and elsewhere, the empty conjunction is defined as , and the empty disjunction as .) Conversely, if is a Kripke model and , (shortened to if is understood from the context) denotes the assignment such that iff . If is a model and a formula, denotes .
A general frame is , where is a Kripke frame, and is a Boolean algebra of sets closed under the operation , or equivalently, under . Sets from are called admissible (or definable), and their arbitrary intersections are closed sets. A Kripke frame can be identified with the general frame . We will sometimes write just frame instead of general frame, however finite frames are always assumed to be Kripke frames. An admissible valuation in is a valuation in satisfying for every .
If is a cardinal number, a general frame is -generated if is generated as a modal algebra by a subset of size at most . (Note that this notion is unrelated to the similarly named generated subframes.)
A general frame is refined if for every ,
(In other words, all sets of the form or are closed.) A family of sets has the finite intersection property (fip) if any its finite subfamily has a nonempty intersection. A frame is compact if every family of admissible (or closed) sets with fip has a nonempty intersection. Compact refined frames are called descriptive.
If is a transitive modal logic, a (general) -frame is a frame such that for every admissible valuation and -tautology . Every is complete with respect to descriptive -frames.
We will use the following well-known property:
If is a descriptive frame, is closed, and , then there exists a -maximal such that .
Proof: If is a nonempty chain, the set has fip. Since each is closed, has a nonempty intersection, and any is an element of majorizing . Thus, satisfies the assumptions of Zorn’s lemma, and the result follows.
Let and be general frames. is a generated subframe of if , (which implies ), and . A p-morphism from to is a mapping such that
iff there is such that ,
for every , , and . The disjoint union of frames , , is the frame whose underlying set is the disjoint union , , and . Generated submodels, and p-morphisms and disjoint unions of models, are defined similarly.
We will usually index sequences of formulas, frames, points, and other objects by nonnegative integers, whose set is denoted . In particular, if , then (without further qualification such as ) means .
2.1 Parametric consequence relations
As already mentioned, we consider atoms of two kinds: variables and parameters (in unification literature, the latter are usually called constants). The set of all variables is denoted , and we assume it is countably infinite. We can enumerate , but for ease of reading we will also use letters for variables. The set of all parameters is denoted , and we will use letters such as for parameters. We assume that is at most countable, but we allow it to be infinite or finite (or even empty, so that our results subsume the parameter-free case). If and , denotes the set of modal formulas in parameters and variables .
A substitution is an atomic substitution such that for every parameter . A single-conclusion consequence relation is a relation between finite sets of formulas and formulas (or equivalently, a set of single-conclusion rules) which satisfies conditions (i)–(iii) from the definition of a logic, as well as
implies for every substitution .
More generally, a multiple-conclusion consequence relation (or just consequence relation for short) is a binary relation between finite sets of formulas, satisfying
(weakening) implies ,
(cut) and implies ,
(substitution) implies for every substitution .
This definition implies that the following more general form of the cut property holds for all finite sets of formulas :
(general cut) If for every partition , then .
(We consider here only finitary consequence relations, however we remark that if we allowed with infinite, the proper definition of consequence relations would need to include (iii’) for arbitrary sets in place of the weaker condition (iii); see [sh-sm] for details.)
A consequence relation is thus a set of pairs of finite sets of formulas. We will call such pairs multiple-conclusion rules, or just rules, and we will write them as .
For every consequence relation , the set of single-conclusion rules such that is a single-conclusion consequence relation, the single-conclusion fragment of . Conversely, for every single-conclusion consequence relation , there is a smallest consequence relation whose single-conclusion fragment is , namely iff for some . In particular, if is a logic, a rule is -derivable if it belongs to the smallest consequence relation extending (which we identify with itself).
If is a consequence relation and a set of rules, denotes the smallest consequence relation containing and .
Let be a logic. An -unifier of a set of formulas is a substitution such that for every . A rule is -admissible if every unifier of also unifies some . The set of all -admissible rules forms a consequence relation which we denote . A basis of -admissible rules is a set of rules such that .
A logic is (finitely) equivalential if there is a finite set of formulas such that
for every formula , possibly involving other variables not shown. Modal logics are equivalential with . Substitutions are equivalent, written , if for every variable and . A substitution is more general than , written , if for some substitution . A complete set of unifiers of a set of formulas is a set of unifiers of such that every unifier of is less general than some . A complete set of unifiers is minimal if no is complete, or equivalently, if consists of pairwise incomparable -maximal unifiers. A most general unifier (mgu) of is a unifier of more general than any other unifier of . If has a minimal complete set of unifiers , its cardinality is an invariant of . We say that is of
type (unary), if (i.e., has a mgu),
type (finitary), if ,
type (infinitary), if is infinite,
type (nullary), if has no minimal complete set of unifiers.
The unification type of is the maximum of types of unifiable finite sets of formulas , where the types are ordered as . has at most finitary unification, if its unification type is unary or finitary.
A parametric Kripke frame is , where is a Kripke frame, and . Similarly, a parametric (general) frame is , where is a general frame, and satisfies for every . An admissible valuation in is an admissible valuation in such that .
A rule is satisfied in a model if for some , or for some . A rule is valid in a parametric frame , written , if is satisfied in any model based on an admissible valuation in .
Generated subframes and disjoint unions of parametric frames are defined in the obvious way. P-morphisms of parametric frames are required to preserve the valuation of parameters in both directions. Validity of rules in parametric frames is preserved by p-morphic images. Single-conclusion rules are also preserved under disjoint unions, and premise-free rules under generated subframes.
Note that parametric frames (and other parametrified notions in this subsection) are not technically any more demanding than usual frames or models. The purpose they serve is to avoid endless repetition that various frames come with a predefined valuation of parameters, which is supposed to be preserved by constructs such as p-morphisms. They are also conceptually important in that they provide adequate semantics for modal logic in signature expanded with free constants, just as usual frames give an adequate semantics for modal logic in its basic signature.
Let , , . The canonical frame is the descriptive parametric frame , where is the collection of maximal -consistent subsets of ; for , we put iff iff ; consists of sets of the form , where ; and we put iff for and .
We have . On the other hand, if , where and , then . The following standard lemma follows easily.
Let be a rule whose parameters are included in .
If , then for every .
If , there is such that whenever .
If is a class of parametric frames, the set of all rules valid in is easily seen to be a consequence relation extending . On the other hand, every consequence relation is complete wrt a class of (finitely generated) descriptive frames. In particular, if , where , then the general cut property (iii’) from p. (iii’) and Zorn’s lemma imply that there exists a partition such that , , and for any finite , . Then is a closed (hence descriptive) generated subframe of , and one checks easily that and (see e.g. [ej:canrules, Thm. 2.2] for the parameter-free case).
On a related note, descriptive parametric frames can be embedded in canonical frames. The lemma below holds for arbitrary cardinals if we allow uncountable sets of variables and parameters, but we will only need it for finite (hence finitely generated) frames.
Let , , and be a parametric -generated descriptive -frame. If is a set of variables such that , there is a general frame isomorphism from onto a closed generated subframe of , preserving the valuation of parameters from .
Proof: Let be the free -algebra generated by , and a homomorphism from to the algebra of admissible sets of , mapping onto a set of generators of , and each to the element of given by the valuation of in . Since is onto, the dual p-morphism from to is injective, and it has the required properties.
A nonstructural consequence relation is a binary relation between finite sets of formulas satisfying conditions (i), (ii), (iii) from the definition of multiple-conclusion consequence relations. We will not refer to nonstructural consequence relations directly, but we will extend the notation to rules as follows. Let be a (structural) consequence relation, and a set of rules. We write if is in the least nonstructural consequence relation containing and .
Note that if the rules in and are just axioms, iff the same holds for the corresponding formulas under the original consequence relation , thus this overloading of the symbol does not lead to conflicts. Also, if , then iff .
( defines a sort of a single-conclusion consequence relation operating with multiple-conclusion rules instead of formulas, but we will not use this terminology in order to avoid unnecessary confusion.)
If extends , and is a frame validating , then implies that every admissible valuation in that satisfies all rules from also satisfies . One can in fact show easily that is complete with respect to this semantics, but we will not need this. Rather, we will use the following lemma which follows from [ejadm, L. 2.3, 2.4], but we include a direct proof for completeness.
Let have fmp, and be a finite set of rules. If , there is a finite -model such that , and .
Proof: Write , and let be the set of all formulas occurring in . By the general cut property (iii’), there is a partition such that , , and . In particular, if , then or , and for every , . The latter implies that there are models with roots such that , and . Let be the disjoint union of all , . Then for every , and for every . In particular, for every , and .
3 Projective formulas
Let us fix a logic with the finite model property. Recall that a formula is projective if it has a projective unifier, which is an -unifier of such that
for every variable (which implies for every formula ). A projective unifier of a formula is also its most general unifier.
Ghilardi [ghil] described projective formulas in the parameter-free case: they are exactly the formulas whose finite -models have a certain extension property, and moreover, one can explicitly define for any formula a substitution satisfying (1) which turns out to be a projective unifier whenever the formula is projective. The goal of this section is to generalize this result to projectivity with parameters. Let us start by defining the relevant extension properties and substitutions.
Models are variants of each other if they are based on the same frame, have the same valuation of parameters, and their valuation of variables can only differ in . A set of models has the model extension property, if every whose all rooted generated proper submodels belong to has a variant in . A formula has the model extension property if this holds for .
Let , where and are finite sets of parameters and variables, respectively. Let , where each is a Boolean function of the parameters. We define the Löwenheim substitution by
for every , where is identified with any Boolean formula representing it. Notice that sequences as above can be equivalently represented as assignments or . Let be the composition of all substitutions of the form , in an arbitrary order. We will also write and when is clear from the context.
Notice that in the case , can be identified with a variable assignment , and is equivalent to the substitution
considered by Ghilardi.
The rest of this section is devoted to the proof of the following characterization.
Let have the finite model property, and be a formula in finitely many parameters and variables . Then the following are equivalent.
has the model extension property.
is a unifier of , where , .
In the parameter-free case, we obtain . This is a considerable improvement over Ghilardi’s original proof, which gives nonelementary (a tower of exponentials whose height is the modal degree of ).
For the next few lemmas, let us fix finite sets of parameters and variables , a formula with the model extension property, and . We aim to show that , which in view of the fmp of amounts to being true in every finite rooted -model .
The basic idea behind the substitutions is that their successive application leaves unchanged the part of where already holds, while we are making progress on the rest of the model: specifically, a maximal cluster where fails can be made to satisfy by applying for a suitably chosen .
Let and .
If , then .
If , then .
If , where is the variant of such that for every , then .
If , then .
(iv): By the model extension property, has a variant such that . We may assume that points in with the same valuation of parameters have the same valuation of variables: we can first collapse all with the same to a single point by a p-morphism, apply the extension property, and lift back the valuation of variables to the original frame. Let be such that for each , and write . By repeated use of (i), we have , hence by (iii), and using (i) again.
3.4 implies that for any of depth at most . However, in order to show that some power of is a unifier of , we need a uniform bound on independent of . As in Ghilardi’s proof, we will achieve this by defining a rank function on models whose number of possible values depends only on , and showing that sufficiently many applications of will strictly decrease the rank or make the model satisfy .
Ghilardi’s rank is based on Fine’s -equivalence [fine1]. It seems that matters become more delicate when we need to deal with valuation of parameters, hence we need a notion of a rank better adapted to our particular situation in order to make the arguments go through. We will use a rank function based on the satisfaction of some formulas related to . As a side effect, this leads to much smaller bounds, as already mentioned in \th3.3. We will also find it helpful to consider ranks to be the actual sets of formulas rather than just their cardinality.
If , we put
Notice that is a proper subset of , specifically for any . The crude rank of is , and its rank is . The rank of a point is defined as . Ranks are ordered lexicographically: if and , we put
The numerical rank of is , where . Notice that implies .
If , , then .
The argument for decreasing rank will be different depending on whether maximal clusters where fails are reflexive or irreflexive. We treat the reflexive case first.
Let be such that all points of have the same rank, , and has a reflexive -maximal cluster. Then .
Proof: Put , . If and , where , are compositions of ’s, we have
by \th3.6, hence .
Fix in a reflexive maximal cluster of . We define as follows. Let . If , we pick an arbitrary . Otherwise , hence there exists such that
We define . Notice that
We can write . We claim , which implies . Assume for contradiction ; we may also assume without loss of generality that for every . Since and , we have
for every using \th3.4. Notice that . We will show
for every and by induction on the complexity of . If , or with , then (6) follows immediately from (4) and (5). The steps for Boolean connectives are trivial. If , , we have by (4). On the other hand, for every by (5), and for every by the induction hypothesis, since . Thus, , irrespective of the reflexivity or irreflexivity of .
If , , we define
If , we put .
If is such that , and all -maximal clusters of are reflexive, then .
Let be such that all points of have the same crude rank, has an irreflexive -maximal point, and , where . Then .
Proof: Put , and fix an irreflexive maximal point . For any , we can change the valuation of parameters in the root of to match , and apply the model extension property to obtain its variant such that . Let be the valuation of variables in the root of . Since , , and valuation of boxed formulas in is unaffected by a change of variables or parameters in , we have
We can write , where and are compositions of some of the . Put and for , so that . Notice that
for any and , by the same argument as in (2).
For any , all -maximal clusters of are reflexive.
By (7), and satisfy the same Boolean combinations of atoms and boxed subformulas of . In particular, they agree on the satisfaction of itself. However, this contradicts and .