Hybrid Type-Logical Grammars,
First-Order Linear Logic and
the Descriptive Inadequacy of Lambda Grammars
Hybrid type-logical grammars [?, ?, ?] are a relatively new framework in
computational linguistics, which combines insights from the Lambek
calculus [?] and lambda grammars
[?, ?, ?] — lambda grammars are also called, depending on the authors,
abstract categorial grammars [?] and linear grammars [?], though with somewhat
different notational conventions
The goal of this paper is to prove that hybrid type-logical grammars are a fragment of first-order linear logic. This embedding result has several important consequences: it not only provides a simple new proof theory for the calculus, thereby clarifying the proof-theoretic foundations of hybrid type-logical grammars, but, since the translation is simple and direct, it also provides several new parsing strategies for hybrid type-logical grammars. Second, NP-completeness of hybrid type-logical grammars follows immediately.
The main embedding result also sheds new light on problems with lambda grammars, which are a subsystem of hybrid type-logical grammars and hence a special case of the translation into first-order linear logic. Abstract categorial grammars are attractive both because of their simplicity — they use the simply typed lambda calculus, one of the most widely used tools in formal semantics, to compute surface structure (strings) as well as to compute logical form (meanings) — and because of the fact that they provide a natural account of quantifier scope and extraction; for both, the analysis is superior to the Lambek calculus analysis. So it is easy to get the impression that lambda grammars are an unequivocal improvement over Lambek grammars.
In reality, the picture is much more nuanced: while lambda grammars have some often discussed advantages over Lambek grammars, there are several cases — notably coordination, but we will see in Section 7 that this is true for any analysis where the Lambek calculus uses non-atomic arguments — where the Lambek grammar analysis is clearly superior. Many key examples illustrating the elegance of categorial grammars with respect to the syntax-semantics interface fail to have a satisfactory treatment in abstract categorial grammars.
However, whether or not lambda grammars are an improvement over the Lambek calculus is ultimately not the most important question. Since there is a large number of formal systems which improve upon the Lambek calculus, it makes much more sense to compare lambda grammars to these extensions, which include, among many others, Hybrid Type-Logical Grammars, the Displacement calculus [?] and multimodal type-logical grammars [?, ?]. These extended Lambek calculi all keep the things that worked in the Lambek calculus but improve on the analysis in ways which allow the treatment of more complex phenomena in syntax and especially in the syntax-semantics interface. Compared to these systems, the inadequacies of lambda grammars are evident: even for the things lambda grammars do right (quantifier scope and extraction), there are phenomena, such as reflexives and gapping, which are handled by the same mechanisms as quantifier scope and extraction in alternative theories, yet which cannot be adequately handled by lambda grammars. The abstract categorial grammar treatment suffers from problems of overgeneration and problems at the syntax-semantics interface unlike any other categorial grammar. I will discuss some possible solutions for lambda grammars, but it is clear that a major redesign of the theory is necessary. The most painless solution seems to be a move either to hybrid type-logical grammars or directly to first-order linear logic: both are simple, conservative extensions which solve the many problems of lambda grammars while staying close to the spirit of lambda grammars.
This paper is structured as follows. Section 2 will introduce first-order linear logic and Section 3 will provide some background about the simply typed lambda calculus. These two introductory sections can be skimmed by people familiar with first-order linear logic and the simply typed lambda calculus respectively. Section 4 will introduce hybrid type-logical grammars and in Section 5 we will give a translation of hybrid type-logical grammars into first-order linear logic and prove its correctness. Section 6 will then compare the Lambek calculus and several of its extensions through their translations in first-order linear logic. This comparison points to a number of potential problems for lambda grammars. We will discuss these problems, as well as some potential solutions in Section 7. Finally, the last section will contain some concluding remarks.
2 First-order Linear Logic
Linear logic was introduced by \citeasnounGirard as a logic which restricts the structural rules which apply freely in classical logic. The multiplicative, intuitionistic fragment of first-order linear logic (which in the following, I will call either MILL1 or simply first-order linear logic), can be seen as a resource-conscious version of first-order intuitionistic logic. Linear implication, written , is a variant of intuitionistic implication with the additional constraint that the argument formula is used exactly once. So, looking at linear logic from the context of grammatical analysis, we would assign an intransitive verb the formula , indicating it is a formula which combines with a single (noun phrase) to form an (a sentence).
Linear logic is a commutative logic. In the context of language modelling, this means our languages are closed under permutations of the input string, which does not make for a good linguistic principle (at least not a good universal one and a principle which is at least debatable even in languages which allow relatively free word order). We need some way to restrict or control commutativity. The Lambek calculus [?] has the simplest such restriction: we drop the structural rule of commutativity altogether. This means linear implication splits into two implications: , which looks for an to its left to form a , and , which looks for an to its right to form a . In the Lambek calculus, we would therefore refine the assignment to intransitive verbs from to , indicating the intransitive verb is looking for the subject to its left.
In first-order linear logic, we can choose a more versatile solution, namely using first-order variables to encode word order. We assign atomic formulas a pair of string positions: becomes , meaning it is a noun phrase spanning position 0 (its leftmost position) to 1 (its rightmost position). Using pairs of (integer) variables to represent strings is standard in parsing algorithms. The addition of quantifiers makes things more interesting. For example, we can assign the formula to a determiner “the” which spans positions . This means it is looking for a noun which starts at its right (that is the leftmost position of this noun is the rightmost position of the determiner, 3) but ends at any position to produce a noun phrase which starts at position 2 (the leftmost position of the determiner) and ends at position (the rightmost position of the noun). Combined with a noun , this would allow us to instantiate to 4 and produce . In other words, the formula given to the determiner indicates it is looking for a noun to its right in order to produce a noun phrase, using a form of “concatenation by instantiation of variables” which should be familiar to anyone who has done some logic programming or who has a basic familiarity with parsing in general [?, ?]. Similarly, we can assign an intransitive verb at position 1,2 the formula to indicate it is looking for a noun phrase to its left to form a sentence, as the Lambek calculus formula for intransitive verbs does — this correspondence between first-order linear logic and the Lambek calculus is fully general and discussed fully in [?] and briefly in the next section.
After this informal introduction to first-order linear logic, it is
time to be a bit more precise. We will not need function symbols in
the current paper, so
terms are either variables denoted (a
countably infinite number) or constants, for which I
will normally use integers , giving an -word string
string positions, from to . The atomic formulas are of the form with terms, a predicate symbol (we only need a
finite, typically smal, number of predicate symbols, often only the
following four: for noun,
for noun phrase, for sentence, for predicate phrase) and its
arity. Our language does not contain the identity relation symbol “=”. Given this set of atomic formulas and the set of variables , the set of formulas is
defined as follows
We treat formulas as syntactically equivalent up to renaming of bound variables, so substituting (where does not contain before this substitution is made) for inside a formula will produce an equivalent formula, for example .
Table 1 shows the natural deduction rules for
first-order linear logic. The variable in the and
rules is called the eigenvariable of the
rule. The rule has the condition that the variable
which is replaced by the eigenvariable
does not occur in undischarged hypotheses of the proof and that
does not occur in before the substitution is made
As shown in [?], we can translate Lambek calculus sequents and formulas into first-order linear logic as follows.
The integers 0 to represent the positions of the formulas in the sequent and the translations for complex formulas introduce universally quantified variables. The translation for states that if we have a formula at positions then for any if we find a formula at positions (that is, to the immediate right of our formula) then we have an at positions , starting at the left position of the formula and ending at the right position of the argument. In other words, a formula is something which combines with a to its right to form an , just like its Lambek calculus counterpart.
Using this translation, we can see that the first-order linear logic formulas used for the determiner and the intransitive verb in the previous section correspond to the translations of at position and at position respectively.
To give a simple example of a first-order linear logic proof, we shown a derivation of “every student ran”, corresponding to the Lambek calculus sequent.
We first translate the sequent into first-order linear logic.
Then translate the formulas as follows.
We can then show that “every student ran” is a grammatical sentence under these formula assignments as follows.
The application of the final rule is valid, since .
Definition 2.1 (Universal closure)
If is a formula we denote the set of free variables of by .
For an antecedent , .
The universal closure of a formula with , denoted , is the formula .
The universal closure of a formula modulo antecedent , written , is defined by universally quantifying over the free variables in which do not occur in . If , then .
Proof If the closure modulo prefixes universal quantifiers to , we can go from to by using the rule times (the quantified variables added for the closure have been chosen to respect the condition on the rule) and in the opposite direction by using the rule times.
2.2 MILL1 with focusing and unification
The rule, as formulated in the previous section, has the disadvantage that it requires us to choose a term with which to replace and that making the right choice for requires some insight into how the resulting formula will be used in the rest of the proof. In the example of the preceding section we need to make two such “educated guesses”: we instantiate to 2 to allow the elimination rule with minor premiss and we instantiate to 3 to produce the desired conclusion .
The standard solution to automate this process in first-order logic theorem proving is to change the rule: instead of directly replacing the quantified variable by the “right” choice, we replace it by a meta-variable (I will use the Prolog-like notation , , for these variables, or, when confusion with the notation and for arbitrary formulas is possible , , , , , , , ). These meta-variables will represent our current knowledge about the term with which we will replace a given quantified variable. The MGU we compute for the endsequent will correspond to the most general instantiations of these variables in the given proof (that is, all other instantiations can be obtained from this final MGU by means of additional substitutions).
The rule unifies the formulas of the argument and minor premiss of the rule (so the two occurrences of need only be unifiable instead of identical). Remember that the unification of two atomic formulas and is only defined when and and that unification tries to find the most general instantiation of all free variables such that (for all ) and fails if no such instantiation exists. The presence of an explicit quantifier presents a complication, but only a minor one: bound variables are treated just like constants which, in addition, must respect the variable condition.
More precisely, the unification of two formulas is defined as follows.
The case assumes there are no free occurrences of in
before substitution. It is defined in such a way that it is
independent of the actual variable names used for the quantifier (as
mentioned, we use a different variable for each occurrence of a
quantifier) and bound occurrences of and are treated as
constants in the clause, subject to the
if we compute a substitution for a
formula and is not free
for in then unification fails. In other words, the substitution cannot introduce new
bound variables, so for example and fail to unify, since is not free for in , and therefore we cannot legally substitute for
since it would result in an “accidental capture”, creating a new
bound occurrence of .
As second problem with natural deduction proof search is that we can have subproofs like the following.
In both cases, we introduce a connective and then immediately eliminate it. A natural deduction proof is called normal if is does not contain any subproof of the forms shown above. One of the classic results for natural deduction is normalization which states that we can eliminate such detours [?, ?]. In the case of linear logic, removing such detours is even guaranteed to decrease the size of the proof.
We use a form of focalized natural deduction [?, ?], which is a syntactic variant of natural deduction guaranteed to generate only normal natural deduction proofs. We use two turnstiles, the negative and the positive (for the reader familiar with focused proofs, corresponds to and to ).
We will call a sequent a positive sequent (and a positive formula) and a sequent a negative sequent (and a negative formula).
Table 2 shows the rules of first-order linear logic in this format. For the lexicon rule, we require that the formula is closed. The formula of the hypothesis rule can contain free variables.
For the rule, is either a variable or a meta-variable which has no free occurrences in any undischarged hypothesis.
For the rule, is the most general unifier of and . That is, we unify the two occurrences of in their respective contexts, using unification for complex formulas as defined above. The resulting most general unifier is then applied to the two contexts and to (replacing, if necessary, any variables shared between and in the formula ).
We can see from the rules that axioms start negative and stay negative as long as they are the major premiss of a rule or the premiss of a rule. We must switch to positive sequents to use the introduction rules or to use the sequent as the minor premiss of a rule.
The “detour” subproofs we have seen above cannot receive a consistent labeling: the formula is the conclusion of a rule and must therefore be on the right-hand side of a positive sequent, however, it is also the major premiss of a rule and must therefore be on the right-hand side of a negative sequent (it is easily verified there is no way to transform a positive sequent into a negative sequent, however the point is that the original detour receives an inconsistent labeling).
A principal branch is a sequence of negative sequents which starts at a hypothesis, then follows all elimination rules from (major) premiss to conclusion ending at a focus shift rule (this corresponds to the normal notion of principal branch from e.g. [?]; a sequence of negative sequents can only pass through the major premiss of a rule and through the single premiss of a rule).
A track is a path of negative sequents followed by a focus shift followed by a path of positive sequents. A track ends either in the conclusion of the proof or in the minor premiss of a rule.
The main track of a proof is the track which ends in its conclusion (these definitions corresponds to the standard notion of track and main track in normal proofs, see e.g. [?]).
This suggests a relation between focused proofs and normal natural deduction proofs, which is made explicit in the following two propositions.
For every natural deduction proof of , there is a focused natural deduction proof with unification of .
Proof We first transform the natural deduction proof of into a normal natural deduction proof, then proceed by induction on the length of the proof and show that we can create both a proof of and a substitution . We proceed by induction on the depth of the proof.
If , we have an axiom or hypothesis rule, which we translate as follows.
If we proceed by case analysis on the last rule.
The only case which requires some attention is the case. Given that the proof is normal, we have a normal (sub)proof which ends in a rule. We are therefore on the principal branch of this subproof and we know that a principal branch starts with an axiom/lexicon rule then passes only rules and rules through their major premiss. Hence, the last rule producing the major premiss in the original proof must either have been an axiom/lexicon rule or an elimination rule for or .
Now induction hypothesis gives us a proof of and a proof of . However, given that the last rule of the proof which produces was either axiom/lexicon, the rule or the rule — all of which have negative sequents as their conclusion — the last rule of must have been the focus shift rule. Removing this focus shift rule produces a valid proof of , which we can combine with the proof of as follows.
Note that this is again a proof which ends with a focus shift rule.
Since the original proof uses the stricter notion of identity (instead of unifiability) for the formulas, we need not change the substitution we have computed so far and therefore leave , and unchanged.
For the rule, induction hypothesis gives us a proof of , by reasoning similar to the case for , we know the last rule of was a focus shift rule, which we can remove, then extend the proof as follows.
Adding the substitution (where is the term used for the in the original rule) to the unifier.
The cases for and are trivial, since we can extend the proof with the same rule.
For every focused natural deduction proof, there is a natural deduction proof.
Proof If we remove the focus shift rule and replace both and by then we only need to give specific instantiations for the rules. The most general unifier computed for the complete proof gives us such values for each (negatively) quantified variable (if wanted, remaining meta-variables can be replaced by free variables).
The following is a standard property of normal natural deduction proofs (and therefore of focused natural deduction proofs).
Focused proofs satisfy the subformula property. That is, any formula occurring in a proof of (or ) is a subformula either of or of .
The following proposition is easily verified by induction on and using the correspondence between natural deduction proofs and -terms.
We can restrict the focus shift rule to atomic formulas . When we do so, we only produce long normal form proofs (which correspond to beta normal eta long lambda terms).
The proof from the previous section looks as follows in the unification-based version of first-order linear logic, though we use a form with implicit antecedents to economize on horizontal space and to make comparison with the proof of the previous section easier. This proof produces the most general unifier , , corresponding to the explicit instantiations for and at the rules in the previous proof.
Restricting focus shift () to atomic formulas, produces the following proof in long normal form. Remark that our hypothesis in this proof is not but which unifies with at the rule immediately below it.
2.3 Proof Nets
Proof nets are an elegant alternative to natural deduction and an important research topic in their own right; for reasons of space we provide only an informal introduction — the reader interested in more detail is referred to [?] for an introduction and to [?, ?] for detailed proofs in the context of linear logic and to [?, ?, ?] for introductions in the context of categorial grammars and the Lambek calculus. Though proof nets shine especially for the and rules (where the natural deduction formulation requires commutative conversions to decide proof equivalence), they are a useful alternative in the and case as well since they provide an easy combinatorial way to do proof search and therefore make arguments about non-derivability of statements and serve to count the number of readings.
quant shows that the proof nets of multiplicative linear logic [?, ?] have a simple extension to the first-order case. Essentially, a proof net is a graph labeled with (polarized occurrences of) the (sub)formulas of a sequent , subject to some conditions we will discuss below. Obviously, not all graphs labeled with formulas correspond to derivable statements. However, we can characterize the proof nets among the larger class of proof structures (graphs labeled with formulas which, contrary to proof nets, do not necessarily correspond to proofs) by means of simple graph-theoretic properties.
The basic building blocks of proof structures are links, as shown in Figure 1. We will call the formulas displayed below the link their conclusion and the formulas displayed above it their premisses. The axiom link (top left) has no premisses and two conclusions, the cut link has no conclusions and two premisses, the binary logical links have two premisses ( and ) and one conclusion and the unary logical links have one premiss and one conclusion . We will call the eigenvariable of the link and require that all links use distinct variables.
Given a statement
we can unfold the formulas using the logical links of the figure, using the
negative links for the and the positive link for . Since
there is only one type of link for each combination of
connective/polarity, we unfold our formulas
We turn this proof frame into a proof structure by connecting atomic formulas of opposite polarity in such a way there is a perfect matching between the positive and negative atoms. This step can already fail, for example if the number of positive and negative occurrences of an atomic formula differ but also because of incompatible atomic formulas like and , with the eigenvariable of a link. More generally, it can be the case that there is no coherent substitution which allows us to perform a complete matching of the atomic formulas using axiom links. These restrictions on the instantiations of variables are a powerful tool for proof search [?, ?].
Proof structures are essentially graphs where some of the links are drawn with dashed lines; the binary dashed lines are paired, as indicated by the connecting arc. We will call the dashed logical links ( and ) the positive links and the solid logical links ( and ) the negative links. The terms positive and negative links only apply to the logical links; the axiom and cut link are neither positive nor negative. A proof structure containing only negative logical links is just a graph labeled with polarized formulas.
Figure 2 shows the proof net which corresponds to the natural deduction proof of Section 2.1. To save space, we have noted only the main connective at each link; the full formula can be obtained unambiguously from the context. We have also been free in the way we ordered the premisses of the links, which allows us to give a planar presentation of the axiom links, much like Lambek calculus proof nets. However, there is no planarity requirement in the proof net calculus; the first-order variables offer more flexibility than simple planarity. For the links, we have annotated the substitutions next to the link. If we use a unification-based presentation, as we did for natural deduction in Section 2.2, we can “read off” these substitutions from the most general unifier computed for the axioms (as opposed to natural deduction, the axioms and not the rule, which corresponds to the link, are responsible for the unification of variables).
A proof structure is a proof net if the statement is derivable, that is, given the proof of Section 2.1, we know the proof structure of Figure 2 is a proof net. However, this definition is not very useful, since it depends on finding a proof in some other proof system; we would like to use the proof structure itself to directly decide whether or not the statement is derivable. However, it is possible to distinguish the proof nets from the other proof structures by simple graph-theoretic properties. To do so, we first introduce some auxiliary notions, which turn the graph-like proof structures into standard graphs. Since axiom, cut and the negative links already produce normal graphs ( corresponds to two edges, all other links to a single edge in the graph), we only need a way to remove the positive links.
A switching is a choice for each positive link as follows.
For each link, we choose one its premisses ( or ).
For each link, we choose either its premiss or any of the formulas in the proof structure containing a free occurrence of the eigenvariable of the link.
A given a switching , a correction graph is a proof structure where we replace all dashed links by a link from the conclusion of the link to the formula chosen by the switching .
[?] A proof structure is a proof net iff all its correction graphs are acyclic and connected.
Defined like this, it would seem that deciding whether or not a proof structure is a proof net is rather complicated: there are potentially many correction graphs — we have two independent possibilities for each link and generally at least two subformulas containing the eigenvariable of each link, giving correction graphs for positive links — and we need verify all of them. Fortunately, there are very efficient alternatives: linear time in the quantifier-free case [?, ?] and at most squared time, though possibly better, in the case with quantifiers [?].
Going back to the example shown in Figure 2, we can see that there are two positive links and twelve correction graphs: there are six free occurrences of — four in atomic formulas and two additional occurrences in the conclusions ( and ) which combine these atomic formulas into — times the two independent possibilities for switching left or right. We can verify that all twelve possibilities produce acyclic, connected graphs. Removing the positive links splits the graph into three connected components: the single node labeled (representing ), a component containing the intransitive verb ending at the axioms to and and a final component containing the rest of the graph, ending at the conclusion of the link (which has been disconnected from its premiss). Now, any switching for the link will connect its isolated conclusion node to the component containing and (via one or the other of these nodes), leaving two connected components. Finally, all free occurrences of the variable occur in this newly created component, therefore any choice for a switching of the link will join these disconnected components into a single, connected component. Since each choice connected two disjoint components, we have not generated any cycles.
We can also show that this is the only possible proof structure for the given logical statement: there is only one choice for the formulas, one choice for the formulas though two choices for the formulas. However, the alternative proof structure would link to (for some value of ), which fails because , being the eigenvariable of a link, cannot be instantiated to 0.
As a second example, let’s show how we can use correction graphs to show underivability. Though it is clear that the switching for the universal quantifier must refer to free occurrences of its eigenvariable somewhere (as do its counterparts in natural deduction and sequent calculus), it is not so easy to find a small example in the fragment where this condition is necessary to show underivability, since finding a global instantiation of the variables is already a powerful constraint on proof structures. However, the existential quantifier and the universal quantifier differ only in the labeling of formulas for the links and we need the formula labeling only for determining the free variables.
A proof structure of the underivable sequent is shown in Figure 3. It is easy to verify this is the unique proof structure corresponding to this sequent. This sequent is used for computing the prenex normal form of a formula in classical logic (replacing by ), but it is invalid in intuitionistic logic and linear logic since it depends on the structural rule of right contraction.
In order to show the sequent is invalid in linear logic, it suffices to find a switching such that the corresponding correction graph either contains a cycle or is disconnected. Figure 4 shows a correction graph for the proof structure of Figure 3 which is both cyclic and disconnected: the axiom is not connected to the rest of the structure and the connection between and produces a cycle, since there is a second path to these two formulas through the axiom .
This concludes our brief introduction to proof nets for first-order linear logic. We refer the reader to Appendix A of [?] for discussion about the relation between proof nets and natural deduction.
3 Basic Properties of the Simply Typed Lambda Calculus
Before introducing hybrid type-logical grammars, we will first review some basic properties of the simply typed lambda calculus which will prove useful in what follows. This section is not intended as a general introduction to the simply typed lambda calculus: we will assume the reader has at least some basic knowledge such as can be found in Chapter 3 of [?] or other textbooks and some knowledge about substitution and most general unifiers. For more detail, and for proofs of the lemmas and propositions of this section, the reader is referred to [?].
A remark on notation: we will use exclusively as a type constructor (also when we know we are using it to type a linear lambda term) and exclusively as a logical connective.
A lambda term is a linear lambda term iff
for every subterm of , has exactly one occurrence in (in other words, each abstraction binds exactly one variable occurrence),
all free variables of occur exactly once.
Table 3 lists the Curry-style typing rules for the linear lambda calculus. For the rule, and cannot share term variables; for the rule, cannot contain (ie. must be a valid context).
For linear lambda terms, we have the following:
When is a linear lambda term and a deduction of , then the variables occurring in are exactly the free variables of .
If , are linear lambda terms which do not share free variables then is a linear lambda term.
If is a linear lambda term with a free occurrence of then is a linear lambda term.
If is a linear lambda term and then is a linear lambda term.
Lemma 3.3 (Substitution)
If , and and are compatible (ie. there are no conflicting variable assignments and therefore is a valid context), then .
The following two results are rather standard, we can find them in [?] as Lemmas 2C1 and 2C2.
Lemma 3.4 (Subject Reduction)
Let , then
Lemma 3.5 (Subject Expansion)
Let with a linear lambda term, then
3.1 Principal types
The main notions from Chapter 3 of [?] are the following.
Definition 3.6 (Principal type)
A principal type of a term is a type such that
for some context we have
if , then there is a substitution such that .
Definition 3.7 (Principal pair)
A principal pair for a term is a pair such that and for all such that there is a substitution with
Definition 3.8 (Principal deduction)
A principal deduction for a term is a derivation of a statement such that every other derivation with term is an instance of (ie. obtained by globally applying a substitution to all types in the proof).
From the definitions above, it is clear that if is a principal deduction for then is a principal pair and a principal type of .
If contains free variables we can compute the principal type of the closed term which is the same as the principal type for .
3.2 The principal type algorithm
The principal type algorithm of \citeasnounhindley is defined as follows. It is slightly more general and computes principal deductions. It takes as input a lambda term and outputs either its principal type or fails in case is untypable. We closely follow Hindley’s presentation, keeping his numbering but restricting ourselves to linear lambda terms; we omit his correctness proof of the algorithm.
We proceed by induction on the construction of .
If is a variable, say , then we take an unused type variable and return as principal deduction.
If is of the form and occurs in then we look at the principal deduction of by induction hypothesis : if we fail to compute a principal deduction for then there is no principal deduction for either. If such a deduction does exist, then we can extend it as follows.
is of the form and does not occur in ; this case cannot occur since it violates the condition on linear lambda terms (we must bind exactly one occurrence of in ), so we fail.
is of the form . If the algorithm fails for either or , then is untypable and we fail. If not, induction hypothesis gives us a principal proof for and a principal proof for . If necessary, we rename type variables if and such that and have no type variables in common. Since is linear, and cannot share term variables.
If is of the form then we compute the most general unifier of and . If this fails the term is untypable; if not we combine the proofs as follows.
If is a type variable, then we compute the most general unifier of and (with a fresh type variable). If this succeeds and the term is typable, we can produce its principal proof as follows.
The main utility of principal types in the current paper is given by the coherence theorem.
Theorem 3.9 (Coherence)
Suppose and let be a principal type of then
The coherence theorem states that a principal type determines a lambda term uniquely (up to equivalence). Since we work in a linear system, where weakening is not allowed, we only need the special case . This special case of Theorem 3.9 is the following: if with a principal type of then for any such that we have that .
In brief, the principal type algorithm allows us to compute the principal type of a given typable lambda term, whereas the coherence theorem allows us to reconstruct a lambda term (up to equivalence) from a principal type.
We say a sequent is balanced if all atomic types occurring in the sequent occur exactly twice.
The following lemmas are easy consequences of 1) the Curry-Howard isomorphism between linear lambda terms and Intuitionistic Linear Logic (ILL), which allows us to interpret the linear type constructor “” as the logical connective “” 2) the correspondence between (normal) natural deduction proofs and (cut-free) proof nets and 3) the fact that renaming the conclusions of the axiom links in a proof net gives another proof net.
If is a linear lambda term with free variables then the principal type of is balanced. Hence the principal type of is balanced.
Proof Compute the natural deduction proof of and convert it to a ILL proof net. By subject reduction (Lemma 3.4), normalization/cut elimination keeps the type invariant. Let be the cut-free proof net which corresponds to the natural deduction proof of and which has the same type as . We obtain a balanced proof net by using a different atomic formula for all axiom links. From this proof net, we can obtain all other types of by renaming the axiom links (allowing for non-atomic axiom links), hence it is a principal type and it is balanced by construction.
If is a beta-normal lambda term with free variables and if has a balanced typing then is linear.
Proof If has a balanced typing, then from this typing we can construct a unique cut-free ILL proof net of . Since it is an ILL proof net, this lambda term must be linear and therefore as well.
To illustrate the principal type algorithm, we give two examples in this section.
As a first example, we compute the principal proof of as follows.
The substitutions (for the topmost rule) and (for the bottom rule) have been left implicit in the proof.
As a second example, the principal proof of is the following.
The substitutions , , (of the three rules, from top to bottom) have again been left implicit.
4 Hybrid Type-Logical Grammars
Hybrid type-logical grammars have been introduced in [?] as an extension of lambda grammars which combines insights from the Lambek calculus into lambda grammars. Depending on authors, lambda grammars [?] are also called abstract categorial grammars [?] or linear grammars [?].
Formulas of hybrid type-logical grammars are defined as follows, where are the formulas of hybrid type-logical grammars and the formulas of Lambek grammars. denotes the atomic formulas of the Lambek calculus — we will call these formulas simple atomic formulas, since their denotations are strings — signifies complex atomic formulas, whose denotations are not simple strings, but string tuples.
As is clear from the recursive definition of formulas above, hybrid type-logical grammars are a sort of layered or fibred logic. Such logics have been studied before as extensions of the Lambek calculus by replacing the atomic formulas in by feature logic formulas [?, ?].
Lambek grammars are obtained by not allowing connectives or complex atoms in . From hybrid type-logical grammars, we obtain lambda grammars by not allowing connectives in . Inversely, we can see hybrid type-logical grammars as lambda grammars where simple atomic formulas have been replaced by Lambek formulas.
Before presenting the rules of hybrid type-logical grammars, we’ll introduce some notational conventions: and range over arbitrary formulas; , and denote type variables or type constants; and denote type constants corresponding to string positions; and denote arbitrary types. Types are written as superscripts to the terms; , and denote term variables; and denote arbitrary terms.
Table 4 shows the rules of Hybrid Type-Logical Grammars. The rules are presented in such a way that they compute principal types in addition to the terms. We obtain the Church-typed version — equivalent to the calculus presented in [?] — by replacing all type variables and constants by the type constant . For the principal types, we use the Curry-typed version, though for readability, we often write the types of subterms as superscripts as well.
Logical rules – Lambek
Logical rules – lambda grammars
The subsystem containing only the rules for is simply lambda grammar. The subsystem containing only the rules for and is a notational variant of the Lambek calculus.
For the Lexicon rule, is a principal pair for or, equivalently, with a -normal -long linear lambda term and its principal type). For the Axiom/Hypothesis rule, is the eta-expansion of .
For the Lambek calculus elimination rule and , is the most general unifier of and (this generally just replaces by but takes care of the cases where or as well). The concatenation operation of the Lambek calculus corresponds to function composition on terms and to unification of string positions on types (much like we have seen in Section 2).
For the Lambek calculus introduction rules and , is the most general unifier of (resp. ) and (ie. we simply identify and and replace by the identity function on string positions — the empty string).
In the rule, is the most general unifier of and .
For convenience, we will often tacitly apply the following rule.
Though the above rule is not strictly necessary, we use it to simplify the lambda terms we compute, performing on-the-fly -normalization (ie. we replace by its beta-normal, or beta-normal-eta-long, form ). Since we have both subject reduction and subject expansion, and are guaranteed to have the same type .
Apart from the types, the system presented in Table 4 is a notational variant of hybrid type-logical grammars as presented by Kubota and Levine \citeyearkl12gap,kl13emp. We have replaced strings as basic types by string positions with Church type . This is a standard strategy in abstract categorial grammars, akin to the difference lists in Prolog, which allows us to do without an explicit concatenation operation: concatenation is simply treated as function composition, as can be seen from the term assignments for the and rules. The introduction rules and are presented somewhat differently than the Kubota and Levine version, who present rules requiring (in our notation) premisses with term assignments and respectively. The present formulation has the advantage that it is more robust in the sense that it does not require us to test that is equivalent to the given terms. Though it may appear a bit strange that the and rules require the identity of the type variable between and , it is clear that this follows from the intended interpretation, which requires the string variable to occur at the beginning (resp. end) of the string denoted by , and this solution seems preferable to interleaving normalization and pattern matching in our rules.
The types, at least for the rules, are exactly those computed using the principal type algorithm of \citeasnounhindley discussed in Section 3.1. We will see how the types for the Lambek connectives and the lexicon rule correspond to principal type computations in the next section.
4.1 Justification of the principal types for the new rules
For and , their principal types are justified as follows; is the most general unifier of and — since is a type variable not occurring elsewhere, we can assume without loss of generality that just replaces with — and is the most general unifier of and . The important type unification is of and (the unification of and affects only a discharged axiom).
At the level of the types, the two rules are the same: both correspond to concatenation.
Taking , which is possible since , and are disjoint, gives us the following proof.
Since only replaced by and no longer appears in the conclusion of the proof (the corresponding hypothesis has been withdrawn) we can treat as the most general unifier of and .
We compute the principal type for the rule as follows.
And symmetrically for .
From the point of view of the principal type computation, we identify the and variables, essentially replacing by the empty string.
The proof rules for Hybrid type-logical grammars of Table 4 compute principal types for the lambda terms corresponding to their proofs.
Proof We essentially use the same algorithm as \citeasnounhindley, which is somewhat simplified by the restriction to linear lambda terms which are eta-long.
The principal types for , , and rules are justified as shown above.
The lexicon rule is justified by the Substitution Lemma (Lemma 3.3): given a principal type for a lexical entry, we replace a hypothesis of the form by a hypothesis of the form , where we know this second sequent has a linear proof.
Given a principal type derived by the rules of hybrid type-logical grammar shown above, we can compute the corresponding lambda term up to equivalence.
Proof Since the principal types computed are balanced by Lemma 3.11, by the Coherence theorem (Theorem 3.9), we can compute the corresponding lambda term up to equivalence. An easy way to do so is to construct the proof net corresponding to the principal type (which is unique because of balance) and to compute its lambda term; this lambda term is the unique beta-normal eta-long term corresponding to the principal type.
As an example of how to compute the principal derivation corresponding to a hybrid derivation, we look at the following hybrid derivation.
The corresponding principal derivation looks as follows (for reasons of vertical space, the lexical entry for has not been eta-expanded to as it should to obtain the given principal type instead of ; though either type will end up being instantiated to the same result type, the eta-expanded principal type has the important advantage that it can be obtained without instantiating type variables to complex types; similarly, and appear in eta-short form).
The and the rules correspond to three rules each in this principal derivation (the derivation of for and the part of the derivation from to for , this last rule satisfies the constraint for the application of the rule, with appearing at the last position)
In principle, the computation of the principal type can fail because of the constants (even though there might be a proof using variables). However, this failure would mean the final term fails to respect the word order of the input string. Principal types using distinct variables for string positions would seem a useful tool for computing all possible word orders for a given set of lexical entries, though.
One of the attractive points of categorial grammars is that we have a very simple and elegant syntax-semantics interface by means of the Curry-Howard isomorphism between intuitionistic proofs and lambda terms (or, in our case between linear intuitionistic proofs and linear lambda terms). By interpreting the logical connectives for the implications “”, “”, “” and “” as the type constructor “” — the formulas as types interpretation — our derivations in the Lambek calculus, in lambda grammars, in hybrid type-logical grammars and in first-order linear logic (where we treat the quantifier as being semantically inert, that is, quantifier rules are “invisible” to the meaning) correspond to -terms — the proofs as terms interpretation. Using the Curry-Howard isomorphism, we can obtain semantics in the tradition of Montague simply by giving lexical substitutions in the lexicon, using essentially the rules of Table 3 (though we typically use the Church-style typing) to assign a derivational meaning to a proof.
The semantic version of the proof from the previous section looks as follows.
Though syntactically, the Lambek elimination rule corresponds to function composition (concatenation), semantically it corresponds to simple application and the introduction rule to abstraction. Given the standard Montegovian semantics for “everyone” as (the set of properties such that all have this property), the previous proof actually produces an equivalent term as the semantics for , so the generalized quantifier can function as a Lambek calculus subject quantifier while keeping the same semantics.
More detail about the syntax-semantics interface in categorial grammars can be found in [?, ?].
For the main result, we only need to show that a hybrid principal type proof corresponds to a MILL1 proof, since we can reconstruct the lambda term from the principal type.
The basic idea which makes the correspondence work is that there is a 1-1 mapping between the atomic terms of a predicate in MILL1 and the principal type which is assigned to the corresponding term in a hybrid derivation. So from the term assigned to a hybrid derivation, we compute the principal type using the principal type algorithm (PTA) and this gives us the first-order variables and from the first-order variables of a MILL1 derivation we obtain the principal type and a hybrid lambda term thanks to the coherence theorem, as shown schematically below.
5.1 String positions, types and formulas
We need an auxiliary function (for flatten) which reduces a complex type to a list of atomic types. Following \citeasnounkanazawa11pg, we compute this list by first taking the yield of the type tree and then reversing this list, which is convenient for induction since it has ( “” denotes list concatenation, the singleton list containing element and the -element list with th element ).
Let be a type, the list is defined as follows.
For example, we have the following.
Let be a formula in Hybrid Type-Logical Grammar, its principal type and the flattened list of atomic types obtained from according to Definition 5.1. The translation of into first-order linear logic is defined as follows.
We can obtain a closed formula by universally quantifying over all variables in the list of arguments replacing all of them with quantified variables using the universal closure operation (Definition 2.1).
Let be a formula in first-order linear logic and a formula in hybrid type-logical grammar and . The free meta-variables of are exactly the type variables of (and of ).
Proof Immediate by induction on using the translation. All new variables introduced during the translation are bound.
Let and be first-order linear logic formulas obtained by the translation function from Hybrid Type-Logical Grammar formulas and with and as their respective principal types. In other words, and .
unifies with with MGU if and only if and unifies with with this same MGU .
Proof Suppose and unify with MGU . We must show that and that is an MGU for and . Showing is an easy induction (exploiting the fact that does not have a quantifier prefix and therefore cannot unify with a Lambek connective and that and cannot unify with each other because of the condition preventing accidental capture of variables). Given that and are identical, we know that and differ only in the free variables (the bound variables are equivalent up to renaming) and that the free variables for and are exactly the type variables of and (by Proposition 5.3). Therefore any substitution that makes and equal (up to renaming of bound variables) makes and equal.
For the other direction, suppose that and that is the MGU of and . Since is a MGU and therefore given that the translation function uses identical hybrid formulas and identical principal types we have that .
It is insightful to compare the translation of (with principal type ) to that of with principal type . Though the two end results are formulas which are equivalent to each other (after universal closure of the meta-variables), there is a difference in the string position list for the non-atomic subformulas: the Lambek formula only ever has a pair of string positions, whereas the linear formula starts with a full list of string positions which decreases at each step. In other words, for the Lambek formula, we compute the string positions step-by-step whereas the lambda grammar version of the same formula precomputes all string positions then divides them among the subformulas.