Translating Specifications in a Dependently TypedLambda Calculus into a Predicate Logic Form

Translating Specifications in a Dependently Typed
Lambda Calculus into a Predicate Logic Form

Abstract

Dependently typed lambda calculi such as the Edinburgh Logical Framework (LF) are a popular means for encoding rule-based specifications concerning formal syntactic objects. In these frameworks, relations over terms representing formal objects are naturally captured by making use of the dependent structure of types. We consider here the meaning-preserving translation of specifications written in this style into a predicate logic over simply typed -terms. Such a translation can provide the basis for efficient implementation and sophisticated capabilities for reasoning about specifications. We start with a previously described translation of LF specifications to formulas in the logic of higher-order hereditary Harrop () formulas. We show how this translation can be improved by recognizing and eliminating redundant type checking information contained in it. This benefits both the clarity of translated formulas, and reduces the effort which must be spent on type checking during execution. To allow this translation to be used to execute LF specifications, we describe an inverse transformation from -terms to LF expressions; thus computations can be carried out using the translated form and the results can then be exported back into LF. Execution based on LF specifications may also involve some forms of type reconstruction. We discuss the possibility of supporting such a capability using the translation under some reasonable restrictions on the structure of specifications.

\authorinfo

Mary Southern and Gopalan Nadathur Computer Science and Engineering
University of Minnesota
Minneapolis, MN 55455

1 Introduction

The Edinburgh Logical Framework (LF) has proven to be a useful device for specifying formal systems such as logics and programming languages. At its core, LF is a dependently typed lambda calculus. By exploiting the abstraction operator that is part of the syntax of LF, it is possible to succinctly encode formal objects whose structure embodies binding notions. Because types can be indexed by terms, we can use them to express relations between the formal objects encoded in terms. If we view such types as formulas, terms that have a given type can be interpreted as proofs of the formula that type represents. Thus, LF specifications can be given a logic programming interpretation using a notion of proof search that corresponds to determining inhabitation of given types. The Twelf system is an implementation of LF that is based on such an interpretation.

An alternative approach to specifying formal systems is to use a predicate logic. Objects treated by the formal systems can be represented by the terms of this logic and relations between them can be expressed through predicates over these terms. If the terms include a notion of abstraction (e.g., if they encompass simply typed lambda terms) they provide a convenient means for representing binding notions. While an unrestricted predicate logic would be capable of describing relations adequately, it is preferable to limit the permitted formulas so that the desired interpretation of rule based specifications can be modeled via a constrained proof search behavior. The logic of higher-order hereditary Harrop formulas () has been designed with these ideas in mind and many experiments have shown this logic to be a useful specification device (e.g., see Miller and Nadathur [2012]). This logic has also been given a computational interpretation in the language Prolog Nadathur and Miller [1988]. Moreover, an efficient implementation of Prolog  has been developed in the Teyjus system Qi et al. [2008].

There are obvious similarities between the two different approaches to specification, making it interesting to explore the connections between them more formally. In early work, Felty et. al. showed that LF derivations could be encoded in  derivations by describing a translation from the former to the latter Felty and Miller [1990]. This translation demonstrated the expressive power of , but was not directly usable in relating proof search behavior. To rectify this situation, Snow et. al. showed how to translate LF specifications into  formulas in such a way that the process of constructing a derivation could be related Snow et al. [2010]. This work provided the basis for an alternative implementation of Twelf. The translation also has the potential to be useful in bringing the power of the Abella prover Gacek [2009] to bear on reasoning about Twelf specifications.

This paper continues the work described in Snow et al. [2010]. There are four specific contributions it makes in this setting:

  1. An important part of the translation is the recognition and elimination of redundant typing information in specifications. We describe an improvement to the criterion presented in Snow et al. [2010] for this purpose.

  2. In contrast to Snow et al. [2010], we show how to modularize the proof of redundancy of typing information, establishing a result concerning LF first and then lifting this result to the translation. This enables us to present results that also apply directly to LF.

  3. If we are to use the translation as a means for implementing proof search in Twelf, we need also a way to return to Twelf expressions after completing execution in Prolog. We describe such an inverse transformation.

  4. Logic programming in Twelf includes a process of type reconstruction. We begin an analysis of the translation towards understanding whether type reconstruction on the translated expression will agree with Twelf’s behavior. This analysis is incomplete, but we believe the approach to be sound and the remaining work to be mainly that of elaborating the details.

The next two sections describe LF and the  logic respectively and discuss their computational interpretations. Section 4 then presents a simple translation of LF specifications into   ones. The following section takes up the task of improving this translation. In particular, it characterizes certain bound variable occurrences in types using a notion called strictness and uses this characterization to identify redundancy in typing. We are then able to eliminate such redundancy in translation. Section 6 describes the inverse translation from  terms found via proof search to LF expressions in the originating context for the translation. Section 7 contains a discussion on the treatment of type reconstruction in Twelf proof search. We end the paper with a discussion of future directions to this work in Section 8.

2 Logical Framework

\AxiomC \RightLabelnull-sig \UnaryInfC \DisplayProof

\AxiomC \RightLabelkind-sig \UnaryInfC \DisplayProof

\AxiomC \RightLabeltype-sig \UnaryInfC \DisplayProof
\AxiomC \RightLabelnull-ctx \UnaryInfC \DisplayProof

\AxiomC \RightLabeltype-ctx \UnaryInfC \DisplayProof
\AxiomC \RightLabeltype-kind \UnaryInfC \DisplayProof
\AxiomC \RightLabelpi-kind \UnaryInfC \DisplayProof
\AxiomC \RightLabelvar-fam \UnaryInfC \DisplayProof
\AxiomC \RightLabelpi-fam \UnaryInfC \DisplayProof
\AxiomC \RightLabelapp-fam \UnaryInfC \DisplayProof
\AxiomC \RightLabelvar-obj \UnaryInfC \DisplayProof
\AxiomC \RightLabelabs-obj \UnaryInfC \DisplayProof

\AxiomC \RightLabelapp-obj \UnaryInfC \DisplayProof
\nocaptionrule
Figure 1: Rules for Inferring LF Assertions

This section introduces dependently typed -calculi as a means for specifying formal systems. A unique aspect of these calculi is that they let us define types which are indexed by terms. This can be a more intuitive method of encoding relationships between terms and types within a specification than using predicates, such as in Prolog. To take a computational view, we interpret types as formulas, and proving such formulas then reduces to to checking that a certain type is inhabited. The particular dependently typed -calculus we shall use in this paper is called the Edinburgh Logical Framework or LF. We describe this calculus below, then exhibit its use in specifying relations and finally explain how it can be given an executable interpretation.

2.1 The Edinburgh Logical Framework

There are three categories of LF expressions: kinds, type families which are classified by kinds, and objects which are classified by types. Below, denotes an object variable, an object constant, and a type constant. Letting range over kinds, and over types, and and over objects, the syntax of these terms are as follows: Both and are binders which assign a type to a variable over the term. The shorthand is used for when does not appear free in . Terms differing only in bound variable names are identified. We use and below to stand ambiguously for types and object expressions. We write to denote the capture avoiding substitution of for free occurrences of in .

LF type family and object expressions are formed starting from a signature that identifies constants together with their kinds or types. In addition, in determining whether or not an expression is well-formed, we will need to consider contexts, denoted by , that assign types to variables. The syntax for signatures and contexts is as follows: In what follows, the signature (which is user-defined) will not change over the course of time. Given this, for simplicity, we will leave this signature implicit in our discussions.

Not all the LF expressions identified by the syntax rules above are considered to be well-formed. The following five forms of judgments are relevant to deciding the ones that are:

The judgments on the first line assert, respectively, that is a valid signature and that is a valid context, implicitly in . The judgments on the second line assert that is a valid kind in the context , is a valid type of (valid) kind in , and is a valid object of (valid) type in ; all these judgments also verify that the context and the implicit signature are both valid. In stating the rules for deriving these judgments, we shall make use of an equality notion for expressions that is based on -conversion, i.e., the reflexive and transitive closure of a relation that equates two expressions that differ only in that a subexpression of the form in one is replaced by in the other. We shall write for the -normal form of an expression, i.e., for an expression that is equal to and that does not contain any subexpressions of the form . The rules for deriving the five different LF judgments are presented in Figure 1. Notice that we allow for the derivation of judgments of the form and only when and are in -normal form. We also observe that such forms are not guaranteed to exist for all LF expressions. However, they do exist for well-formed LF expressions Harper et al. [1993], a property that is ensured to hold for each relevant LF expression by the premises of every rule whose conclusion requires the -normal form of that expression.

The notion of equality that we use for LF terms also includes -conversion, i.e., the congruence generated by the relation that equates and if does not appear free in . Observe that -normal forms for the different categories of expressions have the following structure where is an object constant or variable and where the subterms and subtypes appearing in the expression recursively have the same form. We refer to the the part denoted by in a type expression in such a form as its target type and to as its argument types. Let be a variable or constant which appears in the well-formed term and let the number of s that appear in the prefix of its type or kind be . We say is fully applied if every occurrence of in has the form . A type of the form where is fully applied is a base type. We also say that is canonical if it is in normal form and every occurrence of a variable or constant in it is fully applied. It is a known fact that every well-formed LF expression is equal to one in canonical form by virtue of -conversion Harper et al. [1993].

2.2 Specifying Relations in LF

LF can be used to formalize different kinds of rule based systems by describing a signature corresponding to the system, as we now illustrate. In presenting particular signatures, we will use a more machine-oriented syntax for LF expressions: we write for and for .

The first example we consider is that of the natural number system. To formalize this system we must, first of all, provide a representation for the numbers. This is easy to do: we pick a type corresponding to these numbers and then provide an encoding for zero and the successor constructor. The first three items in the signature shown in Figure 2 suffice for this purpose. The next thing to do is to specify operations on natural numbers. In LF we think of doing this through relations: thus, addition would be specified as a relation between three numbers. To describe relations we use dependent types. For example, the addition relation might be encoded as a type constant that takes three natural number objects as arguments. The real interest is in determining when such a relation holds. In rule based specifications this is typically done through inference rules. Thus, using the LF notation that we have just described, addition might be defined by the rules

\AxiomC \UnaryInfC \DisplayProof     \AxiomC \UnaryInfC \DisplayProof

in which tokens represented by uppercase letters constitute schema variables. In an LF specification, such rules correspond to object constants whose target type is the representation of the rule’s conclusion and whose argument types are the types of the schema variables and the representations of the premises. As a concrete example, the object constants and defined in Figure 2 represent the two addition rules shown. The question of whether a relation denoted by a type holds now becomes that of whether we can use the constants representing the rules to construct an object expression of that type. Thus, types function as formulas in an LF-style specification and the provability of a formula becomes the question of type inhabitation.

We illustrate these ideas once more by using the example of lists of natural numbers. To represent such lists, we use the type and the object constants and defined in Figure 2. Now consider the append relation on these lists. This relation is represented by the type constant that takes three object-level expressions of type as arguments. The rules for proving this relation are the following

\AxiomC \UnaryInfC \DisplayProof \AxiomC \UnaryInfC \DisplayProof

Following the structure described earlier, the object constants and shown in Figure 2 represent these rules.

\nocaptionrule
Figure 2: An example of specifications in LF

2.3 Logic Programming

The Twelf system gives LF specifications a logic programming interpretation. Computation is initiated in Twelf by presenting it with a type. Such a type, as we have explained earlier, corresponds to a formula and the task is to find a proof for it or, more precisely, to find an inhabitant for the provided type.

The search problem is actually better viewed as that of checking if a given object expression has a given type ; this formulation subsumes the case where only the type is given because we allow to contain variables that may become instantiated as the search progresses. In the simple case is a base type. Here, computation proceeds by looking for an object declaration

in the signature at hand and checking if there are object expressions such that is equal to . If this is the case and if it is also the case that and can be unified, then the task reduces, recursively, to checking if has the type for . In this model of computation, the types associated with object constants in a signature are often referred to as clauses and the process of picking an object declaration and trying to use it to solve the inhabitation question is referred to as backchaining on a clause.

In the more general case, may not be a base type, i.e. it may actually have the structure where is a base type. In this case, we first transform the task to trying to show that the object expression has type where we treat as new constants of type , respectively, that are dynamically added to the signature.

For a concrete example of this behavior, let our signature be the specification of append from Figure 2 and let our goal be to construct a term such that

is derivable. We can match this type with the target type of and we are then left with finding a term such that

is derivable. Notice that this step also results in being instantiated to . The type in the new goal of course matches that of , resulting in being instantiated to and correspondingly being instantiated with the expression

We have, at this point determined that this object expression inhabits the type .

3 Specifications in Predicate Logic

\AxiomC \RightLabel \UnaryInfC \DisplayProof

\AxiomC \RightLabel \UnaryInfC \DisplayProof

\AxiomC \RightLabel \UnaryInfC \DisplayProof
\AxiomC \RightLabeldecide \UnaryInfC \DisplayProof

\AxiomC \RightLabelinit \UnaryInfC \DisplayProof
\AxiomC \RightLabel \UnaryInfC \DisplayProof
\AxiomC \RightLabel \UnaryInfC \DisplayProof
\nocaptionrule
Figure 3: Derivation rules for the  logic

Another approach to specification uses a predicate logic, where relations are encoded as predicates rather than in types. The idea of executing the specifications then corresponds to constructing a proof for chosen formulas in the relevant logic. To yield a sensible notion of computation, the specifications must also be able to convey information about how a search for a proof should be conducted. Not all logics are suitable from this perspective. Here we describe the logic of higher-order hereditary Harrop formulas that does have an associated computational interpretation and that, in fact, is the basis for the programming language Prolog Nadathur and Miller [1988]. We present the syntax of the formulas in this logic in the first subsection below and then explain their computational interpretation. The  logic will be the target for the translation of Twelf that is the focus of the rest of the paper.

3.1 Higher-order hereditary Harrop formulas

The  logic is based on Church’s Simple Theory of Types Church [1940]. The expressions of this logic are those of a simply typed -calculus. The types are constructed from the atomic type of propositions and a finite set of other atomic types by using the function type constructor . There are assumed to be two sets of atomic expressions, one corresponding to variables and the other to constants, in which each member is assumed to have been given a type. All typed terms can be constructed from these typed sets of constants and variables by application and -abstraction. As in LF, terms differing only in bound variable names are identified. The notion of equality between terms is further enriched by - and -conversion. When we orient these rules and think of them as reductions, we are assured in the simply typed setting of the existence of a unique normal form for every well-formed term under these reductions. Thus, equality between two terms becomes the same as the identity of their normal forms. For simplicity, in the remainder of this paper we will assume that all terms have been converted to normal form. We use to denote the capture avoiding substitution of the terms for free occurrences of in .

Further qualifications are required to introduce logic into this setting. First, the constants mentioned above are divided into the categories of logical and non-logical constants. Next, we restrict the constants so that only the logical constants can have argument types containing the type . Finally, we limit the logical constants to the following: of type of type of type for each valid type denotes universal quantification, and the shorthand is used for .

The set of non-logical constants is typically called the signature, and as mentioned cannot appear in the type of any argument of these constants. However, is allowed as the target type for nonlogical constants. Constants with target type are called predicates; those with any other target type are called constructors.

For a nonlogical constant of type and terms of type , we call the term of type an atomic formula. Using the set of logical constants, we construct sets of and -formulas from the set of atomic formulas. The syntax of these two sets is the following: where denotes an atomic formula.

The formulas described above are also called higher-order hereditary formulas. A specification in this setting consists of a set of such formulas. To illustrate how such specifications may be constructed in practice, let us consider the encoding of the append relation on lists of natural numbers. The first step in formalizing this relation is to describe a representation for the data objects in its domain. Towards this end, we introduce two atomic types, and . Our signature should then identify the obvious constructors with each of these types: of type of type of type of type As a concrete example, the list that has and as its elements would be represented by the term .

The append relation will now be encoded via a predicate constant, i.e., a non-logical constant that has as its target type. In particular, we might use the constant that has the type

for this purpose. To define the relation itself, we might use the following two -formulas:

These formulas, that are also often referred to as the clauses of a specification or program, can be visualized as defining the append relation by recursion on the structure of the list that is its first argument. The first formulas treats the base case, when this list is empty. The second formula treats the recursive case; the conclusion of the implication is conditioned by the relation holding in the case where the first argument is a list of smaller size. This pattern, of using universal quantifications over atomic formulas to treat the base cases of a relation and such quantifications over formulas that have an implication structure to treat the recursive cases is characteristic of relational specifications in the  logic.

3.2 Logic Programming

The computational interpretation of the  logic consists of thinking of a collection of -formulas as a program against which we can solve a -formula. More formally, computation in this setting amounts to attempting to construct a derivation for a sequent of the form , where is a signature, is a set of program clauses, and a goal formula. The computation that results from such a sequent consists of first decomposing the goal in a manner determined by the logical constants that appear in it and then, once has been broken up into its atomic components, picking a formula from and using this to solve the resulting goals.

The precise derivation rules for the  logic are given in Figure 3. These rules can be understood as follows. In a sequent of the form , if is not an atomic formula, then it must have one of the forms , or . The first kind of goal has an immediate solution. In the second case, we extend the logic program with and continue search with as the new goal formula. In the last case, i.e., when is of the form , we expand with a new constant and the new goal becomes . Once we have arrived at an atomic formula , we pick a clause from whose head eventually “matches” , spawning off new goals to solve in the process. The exact manner in which this kind of simplification of atomic goals takes place is determined by the last four rules in Figure 3.

A special case for treating atomic goals arises when the clause selected from the program has the structure

and where it is the case that for terms of correct type, . The effect of the sequence of rule applications that results in this case is reflected in the following derived rule

\AxiomC

\RightLabelbackchain \UnaryInfC \DisplayProof

in which for . We shall find this rule, which we have labeled backchain for obvious reasons, useful in the analyses that appears in later sections.

\nocaptionrule
Figure 4: An example of a Prolog program

The Prolog language can be viewed as a programming rendition of the  logic that we have discussed here. In Prolog, the user can introduce new atomic types through declarations that begin with the keyword and new constructors by using declarations that begin with the keyword . Examples of such declarations appear in Figure 4. A complete program consists not only of such declarations that identify the signature, but also of -formulas that define relations. In the concrete syntax of Prolog, abstraction is written as the infix symbol \, i.e., the expression is rendered as x\ F. Moreover, the logical constants , and are written as true, pi and => respectively. Another option for expressing G => D is the notation D :- G. Several of these aspects of Prolog syntax are illustrated in Figure 4 through the presentation of clauses defining the append relation.

The Prolog language has been given an efficient compilation-based implementation in the Teyjus system. One of the goals of our work is to leverage this implementation in providing also an efficient treatment of Twelf programs.

4 A Naive Translation

\nocaptionrule
Figure 5: Encoding of types, objects, and translation of LF judgments to

We present in this section a simple translation of LF specifications into  specifications. This translation is taken from Snow et al. [2010] that builds on earlier work due to Felty Felty and Miller [1990]. After presenting the translation, we will prove a correspondence between its source and target. This property will ensure that reasoning based on the translation will correctly follow reasoning based on the original specification. In this way, we know that constructing a  proof of some judgment is equivalent to finding a derivation in LF. Unfortunately, the simple translation produces  formulas that contain a lot of redundant information related to type checking that can result in quite inefficient proof search behavior. We highlight this issue towards developing a better translation in the next section.

4.1 The Translation

We have previously seen two methods for specifying append, in Section 2 a dependently-typed calculus was used and in Section 3 we utilized a relational style. Similarities between these two styles should have become apparent from this simple example. The signature we defined consisted of expressions which are essentially the same between LF and the simply typed calculus. Differences appear when defining dependencies between objects and types. In LF these relations are defined in the types and so we defined objects and .  is simply typed, and so relations are encoded using predicates and -formulas are constructed to define exactly when the relation holds. There is then, a clear connection between the dependent types in LF, and the program clauses in . The closeness of these two approaches is important in determining a translation from LF to  specifications.

As we have seen in Section 2, the goal of proof search in Twelf is to determine if an object of a particular type can be constructed. We will mimic this situation in Prolog by examining if we can construct a proof for an  formula that is obtained from the LF type. The translation presented by Felty relies on having in hand both the LF type and the LF object, but this is obviously too much to expect if proof search is intended to be the main focus. To overcome this difficulty, Snow et. al. adapted Felty’s translation so that it was based solely on the type Snow et al. [2010]; the LF object is then uncovered incrementally by proof search in the  logic from the corresponding specification.

This translation, which is presented in Figure 5, uses a two step process. In the first step a coarse mapping is described that takes both LF types and objects to  terms. More specifically,  terms of type lf-type and lf-obj are used to represent LF objects of base kinds and types respectively. The mapping then identifies an  type with each arbitrary LF kind and object. Finally, an LF object of type is encoded by the  term of type , and respectively, the type of kind is encoded by the  term of type . This simple mapping clearly loses much of the dependency information available in the original LF types and kinds. In the second pass, we recover the lost information by making use of an  predicate of type : is to hold exactly when is the encoding of some LF term of a base LF type whose encoding is . In more detail, using this predicate, we translate each LF type into an  predicate term that is intended to take the encodings of LF objects as arguments. Interpreting as and using this to describe also the translations and of LF contexts and signatures, we expect our translations to be such that, for a suitable  signature ,

is derivable in the  logic just in the case that is a valid LF judgment.111To translate LF signatures in their entirety, we also have to describe a translation of kinds. However, these translations will not be used in the derivations in  and so we make them explicit.

\nocaptionrule
Figure 6: Translation of the LF specification for

Figure 6 illustrates the translation of an LF signature into an  program using the example LF signature of append shown in Figure 2. We would like to use the  (Prolog) program that results from such a translation as the basis for responding to inhabitation questions raised relative to Twelf specifications. The ambient  program and signature in the  sequents that we have to consider in this setting arise from LF signatures that we are already leaving implicit. We will therefore also elide these parts of the  sequent, writing more simply as , mentioning explicitly at most those parts of the  signature that result from the use of the  rule during proof search.

The following theorem makes precise our informal description of the property of our translation and also provides the basis for using  proof search in answering LF queries.

Theorem 1.

Let be a well-formed canonical LF context and let be a canonical LF type such that has a derivation. If has a derivation for a canonical object , then there is a derivation of . Conversely, if for an arbitrary  term , then there is a canonical LF object such that and has a derivation.

The proof of this theorem can be found in Snow et al. [2010]. To summarize the proof, the completeness argument proceeds by induction on the derivation of to show how to construct a derivation for . Similarly, for soundness it uses induction on the derivation of to extract from an LF object of the required type.

4.2 Some Issues With the Translation

The translation described here has been shown correct. However, because LF expressions contain a lot of redundant information, and because of the context in which we want to use the translation, it is possible to produce a version that is more optimized for proof search. A key fact to bear in mind is that when we consider judgments of the form in the setting of logic programming, we would have already verified that is a valid type. This knowledge gives us additional typing related information. For example, suppose that

If we know that is a valid type, then clearly must be of type . In fact, looking at the app-fam rule tells us that a derivation of must contain a derivation of . Thus, in deriving the  goal

it is unnecessary to show that holds as the translation of the type of that is shown in Figure 6 requires us to do.

Removing tests like those above that arise from binders in LF types would certainly simplify the  specification and would thereby allow for more efficient proof search. However, not all such binders can be ignored in the translation: some of them also play a role in addressing inhabitation questions and are not just relevant to type checking. For example, consider the (well-formed) type

To form an object of this type based on the constructor, we need to have in hand an object of type

Thus, the translation of the type of in whose binder this type occurs must preserve the subgoal corresponding to finding such an object. Clearly then, we need some method of determining which tests are redundant and so can be correctly removed and which must be preserved.

5 Improving the Translation

\AxiomC for some \RightLabelAPP \UnaryInfC \DisplayProof
\AxiomC \RightLabelPI \UnaryInfC \DisplayProof
\AxiomC \AxiomC \RightLabelCTX \BinaryInfC \DisplayProof
\AxiomC for each distinct \RightLabelINIT \UnaryInfC \DisplayProof
\AxiomC and for some \RightLabelAPP \UnaryInfC \DisplayProof
\AxiomC \RightLabelABS \UnaryInfC \DisplayProof
\nocaptionrule
Figure 7: Strictly occurring variables in types and objects

The redundancy issue highlighted in the previous section can be rephrased as follows. We are interested in translating an LF type of the form into an  clause that can be used to determine if a type can be viewed as an instance of the target type . This task also requires us to show that are inhabitants of the types ; in the naive translation, this job is done by the formulas pertaining to and that appear in the body of the  clause produced for the overall type. However, particular may occur in in a manner that already makes it clear that the term that replace them in any instance of must possess such a property. What we want to do, then, is characterize such occurrences of so that we can avoid having to include an inhabitation check in the  clause.

In this section, we define a strictness condition for variable occurrences and, hence, for variables that possesses this kind of property. By using this condition, we can simplify the translation of a type into an  clause without losing accuracy. In addition to efficiency, such a translation also produces a result that bears a much closer resemblance to the LF type from which it originates. The correctness of this new translation is shown using lemmas about this strictness condition.

5.1 The Strictness Property and Redundancies in Types

\nocaptionrule
Figure 8: An example motivating the strictness condition

To understand the intuition underlying the strictness condition on variable occurrences and its relevance to type checking, take as an example the signature in Figure 8. The main focus in this example is on the constant and its type; the other declarations are included because they are needed in constructing the type associated with . Substituting and for and respectively provides an instance of the target type of which has the form

Suppose we know that this is a valid type. Then we would already know that has the type

and hence would not need to check this explicitly. The fact that has this type follows from looking at its occurrence in the type known to be valid and noting that the checking of the type of has already established that any instance of the third argument of in this setting must have as its type the corresponding instance of the type of :

Analyzing this more closely, we see that the critical contributing factors (to occurring in this way in the type) are that the path down to the occurrence of is rigid, i.e., it cannot be modified by substitution and is not applied to arguments in a way that could change the structure of the expression substituted for it. These properties were formalized in a notion of strictness in Snow et al. [2010], there inappropriately referred to as rigidity.

The criterion described in Snow et al. [2010] actually fails to recognize some further cases in which dynamic type checking can be avoided. To understand this, consider the occurrence of in the target type of . This occurrence appears applied to an argument that could end up “hiding” the actual structure of any instantiation of . We see this concretely in the instance

considered earlier; we know something about the type of the term resulting from , but cannot conclude anything about the type of itself from this. Thus, this occurrence of is correctly excluded by the strictness condition presented in Snow et al. [2010].

Observe, however, that has another occurrence in the type of , in particular, in the type of the argument . Further, because this argument occurs strictly in the instantiated target type, we would have statically checked its validity. Looking at that type, which is

we would know that is well formed and therefore is an inhabitant of the expected type.

In summary, it seems possible to extend the strictness condition recursively while preserving its utility in recognizing redundancy in type checking. We consider occurrences of bound variables to be strict in the overall type if they are strict in the types of other bound variables that occur strictly in the target type. The relation defined in Figure 7 formalizes this idea. Specifically, we say that the bound variable occurs strictly in the type if it is the case that

holds.

In the lemmas that follow, we formally prove the relationship between this notion and redundancy in type checking that we have discussed above.

Lemma 2.

Let be LF objects and , , and be LF contexts where . Further, let be an LF object, an LF type and . Finally, suppose for some there are derivations of

  1. ,

  2. , and

  3. .

Then there is a derivation of .

Proof.

By induction on the derivation of . The argument proceeds by considering the cases for the last rule used in the derivation.

The last rule is INIT. In this case, is for some distinct and . Then must in fact be . From (2), it follows that , the type of , must be . Note that none of the variables in can appear in (and hence ) or in for and, further, that cannot contain if . We then get from (3) that