Proof-irrelevant type theories

# On the strength of proof-irrelevant type theories

Benjamin Werner INRIA Saclay – Île-de-France and LIX, Ecole Polytechnique, 91128 PALAISEAU cedex, France
###### Abstract.

We present a type theory with some proof-irrelevance built into the conversion rule. We argue that this feature is useful when type theory is used as the logical formalism underlying a theorem prover. We also show a close relation with the subset types of the theory of PVS. We show that in these theories, because of the additional extentionality, the axiom of choice implies the decidability of equality, that is, almost classical logic. Finally we describe a simple set-theoretic semantics.

###### Key words and phrases:
logic, proofs, coq, lambda-calculus, types

\@sect

section1[Introduction]Introduction A formal proof system, or proof assistant, implements a formalism in a similar way a compiler implements a programming language. Among existing formalisms, dependent type systems are quite widespread. This can be related to various pleasant features; among them

1. Proofs are objects of the formalism. The syntax is therefore smoothly uniform, and proofs can be rechecked at will. Also, only the correctness of the type-checker, a relatively small and well-identified piece of software, is critical for the reliability of the system (the “de Bruijn principle”).

2. The objects of the formalism are programs (typed -terms) and are identified modulo computation (-conversion). This makes the formalism well-adapted for problems dealing with program correctness. But also the conversion rule allows the computation steps not to appear in the proof; for instance is simply proved by one reflexivity step, since this proposition is identified with by conversion. In some cases this can lead to a dramatic space gain, using the result of certified computations inside a proof; spectacular recent applications include the formal proof of the four-color theorem [G4] or formal primality proofs [GrThWe].

3. Finally, type theories are naturally constructive. This makes stating decidability results much easier. Furthermore, combining this remark with the two points above, one comes to program extraction: taking a proof of a proposition , one can erase pieces of the -term in order to obtain a functional program of type , whose input and result are certified to be related by . Up to now however, program extraction was more an external feature of implemented proof systems111Except NuPRL; see related work.: programs certified by extraction are no longer objects of the formalism and cannot be used to assert facts like in the point above.

Some related formalisms only build on some of the points above. For example PVS implements a theory whose objects are functional programs, but where proofs are not objects of the formalism.

An important remark about (2) is that the more terms are identified by the conversion rule, the more powerful this rule is. In order to identify more terms it thus is tempting to combine points (2) and (3) by integrating program extraction into the formalism so that the conversion rule does not require the computationally irrelevant parts of terms to be convertible.

In what follows, we present and argue in favor of a type-theory along this line. More precisely, we claim that such a feature is useful in at least two respects. For one, it gives a more comfortable type theory, especially in the way it handles equality. Furthermore it is a good starting point to build a platform for programming with dependent types, that is to use the theorem prover also as a programming environment. Finally, on a more theoretical level, we will also see that by making the theory more extensional, proof-irrelevance brings type theory closer to set-theory regarding the consequences of the axiom of choice.

The central idea of this work is certainly simple enough to be adjusted to various kinds of type theories, whether they are predicative or not, with various kinds of inductive types, more refined mechanisms to distinguish the computational parts of the proofs etc…. In what follows we illustrate it by using a marking of the computational content which is as simple as possible. The extraction function we use is quite close to Letouzey’s [Letouzey1, Letouzey2], except that we discard the inclusion rule , which would complicate the definition of the type theory and the semantics (see [MiqWer] for the last point).

Related work  Almost surprisingly, proof-irrelevant type theories do not seem to enjoy wide use yet. In the literature, they are often not studied for themselves, but as means for proving properties of other systems. This is the case for the work of Altenkirch [Altenkirch] and Barthe [Barthe]. One very interesting work is Pfenning’s modal type theory which involves proof-irrelevance and a sophisticated way to pinpoint which definitional equality is to be used for each part of a term; in comparision we here stick to much simpler extraction mechanism. The NuPRL approach using a squash type [Caldwell] is very close to ours, but the extentional setting gives somewhat different results. Finally, let us mention recent work [Bruno2] by Barras and Bernardo who present a type theory with implicit arguments. This interesting proposal can be understood as a theory with proof-irrelevance, where the computational fragment is precisely Miquel’s calculus [miquel]. Their proposal can be understood as a theory similar to ours, but with a more sophisticated way to mark what is computational and what is not.

\@sect

section1[The Theory]The Theory

\@sect

subsection2[The -terms]The -terms The core of our theory is a Pure Type System (PTS) extended with -types and some inductive type definitions. In PTS’s, the types of types are sorts; the set of sorts is

 S≡{\sf Prop}∪{\sf Type(i)|i∈N}

As one can see, we keep the sort names of Coq. As usual, Prop is the impredicative sort and the sorts give the hierarchy of predicative universes. It comes as no surprise that the system contains the usual syntactic constructs of PTSs; however it is comfortable, both for defining the conversion rule and constructing a model to tag the variables to indicate whether they correspond to a computational piece of code or not; in our case this means whether they live in the impredicative or a predicative level (i.e. whether the type of their type is Prop or a ). A similar tagging is done on the projections of -types. Except for this detail, the backbone of the theory considered hereafter is essentially Luo’s Extended Calculus of Constructions (ECC) [Luo].

The syntax of the ECC fragment is therefore

 s  ::=\sf Prop | \sf Type(i)\sf s  ::= ∗ | ⋄t  ::=s | x\tiny\sf s | λx\tiny\sf s% :t.t  | (t t) | Πx\tiny\sf s:t.t | Σ\tiny\sf sx\tiny\sf s:t.t | Σx:t.ts | π1(t) | π\tiny\sf s2(t)Γ  ::=[] | Γ(x:t).

We sometimes call raw terms these terms, when we want to stress that they are considered independently of typing issues. The tagging of is there to indicate whether the second component of the pair is computational or not (the first component will always be). For the same technical reason, we also tag the second projection .

We will sometimes write for , for or for omitting the tag s when it is not relevant or can be infered from the context.

The binding of variables is as usual. We write for the substitution of the free occurrences of variable in by . As has become custom, we will not deal with -conversion here, and leave open the choice between named variables and de Bruijn indices.

We also use the common practice of writing (resp. ) for (resp. ) when does not appear free in . We also write (resp. ) for (resp. ).

\@sect

subsection2[Relaxed conversion]Relaxed conversion The aim of this work is the study of a relaxed conversion rule. While the idea is to identify terms with respect to typing information, the tagging of impredicative vs. predicative variables is sufficient to define such a conversion in a simple syntactic way. A variable or a second projection is computationally irrelevant when tagged with the mark. This leads to the following definition.

###### Definition 0.1 (Extraction).

We can simply define the extraction relation as the contextual closure of the following rewriting equations

 x∗⊳εελx:A.ε⊳εε(ε t)⊳εεπ∗2(t)⊳εε.

We write for the reflexive-transitive closure of . We say that a term is of tag if and of tag if not. We write for the tag of .

###### Definition 0.2 (Reduction).

The -reduction is defined as the contextual closure of the following equations

 (λx\tiny\sf s:A.t u)⊳βt[x\tiny\sf s∖u]if s(u)=\sf sπ1(Σx:A.B)⊳βaif s(a)=⋄π\tiny\sf s2(Σx:A.B)⊳βbif s(b)=\sf s.

The restrictions on the right-hand side are there in order to ensure that the tag is preserved by reduction. Without them can reduce either to or to Prop which would falsify the Church-Rosser property. Actually we will see that these restrictions are always satisfied on well-typed terms, but are necessary in order to assert the meta-theoretic properties below. While these restrictions are specific to our way of marking computational terms, other methods will probably yield similar technical difficulties.

The relaxed reduction is the union of and . We write for the reflexive, symmetric and transitive closure of and for the transitive-reflexive closure of .

It is a good feature to have the predicative universes to be embedded in each other. It has been observed (Pollack, McKinna, Barras…) that a smooth way to present this is to define a syntactic subtyping relation which combines this with (or here ). Note that this notion of subtyping should not be confused with, for instance, subtyping of subset types in the style of PVS.

###### Definition 0.3 (Syntactic subtyping).

The subtyping relation is defined on raw-terms as the transitive closure of the following equations

 \sf Type(i)≤\sf Type(i+1)T=βεT′⇒T≤T′B≤B′⇒Πx:A.B≤Πx:A.B′.

\@sect

subsection2[Functional fragment typing rules]Functional fragment typing rules The typing rules for the kernel of our theory are given in PTS-style [Barendregt] and correspond to Luo’s ECC. The differences are the use of subtyping in the conversion rule and the tagging of variables when they are “pushed” into the context.

The rules are given in figure 1. In the rule Prod, is the maximum of two sorts for the order

\@sect

subsection2[Treatment of propositional equality]Treatment of propositional equality Propositional equality is a first example whose treatment changes when switching to a proof-irrelevant type theory. The definition itself is unchanged; two objects and of a given type are equal if and only if they enjoy the same properties

 a=Ab  ≡  ΠP:A→\sf Prop.(P a)→(P b)

It is well-known that reflexivity, symmetry and transitivity of equality can easily be proved. When seen as an inductive definition, the definition of “” is viewed as its own elimination principle.

Let us write refl for the canonical proof of reflexivity

 \sf refl≡λA:\sf Type(i).λx:A.λP:A→\sf Prop.λp:(P x).p

In many cases, it is useful to extend this elimination over the computational levels

 \sf Eq_\sf reci:ΠA:\sf Type(i).ΠP:A→\sf Type(i).Πa,b:A.(P a)→a=Ab→(P b)

There is however a peculiarity to : in Coq, it is defined by case analysis and therefore comes with a computation rule. The term of type reduces to in the case where is a canonical proof by reflexivity; in this case, and are convertible and thus coherence and normalization of the type theory are preserved.

As shown in the next section, such a reduction rule is useful, especially when programming with dependent types. In our proof-irrelevant theory however, we cannot rely on the information given by the equality proof , since all equality proofs are treated as convertible. Furthermore, allowing, for any , the reduction rule is too permissive, since it easily breaks the subject reduction property in incoherent contexts.

We therefore put the burden of checking convertibility between and on the reduction rule of by extending reduction with the following, conditional rule

 (\sf Eq_\sf rec A P a b p e)⊳p\ \ if a=εb

When being precise, this means that and are actually two mutual inductive definitions.

An alternative would be the non-linear rule

 (\sf Eq_\sf rec A P a a p e)⊳p

but this allows an encoding of Klop’s counter-example [Klop] and thus breaks the Church-Rosser property (for untyped terms). We thus develop the metatheory for the first version.

\@sect

subsection2[Generalization]Generalization In Coq, computational eliminations are provided for more inductive definitions than just propositional equality. The condition is that

1. The definition has at most one constructor,

2. the arguments of this constructor are all, themselves, non-computational.

It appears that it is reasonably straightforward to extend our type theory, by generalizing the feature, in order to capture this Coq behavior in the case where the inductive definition is non-recursive. We briefly indicate how but without precise justification. The remainder of this paragraph is thus not considered in the meta-theoretical justifications; it is also not necessary for the rest of the article.

We write for and for .

Consider an inductive definition with a unique constructor . The non-computational elimination scheme is

 I\_ind:ΠP:(Π→x:→A.\sf Prop).(Π→y:→B.P →u)→Π→x:→A.I →x→P →x

with the reduction rule

 (I\_ind X p →a (c →b))⊳(p →b)

We can then provide a computational elimination

 I\_rec:ΠX:(Π→x:→A.\sf Type).(Π→y:→B.X →u)→Π→x:→A.I →x→X →x

with the following reduction rule

 (I\_rec X p →a i)⊳(p →ε)if% ~{}→u=βε→a

To understand the last condition, one should note that although the variables are free in , they do not interfere with the conversion since their types ensure they are all tagged by .

\@sect

subsection2[Data Types]Data Types In order to be practical, the theory needs to be extended by inductive definitions in the style of Coq, Lego and others. We do not detail the typing rules and liberally use integers, booleans, usual functions and predicates ranging over them. We refer to the Coq documentation [Coq, Gim]; for a possibly more modern presentation [Blanqui] is interesting.

Let us just mention that data types live in Type. That is, for instance, ; thus, their elements are of tag .

\@sect

section1[Basic metatheory]Basic metatheory

We sketch the basic meta-theory of the calculus defined up to here. The proof techniques are relatively traditional, even if one has to take care of the more delicate behavior of relaxed reduction for the first lemmas (similarly to [MiqWer]).

###### Lemma 0.4.

If , then . Thus, the same is true if .

###### Proof.

By a straightforward case analysis of the form of .

###### Lemma 0.5 (β-postponement).

If , then there exists such that and .

###### Proof.

One first shows that if , then there exists such that and either or . This is done by checking how the two redexes are located with respect to each other. The proof of the lemma then easily follows.

###### Lemma 0.6 (Church-Rosser).

For a raw term, if and , then there exists such that and .

###### Proof.

By a quite straightforward adaptation of the usual Tait and Martin-Löf method. The delicate point was to choose the right formulation of the reduction rule specific to the elimination of propositional equality, as mentioned in section On the strength of proof-irrelevant type theories.

An immediate but very important consequence is that

If , then and .

###### Corollary 0.8.

For any ,

1. if and then .

Furthermore, is obviously strongly normalizing. One therefore can "pre-cook" all terms by when checking relaxed convertibility.

###### Lemma 0.9 (pre-cooking of terms).

Let and be raw terms. Let and be their respective -normal forms. Then, if and only if .

While this property is important for implementation, its converse is also true and semantically understandable. Computationally relevant -reductions are never blocked by not-yet-performed -reductions.

###### Lemma 0.10.

Let be any raw term. Suppose . Then there exists such that .

###### Proof.

It is easy to see that cannot create new -redexes, nor does it duplicate existing ones.

###### Lemma 0.11.

If , for any term and variable , one has . Thus, if then .

###### Proof.

By straightforward induction over the structure of . One uses the fact that, since and have the same syntactic sort, the terms and also have the same syntactic sort.

###### Lemma 0.12 (Substitution).

If and are derivable, if and have the same (syntactic) sort, then is derivable.

###### Proof.

By induction over the structure of the first derivation, like in the usual proof. The condition over the syntactic sorts is necessary for the case of the conversion rule, in order to apply the previous lemma.

###### Lemma 0.13 (Inversion or Stripping).

If is derivable, then so are and for some sort . Furthermore, the following clauses hold.

If is derivable, then: If is derivable, then:
 ∙  (x,U)∈Γ, ∙  Γ⊢T:s is derivable, ∙  U≤T.
 ∙  Γ⊢t:Πx:U.W, ∙  Γ⊢u:U, ∙  W[x∖u]≤V.
If is derivable, then: If is derivable, then
 ∙  Γ(x:U)⊢t:T, ∙  Πx:U.T≤W.
 ∙  Γ⊢A:s1, ∙  Γ(x:A)⊢B:s2, ∙  either s2=\sf Prop and \sf Prop≤T or max(s1,s2)≤T.
If is derivable, then If is derivable, then
 ∙  Γ⊢A:\sf Type(i), ∙  Γ(x:A)⊢B:\sf Prop, ∙  \sf Type(i)≤T.
 ∙  Γ⊢A:\sf Type(i), ∙  Γ(x:A)⊢B:\sf Type(j), ∙  \sf Type(max(i,j))≤T.
is not derivable. If is derivable,
 ∙  Σx⋄:T.U≤V, ∙  Γ⊢t:T, ∙  Γ⊢u:U[x⋄∖t], ∙  s(t)=⋄.
If is derivable, then If is derivable, then
 ∙  Γ⊢t:Σx⋄:A.B, ∙  A≤T.
 ∙  Γ⊢t:Σ\tiny\sf sx⋄:A.B, ∙  B[x⋄∖π1(t)]≤T
If is derivable, then If , then
###### Proof.

Simultaneously by induction over the derivation.

###### Corollary 0.14 (Principal type).

If , then there exists such that and for all , if , then .

###### Proof.

By induction over the structure of , using the previous lemma and corollaries 0.7 and 0.8.

Of course, subject reduction holds only for -reduction, since is not meant to be typable.

###### Lemma 0.15 (Subject reduction).

If is derivable, if (resp. , ), then (resp. ).

###### Proof.

By induction over the structure of . Depending upon the position of the redex, one uses either the substitution or the stripping lemmas above. We only detail the case where a -reduction occurs at the root of the term.

If , and , we know that , and . Thus we can apply lemma 0.12 to deduce

 Γ⊢v[x\tiny\sf s∖u]:V[x\tiny\sf s∖u]

and

 Γ⊢V[x\tiny\sf s∖u]:s

where is the sort such that . The result then follows through one application of the conversion rule.

###### Lemma 0.16.

If is derivable, then there exists a sort such that ; furthermore if and only if is of tag .

###### Proof.

By induction over the structure of . The Church-Rosser property ensures that Prop and are not convertible.

A most important property is of course normalization. We do not claim any proof here, although we very strongly conjecture it. A smooth way to prove it is probably to build on top of the simple set-theoretical model using an interpretation of types as saturated -sets as first proposed by Altenkirch [Alti, MelWer].

###### Conjecture 0.17 (Strong Normalization).

If is derivable, then is strongly normalizing.

Stating strong normalization is important in the practice of proof-checking, since it entails decidability of type-checking and type-inference.

###### Corollary 0.18.

Given , it is decidable whether . Given and a raw term , it is decidable whether there exists such that holds.

###### Proof.

By induction over the structure of , using the stripping lemma. Normalization ensures that the relation is decidable for well-formed types.

The other usual side-product of normalization is a syntactic assessment of constructivity.

###### Corollary 0.19.

If , then with and .

###### Proof.

By case analysis over the normal form of , using the stripping lemma.

\@sect

section1[Programming with dependent types]Programming with dependent types We now list some applications of the relaxed conversion rule, which all follow the slogan that proof-irrelevance makes programming with dependent types more convenient and efficient.

From now on, we will write for , that is for a -type whose second component is non-computational.

\@sect

subsection2[Dependent equality]Dependent equality Programming with dependent types means that terms occur in the type of computational objects (i.e. not only in propositions). The way equality is handled over such families of types is thus a crucial point which is often problematic in intensional type theories.

Let us take a simple example. Suppose we have defined a data-type of arrays over some type . If is a natural number, is the type of arrays of size . That is . Furthermore, let us assume we have a function modeling access to the array .

Commutativity of addition can be proved in the theory: . Yet and are two distinct types with distinct inhabitants. For instance, if we have an array , we can use the operator described above to transform it into an array of size

 t′≡\sf Eq_\sf rec nat \sf tab (m+p) (p+m) t (\sf com (m+p) (p+m)) : \sf tab(p+m)

Of course, and should have the same inhabitants, and we would like to prove

 Πi:\sf nat.\sf acc (m+p) t i=A\sf acc (p+m) t′ i

It is known [HofStr, McBride] that in order to do so, one needs the reduction rule for together with a proof that equality proofs are unique. The latter property being generally established by a variant of what Streicher calls the “K axiom”

 K:ΠA:\sf Type.Πa:A.ΠP:a=Aa→\sf Prop.(P (\sf refl a))→Πe:a=Aa.(P e)

where refl stands for the canonical proof by reflexivity.

Here, since equality proofs are also irrelevant to conversion, this axiom becomes trivial. Actually, since and are convertible, this statement does not even need to be mentioned anymore, and the associated reduction rule becomes superfluous.

In general, it should be interesting to transpose work like McBride’s [McBride] in the framework of proof-irrelevant theories.

\@sect

subsection2[Partial functions and equality over subset types]Partial functions and equality over subset types

In the literature of type theory, subset types come in many flavors; they designate the restriction of a type to the elements verifying a certain predicate. The type can be viewed as the constructive statement "there exists an element of verifying ", but also as the data-type restricted to elements verifying . In most current intensional type theories, the latter approach is not very practical since equality is defined over it in a too narrow way. We have only if and ; the problem is that one would like to get rid of the second condition. The same is true for propositional Leibniz equality and one can establish

 ={x:A|P}→p=P[x∖a]p′

In general however, one is only interested in the validity of the assertion , not the way it is proved. A program awaiting an argument of type will behave identically if fed with or .

Therefore, each time a construct is used indeed as a data-type, one cannot use Leibniz equality in practice. Instead, one has to define a less restrictive equivalence relation which simply states that the two first components of the pair are equal

 ≃A,P    ≡    a=Aa′

But using instead of quickly becomes very tedious; typically, for every function one has to prove

 Πc,c′:{x:A|P} . c≃A,Pc′→(f c)=B(f c′)

and even more specific statements if is itself a subset type.

In our theory, one can prove without difficulties that and are equivalent, and there is indeed no need anymore for defining . Furthermore, one has , so the two terms are computationally identified which is stronger than Leibniz equality, avoids the use of the deductive level and makes proofs and developments more concise.

\@sect

subsubsectionè[Array bounds]Array bounds The same can be observed when partial functions are curryfied. Let us take again the example of arrays, but suppose this time the access function awaits a proof that the index is within the bounds of the array.

 tab : \sf nat→\sf Type(i) acc : Πn:nat.\sf tab n→Πi:nat.i

So given an array of size , its corresponding access function is

 a≡\sf acc n t : Πi:nat.i

In traditional type theory, this definition is cumbersome to use, since one has to state explicitly that the values , where do not depend upon . The type above is therefore not sufficient to describe an array; instead one needs the additional condition

 Tirr:Πi:nat.Πpi,p′i:i

where stands for the propositional Leibniz equality.

This is again verbose and cumbersome since has to be invoked repeatedly. In our theory, not only the condition becomes trivial, since for any and one has , but this last coercion is stronger than propositional equality: there is no need anymore to have recourse to the deductive level and prove this equality. The proof terms are therefore clearer and smaller.

\@sect

subsection2[On-the-fly extraction]On-the-fly extraction An important point, which we only briefly mention here is the consequence for the implementation when switching to a proof-irrelevant theory. In a proof-checker, the environment consists of a sequence of definitions or lemmas which have been type-checked. If the proof-checker implements a proof-irrelevant theory, it is reasonable to keep two versions of each constant: the full proof-term, which can be printed or re-checked, and the extracted one (that is -normalized) which is used for conversion check. This would be even more natural when building on recent Coq implementations which already use a dual storing of constants, the second representation being non-printable compiled code precisely used for fast conversion check.

In other words, a proof-system built upon a theory as the one presented here would allow the user to efficiently exploit the computational behavior of a constructive proof in order to prove new facts. This makes the benefits of program extraction technology available inside the system and helps transforming proof-system into viable programming environments.

\@sect

section1[Relating to PVS]Relating to PVS Subset types also form the core of PVS. In this formalism the objects of type are also of type , and objects of type can be of type . This makes type checking undecidable and is thus impossible in our setting. But we show that it is possible to build explicit coercions between the corresponding types of our theory which basically behave like the identity.

What is presented in this section is strongly related to the work of Sozeau [sozeau], which describes a way to provide a PVS style input mode for Coq.

The following lemma states that the construction and destruction operations of our subset types can actually be omitted when checking conversion.

###### Lemma 0.20 (Singleton simplification).

The typing relation of our theory remains unchanged if we extend the reduction of our theory by222To make the second clause rigorous, a solution is to modify slightly the theory by adding a tag the first projection ( and ). This does not significantly change the metatheory..

 Σ∗x:A.P ⊳ε a π1(c) ⊳ε c   when~{}c:Σ∗x:A.B

The following definition is directly transposed333A difference is that in PVS, propositions and booleans are identified; but this point is independent of this study. It is however possible to do the same in our theory by assuming a computational version of excluded-middle. from PVS [pvs]. We do not treat dependent types in full generality (see chapter 3 of [pvs]).

###### Definition 0.21 (Maximal super-type).

The maximal super-type is a partial function from terms to terms, recursively defined by the following equations. In all these equations, and are of type in a given context.

###### Definition 0.22 (η-reduction).

The generalized -reduction, written , is the contextual closure of

 λx:A.(t x) ⊳η t ~{}~{}~{}~{}~{}if x is not free in t <π1(t),π2(t)> ⊳η t

We can now construct the coercion function from to .

###### Lemma 0.23.

If and is defined, then

1. ,

2. there exists a function which is of type in ,

3. furthermore, when applying the singleton simplification to one obtains an -expansion of the identity function; to be precise,

###### Proof.

It is almost trivial to check that . The two other clauses are proved by induction over the structure of .

1. If is of the form with , then

 ¯¯¯μ(A)≡λx:{x:B|P}.(¯¯¯μ(B) π1(x)):{x:B|P}→μ(B)

Furthermore, since , is here simplified to , and by induction hypothesis we know that reduces to . We can conclude that .

2. If is of the form with , then

 ¯¯¯μ(A)≡λh:C→B.λx:C.¯¯¯μ(B) (h x):C→μ(B)

Since , we have .

3. If is of the form , then

 ¯¯¯μ(A)≡λx:B×C.<(¯¯¯μ(B) π1(x)),(¯¯¯μ(C) π2(x))>μ(B)×μ(C)

Again, the induction hypotheses assure that .

The opposite operation, going from from to , can only be performed when some conditions are verified (type-checking conditions, or TCC’s in PVS terminology). We can also transpose this to our theory, still keeping the simple computational behavior of the coercion function. This time however, our typing being less flexible than PVS’, we have to define the coercion function and its type simultaneously; furthermore, in general, this operation is well-typed only if the type-theory supports generalized -reduction444It should be mentionned that adding -reduction to such a type system yields non-trivial technical difficulties, which are mostly independent of the question of proof-irrelevance..

This unfortunate restriction is typical when defining transformations over programs with dependent types. It should however not be taken too seriously, and we believe this cosmetic imperfection can generally be tackled in practice555For one, in practical cases, -does not seem necessary very often (only with some nested existentials). And even then, it should be possible to tackle the problem by proving the corresponding equality on the deductive level..

###### Lemma 0.24 (subtype constraint).

Given , if is defined, then one can define and such that, in the theory where conversion is extended with , one has

 Γ⊢π(A):μ(A)→\sf Prop    and   Γ⊢¯¯¯π(A):Πx:μ(A).(π(A) x)→A

Furthermore, -normalizes to .

###### Proof.

By straightforward induction. We only provide detail for the case where . Then and .

\@sect

section1[A more extensional theory]A more extensional theory Especially during the 1970s and 1980s, there was an intense debate about the respective advantages of intensional versus extensional type theories. The latter denomination seems to cover various features like replacing conversion by propositional equality in the conversion rule or adding primitive quotient types. In general, these features provide a more comfortable construction of some mathematical concepts and are closer to set-theoretical practice. But they break other desirable properties, like decidability of type-checking and strong normalization.

The theory presented here should therefore be considered as belonging to the intentional family. However, we retrieve some features usually understood as extensional.

\@sect

subsection2[The axiom of choice]The axiom of choice Consider the usual form of the (typed) axiom of choice (AC)

 (∀x:A.∃y:B.R(x,y))⇒∃f:A→B.∀x:A.R(x,f x)

When we transpose it into our type theory, we can choose to translate the existential quantifier either by a -type, or the existential quantifier defined in Prop

 ∃x:A.P≡ΠQ:\sf Prop.(Πx:A.P→Q)→Q :\sf Prop

If we use a -type, we get a type which obviously inhabited, using the projections and . However, if we read the existential quantifiers of AC as defined above, we obtain a (non-computational) proposition which is not provable in type theory.

Schematically, this propositions states that if is provable, then the corresponding function from to exists “in the model”. This assumption is strong and allows to encode IZF set theory into type theory (see [Werner]).

What is new is that our proof-irrelevant type theory is extensional enough to perform the first part of Goodman and Myhill’s proof based on Diaconescu’s observation. Assuming AC, we can prove the decidability of equality. Consider any type and two objects and of type . We define a type corresponding to the unordered pair

 {a,b}≡{x:A|x=Aa∨x=Ab}

Let us write (resp. ) for the element of corresponding to (resp. ); so and . It is then easy to prove that

 Πz:{a,b}.∃e:\sf bool.(e=\sf bool\sf true∧π1(z)=Aa)∨(e=\sf bool\sf false∧π1(z)=Ab)

and from the axiom of choice we deduce

 ∃f:{a,b}→\sf bool.Πz:{a,b}.(f z=\sf bool% \sf true∧π1(z)=Aa)∨(f z=\sf bool\sf false% ∧π1(z)=Ab)

Finally given such a function , one can compare and , since both are booleans over which equality is decidable.

The key point is then that, thanks to proof-irrelevance, the equivalence between and is provable in the theory. Therefore, if and are different, so are and . On the other hand, if then and so . In the same way, entails .

We thus deduce and by generalizing with respect to and we obtain

 ΠA:\sf Type(i).Πa,b:A.a=Ab∨a≠Ab

which is a quite classical statement. We have formalized this proof in Coq, assuming proof-irrelevance as an axiom.

Note of course that this “decidability” is restricted to a disjunction in Prop and that it is not possible to build an actual generic decision function. Indeed, constructivity of results in the predicative fragment of the theory are preserved, even if assuming the excluded-middle in Prop.

\@sect

subsection2[Other classical non-computational axioms]Other classical non-computational axioms At present, we have not been able to deduce the excluded middle (EM) from the statement above666In set theory, decidability of equality entails the excluded middle, since is equal to if and only if holds.. We leave this theoretical question to future investigations but it seems quite clear that in most cases, when admitting AC one will also be willing to admit EM. In fact both axioms are validated by the simple set-theoretical model and give a setting where the ’s are inhabited by computational types (i.e. from we can compute of type ) and Prop allows classical reasoning about those programs.

Another practical statement which is validated by the set-theoretical model is the axiom that point-wise equal functions are equal

 (EXT)     ΠA,B:\sf Type(i).Πf,g:A→B.(Πx:A.f x=Bg x)→f=A→Bg

Note that combining this axiom with AC (and thus decidability of equality) is already enough to prove (in Prop) the existence of a function deciding whether a Turing machine halts.

\@sect

subsection2[Quotients and normalized types]Quotients and normalized types

Quotient sets are a typically extensional concept whose adaptation to type theory has always been problematic. Again, one has to choose between “effective” quotients and decidability of type-checking. Searching for a possible compromise, Courtieu [Courtieu] ended up with an interesting notion of normalized type777A similar notion has been developed for NuPRL [NK06].. The idea is remarkably simple: given a function , we can define which is the subtype of corresponding to the range of . His rules are straightforwardly translated into our theory by simply taking

 {f(x)|x:A}≡{y:B|∃x:A.y=Bf x}

Courtieu also gives the typing rules for functions going from to , and back in the case where is actually of type .

The relation with quotients being that in the case we can understand as the type quotiented by the relation

 x R y  ⟺  f x=Af y

In practice this appears to be often the case, and Courtieu describes several applications.

\@sect

section1[Simple semantics]Simple semantics When justifying the correctness of a program extraction mechanism, one can use either semantics or syntax. In the first case, one builds a model and verifies it validates extraction [Berardi]. In the latter case, at least in the framework of type theories, this mainly means building a realizability interpretation on top of the strong normalization property [Paulin]. This second approach is difficult here, since our theory is itself built using the erasure of non-computational terms. Furthermore, for complex theories, it appears easier to prove strong normalization using an already defined model [Alti, MelWer, CoqSpi].

For this reason alone, it is worth treating the topic of semantics here. Furthermore, we believe it is a good point for a theory meant to be used in a proof-system to bear simple semantics, in order to justify easily the validity of additional axioms like the ones mentioned in the previous section or extensions like the useful reduction rule for (par. On the strength of proof-irrelevant type theories) which is difficult to treat by purely syntactic means.

Set-theoretical interpretations are the most straightforward way to provide semantics for typed -calculi. It consists, given an interpretation of the free variables, of interpreting a type by a set , and terms by elements of . Furthermore, -abstractions are interpreted by their set-theoretical counterparts: is the function mapping to . While these interpretations are not interesting for studying the dynamics of proof-normalization, they have the virtue of simplicity.

Since Reynolds [Reynolds], it is well-known that impredicative or polymorphic types, as the inhabitants of Prop, bear only a trivial set-theoretical interpretation: if , then is either the empty set or a singleton. In other words, all proofs of proposition have the same interpretation. Since our theory precisely identifies all the elements of at the computational level, the set-theoretical setting is, for its simplicity the most appealing for our goal.

Although the set-theoretical model construction is not as simple as it might seem [MiqWer], the setting is not new; We try to give a reasonably precise description here.

\@sect

subsection2[Notations]Notations Peter Aczel’s way to encode set-theoretic functions provides a tempting framework for a model construction, and a previous version of this section relied on it. However, because of technical difficulties appearing when proving the subject reduction property for the semantic interpretation we finally favor the traditional set theoretic vision of functions, where the application is only defined when belongs to the domain of the function .

If is a mapping from variables to sets and is a set, we write for the function mapping to and identical to elsewhere.

The interpretation of the hierarchy goes beyond ZFC set theory and relies on the existence of inaccessible cardinals. This means, we postulate, for every natural number the existence of a set such that

1. ,

2. is closed by all set-theoretical operations.

As usual, we write for the empty set. We write for the canonical singleton . If is a set and a family of sets indexed over , we use the set-theoretical dependent products and sums

 Πa∈ABa ≡ {f∈A→⋃a∈ABa | ∀a∈A.f(a)∈Ba} Σa∈ABa ≡ {(a,b)∈A×⋃a∈ABa | a∈A∧b∈Ba}

Finally we write for the set-theoretical function construction and, of course, for set-theoretical function application.

\@sect

subsection2[Construction]Construction Over the ECC fragment of the type theory, the interpretation is constructed straightforwardly. The fact that non-computational terms are syntacticly tagged makes the definition easier. We define

###### Definition 0.25.

For any mapping from variables to , we define a mapping associating a set to a term and a context . This function is defined by induction over the size of by the following equations of figure 2; we can restrict ourselves to the case where for some .