Morita Equivalence††thanks: The authors can be reached at firstname.lastname@example.org and email@example.com. We would like to thank Dimitris Tsementzis, Jim Weatherall, and JB Manchak for helpful comments and discussion.
Logicians and philosophers of science have proposed various formal criteria for theoretical equivalence. In this paper, we examine two such proposals: definitional equivalence and categorical equivalence. In order to show precisely how these two well-known criteria are related to one another, we investigate an intermediate criterion called Morita equivalence.
Many theories admit different formulations, and these formulations often bear interesting relationships to one another. One relationship that has received significant attention from logicians and philosophers of science is theoretical equivalence.111See Quine (1975), Sklar (1982), Halvorson (2012, 2013, 2015), Glymour (2013), van Fraassen (2014), and Coffey (2014) for discussion of theoretical equivalence in philosophy of science. In this paper we will examine two formal criteria for theoretical equivalence. The first criterion, called definitional equivalence, has been known to logicians since the middle of the twentieth century.222Artigue et al. (1978) and de Bouvére (1965) attribute the concept of definitional equivalence to Montague (1957). Definitional equivalence was certainly familiar to logicians by the late 1960s, as is evident from the work of de Bouvére (1965), Shoenfield (1967), and Kanger (1968). It was introduced into philosophy of science by Glymour (1970, 1977, 1980). The second criterion is called categorical equivalence. It was first described by Eilenberg and Mac Lane (1942, 1945), but was only recently introduced into philosophy of science by Halvorson (2012, 2015) and Weatherall (2015a).
In order to illustrate the relationship between these two criteria, we will consider a third criterion for theoretical equivalence called Morita equivalence. We will show that these three criteria form the following hierarchy, where the arrows in the figure mean “implies.”
Our discussion will allow us to evaluate definitional equivalence against categorical equivalence. Indeed, it will demonstrate a precise sense in which definitional equivalence is too strict a criterion for theoretical equivalence, while categorical equivalence is too liberal. There are theories that are not definitionally equivalent that one nonetheless has good reason to consider equivalent. And on the other hand, there are theories that are categorically equivalent that one has good reason to consider inequivalent.
2 Many-sorted logic
All of these criteria for theoretical equivalence are most naturally understood in the framework of first-order many-sorted logic. We begin with some preliminaries about this framework.333Our notation follows Hodges (2008). We present the more general case of many-sorted logic, however, while Hodges only presents single-sorted logic.
A signature is a set of sort symbols, predicate symbols, function symbols, and constant symbols. must have at least one sort symbol. Each predicate symbol has an arity , where are (not necessarily distinct) sort symbols. Likewise, each function symbol has an arity , where are again (not necessarily distinct) sort symbols. Lastly, each constant symbol is assigned a sort . In addition to the elements of we also have a stock of variables. We use the letters , , and to denote these variables, adding subscripts when necessary. Each variable has a sort .
A -term can be thought of as a “naming expression” in the signature . Each -term has a sort . The -terms of sort are recursively defined as follows. Every variable of sort is a -term of sort , and every constant symbol of sort is also a -term of sort . Furthermore, if is a function symbol with arity and are -terms of sorts , then is a -term of sort . We will use the notation to denote a -term in which all of the variables that appear in are in the sequence , but we leave open the possibility that some of the do not appear in the term .
A -atom is an expression either of the form , where and are -terms of the same sort , or of the form , where are -terms of sorts and is a predicate of arity . The -formulas are then defined recursively as follows.
Every -atom is a -formula.
If is a -formula, then is a -formula.
If and are -formulas, then , , and are -formulas.
If is a -formula and is a variable of sort , then and are -formulas.
In addition to the above formulas, we will use the notation to abbreviate the formula . As above, the notation will denote a -formula in which all of the free variables appearing in are in the sequence , but we again leave open the possibility that some of the do not appear as free variables in . A -sentence is a -formula that has no free variables.
A -structure is an “interpretation” of the symbols in . In particular, satisfies the following conditions.
Every sort symbol is assigned a nonempty set . The sets are required to be pairwise disjoint.
Every predicate symbol of arity is interpreted as a subset .
Every function symbol of arity is interpreted as a function .
Every constant symbol of sort is interpreted as an element .
Given a -structure , we will often refer to an element as “an element of sort .”
Let be a -structure with elements of sorts . We let be a -term of sort , with variables of sorts , and we recursively define the element . If is the variable , then , and if is the constant symbol , then . Furthermore, if is of the form where each is a -term of sort and is a function symbol of arity , then
One can think of the element as the element of the -structure that is denoted by the -term when are substituted for the variables .
Our next aim is to define when a sequence of elements satisfy a -formula in the -structure . When this is the case we write . We begin by considering -atoms. Let be a -atom with variables of sorts and let be elements of sorts . There are two cases to consider. First, if is the formula , where and are -terms of sort , then if and only if
Second, if is the formula , where each is a -term of sort and is a predicate symbol of arity , then if and only if
This definition is extended to all -formulas in the following standard way.
if and only if it is not the case that .
if and only if and . The cases of , , and are defined analogously.
Suppose that is , where is a sort symbol. Then if and only if for every element . The case of is defined analogously.
If is a -sentence, then just in case , i.e. the empty sequence satisfies in .
2.3 Relationships between structures
There are different relationships that -structures can bear to one another. An isomorphism between -structures and is a family of bijections for each sort symbol that satisfies the following conditions.
For every predicate symbol of arity and all elements of sorts , if and only if .
For every function symbol of arity and all elements of sorts ,
For every constant symbol of sort , .
When there is an isomorphism one says that and are isomorphic and writes .
There is another important relationship that -structures can bear to one another. An elementary embedding between -structures and is a family of maps for each sort symbol that satisfies
for all -formulas and elements of sorts . Given an isomorphism or elementary embedding , we will often use the notation to denote the sequence of elements . Every isomorphism is an elementary embedding, but in general the converse does not hold.
There is an important relationship that can hold between structures of different signatures. Let be signatures and suppose that is a -structure. One obtains a -structure by “forgetting” the interpretations of symbols in . We call the reduct of to the signature , and we call an expansion of to the signature . Note that in general a -structure will have more than one expansion to the signature .
We can now discuss first-order theories in many-sorted logic. A -theory is a set of -sentences. The sentences are called the axioms of . A -structure is a model of a -theory if for all . A theory entails a sentence , written , if for every model of .
We begin our discussion of theoretical equivalence with the following preliminary criterion.
Theories and are logically equivalent if they have the same class of models.
One can easily verify that and are logically equivalent if and only if .
3 Definitional equivalence
Logical equivalence is a particularly strict criterion for theoretical equivalence. Indeed, theories can only be logically equivalent if they are formulated in the same signature. There are many cases, however, of theories in different signatures that are nonetheless intuitively equivalent. For example, the theory of groups can be formulated in a signature with a binary operation and a constant symbol , or it can be formulated in a signature with a binary operation and a unary function (Barrett and Halvorson, 2015). Similarly, the theory of linear orders can be formulated in a signature with the binary relation , or it can be formulated in a signature with the binary relation . Since logical equivalence does not capture any sense in which these theories are equivalent, logicians and philosophers of science have proposed more general criteria for theoretical equivalence.
One such criterion is definitional equivalence. This criterion is well known among logicians, and many results about it have been proven.444For example, see de Bouvére (1965), Kanger (1968), Pinter (1978), Pelletier and Urquhart (2003), Andréka et al. (2005), Friedman and Visser (2014), and Barrett and Halvorson (2015). The basic idea behind definitional equivalence is simple. Theories and are definitionally equivalent if can define all of the symbols that uses, and in a compatible way, can define all of the symbols that uses. In order to state this criterion precisely, we need to do some work.
3.1 Definitional extensions
We first need to formalize the concept of a definition. Let be signatures and let be a predicate symbol of arity . An explicit definition of in terms of is a -sentence of the form
where is a -formula. Note that an explicit definition of in terms of can only exist if . An explicit definition of a function symbol of arity is a -sentence of the form
and an explicit definition of a constant symbol of sort is a -sentence of the form
where and are both -formulas. Note again that these explicit definitions of and can only exist if .
A definitional extension of a -theory to the signature is a theory
that satisfies the following two conditions. First, for each symbol the sentence is an explicit definition of in terms of , and second, if is a constant symbol or a function symbol and is the admissibility condition for , then .
3.2 Three results
A definitional extension of a theory “says no more” than the original theory. There are a number of ways to make this idea precise. Of particular interest to us will be the following three. The reader is encouraged to consult Hodges (2008, p. 58–62) for proofs of these results.
The first result captures a sense in which the models of a definitional extension are “determined” by the models of the original theory . In order to specify a model of , one needs to interpret all of the symbols in . The interpretation of the symbols in , however, “comes for free” given an interpretation of the symbols in .
Let be signatures and a -theory. If is a definitional extension of to , then every model of has a unique expansion that is a model of .
Theorem 3.1 provides a semantic sense in which a definitional extension “says no more” than the original theory . The models of are completely determined by the models of .
In order to state the second result, we need to introduce some terminology. Let be signatures. A -theory is an extension of a -theory if implies that for every -sentence . A -theory is a conservative extension of a -theory if if and only if for every -sentence . All conservative extensions are extensions, but in general the converse does not hold. We have the following simple result about definitional extensions.
If is a definitional extension of , then is a conservative extension of .
If is a conservative extension of , then entails precisely the same -sentences as . Theorem 3.2 therefore shows that a definitional extension “says no more” in the signature than the original theory does.
The third result shows something stronger. If is a definitional extension of , then every -formula can be “translated” into an equivalent -formula . The theory might use some new language that did not use, but everything that says with this new language can be “translated” back into the old language of . This result captures another robust sense in which the theory “says no more” than the theory .
Let be signatures and a -theory. If is a definitional extension of to then for every -formula there is a -formula such that .
These results capture three different senses in which a definitional extension has the same expressive power as the original theory. With this in mind, we have the resources necessary to state definitional equivalence.
Let be a -theory and be a -theory. and are definitionally equivalent if there are theories and that satisfy the following three conditions:
is a definitional extension of ,
is a definitional extension of ,
and are logically equivalent -theories.
One often says that and are definitionally equivalent if they have a “common definitional extension.” Theorems 3.1, 3.2, and 3.3 demonstrate a robust sense in which theories with a common definitional extension “say the same thing,” even though they might be formulated in different signatures.
One trivially sees that if two theories are logically equivalent, then they are definitionally equivalent. But there are many examples of theories that are definitionally equivalent and not logically equivalent. The theory of groups formulated in the signature is definitionally equivalent to the theory of groups formulated in the signature . And likewise, the theory of linear orders formulated in the signature is definitionally equivalent to the theory of linear orders formulated in the signature . Definitional equivalence is therefore a weaker criterion for theoretical equivalence than logical equivalence. It is capable of capturing a sense in which theories formulated in different signatures might nonetheless be equivalent.
4 Morita equivalence
Definitional equivalence, however, is incapable of capturing any sense in which theories formulated with different sorts might be equivalent. We have provided no way of defining new sort symbols. One can therefore easily verify that if and are definitionally equivalent, then it must be that and have the same sort symbols. There are many theories with different sort symbols, however, that one has good reason to consider equivalent.
One particularly famous example of this is Euclidean geometry. It can be formulated with only a sort of “points” (Tarski, 1959), with only a sort of “lines” (Schwabhäuser and Szczerba, 1975), or with both a sort of “points” and a sort of “lines” (Hilbert, 1930).555Szczerba (1977) and Schwabhäuser et al. (1983, Proposition 4.59, Proposition 4.89) discuss the relationships between these formulations. Category theory can also be formulated using different sorts. The standard formulation uses both a sort of “objects” and a sort of “arrows” (Eilenberg and Mac Lane, 1942, 1945). But it is well known that category theory can instead be formulated using only a sort of “arrows” (Mac Lane, 1948).666Freyd (1964, p. 5) and Mac Lane (1971, p. 9) also describe this alternative formulation. Since these formulations use different sort symbols, definitional equivalence does not capture any sense in which they are equivalent.
In addition to these two famous examples, we have the following simple example.
Let and be signatures with , and sort symbols and and predicate symbols of arity . Consider the -theory
and the -theory . Since the signatures and have different sort symbols, and are not definitionally equivalent.
Even though and are not definitionally equivalent, one still has good reason to consider them equivalent. The theory partitions everything into the things that are and the things that are . Similarly, the theory partitions everything into the things of sort and the things of sort . Both and say “there are two kinds of things.” The only difference between them is that uses predicates to say this, while uses sorts.
These examples all show that definitional equivalence does not capture the sense in which some theories are equivalent. If one wants to capture this sense, one needs a more general criterion for theoretical equivalence than definitional equivalence. Our aim here is to introduce one such criterion. We will call it Morita equivalence.777This criterion is already familiar in certain circles of logicians. See Andréka et al. (2008). The name “Morita equivalence” descends from Kiiti Morita’s work on rings with equivalent categories of modules. Two rings and are called Morita equivalent just in case there is an equivalence between their categories of modules. The notion was generalized from rings to algebraic theories by Dukarm (1988). See also Adámek et al. (2006). More recently, topos theorists have defined theories to be Morita equivalent just in case their classifying toposes are equivalent (Johnstone, 2003). See Tsementzis (2015) for a comparison of the topos-theoretic notion of Morita equivalence with ours. This criterion is a natural generalization of definitional equivalence. In fact, Morita equivalence is essentially the same as definitional equivalence, except that it allows one to define new sort symbols in addition to new predicate symbols, function symbols, and constant symbols. In order to state the criterion precisely, we again need to do some work. We begin by defining the concept of a Morita extension. We then make precise the sense in which Morita equivalence is a natural generalization of definitional equivalence by proving analogues of Theorems 3.1, 3.2 and 3.3.
4.1 Morita extensions
As we did for predicates, functions, and constants, we need to say how to define new sorts. Let be signatures and consider a sort symbol . One can define the sort as a product sort, a coproduct sort, a subsort, or a quotient sort. In each case, one defines using old sorts in and new function symbols in . These new function symbols specify how the new sort is related to the old sorts in . We describe these four cases in detail.
In order to define as a product sort, one needs two function symbols with of arity , of arity , and . The function symbols and serve as the “canonical projections” associated with the product sort . An explicit definition of the symbols , and as a product sort in terms of is a -sentence of the form
One should think of a product sort as the sort whose elements are ordered pairs, where the first element of each pair is of sort and the second is of sort .
One can also define as a coproduct sort. One again needs two function symbols with of arity , of arity , and . The function symbols and are the “canonical injections” associated with the coproduct sort . An explicit definition of the symbols , and as a coproduct sort in terms of is a -sentence of the form
One should think of a coproduct sort as the disjoint union of the elements of sorts and .
When defining a new sort as a product sort or a coproduct sort, one uses two sort symbols in and two function symbols in . The next two ways of defining a new sort only require one sort symbol in and one function symbol in .
In order to define as a subsort, one needs a function symbol of arity with . The function symbol is the “canonical inclusion” associated with the subsort . An explicit definition of the symbols and as a subsort in terms of is a -sentence of the form
where is a -formula. One can think of the subsort as consisting of “the elements of sort that are .” The sentence (3) entails the -sentence . As before, we will call this -sentence the admissibility condition for the definition (3).
Lastly, in order to define as a quotient sort one needs a function symbol of arity with . An explicit definition of the symbols and as a quotient sort in terms of is a -sentence of the form
where is a -formula. This sentence defines as a quotient sort that is obtained by “quotienting out” the sort with respect to the formula . The sort should be thought of as the set of “equivalence classes of elements of with respect to the relation .” The function symbol is the “canonical projection” that maps an element to its equivalence class. One can verify that the sentence (4) implies that is an equivalence relation. In particular, it entails the following -sentences:
These -sentences are the admissibility conditions for the definition (4).
Now that we have presented the four ways of defining new sort symbols, we can define the concept of a Morita extension. A Morita extension is a natural generalization of a definitional extension. The only difference is that now one is allowed to define new sort symbols. Let be signatures and a -theory. A Morita extension of to the signature is a -theory
that satisfies the following conditions. First, for each symbol the sentence is an explicit definition of in terms of . Second, if is a sort symbol and is a function symbol that is used in the explicit definition of , then . (For example, if is defined as a product sort with projections and , then .) And third, if is an admissibility condition for a definition , then .
Note that unlike a definitional extension of a theory, a Morita extension can have more sort symbols than the original theory.888Also note that if is a Morita extension of to , then there are restrictions on the arities of predicates, functions, and constants in . If is a predicate symbol of arity , we immediately see that . Taking a single Morita extension does not allow one to define predicate symbols that apply to sorts that are not in . One must take multiple Morita extensions to do this. Likewise, any constant symbol must be of sort . And a function symbol must either have arity with , or must be one of the function symbols that appears in the definition of a new sort symbol . The following is a particularly simple example of a Morita extension.
Let and be a signatures with and sort symbols, a predicate symbol of arity , and a function symbol of arity . Consider the -theory . The following -sentence defines the sort symbol as the subsort consisting of “the elements that are .”
The -theory is a Morita extension of to the signature . The theory adds to the theory the ability to quantify over the set of “things that are .”
4.2 Three results
As with a definitional extension, a Morita extension “says no more” than the original theory. We will make this idea precise by proving analogues of Theorems 3.1, 3.2, and 3.3. These three results also demonstrate how closely related the concept of a Morita extension is to that of a definitional extension.
Theorem 3.1 generalizes in a perfectly natural way. When is a Morita extension of , the models of are “determined” by the models of .
Let be signatures and a -theory. If is a Morita extension of to , then every model of has a unique expansion (up to isomorphism) that is a model of .
Before proving Theorem 4.1, we introduce some notation and prove a lemma. Suppose that a -theory is a Morita extension of a -theory . Let and be models of with an elementary embedding between the -structures and . The elementary embedding naturally induces a map between the -structures and .
We know that is a family of maps for each sort . In order to describe we need to describe the map for each sort . If , we simply let . On the other hand, when , there are four cases to consider. We describe in the cases where the theory defines as a product sort or a subsort. The coproduct and quotient sort cases are described analogously.
First, suppose that defines as a product sort. Let be the projections of arity and with . The definition of the function is suggested by the following diagram.
Let . We define to be the unique that satisfies both and . We know that such an exists and is unique because is a model of and defines the symbols , , and to be a product sort. One can verify that this definition of makes the above diagram commute.
Suppose, on the other hand, that defines as the subsort of “elements of sort that are .” Let be the inclusion map of arity with . As above, the definition of is suggested by the following diagram.
Let . We see that following implications hold:
The first and third implications hold since is a -formula, and the second holds because and is an elementary embedding. defines the symbols and as a subsort and is a model of , so it must be that . By the above implications, we see that . Since is also a model of , there is a unique that satisfies . We define . This definition of again makes the above diagram commute.
When defines as a coproduct sort or a quotient sort one describes the map analogously. For the purposes of proving Theorem 4.1, we need the following simple lemma about this map .
If is an isomorphism, then is an isomorphism.
We know that is a bijection for each . Using this fact and the definition of , one can verify that is a bijection for each sort . So is a family of bijections. And furthermore, the commutativity of the above diagrams implies that preserves any function symbols that are used to define new sorts.
It only remains to check that preserves predicates, functions, and constants that have arities and sorts in . Since is a isomorphism, we know that preserves the symbols in . So let be a predicate symbol of arity with . There must be a -formula such that . We know that is an elementary embedding, so in particular it preserves the formula . This implies that if and only if . Since for each , it must be that also preserves the predicate . An analogous argument demonstrates that preserves functions and constants. ∎
We now turn to the proof of Theorem 4.1.
Proof of Theorem 4.1.
Let be a model of . First note that if exists, then it is unique up to isomorphism. For if is a model of with , then by letting be the identity map (which is an isomorphism) Lemma 4.1 implies that . We need only define the -structure . To guarantee that is an expansion of we interpret every symbol in the same way that does. We need to say how the symbols in are interpreted. There are a number of cases to consider.
Suppose that is a predicate symbol of arity with . There must be a -formula such that . We define the interpretation of the symbol in by letting if and only if . It is easy to see that this definition of implies that . The cases of function and constant symbols are handled similarly.
Let be a sort symbol. We describe the cases where defines as a product sort or a subsort. The coproduct and quotient sort cases follow analogously. Suppose first that is defined as a product sort with and the projections of arity and , respectively. We define with and the canonical projections. One can easily verify that . On the other hand, suppose that is defined as a subsort with defining -formula and inclusion of arity . We define with the inclusion map. One can again verify that . ∎
We have shown that the exact analogue of Theorem 3.1 holds for Morita extensions. Theorem 3.2 also generalizes in a perfectly natural way. Indeed, the generalization follows as a simple corollary to Theorem 4.1.
If is a Morita extension of , then is a conservative extension of .
Suppose that is not a conservative extension of . One can easily see that implies that for every -sentence . So there must be some -sentence such that , but . This implies that there is a model of such that . This model has no expansion that is a model of since , contradicting Theorem 4.1. ∎
Theorems 3.1 and 3.2 therefore generalize naturally from definitional extensions to Morita extensions. In order to generalize Theorem 3.3, however, we need to do some work. Theorem 3.3 said that if is a definitional extension of to , then for every -formula there is a corresponding formula that is equivalent to according to the theory . The following example demonstrates that this result does not generalize to the case of Morita extensions in a perfectly straightforward manner.
Recall the theories and from Example 2 and consider the -formula