On uniform relationships between combinatorial problems
The enterprise of comparing mathematical theorems according to their logical strength is an active area in mathematical logic, with one of the most common frameworks for doing so being reverse mathematics. In this setting, one investigates which theorems provably imply which others in a weak formal theory roughly corresponding to computable mathematics. Since the proofs of such implications take place in classical logic, they may in principle involve appeals to multiple applications of a particular theorem, or to non-uniform decisions about how to proceed in a given construction. In practice, however, if a theorem implies a theorem , it is usually because there is a direct uniform translation of the problems represented by into the problems represented by , in a precise sense formalized by Weihrauch reducibility. We study this notion of uniform reducibility in the context of several natural combinatorial problems, and compare and contrast it with the traditional notion of implication in reverse mathematics. We show, for instance, that for all , if then Ramsey’s theorem for -tuples and many colors is not uniformly, or Weihrauch, reducible to Ramsey’s theorem for -tuples and many colors. The two theorems are classically equivalent, so our analysis gives a genuinely finer metric by which to gauge the relative strength of mathematical propositions. We also study Weak König’s Lemma, the Thin Set Theorem, and the Rainbow Ramsey’s Theorem, along with a number of their variants investigated in the literature. Weihrauch reducibility turns out to be connected with sequential forms of mathematical principles, where one wishes to solve infinitely many instances of a particular problem simultaneously. We exploit this connection to uncover new points of difference between combinatorial problems previously thought to be more closely related.
The idea of reducing, or translating, one mathematical problem to another, with the aim of using solutions to the latter to obtain solutions to the former, is a basic and natural one in all areas of mathematics. For instance, the convolution of two functions can be reduced to a pointwise product via the Fourier transform; the study of a linear operator over a complex vector space can be reduced to the study of a matrix in Jordan normal form, via a change of basis; etc. In general, the precise forms of such reductions vary greatly with the particular problems, but they tend to be most useful when they are constructive or uniform in some appropriate sense. Typically, such reductions preserve various fundamental properties and yield more information, and they are usually easier to implement. These ideas have materialized in many areas such as category theory, complexity theory, proof theory, and set theory (see ). In this article, we investigate similar uniform reductions between various combinatorial problems in the setting of computability theory, reverse mathematics and computable analysis.
The program of reverse mathematics provides a unified and elegant way to compare the strengths of many mathematical theorems. Its setting is second-order arithmetic, which is a system strong enough to encompass most of classical mathematics. The formalism permits talking about natural numbers and about sets of natural numbers, and hence readily accommodates countable analogues of mathematical propositions. The fundamental idea is to calibrate the proof-theoretical strength of such propositions by classifying which set-existence axioms are needed to establish the structures needed in their proofs. In practice, we work with fragments, or subsystems, of second-order arithmetic, first finding the weakest one that suffices to prove a given theorem, and then obtaining sharpness by showing that the theorem is in fact equivalent to it. Each of the subsystems corresponds to a natural closure point under logical, and more specifically, computability-theoretic, operations. Thus, the base system, Recursive Comprehension Axiom (), roughly corresponds to computable or constructive mathematics; the system Weak König’s Lemma () corresponds to closure under taking infinite paths through infinite binary trees; and the Arithmetical Comprehension Axiom () corresponds to closure under arithmetical definability, or equivalently, under applications of the Turing jump. Other common subsystems, and -, which we shall not consider in this article, admit similar characterizations. The point is that there is a rich interaction between proof systems on the one hand, and computability on the other.
We refer the reader to Simpson  for background on reverse mathematics, to Soare  for background on computability theory, and to Weihrauch  for background in computable analysis. For background on algorithmic randomness, to which some of our results in Sections 4 and 6 will pertain, we refer to Downey and Hirschfeldt .
In the context of reverse mathematics, we can say that a theorem “reduces” to a theorem if there is a proof of assuming over . Since these proofs are carried out in a formal system, such a proof of from may use several times to obtain , or may involve non-uniform decisions about which sets to use in a construction. However, in many natural cases, a proof of from uses direct, computable, and uniform translations between problems represented by into problems represented by .
To describe these types of arguments more precisely, we restrict our focus to statements in the language of second-order arithmetic, i.e., statements of the form
where is arithmetical. Each such principle has associated to it a natural class of instances, and for each instance, a natural class of solutions to that instance. The following are a few important examples.
Statement 1.1 ().
Every infinite subtree of has an infinite path.
Statement 1.2 ().
Every subtree of such that
is uniformly bounded away from zero for all has an infinite path.
Statement 1.3 (Ramsey’s Theorem).
Fix . is the statement that for every , there exists an infinite set (called homogeneous for ) such that is constant on .
Statement 1.4 ().
For every sequence of sets , there exists an infinite set such that for all , either is finite or is finite.
The idea of uniform direct translations alluded to above was made precise by Weihrauch [33, 34] in the realm of computable analysis and has been widely studied ever since (see [5, Section 1] for a partial bibliography). In this context, a statement as above is viewed as a function specification; a partial realizer of such a specification is a function such that holds for all in the domain of . For this reason, it is traditional in this context to understand the relation as a partial multi-valued function where holds exactly when is one of the possible values of the function at . Weihrauch then introduced a notion of computable reducibility between partial multi-valued functions whereby there are computable processes that serve to uniformly translate realizers of one partial multi-valued function into realizers of another partial multi-valued function. We shall use here the following equivalent definition, which may appear more familiar from perspectives outside of computable analysis, particularly reverse mathematics. However, with a view towards encouraging more collaboration between these two similary-motivated but thus far largely separate approaches, we include an equivalence of the definitions in Appendix A.
Let and be statements of second-order arithmetic. We say that
is Weihrauch reducible to , and write , if there exist Turing reductions and such that whenever is an instance of then is an instance of , and whenever is a solution to then is a solution to .
is strongly Weihrauch reducible to , and write , if there exist Turing reductions and such that whenever is an instance of then is an instance of , and whenever is a solution to then is a solution to .
In other words, Weihrauch reducibility differs from strong Weihrauch reducibility only in that the “backwards” reduction takes as oracle not only the solution to the instance of , but also the original instance, , of . The two notions thus agree on computable instances of problems, but not in general. (See also [14, Section 1] for a discussion of the distinction between these approaches in the non-uniform case; and [17, Section 2.2] for further discussion of (strong) Weihrauch and related reducibilities in the context of computable combinatorics. We note that the notation in these sources differs from ours, with and being used in place of and , respectively.) For most of our results below (with the notable exception of Theorem 3.1) it will not matter which of the two reducibility notions we are working with, so to present the strongest possible results, we shall prove reductions for , and non-reductions for .
It is straightforward to see each of these reducibilities is reflexive and transitive and thus defines a degree structure on statements.
One simple example of a strong Weihrauch reduction is that whenever because given , we may view as a function (by ignoring the additional colors) and then every set homogeneous for is homogeneous for . (Thus, here and can both be taken to be the identity reduction.) A slightly more interesting example is that whenever . To see this, given , define by letting and notice that is uniformly obtained from via a Turing functional, and that every set homogenous for is homogeneous for .
There are also many examples of such reductions using more complicated Turing functionals. Friedman, Simpson, and Smith  showed that if is the statement that every commutative ring with identity has a prime ideal, then . Adapting the proof of this result, one can show that it is possible to uniformly computably convert a commutative ring into an infinite tree such that every path of is a prime ideal of , and hence that . For another example, Cholak, Jockusch, and Slaman [9, Theorem 12.5] exhibit a strong Weihrauch reduction of to via a non-trivial .111The same is not true if is replaced by the closely related principle , introduced in [9, Statement 7.8]. This asserts that if is stable, i.e., if for each the limit of as tends to infinity exists, then there is an infinite set consisting either entirely of numbers for which this limit is , or entirely of numbers for which this limit is . A recent result by Chong, Slaman, and Yang [10, Theorem 2.7] resolves a longstanding open problem by showing that . However, the model they construct to witness the separation is a non-standard one, and so leaves open the question of whether every -model of is also a model of . A typical reason for this being the case would be if were uniformly reducible to . However, by results of Dzhafarov [14, Theorem 1.5 and Corollary 1.10] it follows that , and more recently, Lerman, Solomon, and Towsner (unpublished) have shown even that .
Despite the fact that many natural implications in reverse mathematics correspond to Weihrauch reductions (even strong Weihrauch reductions), there are certainly examples where an implication holds in reverse mathematics but no Weihrauch reduction exists. For example, building on work of Jockusch in , it is known that whenever and , and in particular that . However, because every computable instance of has a -computable solution, but there is a computable instance of with no -computable solution (see [21, Theorems 5.1 and 5.6]). The underlying reason why this implication holds in reverse mathematics is that can be coded into a computable instance of , and by relativizing and iterating this result (i.e., by using multiple nested applications of ), one can obtain the several jumps necessary to compute solutions to instances of .
There are also more subtle instances where no Weihrauch reduction exists despite the fact that degrees of solutions to the problems correspond. For example, Jockusch [22, Theorem 6] showed that for any , the degrees of DNR functions (i.e., functions such that for all ) are the same as the degrees of DNR functions, but there is no Weihrauch reduction witnessing this. More precisely, he showed that given , there is no Turing functional such that for all . If we let be the statement “for every , there exists a function relative to ”, then Jockusch’s theorem shows that .
A motivating question for this article is what happens when one varies the number of colors in Ramsey’s Theorem. It is well known that if and , then . For example, to see that , we can argue as follows. Suppose that . Define by letting
By , we may fix a set such that is homogenous for . Now if , then is homogeneous for . Otherwise, the function is a -coloring of , so we may apply a second time to conclude that there is an infinite such that is homogeneous for . Notice that this proof requires two nested applications of to obtain a solution to . However, there are no known degree-theoretic differences between homogeneous sets of computable instances of and homogeneous sets of computable instances of , so it is unclear whether there is a proof of using one uniform application of . We prove below in Theorem 3.1 that when .
Although the same basic idea of (strong) Weihrauch reducibility is used in the contexts of computable combinatorics, computable analysis, and reverse mathematics, there are important differences beyond terminology that the reader should keep in mind when translating back and forth.
In , (strong) Weihrauch reduction is defined not only for partial multi-valued functions but also for abstract collections of partial functions on Baire space. This more general idea has no equivalent formulation in second-order arithmetic and, moreover, only definable relations make sense in the latter context. Thus, reverse mathematics has a limited view of the (strong) Weihrauch degrees considered in computable analysis. In practice, this limitation only surfaces when considering the general structure of (strong) Weihrauch degrees since these degrees, when considered for their own sake, generally correspond to definable relations. In this paper, we will only consider arithmetically-definable relations, which therefore make sense in all contexts.
Computable combinatorics and computable analysis work exclusively with the standard natural numbers whereas reverse mathematics also considers non-standard models. Since the base system only postulates induction for formulas, issues related to induction often occur in translation and it is not the case that every reduction translates into a proof that . For example, a direct analysis of the reduction of Cholak, Jockusch, and Slaman showing that alluded to above appears to use -induction in order to verify that homogenous sets for the transformed coloring are indeed cohesive for the given instance, and some additional work is required to verify that the proof goes through (see Mileti [27, Appendix A]).
The typical use of oracles varies in the three contexts. In computable combinatorics, results are usually stated without any use of oracles and issues of relativization are discussed where necessary. In computable analysis, both the (strong) Weihrauch reduction above and its continuous analogue, which permits the use of any oracle, are considered. In reverse mathematics, it is customary to allow any oracle that exists in the model under consideration. So the case of (strong) Weihrauch reduction corresponds to the minimal standard model of where the only sets are the computable ones, and the continuous analogue corresponds to the case of the full standard model of second-order arithmetic where all sets are present.
For these reasons, most of our results will be stated in a way that includes all relevant translations, though our proofs will generally focus only on one point of view, with the others left to the reader.
We use standard notations and conventions from computability theory and reverse mathematics. We identify subsets of with their characteristic functions, and we identify each with its set of predecessors. Lower-case letters such as denote elements of . Given a set , we let denote the set of all subsets of of size . We use to denote finite subsets of , which we identity with the corresponding tuple listing the elements in increasing order. We write if . Given a Turing functional , we assume that if , then for all . We say that a Turing functional is total if is a total function for every . Given sets and , we write in place of .
2. The Squashing Theorem and sequential forms
We can naturally combine two principles and into one as follows. We define the parallel product to be the principle whose instances are pairs such that is an instance of and is an instance of , and the solutions to this instance are pairs such that is a solution to and is a solution to . Obviously, this can be generalized to combine any number of principles, even an infinite number. In particular, one of our interests will be in cases when and are the same principle. For , we let applications of , or , refer to the principle whose instances are sequences such that each is an instance of , and the solutions to this instance are sequences such that each is a solution to . The infinite case is sometimes known as the parallelization of and is also denoted .
Notice that we trivially have because given two sequences of sets and , we can uniformly computably interleave them to form the sequence where and , so that any set cohesive for is cohesive for each of and . In fact, using a pairing function, it is easy to see that . For another example, we have that as follows. Given two infinite trees , form a new tree by letting if the sequence of even bits from is an element of and the sequence of odd bits from is an element of . It is straightforward to check that is an infinite tree uniformly computably obtained from , and that given a path through , the even bits form a path through , and the odd bits form a path through . Moreover, using a pairing function again, we can interleave a sequence of infinite trees together to form one infinite tree such that from any path we can uniformly computably obtain paths through each of the original trees, and hence . (This fact is also a consequence of Theorem 8.2 of , which shows that is strong Weihrauch equivalent to ; see also Lemma 5 of Hirst  for a formalized version in reverse mathematics.)
We have the following important example using distinct principles.
If , then .
Given where and , define by for all . Then is uniformly computable from , and any infinite homogeneous set for is also homogeneous for both and . ∎
Given a principle , if , then it is straightforward to see (by repeatedly applying the reduction procedures) that for each fixed . For example, if and we are given where each is an instance of , then
is an instance of uniformly obtained from , and from any solution to this instance we can repeatedly apply to uniformly obtain a sequence such that each is a solution to . (The same is true if is replaced by .) It is not at all clear, however, whether this process can be continued into the infinite, i.e., does necessarily imply that ? Given a sequence where each is an instance of , the natural idea is to consider
Of course, this process clearly fails to converge and so does not actually define an instance of . In fact, we will see later that does not always imply that .
However, if and is reasonably well-behaved, we will prove that such a “squashing” of infinitely many applications of into one application of is indeed possible. For example, consider . The idea is to force some convergence in the above computation by approximating the second coordinate of as follows. When attempting to simulate , we approximate the unknown result of by guessing that it starts as the all zero coloring. By assuming this and hence that the second argument looks like a string of zeros, we eventually force convergence of on , at the cost of introducing some finite initial error in the true “computation” of . Since removing finitely many elements from an infinite homogenous set results in an infinite homogeneous set, these finitely many errors we have introduced into the coloring will not be a problem.
More precisely, we will define a sequence of instances of (where intuitively beyond some finite error introduced to force convergence), along with a uniformly computable sequence of numbers , such that
Now since we no longer have (due to the finite error), the Turing functional may not convert a solution of into a pair of solutions to and . In order to deal effectively with these finite errors, to ensure that our are actually instances of , and to ensure that sequence is uniformly computable (and hence can be used as markers for cut-off points), we need to make some assumptions about .
Let be a principle (or, more generally, any multi-valued function with domain ).
is total if every element of is (or codes) an instance of .
has finite tolerance if there exists a Turing functional such that whenever and are instances of with for all , and is a solution to , then is a solution to .
For each , the principle is total and has finite tolerance.
We can view every element of as a valid -coloring through simple coding. Define as follows. Given , compute the largest element of any tuple of coded by a natural number less than , and let . Now if and are colorings of using colors such that for all , and is an infinite set homogeneous for , then is also an infinite set and it is homogeneous for . ∎
Another simple example of a total principle with finite tolerance is , where in fact we may take (because anything cohesive for a given family of sets is also cohesive for any finite modification of that family).
Although we are certainly interested in the case where , i.e., when , we will need a slightly more general formulation below. As above, when , it is straightforward to see that for each fixed . When passing to the infinite case, however, our “squashing” never reaches the initial instance of , but in good cases we can conclude that . Notice that if , this reduces to the case discussed above.
As a rule, all results in this section about principles could be formulated more generally for any multi-valued function with domain , as in Definition 2.2. For brevity, we shall omit repeatedly stating this.
Theorem 2.5 (Squashing Theorem).
Let and be statements, and assume that both are total and that has finite tolerance.
If then .
If then .
We prove (1), the proof of (2) being virtually the same (in fact, the argument can be made somewhat simpler because the oracle has access to the original problem). Throughout, if , we write for the concatenation of by , and for the continuation of by , meaning
for all . For , we similarly define .
Fix functionals and witnessing the fact that . Since is total, we may fix a computable instance of (one could take to be the sequence of all s, but for some particular problems it might be more convenient or natural to use a different ). Given a sequence of instances of , we uniformly define a sequence of instances of together with a uniformly computable sequence of numbers so that
for all . In other words, we will have for all , and for all . We will then use the instance of as our transformed version of and show how given a solution of , we can uniformly transform into a sequence of solutions to . One subtle but very important point here is that our sequence of cut-off positions will need to be uniformly computable independent of the instances , so that we can use them to unravel a solution of without knowledge of the initial instance.
Thus, our first goal is to define the uniformly computable sequence . We proceed in stages, initially letting . At stage , we define . The goal is to choose large enough to ensure that all potential for will be defined on . Intuitively, by placing enough of down in column (i.e., at the beginning of ), we must eventually see convergence on previous columns through the cascade effect of the nested . Since we do not have access to the sequence , we make essential use of compactness and the fact that is total to handle all potential inputs at once.
To this end, assume has been defined for each . First we claim there exists an such that for all ,
and for general ,
Observe that the set of all such is closed under successor. Thus, once the claim is proved, we can define to be the least such that is greater than for all and also greater than (to ensure that will be defined on as well). This observation also implies that to prove the claim, it suffices to fix , and prove that we can effectively find an such that (1) holds for all .
To this end, let be the set of all tuples of binary strings with such that
Since each of the computations here has a finite string as an oracle, is a computable set. Furthermore, if is an initial segment of under component-wise extension, that is if , then belongs to if does. Thus, is a subtree in under component-wise extension.
Now if is infinite, then it must have an infinite path , where and for all . Then by definition of ,
As and are both total, each of are instances of , and each of the second components of any above are instances of . In particular,
is an instance of , as is . But then cannot be undefined. We conclude that is finite, whence its height can clearly serve as the desired . To complete the proof, we note that an index for as a computable tree can be found uniformly computably from and , and therefore so can .
We now define our reduction procedures witnessing that . Let be an instance of . From this sequence, we uniformly computably define a sequence of instances of as follows. Again, we proceed by stages, doing nothing at stage . At stage , we define for each and define on . If , we let . Otherwise, we let
the right-hand of which we know to be convergent by definition of . That is, we have defined
and so forth. (Each of the in the computations above could also be replaced by .) We also define for all . Since, from the next stage on, will be defined so that , it is not difficult to see that we do indeed succeed in arranging , as desired. Furthermore, is defined uniformly computably from , and each is an instance of because is total. In particular, and there is a Turing functional that produces from .
Let be a Turing functional witnessing that has finite tolerance. We claim that from any solution to the instance of , we can uniformly computably obtain a sequence of solutions to . So suppose is any such solution to . The idea is to repeatedly apply the reduction to deal with the finite errors, followed up by to convert individual solutions to pairs of solutions. Indeed, since for all , we have that is a solution to . Thus, is such that is a solution to , and is a solution to . The first of these, , can serve as the first member of our sequence of solutions. Since for all , we have that is a solution to . Thus, is such that is a solution to , and is a solution to . Continuing in this way, we build an entire sequence of solutions to , and since is uniformly computable, we do this uniformly computably from alone. The proof is complete. ∎
The utility of the Squashing Theorem for our purposes, as we shall see in subsequent sections, is that in many cases it allows us to deduce that multiple applications of a given principle cannot be uniformly reduced to one. This is because there is no (strong) Weihrauch reduction of instances of that principle to one, and in general, showing this tends to be easier.
Let be a principle that is total and has finite tolerance.
If , then .
If , then .
Apply Theorem 2.5 with . ∎
Let and be principles.
If both and are total, then is total.
If both and have finite tolerance, then has finite tolerance.
Let and be statements, assume that both are total and that has finite tolerance, and let be given.
If then .
If then .
Repeatedly applying Lemma 2.7, we see that is total and has finite tolerance. The result follows from the Squashing Theorem. ∎
Let be a principle that is total and has finite tolerance, and let be given.
If , then .
If , then .
Since , we know that , so the result follows from the previous corollary. ∎
For the remainder of this article, we employ the following short-hand to avoid excessive exponents and to give a more evocative name.
For any principle , we denote applications of , or , by . We call the sequential version of .
So, for instance, Corollary 2.6 says that that if is total and has finite tolerance, then implies that . With this terminology, we have the following simple result.
Let and be principles.
If , then .
If , then .
For (1), fix and witnessing the reduction . Given an instance of , we have that is an instance of uniformly computably obtained from it. Also, if is a solution to , then is a solution to . For (2), the proof is the same, except we must take as the solution. ∎
3. Ramsey’s theorem for different numbers of colors
Throughout this section, let be fixed. Our goal is to work up towards a proof of the following theorem.
For all with , we have .
As pointed out above, we have that , but the obvious proof uses multiple nested applications of . Theorem 3.1 says that it is impossible to give a uniform proof of this implication using just one application of .
The key ingredients of the proof are Proposition 2.1, the Squashing Theorem, and the fact that it is possible to code more into than into alone. To illustrate the last of these, consider . Notice that every computable instance of trivially has a computable solution because either there are infinitely many s or there are are infinitely many s (and each of these sets is computable), but there is one non-uniform bit of information used to determine which of these two statements is true. However, it is a straightforward matter to build a computable instance of such that every solution computes . The idea is to use each column to code one bit of by exploiting this one non-uniform decision. In fact, for higher exponents this result can be made sharper, as we now prove. (See also [24, Proposition 47] for a related result in the context of proof mining and program extraction.)
There is a computable instance of every solution to which computes .
We prove the result for being odd; the case where is even is analogous. Fix a computable predicate such that
Define a computable sequence of colorings by
for all .
Let be any sequence of infinite homogeneous sets for the . We claim that for all , and hence that . To see this, suppose first that . Let be Skolem functions for membership in , so that
Now define an increasing sequence of elements as follows. Start by letting be the least that is greater than . Then, given with , suppose we have defined for all . If is odd, let be the least that is greater than . If is even, let be the least that is greater than , and also greater than for all sequences with for each odd .
The sequence of so constructed now clearly satisfies
So by definition of , we have that . And since the all belong to , it follows that , as desired.
Now suppose that . We can similarly construct a sequence of elements of witnessing that . Let be Skolem functions for non-membership in , so that
Let be the least element of , and suppose we are given a with such that has been defined for all . If is even, let be the least that is greater than . If is odd, let be the least that is greater than , and also greater than for all sequences with for each even .
This sequence of satisfies the negation of (2) above, so by definition. Since all the belong to , the claim follows. ∎
After relativization and translation into the language of strong Weihrauch reductions, we obtain from the above that (See the discussion following Corollary 5.21 for a definition of the iterated Turing jump, .)
For all and , we have .
Suppose instead that