Definable sets in a hyperbolic group
Abstract
We give a description of definable sets in a free nonabelian group and in a torsionfree nonelementary hyperbolic group . As a corollary we show that proper noncyclic subgroups of and are not definable. This answers Malcev’s question posed in 1965 for .
Received (Day Month Year)
Revised (Day Month Year)
Communicated by [editor]
1 Introduction
We denote by a free group with finite basis and by a torsionfree nonelementary hyperbolic group and consider formulas in the language that contains generators of (or ) as constants. In this paper we give a description of subsets of definable in (Theorem 13) that follows from \@cite? and similar description for (Theorem 24) that uses \@cite?. A subset is definable in a group H if there exists a firstorder formula in such that is precisely the set of tuples in where holds: .
Our description implies that definable subsets in are either negligible or conegligible (Bestvina and Feighn’s result) and they are also either negligible or generic in the meaning of the complexity theory. We will obtain the following corollary.
Theorem 1
Proper noncyclic subgroups of and are not definable.
These results solve Malcev’s problem 1.19 from \@cite? posed in 1965. Malcev asked the following:
1) Describe definable sets in ;
2) Describe definable subgroups in ;
3) Is the derived subgroup of F definable in ?
The main result, Theorem 13, will be proved in Section 4, Theorem 1 for will be proved in Section 5. In Section 5 we will also prove
Theorem 2
The set of primitive elements of is not definable if
Notice that the set of all bases in is definable. This is based on Nielsen’s theorem: elements form a basis if and only if the commutator is conjugated either to or . Hence the set of bases in is defined by the following formula
The set of all primitive elements in is defined by the following formula
The results on hyperbolic groups will be presented in the last section.
2 Conjunctive formulas
The first step to analyze the structure of definable sets is to reduce it to the study of the structure of definable sets.
Theorem 3
\@cite?,\@cite? Every formula in the theory of is equivalent to a boolean combination of formulas.
Furthermore, a more precise result holds.
Theorem 4
Every set definable in is defined by some boolean combination of formulas
(1) 
where are tuples of variables.
We call formulas in the form (1) where and are either equations or finite systems of equations, conjunctive formulas. Notice that in the language every finite system of equations in and in is equivalent to one equation. For this is Malcev’s result, see also Lemma 3 in \@cite?, for this is Lemma 28. Therefore we can assume that and are equations (although this is not essential for the proof of the main results). Every finite disjunction of equations in and is equivalent to one equation. This is attributed to Gurevich for , see also Lemma 4 in \@cite?, and the same proof works for .
Proof. By Theorem 1, every definable set is defined by a boolean combination of formulas. By Lemma 10 \@cite?, every formula is equivalent to
where are tuples of variables.
This formula has form (25) in \@cite?, namely
(2) 
The proof of Theorem 39 in Section 12.7 of \@cite? shows that the formula is false for a value of the variables if and only if the conjunction of disjunctions of formulas of the two types given below is true for . We will write these formulas in the same form as they appear in Section 12.7 of \@cite?. Notice that instead of the union of variables in these formulas we take variables .
(3) 
(4) 
3 Diophantine sets and sets
Definition 5
A piece of a word is a nontrivial subword that appears in at least two different ways (maybe the second time as , maybe with overlapping).
A piece of a tuple of reduced words , is a nontrivial subword that appears in at least two different ways as a subword of some words of .
Definition 6
A proper subset of admits parametrization if it is a set of all words that satisfy a given system of equations (with coefficients) without cancellations in the form
(5) 
where for all , , each appears at least twice in the system and each variable in is a piece of .
A proper subset of admits parametrization if after permutation of indices it is a product set of and a set of all tuples of words that satisfy a given system of equations (with coefficients) without cancellations in the form
(6) 
where for all , , each appears at least twice in the system and each variable in each is a piece of the tuple.
The empty set and oneelement subsets of admit parametrization.
A finite union of sets admitting parametrization will be called a multipattern. A subset of a multipattern will be called a submultipattern
In this section we will give a description of Diophantine sets and sets in .
Theorem 7
Suppose a Diophantine set defined by the formula
where is a system of equations, is not the whole group . Then is a multipattern.
We will prove this result using the notion of a cut equation introduced in \@cite?, Section 5.7.
Definition 8
A cut equation consists of a set of intervals , a set of variables , a set of parameters , and two labeling functions
For an interval the image is a reduced word in variables and constants from , we call it a partition of .
Sometimes we write omitting and .
Definition 9
A solution of a cut equation with respect to an homomorphism is an homomorphism such that: 1) for every is a reduced nonempty word; 2) for every reduced word the replacement results in a word which is a reduced word as written and such that is graphically equal to the reduced form of ; in particular, the following diagram is commutative.
If is a solution of a cut equation with respect to an homomorphism , then we write and refer to as a solution of modulo . In this event, for a given we say that is a partition of . Sometimes we also consider homomorphisms , for which the diagram above is still commutative, but cancellation may occur in the words . In this event we refer to as a group solution of with respect to .
Definition of a generalized equation can be found in \@cite? (Definition 8, \@cite?). This is one of the principal objects in our work on equations in groups. The following result states that every generalized equation is equivalent to a certain cut equation.
Lemma 10 (Lemma 34,\@cite?)
For a generalized equation one can effectively construct a cut equation such that the following conditions hold:

is a partition of the whole interval into disjoint closed subintervals;

contains the set of variables ;

for any solution of the cut equation has a solution modulo the canonical homomorphism ( where are, correspondingly, the left and the right endpoints of the interval );

for any solution of the cut equation the restriction of on gives a solution of the generalized equation .
The proof given below is verbatim the one cited but it is given in the paper to make it more selfcontained and because some features of the construction are implicitly used in the proof of Theorem 7. All undefined notions used in the proof can be found in \@cite?.
Proof. We begin with defining the sets and . Recall that a closed interval of is a union of closed sections of . Let be an arbitrary partition of the whole interval into closed subintervals (i.e., any two intervals in are disjoint and the union of is the whole interval ).
Let be a set of representatives of dual bases of , i.e., for every base of either or belongs to , but not both. Put .
Now let . We denote by the set of all bases over and by the set of all items in . Put For let be the interval , where are the endpoints of . A sequence of elements from is called a partition of if and for . Let be the set of all partitions of . Now put
Then for every there exists one and only one such that . Denote this by . The map is a welldefined function from into .
Each partition gives rise to a word as follows. If then . If then . If and then . The map is a welldefined function from into .
Now set . It is not hard to see from the construction that the cut equation satisfies all the required properties. Indeed, (1) and (2) follow directly from the construction.
To verify (3), let’s consider a solution of . To define corresponding functions and , observe that the function (see above) is defined for every . Now for put , where , and for put , where . Clearly, is a solution of modulo .
To verify (4) observe that if is a solution of modulo , then the restriction of onto the subset gives a solution of the generalized equation . This follows from the construction of the words and the fact that the words are reduced as written (see definition of a solution of a cut equation). Indeed, if a base occurs in a partition , then there is a partition which is obtained from by replacing by the sequence . Since there is no cancellation in words and , this implies that . This shows that is a solution of .
We will give an example. Let be a generalized equation in Fig 1.
For the cut equation we will have , , Further, This gives five partitions for .
Lemma 11 (Theorem 8, \@cite?)
Let be a system of equations over . Then one can effectively construct a finite set of cut equations
and a finite set of tuples of words such that:

for every equation , one has and ;

for any solution of in , there exists a number and a tuple of words such that the cut equation has a solution with respect to the homomorphism which is induced by the map . Moreover, , the word is reduced as written, and ;

for any there exists a tuple of words such that for any solution (group solution) of the pair where and is a solution of in .
Proof of Theorem 7. We think about as a system of equations with coefficients and two sets of variables: and . It follows from Lemma 11 (where the role of is taken by the system ), that one can effectively construct a finite set of cut equations satisfying the conditions of the statement. Consider one of the cut equations. We can assume that this cut equation has a solution. It follows from the construction that there is at least one interval labeled by that is completely covered by variables from and constants (=coefficients). We can assume that each variable from appears at least twice in the cut equation, otherwise we can remove the partition containing and express in terms of other variables and . After removing the partition solutions satisfying this cut equation will not be graphical solutions of the cut equation, but they will be group solutions, and every group solution of the cut equation still provides a solution of the initial equation S(X,Y,A)=1 (see item 3 in Theorem 11). (Let us consider the example given above. We have
We remove because is contained only once, then remove , because now is contained only once, then remove , then and, finally, Therefore can be arbitrary element of .)
By assumption, formula does not define the whole group . If formula defines a finite or empty set, then the set is trivially a multipattern. Hence, we further assume that does not define a finite or empty set. So at least one can be represented in several ways as a product without cancellation in the form (6). Variables from correspond to pieces of a reduced word corresponding to . Suppose there exists such that for each partition where in (6) there is a variable that appears only as a matching variable. Let be a variable in the partition that appears only as a matching variable. Cutting , if necessary, we can assume it does not intersect any boundaries. If it is covered by some variable in some other partition of , and has a nonmatching occurrence, then is a piece (see Figure 2).
If all variables that cover in other partitions of only have matching occurrences, we represent each such as . To find a group solution of the cut equation we can now remove all occurrences of , solve the remaining system of equations for the remaining variables, and then express the value of the removed variable in terms of the remaining variables and (see Figure 3).
We can take any in such a group solution provided the remaining variables satisfy the system of equations obtained by removing the matching variable (We denote this system by ). This system has a solution, because we assumed that the cut equation has a solution. Therefore, can be arbitrary element of . If and this contradicts to the assumption of the theorem, therefore in this case at least in one partition of , , each variable is a piece. This proves for that the words give a parametrization of the set satisfying properties of Definition 6 and hence is a multipattern. Therefore this proves the theorem for . If , we can assume that , and can be arbitrary element of . The existence of a group solution of the cut equation we began with for is equivalent to the existence of a solution of the system that consists of (6) for and . We can now use induction on . Since defines neither nor a finite set nor an empty set, it follows by induction that for words give a parametrization of the set satisfying properties of Definition 6 and hence is a multipattern. Theorem 7 is proved.
Theorem 12
Suppose an set is defined by the formula
If the positive formula does not define the whole group , then is a submultipattern. Otherwise, either or is a submultipattern.
Proof. If the positive formula does not define the whole group , then by Theorem 7 it defines a multipattern, and is a subset of this multipattern, therefore, a submultipattern.
Suppose now that defines the whole group . Then the set of parameters defined by contains the set of parameters defined by Let us prove this. Parameters satisfying are described by disjunction of cut equations, and one of the cut equations defines the whole group (denote it ), therefore this cut equation can be described by the formula . This follows from \@cite?, Lemma 8 and also can be seen directly. Indeed, if has only matching occurrences we remove it from all partitions of , and we have a system of equations for the remaining variables that does not contain . Every tuple from corresponding to the cut equation can be represented as a function The system of equations consisting of for all such functions , by the Noetherian property, is equivalent to one equation . If is false, then satisfies the formula.
Since there exists such that such tuples either constitute the whole or form a disjunction of cut equations, therefore form a multipattern. If defines a multipattern, then defines a submultipattern. If for any we have , then defines an empty set and we cannot use solutions of the cut equation to obtain solutions of . Then we consider the next cut equation corresponding to the equation and defining the whole group (if exists) and construct the same way as we constructed it for . If one of the formulas defines a multipattern, then is a submultipattern, otherwise is a submultipattern.
4 Main Theorem
In this section we will prove the main theorem.
Theorem 13
For every definable set in a free group , either or its complement is a submultipattern.
Proof. Suppose a set is defined by the formula (1). If the set defined by ) is not the whole group , then the set defined by the formula (1) is a submultipattern.
Suppose now that the set defined by is the whole group, then, as in the proof of Theorem 12, a subset of parameters satisfying formula (1) is a union of a submultipattern and another subset that is defined by
Suppose this formula does not define the empty set. Then the negation is
and it does not define .
Lemma 14
Formula
in in the language is equivalent to the condition that formula holds in each of the finite number of NTQ groups .
Proof. We can assume that the equation is irreducible. By \@cite? each solution of this equation factors through one of the finite number of systems. By \@cite?, Theorem 12, formula holds in one of the corrective normalizing extensions of one of these NTQ systems. For any value of variables that makes each such formula true in each of the NTQ groups , this value also makes true.
Since , this lemma implies that a formula holds in , therefore by Theorem 7 must be a multipattern.
5 Negligible sets
Definition 15
\@cite? A subset of is negligible if there exists such that all but finitely many have a piece such that
A complement of a negligible subset is conegligible.
Bestvina and Feighn \@cite? stated that in the language without constants every definable subset of is either negligible or conegligible. They also proved
Proposition 16
\@cite? 1) Subsets of negligible sets are negligible.
2) Finite sets are negligible.
3) A subset containing a coset of a noncyclic subgroup of cannot be negligible
4) A proper noncyclic subgroup of is neither negligible nor conegligible.
5) The set of primitive elements of is neither negligible nor conegligible if
Proof. Statements 1) and 2) immediately follow from the definition. 3) If and , then the infinite set is not negligible . Statement 4) follows from 3). 5) Let be three elements in the basis of and denote The set of primitive elements contains and the complement contains
Lemma 17
A set that is a submultipattern, is negligible.
Proof. It is enough to show that a set that admits parametrization is negligible. Let be the length of word (as a word in variables ’s and constants). The set is negligible with .
Corollary 18
Every definable subset of in the language with constants (and, therefore, in the language without constants) is either negligible or conegligible.
Definition 19
Recall that in complexity theory is called generic if
where is the ball of radius in the Cayley graph of . The term “negligible” is usually used for a complement of a generic set. We will call in this paper such a set CTnegligible.
Proposition 20
Negligible sets in Definition 15 are CTnegligible.
Proof. Let . Then Fix Let be the set of words that have a piece such that
If , then We now count the number of reduced words of length that have a piece of length (if they have a piece longer than , then they also have a piece of length ). There are choices for positions of the first letters of the pieces. Suppose these positions are fixed, one piece begins at position and the other at position . Then there are two cases :
1) If the pieces of length do not overlap, then up to a constant there are at most such words;
2) If the pieces of length overlap, and , then up to a constant there are at most such words.
Therefore up to a constant there are at most reduced words of length that have a piece of length . Let be the set of reduced words in that have a piece of length , is at most , where is a constant.
It is known that
Using these formulas with , we obtain that where is a constant.
Then Therefore , as
6 Torsion free hyperbolic groups
In this section is a nonelementary torsion free hyperbolic group.
Theorem 21
Let be a nonelementary torsion free hyperbolic group. Every set definable in is defined by some boolean combination of conjunctive formulas.
Proof. Similarly to Theorem 6.5 \@cite? one can prove that (we use Sela’s notation) is in the boolean algebra of conjunctive sets. Indeed where is the depth of the tree of stratified sets and is the set of specializations of the defining parameters for which there exists a valid PS statement for some proof system of depth . Lemma 6.2 in \@cite? deals with . The proof that is a conjunctive set is identical to the free group case. Theorem 6.3 \@cite? deals with . As in the free group case, Proposition 3.7 \@cite? (the proof of this proposition is not given there but it is stated that it is identical to the proof of Proposition 1.34 \@cite?) reduces the analysis of the set to the set of specializations of the defining parameters for which there exists a test sequence of valid statements that factors through the various resolutions . By construction, if then there must exist a valid PS statement of the form
that factors through one of the PS resolutions PSHGHRes constructed with respect to all proof systems of depth 2. (Notice that the notion of a resolution corresponds to the notion of a fundamental sequence in our work.) The sets associated with various PS resolutions , i.e. the sets of specializations of defining parameters for which there exists a test sequence (test sequence corresponds to a generic family) that factors through any of the PS resolutions and restricts to valid PS statements are in the Boolean algebra of sets (Proposition 1.34, \@cite?). Moreover, it follows from the proof of Proosition 1.34 that for any specialization of the defining parameters, there are finitely many combinations for the collections of ungraded resolutions covered by a PS resolution , and the collections of ungraded resolutions covered by the other (auxiliary) graded resolutions associated with . These finitely many possibilities for the collections of ungraded resolutions form a stratification on the set of specializations of the defining parameters, obtained from the bases of all the graded resolutions that have been constructed. Therefore if and only if it belongs to certain strata in the combined stratification, and not to the complement of these strata, but it depends only on the stratum, not on the particular specialization. A stratum in the stratification is the set of specializations for which there exists a given combination of rigid and strictly solid families of specializations (= Maxclasses) of finitely many rigid or solid limit groups (= groups without sufficient splitting). These sets of specializations can be defined by a Boolean combination of conjunctive formulas. By Theorem 1.33 in \@cite?, if there exists a valid PS statement that factors through a PS resolution , then either there exists a test sequence that factors through this resolution and restricts to valid PS statements, or there must exist a combined specialization that factors through a resolution of lower complexity, and we can continue with this resolution. The definition of a valid PS statement is given in Definition 1.23 \@cite?. The fact that for a specialization there exists a valid PS statement that factors through a PS resolution can be expressed by a Boolean combination of conjunctive formulas, because conditions (i)(iv) in this definition can be expressed by such formulas.
Each set is in the boolean algebra of conjunctive sets, by applying the same sieve procedure that is used to analyze the set .
Definition 22
Let be a torsionfree hyperbolic group generated by a set and the canonical projection.

A proper subset of admits parametrization if is the image under of a set in that admits parametrization in and there exist constants and such that for each there is a preimage such that the path corresponding to in the Cayley graph of is quasigeodesic in neighborhood of the geodesic path for .

A finite union of sets admitting parametrization is called a multipattern. A subset of a multipattern is a submultipattern.
A similar definition can be given for sets of tuples of elements of .
Definition 23
Let be a torsionfree hyperbolic group generated by a set and the canonical projection.

A proper subset of admits parametrization if is the image of the set in that admits parametrization in and there exist constants and such that for each there is a preimage such that the path corresponding to each in the Cayley graph of is quasigeodesic in neighborhood of the geodesic .

A finite union of sets admitting parametrization is called a multipattern. A subset of a multipattern is a submultipattern.
Theorem 24
For every definable subset of nonelementary torsion free hyperbolic group , or its complement is a submultipattern.
Proof. We will first show the set defined by the positive existential formula
where elements in are constants, is a submultipattern if it is not the whole group.
In \@cite?, the problem of deciding whether or not a system of equations over a torsionfree hyperbolic group has a solution was solved by constructing quasigeodesics called canonical representatives for certain elements of . This construction reduced the problem to deciding the existence of solutions in finitely many systems of equations over free groups. The reduction may also be used to describe parameters as shown below.
Lemma 25
\@cite? Let be a torsionfree hyperbolic group and the canonical epimorphism. There is an algorithm that, given a system of equations over , produces finitely many systems of equations
(7) 
over and homomorphisms for such that

for every homomorphism , the induced map is a homomorphism, and

for every homomorphism there is an integer and an homomorphism such that .
Moreover, the algorithm also gives positive numbers and such that for each solution of the corresponding words in represent quasigeodesics in the neighborhood of corresponding elements in .
By this lemma the set of parameters defined by the formula consists of the images of the set of parameters satisfying the formula for certain system of equations is This proves that this set is a submultipattern.
Now the proof of the theorem is the same as in the free group case. Suppose a set is defined by the formula (1). If the set defined by ) is not the whole group , then the set defined by the formula (1) is a submultipattern.
Suppose now that the set defined by is the whole group, then, as in the proof of Theorem 13, a subset of parameters satisfying formula (1) is a union of a submultipattern and another subset that is defined by
Suppose this formula does not define the empty set. Then the negation is
and it does not define .
Lemma 26
(\@cite?, Theorem 2.3) Formula
in in the language is equivalent to the condition that there exists a formal solution of the system in the covering closure (which corresponds to a finite number of NTQ groups ).
Since , this lemma implies that a formula holds in , therefore must be a multipattern.
Theorem 27
Proper noncyclic subgroups in a nonelementary torsion free hyperbolic group are not definable.
Proof. We will prove the theorem by contradiction. Let be a definable noncyclic subgroup in . Let be two noncommuting elements in such that they generate a free subgroup, we can assume that is cyclically minimal. Such elements exist by \@cite?, Lemma 1.14. Let where are sufficiently large numbers so that the set of elements consists of quasigeodesics in the neighborhood of corresponding geodesics. Such numbers exist by \@cite?, Lemma 2.3. We can assume that and .
Elements can be represented by quasigeodesics that have pieces of length greater than . We have one of the following two cases.
1) There is a number and a subsequence of indices such that the nonoverlapping part of and in is greater than .
2) There is a subsequence of indices such that have periodic subwords of length greater than , with period such that .
In the second case, since quasigeodesics and are close to each other, such long periodic subwords must appear in the elements . Indeed, the number of different geodesics joining phase vertices of the subpath labeled by of the path labeled by to the nearest vertices of the path labeled by in the generators and is bounded by Therefore, the same geodesic (labeled, say, by ) keeps repeating, and a relatively large part of (greater than ) is a product of commuting elements, each of them being equal to for some . This contradicts to the form of the elements .
In the first case, since quasigeodesics and are close to each other, two pieces in of length greater than are close to each other. Then, similarly to the previous case, we can show that there exists an element conjugating a power of to a power of and a power of to a power of , and, therefore, commuting with both and . Therefore, and two pieces in of length greater than for some coincide. This again contradicts to the form of the elements . The theorem is proved.
We complete this section with the following lemma which shows that a finite system of equations in is equivalent to one equation.
Lemma 28
Let be a torsionfree hyperbolic group. Then
1) There exists a constant such that equation implies
2) A finite system of equations in is equivalent to one equation.
Proof. The first statement follows from \@cite?, 5.3B and \@cite?, Corollary 6. Indeed, one has to take to be the maximum of and the numbers determined as in \@cite?, Corollary 6 for all triples of elements in of length not more than . It can be also obtained from the proof of Theorem 1.4, \@cite?. The theorem, in particular, states that in a nonelementary hyperbolic group, for any finite set of elements there exists an integer such that the normal closure of is free. The number in the proof depends only on the number of elements and the minimum of their translation lengths. Since hyperbolic groups are translationally discrete, and , this number depends only on the group and .
To prove the second statement we fix , and elements that do not commute, and show that the equation