Property Testing for Cyclic Groups and Beyond

# Property Testing for Cyclic Groups and Beyond

François Le Gall11footnotemark: 1
Yuichi Yoshida22footnotemark: 2  33footnotemark: 3
###### Abstract

This paper studies the problem of testing if an input , where is a finite set of unknown size and is a binary operation over given as an oracle, is close to a specified class of groups. Friedl et al. [Efficient testing of groups, STOC’05] have constructed an efficient tester using queries for the class of abelian groups. We focus in this paper on subclasses of abelian groups, and show that these problems are much harder: queries are necessary to test if the input is close to a cyclic group, and queries for some constant are necessary to test more generally if the input is close to an abelian group generated by elements, for any fixed integer . We also show that knowledge of the size of the ground set helps only for , in which case we construct an efficient tester using queries; for any other value the query complexity remains . All our upper and lower bounds hold for both the edit distance and the Hamming distance. These are, to the best of our knowledge, the first nontrivial lower bounds for such group-theoretic problems in the property testing model and, in particular, they imply the first exponential separations between the classical and quantum query complexities of testing closeness to classes of groups.

11footnotemark: 1

Department of Computer Science, The University of Tokyo

legall@is.s.u-tokyo.ac.jp

[2.5mm] 22footnotemark: 2School of Informatics, Kyoto University

yyoshida@kuis.kyoto-u.ac.jp

[2.5mm] 33footnotemark: 3Preferred Infrastructure, Inc.

[8mm]

## 1 Introduction

Background: Property testing is concerned with the task of deciding whether an object given as an oracle has (or is close to having) some expected property. Many properties including algebraic function properties, graph properties, computational geometry properties and regular languages have been proved to be efficiently testable. We refer to, for example, Refs. [8, 15, 17] for surveys on property testing. In this paper, we focus on property testing of group-theoretic properties. An example is testing whether a function , where and are groups, is a homomorphism. It is well known that such a test can be done efficiently [4, 5, 18].

Another kind of group-theoretic problems deals with the case where the input consists of both a finite set and a binary operation over it given as an oracle. An algorithm testing associativity of the oracle in time has been constructed by Rajagopalan and Schulman [16], improving the straightforward -time algorithm. They also showed that queries are necessary for this task. Ergün et al. [9] have proposed an algorithm using queries testing if is close to associative, and an algorithm using queries testing if is close to being both associative and cancellative (i.e., close to the operation of a group). They also showed how these results can be used to check whether the input is close to an abelian group with queries. The notion of closeness discussed in Ergün et al.’s work refer to the Hamming distance of multiplication tables, i.e., the number of entries in the multiplication table of that have to be modified to obtain a binary operation satisfying the prescribed property.

Friedl et al. [10] have shown that, when considering closeness with respect to the edit distance of multiplication tables instead of the Hamming distance (i.e., by allowing deletion and insertion of rows and columns), there exists an algorithm with query and time complexities polynomial in that tests whether is close to an abelian group. An open question is to understand for which other classes of groups such a test can be done efficiently and, on the other hand, if nontrivial lower bounds can be proved for specific classes of groups.

Notice that the algorithm in Ref. [10] has been obtained by first constructing a simple quantum algorithm that tests in time if an input is close to an abelian group (based on a quantum algorithm by Cheung and Mosca [6] computing efficiently the decomposition of a black-box abelian group on a quantum computer), and then replacing the quantum part by clever classical tests. One can find this surprising since, classically, computing the decomposition of a black-box abelian group is known to be hard [2]. This indicates that, in some cases, new ideas in classical property testing can be derived from a study of quantum testers. One can naturally wonder if all efficient quantum algorithms testing closeness to a given class of groups can be converted into efficient classical testers in a similar way. This question is especially motivated by the fact that Inui and Le Gall [11] have constructed a quantum algorithm with query complexity polynomial in that tests whether is close to a solvable group (note that the class of solvable groups includes all abelian groups), and that their techniques can also be used to test efficiently closeness to several subclasses of abelian groups on a quantum computer, as discussed later.

Our contributions: In this paper we investigate these questions by focusing on subclasses of abelian groups. We show lower and upper bounds on the randomized (i.e., non-quantum) query complexity of testing if the input is close to a cyclic group, and more generally on the randomized query complexity of testing if the input is close to an abelian group generated by elements (i.e., the class of groups of the form where and are positive integers), for any fixed and for both the edit distance and the Hamming distance. We prove in particular that their complexities vary dramatically according to the value of and according to the assumption that the size of is known or not. Table 1 gives an overview of our results.

Our results show that, with respect to the edit distance, testing closeness to subclasses of abelian groups generally requires exponentially more queries than testing closeness to the whole class of abelian groups. We believe that this puts in perspective Friedl et al.’s work [10] and indicates both the strength and the limitations of their results.

The lower bounds we give in Theorems 1 and 2 also prove the first exponential separations between the quantum and randomized query complexities of testing closeness to a class of groups. Indeed, the same arguments as in Ref. [11] easily show that, when the edit distance is considered, testing if the input is close to an abelian group generated by elements can be done using queries on a quantum computer, for any value of and even if is unknown. While this refutes the possibility that all efficient quantum algorithms testing closeness to a given class of groups can be converted into efficient classical testers, this also exhibits a new set of computational problems for which quantum computation can be shown to be strictly more efficient than classical computation.

Relation with other works: While Ivanyos [12] gave heuristic arguments indicating that testing closeness to a group may be hard in general, we are not aware of any (nontrivial) proven lower bounds on the query complexity of testing closeness to a group-theoretic property prior to the present work. Notice that a few strong lower bounds are known for related computational problems, but in different settings. Babai [1] and Babai and Szemerédi [2] showed that computing the order of an elementary abelian group in the black-box setting requires exponential time — this task is indeed one of the sometimes called “abelian obstacles” to efficient computation in black-box groups. Cleve [7] also showed strong lower bounds on the query complexity of order finding (in a model based on hidden permutations rather than on an explicit group-theoretic structure). These results are deeply connected to the subject of the present paper and inspired some of our investigations, but do not give bounds in the property testing setting. The proof techniques we introduce in the present paper are indeed especially tailored for this setting.

Organization of the paper and short description of our techniques: Section 3 deals with the case where is unknown. Our lower bound on the complexity of testing closeness to a cyclic group (Theorem 1) is proven in a way that can informally be described as follows. We introduce two distributions of inputs: one consisting of cyclic groups of the form , and another consisting of groups of the form , where is an unknown prime number chosen in a large enough set of primes. We observe that each group in the latter distribution is far with respect to the edit distance (and thus with respect to the Hamming distance too) from any cyclic group. We then prove that a deterministic algorithm with queries cannot distinguish those distributions with high probability.

Section 4 focuses on testing closeness to the class of groups generated by elements, and proves Theorem 2 in a similar way. For example, when is a fixed odd integer, we introduce two distributions consisting of groups isomorphic to and to , respectively. Notice that and have the same size. While is generated by elements, we observe that is far from any group generated by elements. We then show that any deterministic algorithm with queries cannot distinguish those distributions with high probability, even if (and thus ) is known.

Section 5 is devoted to constructing an efficient tester for testing closeness to cyclic groups when the size of the ground set is known. The idea behind the tester we propose is that, when the size of the ground set is given, we know that if is a cyclic group, then it is isomorphic to the group . We then take a random element of and define the map by for any (here the powers are defined carefully to take into consideration the case where the operation is not associative). If is a cyclic group, then is a generating element with non negligible probability, in which case the map will be a group isomorphism. Our algorithm will first test if the map is close to a homomorphism, and then perform additional tests to check that behaves correctly on any proper subgroup of .

## 2 Definitions

Let be a finite set and be a binary operation on it. Such a couple is called a magma. We first define the Hamming distance between two magmas over the same ground set.

###### Definition 1.

Let and be two magmas over the same ground set . The Hamming distance between and , denoted , is

We now define the edit distance between tables. A table of size is a function from where is an arbitrary subset of (the set of natural numbers) of size . We consider three operations to transform a table to another. An exchange operation replaces, for two elements , the value by an arbitrary element of . Its cost is one. An insert operation on adds a new element : the new table is the extension of to the domain , giving a table of size where the new values of the function are set arbitrarily. Its cost is . A delete operation on removes an element : the new table is the restriction of to the domain , giving a table of size . Its cost is . The edit distance between two tables and is the minimum cost needed to transform to by the above exchange, insert and delete operations.

A multiplication table for a magma is a table of size for which the values are in one-to-one correspondence with elements in , i.e., there exists a bijection such that for any . We now define the edit distance between two magmas, which will enable us to compare magmas with distinct grounds sets, and especially magmas with ground sets of different sizes. This is the same definition as the one used in Ref. [10].

###### Definition 2.

The edit distance between two magmas and , denoted , is the minimum edit distance between and where (resp. ) runs over all tables corresponding to a multiplication table for (resp. ).

We now explain the concept of distance to a class of groups.

###### Definition 3.

Let be a class of groups and be a magma. We say that is -far from with respect to the Hamming distance if

 min∗:Γ×Γ→Γ(Γ,∗) is a group in CHamΓ(∘,∗)≥δ|Γ|2.

We say that is -far from with respect to the edit distance if

 min(Γ′,∗)(Γ′,∗) is a group in Cedit((Γ,∘),(Γ′,∗))≥δ|Γ|2.

Notice that if a magma is -far from a class of groups with respect to the edit distance, then is -far from with respect to Hamming distance. The converse is obviously false in general.

Since some of our results assume that the size of is not known, we cannot suppose that the set is given explicitly. Instead we suppose that an upper bound of the size of is given, and that each element in is represented uniquely by a binary string of length . One oracle is available that generates a string representing a random element of , and another oracle is available that computes a string representing the product of two elements of . We call this representation a binary structure for . This is essentially the same model as the one used in Ref. [10, 11] and in the black-box group literature (see, e.g., Ref. [2]). The formal definition follows.

###### Definition 4.

A binary structure for a magma is a triple such that is an integer satisfying , and are two oracles satisfying the following conditions:

• there exists an injective map from to ;

• the oracle chooses an element uniformly at random and outputs the (unique) string such that .

• on two strings in the set , the oracle takes the (unique) element such that and outputs . (The action of on strings in is arbitrary.)

We now give the formal definition of an -tester.

###### Definition 5.

Let be a class of groups and let be any value such that . An -tester with respect to the edit distance (resp., to the Hamming distance) for is a randomized algorithm such that, on any binary structure for a magma ,

• outputs “PASS” with probability at least if satisfies property ;

• outputs “FAIL” with probability at least if is -far from with respect to the edit distance (resp., to the Hamming distance).

## 3 A Lower Bound for Testing Cyclic Groups

Suppose that we only know that an input instance satisfies , where is an integer known beforehand. In this section, we show that any randomized algorithm then requires queries to test whether is close to the class of cyclic groups. More precisely, we prove the following result.

###### Theorem 1.

Suppose that the size of the ground set is unknown and suppose that . Then the query complexity of any -tester for the class of cyclic groups, with respect to the Hamming distance or the edit distance, is .

Theorem 1 is proved using Yao’s minimax principle. Specifically, we introduce two distributions of instances and such that every instance in is a cyclic group and every instance in is far from the class of cyclic groups. Then we construct the input distribution as the distribution that takes an instance from with probability and from with probability . If we can show that any deterministic algorithm, given as an input distribution, requires queries to correctly decide whether an input instance is generated by or with high probability under the input distribution, we conclude that any randomized algorithm also requires queries to test whether an input is close to a cyclic group.

We now explain in details the construction of the distribution . Define and let be the set of primes in . From the prime number theorem, we have . We define as the distribution over binary structures for where the prime is chosen uniformly at random from and the injective map hidden behind the oracles is also chosen uniformly at random. We define as a distribution over binary structures for in the same manner. Indeed, the order of any instance generated by those distributions is at most . Every instance in is a cyclic group. From Lemma 1 below, we know that every instance in is -far (with respect to the edit distance, and thus with respect to the Hamming distance too) from the class of cyclic groups. Its proof is included in Appendix.

###### Lemma 1.

Let and be two nonisomorphic groups. Then

In order to complete the proof of Theorem 1, it only remains to show that distinguishing the two distributions and is hard. This is the purpose of the following proposition.

###### Proposition 1.

Any deterministic algorithm that decides with probability larger than whether the input is from the distribution or from the distribution must use queries.

Let us first give a very brief overview of the proof of Proposition 1. We begin by showing how the distributions and described above can equivalently be created by first taking a random sequence of strings, and then using some constructions and , respectively, which are much easier to deal with. In particular, the map in the constructions and is created “on the fly” during the computation using the concept of a reduced decision tree. We then show (in Lemma 2) a -query lower bound for distinguishing and .

###### Proof of Proposition 1.

Let be a deterministic algorithm with query complexity . We suppose that , otherwise there is nothing to do. The algorithm  can be seen as a decision tree of depth . Each internal node in the decision tree corresponds to a query to either or , and each edge from such a node corresponds to an answer for it. The queries to are labelled as , for elements and in . Each answer of a query is a binary string in . Each leaf of the decision tree represents a YES or NO decision (deciding whether the input is from or from , respectively).

Since we want to prove a lower bound on the query complexity of , we can make freely a modification that gives a higher success probability on all inputs (and thus makes the algorithm more powerful). We then suppose that, when goes through an edge corresponding to a string already seen during the computation, then immediately stops and outputs the correct answer. With this modification, reaches a leaf if and only if it did not see the same string twice. We refer to Figure 1(a) for an illustration.

We first consider the slightly simpler case where the algorithm only uses strings obtained from previous oracle calls as the argument of a query to . In other words, we suppose that, whenever an internal node labelled by is reached, then both and necessarily label some edge in the path from the root of the tree to (notice that this is the case for the algorithm of Figure 1(a)). We will discuss at the end of the proof how to deal with the general case where can also query on strings created by itself (e.g., on the all zero string or on strings taken randomly in ).

Let us fix a sequence of distinct strings in . Starting from the root of the decision tree (located at level ), for each internal node located at level , we only keep the outgoing branches labelled by strings , and we call the edge corresponding to an unseen edge (remember that ). This construction gives a subtree of the decision tree rooted at that we call the reduced decision tree associated with . Note that this subtree has exactly one leaf. See Figure 1(b) for an illustration.

Let us fix and let be either or with the group operation denoted additively. We now describe a process, invisible to the algorithm , which constructs, using the sequence , a map defining a binary structure for . The map is constructed “on the fly” during the computation. The algorithm starts from the root and follows the computation through the reduced decision tree associated with . On a node corresponding to a call to , the oracle chooses a random element of the group. If this element has not already appeared, then is fixed to the string of the unseen edge of this node. The oracle outputs this string to the algorithm , while is kept invisible to . If the element has already appeared, then the process immediately stops — this is coherent with our convention that stops whenever the same string is seen twice. On a node corresponding to a call to , the elements and such that and have necessarily been already obtained at a previous step from our assumption. If the element  has not already appeared, then is fixed to the string of the unseen edge of this node. Otherwise the process stops. By repeating this, the part of the map related to the computation (i.e., the correspondence between elements and strings for all the elements appearing in the computation) is completely defined by and by the elements chosen by the oracle . If necessary, the map can then be completed. On the example of Figure 1(b), if the input is and chooses the element 3, then the path followed is the path starting from the root labelled by which defines , , and .

For a fixed sequence , let (resp. ) be the “on the fly” construction for (resp. ) obtained by first choosing uniformly at random from , and then defining while running the algorithm, as detailed above. The distribution (resp. ) coincides with the distribution that takes a sequence of strings in uniformly at random without repetition and then create binary structures using (resp. ). Thus, to prove Proposition 1, it suffices to use the following lemma.

###### Lemma 2.

Let be any fixed sequence of distinct strings in . If decides correctly with probability larger than whether the input has been created using or using , then .

###### Proof of Lemma 2.

Let be the set of nodes in the reduced decision tree associated with , and let (resp., ) be the set of indexes such that is a query to (resp., to ). Notice that . For each index , we set as a random variable representing the element chosen by at node . Here, when generates , and when generates . Since only additions are allowed as operations on the set , the output to a query for can be expressed as where is a linear combination of the variables in . Here all coefficients are non-negative and at least one coefficient must be positive.

We define the function for every . Without loss of generality, we assume that each is a nonzero polynomial (i.e., there exists at least one index such that ). This is because, otherwise, the element (and the string) appearing at node is always the same as the element (and the string) appearing at node , and thus one of the two nodes and can be removed from the decision tree. For any positive integer , we say that is constantly zero modulo if divides for all indexes . We say that a prime is good if there exist such that the function is constantly zero modulo . We say that is bad if, for all , the function is not constantly zero modulo (as shown later, when is bad, it is difficult to distinguish if the input is or ). We denote by the set of good primes.

We first suppose that . Let denote the value . Assume the existence of a subset of size such that there exist for which is constantly zero modulo for every . Since all are primes, and is not the zero-polynomial, must have a nonzero coefficient divisible by . To create such a coefficient, we must have Now assume that there exists no such subset . Then, for each , at most primes have the property that is constantly zero modulo . This implies that . Since , it follows that . Thus, for both cases, we have .

Hereafter we suppose that . Assume that the leaf of the reduced decision tree corresponds to a YES decision. Recall that, if the computation does not reach the leaf, always outputs the correct answer. From these observations, we give the following upper bound on the overall success probability:

 r+(1−r)(ρℓY⋅1+(1−ρℓY)⋅1)2+r+(1−r)(ρℓN⋅1+(1−ρℓN)⋅0)2=1+r+(1−r)ρℓN2,

where is the probability of being good, and (resp., ) is the probability that does not reach the leaf conditioned on the event that the instance is from (resp., from ) and is a bad prime. Since , the above success probability has upper bound . When the leaf of the reduced decision tree corresponds to a NO decision, a similar calculation gives that the overall success probability is at most .

We now give an upper bound on and . Let us fix . Since is bad, each for is not constantly zero modulo . When generates , the probability that becomes after substituting values into is then exactly (since the values of each uniformly distribute over and there is a unique solution in to the equation once all but one values are fixed). By the union bound, the probability thus satisfies . Similarly, when generates , the probability that becomes after substituting values into is also exactly . Thus, the probability also satisfies .

To achieve overall success probability at least , we must have either or , and thus . ∎

Finally, we briefly explain how to deal with the general case where can make binary strings by itself and use them as arguments to . The difference is that now a string not seen before can appear as an argument to . Basically, what we need to change is the following two points: First, in the “on the fly” construction of from , if such a query appears then an element is taken uniformly at random from the set of elements of the input group not already labelled, and the identification is done. Second, in the proof of Lemma 2, another random variable is introduced to represent the element associated with . With these modifications the same lower bound holds.

This concludes the proof of Proposition 1. ∎

## 4 A Lower Bound for Testing the Number of Generators in a Group

In this section we show that, even if the size of the ground set is known, it is hard to test whether is close to an abelian group generated by elements for any value . We prove the following theorem using a method similar to the proof of Theorem 1. See Appendix for details.

###### Theorem 2.

Let be an integer and suppose that . Then the query complexity of any -tester for the class of abelian groups generated by elements is

 ⎧⎪⎨⎪⎩Ω(|Γ|16−26(3k+2))if k is even,Ω(|Γ|16−46(3k+1))if % k is odd.

Moreover, these bounds hold with respect to either the Hamming distance or the edit distance, and even when is known.

## 5 Testing if the Input is Cyclic when |Γ| is Known

In this section we study the problem of testing, when is known, if the input is a cyclic group or is far from the class of cyclic groups. Let us denote , and suppose that we also know its factorization where the ’s are distinct primes. Let be the cyclic group of integers modulo and, for any , denote by its subgroup of order . The group operation in is denoted additively.

For any , we now define a map such that represents the -th power of . Since the case where is not associative has to be taken in consideration and since we want to evaluate efficiently , this map is defined using the following rules.

 ⎧⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪⎩fγ(1)=γfγ(a)=γ∘f(a−1) if 2≤a≤m−1 and a is oddfγ(a)=fγ(a/2)∘fγ(a/2) if 2≤a≤m−1 and a is evenfγ(0)=γ∘f(m−1)

The value of can then be computed with uses of the operation . Notice that if is a group, then for any .

For any , our -tester for cyclic groups is denoted and is described in Figure 2. The input is given as a binary structure with . In the description of Figure 2, operations in , such as taking a random element or computing the product of two elements, are implicitly performed by using the oracles and . The correctness of this algorithm and upper bounds on its complexity are shown in the following theorem. A proof is given in Appendix.

###### Theorem 3.

For any value , Algorithm is an -tester for cyclic groups with respect to both the edit distance and the Hamming distance. Its query and time complexities are

## Acknowledgments

The authors are grateful to Gábor Ivanyos for communicating to them Lemma 1 and an outline of its proof. Part of this work was conducted while YY was visiting Rutgers University. FLG acknowledges support from the JSPS, under the grant-in-aid for research activity start-up No. 22800006.

## References

• [1] Babai, L.: Local expansion of vertex-transitive graphs and random generation in finite groups. In: Proc. of STOC 1991. pp. 164–174 (1991)
• [2] Babai, L., Szemerédi, E.: On the complexity of matrix group problems I. In: Proc. of FOCS 1984. pp. 229–240 (1984)
• [3] Bach, E., Shallit, J.: Algorithmic Number Theory, Vol. 1: Efficient Algorithms. The MIT Press (1996)
• [4] Ben-Or, M., Coppersmith, D., Luby, M., Rubinfeld, R.: Non-abelian homomorphism testing, and distributions close to their self-convolutions. In: Proc. of APPROX-RANDOM 2004. LNCS, vol. 3122, pp. 273–285. Springer (2004)
• [5] Blum, M., Luby, M., Rubinfeld, R.: Self-testing/correcting with applications to numerical problems. J. Comput. Syst. Sci. 47(3), 549–595 (1993)
• [6] Cheung, K., Mosca, M.: Decomposing finite abelian groups. Quantum Information and Computation 1(3), 26–32 (2001)
• [7] Cleve, R.: The query complexity of order-finding. Inf. Comput. 192(2), 162–171 (2004)
• [8] Czumaj, A., Sohler, C.: Survey on sublinear-time algorithms. Bulletin of the EATCS 89, 23–47 (2006)
• [9] Ergün, F., Kannan, S., Kumar, R., Rubinfeld, R., Viswanathan, M.: Spot-checkers. J. Comput. Syst. Sci. 60(3), 717–751 (2000)
• [10] Friedl, K., Ivanyos, G., Santha, M.: Efficient testing of groups. In: Proc. of STOC 2005. pp. 157–166 (2005)
• [11] Inui, Y., Le Gall, F.: Quantum property testing of group solvability. Algorithmica 59(1), 35–47 (2011)
• [12] Ivanyos, G.: Classical and quantum algorithms for algebraic problems. Thesis for the degree “Doctor of the Hungarian Academy of Sciences” (2007)
• [13] Ivanyos, G.: Personal communication (2010)
• [14] Ivanyos, G., Le Gall, F., Yoshida, Y.: On the distance between non-isomorphic groups. Preprint available at http://arxiv.org/abs/1107.0133 (2011)
• [15] Kiwi, M.A., Magniez, F., Santha, M.: Exact and approximate testing/correcting of algebraic functions: A survey. In: Proc. of STACS 2002. LNCS, vol. 2292, pp. 30–83 (2002)
• [16] Rajagopalan, S., Schulman, L.J.: Verification of identities. SIAM J. Comput. 29(4), 1155–1163 (2000)
• [17] Ron, D.: Property testing. In: Handbook of Randomized Computing, pp. 597–649. Kluwer Academic Publishers (2001)
• [18] Shpilka, A., Wigderson, A.: Derandomizing homomorphism testing in general groups. SIAM J. Comput. 36(4), 1215–1230 (2006)

## Appendix

### A. Proof of Lemma 1

The idea of this proof has been communicated to us by Ivanyos [13]. Work on other aspects of the distance between non-isomorphic groups has subsequently been the subject of a joint paper [14].

We will use the following lemma, which is a weak version of Corollary 1 in Ref. [14].

###### Lemma 3.

Let and be two groups such that . If is not isomorphic to a subgroup of , then

 Prx,y∈G[γ(x∘y)=γ(x)∗γ(y)]≤79|G|2

for any injective map .

We now present our proof of Lemma 1.

###### Proof of Lemma 1.

We assume without loss of generality that and prove the lemma by contraposition. Namely, we show that and are isomorphic if .

Suppose that , where . Let and be multiplication tables of and , respectively, such that the edit distance between and is at most . Here, and are subsets of of size and , respectively. Let and be the bijections associated with and , respectively.

First notice that . Otherwise, at least elements should be added to to obtain the table , which would cost at least

 δ|H|∑i=1(2|H|−2i+1)=2δ|H|2−δ|H|(δ|H|+1)+δ|H|=δ(2−δ)|H|2>δ|H|2

operations.

We now consider the transition from to through the process of computing the edit distance. Observe that the number of removed elements through the transition is at most , otherwise it would cost more than

 δ|G|∑i=1(2|G|−2i+1) = 2δ|G|2−δ|G|(δ|G|+1)+δ|G| = δ(2−δ)|G|2≥δ(2−δ)(1−δ)2|H|2>δ|H|2

operations. Let be the set of elements that are not removed in the transition and define . From the argument above, we have .

We define a map as follows. For , . For , we choose so that becomes an injective map (this is possible since ). Suppose that, for two elements , the element is in . Also, suppose that the value was not modified in the transition, i.e., . In this case,

 σ−1H(f(x)∗f(y)) = TH(σ−1H(f(x)),σ−1H(f(y))) = TH(σ−1G(x),σ−1G(y)) = TG(σ−1G(x),σ−1G(y)) = σ−1G(x∘y).

Thus, we have . Since the number of exchange operations done to the table is at most , by the union bound we obtain

 Prx,y∈G[f(x∘y)=f(x)∗f(y)]≥1−3δ−δ/(1−δ)2≥1−5δ.

Thus, since , Lemma 3 implies that the group is isomorphic to a subgroup of . If is isomorphic to a proper subgroup of , then , which contradicts the fact that . Thus, is indeed isomorphic to . ∎

### B. Proof of Theorem 2

To show the lower bound, we use Yao’s minimax principle as in the proof of Theorem 1. We introduce two distributions and such that every instance in is generated by elements while every instance in is far from abelian groups generated by elements. Moreover, all instances in and have the same order. Then we construct the input distribution as the distribution that takes an instance from with probability and from with probability . By showing that any deterministic algorithm requires many queries to distinguish them, we obtain the desired result.

We first consider the case where is even. Let be a fixed integer and denote . For any fixed (and known) prime , we define as the distribution over binary structures for the group where the injective map hidden behind the group oracles is chosen uniformly at random. We define as the uniform distribution over binary structures for in the same manner. The order of every instance in and is . Every instance in has generators while every instance in needs at least elements to be generated. Moreover, from Lemma 1, every instance in is -far from groups of generators. The part of Theorem 2 for even then follows from the following proposition.

###### Proposition 2.

Any deterministic algorithm that decides with probability larger than whether the input is from the distribution or from the distribution must use queries.

###### Proof.

Let us consider the decision tree associated with a deterministic algorithm using queries. As in Section 3, we rely on the fact that the distribution of instances generated by can be created through a more convenient “on the fly” construction of using a random sequence of strings. We suppose hereafter that is fixed and denote by (resp., ) the associated construction of positive (resp., negative) instances. We assume again that, when goes through an edge corresponding to a string already seen during the computation, then immediately stops and outputs the correct answer (this modification only improves the ability of ).

We denote again by the set of nodes in the reduced decision tree associated with , and by (resp., ) the set of indexes  such that is a query to (resp., ). Notice that . For each , we set as a random variable representing the element obtained by performing a query to . The answer to a query for can be expressed as where is a linear combination of the variables . We define the function for every . Remember that, for any positive integer , we say that is constantly zero modulo if divides for all indexes . Note that we can suppose without loss of generality that for all indexes the function is not constantly zero modulo (otherwise it would give no useful information since for any element in an instance created by or ).

Suppose that the leaf of the reduced decision tree associated with corresponds to a YES decision. The success probability of the algorithm for this fixed sequence is at most

 12(ρℓY⋅1+(1−ρℓY)⋅1)+12(ρℓN⋅1+(1−ρℓN)⋅0)=12(1+ρℓN),

where (resp., ) is the probability that does not reach the leaf conditioned on the event that the instance is from (resp., from ). When the leaf of the reduced decision tree corresponds to a NO decision, a similar calculation gives that the success probability is at most . Notice that and are the probabilities that the same string is seen twice during the computation. We will now show that, when the instance is created by either or , the inequality

 Pr{αj}j∈T[∃i≠i′∈S such % that ∑j∈Tkii′jαj=0]≤t(t−1)2⋅pr−1

holds. This implies that and then the algorithm cannot distinguish from with probability at least 2/3 unless