Generation modulo the action of a permutation group

# Generating tuples of integers modulo the action of a permutation group and applications

## Abstract.

Originally motivated by algebraic invariant theory, we present an algorithm to enumerate integer vectors modulo the action of a permutation group. This problem generalizes the generation of unlabeled graph up to an isomorphism. In this paper, we present the full development of a generation engine by describing the related theory, establishing a mathematical and practical complexity, and exposing some benchmarks. We next show two applications to effective invariant theory and effective Galois theory.

Initialement motivé par la théorie algébrique des invariants, nous présentons une stratégie algorithmique pour énumérer les vecteurs d’entiers modulo l’action d’un groupe de permutations. Ce problème généralise le problème d’énumération des graphes non étiquetés. Dans cet article, nous développons un moteur complet d’énumération en expliquant la théorie sous-jacente, nous établissons des bornes de complexité pratiques et théoriques et exposons quelques bancs d’essais. Nous détaillons ensuite deux applications théoriques en théorie effective des invariants et en théorie de Galois effective.

###### Key words and phrases:
Generation up to an Isomorphism, Enumerative Combinatorics, Computational Invariant Theory, Effective Galois Theory

## 1. Introduction

Let be a group of permutations, that is, a subgroup of some symmetric group . Several problems in effective Galois theory (see [Gir87, Abd00]), computational commutative algebra (see [FR09, BT11, Bor11]) and generation of unlabeled with repetitions species of structures rely on the following computational building block.

Let be the set of non-negative integers. An integer vector of length is an element of . The symmetric group acts on positions on integer vectors in : for a permutation and an integer vector,

 σ.(v1,…,vn):=(vσ−1(1),…,vσ−1(n)).

This action coincides with the usual action of on monomials in the multivariate polynomial ring with a field and indeterminates.

###### Problem 1.1.

Let be a permutation group. Enumerate the integer vectors of length modulo the action of .

Note that there are infinitely many such vectors; in practice one usually wants to enumerate the vectors with a given sum or content.

For example, the Problem 1.1 contains the listing non-negative integer matrices with fixed sum up to the permutations of rows or columns appearing in the theory of multisymmetric functions [Ges87, Mac04] and in the more recent investigations of multidiagonal coinvariant [Ber09, BBT11].

Define the following equivalence relation over elements of : two vectors and are equivalent if there exists a permutation such that

 σ⋅u=(aσ−1(1),…,aσ−1(n))=(b1,…,bn)=v.

Problem 1.1 consists in enumerating all equivalence classes.

This problem is not well solved in the literature. Some applications present a greedy strategy searching and deleting all pairs of vectors such that the second part can be obtained from the first part. The most famous sub-problem is the unlabeled graph generation which consists in enumerate tuples over and of length enumerated up to the action of the symmetric groups acting on pair on nodes. This example has a very efficient implementation in Nauty which is able to enumerate all graphs over a small number of nodes.

The algorithms presented in this paper have been implemented, optimized, and intensively tested in Sage [S09]; most features are integrated in Sage since release 4.7 (2011-05-26, ticket #6812, 1303 lines of code including documentation).

## 2. Orderly generation and tree structure over integer vectors

The orderly strategy consists in setting a total order on objects before quotienting by the equivalence relation. This allows us to define a single representative by orbit. Using the lexicographic order on integer vectors, we will call a vector canonical under the action of or just canonical if is maximum in its orbit under for the lexicographic order:

 v is canonical ⇔v=maxlex{σ⋅v∣σ∈G}.

Now, the goal being to avoid to test systematically if vectors are canonical, we decided to use a tree structure on the objects in which we will get properties relaying the canonical vectors. Any result relating fathers, sons and the property of being canonical in the tree may allowed us to skip some canonical test.

### 2.1. Tree Structure over integer vectors

Let be the vector called root, we build a tree with the following function father.

###### Definition 2.1.

Let be a tuple of integers of length which is not the root. Let be the position of the last non-zero entry of . We define the father of

 father(a1,a2,…,ai,0,0,…,0):=(a1,a2,…,ai−1,0,0,…,0)

For any integer vector , we can go back to the generation root by steps. The corresponding application giving the children of an integer vector is thus:

###### Definition 2.2.

Let be a tuple of integers of length . Let be the position of the last non-zero entry of ( if all entries are null). The set of children of is obtained as:

 Unknown environment '%'
###### Proposition 2.3.

For any permutation group , for any integer vector , if is not canonical under , all children of are not canonical. Therefore, the canonicals form a "prefix tree" in the tree of all integer vectors.

Sketch of proof: When a father is not canonical, there exists a permutation such that the permuted vector is greater. Applying the same permutation on the children shows also it cannot be canonical.

Figure 1 displays integer vectors of length whose sum is at most and shows the tree relations between them. Choosing the cyclic group of order and using the generation strategy, underlined integer vectors are tested but are recognized to be not canonical. Using Proposition 2.3, crossed-out integer vectors are not tested as they cannot be canonical as children of non canonical vectors.

Our strategy consists now in making a breath first search over the sub-tree of canonicals. This is done lazily using Python iterators.

### 2.2. Testing whether an integer vector is canonical

As we have seen, the fundamental operation for orderly generation is to test whether an integer vector is canonical; it is thus vital to optimize this operation. To this end, we use the work horse of computational group theory for permutation groups: stabilizer chains and strong generating sets.

Following the needs required by applications, we want to test massively if vectors are canonical or not. For this reason, we will use a strong generating system of the group . We can compute this last item in almost linear time [Ser03] using GAP [GAP97].

Let a positive integer and a permutation group . Recall that its stabilizer chain is , where

 ∀i,1⩽i⩽n:Gi:={g∈G|∀j⩽i:g(j)=j}.

From this chain, we build a strong generating system where is a transversal of . This set of strong generators is particularly adapted to the partial lexicographic order as stabilizers are defined with positions from left to right.

Let and be two positive integers such that . For and two integer vectors of length , let us define the following binary relations

 v

where and represent regular strict and large lexicographic comparison.

Algorithm 1 is a natural extension of McKay’s canonical graph labeling algorithm as it is explained in [HR09].

Algorithm 1 takes advantage of partial lexicographic orders and the strong generating system of the group . It tries to explore only a small part of the orbit of the vector ; the worst case complexity of this step is bounded by the size of the orbit, and not by . In this sense, it does take into account the automorphism group of the vector .

###### Proposition 2.4.

Let be a positive integer and a subgroup of . Let be an integer vector of length . Algorithm 1 returns if is canonical under the action of and returns otherwise.

Sketch of proof: It is based on the properties of a strong generating system.

## 3. Complexity

### 3.1. Theoretical complexity

#### Efficiency of the tree structure

Let be a positive integer and a permutation group. For any non negative integer , let (resp. ) be the number of canonical (resp. non canonical) integer vectors of degree . Based on the tree structure presented in Section 2.1, let (resp. ) the number of tested (resp. non tested) integer vectors.

###### Proposition 3.1.

Generating all canonical integer vectors up to degree using the generation strategy presented in Section 2 presents an absolute error bounded by . Equivalently, regarding the series, we have

 d∑i=0T(i)−d∑i=0C(i)⩽¯¯¯¯C(d)

Sketch of proof: Using Lemma 2.3, we get this bound noticing two tested but non canonical vectors cannot have a paternity relation.

This absolute error is not very explicit (directly usable), but it can be used to get a relative error at the price of a rough approximation.

###### Corollary 3.2.

Let and be two positive integers and a permutation group. Generating all canonical monomials under the action of up to degree using the generation strategy presented in Section 2 presents a relative error bounded by .

Sketch of proof: We use the previous proposition with the fact that any integer vector has at least one child but no more than children (the generation root is the only one having children).

The bound is optimal for trivial groups (), and seems to be better as the permutation group is of small cardinality. This relative error becomes better as we go up along the degree and tends to become optimal when the degree goes to infinity.

#### Complexity of testing if a vector is canonical

We now investigate the complexity of Algorithm 1. We need first to select a reasonable statistic to collect, which will define the complexity of this algorithm.

The explosion appearing in the algorithm is conditioned by the size of the set . For an integer vector and a strong generating system of a permutation , when runs over in the main loop, the set contains at step :

 Extra open brace or missing close brace

The right statistic to record is the size of the union of the for all such that the algorithm is still running: that corresponds to the part of the orbit explored by the algorithm. This statistic appears to be very difficult to evaluate by a theoretical way. However, collecting it with a computer is a simple task.

#### Parallelization and memory complexity

Let us note that this generation engine is trivially amenable for parallelism: one can devote the study of each branch to a different processor. Our implementation uses a little framework SearchForest, co-developed by the author, for exploration trees and map-reduce operations on them. To get a parallel implementation, it is sufficient to use the drop-in parallel replacement for SearchForest under development by Jean-Baptiste Priez and Florent Hivert.

The memory complexity of the generation engine is reasonable, bounded by the size of the answer. Indeed, we keep in the cache only the Canonical vectors of degree when we search for those in degree . In case one wants to only iterate through the elements of a given degree , then this can be achieved with memory complexity .

### 3.2. Benchmarks design

To benchmark our implementation, we chose the following problem as test-case.

###### Problem 3.3.

Let be a positive integer and a permutation group. Iterate through all the canonical integer vectors under the staircase (i.e. ).

A vector of length is said to be under the staircase when it satisfies .

This problem contains essentially all difficulties that can appear. The family of integer vectors under the staircase contains vectors with trivial automorphism group as well as vectors with a lot of symmetries. Applications also require to deal with this problem as the corresponding family of monomials plays a crucial role in algebra.

#### Benchmarks for transitive permutation groups

We now need a good family of permutation groups, representative of the practical use cases. We chose to use the database of all transitive groups of degree  [Hul05] available in Sage through the system GAP [GAP97].

The benchmarks have been run on an off-the-shelf 2.40 GHz dual core Mac Book laptop running Ubuntu 12.4 and Sage version 5.3.

### 3.3. Benchmarks

#### Tree Structure over integer vectors

This first benchmark investigates the efficiency of the tree structure presented in Section 2.1. As we don’t test children of non canonical integer vectors, one wants to take measures of the part of tested non canonical vectors (which corresponds to the useless part of computations). For that, we solve Problem 3.3 for each group of the database and we collect the following information as follows.

This table displays the statistics for transitive groups of degree . Database Id. is the integer indexing the group, and Index in are respectively the cardinality and the index of the group in the symmetric group . Canonicals denotes the number of canonical vectors under the staircase and number of tests is the number of times the algorithm testing if an integer vector is canonical is called.

From this information, we set a quantity defined as follows:

 Err:=number of tests−CanonicalsCanonicals.

The following figure shows depending on the index . The figure contains crosses, one for each transitive group over at most variables. We use a logarithmic scale on the x axis.

#### Empirical complexity of testing if a vector is canonical

Algorithm 1 needs to explore a part of the orbit of the tested integer vectors. The following table displays for each transitive group over variables, the number of elements of all orbits of tested vectors solving Problem 3.3 compared to the total number of integer vectors explored.

Now we define to be the average size of the orbit needed to be explored to know if an integer vector is canonical:

 Ratio:=total exploredtotal orbits.

The following figure plots in terms of for transitive groups on at most variables.

#### Overall empirical complexity of the generation engine

We now evaluate the overall complexity by comparing the ratio between the computations and the size of the output. We define the measure Complexity as follows:

 Complexity:=total exploredCanonicals.

The following graph displays Complexity in terms of the size of the group for transitive Groups on up to variables (and excluding the alternate and symmetric group of degree ).

The dashed line has as equation . Therefore, we get the following empirical overall complexity:

 Computations=O(ln(|G|)×Output size)

#### Tests around the unlabeled graph generation problem

Although the generation engine is not optimized for the unlabeled graph generation problem, we can apply our strategy on it.

Fix , and consider the set of pairs of elements of . The symmetric group acts on pairs by for and . Let be the induced group of permutations of . A labeled graph can be identified with the integer vector with parts in . Then, two graphs are isomorphic if and only if the corresponding vectors are in the same -orbit.

Now, one needs just to know which are these permutation groups acting on pairs of integers. In the following example, we retrieve the number of graphs on unlabeled nodes is, for small values of is given by: , , , , , , , , , , , …

sage: L = [TransitiveGroup(1,1), TransitiveGroup(3,2),
TransitiveGroup(6,6), TransitiveGroup(10,12), TransitiveGroup(15,28),
TransitiveGroup(21,38), TransitiveGroup(28,502)]

sage: [IntegerVectorsModPermutationGroup(G,max_part=1).cardinality() for G in L]

[2, 4, 11, 34, 156, 1044, 12346]

Notice that our generation engine generalizes the graph generation problem in two directions. Removing the option max_part, one enumerates multigraphs (graphs with multiple edges between nodes). On the other hand, graphs correspond to special cases of permutation groups. From an algebraic point of view, we saw graphs as monomials whose exponents are or , canonical for the action of the symmetric group on pairs of nodes.

## 4. Computing the invariants ring of a permutation group

Let us explain how the generation engine from Section 2 is plugged into effective invariant theory (see [DK02] and [Kin07]).

A well-known application to build an invariant polynomial under the action of a permutation group is the Reynolds operator . From any polynomial in variables , the invariant is

 R(P):=1|G|∑σ∈Gσ⋅P,

where is the polynomial built from for which has permuted by position the tuple of variables . Formally, for any

 (σ⋅P)(x1,x2,...,xn):=P(xσ−1(1),xσ−1(2),…,xσ−1(n)).

For large groups, the Reynolds operator is not very convenient to build invariant polynomials. If is a monomial where , the minimal invariant one can build in number of terms is the orbit sum of .

Let a field, we denote by the ring formed by all polynomials invariant under the action of .

 K[x]G:={P∈K[x]|∀σ∈G:σ⋅P=P}.

For any subgroups of and a field of characteristic , a result due to Hilbert and Noether state that the ring of invariant is a free module of rank over the symmetric polynomials in the variable . Computing the invariant ring consists essentially in building algorithmically an explicit family (called secondary invariant polynomials) of generators of this free module.

Searching the secondary invariant polynomials from orbit sum of monomials whose vector of exponents is canonical (instead of all monomials) produces a gain of complexity of if we assume that all orbits are of cardinality . This assumption is obviously false; however, in practice, it seems to hold in average and up to a constant factor [Bor11]).

In [BT11], the authors calculate the secondary invariants of the transitive group over variables whose cardinality is . Using the canonicals monomials, they managed to build a family of irreducible secondary invariants deploying a set of secondary invariants. This computation is unreachable by Gröbner basis techniques.

## 5. Computing primitive invariants for a permutation group

### 5.1. Introduction

We now apply our generation strategy to this problem concerning effective Galois theory.

###### Problem 5.1.

Let a positive integer and a permutation group, subgroup of . Let be a field and be formal variables. Find a polynomial such that

 {σ∈Sn|σ⋅P=P}=G.

A such polynomial is called a primitive invariant for .

Problem 5.1 (exposed in [Gir87] and [Abd00]) consists in finding an invariant under the action of such that its stabilizer in is equal to itself. Solving this problem becomes difficult when we want to construct a primitive invariant of degree minimal or a primitive invariant with a minimal number of terms.

### 5.3. Benchmarks

Algorithm 2 terminates in less than an hour for any subgroup of . Even, it can calculate some primitive invariants for a lot of subgroups with degree between and while the literature only provides examples up to degree or . Using the same computer, this benchmark just collects the average time in seconds of execution of Algorithm 2 by executing systematically the algorithm on transitive groups of degree .

We would like to thanks Nicolas M. Thiéry, Simon A. King, Karl-Dieter Crisman and Dmitri V. Pasechnik for useful comments about implementation details, review of code and Cython optimizations.

This research was driven by computer exploration using the open-source mathematical software Sage [S09]. In particular, we perused its algebraic combinatorics features developed by the Sage-Combinat community [SCc08], as well as its group theoretical features provided by GAP [GAP97].

### References

1. Ines Abdeljaouad. Théorie des Invariants et Applications à la Théorie de Galois effective. PhD thesis, Université Paris 6, 2000.
2. François Bergeron, Nicolas Borie, and Nicolas M. Thiéry. Deformed diagonal harmonic polynomials for complex reflection groups. In 23rd International Conference on Formal Power Series and Algebraic Combinatorics (FPSAC 2011). 2011.
3. François Bergeron. Algebraic combinatorics and coinvariant spaces. CMS Treatises in Mathematics. Canadian Mathematical Society, Ottawa, ON, 2009.
4. Nicolas Borie. Calcul des invariants des groupes de permutations par transformée de Fourier. PhD thesis, Laboratoire de Mathématiques, Université Paris Sud, 2011.
5. Nicolas Borie and Nicolas M. Thiéry. An evaluation approach to computing invariants rings of permutation groups. In Proceedings of MEGA 2011, March 2011.
6. Harm Derksen and Gregor Kemper. Computational invariant theory. Springer-Verlag, Berlin, 2002.
7. J.C. Faugère and S. Rahmany. Solving systems of polynomial equations with symmetries using SAGBI-Gröbner bases. In (ISSAC 2099), 2009.
8. The GAP Group, Lehrstuhl D für Mathematik, RWTH Aachen, Germany and SMCS, U. St. Andrews, Scotland. GAP – Groups, Algorithms, and Programming, 1997.
9. Ira M. Gessel. Enumerative applications of symmetric functions. In Proceedings of the 17-th Séminaire Lotharingien, Publ. I.R.M.A. Strasbourg, page 17, 1987.
10. Kurt Girstmair. On invariant polynomials and their application in field theory. Math. Comp., 48(178):781–797, 1987.
11. Stephen G. Hartke and A. J. Radcliffe. McKay’s canonical graph labeling algorithm. In Communicating mathematics, volume 479, pages 99–111. 2009.
12. Alexander Hulpke. Constructing transitive permutation groups. J. Symbolic Comput., 39(1):1–30, 2005.
13. S.A. King. Fast Computation of Secondary Invariants. Arxiv math/0701270, 2007.
14. Percy A. MacMahon. Combinatory analysis. Vol. I, II. 2004. Reprint of ıt Combinatory analysis. Vol. I, II (1915, 1916).
15. W. A. Stein et al. Sage Mathematics Software (Version 3.3). The Sage Development Team, 2009. http://www.sagemath.org.
16. The Sage-Combinat community. Sage-Combinat: enhancing Sage as a toolbox for computer exploration in algebraic combinatorics, 2008.
17. Ákos Seress. Permutation group algorithms, volume 152 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 2003.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minumum 40 characters