Definability of linear equation systems over groups and rings

Definability of linear equation systems
over groups and rings

Anuj Dawar  University of Cambridge, Computer Laboratory {anuj.dawar, bjarki.holm}@cl.cam.ac.uk Erich Grädel  RWTH Aachen University, Mathematical Foundations of Computer Science {graedel, pakusa}@logic.rwth-aachen.de Bjarki Holm Eryk Kopczynski  University of Warsaw, Institute of Informatics erykk@mimuw.edu.pl  and  Wied Pakusa
Abstract.

Motivated by the quest for a logic for and recent insights that the descriptive complexity of problems from linear algebra is a crucial aspect of this problem, we study the solvability of linear equation systems over finite groups and rings from the viewpoint of logical (inter-)definability. All problems that we consider are decidable in polynomial time, but not expressible in fixed-point logic with counting. They also provide natural candidates for a separation of polynomial time from rank logics, which extend fixed-point logics by operators for determining the rank of definable matrices and which are sufficient for solvability problems over fields.

Based on the structure theory of finite rings, we establish logical reductions among various solvability problems. Our results indicate that all solvability problems for linear equation systems that separate fixed-point logic with counting from can be reduced to solvability over commutative rings. Moreover, we prove closure properties for classes of queries that reduce to solvability over rings, which provides normal forms for logics extended with solvability operators.

We conclude by studying the extent to which fixed-point logic with counting can express problems in linear algebra over finite commutative rings, generalising known results from [12, 20, 8] on the logical definability of linear-algebraic problems over finite fields.

Key words and phrases:
finite model theory, logics with algebraic operators
 The first and third authors were supported by EPSRC grant EP/H026835/1 and the fourth and fifth authors were supported by ESF Research Networking Programme GAMES. The fourth author was also partially supported by the Polish Ministry of Science grant N N206 567840.
copyright: ©:

\@sect

section�[Introduction]Introduction

The quest for a logic for [14, 17] is one of the central open problems in both finite model theory and database theory. Specifically, it asks whether there is a logic in which a class of finite structures is expressible if, and only if, membership in the class is decidable in deterministic polynomial time.

Much of the research in this area has focused on the logic , the extension of inflationary fixed-point logic by counting terms. In fact, has been shown to capture on many natural classes of structures, including planar graphs and structures of bounded tree-width [16, 17, 19]. Recently, it was shown by Grohe [18] that captures polynomial time on all classes of graphs with excluded minors, a result that generalises most of the previous capturing results. More recently, it has been shown that can express important algorithmic techniques, such as the ellipsoid method for solving linear programs [1].

On the other side, already in 1992, Cai, Fürer and Immerman [9] constructed a graph query that can be decided in , but which is not definable in . But while this CFI query, as it is now called, is very elegant and has led to new insights in many different areas, it can hardly be called a natural problem in polynomial time. Therefore, it was often remarked that possibly all natural polynomial-time properties of finite structures could be expressed in . However, this hope was eventually refuted in a strong sense by Atserias, Bulatov and Dawar [4] who proved that the important problem of solvability of linear equation systems (over any finite Abelian group) is not definable in and that, indeed, the CFI query reduces to this problem. This motivates the study of the relationship between finite model theory and linear algebra, and suggests that operators from linear algebra could be a source of new extensions to fixed-point logic, in an attempt to find a logical characterisation of . In [12], Dawar et al. pursued this direction of study by adding operators for expressing the rank of definable matrices over finite fields to first-order logic and fixed-point logic. They showed that fixed-point logic with rank operators () can define not only the solvability of linear equation systems over finite fields, but also the CFI query and essentially all other properties that were known to separate from . However, although is strictly more expressive than , it seems rather unlikely that suffices to capture on the class of all finite structures.

A natural class of problems that might witness such a separation arises from linear equation systems over finite domains other than fields. Indeed, the results of Atserias, Bulatov and Dawar [4] imply that fails to express the solvability of linear equation systems over any finite ring. On the other side, it is known that linear equation systems over finite rings can be solved in polynomial time [2], but it is unclear whether any notion of matrix rank is helpful for this purpose. We remark in this context that there are several non-equivalent notions of matrix rank over rings, but both the computability in polynomial time and the relationship to linear equation systems remains unclear. Thus, rather than matrix rank, the solvability of linear equation systems could be used directly as a source of operators (in the form of generalised quantifiers) for extending fixed-point logics.

Instead of introducing a host of new logics, with operators for various solvability problems, we set out here to investigate whether these problems are inter-definable. In other words, are they reducible to each other within ? Clearly, if they are, then any logic that generalises and can define one, can also define the others. We thus study relations between solvability problems over (finite) rings, fields and Abelian groups in the context of logical many-to-one and Turing reductions, i.e., interpretations and generalised quantifiers. In this way, we show that solvability both over Abelian groups and over arbitrary (possibly non-commutative) rings reduces to solvability over commutative rings. These results indicate that all solvability problems for linear equation systems that separate from can be reduced to solvability over commutative rings. We also show that solvability over commutative rings reduces to solvability over local rings, which are the basic building blocks of finite commutative rings. Finally, in the other direction, we show that solvability over rings with a linear order and solvability over local rings for which the maximal ideal is generated by elements, reduces to solvability over cyclic groups. Further, we prove closure properties for classes of queries that reduce to solvability over rings, and establish normal forms for first-order logic extended with operators for solvability over finite fields.

While it is known that solvability of linear equation systems over finite domains is not expressible in fixed-point logic with counting, it has also been observed that the logic can define many other natural problems from linear algebra. For instance, it is known that over finite fields, the inverse to a non-singular matrix and the characteristic polynomial of a square matrix can be defined in  [8, 12]. We conclude this paper by studying the extent to which these results can be generalised to finite commutative rings. Specifically, we use the structure theory of finite commutative rings to show that common basic problems in linear algebra over rings reduce to the respective problems over local rings. Furthermore, we show that over rings that split into a direct sum of -generated local rings, matrix inverse can be defined in . Finally, we show that over the class of Galois rings, which are finite rings that generalise finite fields and rings of the form , there is a formula of which can define the coefficients of the characteristic polynomial of any square matrix. In particular, this shows that the matrix determinant is definable in over such rings.


\@sect

section1[Background on logic and algebra]Background on logic and algebra

Throughout this paper, all structures (and in particular, all algebraic structures such as groups, rings and fields) are assumed to be finite. Furthermore, it is assumed that all groups are Abelian, unless otherwise noted.


\@sect

subsection2[Logic and structures]Logic and structures

The logics we consider in this paper include first-order logic () and inflationary fixed-point logic () as well as their extensions by counting terms, which we denote by and , respectively. We also consider the extension of first-order logic with operators for deterministic transitive closure, which we denote by . For details see [13, 14].

A vocabulary is a sequence of relation and constant symbols in which every has an arity . A -structure consists of a non-empty set , called the domain of , together with relations and constants for each and . Given a logic  and a vocabulary , we write to denote the set of -formulas of . A -formula with defines a -ary query that takes any -structure to the set . To evaluate formulas of counting logics like and we associate to each -structure  the two-sorted extension of by adding as a second sort the standard model of arithmetic . We assume that in such logics all variables (including the fixed-point variables) are typed and we require that quantification over the second sort is bounded by numerical terms in order to guarantee a polynomially bounded range of all quantifiers. To relate the original structure with the second sort we consider counting terms of the form which take as value the number of different elements such that . For details see [14, 12].

Interpretations and logical reductions. Consider signatures and and a logic . An -ary -interpretation of in is a sequence of formulas of in vocabulary consisting of: (i) a formula ; (ii) a formula ; (iii) for each relation symbol of arity , a formula ; and (iv) for each constant symbol , a formula , where each , or is an -tuple of free variables. We call the width of the interpretation. We say that an interpretation associates a -structure to a -structure if there is a surjective map from the -tuples to such that:

  • if, and only if, ;

  • if, and only if, ; and

  • if, and only if, .

Lindström quantifiers and extensions. Let be a vocabulary where each relation symbol has arity , and consider a class of -structures that is closed under isomorphism.

With and we associate a Lindström quantifier whose type is the tuple . For a logic , we define the extension by adding rules for constructing formulas of the kind , where are -formulas, has length , has length and each has length . To define the semantics of this new quantifier we associate the interpretation of signature in of width and we let if is defined and as a -structure (see [23, 26]). Similarly we can consider the extension of  by a collection of Lindström quantifiers. The logic is defined by adding a rule for constructing formulas with , for each , and the semantics is defined by considering the semantics for each quantifier , as above. Finally, we write to denote the vectorised sequence of Lindström quantifiers associated with (see [11]).

Definition 0.1 (Logical reductions).

Let be a class of -structures and a class of -structures closed under isomorphism.

  • is said to be -many-to-one reducible to () if there is an -interpretation of in such that for every -structure it holds that if, and only if, .

  • is said to be -Turing reducible to () if is definable in .

Note that as in the case of usual many-to-one and Turing-reductions, we have that whenever a class is -many-to-one reducible to a class , is also -Turing reducible to .


\@sect

subsection2[Rings and systems of linear equations]Rings and systems of linear equations

We recall some definitions from commutative and linear algebra, assuming that the reader has knowledge of basic algebra and group theory (for further details see Atiyah et al. [3]). For , we write to denote the ring of integers modulo .

Commutative rings. Let be a commutative ring. An element is a unit if for some and we denote by the set of all units. Moreover, we say that divides (written ) if for some . An element is nilpotent if for some , and we call the least such the nilpotency of . The element is idempotent if . Clearly are idempotent elements, and we say that an idempotent is non-trivial if . Two elements are orthogonal if .

We say that is a principal ideal ring if every ideal of is generated by a single element. An ideal is called maximal if and there is no ideal with . A commutative ring is local if it contains a unique maximal ideal . Rings that are both local and principal are called chain rings. For example, all prime rings are chain rings and so too are all finite fields. More generally, a -generated local ring is a local ring for which the maximal ideal is generated by elements. See McDonald [24] for further background.

Remark 0.2.

When we speak of a “commutative ring with a linear order”, then in general the ordering does not respect the ring operations (cp. the notion of ordered rings from algebra).

Systems of linear equations. We consider systems of linear equations over groups and rings whose equations and variables are indexed by arbitrary sets, not necessarily ordered. In the following, if , and are finite and non-empty sets then an matrix over is a function . An -vector over is defined similarly as a function .

A system of linear equations over a group is a pair with and . By viewing as a -module (i.e. by defining the natural multiplication between integers and group elements respecting , , and ), we write as a matrix equation , where is a -vector of variables that range over . The system is said to be solvable if there exists a solution vector such that , where we define multiplication of unordered matrices and vectors in the usual way by for all . We represent linear equation systems over groups as finite structures over the vocabulary , where denotes the language of groups, is a unary relation symbol (identifying the elements of the group) and , are two binary relation symbols.

Similarly, a system of linear equations over a commutative ring is a pair where is an matrix with entries in and is an -vector over . As before, we usually write as a matrix equation and say that is solvable if there is a solution vector such that . In the case that the ring is not commutative, we represent linear systems in the form , where is an -matrix over and is a -matrix over , respectively.

We consider three different ways to represent linear systems over rings as relational structures. For simplicity, we just explain the case of linear systems over commutative rings here. The encoding of linear systems over non-commutative rings is analogous. Firstly, we consider the case where the ring is part of the structure. Let , where is the language of rings,  is a unary relation symbol (identifying the ring elements), and and are ternary and binary relation symbols, respectively. Then a finite -structure describes the linear equation system over the ring . Secondly, we consider a similar encoding but with the additional assumption that the elements of the ring (but not the equations or variables of the equation systems) are linearly ordered. Such systems can be seen as finite structures over the vocabulary . Finally, we consider linear equation systems over a fixed ring encoded in the vocabulary: for every ring , we define the vocabulary , where for each the symbols and are binary and unary, respectively. A finite -structure describes the linear equation system over where if, and only if, and similarly for  (assuming that the form a partition of and that the form a partition of ).

Finally, we say that two linear equation systems and are equivalent, if either both systems are solvable or neither system is solvable.


\@sect

section1[Solvability problems over different algebraic domains]Solvability problems over different algebraic domains

It follows from the work of Atserias, Bulatov and Dawar [4] that fixed-point logic with counting cannot express solvability of linear equation systems (‘solvability problems’) over any class of (finite) groups or rings. In this section we study solvability problems over such different algebraic domains in terms of logical reductions. Our main result here is to show that the solvability problem over groups (SlvAG) -reduces to the corresponding problem over commutative rings (SlvCR) and that the solvability problem over commutative rings which are equipped with a linear order () -reduces to the solvability problem over cyclic groups (SlvCycG). Note that over any non-Abelian group, the solvability problem already is -complete [15].

SlvCycG

SlvAG

SlvCR

SlvR

SlvF

SlvLR

-

-

Figure 1. Logical reductions between solvability problems. Curved arrows () denote inclusion of one class in another.

Our methods can be further adapted to show that solvability over arbitrary (that is, not necessarily commutative) rings (SlvR) -reduces to SlvCR. We then consider the solvability problem restricted to special classes of commutative rings: local rings (SlvLR) and -generated local rings (), which generalises solvability over finite fields (SlvF). The reductions that we establish are illustrated in Figure 1.

In the remainder of this section we describe three of the outlined reductions: from commutative rings equipped with a linear order to cyclic groups, from groups to commutative rings, and finally from general rings to commutative rings. To give the remaining reductions from commutative rings to local rings and from -generated local rings to commutative linearly ordered rings we need to delve further into the theory of finite commutative rings, which is the subject of §Definability of linear equation systems over groups and rings.

Let us start by considering the solvability problem over commutative rings that come with a linear order. We want to construct an -reduction that translates from linear systems over such rings to equivalent linear equation systems over cyclic groups. Hence, if the ring is linearly ordered (and in particular if the ring is fixed), this shows that, up to -definability, it suffices to analyse the solvability problem over cyclic groups.

The main idea of the reduction, which is given in full detail in the proof of Theorem 0.3, is as follows: for a ring , we consider a decomposition of the additive group into a direct sum of cyclic groups for appropriate elements . Then every element can uniquely be identified with a -tuple where denotes the order of in , and furthermore, the addition in translates to component-wise addition (modulo ) in .

Having such a group decomposition at hand, this suggests to treat linear equations component-wise, i.e. to let variables range over the cyclic summands and to split each equation into a set of equations accordingly. In general, however, in contrast to the ring addition, the ring multiplication will not be compatible with such a decomposition of the group . Moreover, an expression of the ring elements with respect to a decomposition of has to be definable in fixed-point logic. To guarantee this last point, we make use of the given linear ordering.

Theorem 0.3.

.

Proof.

Consider a system of linear equations over a commutative ring of characteristic and let be a linear order on . In the following we describe a mapping that translates the system into a system of equations over the cyclic group which is solvable if, and only if, has a solution over . Observe that the group can easily be interpreted in the ring  in fixed-point logic, for instance as the subgroup of generated by the multiplicative identity. Indeed, for the purpose of the following construction we could also identify with the cyclic group generated by any element which has maximal order in .

Let be a (minimal) generating set for the additive group and let  denote the order of in . Moreover, let us choose the set of generators such that that . From now on, we identify the group generated by with the group and thus have . In this way we obtain a unique representation for each element as where . Similarly, we can identify variables  ranging over with tuples where ranges over .

To translate a linear equation over into an equivalent set of equations over , the crucial step is to consider the multiplication of a coefficient with a variable with respect to the chosen representation, i.e. the formal expression . We observe that the ring multiplication is uniquely determined by the products of all pairs of generators , so we let , where for .

Now, let us reconsider the formal expression from above where for , then we have

Here, the coefficient of generator in the last expression is an element in , which in turn means that we have to reduce all summands modulo . To see that this transformation is sound, we choose arbitrary such that . However, since it holds that for all we conclude that for all . Finally, since for all we can uniformly consider all terms as taking values in first, and then reduce the results modulo afterwards.

For notational convenience, let us set , then we can write . Note that the remaining multiplications between variables and coefficients are just multiplications in .

However, for our translation we face a problem, since we cannot express that ranges over as a linear equation over . To overcome this obstacle, let us first drop the requirement completely, i.e. let us consider the multiplication of in the form given above lifted to the group . Furthermore, let denote the natural group epimorphism which maps onto . We claim that for all we have . Together with the fact that is also a group homomorphism from to this justifies doing all calculations in first, and reducing the result to via  afterwards. To see that for all let us denote by the natural group epimorphism from . Note that for we have . Then we have to show that for all we have . For this it suffices to show that for all we have . Let (the other case is symmetric), then and thus by the definition of we conclude that . Since we know that which yields the result.

We are prepared to give the final reduction. In a first step we substitute each variable by a tuple of variables where takes values in . We then translate all terms in the equations of the original system according to the above explanations and split each equation into a set of equations according to the decomposition of . We finally have to address the issue that the set of new equations is not equivalent to the original equation ; indeed, we really want to introduce the set of equations . However, this problem can be solved easily as for all the linear equation over is equivalent to the linear equation over .

Hence, altogether we obtain a system of linear equations over which is solvable if, and only if, the original system has a solution over .

We proceed to explain that the mapping can be expressed in . Here, we crucially rely on the given order on to fix a set of generators. More specifically, as we can compute a set of generators in time polynomial in , it follows from the Immerman-Vardi theorem [21, 28] that there is an -formula  such that generates and . Having fixed a set of generators, it is obvious that the map taking , is -definable. Furthermore, the map can easily be formalised in , since the coefficients are just obtained by performing a polynomial-bounded number of ring operations. Splitting the original system of equations component-wise into systems of linear equations, multiplying them with a coefficient and combining them again to a single system over is trivial.

Finally, we note that a linear system over the ring can be reduced to an equivalent system over the group , by rewriting terms with as ().        

Note that in the proof we crucially rely on the fact that we have given a linear order on the ring  to be able to fix a set of generators of the Abelian group .

So far, we have shown that the solvability problem over linearly ordered commutative rings can be reduced to the solvability problem over groups. This raises the question whether a translation in the other direction is also possible; that is, whether we can reduce the solvability problem over groups to the solvability problem over commutative rings. Essentially, such a reduction requires a logical interpretation of a commutative ring in a group, which is what we describe in the proof of the following theorem.

Theorem 0.4.

.

Proof.

Let be a system of linear equations over a group , where and . For the reduction, we first construct a commutative ring from and then lift to a system of equations which is solvable over if, and only if, is solvable over .

We consider as a -module in the usual way and write for multiplication of group elements by integers. Let be the least common multiple of the order of all group elements. Then we have for all , where denotes the order of . This allows us to obtain from a well-defined multiplication of by elements of which commutes with group addition. We write and for addition and multiplication in , where and denote the additive and multiplicative identities, respectively. We now consider the set as a group, with component-wise addition defined by , for all , and identity element . We endow with a multiplication which is defined as .

It is easily verified that this multiplication is associative, commutative and distributive over . It follows that is a commutative ring, with identity . For and we set and . Let be the map defined by . Extending to relations in the obvious way, we write and .

Claim. The system is solvable over if, and only if, is solvable over .

Proof of claim. In one direction, observe that a solution to gives the solution to . For the other direction, suppose that is a vector such that . Since each element can be written uniquely as , we write , where and . Observe that we have and for all and . Hence, it follows that and . Now, since , we have . Hence, gives a solution to , as required.

All that remains is to show that our reduction can be formalised as an interpretation in . Essentially, this comes down to showing that the ring can be interpreted in by formulas of . By elementary group theory, we know that for elements of maximal order we have . It is not hard to see that the set of group elements of maximal order can be defined in ; for example, for any fixed the set of elements of the form for is -definable as a reachability query in the deterministic directed graph . Hence, we can interpret in , and as on ordered domains, expresses all -computable queries (see e.g. [14]) the multiplication of is also -definable, which completes the proof.        

We conclude this section by discussing the solvability problem over general (i.e. not necessarily commutative) rings . Over such rings, linear equation systems have a representation of the form where and are two coefficient matrices over . This representation takes into account the difference between left and right multiplication of variables with coefficients from the ring.

First of all, if the ring comes with a linear ordering, then it is easy to adapt the proof of Theorem 0.3 for the case of non-commutative rings. Hence, in this case we obtain again an -reduction to the solvability problem over cyclic groups. Moreover, in what follows we are going to establish a -reduction from the solvability problem over general rings to the solvability problem over commutative rings . These results indicate that from the viewpoint of -definability the solvability problem does not become harder when considered over arbitrary (i.e. possibly non-commutative) rings.

As a technical preparation, we first give a first-order interpretation that transforms a linear equation systems over into an equivalent system with the following property: the linear equation system is solvable if, and only if, the solution space contains a numerical solution, i.e. a solution over .

Lemma 0.5.

There is an -interpretation of in such that for every linear equation system over , describes a linear equation system over the -module such that is solvable over if, and only if, has a solution over .

Proof.

Let , , and . By duplicating each variable we can assume that for every we have that for all , or for all , i.e. we assume that each variable occurs only with either left-hand or right-hand coefficients. For , we introduce for each variable () and each element a new variable , i.e. the index set for the variables of is . Finally, we replace all terms of the form by , and similarly, terms of the form by . If we let the new variables take values in , then we obtain a new linear equation system of the desired form over the -module . It is easy to see that this transformation can be formalised by an -interpretation .

Finally we observe that the newly constructed linear equation system is equivalent to the original system . To see this, assume that is a solution of the original system. By setting if and by setting otherwise, we obtain a solution of the system . For the other direction, assume that is a solution of . Then we set to get a solution for the system .        

By Lemma 0.5, we can restrict to linear equation systems over the -module where variables take values in . However, since is an infinite domain, we let denote the maximal order of elements in the group . Then we can also treat as an equivalent linear equation system over the -module .

At this point, we reuse our construction from Theorem 0.4 to obtain a linear system over the commutative ring , where and . We claim that is solvable over if, and only if, is solvable over . For the non-trivial direction, suppose is a solution to and decompose into group elements and number elements, as explained in the proof of Theorem 0.4. Recalling that for all , it follows that . Hence, there is a solution to that consists only of number elements, as claimed. Thus we obtain:

Theorem 0.6.

.


\@sect

section1[The structure of finite commutative rings]The structure of finite commutative rings

In this section we study structural properties of (finite) commutative rings and present the remaining reductions for solvability outlined in §Definability of linear equation systems over groups and rings: from commutative rings to local rings, and from -generated local rings to commutative rings with a linear order. Recall that a commutative ring is local if it contains a unique maximal ideal . The importance of the notion of local rings comes from the fact that they are the basic building blocks of finite commutative rings. We start by summarising some of their useful properties.

Proposition 0.7 (Properties of (finite) local rings).

Let be a finite commutative ring.

  • If is local, then the unique maximal ideal is .

  • is local if, and only if, all idempotent elements in are trivial.

  • If is idempotent then as a direct sum of rings.

  • If is local then its cardinality (and in particular its characteristic) is a prime power.

Proof.

The first claim follows directly by the uniqueness of the maximal ideal . For the second part, assume is local but contains a non-trivial idempotent element , i.e.  but . In this case and are two non-units distinct from , hence . But then which yields the contradiction. On the other hand, if only contains trivial idempotents, then we claim that every non-unit in is nilpotent: assume that is a non-unit which is not nilpotent, then for some and all because is finite. In particular,

Since we have which is a contradiction to our assumption that is not nilpotent. Hence we have that is a non-unit if, and only if, is nilpotent. Knowing this, it is easy to verify that also sums of non-units are non-units, which implies that the set of non-units forms a unique maximal ideal in .

For the third part, assume is idempotent. Then so is also idempotent. Furthermore, as we see that and are orthogonal, and since , any element can be expressed as , so we conclude that .

Finally, let be local and suppose where . We want to show that . Otherwise and would be two proper distinct ideals. To see this, let with . We first show that . Assume for some , then and hence . Furthermore we show that : for each we have that and so . This shows that does not contain a unique maximal ideal, and so was not local.        

By this proposition we know that finite commutative rings can be decomposed into local summands that are principal ideals generated by pairwise orthogonal idempotent elements. Indeed, this decomposition is unique (for more details, see e.g. [6]).

Proposition 0.8 (Decomposition into local rings).

Let be a (finite) commutative ring. Then there is a unique set of pairwise orthogonal idempotent elements for which it holds that (i) is local for each ; (ii) ; and (iii) .

We next show that the ring decomposition is -definable. As a first step, we note that (the base of ) is -definable over .

Lemma 0.9.

There is a formula such that for every (finite) commutative ring .

Proof.

We claim that consists precisely of those non-trivial idempotent elements of which cannot be expressed as the sum of two orthogonal non-trivial idempotent elements. To establish this claim, consider an element and suppose that where and are orthogonal non-trivial idempotents. It follows that is different from both and , since if , say, then and similarly when . Now and, similarly,