Graphs with degree constraints
Given a set of nonnegative integers, we derive the asymptotic number of graphs with a given number of vertices, edges, and such that the degree of every vertex is in . This generalizes existing results, such as the enumeration of graphs with a given minimum degree, and establishes new ones, such as the enumeration of Euler graphs, i.e. where all vertices have an even degree. Those results are derived using analytic combinatorics.
1.1 Related works
The asymptotics of several families of simple graphs with degree constraints have been derived. Regular graphs, where all vertices have the same degree, have been enumerated by BC78, graphs with minimum degree at least by PW03. An Euler graph, or even graph, is a graph where all vertices have an even degree. An exact formula for the number of such graphs, for a given number of vertices and without consideration of the number of edges, has been derived by RWR69 and MS75. In the present work, we generalize those results and derive the asymptotic number of graphs with degrees in any given set.
A similar problem has been addressed with probabilistic tools by the configuration model, introduced independently by B80 and Wo78. This model inputs a distribution on the degrees, and outputs a random multigraph where the degree of each vertex follows . The main difference with the model analyzed in this article is that the number of edges in the configuration model is a random variable. The link between both models is discussed in Section 4.1. For more information on the configuration model, we recommend the book of Ho14.
Other related problems include the enumeration of graphs with a given degree sequence (BC78), the enumeration of symmetric matrices with nonnegative coefficients and constant row sum (CMS05), and the enumeration of graphs with degree parities, investigated by RR82.
1.2 Model and notations
A multiset is an unordered collection of objects, where repetitions are allowed. Sets are then multisets without repetitions. A sequence is an ordered multiset. We use the parenthesis notation for sequences, and the brace notation for sets and multisets. Open real intervals are denoted by open square brackets .
A simple graph is a set of labelled vertices and a set of edges, where each edge is an unordered pair of distinct vertices. In a multigraph, the edges form a multiset and the vertices in an edge need not be distinct. An edge is a loop if , a multiple edge if it has at least two occurrences in the multiset of edges, and a simple edge otherwise. Thus, the simple graphs are the multigraphs that contain neither loops nor multiple edges, i.e. that contain only simple edges. The set of multigraphs with vertices and edges is denoted by , and the subset of simple graphs by .
The degree of a vertex is defined as its number of occurrences in . In particular, a loop increases its degree by . The set of multigraphs from where each vertex has its degree in a set is denoted by . The subset of simple graphs is . The set may be finite or infinite. We denote its generating function by
For any natural number , denotes the set . In particular, observe that . We also define the valuation and periodicity of the set (by convention, the periodicity is infinite when ).
2 Main Theorem and applications
Our main result is an asymptotic expression for the number of graphs in , when the number of edges grows linearly with the number of vertices.
Assume contains at least two integers, has valuation and periodicity . Let , denote two integers tending to infinity, such that stays in a fixed compact interval of and divides , then the number of simple graphs in is
where , is the unique positive solution of , and . If does not divide , if or if , then is empty.
When , the degrees are not constrained, so . Using Stirling formula, it can indeed be checked that , the total number of simple graphs with vertices and edges, is asymptotically equal to the result of Theorem 1
PW03 have derived the asymptotics of simple graphs with minimum degree at least . They used probabilitic and analytic elementary tools, in a sophisticated way. In the present paper, we have addressed the enumeration of a broader family of graphs with degree constraints, using more powerful tools (analytic combinatorics). For graphs with minimum degree at least , the asymptotics derived in Theorem 1, for , matches their result.
Euler graphs are simple graphs where each vertex has an even degree. An exact, but complicated, formula for the number of such graphs, for given number of vertices and without consideration of the number of edges, has been derived by RWR69 and MS75. Applying Theorem 1, we are now able to derive the asymptotic number of Euler graphs with vertices and edges, when stays in a fixed compact interval of
where and .
3 Proof of the result
In this section we provide a proof for Theorem 1. The proof of all lemmas and theorems are moved to the appendix.
3.1.1 Multigraph model
The main model of random multigraphs with vertices and edges is the multigraph process, analyzed by FKP89 and JKLP93. It samples uniformly and independently vertices in , and outputs a multigraph with set of vertices and set of edges
Given a simple or multi graph, one can order the set of edges and the vertices in each edge. The result is a sequence of ordered pairs of vertices, that we call an ordering of . Let denote the number of such orderings. For example, the multigraph on vertices with edges has orderings, amongst them . For simple graphs, the number of orderings is equal to , because each edge has two possible orientations and all edges can be permuted. For non-simple multigraphs, is smaller. FKP89 and JKLP93 introduced the compensation factor of a multigraph with edges, defined as
The compensation factor of a multigraph is if and only if it is simple.
Observe that in the random distribution induced by the multigraph process, each multigraph receives a probability proportional to its compensation factor. Therefore, when the output of the multigraph process is constrained to be a simple graph, the sampling becomes uniform on . The total weight of a family of multigraphs is the sum of their compensation factors. For example, the total weight of is equal to . When contains only simple graphs, its total weight is equal to its cardinality.
3.1.2 Analytic tools
Our tool for the analysis of graphs with degree constraints is analytic combinatorics, as presented by FS09. Its principle is to associate to the combinatorial family studied its generating function. The asymptotics of the family is then linked to the analytic behavior of this function.
In the analysis of a graphs family with analytic combinatorics, the main difficulty is the fast growth of its cardinality, which often implies a zero radius of convergence for the corresponding generating function
This feature drastically reduces the number of tools from complex analysis that can be applied. Graphs with degree constraints are no exception, but our approach completely avoid this classic issue. In fact, the only analytic tool we use is the following lemma, a variant of (FS09, Theorem VIII.8).
Consider a non-monomial series with nonnegative coefficients, analytic on , with valuation and periodicity . Let denote the function , and a compact interval of the open interval . Let , denote two integers tending to infinity while stays in , and let denote the unique positive solution of . Finally, consider a compact and a function , on , such that for all in , the function is analytic on and is nonzero. Then we have, uniformly for in and in ,
3.2 Multigraphs with degree constraints
The work of FKP89 and JKLP93 demonstrates that multigraphs are more suitable to the analytic combinatorics approach than simple graphs. Moreover, the results on multigraphs can usually be extended to simple graphs. Following this observation, multigraphs are analyzed in this section, before turning so simple graphs in Section 3.3.
3.2.1 Exact and asymptotic enumeration
The total weight of all multigraphs in is
The proof of this theorem is elementary by the definition of the compensation factor. Now applying Lemma 2 to the exact expression, we derive the asymptotics of multigraphs with degree constraints. Let us first eliminate three simple cases.
When contains only one integer , is the set of -regular multigraphs. The total weight of is then if , and otherwise.
The sum of the degrees of the vertices is equal to , so is empty when or .
The periodicity of is equal to . For each vertex of a multigraph from , it follows that divides . By summation over all vertices, we conclude that if does not divide , then the set is empty.
The two last points obviously hold for .
Consider a set of size at least . Let denote its valuation and its periodicity. Let , denote two integers tending to infinity, such that stays in a fixed compact interval of the open interval , and divides , then the total weight of is equal to
where and is the unique positive solution of .
3.2.2 Typical multigraphs with degree constraints
Let us recall that an edge is simple if it is neither a loop nor a multiple edge. Before turning to the enumeration of simple graphs with degree constraints, we first describe the behavior of non-simple edges in a typical multigraph from . No proofs are given here, as stronger results will be derived later.
Using random sampling, we observe that in most of the multigraphs from , all non-simple edges have low multiplicity and are well separated. This motivates the following definition. A multigraph from is in if all its non-simple edges are loops or double edges, and each vertex belongs to at most one loop or (exclusive) one double edge. Let denote the number of occurrences of the element in the multiset . Formally, is characterized as the set of multigraphs from such that for all vertices , we have
The complementary set, , is denoted by , and illustrated in Figure 1.
3.3 Simple graphs with degree constraints
We introduce the notation for the set of simple graphs with vertices, edges and all degrees in , i.e. multigraphs from that contain neither loops nor multiple edges. The enumeration of simple graphs with degree constraints is derived in Theorem 1. First, in Section 3.3.2, we describe an inclusion-exclusion process that outputs when applied to . In Section 3.3.3, this process is then applied to , and the error introduced is proven to be negligible in Section 3.3.4.
In order to forbid loops and multiple edges in multigraphs from , we introduce the notion of marked multigraphs.
3.3.1 Marked multigraphs
A marked multigraph is a triplet , where denotes the set of vertices, the multiset of normal edges, and the multiset of marked edges, where both normal and marked edges are unordered pairs of vertices. We say that a marked multigraph belongs to a family of (unmarked) multigraphs if the unmarked multigraph is in .
We now extend to marked multigraphs the definitions of degree, orderings and compensation factors, introduced for multigraphs in Section 3.1. The degree of a vertex from a marked multigraph is equal to its number of occurrences in the multiset . An ordering of a marked multigraph with edges is a sequence
from such that the multiset is equal to , and the multiset is equal to . The number of orderings of a given marked multigraph is denoted by , and its compensation factor is
For example, consider the marked multigraph with
Its number of orderings is , and therefore its compensation factor is whereas it is for without the marks,
In the following, we will consider families of marked multigraphs where the marked edges are loops or multiple edges. Given a marked multigraph , then denotes the number of loops in , and the number of distinct edges from that are not loops. The generating function of a family or marked multigraphs is
3.3.2 Inclusion-exclusion process
In this section, we build an operator that inputs a family of multigraphs and outputs a family of marked multigraphs. It is designed so that the asymptotics of its generating function is linked to the asymptotics of . In order to justify the construction, we first introduce the operators and .
If we could mark all loops and multiple edges from , the enumeration of simple graphs with degree constraints would be easy. Indeed, given a family of multigraphs, let denote the marked multigraphs from with all loops and multiple edges marked. Since the simple graphs are the multigraphs that have neither loops nor multiple edges, we have
which is equal to , because simple graphs have a compensation factor equal to . Unfortunately, we do not have a description of this family in the symbolic method formalism.
The inclusion-exclusion principle advises us to mark some of the non-simple edges. Let denote the set of marked multigraphs from such that each edge from is either a loop, or has multiplicity at least in and does not belong to . This construction implies the relation
The natural idea to build a marked multigraph from is to first choose some loops and multiple edges to put in , then complete with unmarked edges, which may well form other loops and multiple edges, in a way that ensures . However, the description of the set of marked edges is complicated, because of the numerous possible intersection patterns.
We have seen in Section 3.2.2 that in most of the multigraphs from , non-simple edges do not intersect. This motivates the following definition. Given a set of multigraphs, let denote the set of marked multigraphs from such that each vertex is in exactly one of the following cases:
the vertex belongs to no marked edge,
the vertex belongs to one marked loop and no other marked edge,
the vertex belongs to two identical marked edges and no other marked edge.
Therefore, each marked edge is a loop of multiplicity or a double edge. This marking process links the multigraphs from , defined in Section 3.2.2, to the simple graphs with degree constraints.
The value is equal to the number of simple graphs in .
3.3.3 Application of the inclusion-exclusion process to all multigraphs with degree constraints
We have the formal equality
where when is greater than , otherwise
The proof is constructive by considering all the disjoint sets of vertices where we can put a loop or a double edges. We observe that when is fixed while tends to infinity, then tends to . The double sum can then be approximated by an exponential, and it is tempting to conclude
The next lemma formalize this intuition. A multivariate generating function is said to dominate coefficient-wise another series if for all ,
When stays in a fixed compact interval of , there is an entire bivariate analytic function such that, for large enough, dominates coefficient-wise
We can now derive the asymptotics of . As observed in the discussion preceding Theorem 4, the result is trivial when contains only one integer, when is outside and when does not divide .
Assume has size at least , valuation and periodicity . Let , denote two integers tending to infinity, such that stays in a fixed compact interval of and divides . When , stay in a fixed compact, then
where , and .
3.3.4 Negligible marked multigraphs
Recall that denotes the set . In Lemma 10, we prove that is negligible. To do so, we first bound for a family of marked multigraphs from with mandatory edges.
Let denote edges on the set of vertices , and the set of multigraphs from that contain those edges, with multiplicities (i.e. an edge with occurrences in the list has at least occurrences in the multiset of edges of the multigraph)
Assume contains at least two integers and has valuation . Let , denote two integers tending to infinity, such that stays in a fixed compact interval of , then
Figure 1 displays four multigraphs from . Actually, any multigraph from contains one of those four graphs as a subgraph, and this property can be described in terms of mandatory edges. In the following lemma, we use this fact to bound .
Assume contains at least two integers, has valuation and periodicity . Let , denote two integers tending to infinity, such that stays in a fixed compact interval of , and divides , then
The intuition supporting this proof is that a multigraph belongs to if and only if it contains a vertex that is in one of the four configurations depicted in Figure 1. According to Lemma 9, multigraphs from that contain those subgraphs have a negligible total weight. Now we have all the ingredients to prove Theorem 1.
4 Random generation
In order to keep a combinatorial interpretation, we focused on generating functions with coefficients in . Our results hold more generally for any generating function with nonnegative coefficients and large enough radius of convergence (so that the saddle-point from Lemma 2 is well defined). Multigraphs are then counted with a weight that depends of the degrees of their vertices
The present work has been guided by experiments on large random graphs with degree constraints. We used exact and Boltzmann sampling (DFLS04). Observe that to build a random simple graph from , one can sample multigraphs from and reject until the multigraph is simple. As a consequence of Theorem 1, the expected number of rejections is (using the notations of the theorem).
4.1 Boltzmann sampling
The construction of the Boltzmann algorithm is straightforward from Theorem 3. To build a random multigraph with degrees in , vertices and approximately edges, the algorithm first computes a positive value , according to the number of edges targeted. It then draws independently integers , following the law
with . If their sum is odd, a new sequence is drawn. Otherwise, the algorithm outputs a random multigraph with sequence of degrees . To do so, as in the configuration model (B80, Wo78), each vertex receives half-edges, and a random pairing on the half-edges is drawn uniformly.
Therefore, the random distribution induced on multigraphs by the Boltzmann sampling algorithm is identical to the configuration model. Conversely, given a probability distribution on , one can choose so that the distribution is equal to the one described by Equation (3). Thus, we expect random multigraphs from the configuration model and multigraphs with degree constraints to share many statistical properties.
4.2 Recursive method
For the sampling of a multigraph in , the generator first draws a sequence of degrees, and then performs a random pairing of half-edges, as in configuration model and the Boltzmann sampler. Each sequence from is drawn with weight . In the first step, we use dynamic programming to precompute the values , sums of the weights of all the sequences of degrees that sum to
using the initial conditions and the recursive expression
After this precomputation, we generate the sequence of degrees as follows: first we sample the last degree of the sequence according to the distribution
then we recursively generate the remaining sequence , which must sum to . Once the sequence of degrees is computed, we generate a random pairing on the corresponding half-edges.
5 Forthcoming research
The results presented can be extended in several ways. The case where tends to or could be considered. For example, PW03 have derived, using elementary tools, the asymptotics of graphs with a lower bound on the minimum degree when . This extension would only require to adjust the saddle-point method from Lemma 2.
We have also derived results on the enumeration of graphs where the degree sets vary with the vertices. The model inputs an infinite sequence of sets and output graphs where each vertex has its degree in . The techniques presented in this paper can be extended to this case, if some technical conditions are satisfied, such as the convergence of the series . This extension will be part of a longer version of the paper. Two examples of such families are graphs with degree parities (RR82), and graphs with a given degree sequence (BC78).
We believe that complete asymptotic expansion can be derived for graphs with degree constraints. This would require to apply a more general version of Lemma 2, such as presented in Chapter by PW13, and we would have to consider more complex families than .
The asymptotics of connected graphs from when tends to infinity has first been derived by BCM90. Since then, two new proofs were given, one by PW05, the other by HS06. The proof of Pittel and Wormald relies on a link between connected graphs and graphs from a particular family of graphs with degree constraints (graphs with degrees at least ). In ElieThesis, following the same approach, but using analytic combinatorics, we obtained a short proof for the asymptotics of connected multigraphs from when tends to infinity. We now plan to extend this result to simple graphs, and to derive a complete asymptotic expansion.
In this paper, we have focused on the enumeration of graphs with degree constraints. We can now start the investigation on the typical structure of random instances of such graphs. An application would be the enumeration of Eulerian graphs, i.e. connected Euler graphs.
Finally, the inclusion-exclusion technique we used to remove loops and double edges can be extended to forbid any family of subgraphs.
Appendix A Proofs
In this appendix, we include the proofs of the lemmas and theorems.
Proof of Theorem 3.
By definition of the compensation factor, the number of multigraphs of the theorem is equal to
Let us consider an ordering
of a multigraph from . For all , let denote the set of positions of the vertex in this ordering. Since the vertices have their degrees in , each has size in . This implies a bijection between
the orderings of multigraphs in ,
the sequences of sets , where the size of each set is in , and is a partition of (i.e. the sets are disjoints and ).
We now interpret as a sequence of sets that contain labelled objects and apply the Symbolic Method (see FS09). The exponential generating function of sets of size in is . The bijection then implies
and the theorem follows, after division by . ∎
Proof of Lemma 5.
As explained in the paragraphs First markink and Second marking of Section 3.3.2, the following relations hold
Furthermore, by construction of , we have
so . ∎
Proof of Lemma 6.
To build an ordering of a multigraph from with vertices in marked double edges and vertices in marked loops, we perform the following steps:
choose the labels of the vertices that appear in the marked double edges, and the vertices that appear in the marked loops. There are such choices.
choose the distinct edges of distinct vertices, among the chosen vertices, that will become the marked double edges. There are such choices.
order the marked double edges and the vertices in each of them. There are ways to order them.
order the loops. There are ways to do so.
choose among the edges of the final ordering which ones receive marked loops and which ones receive marked double edges. There are choices.
to fill the rest of the final ordering, build an ordering of length where the vertices that belong to marked double edges and the vertices that appear in marked loops have degree in , while the other vertices have degree in . The number of such orderings is .
This bijective construction implies the following enumerative result
After simplification, this last expression can be rewritten
Proof of Lemma 7.
Developing the exponential as a double sum
the result can be rewritten
for all , . We prove that when is large enough, we have
for all . Since the right-hand side are the coefficients of a function analytic on , this will conclude the proof.
Let denote the value