A probabilistic heuristic for counting components of functional graphs of polynomials over finite fields
Abstract.
In 2014, Flynn and Garton [FG14] bounded the average number of components of the functional graphs of polynomials of fixed degree over a finite field. When the fixed degree was large (relative to the size of the finite field), their lower bound matched Kruskal’s asymptotic for random functional graphs. However, when the fixed degree was small, they were unable to match Krusal’s bound, since they could not (Lagrange) interpolate cycles in functional graphs of length greater than the fixed degree. In our work, we introduce a heuristic for approximating the average number of such cycles of any length. This heuristic is, roughly, that for sets of edges in a functional graph, the quality of being a cycle and the quality of being interpolable are “uncorrelated enough”. We prove that this heuristic implies that the average number of components of the functional graphs of polynomials of fixed degree over a finite field is within a bounded constant of Kruskal’s bound. We also analyze some numerical data comparing implications of this heuristic to some component counts of functional graphs of polynomials over finite fields.
1. Introduction
A (discrete) dynamical system is a pair consisting of a set and a map . Given such a system, an element is a periodic point of the system if there exists some such that ; the smallest with this property is called the period of . The functional graph of such a system, which we denote by , is the directed graph whose vertex set is and whose edges are given by the relation if and only if . A component of such a graph is a component of the underlying undirected graph. For any , let denote the average number of components of a random functional graph on a set of size ; that is, choose any set with and let
Kruskal (see [Kru54]) proved that
where is Euler’s constant.
Recently, researchers have begun studying the analogous situation for polynomials and rational maps over finite fields. More precisely, if is a prime power, define if and if . (If there is no ambiguity, we will frequently write for .) Then we can ask the question: for , what is the average number of components of , for ranging over all polynomials (or rational maps) over of a fixed degree? For example, if we define
then we can ask:
Question 1.1.
For a prime power and , how does compare to ?
In this paper, we recast these questions in probabilistic terms. Specifically, in Section 2, we define two families of random variables whose interaction determines the answer to Question 1.1. (Briefly, both families random variables have sample space a certain collection of subsets of —one random variable determines if a collection is a cycle, and the other returns how many polynomials of a given degree pass though every point in a collection.)
Our main result, Theorem 3.3, states that if these two familes of random variables satisfy a certain “noncorrelation hypothesis”, then
(See Heuristic 3.1 for an exact formulation of this hypothesis.) In Section 2 we define and study these random variables; in particular, we compute their expected values. Next, in Section 3 we use the results from Section 2 to prove the aforementioned Theorem 3.3. Then, in Section 4, we provide numerical evidence in support of Heuristic 3.1. Finally, these results carry over easily to the analogous question for rational functions; these results make up Section 5.
Previous work of Flynn and the second author (see [FG14]) provided a partial answer to the question under discussion. In particular, they proved that if , then the average number of components of functional graphs of polynomials (or rational maps) of degree over is bounded below by
(this is Corollary 2.3 and Theorem 3.6 from [FG14]).
To describe their method, which is the starting point for this paper, we require a definition and an observation. If a map has a periodic point of period , with orbit , then we refer to its orbit as a cycle (cycles of length are called cycles). (See [VS04] for more exposition and illustrations of the cycle structure of functional graphs.) This definition is especially useful since it allows for the following observation.
Observation 1.2.
Components of are in onetoone correspondence with the cycles of .
To obtain their results, Flynn and the second author used Lagrange interpolation to interpolate all the cycles of length smaller than the degree of the maps in question. Since they could not interpolate longer cycles,

they obtained only a lower bound for , and

their result required that be at least .
See Remark 2.5 for a discussion on the relationship between the results of this paper and the results of [FG14]; for example, they proved that the random variables mentioned above are indeed uncorrelated in certain cases.
The cycle structure of functional graphs of polynomials over finite fields has been studied extensively in certain cases. Vasiga and Shallit [VS04] studied the cycle structure of for the cases and , as did Rogers [Rog96] for . For any , the squaring function is also defined over ; Carlip and Mincheva [CM08] addressed this situation for certain . Similarly, Chou and Shparlinski [CS04] studied the cycle structure of repeated exponentiation over finite fields of prime size. In the context of Pollard’s rho algorithm for factoring integers (see [Pol75]), researchers have provided copious data and heuristic arguments supporting the claim that quadratic polynomials produce as many “collisions” as random functions, but very little has been proven (see [Pol75] and [Bac91]). For many other aspects of functional graphs besides their cycle structure, see [FO90] for a study of about twenty characteristic parameters of random mappings in various settings.
More recently, Burnette and Schmutz [BS15] used the probabilistic point of view to study a similar question to the one we address here. If is a polynomial (or rational function) over , define the ultimate period of to be the least common multiple of the cycle lengths of . They found a lower bound for the average ultimate period of polynomials (and rational functions) of fixed degree, whenever the degree of the maps in question, and the size of the finite field, were large enough.
2. Two families of random variables
In this section, we define two families of random variables and compute their expected values. The interaction of these random variables determines the answer to Question 1.1; see Remark 2.4 and the remarks that follow for details about this interaction. For the remainder of the section, fix a prime power and positive integer . Now, for any set and , we say that is consistent if and only if it has the following property: if , then . Next, for any , define
Any element of defines a directed graph with vertex set and edge set ; let be the binary random variable that detects whether or not an element of defines a graph that happens to be a cycle. If and , we say that satisfies if for all . Next, we let be the random variable defined by
Before computing the expected values of and , we first mention the size of their sample space.
Remark 2.1.
If , then
Proof.
Since the elements of are consistent, there are possible choices for the sets of abscissas for any choice of ordinates. Since the ordinates of elements of are unrestricted, we conclude that . ∎
Remark 2.2.
If , then
Proof.
Since
we only need to count the number of elements in that are cycles. Since there are
cycles of length , we conclude by Remark 2.1. ∎
Proposition 2.3.
If , then
Proof.
Remark 2.4.
If we assume that are uncorrelated for all , then .
Proof.
Remark 2.5.
Unfortunately, we must face up to the fact that the random variables are not uncorrelated for all . Indeed, if they were, then the computations from Remark 2.4 would show that
But, if , then the quantity on the left is an integer, and the quantity on the right is not! In Section 3, we propose a heuristic that is more reasonable than that these two random variables are uncorrelated.
On the other hand, we should note that the variables are indeed uncorrelated whenever ; this is the content of Lemma 2.1 in [FG14].
3. The heuristic assumption and its implications
As mentioned in Remark 2.5, the variables are not uncorrelated for all . In this section, we propose a weaker heuristic for these variables, one which nevertheless implies .
Heuristic 3.1.
For any and any ,
Here, the implied constant depends only on .
In fact, Heuristic 3.1 implies more than ; we state the stronger implication here as a conjecture after one more definition. If and any , let
Conjecture 3.2.
For any and any ,
where the implied constant depends only on . In particular, .
Theorem 3.3.
If Heuristic 3.1 is true, then Conjecture 3.2 is true.
Proof.
As in the proof of Remark 2.4, Heuristic 3.1 immediately implies that
Next, we can apply Remarks 2.1 and 2.2, along with Proposition 2.3, to see that
To conclude, note that
∎
Remark 3.4.
The available numerical data suggests that the implied constants in Heurisitic 3.1 could be quite small. For example, the constant for seems as if it could be as small as 60. (See Section 4 for more details on the available data.)
4. Numerical evidence
In constructing numerical evidence for Conjecture 3.2, we computed the number of cycles of every length for all polynomials in

of degree 2, up to , and

of degree 3 up to .
For the remainder of the section, we will address only the quadratic case; a similar analysis works for the cubic case.
Of course, if we let , then for any , there is certainly a constant—let’s call it —for which
There are two obvious questions to ask about these constants, which we will address in turn

For any particular , how plausible is it that for all prime powers ?

Even if for all , does it seem likely that the implied constants are bounded, as asserted by Conjecture 3.2?
To answer the former question, we could plot, for various ,
But, as these numbers quickly become minuscule, it is convenient to let
that is, is the number of cycles appearing in functional graphs of polynomials in of degree . Conjecture 3.2 predicts that this quantity is about
which we will denote by . By the definition of , we know that for all and ,
As two examples of the data we have compiled, we include plots of and for , where and . These graphs are typical for .
4.1. Plots of and for
4.2. A plot of
To address the second question mentioned above, we plot the various values of in the hopes that they appear to be bounded.
This graph is below.
We should point out that the small values of in the above graph are a result of the fact that in our data, we simply found no cycles at all for all . So from onward, the above graph is simply plotting
This begs two questions:

As cycles of larger length arise for larger values of , will the size of increase?

Conversely, if these cycles do not arise promptly, will this increase the size of ?
Of course, we cannot answer these questions, but note that for the particular value of , the quadratic polynomials we tested yielded exactly cycles (all appearing when ), whereas for , they yielded exactly zero. That is, this is an example of a cycle of larger length arising without affecting the maximum of the s.
As for the second question, the lack of cycles will not cause to rise above 60 as long as the first cycle appears in a graph for a finite field of size less than . For example, the smallest for which 62cycles appear is (which is well under ). The smallest cycle length that does not appear for is ; if a 43cycle does not appear by the time , then will rise above 60. It is unfortunately beyond our abilities to determine if a 43cycle appears by this time.
5. Rational functions
In this section, we briefly mention the results for rational functions, which are analogous to those for polynomials. For any prime power and , let
If , we can define in exactly the same way as .
To define our new families of random variables, for any prime power and , let
and be the binary random variable that detects whether or not an element of is a cycle. If , let be the random variable defined by
The rational function analogs of Remark 2.1, Remark 2.2, Proposition 2.3 are proved as above, leading to the following conjecture, which again follows from the heuristic that the random variables are “uncorrelated enough”.
Conjecture 5.1.
For any and any ,
where the implied constant depends only on . In particular, .
Heuristic 5.2.
If , and , then
Here, the implied constant depends only on .
Theorem 5.3.
If Heuristic 5.2 is true, then Conjecture 5.1 is true.
Proof.
Similar to the proof of Theorem 3.3. ∎
Acknowledgments
The authors would like to thank Ian Dinwoodie, Rafe Jones, and Christopher Kramer for their help and advice.
References
 [Bac91] Eric Bach, Toward a theory of Pollard’s rho method, Inform. and Comput. 90 (1991), no. 2, 139–155. MR 1094034
 [BS15] C. Burnette and E. Schmutz, Periods of iterated rational functions over a finite field, ArXiv eprints (2015).
 [CM08] Walter Carlip and Martina Mincheva, Symmetry of iteration graphs, Czechoslovak Math. J. 58(133) (2008), no. 1, 131–145. MR 2402530
 [CS04] WunSeng Chou and Igor E. Shparlinski, On the cycle structure of repeated exponentiation modulo a prime, J. Number Theory 107 (2004), no. 2, 345–356. MR 2072394
 [FG14] Ryan Flynn and Derek Garton, Graph components and dynamics over finite fields, Int. J. Number Theory 10 (2014), no. 3, 779–792. MR 3190008
 [FO90] Philippe Flajolet and Andrew M. Odlyzko, Random mapping statistics, Advances in cryptology—EUROCRYPT ’89 (Houthalen, 1989), Lecture Notes in Comput. Sci., vol. 434, Springer, Berlin, 1990, pp. 329–354. MR 1083961
 [Kru54] Martin D. Kruskal, The expected number of components under a random mapping function, Amer. Math. Monthly 61 (1954), 392–397. MR 0062973 (16,52b)
 [Pol75] J. M. Pollard, A Monte Carlo method for factorization, Nordisk Tidskr. Informationsbehandling (BIT) 15 (1975), no. 3, 331–334. MR 0392798 (52 #13611)
 [Rog96] Thomas D. Rogers, The graph of the square mapping on the prime fields, Discrete Math. 148 (1996), no. 13, 317–324. MR 1368298
 [VS04] Troy Vasiga and Jeffrey Shallit, On the iteration of certain quadratic maps over , Discrete Math. 277 (2004), no. 13, 219–240. MR 2033734 (2004k:05104)