Protection Number in Plane Trees

Protection Number in Plane Trees

Clemens Heuberger Institut für Mathematik, Alpen-Adria-Universität Klagenfurt, Universitätsstraße 65–67, 9020 Klagenfurt, Austria  and  Helmut Prodinger Department of Mathematical Sciences, Stellenbosch University, 7602 Stellenbosch, South Africa
Abstract.

The protection number of a plane tree is the minimal distance of the root to a leaf; this definition carries over to an arbitrary node in a plane tree by considering the maximal subtree having this node as a root. We study the the protection number of a uniformly chosen random tree of size and also the protection number of a uniformly chosen node in a uniformly chosen random tree of size . The method is to apply singularity analysis to appropriate generating functions. Additional results are provided as well.

Key words and phrases:
Protected node, protection number, singularity analysis, plane tree
1991 Mathematics Subject Classification:
05C05; 05A15, 05A16
C. Heuberger is supported by the Austrian Science Fund (FWF): P 24644-N26. This paper has been written while he was a visitor at Stellenbosch University.
H. Prodinger is supported by an incentive grant of the National Research Foundation of South Africa.

1. Introduction

Cheon and Shapiro [2] started the study of 2-protected nodes in trees. A node enjoys this property if its distance to any leaf is at least 2. After this pioneering paper, a large number of papers has been published [13, 6, 1, 10, 5, 12].

In this paper we study the protection number of the root of a (rooted, plane) tree (in the older literature often called ordered tree): It is the minimal distance of the root to any leaf. Further, the protection number of any node is defined by taking the (maximal) subtree that has this node as the root.

Preliminary results on the subject have appeared in the recent paper [3], but we show that, thanks to a rigorous use of methods outlined in the book Analytic Combinatorics [9], we can go much further. We are able to solve a basic recurrence explicitly, which allows us to use singularity analysis of generating functions and getting, at least in principle, as many terms as one wants in the asymptotic expansions of interest. Further, one can derive explicit expressions for the probabilities in question.

Some curious observations related to the constants that appear are also made; they are linked to identities due to Dedekind, Ramanujan and others and are part of the toolkit of the modern analysis of algorithms.

2. Results

In a rooted plane tree , a vertex is said to be -protected if its minimum distance from a leaf is at least . The tree is said to be -protected if its root is -protected.

We denote the maximal such that a vertex is -protected by and call it the protection number of . The protection number of the root is called the protection number of the tree, . This means that a tree is -protected if and only if .

If a vertex is a leaf, then by definition. Otherwise, if it has children , …, , then

 π(v)=1+min{π(w1),…,π(wℓ)}.

We are interested in the random variables

• , the protection number of a uniformly chosen random tree with vertices,

• , the protection number of a uniformly chosen vertex in a uniformly chosen random tree with vertices.

We will prove the following results.

Theorem 1.

For , the protection number of a tree with vertices tends to a discrete limit distribution:

 P(Xn=k)=27⋅4k(42k−1)(4k+2)2(2⋅4k+1)2+81⋅4k(4(k−3)46k+36⋅45k−(45k−72)44k−80k43k−(45k+72)42k−36⋅4k+4(k+3))2(4k+2)4(2⋅4k+1)41n+O(k23kn3/2).

Setting

 c0 =∑k≥19⋅4k(4k+2)2 =1.622971384715353049514658203184345989635513668984063539407825…, c1 =∑k≥19⋅4k((3k−8)42k+28⋅4k−(12k+20))2(4k+2)4 =0.1311873689494231825244485810366733833577429413531428274982796…, c2 =∑k≥19(2k−1)4k(4k+2)2−c20 =0.71569507178333266731548919868273628601066118785422617431075…, c3 =∑k≥19(2k−1)4k((3k−8)42k+28⋅4k−(12k+20))2(4k+2)4−2c0c1 =−0.294639322732595323433878185755458143829498855158644070705218…,

its expectation and variance can be written as

 E(Xn) =c0+c11n+O(1n3/2), V(Xn) =c2+c31n+O(1n3/2).
Theorem 2.

For , the protection number of a random vertex of a random tree with vertices tends to a discrete limit distribution:

 P(Yn=k)=9⋅4k(4⋅4k+2)(4k+2)+3⋅4k(4k−1)((6k−22)43k+(21k+30)42k+(21k+96)4k+(6k+58))(4k+2)3(2⋅4k+1)31n+O(k23kn2).

Setting

 d0 =∑k≥134k+2 =0.727649276913726097531184400482145348863515722775042276537008…, d1 =∑k≥1(3k−10)42k+(6k+26)4k−162(4k+2)3 =−0.0311837125986222774945246489936100437425899128713521725307175…, d2 =∑k≥13(2k−1)4k+2−d20 =0.81689937948362892278879205623322983539562628691031631640757…, d3 =∑k≥1(2k−1)((3k−10)42k+(6k+26)4k−16)2(4k+2)3−2d0d1 =0.014197899249123624176745586362758197533680269252844749278840…,

its expectation and variance can be written as

 E(Yn) =d0+d11n+O(1n3/2), V(Yn) =d2+d31n+O(1n3/2).

3. Number of k-Protected Trees

In this section, we investigate the auxiliary quantity , the number of -protected trees with vertices.

Let denote the set of all rooted plane trees with protection number . Its generating function is denoted by where labels the number of vertices of a tree, i.e.,

 R≥k(z)=∑n≥1rnkzn.
Lemma 3.1.

We have

 (1) R≥0(z)=1−√1−4z2,

and

 (2) R≥k(z)=(1−z)zk−2(R≥0(z))31+zk−2(R≥0(z))3

for .

Proof.

It is clear that is the generating function of all rooted plane trees which is well-known to be given by (1), cf. for instance [9, § I.5.1].

For , the root of a tree is -protected if and only if all of its children are -protected. Thus a tree is -protected if and only if it consists of a root and a non-empty sequence of branches whose roots are -protected. This translates into the symbolic equation shown in Figure 1 and thus into the equation

 (3) R≥k(z)=zR≥k−1(z)1−R≥k−1(z)

for .

It would now be easy to prove (2) by induction; however, we intend to derive (2).

We rewrite the recurrence (3) in the form

 R≥k(z)=−z+z1−R≥k−1(z)

such that occurs only once.

Using the abbreviation this leads to

 Fk=z1+z−Fk−1.

This is reminiscent of continued fractions. We use the Ansatz resulting in the equation

 zakak+1=zak(1+z)ak−zak−1.

It is sufficient to require

 ak+1=(1+z)ak−zak−1

for .

This is a linear recurrence whose characteristic equation has roots and , so its solution has the form . As common factors between and do not matter, we may choose which leads to .

Thus

 R≥k(z)=z(z3+zk(R≥0(z))3)z3+zk+1(R≥0(z))3−z

which results in (2). ∎

Proposition 3.2.

The probability that a tree is protected is

 (4) P(Xn≥k) =9⋅4k(4k+2)2 +9⋅4k((3k−8)42k+28⋅4k−(12k+20))2(4k+2)41n +O(k23kn3/2).
Proof.

We intend to use singularity analysis ([8], [9, Chapter VI]). Let be in some -domain at (see [9, Definition VI.1]) with . We have

 zk−2 =(14−(14−z))k−2 =164k−16(k−2)4k(1−4z)+O(k23k(1−4z)2).

Inserting this into (2), we get

 (5) R≥k(z) =32⋅4k+4+−9⋅4k2⋅42k+8⋅4k+8(1−4z)1/2 +−3k⋅42k−6k4k+16⋅42k−20⋅4k+42⋅43k+12⋅42k+24⋅4k+16(1−4z) +9k43k−24⋅43k−36k4k+84⋅42k−60⋅4k2⋅44k+16⋅43k+48⋅42k+64⋅4k+32(1−4z)3/2 +O(k23k(1−4z)2).

By singularity analysis, we obtain

 rnk=(9⋅4k4√π(42k+4⋅4k+4))4nn−32+(9(12⋅43kk−29⋅43k+124⋅42k−48⋅4kk−68⋅4k)32√π(44k+8⋅43k+24⋅42k+32⋅4k+16))4nn−52+O(k24n3kn3)

Singularity analysis and division by the number (the st Catalan-number) of rooted plane trees (this corresponds to setting above) yields (4).

Proof of Theorem 1.

We use and Proposition 3.2 to prove the limit theorem.

The expectation follows from the well-known formula

 E(Xn)=∑k≥1P(Xn≥k)

which is valid for all random variables with non-negative integer values.

Similarly, the variance follows from and

 E(X2n) =∑k≥1k2P(Xn=k)=∑k≥1(2k−1)P(Xn≥k).

4. Protection Numbers of all Vertices

We now turn to the protection numbers of arbitrary vertices. We fix some and consider the number of -protected vertices summed over all trees of size . The corresponding generating function is denoted by

 S≥k(z)=∑n≥0snkzn

where again labels the number of vertices.

Lemma 4.1.

We have

 (6) S≥k(z)=12R≥k(z)(1+1√1−4z).
Proof.

In the language of the symbolic method, the generating function corresponds to the class where denotes pointing to a -protected vertex, cf. [9, Definition I.14].

A tree and a -protected vertex in this tree bijectively correspond to a -protected tree whose root is merged with a leaf of another tree , cf. Figure 2.

Thus corresponds bijectively to where denotes pointing at a leaf and the factor on the left hand side denotes one single vertex which compensates the fact that merging the root of one tree with a leaf of the other tree reduces the number of vertices by .

Thus where denotes the generating function of with respect to the number of vertices. Note that the pointing is with respect to leaves, but is a generating function with respect to all vertices.

Let

 T(v,z)=∑n≥0∑ℓ≥0Nn−1,ℓvℓzn

denote the generating function of where marks the number of vertices and marks the number of leaves. Here, denotes the Narayana number counting the number of trees with vertices and leaves.

It is a well-known consequence of the symbolic method that

 T(v,z)=zv+zT(v,z)1−T(v,z),

cf. [9, Example III.13]. This yields the explicit expression

 T(v,z)=1−z+vz−√(v−1)2z2−2(v+1)z+12.

Pointing corresponds to applying , cf. [9, Theorem I.4]. Setting then leads to

 L(z)=(vdT(v,z)dv)∣∣v=1=z2(1+1√1−4z).

This yields (6). ∎

Proposition 4.2.

The probability that a random vertex of a random tree with vertices has protection number at least is

 P(Yn≥k)=34k+2+(3k−10)42k+(6k+26)4k−162(4k+2)31n+O(k23kn2).
Proof.

By (6) and (5), we get

 S≥k(z) =34⋅4k+8(1−4z)1/2−3⋅4k−32⋅42k+8⋅4k+8 −(3k−7)42k+(6k+38)4k−44⋅43k+24⋅42k+48⋅4k+32(1−4z)−1/2 +(3k−4)43k−6(k−8)42k−24(k+2)4k+42⋅44k+16⋅43k+48⋅42k+64⋅4k+32(1−4z)−1 +O(k23k(1−4z)−3/2).

By singularity analysis, we get

 snk=34√π(4k+2)4nn−1/2+(12k−31)42k+(24k+140)4k−2832√π(43k+6⋅42k+12⋅4k+8)4nn−3/2+O(k23kn5/2).

Dividing by the number of all trees and by the number of vertices yields (4.2). ∎

Proof of Theorem 2.

Theorem 2 follows from Proposition 4.2 in the same way as Theorem 1 follows from Proposition 3.2. ∎

5. Explicit formula for the number of ≥k-protected trees

Our goal is to read off the coefficient of in formula (2) in explicit form.

Proposition 5.1.

The number of -protected trees with vertices is

 rnk=∑j≥1(−1)j−1[(2n−3−(2k−1)jn−(k+1)j)−(2n−3−(2k−1)jn−3−(k+1)j)].
Proof.

We use the substitution , which was introduced in [4] and rewrite (2) as

 G≥k(u(1+u)2) =1−u3(1−u)(1+u)2uk+1(1+u)2k−11+uk+1(1+u)2k−1 =1−u3(1−u)(1+u)2∑j≥1(−1)j−1u(k+1)j(1+u)(2k−1)j.

Extracting coefficients is now done with the Cauchy integral formula:

 [zn] 1−u3(1−u)(1+u)2∑j≥1(−1)j−1u(k+1)j(1+u)(2k−1)j =12πi∮dzzn+11−u3(1−u)(1+u)2∑j≥1(−1)j−1u(k+1)j(1+u)(2k−1)j =12πi∮duun+1(1+u)2n−3(1−u3)∑j≥1(−1)j−1u(k+1)j(1+u)(2k−1)j =∑j≥1(−1)j−1[un−(k+1)j](1−u3)(1+u)2n−3−(2k−1)j.

Using the binomial theorem yields the assertion. ∎

6. Functional equations for the constants

Two of the constants satisfy attractive and non-trivial functional equations. This phenomenon is not uncommon in the analysis of algorithms; we point out the paper [11] where it was first observed and the survey [14] which contains many references to earlier papers.

The first example is the constant

 c0 =92∑k≥122k−1(22k−1+1)2=92F(log2),

with

 F(x) :=∑k≥1e(2k−1)x(e(2k−1)x+1)2=∑k≥1e−(2k−1)x(1+e−(2k−1)x)2=∑k,j≥1(−1)j−1je−(2k−1)jx.
Proposition 6.1.

We have the functional equation

 F(x)=14x−π2x2F(π2x).

Since , we have the near-identity .

Proof.

We compute the Mellin transform of it [7], which exists (at least) in the fundamental strip :

 F∗(s) =Γ(s)∑k,j≥1(−1)j−1j(2k−1)−sj−s=Γ(s)ζ(s−1)ζ(s)(1−22−s)(1−2−s).

The inversion formula for the Mellin transform gives the original function back (integration is along vertical lines). We shift the line of integration to the left and collect residues:

 F(x) =12πi∫(52)Γ(s)ζ(s−1)ζ(s)(1−22−s)(1−2−s)x−sds =14x+12πi∫(−52)Γ(s)ζ(s−1)ζ(s)h(s)x−sds

for

 h(s)=(1−22−s)(1−2−s).

In the remainder of this proof, the relation

 (7) h(2−s)=22s−2h(s)

will be the only property of that we will use.

We now use the duplication formula for the Gamma function and the functional equation for the Riemann zeta function, a substitution and then again a shift of the line of integration.

Our second example relates to a sum that appears within the constant :

 S=32∑k≥12k−122k−1+1=32∑k≥1(2k−1)2−2k+11+2−2k+1=32∑k,j≥1(−1)j−1(2k−1)2−j(2k−1)=32G(log2)

with

 G(x)=∑k,j≥1(−1)j−1(2k−1)e−j(2k−1)x.
Proposition 6.2.

We have the functional equation

 G(x)=π224x2+124−π2x2G(π2x).

Since , we have the near-identity .

Proof.

The Mellin transform of is

 Γ(s)∑k,j≥1(−1)j−1j−s(2k−1)−s+1=Γ(s)ζ(s)ζ(s−1)(1−21−s)2.

The proof of Proposition 6.1 applies with replaced by which again has the property (7). ∎

References

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters