A combinatorial proof for Cayley’s identity
Abstract.
In [3], Caracciolo, Sokal and Sportiello presented, inter alia, an algebraic/combinatorial proof for Cayley’s identity. The purpose of the present paper is to give a “purely combinatorial” proof for this identity; i.e., a proof involving only combinatorial arguments together with a generalization of Laplace’s Theorem [6, section 148], for which a “purely combinatorial” proof is given in [4, proof of Theorem 6].
1. Introduction
For , denote by the set and let be an matrix of indeterminates. For and , we denote

the minor of corresponding to the rows and the columns by ,

the cominor of (which corresponds to the rows and the columns ) by .
Let be a finite ordered set, and let be a subset of . We define
As pointed out in [3, Section 2.6], the following identity is conventionally but erroneously attributed to Cayley. (Muir [5, vol. 4, p. 479] attributes this identity to Vivanti [7].)
Theorem 1 (Cayley’s Identity).
Consider , and let be the corresponding matrix of partial derivatives^{1}^{1}1 is also known as Cayley’s –process.. Let with . Then we have for :
(1) 
By the alternating property of the determinant, Cayley’s Identity is in fact equivalent to the following special case of (1).
Corollary 1 (Vivanti’s Theorem).
Specialize for some in Theorem 1. Then we have for :
(2) 
2. Combinatorial proof of Vivanti’s Theorem
We may view the determinant of as the generating function of all permutations in , where the (signed) weight of a permutation is given as :
2.1. View permutations as perfect matchings
For our considerations, it is convenient to view a permutation as a perfect matching of the complete bipartite graph , where the vertices consist of two copies of which are arranged in their natural order; see Figure 1 for an illustration of this simple idea. It is easy to see that the edges of such perfect matching can be drawn in a way such that all intersections are of precisely two (and not more) edges, and that the number of these intersections equals the number of inversions of , whence the sign of is
This simple visualization of permutations and their inversions is already used in [1, §15, p.32]: We call it the permutation diagram. So assigning weight to the edge pointing from to and defining the weight of the permutation diagram to be the product of the edges belonging to , we may write
Given this view, the combinatorial interpretation of the th power of the determinant is obvious: It is the generating function of all tuples of permutation diagrams, where the (signed) weight of such tuple is given as
(See Figure 2 for an illustration.)
2.2. Action of the determinant of partial derivatives
Next we need to describe combinatorially the action of the determinant of partial derivatives. Let be an tuple of permutation diagrams counted in the generating function , and let : Then the summand
applied to yields
where is the number of ways to choose the set of edges from all the edges in (this number, of course, might be zero). We may visualize the action of as “erasing the edges constituting in ”; see Figure 3 for an illustration.
Hence we have:
(3) 
2.3. Double counting
For our purposes, it is convenient to interchange the summation in (3). This application of double counting amounts here to a simple change of view: Instead of counting the ways to choose the set of edges corresponding to from all the edges corresponding to some fixed tuple , we fix and consider the set of ’s from which ’ edges might be chosen. This will involve two considerations:

In how many ways can the edges corresponding to be distributed on copies of the bipartite graph ?

For each such distribution, what is the set of compatible tuples of permutation diagrams?
For example, if and (as in Figure 3), there clearly

is way to distribute the three edges on a single copy of the bipartite graphs (see the fourth row of pictures in Figure 3), and there are ways to choose such single copy,

are ways to distribute the three edges on precisely two copies of the bipartite graphs (see the second and third row of pictures in Figure 3), and there are ways to choose such pair of copies (whose order is relevant),

is way to distribute the three edges on precisely three copies of the bipartite graphs (see the first row of pictures in Figure 3), and there are ways to choose such triple of copies (whose order is relevant).
2.4. Partitioned permutations
A distribution of the edges corresponding to on copies of the bipartite graph may be viewed (see Figure 3)

as an tuple of partial matchings (some of which may be empty) of

such that the union of these partial matchings gives the perfect matching of .
Clearly, to each of such partial matching corresponds a partial permutation , which we may write in twoline notation as follows:

the lower line shows the domain of in its natural order,

the upper line shows the image of ,

the ordering of the upper line represents the permutation .
We say that each of these is a partial permutation of , and that is a partitioned permutation. We write in short:
For example, the rows of pictures in Figure 3 correspond to the partitioned permutations (written in the aforementioned twoline notation)

for the first row,

for the second row,

for the third row,

for the fourth row.
2.5. Equivalence relation for partitioned permutations
For any partitioned permutation , consider the tuple of the upper rows (in the aforementioned twoline notation) only: We call this tuple of permutation words the partition scheme of and denote it by . We say that complies to its partition scheme and denote this by .
Now consider the following equivalence relation on the set of partitioned permutations:
By definition, the corresponding equivalence classes are indexed by a partition scheme, and belongs to the equivalence class of iff . (For , a partitioned permutation is not uniquely determined by .)
It is straightforward to compute the number of these equivalence classes: In the language of combinatorial species (see, for instance, [2]) the tuples of permutation words indexing these classes correspond bijectively to the (labelled) species , and since the exponential generating function of is
the exponential generating function of is simply
So the number of these equivalence classes is , which is precisely the factor in (2). Our proof will be complete if we manage to show that the generating functions of each of these equivalence classes are the same, namely
2.6. Accounting for the signs
A necessary first step for this task is to investigate how the sign of a permutation is changed by removing a given partial permutation : We view this as erasing all the edges belonging to ’s permutation diagram from ’s permutation diagram ; see again Figure 3.
Lemma 1.
Let be a permutation, and let be the permutation corresponding to the permutation diagram with edge removed. Then we have
Proof.
We count the the number of intersections with edge in : Let , , and (see Figure 4 for an illustration).
Assume : Then edge clearly intersects the edges joining vertices from to vertices from (see again Figure 4).
The only other intersections with come from edges joining vertices from to vertices from : Since is a bijection, we have , whence .
Altogether, the removal of edge removes intersections of edges. ∎
Corollary 2.
Let be a partitioned permutation , where is the partial permutation
(with ). Clearly, is the permutation corresponding to the matching with edges erased, which we also denote by . Then we have
If we denote and , we may rewrite this as
(4) 
Proof.
We proceed by induction: simply amounts to the statement of Lemma 1.
For , let be the preimage of the maximum of the image of . Let be the number of elements in the domain of wich are greater than :
See Figure 5 for an illustration. Removing the edge leaves (the diagram of) a permutation and a partial permutation
therein of length . By induction, we have
Since we have
the assertion follows. ∎
2.7. Sums of (signed) products of minors
Now consider a fixed equivalence class in the sense of Section 2.5, which is indexed by a partitionscheme
We want to compute the generating function of this equivalence class: Clearly, we may concentrate on the nonempty partial permutations; so w.l.o.g. we have to consider the partitionscheme
which consists only of nonempty partial permutations for . For any with , such partition scheme corresponds to a unique ordered partition of the image of :
and any specification of a compatible ordered partition , i.e.,
uniquely determines such , which we denote by .
Equation (4) gives the signchange caused by erasing the edges corresponding to (with respect to any permutation in which contains as a partial permutation), whence we can write the generating function as
where the sum is over all compatible partitions . (The factor comes from the determinant of partial derivatives.) Clearly,
so it remains to show
(5) 
This, of course, is true for . We proceed by induction on .
For any ordered partition , we introduce the shorthand notation
Moreover, write for short. Then the lefthandside of (5) may be written as the fold sum
(6) 
where and .
Assume , and . Then the special choice (i.e., with respect to the relative ordering, “ is the same subset as ”) and determines uniquely a partial permutation
According to (4), by construction we have
(7) 
Now consider in the innermost sum of (6): Erasing the edges corresponding to and from and replacing them by the edges corresponding to yields a permutation (which, of course, complies to the partition scheme ). Since by (4) together with (7) we have
and (clearly)
we also have (again by (4))
Hence the innermost sum of (6) can be written as
If we can show that this last sum equals , then (5) follows by induction, since the –fold sum in (6) thus reduces to an –fold sum, which corresponds to the partitionscheme .
2.8. (A generalization of) Laplace’s theorem
Luckily, a generalization (see [6, section 148]) of Laplace’s Theorem serves as the closer for our argumentation:
Theorem 2.
Let be an matrix, and let and be (the indices of) fixed rows and fixed columns of . Denote the set of these (indices of) rows and columns by and , respectively. Consider some fixed set . Then we have:
(8) 
References
 [1] A.C. Aitken. Determinants and Matrices. Oliver & Boyd, Ltd., Edinburgh, 9th. edition, 1956.
 [2] F. Bergeron, G. Labelle, and P. Leroux. Combinatorial Species and tree–like Structures, volume 67 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, 1998.
 [3] S. Caracciolo, A.D. Sokal, and A. Sportiello. Algebraic/combinatorial proofs of Cayleytype identities for derivatives of determinants and pfaffians. Advances in Applied Mathematics, 50(4):474 – 594, 2013.
 [4] M. Fulmek. Viewing determinants as nonintersecting lattice paths yields classical determinantal identities bijectively. Electron. J. Combin., 19(3):P21, 2012.
 [5] T. Muir. The Theory of Determinants in the historical order of development, volume 4 volumes. MacMillan and Co., Limited, London, 1906–1923.
 [6] T. Muir. A Treatise on the Theory of Determinants. Longmans, Green and Co., London, 1933.
 [7] G. Vivanti. Alcune formole relative all’operazione . Rend. Circ. Mat. Palermo, 1890.