[
Abstract
We present an algorithm for evaluating a linear “intersection transform” of a function defined on the lattice of subsets of an element set. In particular, the algorithm constructs an arithmetic circuit for evaluating the transform in “downclosure time” relative to the support of the function and the evaluation domain. As an application, we develop an algorithm that, given as input a digraph with vertices and bounded integer weights at the edges, counts paths by weight and given length in time , where , and the notation suppresses a factor polynomial in .
]The fast intersection transform
with applications to counting paths
lu]Andreas Björklund
itu]Thore Husfeldt
he]Petteri Kaski
he]Mikko Koivisto
1 Introduction
Efficient algorithms for linear transformations, such as the fast Fourier transform of Cooley and Tukey [10] and Yates’ algorithm [28], are fundamental tools both in computing theory and in practical applications. Therefore it is surprising that some arguably elementary transformations have apparently not been investigated from an algorithmic perspective.
This paper contributes by studying an “intersection transform” of functions defined on subsets of a ground set. In precise terms, let be a finite set with elements (the ground set), let be a ring, and denote by the set of all subsets of . The intersection transform maps a function to the function , defined for all and by
(1) 
Our interest here is in particular to restrict (or “trim”) the domains of the input and the output from to given subsets of .
For a subset , denote by the downclosure of , that is, the family of sets consisting of all the sets in and their subsets. The notation in what follows suppresses a factor polynomial in . The following theorem states our main result.
Theorem 1.
There exists an algorithm that, given and as input, in time constructs an arithmetic circuit with input gates for and output gates that evaluate to .
This result supplies yet another tool aimed at the resolution of a longstanding open problem, namely that of improving upon the classical (early 1960s) dynamic programming algorithm for the Travelling Salesman Problem (TSP). With an running time for an instance with cities, the classical algorithm, due to Bellman [3, 4], and, independently, Held and Karp [15], remains the fastest known exact algorithm for the TSP. Moreover, progress has been equally stuck at even if one considers the more restricted Hamiltonian Path (HP) and the Hamiltonian Cycle (HC) problems.
Armed with Theorem 1, we show that the bound can be broken in a counting context, assuming one cares only for long paths or cycles, as opposed to the spanning paths or cycles required by the TSP/HP/HC. (See §1.1 for a contrast with earlier work.)
Denote by the binary entropy function
(2) 
Theorem 2.
There exists an algorithm that, given as input

a directed graph with vertices and bounded integer weights at the edges,

two vertices, and , and

a length ,
counts, by total weight, the number of paths of length from to in in time
(3) 
For example, Theorem 2 implies that we can count in time with length and in time with length . For length the bound reduces to the classical bound .
We observe that counting implies, by selfreducibility, that we can construct examples of the paths within the same time bound. Similarly, we can count cycles of a given length within the same bound. However, the efficient listing (in the form of vertex supports, weights, and ends ) of all the paths for any length appears not to be possible with present tools in time for independent of . Indeed, if it were possible, we would obtain the breakthrough algorithm for generic TSP by starting the classical algorithm from the output of the listing algorithm.
We expect Theorem 1 to have applications beyond Theorem 2; for example, in the context of subset query problems discussed by Charikar, Indyk, and Panigrahy [8].
Given and as input, we can count in time for each the number of that intersect in a given number of points; in particular, for each we can count the number of disjoint .
By duality of disjointness and set inclusion, we can thus count in time for each the number of with . Here denotes the upclosure of , that is, the family of sets consisting of all the sets in and their supersets in .
1.1 Further remarks and earlier work
Theorem 1 has its roots in Yates’ algorithm [28] for evaluating the product of a vector with the Kronecker power of a matrix. While Yates’ algorithm is essentially optimal, running in ring operations given an input vector with entries, in certain cases the evaluation can be “trimmed”, assuming one requires only sporadic entries of the output vector. In particular, the present authors have observed [6] that the zeta and Moebius transforms on are amenable to trimming (see Lemma 3 below for a precise statement).
The proof of Theorem 1 relies on a trimmed concatenation of two “dual” zeta transforms, one that depends on supersets of a set (the “up” transform), and one that depends on subsets of a set (the “down” transform). To provide a rough intuition, we first use the upzeta transform to drive information about on “down” to . Then we use a “ranked” [5] downzeta transform to assemble information “up” from to . Finally, we extract the intersection transform from the information gathered at each . This essentially amounts to solving a fixed system of linear equations at each .
This proof strategy yet again highlights a basic theme: the use of fast linear transformations to distribute and assemble information across a domain (e.g. time, frequency, subset lattice) so that “local” computations in the domain (e.g. pointwise multiplication, solving local systems of linear equations) alternated with transforms enable the extraction of a desired result (e.g. convolution, intersection transform). Compared with earlier works such as [5, 6, 19], the present approach establishes the serendipity of the up/down dual transforms and introduces the “linear equation trick” into the toolbox of local computations.
Once Theorem 1 is available, Theorem 2 stems from the observation that a path can be decomposed into two paths, each having half the length of the original path, with exactly one vertex in common. Theorem 1 then enables us to “glue halves” in and , where and consist of sets of size at most . This prompts the observation that Theorem 1 is useful only when the bound improves upon the trivial bound obtained by a direct iteration over all pairs .
We know at least one alternative way of proving Theorem 2, without using Theorem 1. Indeed, assuming knowledge of trimming [6], one can use an algorithm of Kennes [19] to evaluate a sum for given and in ring operations (take the trimmed upzeta transform of and , take pointwise product of transforms, take the trimmed upMoebius transform, and sum over all subsets in ). This enables one to evaluate the righthand side of (20) below in time (3), thus giving an alternative proof of Theorem 2.
To contrast Kennes’ algorithm with Theorem 1, Kennes’ algorithm computes for each the sum over pairs with , whereas (1) computes, for each the sum over with . Thus, Kennes’ algorithm provides control over the intersection but lacks control over the pairs , whereas (1) provides control over but lacks control over the intersection (except for size).
As regards the TSP/HP/HC, earlier work on exact exponentialtime algorithms can be divided roughly into three lines of study. (For a broader treatment of TSP/HP/HC and exact exponentialtime algorithms, we refer to [2, 14, 23], and [27], respectively.)
One line of study has been to restrict the input graph, whereby a natural restriction is to place an upper bound on the degrees of the vertices. Eppstein [11] has developed an algorithm that runs in time for and in time for . Iwama and Nakashima [16] have improved the case to , and Gebauer [12] the case to . The present authors established [7] an bound for all , with depending on but not on .
A second line of study has been to ease the space requirements of the algorithms from exponential to polynomial in . Karp [18] and, independently, Kohn, Gottlieb, and Kohn [20] have shown that TSP with bounded integer weights can be solved in time and space polynomial in . Combined with restrictions on the graph, one can arrive at running times and polynomial space [7, 11, 16].
A third line of study relaxes the requirement on spanning paths/cycles to “long” paths/cycles. In this setting, a simple backtrack algorithm finds a path of length in time . Monien [24] observed that this can be expedited to time by a dynamic programming approach. Alon, Yuster, and Zwick [1] introduced a seminal colourcoding procedure and improved the running time to expected and deterministic time, a large constant. Subsequently, combining colourcoding ideas with a divideandconquer approach, Chen, Lu, Sze, and Zhang [9], and, independently, Kneis, Mölle, Richter, and Rossmanith [22], developed algorithms with expected and deterministic time. A completely different approach was taken by Koutis [21], who presented an expected time algorithm relying on a randomised technique for detecting whether a given variate polynomial, represented as an arithmetic circuit with only sum and product gates, has a squarefree monomial of degree with an odd coefficient. Recently, Williams [26] extended Koutis’ technique and obtained an expected time algorithm.
To contrast with Theorem 2, while the bound of the Koutis–Williams [21, 26] algorithm is superior to the bound (3) in Theorem 2, it is not immediate whether the Koutis–Williams approach extends to counting problems. Furthermore, it appears challenging to derandomise the Koutis–Williams algorithm without increasing the running time (see [26, p. 6]), whereas the algorithm in Theorem 2 is deterministic.
2 The fast intersection transform
2.1 Preliminaries
For a logical proposition , we use Iverson’s bracket notation to denote a if is true, and a if is false.
Let and .
Define the upzeta transform for all by
(4) 
Define the downzeta transform for all by
(5) 
The following lemma condenses the essential properties of the “trimmed” fast zeta transform [6].
Lemma 3.
There exist algorithms that construct, given and as input, an arithmetic circuit with input gates for and output gates that evaluate to

, with construction time ;

, with construction time ;

, with construction time ; and

, with construction time .
2.2 The inverse of truncated Pascal’s triangle
We work with the standard extension of the binomial coefficients to arbitrary integers (see Graham, Knuth, and Patashnik [13]). For integers and , we let
(6) 
The following lemma is folklore, but we recall a proof here for convenience of exposition.
Lemma 4.
The integer matrices and with entries
(7) 
are mutual inverses.
Proof.
Let us first consider the entry of :
Here the second equality follows by observing that implies for all ; similarly, implies for all . The third equality follows from an application of the identity , valid for all integers (see [13, Equation 5.21]). The last equality follows from an application of the Binomial Theorem.
The analysis for the entry of is similar:
∎
It follows from Lemma 4 that the matrices and are mutual inverses over an arbitrary ring , where the entries of the matrices are understood to be embedded into via the natural ring homomorphism , where is the multiplicative identity element of , and is an integer.
2.3 Proof of Theorem 1
We first describe the algorithm and then prove its correctness. All arithmetic in the evaluations, and all derivations in subsequent proofs, are carried out in the ring .
Let and be given as input to the algorithm. The circuit is a sequence of three “modules” starting at the input gates for .
. Uptransform. Evaluate the upzeta transform
(8) 
with a circuit of size using Lemma 3(1). Observe that (4) implies that all nonzero values of are in .
. Downtransform by rank. For each , evaluate , the component of with rank , on ; that is, for all , set
(9) 
Then, for each , evaluate
(10) 
with a circuit of size using Lemma 3(3).
. Recover the intersection transform. Let be the matrix in Lemma 4 with entries embedded to . Associate with each the column vector
For each , evaluate the column vector
as the matrix–vector product
(11) 
Because the matrix is fixed, this can be implemented with fixed arithmetic gates.
The circuit thus consists of arithmetic gates. It remains to show that the circuit actually evaluates the intersection transform of .
Lemma 5.
For all and it holds that .
Proof.
Let and . Consider the following derivation:
(12)  
Here the first equality expands the definitions (10), (5), (9), (8), and (4). The second equality follows by changing the order of summation and observing that if and only if both and . The fourth equality follows by collecting the terms with together. The last equality follows from (7) and (1).
3 Counting paths
3.1 Preliminaries
We require some preliminaries before proceeding with the proof of Theorem 2. For basic graphtheoretic terminology we refer to West [25].
Let be an vertex digraph with vertex set and edge set , possibly with loops and parallel edges. (However, to avoid further technicalities in the bound (3), we assume that the number of edges in is bounded from above by a polynomial in .) Associated with each edge is a weight . For an edge , denote by (respectively, ) the start vertex (respectively, the end vertex) of .
It is convenient to work with the terminology of walks instead of paths. A walk of length in is a tuple such that , , and, for each , it holds that and . The walk is said to be from to .
A walk is simple if are distinct vertices. The set of distinct vertices occurring in a walk is the support of the walk. We denote the support of a walk by . The weight of a walk is the sum of the weights of the edges in the walk; a walk with no edges has zero weight. We write for the weight of .
For and we denote by the set of all simple walks from to with support . Observe that is empty unless both and .
Let be a polynomial indeterminate, and define an associated polynomial generating function by
(13) 
Put otherwise, the coefficient of each monomial of enumerates the simple walks from to with support and weight .
For , denote by the set of all subsets of .
For , define a polynomial generating function by
(14) 
Put otherwise, the coefficient of each monomial of enumerates the simple walks from to with length and weight .
3.2 Proof of Theorem 2
Let be fixed. Let be a digraph with vertices and edge weights for all . Let . Let .
With the objective of eventually applying Theorem 1, let and let be the univariate polynomial ring over with integer coefficients.
To compute , proceed as follows. First observe that the generating polynomials (13) can be computed by the following recursion on subsets of . The singleton sets , , form the base case of the recursion:
(15) 
The recursive step is defined for all and , , by
(16) 
Now, using (15) and (16), evaluate
(17) 
for each . Then, using (15) and (16) again, evaluate
(18) 
Next, using the algorithm in Theorem 1 with and , evaluate
(19) 
Finally, evaluate the righthand side of
(20) 
by direct summation.
The entire evaluation can thus be carried out with an arithmetic circuit of size
(21) 
that can be constructed in similar time.
To justify the equality in (20), consider the following derivation:
Here the first two equalities expand (17), (19), (1), (18), and (13). The third equality follows by observing that and are both nonempty only if and . Thus, implies that only terms with appear in the sum. The fourth equality is justified as follows. First observe that an arbitrary walk of length from to has the property that there exists a with if and only if the walk is simple. Moreover, a simple walk of length from to has a bijective decomposition into two simple subwalks, and , with for some . Indeed, is the length prefix of from to some , and is the length suffix of from to . Conversely, prepend to , deleting one occurrence of in the process, to get . The fifth equality follows from (14) and (13).
It remains to analyse the total running time of constructing and evaluating the circuit in terms of and .
Because is fixed, all the ring operations are carried out on polynomials of degree at most . Moreover, denoting by the number of edges in , the coefficients in the polynomials are integers bounded in absolute value by , where is an upper bound for the coefficients in (13) and (14), and is an upper bound for the expansion in intermediate values in the transforms. (Both bounds are far from tight.) Recalling that we assume that is bounded from above by a polynomial in , we have that the coefficients can be represented using a number of bits that is bounded from above by a polynomial in . It follows that each ring operation runs in time bounded from above by a polynomial in .
Footnotes
 thanks: This research was supported in part by the Academy of Finland, Grants 117499 (P.K.) and 109101 (M.K.), and by the Swedish Research Council, project “Exact Algorithms” (A.B. and T.H.).
References
 N. Alon, R. Yuster, U. Zwick, Colorcoding, J. Assoc. Comput. Mach. 42 (1995), 844–856.
 D. L. Applegate, R. E. Bixby, V. Chvátal, W. J. Cook, The Traveling Salesman Problem: A Computational Study, Princeton University Press, 2006.
 R. Bellman, Combinatorial processes and dynamic programming, Combinatorial Analysis, Proceedings of Symposia in Applied Mathematics 10, American Mathematical Society, 1960, pp. 217–249.
 R. Bellman, Dynamic programming treatment of the travelling salesman problem, J. Assoc. Comput. Mach. 9 (1962), 61–63.
 A. Björklund, T. Husfeldt, P. Kaski, M. Koivisto, Fourier meets Möbius: fast subset convolution, 39th Annual ACM Symposium on Theory of Computing (STOC 2007), ACM, 2007, pp. 67â74.
 A. Björklund, T. Husfeldt, P. Kaski, M. Koivisto, Trimmed Moebius inversion and graphs of bounded degree, 25th International Symposium on Theoretical Aspects of Computer Science (STACS 2008), Dagstuhl Seminar Proceedings 08001, IBFI Schloss Dagstuhl, 2008, pp. 85–96.
 A. Björklund, T. Husfeldt, P. Kaski, M. Koivisto, The travelling salesman problem in bounded degree graphs, 35th International Colloquium on Automata, Languages and Programming (ICALP 2008), Part I, LNCS 5125, Springer, 2008, pp. 198â209.
 M. Charikar, P. Indyk, R. Panigrahy, New algorithms for subset query, partial match, orthogonal range searching, and related problems, 29th International Colloquium on Automata, Languages and Programming (ICALP 2002), Part I, LNCS 2380, Springer, 2002, pp. 451â462.
 J. Chen, S. Lu, S. Sze, F. Zhang, Improved algorithms for path, matching, and packing problems, 18th Annual ACMSIAM Symposium on Discrete Algorithms (SODA 2007), SIAM, 2007, pp. 298–307.
 J. W. Cooley, J. W. Tukey, An algorithm for the machine calculation of complex Fourier series, Math. Comp. 19 (1965), 297–301.
 D. Eppstein, The traveling salesman problem for cubic graphs, J. Graph Algorithms Appl. 11 (2007), 61–81.
 H. Gebauer, On the number of Hamilton cycles in bounded degree graphs, 4th Workshop on Analytic Algorithms and Combinatorics (ANALCO 2008), SIAM, 2008.
 R. L. Graham, D. E. Knuth, O. Patashnik, Concrete Mathematics, 2nd ed., Addison–Wesley, 1994.
 G. Gutin, A. P. Punnen (Eds.), The Traveling Salesman Problem and its Variations, Kluwer, 2002.
 M. Held, R. M. Karp, A dynamic programming approach to sequencing problems, J. Soc. Indust. Appl. Math. 10 (1962), 196–210.
 K. Iwama, T. Nakashima, An improved exact algorithm for cubic graph TSP, 13th Annual International Conference on Computing and Combinatorics (COCOON 2007), LNCS 4598, Springer, 2007, pp. 108–117.
 S. Jukna, Extremal Combinatorics, Springer, 2001.
 R. M. Karp, Dynamic programming meets the principle of inclusion and exclusion. Oper. Res. Lett. 1 (1982), 49–51.
 R. Kennes, Computational aspects of the Moebius transform of a graph, IEEE Transactions on Systems, Man, and Cybernetics 22 (1991), 201–223.
 S. Kohn, A .Gottlieb, M. Kohn, A generating function approach to the traveling salesman problem, ACM Annual Conference (ACM 1977), ACM Press, 1977, pp. 294–300.
 I. Koutis, Faster algebraic algorithms for path and packing problems, 35th International Colloquium on Automata, Languages and Programming (ICALP 2008), Part I, LNCS 5125, Springer, 2008, pp. 575–586.
 J. Kneis, D. Mölle, S. Ricther, P. Rossmanith, Divideandcolor, 32nd International Workshop on GraphTheoretic Concepts in Computer Science (WG 2006), LNCS 4271, Springer, 2006, pp. 58–67.
 E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, D. B. Shmoys (Eds.), The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization, Wiley, 1985.
 B. Monien, How to find long paths efficiently, Ann. Discrete Math. 25 (1985), 239–254.
 D. B. West, Introduction to Graph Theory, 2nd ed., Prentice–Hall, 2001.
 R. Williams, Finding paths of length in time, arXiv:0807.3026, July 2008.
 G. J. Woeginger, Exact algorithms for NPhard problems: A survey, Combinatorial Optimization – Eureka, You Shrink! LNCS 2570, Springer, 2003, pp. 185–207.
 F. Yates, The Design and Analysis of Factorial Experiments, Technical Communication 35, Commonwealth Bureau of Soils, Harpenden, U.K., 1937.