LinearTime Compression of BoundedGenus Graphs into InformationTheoretically Optimal Number of Bits^{†}^{†}thanks: Accepted to SIAM Journal on Computing. A preliminary version appeared in SODA [65].
Abstract
A compression scheme for a class of graphs consists of an encoding algorithm that computes a binary string for any given graph in and a decoding algorithm that recovers from . A compression scheme for is optimal if both and run in linear time and the number of bits of for any node graph in is informationtheoretically optimal to within lowerorder terms. Trees and plane triangulations were the only known nontrivial graph classes that admit optimal compression schemes. Based upon Goodrich’s separator decomposition for planar graphs and Djidjev and Venkatesan’s planarizers for boundedgenus graphs, we give an optimal compression scheme for any hereditary (i.e., closed under taking subgraphs) class under the premise that any node graph of to be encoded comes with a genus embedding. By Mohar’s lineartime algorithm that embeds a boundedgenus graph on a genus surface, our result implies that any hereditary class of genus graphs admits an optimal compression scheme. For instance, our result yields the firstknown optimal compression schemes for planar graphs, plane graphs, graphs embedded on genus surfaces, graphs with genus or less, colorable directed plane graphs, outerplanar graphs, and forests with degree at most . For nonhereditary graph classes, we also give a methodology for obtaining optimal compression schemes. From this methodology, we give the first known optimal compression schemes for triangulations of genus surfaces and floorplans.
1 Introduction
Compact representation of graphs are fundamentally important and useful in many applications, including representing the meshes in finiteelement analysis, terrain models of GIS, and 3D models of graphics [80, 82, 81, 92, 85, 64, 89, 48], VLSI design [84, 56], designing compact routing tables of computer networks [94, 37, 66, 77, 35, 95, 1, 16, 3, 36], and compressing the link structure of the Internet [15, 2, 88, 7, 5, 21]. Let be a class of graphs. Let denote the number of distinct node graphs in . The informationtheoretically optimal number of bits to encode an node graph in is .^{1}^{1}1All logarithms throughout the paper are to the base of two. For instance, if is the class of rooted trees, then and ; if is the class of plane triangulations, then [97]. A compression scheme for consists of an encoding algorithm that computes a binary string for any given graph in and a decoding algorithm that recovers graph from . A compression scheme for a graph class with is optimal if the following three conditions hold.

The running time of algorithm is linear in the size of .

The running time of algorithm is linear in the bit count of .

For all positive constants with , the bit count of for an node graph in is no more than .
Condition C3 basically says that the bit count of is informationtheoretically optimal to within lowerorder terms. Although there has been considerable work on compression schemes, trees (see e.g., [72, 50, 67, 11]) and plane triangulations [79] were the only known nontrivial graph classes that admit optimal compression schemes. A graph class is hereditary if it is closed under taking subgraphs. Below is the main result of the paper.
Theorem 1.1.
Any hereditary class of graphs with admits an optimal compression scheme, as long as each input node graph in to be encoded comes with a genus embedding.
By Theorem 1.1 and Mohar’s lineartime genus embedding algorithm for genus graphs [70, 54] (see Lemma 2.5), any hereditary class of genus graphs admits an optimal compression scheme. For instance, our result yields the firstknown optimal compression schemes for planar graphs, plane graphs, graphs embedded on genus surfaces, graphs with genus or less, colorable directed plane graphs, outerplanar graphs, and forests with degree at most . For nonhereditary graph classes, we also give an extension (see Corollary 5.1) of Theorem 1.1. As summarized in the following theorem, we show two classes of genus graphs whose optimal compression schemes are obtainable via this extension, where the class of floorplans is defined in related work below.
Theorem 1.2.
The following classes of graphs admit optimal compression schemes:

Triangulations of a genus surface for any integral constant .

Floorplans.
Technical overview
The kernel of the proof of Theorem 1.1 is a lineartime disjoint partition of an node graph embedded on a genus surface.^{2}^{2}2Precisely, the disjoint partition of the edges of the embedded graph in the proof of Theorem 1.1 is , where is both (i) a separation of an arbitrary triangulation of and (ii) a refinement of the separation of . Let denote . Based upon Goodrich’s separator decomposition of planar graphs [40] and Djidjev and Venkatesan’s planarizer [26], partition satisfies the following conditions, where is the number of nodes of and is the number of times that the nodes of are duplicated in some with :^{3}^{3}3As a matter of fact, in our construction, all duplicated nodes of with belong to . (a) , (b) holds for each , (c) , and (d) . By Condition (a), can be encoded in bits. By Conditions (b) and (c), the information required to recover from can be encoded into bits (see Lemma 4.1). By Condition (d), we have . Therefore, the disjoint partition reduces the problem of encoding an node graph in to the problem of encoding a node graph in . Applying such a reduction for one more level, it remains to encode a node graph in into an informationtheoretically optimal number of bits, which can be resolved by the standard technique (see, e.g., [47, 72, 78]) of precomputation tables (see Lemma 2.3).
Related work
The compression scheme of Turán [96] encodes an node plane graph that may have selfloops into bits.^{4}^{4}4For brevity, we omit all lowerorder terms of bit counts in our discussion of related work. Keeler and Westbrook [55] improved this bit count to . They also gave compression schemes for several families of plane graphs. In particular, they used bits for plane triangulation, and bits for connected plane graphs free of selfloops and degreeone nodes. For plane triangulations, He et al. [46] improved the bit count to . For triconnected plane graphs, He et al. [46] also improved the bit count to at most bits. This bit count was later reduced to at most by Chuang et al. [20]. For any given node graph embedded on a genus surface, Deo and Litow [25] showed an an bit encoding for . These compression schemes all take linear time for encoding and decoding, but Condition C3 does not hold for them. The compression schemes of He et al. [47] (respectively, Blelloch et al. [14]) for planar graphs, plane graphs, and plane triangulations (respectively, separable graphs) satisfies Condition C3, but their encoding algorithms require time on node graphs.
Floorplanning is a fundamental issue in circuit layout [106, 43, 69, 62, 51, 108, 32, 8, 17, 58, 24, 57, 91, 68, 84, 4]. Motivated by VLSI physical design, various representations of floorplans were proposed [110, 109, 33]. Designing a floorplan to meet a certain criterion is NPcomplete in general [87, 44, 100], so heuristic techniques such as simulated annealing [102, 101, 17] are practically useful. The length of the encoding affects the size of the search space. A floorplan, which is also known as rectangular drawing, is a division of a rectangle into rectangular faces using horizontal and vertical line segments. Two floorplans are equivalent if they have the same adjacency relations and relative positions among the nodes. For instance, Figure 1 shows three floorplans: Floorplans (a) and (b) are equivalent. Floorplans (b) and (c) are not equivalent. Let be the input node floorplan. Under the conventional assumption that each node of , other than the four corner nodes, has exactly three neighbors (see, e.g., [45, 107]), one can verify that has faces and edges. Yamanaka and Nakano [103] showed how to encode into bits. Chuang [19] reduced the bit count to . Takahashi et al. [90] further reduced bit count to . All these compression schemes for floorplans satisfy Conditions C1 and C2, but not Condition C3. Takahashi et al. [90] also showed that the number of distinct node floorplans is no more than . Therefore, our Theorem 1.2(2) encodes an node floorplan into at most bits.
For applications that require query support, Jacobson [50] gave a bit encoding for a connected and simple planar graph that supports traversal in time per node visited. Munro and Raman [71] improved this result and gave schemes to encode binary trees, rooted ordered trees, and planar graphs. For a general node edge planar graph , they used bits while supporting adjacency and degree queries in time. Chuang et al. [20] reduced this bit count to for any constant with the same query support. The bit count can be further reduced if only time adjacency queries are supported, or if is simple, triconnected or triangulated [20]. Chiang et al. [18] reduced the number of bits to . Yamanaka and Nakano [105] showed a bit encoding for plane triangulations with query support. The succinct encodings of Blandford et al. [13] and Blelloch et al. [14] for separable graphs support queries. Yamanaka et al. [104] also gave a compression scheme for floorplans with query support. For labeled planar graphs, Itai and Rodeh [49] gave an encoding of bits. For unlabeled general graphs, Naor [74] gave an encoding of bits. For certain graph families, Kannan et al. [52] gave schemes that encode each node with bits and support time testing of adjacency between two nodes. Galperin and Wigderson [34] and Papadimitriou and Yannakakis [75] investigated complexity issues arising from encoding a graph by a small circuit that computes its adjacency matrix. Related work on various versions of succinct graph representations can be found in [73, 6, 31, 42, 38, 76, 83, 30, 29, 28, 9, 53] and the references therein.
Outline
The rest of the paper is organized as follows. Section 2 gives the preliminaries. Section 3 shows our algorithm for computing graph separations. Section 4 gives our optimal compression scheme for hereditary graph classes. Section 5 shows a methodology for obtaining optimal compression schemes for nonhereditary graph classes and applies this methodology on triangulations of genus graphs and floorplans. Section 6 concludes the paper with a couple of open questions.
2 Preliminaries
2.1 Segmentation prefix
Let denote the number of bits of binary string . A binary string is a segmentation prefix of binary strings if (a) it takes time to compute from and (b) given the concatenation of , it takes time to recover all with .
Lemma 2.2.
Any binary strings have an bit segmentation prefix, where .
Proof.
Let be the concatenation of . If , let be the bit binary string with exactly copies of bits such that the th bit of is if and only if holds for some . Otherwise, let store the bit numbers for all . Let be the segmentation prefix of and as ensured by Lemma 2.1. The concatenation of and is a segmentation prefix of with bits. The lemma is proved. ∎
For the rest of the paper, let denote the concatenation of , where is the segmentation prefix of as ensured by Lemma 2.2.
2.2 Precomputation table
Unless clearly stated otherwise, all graphs throughout the paper are simple, i.e., having no multiple edges or selfloops. Let denote the cardinality of set . Let consist of the nodes in graph and let . For any subset of , let denote the subgraph of induced by and let denote the subgraph of obtained by deleting and their incident edges. Two disjoint subsets and of are adjacent in if there is an edge of with and . For any subset of , let consist of the nodes in that are adjacent to in and let . A connected component of graph is a maximal subset of such that is connected.
Lemma 2.3.
Let be a graph class satisfying . Given positive integers and with , it takes overall time to compute (i) a labeling and a bit binary string for each distinct graph with at most nodes and (ii) an bit string such that the following statements hold.

Given any graph with , it takes time to obtain and from .

Given for any graph with , it takes time to obtain and from .
Proof.
Straightforward by . ∎
2.3 Separator decomposition of planar graphs
Sets form a disjoint partition of set if are pairwise disjoint and . A subset of is a separator of graph with respect to and if (1) , , and form a disjoint partition of , (2) and are not adjacent in , (3) , and (4) . A separator decomposition [12] of is a rooted binary tree on a disjoint partition of such that the following two statements hold, where “nodes” specify elements of and “vertices” specify elements of . Statement 1: Each leaf vertex of consists of a single node of . Statement 2: Each internal vertex of is a separator of with respect to and , where and are the child vertices of in and (respectively, and ) is the union of all the vertices in the subtree of rooted at (respectively, and ). See Figure 2 for an illustration.
Lemma 2.4 (Goodrich [40]).
It takes time to compute a separator decomposition for any given node planar graph.
2.4 Planarizers for nonplanar graphs
The genus of a graph is the smallest integer such that can be embedded on an orientable surface with handles without edge crossings [41]. For example, the genus of a planar graph is zero. By Euler’s formula (see, e.g., [39]), an node genus graph has edges. Determining the genus of a general graph is NPcomplete [93], but Mohar [70] showed that it takes linear time to determine whether a graph is of genus for any . Mohar’s algorithm is simplified by Kawarabayashi et al. [54].
Lemma 2.5 (Mohar et al. [70, 54]).
It takes time to compute a genus embedding for any given node genus graph.
Gilbert et al. [39] gave an time algorithm to compute an node separator of an node genus graph, generalizing Lipton and Tarjan’s classic separator theorem for planar graphs [63]. Our result relies on the following planarization algorithm.
Lemma 2.6 (Djidjev and Venkatesan [26]).
Given an node graph embedded on a genus surface, it takes time to compute a subset of with such that is planar.
3 Separation and refinement
We say that with is a separation of graph if the following properties hold.

form a disjoint partition of .

Any two and with are not adjacent in .
For instance, Figure 3(a) shows a separation of graph and Figure 4(a) shows another separation of . For any subset of , let be the subgraph of induced by excluding the edges of . If is a separation of , then form a disjoint partition of the edges of . See Figures 3(b) and 4(b) for illustrations. Let . For any positive integer , let . For notational brevity, for any nonnegative integer , let
For a nonnegative integer , separation of an node graph is a separation of if the following three properties hold.

and .

holds for each .

.
One can verify that is a separation of .^{5}^{5}5The “” in Property S3 is redundant for . However, we need it so that is a separation of , since .
Let and be two separations of graph . We say that is a refinement of if the following three properties hold.

.

For each index , there is an index with and .

For any indices , , with , if , then .
For instance, in Figure 4(a), is a refinement of . Below is the main lemma of the section.
Lemma 3.1.
Let be a positive integer. Let be an node connected graph embedded on a genus surface. Given a separation of , it takes time to compute a separation of that is a refinement of .
Lemma 3.2.
Let be a positive integer. Given an node graph embedded on a genus surface, it takes time to compute an node subset of such that each node of has degree at most in and each connected component of has at most nodes.
Proof.
We first apply Lemma 2.6 to compute in time an node subset of such that is planar. We then apply Lemma 2.4 to compute in time a separator decomposition of . For each vertex of , let denote the union of all the vertices in the subtree of rooted at and let . Let . Let consist of the nodes of with degree more than in . Let be the union of all the vertices of with . Let . By and the definition of , each connected component of has at most nodes. By , each node of has degree at most in . Since has edges, . It remains to show . For each index , let consist of the vertices of with . By and , each is an internal vertex of . By definition of , we know that and are disjoint for any two distinct elements and of , implying that holds. Since holds for each , we have . Since each is an internal vertex of , is a separator of . Therefore, holds for each vertex in . We have . The lemma is proved. ∎
Proof of Lemma 3.1.
Suppose that is the given separation . Let be the time computable subset of ensured by Lemma 3.2. We have . Let . Let consist of the connected components of . By , each element of has at most nodes. By and Properties S1 and S2 of , each element of is contained by some with . For each , let consist of the elements of with . We run Algorithm 1 to obtain (a) a disjoint partition of and (b) nodes of , which may not be distinct. Let . Since is connected, each element of is adjacent to . The first statement of the outer repeatloop is well defined. Since each element of has at most nodes, the first statement of the inner repeatloop is well defined. See Figure 5 for an illustration: Suppose that all nodes are in . All nodes are initially unmarked. Let consist of the nine unlabeled nodes, including the three gray nodes. For each , let consist of the nodes with label . That is, are the six connected components of . Suppose that and the first two iterations of the outer repeatloop obtain and . In the third iteration of the outer repeatloop, are the unmarked elements of that are adjacent to in clockwise order around . By , the two iterations of the inner repeatloop obtain and .
By definition of Algorithm 1, one can verify that Properties R1, R2, and R3 hold for and (that is, is a refinement of ) and Properties S1 and S2 hold for . By Property S3 of , we have . By , we have . Let consist of the indices with and . Let consist of the indices with and . We show as follows. By Property S1 of , we have . To show , we categorize the indices in with into the the following types, where is the index with :
 Type 1:

and . The number of such indices is at most .
 Type 2:

and .
 Type 2a:

. The number of such indices is at most .
 Type 2b:
 Type 2c:
We have . Property S3 holds for . By definition of Algorithm 1, holds for each . By , each node of has degree at most . Property S4 holds for .
To see Property S5 of , we obtain a contracted graph from by performing the following two steps for each .^{6}^{6}6The contraction procedure is only for proving Property S5 of , not needed for computing . Step 1: Let be the elements of with in clockwise order around in . Split into two adjacent nodes and and let take over the neighbors of in clockwise order around from the first neighbor of in to the first neighbor of in . Step 2: Contract all nodes of into node and delete multiple edges and selfloops. See Figure 6 for an illustration: For each , let consist of the nodes with labels in Figure 6(a). Suppose that , , and . The unlabeled circle nodes belong to . The square nodes are two previously contracted nodes and from and for some indices and with . Figure 6(b) shows the result of Step 1. Figure 6(c) shows the result of Step 2. Observe that each node that is adjacent to becomes a neighbor of after applying Steps 1 and 2. Also, each neighbor of that is not in either remains a neighbor of or becomes a neighbor of after applying Steps 1 and 2. Therefore, for each and each node , there is either an edge or an edge for some index with and . Thus, is no more than the number of edges in the resulting contracted simple graph, which has nodes. Observe that Step 1 does not increase the genus of the embedding. Since the subgraph induced by is connected, Step 2 does not increase the genus of the embedding, either. The number of edges in the resulting contracted simple genus graph is . Property S5 holds for . The lemma is proved. ∎
4 Our compression scheme
This section proves Theorem 1.1.
4.1 Recovery string
A labeling of graph is a onetoone mapping from to . For instance, Figure 7(a) shows a labeling for graph . Let be a graph embedded on a surface. We say that a graph embedded on the same surface is a triangulation of if is a subgraph of with such that each face of has three nodes. The following lemma shows an bit string with which the larger embedded labeled subgraphs of can be recovered from smaller embedded labeled subgraphs of in time.
Lemma 4.1.
Let be a positive integer. Let be an node graph embedded on a genus surface. Let be a triangulation of . Let be a given separation of and be a given separation of such that is a refinement of . For any given labeling of for each , the following statements hold.

It takes overall time to compute a labeling of subgraph for each .

Given the above labelings of subgraphs with , it takes time to compute an bit string such that and for all can be recovered in overall time from and and for all .