SPLZ: An Efficient Algorithm for Single Source Shortest Path Problem Using Compression Method
Abstract
Efficient solution of the single source shortest path (SSSP) problem on road networks is an important requirement for numerous realworld applications. This paper introduces an algorithm for the SSSP problem using compression method. Owning to precomputing and storing allpairs shortest path (APSP), the process of solving SSSP problem is a simple lookup of a little data from precomputed APSP and decompression. APSP without compression needs at least 1TB memory for a road network with one million vertices. Our algorithm can compress such an APSP into several GB, and ensure a good performance of decompression. In our experiment on a dataset about Northwest USA (with 1.2 millions vertices), our method can achieve about three orders of magnitude faster than Dijkstra algorithm based on binary heap.
Keywords:
Shortest Path Compression Road network∎
1 Introduction
The single source shortest path (SSSP) problem is a classic algorithm problem, and is also a model for numerous realworld applications, such as navigation, facilities location, logistics planning. Generally, given a graph , and a source vertex , the goal of SSSP problem is to find the shortest paths from to all other vertices in the graph.
Effective precomputation plays an important role of many efficient algorithms of SSSP problem. This kind of algorithms contains two phases: precomputing some supporting information offline, and computing the final results online. A straightforward precomputation method is precomputing allpair shortest path (APSP) and storing it in memory. Then the time complexity of SSSP problem online is for a simple lookup and output, However, space consumption for raw data of APSP needs at least . There is no sufficient memory to run this algorithm on an ordinary machine for a largescale graph. For example, a graph with one million vertices needs at least 1 TB memory. It is an extreme situation that we only record the shortest path tree. An APSP includes trees, and each tree needs bytes to record the parent of every vertex. In road networks, the degree of a vertex is lower than 255, so recording the parent of a vertex needs 1 byte. At this situation, the APSP of one million vertices takes about 1TB.
In this paper, we propose a compression method to reduce the space cost of precomputation effectively, and ensure a linear time complexity for decompressing. We call our method shortest path with Lempel Ziv (SPLZ), which means it is a modification of LZ77Ziv and Lempel (1977) algorithm. However, the offline time complexity is , so SPLZ is not very suitable to work on a continentalsized graph, for example, the Europe road network with 18 million vertices. The number of vertices of our experiment road networks is from 300,000 to 1,200,000. Though they are not continentalsized, they may still be representations of largescale road network in realworld applications. SPLZ could work well at this scale.
In one of our experiment on road network of Northwest USA (about 1.2 million vertices), SPLZ can compress the APSP of this graph (about 1.4 TB) into several GB. It is affordable for a highend PC or an ordinary workstation. If memory is insufficient, we can store the compressed APSP into extended memory. Our experiments show that using extended memory can still achieve a good performance of decompressing.
There are three main contributions of our paper:

We design an effective compression scheme for storing APSP data. With this method, we can take full advantages of information generated by precomputation. When memory is not enough, we can store the compressed APSP into extended memory and still keep a good performance of decompression.

We develop a fast algorithm named SPLZ to solve SSSP problem. Our algorithm on single core can achieve about three orders of magnitude faster than Dijkstra algorithm based on binary heap. This performance is about two to three times as much as the time cost of copying an array of length using standard C library function memcpy().

SPLZ is simple to be implemented. SPLZ does not use complex data structures and elaborate skills. In the offline phase, SPLZ uses Dijkstra (or PHAST) and LZ77 with a little modification. In the online phase, the operation of SPLZ for solving SSSP is copying an array of length .
The remainder of this paper is organized as follows: Section 2 describes related works. Section 3 introduces the basic idea of SPLZ. Section 4 details the implement of SPLZ. Section 5 reports the experimental results. Conclusion is made in Section 6.
2 Related works
SSSP problem has been widely researched. DijkstraDijkstra (1959) algorithm is the most classic method for SSSP problem. To improve the performance of Dijkstra, researchers have adopted numerous type of priority queue. It has a series of modification like DIKBDial (1969), DIKBDCherkassky et al (1996), DIKHCormen et al (2001), DIKRAhuja et al (1990). They are also called label setting algorithm. Another classic algorithm for SSSP problem is BellmanFordBellman (1956). It is classified to another type of algorithm called label correcting algorithm. Besides BellmanFord, label correcting algorithm includes many others like PAPEPape (1974), TWOQPallottino (1984), THRESHGlover et al (1985), SLFBertsekas (1993). They are based on different label correcting strategies. Both label setting algorithm and label correcting algorithm can be described by a unified frameworkGallo and Pallottino (1986). These algorithms are designed for general purpose, so they do not have preprocess phase and other optimization skills for road network. These algorithms usually perform not so well for SSSP problem on large scale road network.
Some algorithms accelerate solving the SSSP problem with parallelism. The classic parallel algorithms for SSSP problem contain parallel asynchronous label correcting methodBertsekas et al (1996) and steppingMeyer and Sanders (2003). Goldberg et alCherkassky et al (2009) pointed out that traditional parallel algorithm for SSSP problem usually do not take full advantages of modern CPU architecture, like multicore, SSE. stepping has less acceleration on largescale road networkMadduri et al (2006). In 2011, Delling and Goldberg et al developed PHASTDelling et al (2013a), which is the fastest algorithm at present. PHAST makes full use of SSE, multicore, and is elaborately designed to obtain a low cache missing. Its performance of GPU modification on large scale road network is up to three orders of magnitude faster than Dijkstra on a highend CPU. Using PHAST instead of Dijkstra can reduce the time cost of offline phast of SPLZ. Parallel technique is also used in SPLZ to accelerate the offline precomputing phase.
Precomputing methods replace the time consumption online with time and space consumption offline. It is a efficient approach for solving shortest path problem on largescale graph. Many high performance algorithm for pointtopoint shortest path problem have a precomputing process, including Highway HierarchiesSanders and Schultes (2005), Transit Node RoutingBast et al (2007), ALTGoldberg and Harrelson (2005) et al. In 2008, Geisberger proposed Contraction HierarchiesGeisberger et al (2008) algorithm. It is not only a good algorithm for calculating the answer of shortest path, but also a efficient method for precomputing. Many outstanding algorithms like Transit Node RoutingArz et al (2013), PHAST and Hubbased labelingAbraham et al (2011, 2012) adopted it as precomputing method. Hubbased labeling is the stateoftheart pointtopoint shortest path algorithm. It takes more space than others to obtain the fastest online performance. Similarly, SPLZ precomputes much more information (the total APSP) than these algorithms, using more offline time and space consumption. Thus SPLZ can achieve high online performance.
Some algorithms have good performance online, but space consumption of precomputing is too huge to be loaded in memory. Compression methods are practical way to reduce the space consumption. SILCSankaranarayanan et al (2005), PCPDSankaranarayanan et al (2009), and CPDBotea et al (2013) use compression method to reduce space complexity of storing APSP. These methods are to solve pointtopoint shortest path problem on spatial networks (graphs where each node is labeled with coordinates). They store APSP in the form of ¡°first move table¡±. First move table can obtain high compression ratio by taking the advantage of path coherenceSankaranarayanan et al (2005). Graph partitioning usually is used for reducing online space in some algorithms, like ArcflagHilger et al (2009), PCDDelling et al (2013b) and CRPMaue et al (2010). But for SILC et al, graph partitioning is used for clustering similar data in APSP. SPLZ has a similar preprocess like SILC et al, but SPLZ is used for SSSP problem, not pointtopoint, so SPLZ adopts different compression and decompression method. Detailed difference will be described in section 3.1.
3 Basic SPLZ
3.1 Main idea
The main idea of SPLZ is to precompute APSP and then compress it for online lookup. In SPLZ, APSP is stored in the form of shortest path tree(SPT). Here SPT is an array of length , which records the last move of the shortest path from a source vertex to all other vertices. For example, if we want to calculate the shortest path from vertex to other vertices, we will present the result by , which is a tree with root . If , it means that the predecessor of vertex along the shortest path from to is vertex . Edge is the last move from to , so SPT is also called last move table. Traditional algorithm for SSSP problem like Dijkstra usually present the result in the form of SPT too. If we store APSP straightforwardly, the space complexity is , for we need to store SPTs.
Path coherence described in Sankaranarayanan et al (2005) reveals that “vertices contained in a coherent region share the first segment of their shortest path from a fixed vertex”. For last move, path coherence still makes sense with a bit change on description: vertices contained in a coherent region share the last segment of their shortest path to a fixed vertex. Strictly, path coherence only holds when the fixed vertex is sufficiently far away. In largescale road network, there always are numerous vertices which are sufficiently far away from each other, so path coherence holds in most cases. A related experiment is introduced in section 4.1 to show how frequently it holds. Path coherence implies that the data among multiple SPTs contain a large number of reduplicative sequences. This feature will lead to a high compression ratio with LZfamily algorithm.
SILCSankaranarayanan et al (2005), PCPDSankaranarayanan et al (2009), and CPDBotea et al (2013) adopt first move table. First move table has the similar feature like last move table, but it is suitable to solve pointtopoint shortest path problem. When querying the shortest path from vertex s to vertex t, first move table can iteratively give the next vertex of the shortest path beginning from s. Our method is designed for SSSP problem, so the result is a tree, not a single path. A vertex in a tree may have several successors, thus the first move table, which is an array and can just store one successor of a vertex on a shortest path, cannot fit our requirement. If we persist in using first move table to record the tree, it need to be implement in the form of a data structure about tree, which is more expensive than an array. We use last move table. A vertex in a tree only have one predecessor, so last move table can be stored in the form of an array. That is also why many classical algorithms for SSSP problem, like Dijkstra, use last move table to record the shortest path tree. So SPLZ adopts last move table as the form of storage of result.
Generally, SPLZ contains 3 parts: (1)calculating the APSP, (2)compressing the APSP offline, (3)decompressing the APSP online. We adopt PHAST algorithm to calculating the APSP. Other existing methods, like Dijkstra, BellmanFordMoore, are also feasible. Next we focus on the compressing and decompressing method of SPLZ, which is a variant of LZ77.
3.2 LZ77 algorithm
Let be a string to be compressed, and bytes has been compressed. is the size of dictionary. The compression procedure of LZ77 is as Algorithm 1. LZ77 keeps the dictionary, , sliding with increases. looks for the longest common subsequence among (begins from and may extend into uncompressed data) and the prefix of uncompressed data, and then returns location and length of the common subsequence. Finally the compressed data is an array of pairs.
Let be an array of pairs to be decompressed, and assume bytes has been decompressed. The decompression procedure of LZ77 is as Algorithm 2. Decompression is much simpler than compression. It successively loads every in compressed data, and then looks up and outputs the corresponding subsequence in . still needs to slide with .
LZ77 is a representation of regular compression methods without any domain knowledge. Decompression speed of LZ family algorithm is fast, but decompressing some particular data from compressed data is dependent on previous data because of the sliding of dictionary. Other common LZfamily methods, like what are used in gzip, zlib and 7z, do not have any superiority on retrieval speed compared with LZ77, because they need to not only look up the dictionary to convert to original data and slice the dictionary, but also calculate some complex coding. Assuming that there are compressed data contains SPTs, from which we hope to decompress a particular SPT, we need decompress SPTs on average. This process results in a large amount of redundant operations.
3.3 Fixeddictionary compression
To avoid redundant operations, we fix the dictionary. This means the dictionary is a fixedsize and fixedlocation sequence in the front of the raw data. Data in the dictionary will not be compressed, to achieve a faster decompressing speed. These modification result in a lower compression ratio. We should point out that, SPLZ is not a compression algorithm for general situation, but an algorithm for solving SSSP problem.
Let be a set of SPT, and , the first SPT in , be the dictionary. is the starting position of th SPT in compressed data. In other words, is equal to the compressed length of th SPT. Let be the total size of compressed data. Algorithm 3 describes the process of compressing. Firstly SPLZ loads the first SPT in as the dictionary and outputs it without compression. Then SPLZ compresses remaining SPTs successively. For each SPT, SPLZ repeatedly finds the longest common subsequence among and the prefix of this SPT, and compresses, outputs and deletes the matched prefix, until this SPT is empty. There are many methods for finding the longest match Bell and Kulp (1993), we use a simple implement with a modification of KMPKnuth et al (1977). Algorithm 4 is the process of decompressing the th SPT. is not compressed so it can be loaded straightly. Next SPLZ locates the compressed th SPT in the compressed data stream using and . For each pair, we can look up the corresponding subsequence in to output the original data. The first line of Algorithm 3 and Algorithm 4 is just an assignment to a pointer, without copying any real data. In decompression process, the whole size of outputed data in line 4 is .
3.4 An example of SPLZ compression and decompression
Assume that there is a graph G as Figure 1 shows. After calculating shortest path tree for each vertex, we get six SPTs shown in Table 2. If , it means the precursory vertex of on the shortest path from to is .
Vertex  Adjacent vertices of  
SPT  Elements of  
  2  0  2  2  3  
2    1  2  2  3  
2  2    2  2  3  
2  2  3    2  3  
2  2  4  2    4  
2  2  4  5  5   
To obtain an effective compression, we convert Table 2 to another form. In Table 3, means the precursory vertex of on the shortest path from to is the th adjacent vertex of . We define .
Then, for example, we select as the dictionary. Every SPT is compressed into an array of 2tuple . Note that sometimes a number in a SPT might not exist in the dictionary. For example, in Table 3, [2]=1, but “1” does not exist in the dictionary. At this situation, we set =0 and =the number excluded in dictionary.
SPT  Elements of  After compressing  

0  0  0  0  0  0  (0,6)  
0  0  1  0  0  0  (0,2) (1,0) (0,3)  
0  0  0  0  0  0  dictionary  
0  0  2  0  0  0  (0,2) (2,0) (0,3)  
0  0  3  0  0  1  (0,2) (3,0) (0,2) (1,0)  
0  0  2  1  1  0  (0,2) (2,0) (1,0) (1,0) (0,1) 
To make a simple illustration, here we assume that each 2tuple needs two bytes. Therefore after compressing, the index array is: (0, 2, 8, 14, 20, 28, 38).
The total length of output is 44 bytes, for the length of dictionary is 6 bytes and the length of compressed date is 38 bytes. The effectiveness of compression seems poor, because the scale of graph in our example is too small.
When solving a SSSP problem online, for example, calculating the , the steps are:

Find out that and . In other words, the length of compressed is 6 and its start location in whole compressed data is 2.

Convert every 2tuple to original data by looking up the dictionary. For example, when handling the 2tuple (0, 2), we intercept the subsequence of dictionary, which begins at 0 and is of length 2.
All the conversion is: (0, 2) 0 0; (1, 0) 1; (0, 3) 0 0 0.
This step can be easily parallelized. 
Concatenate these subsequences to one array: (0 0 1 0 0 0). This array is .
4 Details of implementation
4.1 The key factor affecting the compression ratio
When we use fixeddictionary compression, the compression ratio is mainly decided by the similarity between the dictionary and data to be compressed. Let be the number of edges along the shortest path between vertex and vertex . Similarity between two SPTs has a negative correlation with between the source vertex of the two SPTs. We use the proportion of common edges among two SPTs to measure the similarity between two SPTs.
Figure 2 presents an experiment result, which is based on a northwest USA road network with about 1.2 million vertices. . The less of two vertex and , the higer similarity between and . This result is reasonable in realworld. For example, assume there are three locations A, B, and C. Both the distance of (A, C) and (B, C) is 10 km. If the distance between A and B is one meter, we could guess that the shortest path (A, C) and (B, C) are almost the same. When we choose as dictionary and compress , the impact of on the compression ratio is as Figure 3 shows.
In addition, Figure 2 also shows how frequently path coherence holds. When is less than 100 (vertices contained in a coherent region), and share over 93% of their elements. An element in records the last move of the path from to a vertex. This experiment validates that path coherence holds in most cases on a largescale road network.
The result points out that the compression ratio decreases fast with increasing . To reduce the space consumption, it is necessary to limit the between the dictionary and the SPT to be compressed.
4.2 Regions partition
If we choose only one SPT as the dictionary in a large scale graph, there always are many vertices far away from the dictionary. By partitioning the graph into a series of regions with smaller size, we can choose a SPT as dictionary and compress the rest SPT independently for every region. We choose the SPT of a vertex which is closest to the geometrical center as the dictionary, and call this vertex the root of this region. Partition ensures that in a region, the pathlen between vertex of dictionary and other vertices don’t exceed the diameter of the region.
When partitioning the graph into numerous disjoint regions, we should keep the distance of vertices among a region as close as possible. It can be handled as a clustering problem. We use kmeans, a simple but effective clustering method, to partition the graph. The simplest attribute for clustering vertices in a road network is coordinate, and we adopt it. Actually, there are numerous methods to partition the graph without coordinate. Coordinate is an unessential condition for SPLZ.
Due to that data in the dictionary will not be compressed and output in raw form, if the number of regions is excessive, uncompressed data would occupy a large proportion in final output. If the number of regions is small, we cannot ensure that the diameter of a region is significantly less than the diameter of the total graph. Intuitively, to reach both small number of regions and small size of every region, we assume that optimal number of regions is in form of , and choose a proper value of by experiment. Section 5.2 compares the impact of different parameter .
4.3 Multistep compression
Assume that is a root of a region, and vertices of SPT() among its region is as Figure 4 shows.
In region showed in Figure 4, all SPT except is compressed with dictionary . We call this process onestep compression. Pathlen between most SPT in this region and is 3 or 4.
We can reduce the pathlen by multistep compression. For example, let the grandparent of each SPT be its dictionary. As for vertex , will be compressed with dictionary and will be compressed with dictionary . When we decompress , we must decompress at first. It is socalled twostep compression for vertex . By applying similar operation to all vertex, their pathlen is decreased to 1 or 2.
We call the pathlen between a SPT and its dictionary . Figure 3 tell us that shorter pathlen leads to a higher compression ratio. But the reduction of space costs brings higher time cost. If is , the time to decompress is times of onestep compression. By controlling , we can adjust the point of balance between space costs of compression and time costs of decompression online.
We define when we use onestep compression.
4.4 Code of compressed data
The compressed data are array of . If and are fixedlength integer, data compressed by method in Section 3.3 can be compressed one more time by entropy coding. However, entropy coding has poor decompression speed. We adopt a variable length coding to encode and . Though our coding method cannot obtain the compression ratio as high as entropy coding, it has almost no negative effect on decompression speed. The code is also prefix coding, so there is no ambiguity when we decode it.
In the data stream of compressed data, is presented by differential coding. We just record the difference between every and its predecessor except the first one, because difference between two adjacent usually smaller than their real value. Differential coding might result in a shorter code. Value of does not have such a feature, so we record its real value.
range of value (in hex)  code length  code format(in binary) 

[0x00, 0x0F] (uncompressed)  1  1111xxxx 
[0x00, 0x7F]  1  0xxxxxxx 
[0x0080, 0x3FFF]  2  10xxxxxx xxxxxxxx 
[0x004000, 0x1FFFFF]  3  110xxxxx xxxxxxxx xxxxxxxx 
[0x00200000, 0x0FFFFFFF]  4  1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx 
range of value (in hex)  code length  code format(in binary) 

[0x00, 0x3F]  1  00xxxxxx 
[0x00, 0x3F]  1  01xxxxxx 
[0x0040, 0x1FFF]  2  100xxxxx xxxxxxxx 
[0x0040, 0x1FFF]  2  101xxxxx xxxxxxxx 
[0x002000, 0x0FFFFF]  3  1100xxxx xxxxxxxx xxxxxxxx 
[0x002000, 0x0FFFFF]  3  1101xxxx xxxxxxxx xxxxxxxx 
[0x00100000, 0x07FFFFFF]  4  11100xxx xxxxxxxx xxxxxxxx xxxxxxxx 
[0x00100000, 0x07FFFFFF]  4  11101xxx xxxxxxxx xxxxxxxx xxxxxxxx 
Encoding method for and in detail is separately in Table 4 and Table 5. In the first line of Table 4, “uncompressed” means that some bytes does not appear in the dictionary, so these bytes cannot be compressed. In our method, the value of such a byte must be no more than 15. It is reasonable for a realworld road network, for number of branch of realworld road usually smaller than 15. In the case of that degree of a vertex is more than 15, we can add a virtual vertex to the graph. Let the distance between and be zero and assign excess edges of to .
5 Experiment
5.1 Experiment setup
Our experiment code is written in C++, and compiled by VC++ 2010. The program includes two parts: precomputing offline and calculating the SSSP online. The experiments run on a PC, with 3.4 GHz Intel i74770(4 cores), 24GB RAM. External memories include a 2TB mechanical disk and a 256GB SSD. For parallelly precomputing, we use OpenMP. Data of graph are downloaded from http://www.dis.uniroma1.it/challenge9, which are benchmarks for the 9th DIMACS Implementation ChallengeDemetrescu et al (2009). The data set we used is ”Northwest USA”(NW), with 1207945 vertices and 2840208 edges, and the type of graph is ”Distance graph”.
The source code of our experiments is released^{1}^{1}1https://github.com/asds25810/SPLZ.
5.2 Precomputing
Operation of precomputing consist of computing the APSP and compressing it. The target of compressing is to reduce the space consumption of APSP. So the compression ratio is a important feature for measuring the effectiveness of precomputing. The number of regions has an impact on compression ratio. We choose different setting of parameter for as the number of regions separately. For every parameter setting, the running time of precomputing is about 13 hours. Table 6 shows the effect of number of regions on the compression ratio.
The number of regions determine the average size of each region, and the size of a region has effect on the pathlen between vertices in a region. The less the pathlen between vertices, the higher the compression ratio. So it seems that more number of regions may lead to higher compression ratio. However, Table 6 demonstrates that when the number of regions is more than a certain value, the compression ratio falls down. It is owing to that, with the number of regions increases, the proportion of dictionary increases. We select a vertex as the representative vertex for each region. The SPT of the representative vertex is the dictionary of that region. To ensure that the dictionary is available at the immediate time of decompressing, dictionary will not be compressed. Although more number of regions leads to a higher compression ratio of single SPT, the total size of final data will increase because the increased size of dictionaries.
If the dictionaries occupy a high proportion of the final output, we can consider to compress the dictionaries. But it will result in two problems. One is the extra time cost for decompressing dictionary when decompressing data. Another one is that, actually, it is difficult to find a proper ”dictionary” for compressing dictionaries, which intrinsically have less dataredundancy. In other words, the compression ratio of compressing the dictionaries is much lower than compressing SPTs.
We set as the number of regions for following experiment.







0.5  1459  16.65  88  0.66  4.0%  
1  1459  13.29  110  1.33  10.0%  
2  1459  11.75  124  2.65  22.5%  
4  1459  12.21  119  5.31  43.5%  
8  1459  15.87  92  10.62  66.9% 
5.3 Multistep compression
Table 6 shows the results of onestep compressing with different number of regions. Multistep compressing can achieve a higher compression ratio as shown in Figure 5. SPLZ can adjust the compression ratio by controlling the parameter , which was described in Section 4.3. This capability of SPLZ makes it adaptable to different capacity of memory. With the decreases, the compression ratio increases. It is due to the similarity between the SPT to be compressed and its dictionary is higher when the parameter decreasing. SPLZ achieves the highest compression ratio when =1. The APSP of size about 1459 GB can be compressed to 2.87 GB. It is affordable for an ordinary PC.
5.4 Online performance
After precomputing, we test the time costs of solving the SSSP problem for a particular vertex. We load the compressed APSP in memory, and randomly generate a series of queries. Each query input a vertex , and request as output. Table 7 shows the average costs of handling a query. The parameter evidently affects the performance. Higher makes a lower space consumption. However, the reduced space consumption is repayed by increasing time costs. should be decided by the bottleneck of different applications.
If the memory capacity is not enough to store the compressed APSP, we can consider to store it to external memory. When SPLZ handles a query, it looks up the compressed SPT from external memory, and then decompresses it and returns the result. Compared with RAM, external memory is cheaper and usually has higher capacity. The time costs on external memory in Table 7 is the stable performance after handling a large number of random queries. The latency of accessing mechanical disk is significantly higher than that of memory. When is not , the results on mechanical disk have obvious difference in each experiment execution, and we do not find any regular pattern. So we only record the result when is . SSD performs better than mechanical disk. The time cost on SSD is three to four times as much as that on memory. Note that the space costs only consider the compressed APSP. Other consumptions are much less than them.
If the parent pointers need to be converted to the global id of each vertices in results, another 887 is needed. This conversion is not always needed in our opinion. To represent a path tree in parent pointer form, the local id (represented like Table 3) may be enough if graph is stored in the form of adjacency list.
Methods  Lentodic  Time () 



SPLZ on memory  219  13.29    
16  323  9.37    
8  464  6.32    
4  733  4.44    
2  1265  3.41    
1  2287  2.87    
SPLZ on SSD  740  1.33  11.96  
16  1176  1.33  8.04  
8  1709  1.33  4.99  
4  2556  1.33  3.11  
2  4468  1.33  2.08  
1  8002  1.33  1.54  
SPLZ on mechanical disk  7085  1.33  11.96  
Dijkstra    217291     
Although SPLZ needs about 13 GB space costs if =, storing the compressed APSP into disk or selecting a lower would make the space consumption more practicable at the expense of lower quering.
The average time costs of Dijkstra algorithm based on binary heap is about 217 ms on our experiment graph. In our experiment, the performance of SPLZ is almost three orders of magnitude faster than Dijkstra based on binary heap, if let =. We also implement PHAST, which needs 27.4ms in our experiment. These comparisons may be influented by the details of how a people implements these algorithm. To fairly shows the online performance of SPLZ, we try to find a lower bound of running time of SSSP problem, and compare SPLZ with this lower bound.
5.5 Lower bound of SSSP problem
The time costs of solving SSSP problem has a natural lower bound. Whatever methods we use to calculate the shortest path from a vertex to all vertices in a graph, we must fill the result to an array of length as output. So the natural lower bound is the time costs of copying an array of length . We compares the time costs of SPLZ and array copying in Table 8. We test two copying methods: memcpy and forloop assignment. Memcpy() is a standard function in C library, which is fully optimized. Considering many algorithm successively output their result in a loop, we also test copying an array by forloop(assigning the elements one by one). The results in Table 8 shows that the performance of SPLZ is close to the lower bound.
Methods  Time costs() 

SPLZ  219 
memcpy  84 
forloop assignment  448 
5.6 Experiments on other road networks
Up to now, we test SPLZ on only one graph, the Northwest USA road network. Here we show the results of experiments on some other graph data. Table 9 introduces the related information about these data. Like North west USA data, these data are also downloaded from http://www.dis.uniroma1.it/challenge9. These experiments are used to show the performance of SPLZ on different data. Here we set and . Other hardware and software configurations are the same to section 5.1.
Table 10 shows the results of preprocess of SPLZ on these datasets. The scale of FLA dataset is close to Northwest, so the performance of SPLZ is similar to what we have shown above. SPLZ achieves less compression ratio on smaller graph, because path coherence takes effects when vertices in a coherent region are sufficiently far away from a vertex, while the average distance between vertices in smaller graph is closer. Table 11 is the decompression performance on these datasets. The lower bound cost is also tested. Table 11 validates that SPLZ also works well on smallscale road networks.
Name  Description  Number of vertices  Number of edges 

FLA  Florida  1,070,376  2,712,798 
COL  Colorado  435,666  1,057,066 
BAY  San Francisco Bay Area  321,270  800,172 
Name 


Compression ratio 



FLA  1146  9.97  115  10:56  
COL  190  2.72  70  2:15  
BAY  103  1.42  73  1:07 
Graph  Methods  Time () 




FLA  SPLZ on memory  172  9.97    
SPLZ on mechanical disk  6920  1.11  8.86  
SPLZ on SSD  657  1.11  8.86  
memcpy  76      
forloop assignment  459      
COL  SPLZ on memory  98  2.72    
SPLZ on mechanical disk  6120  0.29  2.43  
SPLZ on SSD  449  0.29  2.43  
memcpy  29      
forloop assignment  192      
BAY  SPLZ on memory  71  1.42    
SPLZ on mechanical disk  5767  0.18  1.24  
SPLZ on SSD  371  0.18  1.24  
memcpy  21      
forloop assignment  140     
6 Conclusion
In this paper, we presented SPLZ, an algorithm for solving single source shortest path problem on road network. SPLZ is about three orders of magnitude faster than Dijkstra based on binary heap. Compared with the time costs of array copying, which is a natural lower bound of SSSP problem, SPLZ shows a significantly high performance online. Even though SPLZ consumes much memory yet, this problem can be solved by storing compressed data into external memory or by adjusing the parameter .
Future research will focus on developing a more efficient preprocess method. In our experiments, SPLZ can solve SSSP prolem on a road network with about 1.2 millions vertices, and it is enough for many applications. But we should admit that, SPLZ still cannot deal with a more largescale road network, because of the huge time costs for precomputing. We can make efforts to two points. One is to adopt an algorithm faster than sequential PHAST to calculate APSP, and the other is to use a more efficient methods to find the longest match while compressing the APSP.
Acknowledgement
We would like to thank the reviewers for their valuable suggestions, and Shiyan Zhan for the fruitful discussions. This work is supported by Natural Science Foundation of China (No. 61033009 and No. 61303047) and Anhui Provincial Natural Science Foundation (No. 1208085QF106).
References
 Abraham et al (2011) Abraham I, Delling D, Goldberg AV, Werneck RF (2011) A hubbased labeling algorithm for shortest paths in road networks. In: Experimental Algorithms, Springer, pp 230–241
 Abraham et al (2012) Abraham I, Delling D, Fiat A, Goldberg AV, Werneck RF (2012) Hldb: Locationbased services in databases. In: Proceedings of the 20th International Conference on Advances in Geographic Information Systems, ACM, New York, NY, USA, SIGSPATIAL ’12, pp 339–348, DOI 10.1145/2424321.2424365, URL http://doi.acm.org/10.1145/2424321.2424365
 Ahuja et al (1990) Ahuja RK, Mehlhorn K, Orlin J, Tarjan RE (1990) Faster algorithms for the shortest path problem. Journal of the ACM (JACM) 37(2):213–223
 Arz et al (2013) Arz J, Luxen D, Sanders P (2013) Transit node routing reconsidered. In: Experimental Algorithms, Springer, pp 55–66
 Bast et al (2007) Bast H, Funke S, Sanders P, Schultes D (2007) Fast routing in road networks with transit nodes. Science 316(5824):566–566
 Bell and Kulp (1993) Bell T, Kulp D (1993) Longestmatch string searching for zivlempel compression. Software: Practice and Experience 23(7):757–771
 Bellman (1956) Bellman R (1956) On a routing problem. Tech. rep., DTIC Document
 Bertsekas (1993) Bertsekas DP (1993) A simple and fast label correcting algorithm for shortest paths. Networks 23(8):703–709
 Bertsekas et al (1996) Bertsekas DP, Guerriero F, Musmanno R (1996) Parallel asynchronous labelcorrecting methods for shortest paths. Journal of Optimization Theory and Applications 88(2):297–320
 Botea et al (2013) Botea A, Baier JA, Harabor D, Hernández C (2013) Moving target search with compressed path databases. Proceedings of ICAPS13
 Cherkassky et al (1996) Cherkassky BV, Goldberg AV, Radzik T (1996) Shortest paths algorithms: Theory and experimental evaluation. Mathematical programming 73(2):129–174
 Cherkassky et al (2009) Cherkassky BV, Georgiadis L, Goldberg AV, Tarjan RE, Werneck RF (2009) Shortestpath feasibility algorithms: An experimental evaluation. Journal of Experimental Algorithmics (JEA) 14:7
 Cormen et al (2001) Cormen TH, Leiserson CE, Rivest RL, Stein C, et al (2001) Introduction to algorithms, vol 2. MIT press Cambridge
 Delling et al (2013a) Delling D, Goldberg AV, Nowatzyk A, Werneck RF (2013a) Phast: Hardwareaccelerated shortest path trees. Journal of Parallel and Distributed Computing 73(7):940–952
 Delling et al (2013b) Delling D, Goldberg AV, Pajor T, Werneck RF (2013b) Customizable route planning in road networks. In: Sixth Annual Symposium on Combinatorial Search
 Demetrescu et al (2009) Demetrescu C, Goldberg AV, Johnson DS (2009) The Shortest Path Problem: Ninth DIMACS Implementation Challenge, vol 74. American Mathematical Soc.
 Dial (1969) Dial RB (1969) Algorithm 360: Shortestpath forest with topological ordering [h]. Communications of the ACM 12(11):632–633
 Dijkstra (1959) Dijkstra EW (1959) A note on two problems in connexion with graphs. Numerische mathematik 1(1):269–271
 Gallo and Pallottino (1986) Gallo G, Pallottino S (1986) Shortest path methods: A unifying approach. Netflow at Pisa pp 38–64
 Geisberger et al (2008) Geisberger R, Sanders P, Schultes D, Delling D (2008) Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In: Experimental Algorithms, Springer, pp 319–333
 Glover et al (1985) Glover F, Klingman D, Phillips N (1985) A new polynomially bounded shortest path algorithm. Operations Research 33(1):65–73
 Goldberg and Harrelson (2005) Goldberg AV, Harrelson C (2005) Computing the shortest path: A search meets graph theory. In: Proceedings of the sixteenth annual ACMSIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, pp 156–165
 Hilger et al (2009) Hilger M, Köhler E, Möhring RH, Schilling H (2009) Fast pointtopoint shortest path computations with arcflags. The Shortest Path Problem: Ninth DIMACS Implementation Challenge 74:41–72
 Knuth et al (1977) Knuth DE, Morris JH Jr, Pratt VR (1977) Fast pattern matching in strings. SIAM journal on computing 6(2):323–350
 Madduri et al (2006) Madduri K, Bader DA, Berry JW, Crobak JR (2006) Parallel shortest path algorithms for solving largescale instances
 Maue et al (2010) Maue J, Sanders P, Matijevic D (2010) Goaldirected shortestpath queries using precomputed cluster distances. J Exp Algorithmics 14:2:3.2–2:3.27, DOI 10.1145/1498698.1564502, URL http://doi.acm.org/10.1145/1498698.1564502
 Meyer and Sanders (2003) Meyer U, Sanders P (2003) stepping: a parallelizable shortest path algorithm. Journal of Algorithms 49(1):114–152
 Pallottino (1984) Pallottino S (1984) Shortestpath methods: Complexity, interrelations and new propositions. Networks 14(2):257–267
 Pape (1974) Pape U (1974) Implementation and efficiency of moorealgorithms for the shortest route problem. Mathematical Programming 7(1):212–222
 Sanders and Schultes (2005) Sanders P, Schultes D (2005) Highway hierarchies hasten exact shortest path queries. In: Algorithms–Esa 2005, Springer, pp 568–579
 Sankaranarayanan et al (2005) Sankaranarayanan J, Alborzi H, Samet H (2005) Efficient query processing on spatial networks. In: Proceedings of the 13th annual ACM international workshop on Geographic information systems, ACM, pp 200–209
 Sankaranarayanan et al (2009) Sankaranarayanan J, Samet H, Alborzi H (2009) Path oracles for spatial networks. Proceedings of the VLDB Endowment 2(1):1210–1221
 Ziv and Lempel (1977) Ziv J, Lempel A (1977) A universal algorithm for sequential data compression. IEEE Transactions on information theory 23(3):337–343