Temporal Graph Traversals:Definitions, Algorithms, and Applications

Temporal Graph Traversals: Definitions, Algorithms, and Applications

Silu Huang, James Cheng, Huanhuan Wu
Department of Computer Science and Engineering
The Chinese University of Hong Kong
{slhuang
jcheng    hhwu}@cse.cuhk.edu.hk
Abstract

A temporal graph is a graph in which connections between vertices are active at specific times, and such temporal information leads to completely new patterns and knowledge that are not present in a non-temporal graph. In this paper, we study traversal problems in a temporal graph. Graph traversals, such as DFS and BFS, are basic operations for processing and studying a graph. While both DFS and BFS are well-known simple concepts, it is non-trivial to adopt the same notions from a non-temporal graph to a temporal graph. We analyze the difficulties of defining temporal graph traversals and propose new definitions of DFS and BFS for a temporal graph. We investigate the properties of temporal DFS and BFS, and propose efficient algorithms with optimal complexity. In particular, we also study important applications of temporal DFS and BFS. We verify the efficiency and importance of our graph traversal algorithms in real world temporal graphs.

1 Introduction

Graph traversals, such as depth-first search (DFS) and breadth-first search (BFS), are the most fundamental graph operations. Both DFS and BFS are not only themselves essential in studying and understanding graphs, but they are also building blocks of numerous more advanced graph algorithms [7]. Their importance to graph theory and applications is beyond question.

Surprisingly, however, such basic graph traversal operations as DFS and BFS are not even defined or studied in any depth for an important source of graph data, namely temporal graphs. Although both DFS and BFS are simple for a non-temporal graph, we shall show that the concepts of DFS and BFS are non-trivial for temporal graphs, which reveal many important properties useful for understanding temporal graphs and lead to new applications.

Temporal graphs are graphs in which vertices and edges are temporal, i.e., they exist or are active at specific time instances. Formally, the temporal graph we study is a graph , where is the set of vertices and is the set of edges. Each edge is associated with a list of time instances at which is active or is communicating to . A vertex is considered active whenever it is involved in an active edge communication. Figure 1 depicts a Short Message Service (SMS) network modeled as a temporal graph. In the graph, we can see that sends a message to at time and ; while sends a message to at time .

Although research on graph data has been mainly focused on non-temporal graphs, temporal graphs are in fact ubiquitous in real life. For example, SMS networks, phone call networks, email networks, online social posting networks, stock exchange networks, flight scheduling or travel planning graphs, etc., are all temporal graphs as the objects (i.e., vertices) communicate/connect to each other at different time instances. A long list of different types of temporal graphs is described in [12].

Though existing in a wide spectrum of application domains, research on temporal graphs are seriously inadequate, which we believe is mainly due to the common practice of representing a temporal graph as a non-temporal graph for easier data analysis and algorithm design. Figure 2(a) shows the non-temporal graph representation of the temporal graph in Figure 1, where the temporal information in is discarded and multiple edges are combined into one (e.g., the two edges (a,b) at time 1 and 6 are combined into a single edge in ). Unfortunately, it has been largely overlooked that a non-temporal graph representation actually loses critical information in the temporal graph, which we explain as follows.

Both DFS and BFS are closely related to graph reachability [1, 24], as any path from an ancestor to a descendant in the DFS/BFS tree indicates that can reach . However, a DFS/BFS tree of the non-temporal graph representation does not imply the reachability between the same vertices in the corresponding temporal graph. For example, the DFS and BFS trees, with vertex as the root, of the non-temporal graph in Figure 2(a) are the same and are given in Figure 2(b). The path in the DFS/BFS tree indicates that can reach in ; however, cannot reach in the temporal graph because in , reaches at and but reaches only before , i.e., . In fact, is not a proper path in since information cannot be transmitted from to and then from to following a chronological time sequence.

The above discussion not only shows that the non-temporal graph representation is not a good tool for studying temporal graphs, but also motivates the need to define basic graph traversals for temporal graphs. To this end, we conduct the first study on traversal problems in temporal graphs and propose definitions, algorithms, and applications of both DFS and BFS in temporal graphs. We will show that these simple traversal operations on non-temporal graphs become unexpectedly complicated in temporal graphs as the presence of the edges are governed by chronological time sequences.

Note that a temporal graph can be viewed as a sequence of snapshots, where each snapshot is a non-temporal graph in which all edges are active at the same time . A naive approach of temporal graph traversal is to conduct a DFS or BFS in each snapshot. However, such an approach is unrealistic since the number of snapshots in a temporal graph can be very large, e.g., the wiki dataset used in our experiment has snapshots. Dividing a temporal graph into so many snapshots is not suitable for analysis as it is not easy and efficient to relate the information of one snapshot to that of the next one.

The main contributions of our work are summarized as follows.

• We show that critical temporal information is missing in the non-temporal graph representation of a temporal graph, hence the motivation to study temporal graphs by preserving temporal information.

• We show the challenges in defining meaningful DFS and BFS for temporal graphs. We then formally define DFS and BFS in temporal graphs, and design efficient algorithms for their computation.

• We believe that basic traversals such as DFS and BFS are the keys to studying temporal graphs, and hence the significance of our work in contributing to future research on temporal graphs that has not been given enough attention so far. As a first step along this direction, we study various graph properties that can be obtained by a DFS or BFS traversal of a temporal graph, and then we identify a set of important applications for both DFS and BFS in temporal graphs.

• We conduct extensive experiments on a range of real world temporal graphs. We first evaluate the efficiency of our algorithms. We study the properties of temporal DFS and BFS, and demonstrate their importance by comparing with results obtained from non-temporal graphs. We also show temporal graph traversals are useful in applications.

The rest of the paper is organized as follows. In Section 2 we give the notations. Then, in Sections 3 and 4 we present the details of temporal DFS and BFS. We discuss applications in Section 5. We conduct experimental studies in Section 6. Finally, we discuss related work in Section 7 and give our conclusions in Section 8.

2 Notions and Notations

We define notions and notations related to temporal graph in this section. We first define two types of closely related edges.

• Temporal edge: a temporal edge is represented by a triplet, , where is the start point or start vertex, is the end point or end vertex, and is the time when sends a message to or when the edge is active. we call the in-neighbor of and the out-neighbor of .

• Non-temporal edge: a non-temporal edge is simply the conventional edge representation, given by a pair , where is the start vertex, and is the end vertex.

Based on the two types of edges, we define temporal graph and non-temporal graph as follows.

Temporal graph: Let be a temporal graph, where is the set of vertices and is the set of edges in .

• Each edge is a temporal edge from a vertex to another vertex at time . For any two temporal edges and , .

• Each vertex is active when there is a temporal edge that starts or ends at .

• : the number of temporal edges from to in .

• : the set of temporal edges from to in , i.e., .

• or : the set of out-neighbors or in-neighbors of in , i.e., and .

• or : the temporal out-degree or in-degree of in , defined as and .

Now given a temporal graph , we define the corresponding non-temporal graph of as follows.

Non-temporal graph: Given a temporal graph , we construct a non-temporal graph from as follows:

• .

• , i.e., we create a non-temporal edge in for every set in .

• or : the set of out-neighbors or in-neighbors of in , i.e., and .

• or : the out-degree or in-degree of in , defined as and .

Figures 1 and 2(a) show a temporal graph and its corresponding non-temporal graph . We have as , and thus , while .

Remarks: We focus our discussions on directed graphs, but our definitions and algorithms can be trivially applied to undirected graphs. For simplicity, we do not consider self-loops, which can also be easily handled. We also remark that our method can be easily extended to handle temporal edges with a time duration.

3 Depth-First Search

In this section, we propose two definitions of depth-first search (DFS) for temporal graphs, and discuss why two definitions are needed. We investigate properties of DFS in temporal graphs and then present efficient algorithms for DFS in temporal graphs.

3.1 Challenges of DFS

DFS in a non-temporal graph is rather simple, which starts from a chosen source vertex and traverses as deep as possible along each path before backtracking. The DFS constructs a tree rooted at the source vertex. However, in a temporal graph, even such a simple graph traversal problem becomes very complicated due to the presence of temporal information on the edges and the existence of multiple edges between two vertices.

In Section 1, we have shown that if we ignore the temporal information, the DFS tree obtained will present incorrect information about the temporal graph. Thus, a DFS in a temporal graph must follow the chronological order carried by the temporal edges.

We can impose a time constraint when traversing a temporal graph. Naturally, the following time constraint should be imposed: when we traverse as deep as possible along a path in the DFS tree, for any two consecutive edges and on any root-to-leave path, we have . This constraint is meaningful because if , then the edge exists after and at time when we traverse , the edge no longer exists (as it existed in the past at time ). For example, in the graph in Figure 1, first sent a message to at time 1 and then forwarded it to at time 4, which naturally gives two chronologically ordered edges followed by . On the contrary, if we first have , then it should not be followed by as this order does not give the correct chronological development of events and may lead to a chaotic time sequence especially when the path grows longer.

Imposing the above-mentioned time constraint during temporal graph traversal probably addresses the problem if there is only a single temporal edge going from one vertex to another vertex. However, the existence of multiple temporal edges between two vertices complicates the problem. Consider again the graph in Figure 1, there are two temporal edges from to , and the question is how DFS traverses the two edges, are they treated as tree edges or forward edges? The situation is further complicated as there are also multiple temporal edges connecting among their neighbors, leading to a combinatorial effect. Such tricky cases do not occur in a non-temporal graph, and thus careful investigation is needed to define meaningful and useful DFS in temporal graphs.

3.2 Definitions of DFS

We first formally define the time constraint on a temporal graph traversal (including both DFS and BFS) as follows.

Definition 1 (Time Constraint on Traversal)

Let be the current vertex during a traversal in a temporal graph, and be the time when is visited, i.e., the traversal either starts from as the source vertex at time , or visits via a temporal edge . Given an edge , we traverse only if .

The above time constraint was proposed to define temporal paths in [13], and the rationale for setting this time constraint has been explained in Section 3.1.

To address the problem of multiple temporal edges between two vertices, we allow multiple occurrences of a vertex in a DFS tree, in contrast to a DFS tree in a non-temporal graph in which each vertex appears exactly once. This is reasonable because each vertex is actually active at multiple times when the (multiple) edges are active. We give our first version of DFS in Definition 2.

Definition 2 (Temporal DFS-v1)

Given a temporal graph and a starting time , a DFS in starting at , named as DFS-v1, is defined as follows:

1. Initialize for all , and select a source vertex .

2. Visit and set , and go to Step 2(a):

1. After visiting a vertex : Let be the set of temporal edges going from to , where each edge has not been traversed before and .

If there exists an out-neighbor of such that , then choose the edge , where , and traverse and go to Step 2(b).

If there is no out-neighbor of such that , then we backtrack to ’s predecessor (i.e., we have just visited via the temporal edge ) and repeat Step 2(a); or if is the source vertex, then terminate the DFS.

2. After traversing a temporal edge : If , we visit the vertex and set , and go to Step 2(a). Else, repeat Step 2(a).

Definition 2 allows a vertex to be visited multiple times. The condition “” in Step 2(b) is necessary to prevent a vertex being both the ancestor and descendant of itself in a DFS tree. For example, if we set “”, then a DFS of the graph in Figure 1 following the edges creates a loop . Thus, setting for all indicates that initially “” is satisfied and can be visited, while setting we do not allow to be visited from any other vertices and hence restrict to be the root of a DFS tree only.

However, setting the condition “” alone is not sufficient as there are multiple out-edges we can choose to traverse. Naturally we specify the order of the edges to be traversed to follow the ascending order of the time at which they are active. In addition, since some applications may favor more recent information. Thus, we also allow users to specify a starting time to capture temporal information only at or after , while the information before is considered obsolete.

Figure 3(a) gives the DFS tree obtained by executing DFS-v1, starting at , on the graph in Figure 1. Note that all temporal edges before are neither tree edges nor non-tree edges, as they are considered obsolete. However, it is observed that more recent edges such as are not considered as equally as older edges such as . Thus, if we consider that all temporal edges after a user-specified starting time should receive equal treatments, we will need a new definition of DFS, which we present as follows.

Definition 3 (Temporal DFS-v2)

Given a temporal graph and a starting time , a DFS in starting at , named as DFS-v2, is defined as follows:

1. Initialize for all , and select a source vertex .

2. Visit and set , and go to Step 2(a):

1. After visiting a vertex : Let be the set of temporal edges outgoing from , where each edge has not been traversed before and .

If , we choose the edge , where , and traverse and go to Step 2(b).

Else (i.e., ), we backtrack to ’s predecessor (i.e., we have just visited via the temporal edge ) and repeat Step 2(a); or if is the source vertex, then terminate the DFS.

2. After traversing a temporal edge : If , we visit the vertex and set , and go to Step 2(a). Else, repeat Step 2(a).

The main difference between Definition 3 and Definition 2 is that among the set of available outgoing temporal edges from a vertex in Step 2(a), Definition 3 chooses the edges to traverse in reverse chronological order. This may look to be counter intuitive, but we will show that this definition of DFS is meaningful, especially in Section 5 we show how it allows us to answer important path and “distance” queries.

Figure 3(b) presents the DFS tree obtained by executing DFS-v2, starting at , on the graph in Figure 1. We can see the multiple temporal edges between vertices are equally considered in the DFS and presented in the DFS tree.

3.3 Notions and Properties Related to DFS

Now we present a number of notions related to DFS and some good properties of DFS for temporal graphs.

We first formally define tree edges in a DFS as follows.

Definition 4 (DFS tree and Tree Edges)

In a DFS of a temporal graph , an edge is a tree edge if is traversed in the DFS and when we traverse (and then we visit via and set ).

The DFS constructs a DFS tree, , which is rooted at the source vertex , where the set of vertices in is the set of vertices visited in the DFS, and the set of edges in is the set of all tree edges in the DFS. Since some may be visited multiple times in the DFS, there may be multiple occurrences of in .

Let be the number of temporal edges in a temporal graph that are active at or after . The following lemmas analyze the bound on the size of the DFS tree. Lemma 1 will also be used to analyze the complexity of our algorithms in Section 3.4.

Lemma 1

In a DFS of (either by Definition 2 or 3), only edges in are traversed and each edge in is traversed at most once.

Lemma 2

Let be the DFS tree of . Then, has at most vertices and edges.

The proof of Lemma 1 follows directly from Definition 2 or 3, while the proof of Lemma 2 follows directly from Definition 4 and Lemma 1, and the fact that we visit a vertex in the DFS only when we traverse an edge. We next define non-tree edges traversed in a DFS as follows.

Definition 5 (Non-Tree Edges)

Given a DFS tree of , an edge is either a tree edge in , or a non-tree edge belonging to one of the following four types:

• Forward edge: is a forward edge if at the time when the DFS traverses , is already an ancestor of in .

• Backward edge: is a backward edge if at the time when the DFS traverses , is already a descendant of in .

• Cross edge: is a cross edge if at the time when the DFS traverses , is neither an ancestor nor a descendant of in .

• Non-DFS edge: is a non-DFS edge if is not traversed in the DFS.

The following lemmas and corollary present some important properties of DFS in a temporal graph.

Lemma 3

Given a DFS tree of , if a vertex is an ancestor of another vertex in (or is a descendant of ), then . Here and refer to a specific occurrence of vertex and in , respectively.

{proof}

If is an ancestor of , then there exists a path such that for each edge on the path for , we have by Steps 2(a) and 2(b) of Definitions 2 and 3 since is a tree edge. The proof follows as , for , implies that .

Lemma 4

Given a DFS tree of , a vertex cannot be both an ancestor and a descendant of another vertex along the same root-to-leaf path.

{proof}

If is both an ancestor and a descendant of along the same root-to-leaf path, there exists a path such that each edge on the path is a tree edge. Consider the last edge on the path, i.e., . At the time when we traverse from to , we have by Steps 2(a) of Definitions 2 and 3. Since is a tree edge, we require when we traverse (right before we visit via ), which contradicts to .

Corollary 1

Given a temporal graph , a DFS of partitions into five disjoint subsets.

{proof}

The proof follows directly from Definitions 4 and 5, and Lemma 4.

We illustrate the concepts by an example. The solid lines in Figure 3 are all tree edges; in Figure 3(a) is a forward edge; is a backward edge and is a cross edge in Figures 3(a)-(b); while , and are non-DFS edges in Figures 3(a)-(b). Some temporal edges cannot be traversed due to time sequential constraint, e.g., is non-DFS edge even if it is active after . It is reasonable not to include such temporal edges since we only keep all useful information concerning DFS starting from . Also note that a temporal edge may belong to different categories for different versions of DFS, e.g., is a forward edge in Figure 3(a) but a tree edge in Figure 3(b).

We next define the notion of cycle in a temporal graph. Similar to the time constraint on temporal graph traversal given in Definition 1, cycles in a temporal graph also follow sequential time constraint as defined below. For simplicity, in the following discussion on cycles, whenever we mention a vertex in a DFS tree , we refer to a specific occurrence of in .

Definition 6 (Temporal Cycle)

Given a temporal graph , a cycle in is given by a sequence of temporal edges , where is the start and end vertex of , and .

Different from cycles in a non-temporal graph, cycles in a temporal graph have a start vertex in order to satisfy the sequential time constraint. Note that we cannot pick another vertex in the cycle without violating the sequential time constraint. For example, if we choose in to be the start vertex, without considering the temporal information is still a cycle, but with the temporal information we have .

A temporal cycle, e.g., , corresponding to in Figure 3(b), indicates that delivers information at while it gets information feedback at w.r.t. information fusion. In another application such as flight scheduling, the temporal cycle indicates a person leaving at and returning at , where the difference between and is referred as round trip time.

The following lemma is useful for detecting temporal cycles.

Lemma 5

In a DFS of a temporal graph , if a temporal edge is a backward edge, then is a cycle in , where is the start and end vertex of .

{proof}

If is a backward edge, by Definition 5 is a descendant of , which means that there is a path and each is a tree edge for . Thus, by Steps 2(a) and 2(b) of Definitions 2 and 3, we have and hence . Since is traversed in the DFS, we have . Thus, and is a cycle in .

The following definition and lemma are related to reachability in a temporal graph.

Definition 7 (Temporal Graph Reachability)

Let be a temporal graph. A vertex can reach another vertex (or is reachable from ) in if there is a path from to in such that traversing the path starting from to follows the time constraint defined in Definition 1.

Lemma 6

Let be the set of distinct vertices (i.e., multiple occurrences of a vertex are considered as a single ) in the DFS tree of (by DFS-v1 or DFS-v2), rooted at a source . Let be the set of vertices in that are reachable from . Then, .

{proof}

First, because the simple path in from to each is a path in that satisfies the definition of reachability from to as given in Definition 7. Next, we prove , i.e., if there exists a path in from to each , then . When visiting in the DFS, the edge must be traversed according to Step 2(a) of DFS-v1 or DFS-v2 since , which implies that must occur in with ( could be visited via another edge where ). Then, must be traversed since , and thus must occur in with . By recursive analysis we conclude that must occur in with , i.e., . Thus, .

In Figure 1, if , then the set of reachable vertices from is the whole vertex set except , which is exactly the set of vertices in the DFS trees in Figures 3(a)-(b).

3.4 Algorithms and Complexity of DFS

Before presenting the algorithms for DFS, we first describe the data format for an input temporal graph . Assuming that edges in are active at time instances , where for . Let be the set of out-neighbors of a vertex at . We assume that each is collected after as time proceeds, and simply concatenated to . Thus, for each , the set of temporal out-edges from in is stored as . For example, the out-edges of in Figure 1, i.e., , are stored as , which is ordered chronologically.

In the discussions of all our algorithms, we assume the above-described data format for the input temporal graph.

We now present the algorithms for DFS in a temporal graph. The algorithm for DFS-v1 is in fact rather straightforward following the description of Definition 2. To analyze the complexity and reduce the cost of some operations in the DFS, we first give the following analysis directly based on Definition 2.

If every individual operation in DFS-v1 uses time, then by Lemma 1 we only use time. However, checking all out-neighbors of such that in Step 2(a) takes on the set of temporal out-edges of , for each time is visited. Let be the number of times a vertex is visited by DFS-v1. Then, DFS-v1 takes time, since by Lemma 1. We can reduce the time complexity to by using priority queues to select neighbors to be traversed in Step 2(a). In addition, we can do a binary search to choose the edge , where . Since each temporal edge is traversed at most once, the whole binary search costs at most . Hence the total time complexity is

Next, we propose a linear-time algorithm for DFS-v2, which is in fact optimal. Importantly, we find that the same algorithm can also be applied to solve DFS-v1 to achieve the same linear time complexity.

To begin with, we first present the following important lemma.

Lemma 7

In a DFS of by Definition 3, for any vertex and for any two temporal edges and , where , if is traversed before in the DFS, then we have .

{proof}

Let be the DFS tree. Since may have multiple occurrences in , we have the following two cases when is traversed before in the DFS. If and are traversed when visiting the same occurrence of , then because Definition 3 chooses the edges to traverse in reverse chronological order in Step 2(a). Else, let and be two occurrences of , and assume that and are traversed when visiting and , respectively. Since is traversed before , must occur before in . Thus, , because should be traversed when visiting if . In both cases, we have .

Let be the set of all temporal edges going out from . Lemma 7 essentially implies that we can first order in descending order of the time at which edges in are active, and then scan the edges in that order to traverse them during an execution of DFS-v2. This descending order of is simply the reverse order how the set of temporal out-edges from each in is stored, as described at the beginning of Section 3.4. Apparently, since each edge is traversed at most once during a DFS by Lemma 1 and now we do not need to search the out-neighbors of in Step 2(a), the time complexity of DFS-v2 is .

Finally, we remark that DFS-v1 can also be processed by scanning in reverse order as for DFS-v2, and the resultant DFS tree does not violate Definition 2 and hence any related properties/notions presented in Section 3.3.

The following theorem states the complexity of DFS in a temporal graph (the proof follows directly from the discussion above).

Theorem 1

Given a temporal graph , DFS-v2 (or DFS-v1) in uses time and space.

Note that both the time and space complexity given in Theorem 1 are the lower bound because it is easy to give a temporal graph for which edges are traversed and vertices are visited, and we need space to keep the graph in memory for random vertex/edge access.

In this section, we define breadth-first search (BFS) for temporal graphs. We also discuss properties of BFS in temporal graphs and present an efficient algorithm for temporal BFS.

4.1 Challenges of BFS

BFS in a non-temporal graph starts from a chosen source vertex and visits all ’s neighbors, and then from each neighbor visits the un-visited neighbors of , and so on until all reachable vertices are visited. However, BFS in a temporal graph is much more complicated due to the presence of temporal information.

BFS and DFS in a temporal graph share many similar challenges that are discussed in Section 3.1. They both need to follow the time constraint stated in Definition 1 and require multiple occurrences of a vertex in the traversal tree in order to retain critical information. In addition, temporal BFS also poses its own challenges.

One issue unique to temporal BFS is that the path that gives the smallest number of hops may not be the path that reaches from the source to the target at the earliest time, i.e., a short path may take longer time to traverse. For example, in Figure 4(a), the shortest path from to has only 1 hop but reaches at time , while a longer path has 2 hops but reaches at an earlier time (such information is important for travel planning).

Figure 4(a) also reveals another problem that has been significantly complicated with the addition of temporal information. When we start from and finish the first level of BFS (i.e., we have visited , and ), we have at the second level that can be visited from and . If we do not visit again because has been visited at the first level, then we cannot reach since visits at time and so it cannot go from to after since the edge is active at . If we visit again, then we can only consider to visit from because the time to reach from is at time , which is also after when the edge is active.

4.2 Definitions of BFS

To define meaningful and useful BFS for temporal graphs, we need to consider all the issues identified in Section 4.1.

Definition 8 (Temporal BFS and BFS Tree)

Given a temporal graph and a starting time , a BFS in starting at is defined as follows:

1. Initialize , , and for all , where denotes the number of hops (or the path distance for un-weighted graphs) from to and denotes the predecessor of during the BFS; and initialize an empty queue .

2. Select a source vertex , set , , , and push into .

3. While is not empty, do:

1. Pop from ;

2. Let be the set of temporal edges going from to , where each edge has not been traversed before and .

For each out-neighbor of , where , do:

1. Let be the edge in where .

2. If is not in (whether has been visited or not): traverse , and if , then visit and push into .

3. Else:

1. If there exists in such that : traverse , and if , then visit and update and in .

2. Else (i.e., ): traverse , and if , then visit and push into .

The BFS constructs a BFS tree, , which is rooted at , where the set of vertices in is defined as is visited in the BFS and , and the set of edges in is defined as in and .

Definition 8 addresses the issues identified in Section 4.1. In addition, we also address the problem of multiple occurrences of a vertex at the same level of the BFS tree; for example, as discussed in Section 4.1, at the second level can be visited from and , and thus two occurrences of will be created at the second level. For BFS, such multiple occurrences at the same level are of little practical use and may keep much duplicated information.

Figure 4(b) shows the BFS tree of the graph in Figure 4(a). Note that there are still multiple occurrences of a vertex in the BFS tree, but these occurrences are necessary to keep essential information. For example, at level 1 in the BFS tree in Figure 4(b) keeps the shortest path from to , while at level 2 is necessary to report the shortest path from to and also keep the earliest time can reach (i.e., at time ).

Thus, in Definition 8, we keep updated during the BFS process so that its final value records the earliest time from to reach this particular occurrence of in hops. It is important to record this earliest time because a later time may miss some paths that start at an earlier time, and hence we may miss the shortest path. For example, goes to through at time instead of via at level 1.

When traversing an edge , if is in , then either was visited at the same level as but later than (i.e., ) or was visited at the current level from another vertex at the same level as (i.e., ). If is not in , then either was visited at a level earlier than or was visited at the same level as but earlier than or has never been visited, in either case we need to create another occurrence of in the BFS tree if since it may be crucial to some paths.

We can verify from the BFS tree that the distance of all the vertices from the source vertex is correctly recorded and the temporal time constraint is also followed along all the paths.

4.3 Notions and Properties Related to BFS

Next we present some important properties and notions related to temporal BFS.

Let be the number of temporal edges in a temporal graph that are active at or after . The following lemmas analyze the bound on the size of the BFS tree.

Lemma 8

A BFS of traverses only edges in and each edge in is traversed at most once.

Lemma 9

Let be the BFS tree of . Then, has at most vertices and edges.

The proof of Lemma 8 follows directly from Definition 8, while the proof of Lemma 9 follows directly from Definition 8 and Lemma 8 and the fact that we assign for a vertex only when we traverse an edge.

The following lemma states that for each in , there is at most one occurrence of at each level of the BFS tree. This property will be used to prove some other lemmas.

Lemma 10

Let be the BFS tree of , and be the set of all occurrences of in . For any , .

The following lemma is related to temporal graph reachability.

Lemma 11

Let be the set of distinct vertices (i.e., multiple occurrences of a vertex are considered as a single ) in the BFS tree, rooted at , of a temporal graph . Let be the set of vertices in that are reachable from . Then, .

The proof of Lemma 10 follows directly from Definition 8, while the proof of Lemma 11 is similar to that of Lemma 6.

The following definition and lemma are related to the distance between two vertices in a temporal graph.

Definition 9 (Temporal Graph Distance)

Let be a temporal graph. The shortest temporal path distance in from a vertex to another vertex , denoted by , is the minimum number of hops in a path from to in such that traversing the path starting from to follows the time constraint defined in Definition 1.

Lemma 12

Let be the BFS tree, rooted at , of . Let is the first occurrence of in . Then, .

{proof}

BFS is executed level by level, where is actually the level number of in . Assume that and is the corresponding path in , we want to prove that is at level of . First, cannot occur before level with , otherwise there is another path with a shorter distance than that of which contradicts to our assumption. Thus, cannot occur before level and we need to prove that occurs at level , i.e., is at level . According to Definition 8, must be at level 1 and cannot be at level 1 with (otherwise ), hence must be at level 2 after traversing . By recursive analysis we can conclude that the first occurrence of is at level . Thus, we have .

The following lemma specifies the relationship between temporal information and distance information captured in a BFS of a temporal graph.

Lemma 13

Let be the BFS tree, rooted at , of . Let be the set of all occurrences of in . Then, for each , is the earliest time that can reach in hops in starting from time .

{proof}

Suppose is the earliest time that can reach in hops starting from time , via the path in , where , and is the shortest path by which can reach at time (starting from ). Let be the latest occurrence of in such that . We want to prove . First, (otherwise is not the earliest time that can reach in hops). Next, we prove . Similar to the proof of Lemma 12, we have at level 1 and cannot be at level 1 with , hence is at level 2 with . A recursive analysis shows that occurs at level with , and this is the latest occurrence of within hops since there is at most one occurrence of at level by Lemma 10. Thus, is the earliest time that can reach in hops.

4.4 Algorithms and Complexity of BFS

The algorithm for the temporal BFS is clear following the description of Definition 8. We first prove a few lemmas as follows, which also analyze the algorithm and its complexity.

Lemma 14

In a BFS of , at most records are pushed into the queue in Definition 8.

{proof}

Since a record is pushed into only if an edge is traversed, the proof follows from Lemma 8.

Lemma 15

There are at most two records involving the same vertex in at any particular time during a BFS of .

{proof}

According to the principle of BFS in Definition 8, at any time the vertices in belong to at most two levels, i.e., we have either or for any in , for some positive integer . When considering whether to push a new record of into in Step 3(b)iii, we have two cases that compare with of an existing record of in : (1) : in which case we only update the existing record in ; or (2) , in which case we push a new record for into with . If Case (1) is executed, no new record is created. If Case (2) is executed, let the existing record in be and the new record be , and consider that is visited again, then there exists now and hence Case (1) will be executed. Thus, no new record of at level will be pushed into , and when we process level , must have been popped from according to the principle of BFS.

Lemmas 8 and 9 show that at most edges are traversed in the BFS and at most vertices and edges are created. Then, Lemma 14 shows that at most records are pushed into and popped from , while Lemma 15 shows that updating a record in takes time since we check at most two records for each (and there are at most updates according to Lemma 14).

There is one more operation that we have not considered, that is, computing for each and finding from to traverse next, which can be costly if implemented directly. To avoid computing and search for , we can use the same sorted described in Section 3.4, where is the set of all temporal edges going out from . Since Definition 8 does not specify an order by which the out-neighbors of should be traversed, we can scan the edges in in the same order as for DFS-v1 to select the next edge to traverse. Since each edge is traversed at most once according to Lemma 8, traversing edges in the sorted does not violate the definition of BFS. Thus, we have the linear overall complexity as stated by the following theorem.

Theorem 2

Given a temporal graph , processing BFS in uses time and space.

Similar to DFS, both the time and space complexity given in Theorem 2 are the lower bound for temporal BFS.

5 Temporal Graph Traversals for Answering Path Queries

Both DFS and BFS in a non-temporal graph have important applications [7]. As the first study (to the best of our knowledge) on temporal graph traversals, we would like to show that our definitions of temporal DFS and BFS also give vital applications.

In Sections 3.3 and 4.3, some notions and properties related to temporal DFS and BFS can be readily used in applications such as the detection of cycles, and answering temporal graph reachability queries. Furthermore, these applications are themselves fundamental concepts/tools for studying graphs and therefore each of them has many other applications themselves. For example, temporal graph reachability can be naturally applied to study connected components in a temporal graph.

Due to space limit, we focus on an important set of applications: temporal path queries, such as foremost paths, fastest paths, and shortest temporal paths. We also emphasize that these paths, like shortest paths in a non-temporal graph, can in turn be applied to develop many other useful applications (e.g., temporal graph clustering, temporal centrality computation, etc.).

We first define some common notations used in the definitions of the various types of temporal paths. Given a temporal graph , two vertices and , and time , let be the set of all paths in from to and each path in implies that can reach in starting at time from . Formally, let , then is in if , , , and can reach in . We define , , and .

5.1 Foremost Paths

We first define foremost path.

Definition 10 (Foremost Path)

A path is a foremost path if for all path , . The problem of single-source foremost paths is to find the foremost path from a source vertex , starting from time , to every other vertex in .

Intuitively, a foremost path is the path from , starting from time , that reaches at the earliest possible time. Applications of foremost paths include travel planning for which one wants to know the earliest time to reach a destination if departing at time . Figure 1 shows two paths from to starting at , of which is a foremost path while is not.

Next, we show how temporal DFS and BFS can be applied to compute foremost paths.

Theorem 3

Let be the DFS tree (constructed by DFS-v1) or the BFS tree, rooted at a source vertex , of a temporal graph . Let be the set of all occurrences of a vertex in , and let be the occurrence of such that . For each in , if is in , then the path from to in is a foremost path from to in ; if is not in , then the foremost path from to does not exist in .

{proof}

If is not in , then the foremost path from to