Taming the Expressiveness and Programmability of Graph Analytical Queries

# Taming the Expressiveness and Programmability of Graph Analytical Queries

## Abstract

Graph database has enjoyed a boom in the last decade, and graph queries accordingly gain a lot of attentions from both the academia and industry. We focus on analytical queries in this paper. While analyzing existing domain-specific languages (DSLs) for analytical queries regarding the perspectives of completeness, expressiveness and programmability, we find out that none of existing work has achieved a satisfactory coverage of these perspectives. Motivated by this, we propose the DSL, which is named after the three primitive operators Filter, LocAl and PuSH. We prove that is Turing complete (completeness), and show that it achieves both good expressiveness and programmability for analytical queries. We provide an implementation of based on code generation, and compare it with native C++ codes and existing DSL using representative queries. The experiment results demonstrate ’s expressiveness, and its capability of programming complex algorithms that achieve satisfactory runtime.

\keywords

Graph Queries, Domain-Specific Language, Graph Database

## 1 Introduction

Last decade has witnessed the proliferation of graph database, in which the entities are modelled as vertices and the relationships among them are modelled as edges. Graph database emerges with growing needs of expressing and analyzing the inter-connections of entities. Examples of graph databases are social network where people are vertices and their friendships form edges, web graphs where web pages are vertices and the hyperlinks serve as edges, protein-protein-interaction networks where proteins are vertices and their interactions become edges, to just name a few.

Correspondingly, graph queries have gained many attentions as the growing of graph database. In [7], the authors have surveyed two categories of graph queries, namely pattern matching queries and navigational queries. These two kinds of queries have arrived at a mature stage, endorsed by a lot of theoretical efforts [10, 60] and industrial products including SPARQL [52], Cypher [21], PGQL [57], GCore [6] and Gremlin [43]. However, there exists a third category that has been recognized but not yet fully studied [7], namely analytical queries, and it is the main focus of this work. Analytical queries are related to conducting machine learning/data mining (MLDM) tasks over graphs. Examples of analytical queries are connected components (CC), single-source shortest path (SSSP), PageRank (PR), Core Decomposition (CD), Triangle Listing (TL) and Graph Coloring (GC).

State-of-the-Arts. It is yet unclear how to abstract the core features of the analytical queries [7]. A typical solution is via the definition of graph computation models, and graph engines provide users the programming interfaces to implement the query. Examples of such practice include Pregel’s vertex-centric model [37], PowerGraph’s GAS model [22], and Gremlin’s graph computation step [43]. However, this requires mastering a lot of domain knowledge, which is often challenging for non-expert users due to the complication of graph analysis. To address this issue, there emerge two directions of studies in the development of domain-specific languages (DSLs) for graph analytical queries. The first direction follows the high-level programming language in general, while encapsulating some graph-specific operations such as BFS/DFS to ease the programming. The DSL is then translated back to the high-level programming language using the techniques of code generation. Green-Marl [28] is a representative work of this direction. The other direction attempts to extend declarative database query language in order to leverage the foundation of their user bases. TigerGraph’s GSQL [17] is such a work that extends SQL for graph analysis.

Important Perspectives. In this paper, we consider three perspectives to evaluate a DSL for graph analysis, namely Completeness, Expressiveness and Programmability. We explain these perspectives and argue that all of them are important for a DSL in the following.

• Completeness: More specifically, Turing completeness. A language (computing system) is Turing complete if it can simulate a single-head Turing machine, and if it can do so, it can be programmed to solve any computable problem [56]. In this sense, completeness is seemingly the most significant feature of a DSL.

• Expressiveness: Expressiveness refers to the succinctness of the language while expressing a query, which can be measured using the Logical Line of Codes (LLoCs) [39]. Expressiveness is important for non-expert users to quickly get into graph queries. Another reason to weigh on expressiveness is that system-level optimization can often be leveraged to translate the codes into relatively efficient runtime.

• Programmability: Programmability stands for the capability of programming certain query in a different way typically for performance consideration. Programmability is important for expert users to tune the implementation of certain query. Programmability complements the system-level optimization by further offering query-specific tuning.

Motivations. We now evaluate the state-of-the-art DSLs - Green-Marl [28] and GSQL [17]. Both languages demonstrate satisfying programmability. However, in terms of completeness, none of these languages for analytical queries has proven completeness guarantee to the best we know. In terms of expressiveness, Green-Marl in principle follows the C-style programming, which may not be intuitive for non-expert user. In addition, the authors in [31] also concern that writing such codes for every graph analysis could quickly become very messy for end users. GSQL is a SQL-like language, while borrowing the SQL syntax is a double-blade sword. On the one hand, it facilitates the programming for users who are already familiar with SQL. On the other hand, it must follow the SQL programming structure, which adds a few extra expressions. For example, the official GSQL codes 1 for CC occupy 26 LLoCs, among which only around 12 lines are effective codes. In comparison, Green-Marl’s CC codes [9] also takes 26 LLoCs, while a majority of these codes are effective.

In summary, there lacks a DSL for graph analytical queries that achieve a satisfactory coverage of the perspectives of completeness, expressiveness and programmability.

Our Contributions. In response to the gap, we propose the DSL (named after the three primitive operators Filter, LocAl and PuSH) in this paper. We formally prove that is Turing complete (Section 3.4). We argue that achieves the best-so-far performance for both expressiveness and programmability in the following. Regarding expressiveness, can express many widely-used analytical queries much more succinct than Green-Marl and GSQL (Table 2). As an example, the codes for connected components in Listing 1 use only 10 LLoCs, compared to 26 LLoCs for both Green-Marl and GSQL. Regarding programmability, we show that ’s programmability is mainly a result of its flexible operator chaining, and the semantic of implicit edges that allows each vertex to communicate with all vertices in the graph.As an evidence, we manage to implement the optimized connected components (CC-opt) algorithm proposed in [47, 41] using , and its correctness is carefully verified. We have further programmed to express over 50 advanced graph algorithms [9] in the literature, including iteration-optimized minimum spanning tree [41], triangle listing [15], -core decomposition [34], butterfly counting [58], graph coloring [63], to just name a few.

We design following a BFS-style to favor a parallel/distributed implementation for large graph processing. Listing 1 has revealed a taste of functional programming of . We tend to use the functional programming design as it can be more naturally incorporated into the distributed dataflow architectures (e.g. [1, 4, 33, 38, 64]).

In this paper, we mainly focus on the semantics of , and leave the details of its syntax and distributed implementation to the future work (Section 8). In summary, we make the following contributions:

1. We propose the query language for analytical queries over graph data. We define the operators, its control flow, which forms the machine. We prove that the machine is Turing complete.

2. We simulate the GAS model (a symbolic graph programming model for analytical queries) using , which immediately indicates that can express many widely used analytical queries. We exemplify ’s code on some representative analytical queries to show its expressiveness.

3. We demonstrate ’s strong programmability by formally validating its implementation of the optimized connected components algorithm [41].

4. We implement based on code generation that supports parallelism via OpenMP. We conduct extensive experiments to demonstrate the huge potentials of while compared to existing works and native C++ codes.

## 2 Preliminary

Runtime Property Graph Model. is defined over the runtime property graph model - an extension of the property graph model in [42] - denoted as , where

• and denote the ordered sets of vertices and edges, respectively.

• is a total function that maps each edge to its source vertex and target vertex , denoted as . We use and to denote the source and destination vertices of . We support both directed edge and undirected edge. If is a directed edge, it means is from to ; otherwise, the source and destination vertices are relative concepts, which means that is the source and is the destination relative to , and vise versa. We apply directed graph by default unless otherwise specified. Note that we allow multiple edges between each pair of vertices as [21].

• is a total function that maps a vertex and an edge to a label, where denotes a finite set of labels.

• is a partial function that specifies properties for the vertices and edges, where the key of the property is a non-empty string and the value can be any accepted data type. We further divide the properties into static properties and runtime properties, and require that a runtime property must have the key start with the prefix “@”.

###### Remark 2.1.

A static property is a property natively carried by the graph, which is read-only during the execution of a program. The data type of a static property is also pre-claimed. One can thus interpret all static properties of the vertices and edges as the schema of the graph. A runtime property is the one that is created by a program in runtime and will be invalidated after the execution. User is required to explicitly declare the runtime properties with their data types in codes.

As the routine of this paper, we use a lower-case character to represent a value, a capital character to represent a collection and a capital with a hat of “” for a collection of collections. Given a vertex , we use (resp. ) and (resp. ) to denote its outgoing and incoming neighbors (resp. edges). Note that we have (resp. ) for undirected graph, and we often use and as a routine in this paper. Given a set of vertices , we denote , and , and are analogously defined. We can apply a filter function to each above neighboring notations as . For example, we write .

FLASH Programming. In codes, we use the popular dot notation-syntax of modern object-oriented programming languages. For example, is programmed as “v.age”. Table 1 lists the reserved properties for vertices/edges to be used in codes throughout this paper.

follows the syntax of functional programming, and we use the expression

 |x, y, ...| ...

in codes to denote a lambda function that takes x, y, etc. as input variables and returns the value from the last expression of the function body. When the context is clear and there is only one input variable in the lambda function, we can further omit the prefix “|x|”, and the presence of “x” in the function body can be replaced with a placeholder “_”. For example, we write “|v| v.age < 25” to denote a lambda function that accepts a vertex as input and returns a boolean value (such lambda function is often called a predicate). When the context is clear, we can write “.age < 25” for short.

## 3 The FLASH Machine and Query Language

In this section, we first define the operators and control flow. Then we define the machine, followed by a proof of its Turing completeness.

### 3.1 The Abstraction of FLASH Operators

A operator defines what operation to conduct on the graph data, which takes one as input and return one . Conceptually, a is a subset of the graph vertices , while we only call them so while serving the input and output of a operator. In this paper, we always denote as the whole of the graph. In general, a operator can be abstracted as

 O:Vin↦Vout, s.t. δ:Vout↦@ΨVout,

where and are the input and output s, and denotes a side effect upon a collection of runtime properties over , which means that the operator may either create or update for and . Specially, there is no side effect if . We denote each element of an operator as , , and . The “” can be omitted if the context is clear.

Given an ordered sequence of operators , there highlight two features of the operators, namely procedural side effects and operator chaining.

Procedural side effects. The side-effects of ’s operators are finalized in a procedural way. Specifically, the side effect for , must be finalized before proceeding to , and for must be aware of the changes caused by . Given and where , if , then for each , the side effect conducted by overwrites that of . The feature of procedural side effects may make less pure as a functional programming language[40]. However, it is not new to consider side effects in functional programming [32], and it is also a common practice in graph programming [17, 43].

Operator Chaining. If we further have , , the operators can be chained as a composite operator

 O:=⨀iOi, s.t. δ:(⋃iOi.Vout)↦(⋃iOi.@Ψ)(⋃iOi.Vout).

where and , , and the side effect of the composite operator subjects to all side effects of each individual operator. Implied by procedural side effects, a program always finalizes the latest side effect among multiple ones conducted on the same runtime properties in operator chaining.

### 3.2 The Primitive Operators

We now define the three most primitive operators of , namely , and , after which is named. We also show examples of how to program with them. Each operator extends the abstraction defined in Section 3.1, which implicitly carries the elements of and .

Filter Operator. Given the input , we define the operator to filter from to form the output as

 Filter(Vin:A,Vout:B,δ:ℵ,@Ψ:∅,f:Vin↦{⊤,⊥}), (1)

where is a predicate that defines the filter criteria and we have . In the following, we will omit the items that are not important or are clear in the context in the operator. For operator, we often just keep the filter predicate .

###### Example 3.1.

Given an input , we want to filter out the vertices that are married under the age of 25 (suppose the “age” and “married” properties present), and assign the output to . We write the operator in both mathematics and code as

In the above example, A is the input and B holds the output. For each operator in the following, as it is clear that we are accessing all vertices of the input , we will omit the “” term in the mathematical expression. As for code, it can be shortened using the placeholder “_” as

B = A.Filter($\placeholder$.age < 25 && $\placeholder$.married);

Local Operator. The operator is used to conduct side effect on the input , which is defined as

 Local(δ:Vin↦@ΨVin). (2)

The local operator will directly return the input after applying the side effect , namely . Given , we typically write the side effect as for each , where the runtime property will be created if not present, and then be assigned/updated as the value computed by some function .

###### Example 3.2.

While computing single-source shortest path (SSSP), it is a routine to initialize the distance value of all vertices with some large value, e.g. . We use operator to realise this purpose as

Push Operator. Graph computation often needs to navigate from current vertices to other vertices, typically via the outgoing and incoming edges. Data may be exchanged along with the process. The operator is defined for this purpose as

 Push(γ:Vin↦−−→EVin,δ:∥γ(v)↦⊕(@Ψ)γ(v)∥) (3)

Here is called the route function of operator, which maps each input vertex to a collection of “edges” that are bounded by . The edges can be either explicit or implicit. Explicit edges are actual edges in the graph, namely , and . Given an arbitrary vertex subset , the semantic of implicit edges allows the operator to interpret for as an “edge”, regardless of whether or not it is an actual edge. As a result, we have the following options for : (1) a collection of edges, , and ; (2) a collection of vertices, , , and a property that maintains a collection of vertices. Let denote the next vertices to navigate to, which is determined by as

 Rv=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩N+v, if γ(v)=E+v,N−v, if γ(v)=E−v,N+v∪N−v, if γ(v)=E+v∪E−v,γ(v), otherwise.

We then have .

We next discuss how operator performs data delivery via the side effect . Given an input vertex and the edges as the communication channels, there are two locations to maintain the delivered data: (1) the edge; (2) the other end of the edge. For case (1), the edge must be an explicit edge, where the data can be maintained as a runtime property of the edge. For case (2), the data from is expected to destine at every vertex of . It is possible that one vertex can belong to of multiple s. We configure two tools in the operator to cope with this issue. Firstly, a commutative and associative aggregate function is offered to aggregate multiple side effects that will be finalized on the same vertex. Typical aggregate functions are collecting list/set using /, retrieving minimum/maximum value (when the data can be partially ordered) using / and computing summation using . Secondly, a barrier is placed to block the procedure until all vertices in have finalized the side effects.

###### Remark 3.1.

Graph systems [22, 37, 43] typically only allows exchanging data via explicit edges. The capability of delivering data via the implicit edges greatly enhances ’s programmability while implementing complex algorithms, which will be further explored in Section 4.2.

###### Example 3.3.

In the task of single-source-shortest-path (SSSP), each edge has a weight property “w” and each vertex holds a runtime property “@d” that records its current distance to the source vertex. In each step, every vertex sends a new distance value to its outgoing neighbors using operator as

Here, the lambda function specified after outE in the code indicates a “foreach” operation that applies the side effect to each element in the collection.

In the SSSP algorithm, if all edges have unit weight, we can simply write

B = A.Push($\placeholder$.out(|v| v.@d = min($\placeholder$.@d + 1)));

The outgoing neighbors instead of edges are specified in the route function according to the implicit-edge semantic.

We can simply use to navigate among the vertices without performing data exchange as

B = A.Push($\placeholder$.out);

In the following, we will make definitions and prove theorems using mathematics for strictness, while giving examples in codes for easy reading.

### 3.3 The Control Flow

The graph database community has long observed the necessity of incorporating the control flow into graph processing to better support iterative graph algorithms, and such practice has been widely adopted [22, 23, 37, 43]. This motivates us to define the control flow for .

We consider a set of operators (orders not matter here), a set , where () is a predicate that returns true/false value from the runtime contexts. The control flow is a finite set . For , it means that the successor will be scheduled after if returns true. As for , it indicates the chaining of and .

To complete the control flow, we further define five special operators: , , , and . specifies the input to the machine, and indicates the termination of a program. As a result, we must have . We now detail for branching control, and and for loop control.

Branching Control. We define the operator to control the branching of a program. The operator occurs in pair in the control flow. That is, if , then there must be one , such that . In other words, the operator schedules two successive branches of the program, one via , and the other via , in case whether returns true or false.

Loop Control. Loop control is critical for a graph algorithm. Two operators - and - are defined to enclose a loop context. We say an operator is enclosed by the loop context, if there exists a cycle in the control flow from to itself. There must exist a pair of transitions - and - where is enclosed by the loop context, while is not, and or there is a path connecting and with all operators it goes though enclosed by the loop context. We call the former the Feedback and the latter the Exit. Within the loop context, the runtime will maintain a loop counter that records how many times the current loop proceeds.

###### Example 3.4.

Given , and , we now construct a control flow as

 F={(Input(V),ℵ,O1),(O1,ℵ,Switch),(Switch,(β1,⊤),O2),(Switch,(β1,⊥),O3),(O2,ℵ,O4),(O3,ℵ,O4),(O4,ℵ,LoopS),(LoopS,ℵ,O5),(O5,ℵ,O6),(O6,ℵ,LoopE),(LoopE,(β2,⊤),LoopS),(LoopE,(β2,⊥),Fin)}

Figure 1 visualizes this control flow, where conditions are omitted for clear present. Clearly, and are the operators enclosed by the loop context. The Feedback and Exit are marked with dashed arrow and thick arrow, respectively.

### 3.4 The FLASH Machine and its Turing Completeness

We are now ready to define the machine. Given a runtime property graph , a set of operators (both primitive operators and control operators), a set of boolean expressions , and the control flow , we define a machine as a 4-tuple . We show that a machine is Turing complete.

Turing Completeness. A Turing machine [29] is consisted of a finite-state-controller, a tape, and a scanning head. The tape can be divided into infinite number of cells, each can maintain any one of the symbols withdrawn from a finite symbol set . The scanning head, while pointing to one cell anytime, can move leftwards or rightwards by one cell on the tape, based on the state of the machine and the symbol of current cell. While moving the head, a write operation may occur to overwrite the symbol of current cell. The head can move infinitely to both left and right at any point.

Formally, a Turing machine is a 5-tuple , where

• is an finite set of states of the machine;

• is the alphabets accepted by the machine;

• is the initial state, are a finite set of finishing states.

• is the state transition function, where denote the left and right direction while moving the head. Given where and , let , it means when the machine is in state , and the current cell has symbol , the machine will transit to state , and the head will move leftwards on the tape after writing the symbol in current cell.

A system of data-manipulation rules is considered to be Turing complete if it can simulate the above Turing machine. If a system is Turing complete, it can be programmed to solve any computable problem [56].

###### Theorem 3.1.

The machine is Turing complete.

###### Proof.

Consider a Turing machine . Let’s first construct a directed graph of infinite number of vertices, and each vertex is identified by a integer ranged from to . The edges of the graph only connect two vertices of consecutive ids, namely iff. . Now the graph simulates the tape, where each vertex represents a cell. Each vertex further takes two runtime properties and , to maintain the allowed symbols and states of the Turing machine. Initially, only one vertex, saying without loss of generality, holds the initial state of the Turing machine, namely , while the other vertices hold the finishing state in .

We then configure the following operators: \@fleqntrue\@mathmargin0pt

 O1:Filter(f:κ(v,@state)∉F),
 O2:Local(δ:With λ((κ(v,@symbol),κ(v,@state)))=(q,s,D),κ(v,@symbol)←q;κ(v,@state)←s;κ(v,@to)←{N+v,% if D=R,N−v, if D=L.),
 O3:Push(γ:Vin↦{κ(v,@to)|∀v∈Vin},δ:∥κ(v′,@state)←min(κ(v,@state))∀v′∈γ(v)∥).

In the operator , we create the property to control how the operator navigates to simulate the movement of the Turing machine. Specifically, is assigned as (resp. ) to simulate the Turing machine for moving the head rightwards (resp. leftwards). \@fleqnfalse

We let , and define the set of predicates . Finally, we set up the control flow as:

 F={(Input(V),ℵ,O1),(O1,ℵ,LoopS),(LoopS,ℵ,O2),(O2,ℵ,O3),(O3,ℵ,O1),(O1,(β1,⊤),LoopS)(O1,(β1,⊥),Fin)},

which is visualized in Figure 2.

We then construct the machine that simulates the Turing machine . In this way, given any Turing machine, we can always simulate it using a machine, which completes the proof. ∎

### 3.5 The Auxiliary Operators

We have shown that is Turing complete using the three most primitive operators , and , while sometimes it can be awkward to just have these operators. We thus introduce some auxiliary primitive operators to further facilitate the programming.

Pull Operator. The operator is defined analogously to operator as

 (4)

Similarly, we allow both explicit edges and implicit edges for data delivery. Instead of sending data out from the input , is used to retrieve data from either the edge or the other end of the edge. The output of operator remains as , and the side effect will also function on .

operator is a complement of operator in two aspects. Firstly, sometimes we may want to gather data from the other vertices without navigation. This functionality can be realized by chaining two operators, which is apparently more redundant and can be less efficient than using one operator. Secondly, it has been observed that sometimes a -style communication instead of can improve the performance of graph processing [25, 24].

Other Operators. Other auxiliary operators used in this paper include: operator for grouping the input according to given properties; operator for ordering the input by given properties; operator for materializing the given properties of the input to an output device (by default the standard output).

###### Example 3.5.

We use the following example to give our first impression on ’s programming for graph analysis. We process the DBLP dataset [16] to contain the vertices (properties) of “Author(name)” and “Paper(name, venue, year)”, and edges from “Authors” to “Papers” of label “write”, from “Papers” to “Authors” of label “written_by”.

Consider the query: what is the venue (conference + year) in which a given author published most papers? We express the query using as

1Pair<string, int> @venue;
2int @cnt;
3V.Filter($\placeholder$.name == query_author)
4 .Push($\placeholder$.outE[|e| e.label == "write"])
5 .Local($\placeholder$.@venue = ($\placeholder$.conf, $\placeholder$.year))
6 .Group(@venue, |v| v.@cnt = sum(1))
7 .Order(@cnt(DESC), 1);
8 .Output(@venue, @cnt);

We begin the codes by declaring two runtime parameters “@venue” and “@cnt” with their data types (line 1-2). The operator returns the given author’s vertex (line 3), from which is used to navigate to the papers the author has written via the “write” edges (line 4). Note that we write “_.outE[filter]” to apply the edge filter. Then we use the operator to assign “@venue” as the pairing of conference’s name and the year it was hosted (line 5), which is used as the property to group the vertices (line 6).

For interface consistency, we let the operator create and return the runtime vertices, which only present in the lifetime of the program. Each runtime vertex holds the data of a group (thus is also called a grouped vertex). In this query, each grouped vertex will maintain two properties, namely “@venue” and “@cnt”, where “@venue” is inherited from the grouping operation and “@cnt” is specified using the side effect in the operator that computes the number of vertices being grouped (line 6). To obtain the venue the given author published most papers, we use operator to rank these grouped vertices via the “@cnt” property in a descending order, and return the very first result (line 7), which are finally printed to screen using the operator (line 8).

From the above example, one may find it intuitive to read and write codes. Certainly, it is up to the reader to judge, and we have prepared more examples of codes in a Jupyter Notebook [9] for the readers to play with .

## 4 Graph Queries

In this section, we first show that is capable of simulating the GAS model in [22]. Then we demonstrate ’s expressiveness by giving examples of widely used queries. Finally, we exhibit ’s programmability by optimizing representative queries using , including a complete verification of the correctness of ’s implementation of an optimized connected component algorithm [41].

### 4.1 Basic Analytical Queries

GAS Model. In [22], the authors proposed the GAS model to abstract the common structure of analytical queries on graphs, which consists of three conceptual phases, namely gather, apply and scatter. The GAS model allows three user-defined functions , and in the above three phases to accommodate different queries.

In the gather phase, each vertex collects the data from all its incoming edges and vertices, which is formally defined as:

 Av←⨁e∈E−v(λg(De[0],De,De[1])),

where is a commutative and associative aggregate function, and represents the data of corresponding vertex or edge.

While gathering all required data from the neighbors, each vertex uses the gathered data and its own data to compute some new data to update the old one. The apply stage is defined as:

 D∗v←λa(Av,Dv),

where and represent the old and new vertex data.

The scatter phase then uses the newly computed data to update the value of the outgoing edges as:

 ∀e∈E+v,De←λs(De[0],De,De[1])

The MLDM graph algorithms typically run the GAS phases iteratively until converge or reaching maximum iterations.

GAS via FLASH. We now simulate the GAS model using . Observe that operator corresponds to the apply phase, while and operator can represent gather and scatter phases, respectively. We define: \@fleqntrue\@mathmargin0pt

 O1:Local(δ:κ(v,@Dv)←λi(v)),
 O4:Filter(f:λc(κ(v,@D∗v),κ(v,@Dv))),
 O5:Local(δ:κ(v,@Dv)←κ(v,@D∗v)),
 O6:Push(γ:Vin↦−−→E+Vin,δ:∥κ(e,@De)←λs(κ(e[0],@Dv),κ(e,@De),κ(e[1],@Dv))∀e∈γ(v)∥),
\@fleqnfalse

where is used to initialize the runtime property for each vertex , helps check whether the algorithm converges while feeding the old and new data of the vertex to , and , and corresponds to the gather, apply and scatter phases, respectively. Note that uses to place data on the adjacent edges, where there is no need to apply aggregation.

Let . Define , where is the loop counter and denotes the maximum number of iterations. Let the control flow be:

 F={(Input(V),ℵ,O1),(O1,ℵ,LoopS),(LoopS,ℵ,O)(O,(β1,⊤),O′),(O,(β1,⊥),Fin),(O′,ℵ,LoopS)},

where and . We have the machine simulate the GAS model.

Expressiveness of Flash. Showing that the machine can simulate the GAS model immediately indicates that is capable of expressing a lot of widely-used analytical queries. In the following, we will demonstrate ’s expressiveness by giving the examples of representative analytical queries.

###### Example 4.1.

Weakly Connected Component (WCC). A weakly connected component is defined as a subset of vertices that where there is an undirected path (a path regardless of the directions of the edges) connecting them in . The WCC problem aims at computing a set of vertex subset , such that each forms a maximum connected component, and . The codes for WCC are given in the following, which are intuitive to read.

1int @cc;
2int @precc;
3int @cnt;
4A = V.Local($\placeholder$.@cc = $\placeholder$.id);
5while (A.size() > 0) {
6    A = A.Push($\placeholder$.@both(|v| v.@precc = min($\placeholder$.@cc)))
7         .Filter($\placeholder$.@precc < $\placeholder$.@cc)
8         .Local($\placeholder$.@cc = $\placeholder$.@precc);
9}
10V.Group(@cc, |v| v.@cnt = sum(1));

Single-Source Shortest Path (SSSP). Consider a weighted graph where all edges have a static weight property that is a non-negative integer. Given two vertices and that have a path in the graph, denoted as , where are edges the path goes through, we define the length of the path as . Given a source vertex , the problem of SSSP is to compute the shortest path from to all other vertices. Below shows ’s codes for SSSP (without materializing the path).

1int @dist;
2int @min_d;
3A = V.Local($\placeholder$.@dist = INT_MAX)
4     .Filter($\placeholder$.id == src_id)
5     .Local($\placeholder$.@dist = 0);
6while (A.size() > 0) {
7    A = A.Push($\placeholder$.outE(|e| e.dst.@min_d = min($\placeholder$.@dist + e.weight)))
8         .Filter($\placeholder$.@dist > $\placeholder$.@min_d)
9         .Local($\placeholder$.@dist = $\placeholder$.@min_d);
10}

PageRank (PR). PR is invented by Sergay Brin and Larry Page [11] to measure the authority of a webpage, which is the most important ranking metric used by Google search engine. The algorithm runs iteratively, while in the round, the pagerank value of a given page is computed as , where is a damping factor. We give one implementation of PR using in the following.

1float @pr;
2float @tmp;
3float @new;
4A = V.Local($\placeholder$.@pr = 1 / V.size());
5while (A.size() > 0) {
6    A = V.Local($\placeholder$.@tmp = $\placeholder$.@pr / $\placeholder$.out.size())
7         .Push($\placeholder$.out(|v| v.@tmp = sum($\placeholder$.@tmp)))
8         .Local($\placeholder$.@new = 0.15 / V.size() + 0.85 * $\placeholder$.@tmp)
9         .Filter(abs($\placeholder$.@new - $\placeholder$.@pr) > 1e-10)
10         .Local($\placeholder$.@pr = $\placeholder$.@new);
11}
###### Remark 4.1.

The basic implementations of representative analytical queries in Example 4.1 use only 8-11 LLoCs, which are evidently more succinct that both Green-Marl and GSQL (Table 2). We are aware that fewer lines of codes may not always be a good thing, at least not so regarding readability. However, as shown in Example 4.1, we believe that codes are quite intuitive (often self-explanatory) to read.

### 4.2 Optimized Analytical Queries

We show ’s strong programmability using optimized implementations of the analytical queries. For easy presentation, we use the undirected graph model in this subsection, and we refer to the outgoing neighbors (resp. outgoing edges) as the adjacent neighbors (resp. adjacent edges) in the undirected graph.

The reasons of ’s strong programmability are mainly twofold. The first reason is that ’s operators can be flexibly chained, which makes it easy to combine the operators in different ways to realise the same algorithm. As an example, we show in the following how we can flexibly combine and operators to improve the CC (WCC becomes CC in undirected graph) algorithm in Example 4.1.

The observation of the implementation in Listing 2 is that when there are not many vertices to update its CC (line 5), a operation can often render less communication than . Such an optimization is more often applied in the system level [25, 24], while it is nearly impossible to tune the system to achieve the best configuration between and for all queries. In comparison, allows tuning the best tradeoff for each query directly in the code, which is often easier by looking into one single query.

The second reason is that the semantic of implicit edges allows the vertices to communicate with any vertices in the graph. To show this, we continue to optimize the CC algorithm using the CC-opt algorithm proposed in [41, 47]. We are interested in this algorithm because that: (1) it is an algorithm with theoretical guarantee proposed to handle big graph in the parallel/distributed context, which fits ’s design principle; (2) it involves data exchanging via implicit edges that can reflect the programmability of .

CC-opt via FLASH. The CC-opt algorithm in [41] utilizes a parent pointer for each to maintain a tree (forest) structure. Each rooted tree represents a connected component of the graph. The algorithm runs iteratively. In each iteration, the algorithm uses StarDetection to identify stars (tree of depth one), in which every vertex points to one common rooted vertex (the rooted vertex is self-pointing); then it merges stars that are connected via some edges using two StarHooking operations; finally it applies PointerJumping to assign . When the algorithm terminates, there must be isolated stars, each standing for one connected component with the rooted vertex as its id. We show CC-opt’s implementation in Listing 3. The algorithm is divided into 5 components, namely ForestInit, StarDetection, conditional StarHooking, unconditional StarHooking and PointerJumping. Based on the expected output of each component, we explain the implementation and show its correctness in the following.