A Survey of Parallel Sequential Pattern Mining
Abstract
With the growing popularity of resource sharing and shared resources, large volumes of complex data of different types are collected automatically. Traditional data mining algorithms generally have problems and challenges including huge memory cost, low processing speed, and inadequate hard disk space. For sequential pattern mining (SPM), it is used in a wide variety of reallife applications. However, it is more complex and challenging than frequent itemset mining, and also suffers from the above challenges when handling the largescale data. To solve these problems, mining sequential patterns in a parallel computing environment has emerged as an important issue with many applications. In this paper, an indepth survey of the current status of parallel sequential pattern mining (PSPM) is investigated and provided, including detailed categorization of traditional serial SPM approaches, and state of the art parallel SPM. We review the related work of PSPM in detail, including partitionbased algorithms for PSPM, Aprioribased PSPM, pattern growth based PSPM, and hybrid algorithms for PSPM, and provide deep description (i.e., characteristics, advantages, and disadvantages) of each parallel approach of PSPM. Some advanced topics for PSPM and the related opensource software are further reviewed in details. Finally, we summarize some challenges and opportunities of PSPM in the big data era.
1 Introduction
With the rapid development of information technology and data collection, Knowledge Discovery in Databases (KDD), which is also called data mining provides a powerful capability to discover meaningful and useful information from different types of complex data [1, 2, 3, 4, 5]. KDD has numerous reallife applications and is crucial to some of the most fundamental tasks such as frequent itemset and association rule mining [3, 4, 6], sequential pattern mining [7, 5, 8], clustering [9, 10], classification [11], outline detection [12].
Most traditional data mining algorithms are designed to run on a single computer (node), and are thus called singlenode techniques. They can discover various kinds of patterns from various types of databases. At the same time, in recent decades, data mining has been studied extensively and applied widely [1, 3, 9, 13, 14, 4, 15]. These techniques perform well on small datasets, however, due to the limited memory capacity and computation capability of a single node, these data mining methods become inefficient over big data. The memory requirements for handling the complete set of desired results increase quickly, and the computational cost can be expensive on a single machine. All aforementioned methods are serialized. When handling largescale data, these methods are fundamentally inappropriate due to many reasons, including the huge amounts of data, infeasibility of bandwidth limitation, as well as the fact that larger inputs demands parallel processing, and privacy concerns. Unfortunately, parallelization of the mining process is a difficult task. It is an important issue to develop more adaptable and flexible mining frameworks.
Parallel data mining (PDM) [16, 17] is a type of computing architecture in which several processors execute or process an application. Research and development work in the area of parallel data mining concerns the study and definition of parallel mining architectures, methods, and tools for the extraction of novel, useful, and implicit patterns from databases using a highperformance architecture. In some cases, PDM distributes the mining computation w.r.t. multicore over more than one node. When data mining tools are implemented on highperformance parallel computers, they can analyze massive databases in parallel processing within a reasonable time. In general, parallel computation allows for solving larger problems and executing applications that are parallel and distributed in nature. Parallel computing is becoming increasingly common and used to accelerate processing of the massive amount of data. So far, some parallelized Apriorilike algorithms have been implemented with the MapReduce framework [18] and achieved certain speedup compared with singlenode methods. However, some previous studies [18, 19, 20] show that the MapReduce framework is not suitable for frequent itemset mining algorithm like the Apriori algorithm with intensive iterated computation.
tid  time  Customer ID  Event (products) 
03062017 10:05:30  milk, bread  
03062017 10:09:12  oatmeal  
03062017 10:21:45  milk, bread, butter  
03062017 11:40:00  milk, cereal, cheese  
03062017 12:55:30  cereal, oatmeal  
03062017 14:38:58  bread, milk, cheese, butter, cereal  
…  …  …  … 
05062017 15:30:00  bread, oatmeal, butter 
In the field of data mining, pattern mining has become an important task for a wide range of realworld applications. Pattern mining consists of discovering interesting, useful, and unexpected patterns in databases. This field of research has emerged in the 1990s with the Apriori algorithm [1] which was proposed by Agrawal and Srikant. It is designed for finding frequent itemsets and then extracting the association rules. Note that frequent itemsets are the groups of items (symbols) frequently appearing together in a database of customer transactions. For example, the pattern/products {bread, wine, cheese} can be used to find the shopping behavior of customers for market basket analysis. Some pattern mining techniques, such as frequent itemset mining (FIM) [1, 4] and association rule mining (ARM) [1], are aimed at analyzing data, where the sequential ordering of events is not taken into account. However, the sequencebased database which contains the embedded timestamp information of event is commonly seen in many realworld applications. A sequence in a sequence database is an ordered list of items, and sequence is everywhere in our daily life. Typical examples include consumers’ shopping behavior, Web access logs, DNA sequences in bioinformatics, and so on. We illustrate the sequential data with one case of market basket analysis in detail below. For example, Table I is a simple retail store’s database which contains customers’ shopping records, including transaction ID (tid), occur time, customer ID, and event (w.r.t. purchase products), and so on. For each customer, his/her total purchase behavior can be considered as a sequence consists of some events which happened at different time. As shown in Table II which is translated from Table I, the customer w.r.t. sequence ID ( = 1) has three records (, , and ) that occurred sequentially.
SID  tid  Event (products) 

1  milk, bread  
1  milk, cereal, cheese  
1  bread, oatmeal, butter  
2  oatmeal  
2  bread, milk, cheese, butter, cereal  
3  milk, bread, butter  
3  cereal, oatmeal 
To address this issue, another concept of pattern mining named sequential pattern mining (SPM) was first introduced by Agrawal and Srikant in 1995 [2]. The goal of SPM aims at discovering and analyzing statistically relevant subsequences from sequences of events or items with time constraint. More precisely, it consists of discovering interesting subsequences in a set of sequences, where the interestingness of a subsequence can be measured in terms of various criteria such as its occurrence frequency, length, and profit. Given a userspecified threshold, termed the minimum support (denoted minsup), a sequence is said to be a frequent sequence or a sequential pattern if it occurs more than minsup times in the processed database. For instance, consider the database of Table II, and assume that the user sets minsup = 3 to find all subsequences appearing in at least three sequences. For example, the patterns and are frequent and have a support of 4 and 3 sequences, respectively. As a fundamental task of pattern mining, sequential pattern mining is used in a wide variety of reallife applications, such as market basket analysis [2], Web mining [21], bioinformatics [22], classification [23], finding copypaste bugs in largescale software code [24].
In the past two decades, pattern mining (i.e., FIM, ARM and SPM) has been extensively studied and successfully applied in many fields [6, 8]. Meanwhile, to meet the demand of largescale and high performance computing, as mentioned before, parallel data mining has received considerable attention over the past decades [16, 17, 25, 26], including parallel frequent itemset mining (PFIM) [27, 19, 28], parallel association rule mining (PARM) [20, 29], parallel sequential pattern mining (PSPM) [30, 31], parallel clustering [32, 33], and so on. Among them, the sequencebased task  PSPM is crucial in a wide range of realword applications. For example, in Bioinformatics for DNA sequence analysis [22], it requires a truly parallel computing on massive largescale DNA. On one hand, the serial sequential pattern mining is computationally intensive. Although a significant amount of developments have been reported, there is still much room for improvement in its parallel implementation. On the other hand, many applications are timecritical and involve huge volumes of sequential data. Such applications demand more mining power than serial algorithms can provide [34]. Thus, solutions of the sequential pattern mining task that scale are quite important. Up to now, the problem of parallel sequential pattern mining has attracted a lot of attention [30, 31]. Fig. 1 shows the number of published papers from 2007 to 2017 where Group 1 denotes the search keywords of “Sequential Pattern Mining” / “Sequence Mining”, and Group 2 denotes the search keywords of “Parallel Sequential Pattern Mining” / “Parallel Sequence Mining” / MapReduce “sequential pattern”. Fig. 1 outlines a rapid surge of interest in SPM and PSPM in recent years. These results can easily show the trend that mining sequential patterns in a parallel computing environment has emerged as an important issue.
Up to now, several related surveys of serial sequential pattern mining (abbreviated as SPM) have been previously studied [35, 36, 37]. Recently, FournierViger et al. published an uptodate survey of the current status of serial SPM [8]. There is, unfortunately, no survey of parallel sequential pattern mining methods yet. In 2014, Anastasiu et al. published an article on the big data frequent pattern mining [38], which describes some Serial and parallel approaches to scalable frequent itemset mining, sequence mining, and frequent graph mining. However, this article is not specific for parallel sequential pattern mining, only 12 algorithms for PSPM are presented in one section. Many concepts, technologies, big data platforms and tools, domain applications, and the stateoftheart works of PSPM are not considered and reviewed. Yet, after more than ten years of theoretical development of big data, a significant number of new technologies and applications have appeared in this area. Therefore, we believe that now is a good time to summarize current status (i.e., the big data platforms and tools, the new technologies, application, and advanced topics) of PSPM.
The question then posed is: how can one best summarize the related studies in various types of parallel sequential pattern mining in parallel computing environments and make a general taxonomy of them? The methods summarized in this article not only fit for parallel computing [39, 21, 40], but are also good references for other related work, such as data mining [13], big data technologies [41, 42, 25, 26, 43], distributed systems [24, 44], and database management [45]. Thus, this paper aims at reviewing the current status of parallel sequential pattern mining (PSPM), and providing indepth descriptions on a number of representative algorithms. The main contributions of this paper are described below.

We review some related works on serial sequential pattern mining in several categories, including Aprioribased techniques for SPM, pattern growth techniques for SPM, algorithms for SPM with earlypruning, and constraintbased algorithms for SPM.

We review the stateoftheart works of parallel sequential pattern mining on distributed environments in recent years. This is a high level survey about parallel techniques for sequential pattern mining in several aspects, including partitionbased algorithms for PSPM, Aprioribased PSPM, pattern growth based PSPM, and hybrid algorithms for PSPM. Not only the representative algorithms, but also the advances on latest progresses are reviewed comprehensively. We also point out the key idea, advantages and disadvantages of each PSPM approach.

Some advanced topics for PSPM are reviewed, including PSPM with quantitative / weighted / utility, PSPM from uncertain data and stream data, hardware accelerator for PSPM (i.e., CPU and GPU approaches for parallel sequential pattern mining).

We further review some wellknown opensource software in the fields of serial SPM and parallel SPM, hope that these resources may reduce barriers for future research.

Finally, some challenges and opportunities of parallel sequential pattern mining task for future research are briefly summarized and discussed.
The rest of this paper is organized as follows. Section 2 first introduces the key characteristics about some parallelism methods, then reviews some important features of parallel computing platforms and distributed systems, and respectively summarizes some important parallel computing platforms and tools. The definitions of major concepts used in literature are first introduced in Section 3. Besides, Section 3 also briefly provides the stateoftheart research efforts SPM. Section 4 highlights and discusses the stateoftheart research efforts on different approaches, under four taxonomies, for parallel sequential pattern mining. Some advanced topics for PSPM and the related opensource software are further reviewed in Section 5 and Section 6, respectively. Section 7 briefly summarizes some challenges and opportunities in parallel sequential pattern mining. Finally, some conclusions are briefly given in Section 8.
2 Parallel Computing Framework
2.1 Methods of Parallelism
Zaki has pointed several reasons for designing parallel algorithms to the data mining task [16, 46]. First, single processors cause the memory and CPU speed limitations, while multiple processors can parallelly reduce them. Second, large amounts of data are already available in parallel subdatabases or they are stored/distributed at multiple sites [16, 46]. Therefore, it is prohibitively expensive either to collect all data into one site, or to perform the serial mining process on a single computer. For these reasons, tools that can help in parallelization are urgently needed. According to the previous studies [16, 46], there are some wellknown methods of parallelism: task parallelism (divide and conquer strategy, task queue), data parallelism (record based and attribute based), and hybrid data/task parallelism. Characteristics about these parallelism methods are described below.
(1) Task parallelism
(2) Data parallelism
(3) Hybrid data/task parallelism It is a parallel pipeline of tasks, where each task might be data paralleled, and output of one task is the input to the next one. It means that each task can be run in parallel, throughput impacted by the longestlatency element in the pipeline.
Nowadays, parallel computing is the key to improve the performance of computer programs. Currently, data mining approaches to achieve parallelization and distribution can be classified in terms of five main components [16, 46]:

Distributed versus shared memory systems;

Data versus task parallelism;

Static versus dynamic load balancing;

Complete versus heuristic candidate generation;

Horizontal versus vertical data layout.
2.2 Difference between Parallel Computing Platform and Distributed System
Unlike traditional centralized systems, a distributed system is defined as: one in which components of networked computers communicate and coordinate their actions only by passing messages [24, 44]. In other words, a distributed system is a collection of autonomous computing elements that appears to its users as a single coherent system. According to [24, 44], there are two aspects of distributed systems: (1) independent computing elements and (2) single system w.r.t. middleware. There are some important features of distributed systems, including: 1) concurrency, multiprocess and multithreads concurrently execute and share resources; 2) no global clock, where program coordination depends upon on message passing; and 3) independent failure, wherein some processesâ failure cannot be known by other processes [24, 44]. There are many types of distributed systems, such as grids [47, 48], peertopeer (P2P) systems [49], adhoc networks [50], cloud computing systems [51], and online social network systems [52]. According to a study by Almasi et al., parallel computing
Both distributed mining and parallel mining can speed up the data mining process. However, they have several differences. In 1999, Zaki pointed out that there are some differences between parallel and distributed association rule mining [16]. For achieving the parallelism of a traditional mining method, the main challenges include: i) synchronization minimization and communication minimization, ii) workload balancing, iii) finding good data layout and data decomposition, and iv) disk I/O minimization (which is especially important for association rule mining) [16]. In general, the parallel paradigm is hardware or softwaredistributed sharedmemory systems. Thus, one parallel method can be distributed or sharedmemory. In distributed data mining (DDM), the same or different mining algorithms are often applied to tackle local data. A DDM algorithm communicates among multiple process units, and then combines the local patterns into the desired global patterns. While a parallel data mining (PDM) algorithm applies the global parallel algorithm on the entire dataset to discover the desired knowledge, it is the same as that found by the traditional data mining algorithm. The accuracy or efficiency of DDM depends on data partitioning, task scheduling, and global synthesizing, and thus it is somewhat difficult to predict [54]. PDM accuracy may be more guaranteed than that of DDM.
2.3 Parallel Data Mining Platforms and Tools
In order to achieve highperformance data mining, some parallel data mining platforms and tools have been further developed in recent years. Many researchers provide different techniques to work on parallel or distributed environments like multicore computing [55], grid computing [47, 48], graphics processing units (GPUs) [39, 21], cloud computing [51], Hadoop
(1) Multicore computing. A multicore processor
(2) Grid computing. Generally speaking, grid computing
(3) Reconfigurable computing with FPGA. A fieldprogrammable gate array (FPGA) is, in essence, a computer chip that can rewire itself for a given task [57]. A new computing paradigm, reconfigurable computing, has received wide attention and is the focus of extensive research. Reconfigurable computing uses a FPGA as a coprocessor to a generalpurpose computer. Thus, highperformance computing can be achieved by using FPGA. Reconfigurable computing with FPGA has a good ability to combine efficiency with flexibility.
(4) Graphics processing units (GPUs)
(5) MapReduce [18]. The MapReduce model was originally proposed by Google. It is a popular parallel programming model that not only simplifies the programming complexity of parallel or distributed computing, but also can achieve better processing and analytics for massive datasets. The MapReduce model [18] stores the data in key, value pair and then runs in rounds, which are composed of three consecutive phases: map, shuffle, and reduce. The input and output formats of MapReduce can be expressed as follows: i) Mapper: key input, value input to listkey map, value map; ii) Reducer: key map, list(values) to listkey reducer, value reducer. By using the MapReduce programming model, mining progress can be effectively enhanced. Moreover, I/O cost caused by scanning for massive and high dimensional datasets can be significantly reduced. Up to now, there are many implementations of MapReduce, and MapReduce has become a common powerful tool for parallel or distributed computing under various environments.
(6) Spark [61]. Spark is one of the distributed computing framework developed by Berkeley AMPLab. It is a wellknown opensource cluster computing system that provides a flexible interface for programming entire clusters with implicit data parallelism and faulttolerance. Due to the inmemory parallel execution model, all data will be loaded into memory in Spark. This fundamental feature means Spark can significantly speed up computation for iterative algorithms such as the Apriori algorithm and some other machine learning algorithms. In general, Spark [61] can be 12 orders of magnitude faster than MapReduce [18].
(7) PerfExplorer [62]. It is a computing framework for largescale parallel data mining. It consists of two main components, the Client and the Server. PerfExplorer has a flexible interface and provides a common, reusable foundation for multiple data analysis, including clustering, dimensionality reduction, coefficient of correlation analysis, and comparative analysis [62].
3 Stateoftheart Serial Sequential Pattern Mining
Most of the current parallel sequential pattern mining algorithms are extended from the traditional serial SPM algorithms. Therefore, in this section, some preliminaries and the problem statement related to frequent itemset mining (FIM) and sequential pattern mining (SPM) are first introduced. Then, some related works of sequential pattern mining are reviewed in several categories. Finally, the status of SPM is described and briefly summarized.
3.1 Frequent Itemset Mining VS. Sequential Pattern Mining
First, let us introduce some preliminaries and the problem statement related to frequent itemset mining (FIM) from transactional databases. Let … be a set of items, an itemset … with items as a subset of . The length or size of is denoted as , i.e., the number of items in w.r.t. . Given a transactional database …, where each transaction is generally identified by a transaction (TID), and denotes the total number of transactions w.r.t. . The support of in is denoted as , it is the proportion of transactions that contain , i.e., = . An itemset is said to be a frequent itemset (FI) in a database if its support is no less than the user defined minimum support threshold (). Therefore, the problem of frequent itemset mining is to discover all itemsets which have a support no less than the defined minimum support threshold, that is [1]. As the most important mining task for a wide range of realworld applications, frequent itemset mining (FIM) or association rule mining (ARM) has attracted a lot of attention [1, 3, 13, 14, 4, 6, 17].
In 1995, Srikant and Agrawal first extended the concept of the frequent itemset mining model to handle sequences [2]. They first introduced and formulated the sequential pattern mining (SPM) problem. Different from FIM, SPM aims at discovering frequent subsequences as interesting patterns in a sequence database which contains the embedded timestamp information of event. According to the definitions from Mannila et al. [63], we have the associated notations and descriptions of SPM as below. Given an input sequence database SDB = …, where SID is a unique sequence identifier (also called sequenceid) and is an input sequence. Thus, SDB is a set of tuples (SID, ), and each sequence has the following fields: SID, eventtime, and the items present in the event. For example, in Table 1, consider the first sequence , it represents five transactions made by a customer at a retail store. Each single letter represents an item (i.e., , , , etc.), and items between curly brackets represent an itemset (i.e., and ). Simply speaking, a sequence is a list of temporally ordered itemsets (also called events).
SID  Sequence 

1  
2  
3  
4  
5 
A sequence = … is called a sequence if it contains items, or in other words if …. For example, as shown in Table III, the sequence is a 7sequence. A sequence = is called a subsequence of another sequence = … , and is called a supersequence of if there exists an integer … such that , denoted as . A tuple (SID, ) is said to contain a sequence if is a supersequence of , denoted as . As shown in Table III, the sequence is contained in sequence , while the sequence is not. Especially, the sequence and are called extensions of , and is called extension of . The support of a sequence in a sequence database SDB is the number of tuples that contains , and it is denoted as . The size of a sequence database SDB (denoted as ) corresponds to the total number of sequences (i.e., the number of customers).
In general, sequential pattern mining is more complex and challenging than frequent itemset mining. There are some important reasons for this. Firstly, due to the absence of time constraints in FIM not presenting in SPM, SPM has a potentially huge set of candidate sequences [64]. The total number of possible sequential patterns is much larger than that of FIM. If all sequences contain exactly one event, SPM is equal to FIM. For example, consider only 100 distinct items for the SPM problem, there are 100 sequences has their length as 1, while the number of sequences with length 2 is = 14,950, and the number of sequences with length 3 is . Secondly, there are further difficulties mining longer sequential patterns [64], such as DNA sequence analysis or stock sequence analytics/prediction.
3.2 Stateoftheart Serial Sequential Pattern Mining
Through 20 years of study and development, many techniques and approaches have been proposed for mining sequential patterns in a wide range of realword applications [8], such as Web mining [21], classification [23], and mining motifs from biological sequences [22]. Some wellknown serial algorithms for sequential pattern mining, such as AprioriAll [2], GSP [5], BIDE [65], CloSpan [66], FreeSpan [67], PrefixSpan [64], SPADE [68], etc., have been proposed. Since many parallel SPM algorithms are extended by the traditional serial SPM approaches, it is quite necessary for us to have an understanding of the systematic characteristics of traditional serial sequential pattern mining approaches. Thus, the stateoftheart of serial SPM are reviewed below. Based on the different mining mechanisms, some wellknown serial SPM approaches are divided into several categories, including Aprioribased techniques, pattern growth techniques, algorithms with earlypruning, and constraintbased algorithms. Details are summarized in Tables IV, V, VI, and VII, respectively.
Name  Description  Year 

Apriori (All, Some, Dynamic Some) [2]  The first algorithm for sequential pattern mining.  1995 
GSP [5]  Generalized Sequential Patterns (Max/Min Gap, Window, and Taxonomies).  1996 
PSP [69]  Retrieval optimizations and more efficient than GSP.  1998 
SPADE [68]  Sequential PAttern Discovery using Equivalence classes.  2001 
SPAM [23]  Sequential PAttern Mining with Bitmap representation.  2002 
LAPIN [70]  SPM with LAst Position Induction, which is categorized into two classes,  2004 
LAPINLCI and LAPINSuffix.  
LAPINSPAM [71]  LAst Position INduction Sequential PAttern Mining.  2005 
Name  Description  Year 

FreeSpan [67]  FREquEnt patternprojected Sequential Pattern mining.  2000 
WAPmine [72]  SPM with suffix growth.  2000 
PrefixSpan [64]  PREFIXprojected Sequential PAtterN mining.  2001 
LPMiner [73]  Sequential pattern mining with Length decreasing suPport.  2001 
SLPMiner [74]  Sequential pattern mining with Length decreasing suPport.  2002 
FSMiner [75]  SPM with suffix growth.  2004 
LAPINSuffix [70]  SPM with suffix growth.  2004 
PLWAP [76]  SPM with prefix growth.  2005 
AprioriBased Techniques. In 1995, Agrawal and Srikant first introduced an new approach named AprioriAll to discover the sequential patterns from a set of sequences whithin the embedded timestamp information [2]. The AprioriAll algorithm is based on Apriori and relies on the Apriori property. That is “all nonempty subsets of a frequent itemset must also be frequent” [1]. Therefore, AprioriAll holds the downward closed property (also called antimonotonic), and utilizes this antimonotonic to prune the search space. In general, two properties are often utilized in sequential pattern mining to speed up computation:

Subsequence infrequency based pruning: any supersequence of an infrequent sequence is not frequent, thus it can be pruned from the set of candidates [2]. If is a subsequence of , then is at most as large as . Monotonicity: If is not frequent, then it is impossible that is frequent. Pruning: If we know that is not frequent, we do not have to evaluate any supersequence of . For example, if sequence occurs only 10 times, then can occur at most 10 times.

Supersequence frequency based pruning: any subsequence of a frequent sequence is also frequent, thus it can be safely pruned from the set of candidates [77]. For example, if sequence is frequent and occurs 8 times, then any its subsequence (, , , , and ) is frequent and occurs at least 8 times.
Srikant then proposed GSP, which is similar to AprioriAll in execution process, but adopts several technologies including time constraints, sliding time windows, and taxonomies [5]. GSP uses a multiplepass, candidate generationandtest method to find sequential patterns. It adopts the Apriori property that all subpatterns of a frequent pattern must be frequent. Thus, GSP can greatly improve performance over AprioriAll. PSP [69] resumes the general principles of GSP but it utilizes a different intermediary data structure, making it more efficient than GSP. AprioriAll [2] AprioriSome [2], DynamicSome [2] and GSP [5] all use the breadthfirst search (BFS) to mine sequential patterns with a hierarchical structure. The SPADE algorithm, which uses equivalence classes to discover the sequential patterns, is an Aprioribased hybrid miner, it can be either breadthfirst or depthfirst [68]. SPADE exploits sequential patterns by utilizing a vertical idlist database format and a vertical lattice structure. SPAM is similar to SPADE, but SPAM uses bitmap representation and bitwise operations rather than regular and temporal joins. Yang et al. then proposed a novel algorithm LAPIN for SPM with last position induction (LAPINLCI and LAPINSuffix) [70]. They further developed the LAPINSPAM [71] algorithm by combining LAPIN and SPAM. Note that for SPAM, LAPIN, LAPINSPAM and SPADE, these algorithms use the property that the support of a superpatterns is always less than or equal to the support of its support patterns, it is different from the Apriori property used in GSP. Most of the above Aprioribased algorithms consist of three features.

Breadthfirst search: Aprioribased algorithms commonly use the breadthfirst (w.r.t. levelwise) search manner. They construct all sequences together in each th iteration while exploring the search space.

Multiple database scans: These algorithms need to scan the original database many times to determine whether a longer generated sequences is frequent. They suffer from the drawbacks of the candidate generationandtest paradigm. In other words, they may generate a huge number of candidate sequences that do not appear in the input database and need to scan the original database many times.
Name  Description  Year 

HVSM [78]  Bitwise operation and position induction.  2005 
LAPINSPAM [71]  Bitwise operation.  2005 
PLWAP [76]  Position induction.  2005 
LAPIN [70]  Position induction.  2007 
DISCall [22]  Position induction and prefix growth.  2007 
UDDAG [79]  UpDown Directed Acyclic Graph for sequential pattern mining.  2010 
PatternGrowth Techniques. In 2000, a patternprojected sequential pattern mining algorithm named FreeSpan was introduced by Han et al. [67]. By using a frequent item list (flist) and SMatrix, it obtains all frequent sequences based on socalled projected pattern growth. Patterngrowth SPM algorithms can avoid recursively scanning the database to find frequent patterns. The reason is that they only consider the patterns actually appearing in the database. To reduce timeconsuming database scans, patterngrowth SPM algorithm introduces a novel concept called projected database [67, 39]. The projected database can significantly reduce the size of databases as frequent patterns are considered by the depthfirst search.
The WAPmine algorithm, which utilizes a WAPtree, was proposed by Pei et al. [72]. In a WAPtree, each node is assigned a binary code to determine which sequences are the suffix sequences of the last event, and then to find the next prefix for a discovered suffix. Thus, PLWAP does not have to reconstruct intermediate WAPtrees. Han et al. then further proposed the most representative algorithm PrefixSpan [64]. It tests only the prefix subsequences, and then projects their corresponding postfix subsequences into the projected subdatabases (A projected database is the set of suffixes w.r.t. a given prefix sequence). By recursively exploring only local frequent sequences, sequential patterns can be recursively grown in each projected subdatabase. Thus, PrefixSpan exhibits better performance than both GSP [5] and FreeSpan [67]. However, when dealing with large dense databases that have large itemsets, it is worse than that of SPADE [68]. As shown in Table V, some other pattern growth algorithms for sequential pattern mining have been developed, such as LPMiner [73], SLPMiner [74], WAPmine [72], and FSMiner [75]. With consideration of length decreasing support, the LPMiner [73] and SLPMiner [74] algorithms are introduced. In 2005, Ezeife et al. proposed a patterngrowth and earlypruning hybrid method named PLWAP [76]. PLWAP utilizes a binary code assignment method to construct a preordered, linked, positioncoded Web access pattern tree (WAPtree for short). Compared to the WAP algorithm, PLWAP can significantly reduce runtime and provide a position code mechanism. Inspired by FPtree [4] and the projection technique for quickly mining sequential patterns [39], a tree algorithm named FSMiner (Frequent Sequence Miner) is developed [75]. FSMiner resembles WAPmine [72]. Moreover, it supports incremental mining and interactive mining [80].
Name  Description  Constraint  Character  Year 

CloSpan [66]  The first algorithm for mining closed sequential patterns.  Closed  Based on the GSP algorithm.  2003 
BIDE [65]  Mining of frequent closed sequences by using BIDirectional Extension, which is a sequence closure checking scheme.  Closed  It executes some checking steps in the original database that avoid maintaining the sequence tree in memory.  2004 
MSPX [81]  MSPX mines maximal sequential patterns with a bottomup search and uses multiple samples.  Maximal  The sampling technique is used at each pass to distinguish potentially infrequent candidates.  2005 
SQUIRE [82]  Sequential pattern mining with quantities, AprioriQSP and PrefixSpanQSP.  Pattern with quantities  Aprioribased and PrefixSpanbased.  2007 
ClaSP [83]  Mining frequent closed sequential patterns by using several search space pruning methods together with a vertical database layout.  Closed  Inspired on the SPADE and CloSpan algorithms.  2013 
MaxSP [56]  Maximal Sequential Pattern miner without candidate maintenance.  Maximal  It neither produces redundant candidates nor stores intermediate candidates in main memory.  2013 
VMSP [84]  The first vertical mining algorithm for Vertical mining of Maximal Sequential Patterns.  Maximal  Uses a vertical representation to do a depthfirst exploration of the search space.  2014 
CloFAST [85]  A fast algorithm for mining closed sequential patterns using idlist.  Closed  It combines a new data representation of the dataset, based on sparse and vertical idlist.  2016 
EarlyPruning Techniques. Although the projectionbased approaches (i.e., FreeSpan, PrefixSpan) can achieve a significant improvement over Aprioribased approaches, the projection mechanism still suffers from some drawbacks. The major cost of PrefixSpan is caused by constructing projected databases. Therefore, some hybrid algorithms with earlypruning strategy are developed, as shown in Table VI. Following in the footsteps of SPAM, a firstHorizontallastVertical scanning database Sequential pattern Mining algorithm (named HVSM for short) [78] is developed using bitmap representation. Instead of using candidate generateandtest or the tree projection, the LAPIN algorithm [70] uses an itemlastposition list and a position set of prefix border. Based on the antimonotone property which can prune infrequent sequences, Direct Sequence Comparison (DISCall) uses other sequences of the same length to prune infrequent sequences [22]. The DISCall respectively employs different orderings (lexicographical ordering and temporal ordering) to compare sequences of the same length. With the uptodown directed acyclic graph, the UDDAG algorithm [79] uses the prefixes and suffixes to detect the pattern. It obtains faster pattern growth because of fewer levels of database projection compared to other approaches, and allows bidirectional pattern growth along both ends of detected patterns.
Constraintbased Techniques. Different from the traditional SPM approaches, the interesting issue of constraintbased SPM has been widely studied, including quantitative sequences, maximal sequential pattern, closed sequential patterns, sequential patterns with gap constraint, top sequential patterns, etc. Some constraintbased SPM algorithms are described in Table VII. In order to handle the condense representation of an explosive number of frequent sequential patterns, the issue of mining maximal sequential patterns is first developed. The MSPX mines maximal sequential patterns with a bottomup search and uses multiple samples [81]. Unlike the traditional singlesample techniques used in FIM algorithms, MSPX uses multiple samples at each pass to distinguish potentially infrequent candidates. Then, an efficient algorithm called MaxSP (Maximal Sequential Pattern miner without candidate maintenance) was proposed [56]. The maximal representation may cause the information loss of support, thus another condense representation named closed pattern was introduced [66]. Up to now, some algorithms for mining closed sequential patterns have been proposed, such as CloSpan [66], BIDE [65], ClaSP [83], and CloFAST [85]. CloSpan [66] adopts the candidate maintenanceandtest method to prune the search space, and it may perform worse while database having long sequences or the support threshold is very low. A novel closure checking scheme, called bidirectional extension, is developed in BIDE [65]. BIDE mines closed sequential patterns without candidate maintenance, it is more efficient than CloSpan. Gomariz et al. then introduced the ClaSP algorithm by using several efficient search space pruning methods together with a vertical database layout [83]. Recently, a more fast algorithm called CloFAST was proposed for mining closed sequential patterns using sparse and vertical idlists [85]. CloFAST does not build pseudoprojected databases and does not need to scan them. It is more efficient than the previous approaches, including CloSpan, BIDE, and ClaSP.
Discussions. Each method of serial sequential pattern mining algorithm has advantages and disadvantages. Besides, there are also many different definitions of categories for SPM algorithms. A more comprehensive discussion of these methods has been given in [35, 36, 37], and an uptodate survey of the current status of sequent pattern minng can be referred to [8].
4 Stateoftheart Parallel Sequential Pattern Mining
In this section, we first briefly overview the current status of parallel frequent itemset mining. We also cover some special categories of parallel sequential pattern mining algorithms, i.e., partitionbased algorithms for PSPM, Aprioribased algorithms for PSPM, pattern growth algorithms for PSPM, and hybrid algorithms for PSPM. In most PSPM algorithms, they are hybrid inherently.
4.1 Parallel Frequent Itemset Mining
As mentioned previously, sequential pattern mining is highly related to frequent itemset mining, and SPM is more complicated than FIM. Since some developments of SPM are inspired by many technologies of FIM, it is needed to have an overview of the current development in parallel frequent itemset mining (PFIM). The problem of frequent itemset mining in parallel environments has been extensively studied so far, and a number of approaches have been explored to address this problem, such as PEAR [86], PPAR [86], PDM [87], ParEclat [29], PaMPaHD [27], PHIKS [28], etc. In 1995, Mueller first proposed two parallel algorithms, Parallel Efficient Association Rules (PEAR) [86] and Parallel Partition Association Rules (PPAR) [86]. Cheung et al. also proposed an algorithm named Parallel Data Mining (PDM) to parallel mining of association rules [87], and the Fast Distributed Mining (FDM) algorithm for distributed databases [88]. An adaptation of the FPGrowth algorithm to MapReduce [18] called PFP (parallel FPtree) is presented in [19]. PFP is a parallel form of the classical FPGrowth, it splits a largescale mining task into independent and parallel tasks. By extending the vertical mining approach Eclat [89], the parallelbased Eclat (ParEclat) [29] and the distributed Eclat (DistEclat) [90] were developed, respectively. Research efforts have already been made to improve Aprioribased and traditional FIM algorithms and to convert them into parallel versions, mostly under the MapReduce [18] or Spark [61] environment. Some results of these efforts are PARMA [20], a parallel frequent itemset mining algorithm with Spark (RApriori) [91], PaMPaHD [27], and PHIKS [28]. PARMA [20] is a parallel algorithm for the MapReduce framework, which mines approximate association rules instead of exact ones.
Discussions. The above parallel algorithms are used in frequent itemset mining. Up to now, large numbers of studies of parallel frequent itemset mining have been extensively studied compared to parallel sequential pattern mining. The development of PFIM is still an active area in research, where the uptodate advances of PFIM can be referred to [38, 17]. As mentioned before, efficient mining of frequent sequences is more challenging than FIM. Currently, a large number of algorithms have been extensively developed for SPM, while few of them are sufficiently scalable to handle largescale datasets.
4.2 Partitionbased Algorithms for PSPM
In addition to parallel frequent itemset mining, a variety of mining algorithms has been developed for parallel sequential pattern mining (PSPM). Although PSPM is comparatively more complicate and challenging than that of PFIM, research in the area of PSPM has been active for some time. Recently, to improve the performance, effectiveness and scalability issues, parallel sequential pattern mining has received much attention. As shown in Table VIII, some partitionbased parallel methods for SPM have been developed.
A comprehensive working example of partitionbased sequential pattern mining algorithm with task parallelism is illustrated below, to show how the problem of parallel sequential pattern mining is solved. It could help to understand the idea of parallel sequential pattern mining.
Note that Fig. 2 is an illustrated example of the SPSPM algorithm (Simply Partitioned Sequential Pattern Mining with task parallelism) [92]. Assuming there are 3 nodes in the mining system, denoted as Node A, Node B and Node C, respectively. Each node reads the sequence database from its local disk, and let the large/frequent items be . Since SPSPM partitions/distributes the candidate sequences in a roundrobin manner, the candidate sequences are assigned shown in Fig. 2. Suppose Node A reads the sequence . Then Node A broadcasts stored in its local disk to all the other nodes, and increases the countsupport of . Node B and Node C respectively receive the sequence from Node A. Then Node C increases the supportcount of .
NPSPM, SPSPM and HPSPM. In 1998, Shintani and Kitsuregawa firstly partitioned the input sequences for SPM, and yet they assumed that the entire candidate sets could be replicated and would fit in the overall memory (random access memory and hard drive) of a process [92]. Based on the classical GSP approach [5], they proposed three parallel approaches [92]: NPSPM (NonPartitioned Sequential Pattern Mining), SPSPM (Simply Partitioned Sequential Pattern Mining), and HPSPM (Hash Partitioned Sequential Pattern Mining). These three variants perform as a classical levelwise candidate generationandtest style. HPSPM is similar to SPSPM but utilizes more intelligent strategies. HPSPM uses the hash mechanism to assign the input sequences and candidate sequences to specific processes [92]. Although HPSPM has the best performance among the three variants, it also suffers, however, several limitations. For exampe, when the object timelines are very large, or when the counting method for generalized SPM is used, HPSPM is very timeconsuming and memory cost.
EVE & EVECAN. Similar assumptions of NPSPM were made in EVE [93] and EVECAN [93]. EVECAN (EVEnt and CANdidate distribution) partitions the input sequentail data similar to EVE (EVEnt distribution), and also partitions the candidate sequences. In EVECAN, both the candidate generation phase and the counting phase are parallelized. The former uses a distributed hash table mechanism, whereas the later adopts an approach similar to Intelligent Data Distribution (IDD), which is usually used for nonsequential associations. EVECAN shards both input sequences and candidate sequences, and rotates the smaller of the two sequence sets among the processes, in roundrobin fashion. Note that the above parallel formulations for SPM only toward the sharedmemory architecture.
DPF, STPF and DTPF. Guralnik and Karypis developed three treeprojectionbased parallel sequence mining algorithm [94], named data parallel formulation (DPF), static taskparallel formulation (STPF), and dynamic taskparallel formulation (DTPF), for distributedmemory architectures. These projection algorithms are parallelized using the data parallel formulation (DPF) and the task parallel formulation (TPF). They are intrinsically similar to the PrefixSpan algorithm [64], grow the lexicographic sequence tree in a bilevel breadthfirst manner. Two partitioning methods were proposed: TPFBP (TPF based on a binpacking algorithm) and TPFGP (TPF based on a minimumcut bipartite graph partitioning algorithm), both TPFBP and TPFGP can achieve balanced workloads. It was shown that these three algorithms are able to achieve good speedups when the number of processors increases. However, STPF may suffer some load imbalance problem while the number of processes increases, and DPF has not only to partition the tree, but also needs to perform multiple scans of the local database at each iteration. The added I/O overhead makes DPF impractical for big data analytics. In addition, DPF utilizes a dynamic loadbalancing strategy to allow an idle processor to join the busy processors. However, this dynamic strategy requires much more interprocessor communication than the selective sampling approach used in the ParCSP algorithm [34]. Moreover, the interruption of these busy processors may even cause more overhead during the mining process.
Summary of merits and limitation. In this subsection, we have described the key details of some partitionbased algorithms for PSPM so far. We further highlight the key differences, advantages and disadvantages between these approaches, as shown in Table VIII.
Name  Description  Pros.  Cons.  Year 

HPSPM [92]  Hash Partitioned Sequential Pattern Mining, task parallelism.  It can considerably reduces the broadcasting of sequences.  It is based on GSP and has a levelwise candidate generationandtest style.  1998 
SPSPM [92]  Simply Partitioned Sequential Pattern Mining, task parallelism.  The candidate sets assigned to the nodes have almost the same size.
They are disjoint from each other. 
It requires a very large communication overhead.  1998 
NPSPM [92]  NonPartitioned Sequential Pattern Mining, data parallelism.  Mining sequential patterns without partition.  NPSPM is based on GSP and performed in a bottomup, levelwise manner with multiple data scans.  1998 
EVE [93]  EVEnt distribution.  Assume that the entire candidate sequences can be replicated and fit in the overall memory of a process.  The I/O overhead is impractical in the big data context.  1999 
EVECAN [93]  EVEnt and CANdidate distribution.  Rotates the smaller of two sets among the processes with a roundrobin fashion.  The I/O overhead is impractical in the Big Data context.  1999 
DPF, STPF
and DTPF [94] 
Data Parallel Formulation,
Static TaskParallel Formulation, and Dynamic TaskParallel Formulation 
It can achieve balanced workloads using a bilevel breadthfirst manner in the lexicographic sequence tree.  At each iteration, it has to partition the tree and perform multiple scans of the local database.
This added I/O overhead is impractical when mining big data. 
2004 
4.3 Aprioribased Algorithms for PSPM
As the above partitionbased algorithms of PSPM, input partitioning is not inherently necessary for shared memory systems or MapReduce distributed systems. In this subsection, some typical Aprioribased algorithms for PSPM are introduced. Moreover, the key differences, advantages and disadvantages between these approaches are highlighted at Table IX.
pSPADE and webSPADE. Zaki first extended the efficient SPADE algorithm to the shared memory parallel architecture, called pSPADE [31]. In the pSPADE framework, input sequence data is assumed to be resided on shared hard drive space, and stored in lattice (a vertical data structure). pSPADE utilizes an optimized task parallel formulation approach (Recursive Dynamic Load Balancing, RDLB), and uses two data parallel formulations (single vs. levelwise idlist parallelism and join parallelism), to partition the input spaces. Processes are either assigned idlists for a subset of sequences, or portions of all idlists associated with a range of sequences in database. After that, processes collaborate to expand each node in the itemset lattice. However, these formulations lead to poor performance due to high synchronization and memory overheads. Zaki then proposed two task distribution schemes, which are able to avoid read/write conflicts through independent search space sublattice assignments. Besides, a webbased parallel approach called webSPADE is proposed to analyze the web log [95]. webSPADE is also a SPADEbased algorithm to be run on shared memory parallel computers.
DGSP and DPSP. A distributed GSP algorithm based on MapReduce [18] framework named DGSP was introduced in [96]. The “twojobs” structure used in DGSP can effectively reduce the communication overhead for parallelly mining sequential patterns. DGSP optimizes the workload balance, partitions the database, and assigns the fragments to Map workers. All in all, DGSP is a straightforward extension of GSP in distributed environment, it performs poorly because of the repeated scans of the input data. Huang et al. developed Map/Reduce jobs in DPSP to delete obsolete itemsets, to update current candidate sequential patterns, and to report uptodate frequent sequential patterns within each period of interest (POI) [97]. One major drawback of DPSP is that it needs to proceed through numerous rounds of MapReduce’s jobs, which will easily cause the load unbalancing problem and incur high cost of MapReduce initialization. Anthor drawback is that it focuses on the POI concept, where the database is divided into many POI windows. Thus, its ability of handling longer sequential patterns is not demonstrated.
PartSpan and GridGSP. After that, some distributed and parallel mining methods, such as parallel sequence mining of trajectory pattern (PartSpan) [98] and GridGSP [99], were proposed to find trajectory patterns by extending the traditional GSP algorithm [5]. The PartSpan, which is based on GSP and HPSPM, adopts the prefixprojection and the parallel formulation [98]. Thus, it can efficiently solve the I/O cost by using the candidate pruning strategy and reasonable data distribution [98]. GridGSP is a distributed and parallel GSP algorithm in a grid computing environment, it provides a flexible and efficient platform for SPM [99]. In GridGSP, the input sequence data is divided into a number of progressive windows, and then these data independently perform candidate generation on multiple machines. Then, reducer uses the support assembling jobs to count the relative occurrence frequencies of candidate sequences. Besides, GridGSP uses the Count Distribution (CD) technology for counting 1length candidate sequences, followed by a projectionbased task partitioning for solving the remainder of the PSPM problem. The similar strategies were used in PLUTE [100] and MGFSM [101]. Both PartSpan and GridGSP focus on efficiently mining the frequent patterns under different domainspecific constraints or on different types of data by using MapReduce.
SPAMC & SPAMCUDLT. Recently, Huang et al. extend the classic SPAM algorithm to a MapReduce version, named as SPAM algorithm on the Cloud (SPAMC for short). Cloudbased SPAMC utilizes an iterative MapReduce to quickly generate and prune candidate sequences while constructing the lexical sequence tree [102]. The core technology of SPAMC is that it mines a sequential database using parallel processing, and the all subtasks can be simultaneously distributed and executed on many machines. However, SPAMC does not address the iterative mining problem, as well as the load balancing problem. Thus, reloading data in SPAMC incurs additional costs. Although SPAMC utilizes a global bitmap, it is not scalable enough for handing extremely largescale databases. To remedy these problems, Chen et al. further devised SPAMCUDLT to discover the sequential patterns in the clouduniform distributed lexical sequence tree, exploiting MapReduce and streaming processes [103]. SPAMCUDLT dramatically improves overall performance without launching multiple MapReduce rounds, and provides perfect load balancing across machines in the cloud [103]. The results [103] showed that SPAMCUDLT can remarkably reduce execution time compared to SPAMC, achieve high scalability, and provide much better load balancing than existing algorithms in the cloud.
Summary of merits and limitation. All the abovedescribed algorithms, which are shown in Table IX, are the Aprioribased algorithms for parallel sequential pattern mining. In general, all these algorithms may also suffer from some drawbacks of serial Aprioribased SPM algorithm. Thus, they do not scale well and can not efficiently solve the problem such as timeconsuming and memory cost.
Name  Description  Pros.  Cons.  Year 

pSPADE [31]  Parallel SPADE.  RDLB exploits the dynamic load balancing at both interclass and intraclass granularities.  It mines patterns with levelwise candidate generationandtest.  2001 
webSPADE [95]  A parallelalgorithm variation of SPADE developed for web logs.  It requires once scan of the input data and multiprocessor servers.
Data and task parallelism are achieved easily by multithreaded programming. 
As a Aprioribased, vertical formatting method similar to SPADE, it needs several partial scans of the database.  2002 
DGSP [96]  A distributed GSP algorithm based on MapReduce.  DGSP optimizes the workload balance, partitions the database, and assigns the fragments to Map workers.  It may performe poorly because of the repeated scans of the input sequence data.  2005 
PartSpan [98]  Parallel sequence mining of trajectory patterns.  It can reduce the I/O cost by using the candidate pruning strategy and reasonable data distribution.  It does not scale well and can not efficiently solve the problem of timeconsuming and memory cost.  2008 
DPSP [97]  A sequence algorihtm that mining Distributed Progressive Sequential Patterns on the cloud.  It performs a progressive mining process in a dynamic database.
It can discover uptodate frequent sequential patterns. 
It may cause the load unbalancing problem because of the required numerous rounds of MapReduce jobs.
It may easily incur high cost of MapReduce initialization. 
2010 
GridGSP [99]  A distributed and parallel GSP in a grid computing environment.  It utilizes the divideandconquer strategy.
Several monitoring mechanisms are developed to help manage the SPM process. 
Suffer some drawbacks of the GSP algorithm, such as timeconsuming and memory cost.  2012 
SPAMC [102]  SPAM algorithm on the Cloud.  Couldbased SPAMC utilizes an iterative MapReduce framework to increase scalability.  It is not an iterative mining approach, and may incur additional costs for reloading data.
It does not study the load balancing problem. It is still not scalable enough. 
2013 
SPAMCUDLT [103]  SPAM algorithm in the clouduniform distributed lexical sequence tree.  It is an iterative mining approach.
It studies the load balancing problem. Exploiting MapReduce and streaming processes. 
It does not consider the gap constraint.
It does not study condense representation (i.e., closed, maximal). 
2013 
Name  Description  Pros.  Cons.  Year 

ParASP [104]  Parallel PrefixSpan to mine all sequential patterns (ASP).  It uses a sampling method to accomplish static task partition.  The serial sampling component limits the speedup potential when the number of processes increases.  2005 
ParCSP [34]  Parallel mining Closed Sequential Patterns with selective sampling.  The first parallel algorithm for mining closed sequential patterns.
It can reduce processor idle time using a dynamic scheduling scheme. Using a loadbalancing scheme. 
The serial sampling component limits the speedup potential when the number of processes increases.  2005 
PLUTE [100]  Parallel sequential pattern mining.  It focuses on massive trajectory data, and outperforms PartSpan in mining massive trajectory data.  The used prefix projection technology could consume huge memory and a lot of scan time.  2010 
BIDEMR [105]  A BIDEbased parallel algorithm on MapReduce.  The tasks of closure checking and pruning can be iteratively assigned to different nodes in cluster.
Performed on Hadoop cluster with high speedups. Can mine closed sequential patterns. 
Even mining frequent closed sequences does not fully resolve the problem of pattern explosion.  2012 
DFSP [106]  A DepthFirst SPelling algorithm for mining sequences from biological data.  It proposes a threedimensional list for mining DNA sequences.
It addresses biological data mining, such as protein DNA analysis. 
It focuses on PSPM under different domainspecific constraints, but not the general cases without any constraints.  2013 
Sequence
Growth [107] 
A MapReducebased FIM algorithm with sequences.  It uses a lexicographical order to generate candidate sequences that avoiding expensive scanning processes.
It provides an extension for trajectory pattern mining. 
It does not consider the gapconstraint and support hierarchies.  2015 
4.4 Pattern Growth Algorithms for PSPM
As mentioned before, many pattern growth algorithms for serial sequential pattern mining have been extensively studied. Based on these theories and techniques, some of these algorithms have been extended to realize the parallelization. As shown in Table X. It contains key characteristics, advantages and disadvantages of the pattern growth approaches. The PrefixSpan algorithm also has some parallelized versions, such as ParASP [104] and SequenceGrowth [107].
ParFP, ParASP and ParCSP. In order to balance mining tasks, Cong et al. designed several patterngrowth models, ParFP (Parallel FPgrowth) [104], ParASP (Parallel PrefixSpan to mine all sequentialpatterns) [104] and ParCSP (Parallel mining of Closed Sequential Patterns) [34], to accomplish the static task. Especially, they use a sampling technique, named selective sampling, that requires the entire input data be available at each process. ParASP is extended by the classical PrefixSpan algorithm [64], and the used selective sampling in ParASP is performed by four steps: sample tree construction, sample tree mining, task partition and task scheduling. After collecting 1length frequent sequence counts, ParASP separates a sample of length frequent prefixes of sequences in database. It then uses one process to handle the sample via a pattern growth algorithm, recording the execution times for the found frequent subsequences. Correspondingly, mining each of frequent subsequences can be transformed into a parallel task. A series of projected subdatabases for these frequent subsequences are done while estimating their execution times. After that, task distribution is done in the same way as in the ParFP algorithm [104]. However, it has been found that the serial sampling operations of these algorithms limit the speedup potential when the number of processes increases.
As described at above subsection, some closedbased SPM algorithms have been extensively studied. Unfortunately, no parallel method for closed sequential pattern mining has yet been proposed. In 2005, Cong et al. first proposed the ParCSP (Parallel Closed Sequential Pattern mining) algorithm [34] based on the BIDE algorithm. ParCSP [34] is the first parallel algorithm which aims at mining closed sequential patterns. The process in ParCSP is divided into independent tasks, this can minimize interprocessor communications. ParCSP utilizes a dynamic scheduling strategy to reduce processor idle time and is performed in a distributed memory system. It is different from pSPADE using a sharedmemory system. ParCSP also uses a loadbalancing scheme to speed up the process. Applying such a dynamic load balancing scheme in a distributed memory system is too expensive to be practical even for big data. In contrast to mine closed sequences, a parallel version of the MSPX [81] algorithm named PMSPX was also proposed. It mines maximal frequent sequential patterns by using multiple samples [108].
In addition, sequential mining can be applied into biological sequence mining, such as a protein DNA analysis. Recently, the 2PDFIndex [109], 2PDFCompression [109], and DFSP [106] algorithms were proposed and applied to scalable mining sequential patterns for biological sequences. DFSP [106] uses a threedimensional list for mining DNA sequences. It adopts direct access strategy and binary search to index the list structure for enumerating candidate patterns. However, these three works focus on efficiently mining the frequent patterns under different domainspecific constraints or on different types of data (e.g., medical database, DNA sequences), while SPAMCUDLT [103] focuses on the general cases without any constraints.
Summary of merits and limitation. According the above analytics, it can be concluded that the patterngrowth approaches usually perform better than those of Aprioribased algorithms for PSPM. Moreover, the key differences, advantages and disadvantages between these approaches are highlighted in Table IX.
Name  Description  Pros.  Cons.  Year 

MGFSM [101]  A largescale algorithm for Frequent Sequence Mining on MapReduce with gap constraints.  Uses some optimization techniques to minimize partition size, computational cost, and communication cost.
More efficient and scalable than other alternative approaches. Any existing SPM method can be used to mine one of its partitions. Can handle temporal gaps, maximal and closed sequences, which improves its usability. 
MGFSM does not support hierarchies, that some of items have hierarchies and frequency can be different.  2013 
DFSP [106]  A DepthFirst SPelling algorithm for mining sequences from biological data.  It proposes a threedimensional list for mining DNA sequences.
It addresses biological data mining, such as protein DNA analysis. 
It focuses on PSPM under different domainspecific constraints, but not the general cases without any constraints.  2013 
IMRSPM [110]  Uncertain SPM algorithm in iterative MapReduce framework.  An iterative MapReduce framework for mining uncertain sequential patterns.
The first work to solve the largescale uncertain SPM problem. 
It extends the Apriorilike SPM framework to MapReduce, thus suffers from some drawbacks of Apriorilike SPM algorithms.  2015 
MGFSM+ [58]  An enhanced version of MGFSM.  A more scalable distributed (i.e., sharednothing) sequential pattern mining algorithm.
A more suitable SPM approach to directly integrate the maximality constraint. 
Similar to MGFSM, MGFSM+ does not support hierarchies either.  2015 
LASH [30]  A LArgescale Sequence mining algorithm with Hierarchies.  It is the first parallel algorithm for mining frequent sequences with consideration of hierarchies.
It proposes the pivot sequence miner (PSM) method for mining each partition. LASH can search better than MGFSM because of PSM. LASH does not need a global postprocessing. 
It cannot handle stream data and uncertain data.  2015 
Distributed SPM [111]  A distributed SPM approach with dynamic programming.  A memoryefficient approach uses distribute dynamic programming schema.
Uses an extended prefixtree to save intermediate results. Its time cost is linear. 
It does not consider the gapconstraint and support hierarchies.  2016 
Interesting Sequence Miner (ISM) [112]  A novel subsequence interleaving parallel model based on a probabilistic model of the sequence database.  It takes a probabilistic machine learning approach (an encoding scheme) to the SPM problem.
All operations on the sequence database are trivially parallelizable. 
It cannot handle stream data and uncertain data.  2015 
4.5 Hybrid Algorithms for PSPM
Some hybrid parallel algorithms for SPM are developed by incorporating the above different technologies, as shown in Table XI. Details of some important hybrid algorithms are described below.
MGFSM and MGFSM+. MGFSM [101] and its enhanced version MGFSM+ [58] are the scalable distributed (i.e., sharednothing) algorithms for gapconstrained SPM on MapReduce [18]. In MGFSM and MGFSM+, the notion of wequivalency w.r.t. “projected database” which is used by many SPM algorithms is introduced. Moreover, some optimization techniques are developed to minimize partition size, computational costs, and communication costs. Therefore, MGFSM and MGFSM+ are more efficient and scalable than previous alternative approaches. Especially, any existing serial FSM method can be applied into MGFSM to handle each of its partitions. MGFSM is divided into three key phases: i) a preprocessing phase, ii) a partitioning phase, and iii) a mining phase; all these phases are fully parallelized. It is more adaptable and flexible that both MGFSM [101] and MGFSM+ [58] can handle temporal gaps and maximal and closed sequences, which improves their usability.
MGFSM vs. LASH. In some realworld applications, hierarchy and frequency of items can be different, while MGFSM does not support hierarchies. To overcome this drawback, LASH is recently designed for largescale sequence datasets. It is the first parallel algorithm to discovery frequent sequences by considering hierarchies [30]. Inspired by MGFSM, LASH first partitions the input data, and then handles each partition independently and in parallel. Both of them are the distributed algorithms for mining SPs with Maximum gap and maximum length constraints using MapReduce [18], they are all more adaptable and flexible than other PSPM algorithms. Notice that LASH does not have a global postprocessing, it divides each sequence by pivot item and performs local mining (PSM). Therefore, LASH can have better search performance than MGFSM and MGFSM+.
ACME. For the application in bioinformatics, Sahli et al. proposed ACME [113], which applies tree structure to extract longer motif sequences on supercomputers. ACME is a parallel combinatorial method for extracting motifs repeated in a single long sequence (e.g., DNA) instead of multiple long sequences. It proposes two novel models, CAST (cache aware search space traversal model) and FAST (finegrained adaptive subtasks), to effectively utilize memory caches and processing power of multicore sharedmemory machines. ACME introduces an automatic tuning mechanism that suggests the appropriate number of CPUs to utilize. In particular, it is largescale shared nothing systems with tens of thousands of processors, which are typical in cloud computing. It has been shown that ACME achieves improvement in serial execution time by almost an order of magnitude, especially supports for longer sequences (e.g., DNA for the entire human genome).
Summary of merits and limitation. In most PSPM algorithms, they are hybrid inherently. Although the hybrid methods for PSPM incorporate different technologies to effectively reduce communication cost, memory usage and execution time. There is no doubt that they also have their advantages and disadvantages. Details of characteristics, advantages and disadvantages about these hybrid methods for PSPM are described at Table XI.
5 Advanced Topics in PSPM
In this section, some advanced topics of parallel sequential pattern mining are provided and discussed in details, including parallel quantitative / weighted / utility sequential pattern mining, parallel sequential pattern mining from uncertain data and stream data, hardware accelerator for PSPM.
5.1 Parallel Quantitative / Weighted / Utility Sequential Pattern Mining
Generally speaking, most PSPM algorithms focus on how to improve their efficiency. In many cases, effectiveness can be far more important than efficiency. It is commonly seen that sequence data contains various quantity, weight, and utility with different items/objects [82, 114]. For example, consider some transactions from shopping baskets. As shown in Table III, the quantity and price of each item are not considered in this database. In general, some useful information (i.e., quantitative, weight and utility) are simply ignored in the current form of sequential pattern mining. Up to now, some approaches have been proposed to handle quantity, weight or utility constraints in sequential pattern mining. Many sequential pattern algorithms have been introduced, such as quantitative sequential pattern [82, 115], weighted sequential pattern [116], high utility sequential pattern [114], etc. All of these efforts rely on sequential algorithms and are not able to scale to largerscale datasets. How to achieve parallelism of these methods is an open issue and challenge.
Recently, the methods for mining above useful sequential patterns are extended to achieve parallelism, such as the PESMiner (Parallel Event Space Miner) framework [117] for parallel and quantitative mining of sequential patterns at scale, and the Sparkbased BigHUSP algorithm [118] for mining high utility sequential patterns. BigHUSP is first proposed by Zihayat et al., it is designed based on the Apache Spark platform [61] and takes advantage of several merit properties of Spark such as distributed in memory processing, fault recovery and high scalability [118]. On one hand, both PESMiner [117] and BigHUSP [118] can identify informative sequential patterns with respect to a business objective, such as intervalbased quantitative patterns and profitable patterns. Thus, they may be more effective than those which simply ignore the useful information. On the other hand, these frameworks integrate domain knowledge, computational power, and MapReduce [18] or the Apache Spark [61] techniques together to achieve better feasibility, quality and scalability.
5.2 PSPM in Uncertain Data
In many reallife applications, uncertain data is commonly seen, and uncertain sequence data is widely used to model inaccurate or imprecise timestamped data [119], while traditional datat mining (i.e., FIM, ARM, SPM) algorithms are inapplicable when handling uncertain data. Other related algorithms for parallel sequential pattern mining are still being developed, such as the PSPM of largescale databases by using a probabilistic model [112]. Recently, Ge at al. proposed an iterative MapReduce framework for mining uncertain sequential patterns, where the new iterative MapReducebased Apriorilike uncertain SPM framework is quite different from the traditional SPM algorithms [110]. An uncertain sequence database = is abstracted by a Resilient Distributed Datasets (RDD) [61] in Spark. Uncertain sequences in the RDD are allocated to a cluster of machines and can be processed in parallel. A sequential pattern in is called a probabilistic frequent pattern (pFSP) if and only if its probability of being frequent is at least , denoted by . Here, is the userdefined minimum probability threshold. However, the frequentness of in an uncertain database is probabilistic.
For application of SPM in uncertain databases [119], it is challenging in terms of efficiency and scalability. Ge and Xia further developed a Distributed Sequential Pattern (DSP) mining algorithm in largescale uncertain databases based on Spark, which relies on a memoryefficient distributed dynamic programming (DP) approach [111]. Although MapReduce [18] is widely used for processing big data in parallel, while it does not support the iterative computing model, its basic framework cannot be directly used in SPM. Directly applying the DP method to Spark is impractical because its memoryconsuming characteristic may cause heavy JVM garbage collection overhead in Spark [111]. For bioinformatics, a new sparkbased framework was proposed to mine sequential patterns from uncertain big deoxyribonucleic acid (DNA) [120].
5.3 PSPM in Stream Data
Most of the above PSPM algorithms are developed to handle traditional database management system (DBMS)
5.4 Hardware Accelerator for PSPM
Generally, many efforts have been made to speed up SPM via software and hardware. On the one hand, parallel implementation is the general way. On the other hand, hardware accelerators allow a single node to achieve orders of magnitude improvements in performance and energy efficiency. As mentioned before, Graphics processing units (GPUs) [39, 21] have attracted much attention due to their costeffectiveness and enormous power for massive data parallel computing. Generalpurpose GPUs leverage high parallelism, but GPUs’ single instruction multiple data (SIMD) and lockstep organization means that the parallel tasks must generally be similar. Lin et al. developed a novel parallel algorithm on GPUs for accelerating pattern matching [60]. Based on the traditional AC algorithm, they implement three CPU versions and one GPU version as follows: ACCPU, DPACOMP, PFACOMP, and PFACGPU [60].
Then Memeti and Pllana introduced an approach for accelerating largescale DNA sequence analysis and SIMD parallelism using manycore architectures (such as GPU, Intel Xeon Phi coprocessor) [122]. Recently, Wang et al. proposed a hardwareaccelerated solution, APSPM and GPUGSP, for sequential pattern mining (SPM) [123]. The major ideas are summarized as follows: It uses Micron’s new Automata Processor (AP), which provides native hardware implementation of nondeterministic finite automata. Based on the wellknown GSP approach, AP SPM derives a compact automaton design for matching and counting frequent sequences. According to experiments with a singlethreaded CPU, multicore CPU, and GPU GSP implementations [123], it has been shown that the APaccelerated solution outperforms PrefixSpan [64] and SPADE [68] on multicore CPU, and scales well for larger datasets.
6 OpenSource Software
Although the problem of sequential pattern mining has been studied for more than two decades, and the advanced topic of parallel sequential pattern mining also has been extended to many research fields, few implementations or source code of these algorithms are released. This brings some barriers to other researchers that they need to reimplement algorithms for using the algorithms or comparing the performance with novel proposed algorithms. To make matter worse, it may cause the unfairness while running experimental comparison, since the performance of pattern mining algorithms (i.e., FIM, ARM and SPM) may commonly depending on the used compiler and the used machine architecture. We provide some opensource software of SPM and PSPM in Table XII.
Name  Contributors  Website 

Sequence Mining  Zaki et al.  http://www.cs.rpi.edu/~zaki/wwwnew/pmwiki.php/Software 
SPAM [23]  Gehrke et al.  http://himalayatools.sourceforge.net/Spam 
SPMF [5]  FournierViger et al.  http://www.philippefournierviger.com/spmf 
MGFSM & MGFSM+ [69]  Miliaraki et al.  https://github.com/umapi1/mgfsm 
LASH [30]  Beedkar et al.  http://umapi1.github.io/lash 
Sequence mining. Zaki releases many implementations or source code of data mining algorithms on his Website, including itemset mining and association rules, sequence mining, itemset and sequence utilities, tree mining, graph mining & indexing, and clustering. He provides C code for three sequence mining algorithms.
SPAM. There is a special website (as shown in Table XII), named “Himalaya Data Mining Tools”, built for the classical SPAM algorithm [23] which aims at finding all frequent sequences. It provides an overview, an illustrative example, source code and documentation, as well as the presentation slides of SPAM.
SPMF. SPMF [80] is a wellknown opensource datamining library, which implements many algorithms and has been cited in more than 600 research papers since 2010. SPMF is written in Java, and provides implementations of 120 data mining algorithms, specialized in sequential pattern mining. SPMF has the largest collection of implementations of various algorithms for sequential pattern mining. It provides the Java code for AprioriAll [2], GSP [5], PrefixSpan [64], SPADE [68], SPAM [23] and many others. The only problem is that SPMF does not provide any parallel algorithms.
MGFSM & MGFSM+. MGFSM [101] is a scalable, generalpurpose (e.g., general, with Maximum gap constraints, and with maximum length constraints, etc.) frequent sequence mining algorithm built for MapReduce. The Java implementation and command line options of MGFSM are provided at Github.
LASH. LASH [30] is a new high scalable, MapReducebased distributed sequence mining algorithm. It mines sequential patterns by considering items’ hierarchies. The Java source code of LASH is provided at Github, including prerequisites for building LASH and running instructions.
7 Challenges and Opportunities in Parallel Sequential Pattern Mining
7.1 Challenges in PSPM
In recent decades, many models and algorithms have been developed in data mining to efficiently discover the desired knowledge from various types of databases. There are still, however, many challenges in big data mining. According to Labrinidis and Jagadish’s study, the major challenges in big data analysis include: data inconsistency, data incompleteness, scalability, timeliness, and security [42]. Although useful developments have been made in the field of parallel computing platforms, there are still some technical challenges [24, 44]. Despite advances in the field of parallel data mining, especially parallel frequent itemset mining and parallel sequential pattern mining, both in academia and industry, significant challenges still remain in parallel sequential pattern mining. Some challenges which need to be dealt with for realizing parallel sequential pattern mining are respectively explored in the following section.
(1) Complex types of sequence data. Sequential data commonly occurs in many fields, current developments in parallel sequential pattern mining have successfully addressed some types of sequence data, including the general type of sequence, sequence data containing quantity / weight / utility, uncertain sequential data, stream data, and so on. The related algorithms and some extensions have been mentioned previously. However, there are still various more complex types of sequence data in a wide range of applications. For future research, it is a big challenge to develop more adaptable and flexible PSPM frameworks for dealing with more complex types of sequence data in a parallel environment.
(2) Multimodal data. In the big data era, a key property of multimodality is complementarity. Each modality brings to the whole some type of added value that cannot be deduced or obtained from any of the other modalities in the setup. In mathematical field, this added value is so called diversity. Thus, multimodality and data fusion are a potential challenge to deal with big data before using parallel sequential pattern mining algorithms.
(3) Dynamic sequence data. In a wide range of applications, the processed data may be commonly dynamic but not static [121]. The dynamic data is more complicated and difficult than the static data. Although some approaches have been developed for handling dynamic sequence data, such as incremental mining [124, 75, 125], online progressive mining [126], and stream data processing [121, 103]. There are still many challenges when parallelizing these mining methods, especially for parallelizing dynamic sequential pattern mining algorithms.
(4) Scalability. In general, parallelizing SPM algorithms for dealing with big data is more complicated and difficult than that of small data. It easily brings some challenging problems, such as availability, accuracy, and scalability. When dealing with big data, the disadvantages of current data mining techniques are centered on inadequate scalability and parallelism. Scalability is one of the core technologies to meet these challenges. Thus, there is a major challenge to realize the needed scalability of the developed PSPM algorithm with demonstrated elasticity and parallelism capacities.
(5) Privacy. Data has the inestimable value (i.e., hidden knowledge and more valuable insights). Data mining and analytics offer many useful tools and play an important role in many fields. However, a major risk in data management and data mining is data leakage, which seriously threatens privacy. Public concerns regarding privacy are rising, which is a major concern in data mining, especially big data mining. On one hand, policies that cover all user privacy concerns should be developed. On the other hand, privacy preserving data mining (PPDM) [127] is also needed. Up to now, many technologies and algorithms have been developed for PPDM [127], but few of them are related to parallel sequential pattern mining. With the developments of SPM to analyze personal data, it is quite necessary and challenging to address the privacy preserving problem for parallel sequential pattern mining.
7.2 Opportunities in PSPM
There are a variety of traditional, distributed, and parallel data mining algorithms in numerous reallife applications. To meet the demand of largescale and high performance computing, the problem of parallel/distributed data mining has received considerable attention over the past decade. There are many types of parallel and distributed systems, such as multicore computing [55], grids [47, 48], peertopeer (P2P) systems [49], adhoc networks [50], cloud computing systems [51], and the MapReduce framework [18]. All these developments can provide theoretical and technical support for parallel sequential pattern mining. Some opportunities are summarized below.
(1) New applications. According to the research of current status, most algorithms of PSPM focus on how to improve the mining efficiency. It is commonly seen that sequential data occurs in many fields such as basket analysis, sensor networks, bioinformatics, social network analysis, and the internet of things. There are some important research opportunities for PSPM in these fields. Instead of focusing on faster general PSPM algorithms, the trend clearly goes towards to extend parallel sequential pattern mining in new applications, or in new ways for existing applications [8].
(2) Advanced parallel computing environment. To overcome the scalability challenge of big data, several attempts have been made in exploiting massive parallel processing architectures, e.g., cloud [51], MapReduce [18], Spark [61]. All these computing infrastructures provide new opportunities for the design of parallel data mining, especially for parallel sequential pattern mining algorithms. To some degree, all these developments can provide the necessary theoretical and technical support for parallel sequential pattern mining.
(3) Developments from hardware and software. As mentioned before, a trend of parallel computing is graphics processing unit (GPU) based parallel techniques. Architecturally, a GPU is composed of hundreds of cores that can simultaneously handle thousands of threads/tasks [39, 21]. More and more supercomputers have the GPU component which is different from the traditional CPU. Therefore, we can expect an enhancement on parallel data mining with the progress of powerful computational capabilities. For future research, the developments from hardware and software strongly provide some opportunities for the sequence mining task in a parallel environment.
(4) Keeping pattern short and simple. Although there have been numerous studies on pattern mining (e.g., FIM, ARM, SPM, etc.), the study on interesting pattern mining is still limited. Most of them are only limited to discover the complete set of desired patterns, a huge pattern mining results may still confuse users to make inefficient decisions. What iss worse, they may easily cause redundant results when mining long patterns or with a low support threshold. Returning a condensed or more constrained set is an interesting issue. Some methods of condensed SPM (i.e., CloSpan [66], BIDE [65], MSPX [81] and ClaSP [83], as shown at Table VII) or summarization SPM (i.e., summarizing event sequences [128], SQUISH [129]) have been proposed, that is, instead of returning all patterns that occur more often than a certain threshold, these methods only return a condensed or more constrained set. How to achieve parallelism of these condensed or summarizing methods is an open issue and challenge.
(5) Utilizing deep learning technology. Deep learning [130] is a hot topic in current research. There are some works focus on deep learning for sequential data (i.e., text, EEG data and DNA), such as Recurrent Neural Networks (RNN) [131], and Long Short Term Memory (LSTM) [132]. Unlike the standard neural network, RNN allows us to operate over sequences of vectors [131]. And RNN uses internal memory to process arbitrary sequences of inputs. LSTM is particularly usable by a category of learning machines, adapted to sequential data [132]. Many deep learning algorithms have been proposed to model sequential data, and then to achieve different learning goals. Despite their flexibility and power, how to make them parallelized for sequential pattern mining can lead to challenges and open many new research fields.
(6) Other important issues. Beyond the above points, there are some other important issues that can be reasonably considered and further developed, such as visualization techniques and the privacy issue for parallel sequential pattern mining in the big data era. How to design efficient and more flexible PSPM approachs to support iterative and interactive mining is also an interesting issue.
8 Conclusion
Typically, sequential pattern mining algorithms aim at discovering the desired frequent sequential patterns. Since traditional data mining algorithms generally have some problems and challenges while processing largescale data, parallel data mining has emerged as an important research topic. However, fewer studies summarized the related development in parallel sequential pattern mining in parallel computing environments, or summarized them as a taxonomy structure. In this paper, we first review the key characteristics about some parallelism methods, and then highlight differences and challenges between parallel computing platforms and distributed systems. We then highlight and discuss some related works on parallel sequential pattern mining in several categories. The main contributions are that we investigate recent advances in parallel sequential pattern mining and provide the status of the field in detail, including sequential pattern mining (SPM), parallel frequent itemset mining (PFIM), and parallel sequential pattern mining (PSMP). Both basic algorithms and advanced algorithms for parallel sequential pattern mining are reviewed in several categories, the key ideas, advantages and disadvantages of each approach are also pointed out in details. We further provide some related opensource software of PSPM, that may reduce barriers from research and algorithm implementation. Finally, we briefly point out some challenges and opportunities of parallel sequential pattern mining for future research.
Wensheng Gan is currently a joint Ph.D. candidate at the Department of Computer Science, University of Illinois at Chicago, USA; and a Ph.D. candidate at the School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Guangdong, China. He received B.S. and M.S. degrees in Computer Science from South China Normal University and Harbin Institute of Technology (Shenzhen), Guangdong, China, in 2013 and 2015, respectively. His research interests include data mining, utility mining, pattern discovery, and Big Data technologies. He has published more than 50 research papers in peerreviewed journals and international conferences. 
Jerry ChunWei Lin received his Ph.D. in Computer Science and Information Engineering from National Cheng Kung University, Tainan, Taiwan, in 2010. He is now working as an associate professor at the Harbin Institute of Technology (Shenzhen), Shenzhen, China. His research interests include data mining, privacypreserving and security, Big Data analytics, and social networks. He has published more than 200 research papers in peerreviewed international conferences and journals, which have received more than 1900 citations. He is the coleader of the popular SPMF opensource datamining library and the EditorinChief of Data Mining and Pattern Recognition (DSPR), Associate Editor of Journal of Internet Technology, and a member of the Editorial Board of Intelligent Data Analysis. 
Philippe FournierViger is a full professor and Youth 1000 scholar at the Harbin Institute of Technology (Shenzhen), Shenzhen, China. He received a Ph.D. in Computer Science from the University of Quebec, Montreal, in 2010. His research interests include data mining, pattern mining, sequence analysis and prediction, text mining, elearning, and social network mining. He has published more than 200 research papers in peerreviewed international conferences and journals, which have received more than 2200 citations. He is the founder of the popular SPMF opensource datamining library, which has been cited in more than 700 research papers since 2010. He is also EditorinChief of Data Mining and Pattern Recognition (DSPR). 
HanChieh Chao has been the president of National Dong Hwa University since February 2016. He received M.S. and Ph.D. degrees in Electrical Engineering from Purdue University in 1989 and 1993, respectively. His research interests include highspeed networks, wireless networks, IPv6based networks, and artificial intelligence. He has published nearly 350 peerreviewed professional research papers. He is the EditorinChief (EiC) of IET Networks’ Journal of Internet Technology. He is the founding EiC of the International Journal of Internet Protocol Technology and the International Journal of Ad Hoc and Ubiquitous Computing. Dr. Chao has served as a guest editor for ACM MONET, IEEE JSAC, IEEE Communications Magazine, IEEE Systems Journal, Computer Communications, IEEE Proceedings Communications, Wireless Personal Communications, and Wireless Communications & Mobile Computing. He is an IEEE Senior Member and a fellow of IET. 
Philip S. Yu received a B.S. degree in electrical engineering from National Taiwan University, M.S. and Ph.D. degrees in electrical engineering from Stanford University, and an MBA from New York University. He is a distinguished professor of computer science with the University of Illinois at Chicago (UIC) and also holds the Wexler Chair in Information Technology at UIC. Before joining UIC, he was with IBM, where he was manager of the Software Tools and Techniques Department at the Thomas J. Watson Research Center. His research interests include data mining, data streams, databases, and privacy. He has published more than 1,000 papers in peerreviewed journals (i.e., TKDE, TKDD, VLDB J., and ACM TIST) and conferences (KDD, ICDE, WWW, AAAI, SIGIR, ICML, CIKM, etc). He holds or has applied for more than 300 U.S. patents. Dr. Yu is the EditorinChief of ACM Transactions on Knowledge Discovery from Data. He received the ACM SIGKDD 2016 Innovation Award for his influential research and scientific contributions on mining, fusion, and anonymization of Big Data, and the IEEE Computer Society 2013 Technical Achievement Award for pioneering and fundamentally innovative contributions to the scalable indexing, querying, searching, mining, and anonymization of Big Data. Dr. Yu is a fellow of the ACM and the IEEE. 
Footnotes
 http://en.wikipedia.org/wiki/Task_parallelism
 http://en.wikipedia.org/wiki/Data_parallelism
 https://en.wikipedia.org/wiki/Parallel_computing
 http://hadoop.apache.org
 https://en.wikipedia.org/wiki/Multicore_processor
 https://en.wikipedia.org/wiki/Grid_computing
 https://en.wikipedia.org/wiki/Graphics_processing_unit
 https://en.wikipedia.org/wiki/Database
References
 R. Agrawal, R. Srikant et al., “Fast algorithms for mining association rules,” in Proc. 20th int. conf. very large data bases, VLDB, vol. 1215, 1994, pp. 487–499.
 R. Agrawal and R. Srikant, “Mining sequential patterns,” in Data Engineering, 1995. Proceedings of the Eleventh International Conference on. IEEE, 1995, pp. 3–14.
 R. Agrawal, T. Imieliński, and A. Swami, “Mining association rules between sets of items in large databases,” in Acm sigmod record, vol. 22, no. 2. ACM, 1993, pp. 207–216.
 J. Han, J. Pei, Y. Yin, and R. Mao, “Mining frequent patterns without candidate generation: A frequentpattern tree approach,” Data mining and knowledge discovery, vol. 8, no. 1, pp. 53–87, 2004.
 R. Srikant and R. Agrawal, “Mining sequential patterns: Generalizations and performance improvements,” in International Conference on Extending Database Technology. Springer, 1996, pp. 1–17.
 P. FournierViger, J. C. W. Lin, B. Vo, T. T. Chi, J. Zhang, and H. B. Le, “A survey of itemset mining,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 7, no. 4, 2017.
 J. Pei, J. Han, B. MortazaviAsl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. C. Hsu, “Mining sequential patterns by patterngrowth: The PrefixSpan approach,” IEEE Transactions on knowledge and data engineering, vol. 16, no. 11, pp. 1424–1440, 2004.
 P. FournierViger, J. C. W. Lin, R. U. Kiran, Y. S. Koh, and R. Thomas, “A survey of sequential pattern mining,” Data Science and Pattern Recognition, vol. 1, no. 1, pp. 54–77, 2017.
 P. Berkhin, “A survey of clustering data mining techniques,” in Grouping multidimensional data. Springer, 2006, pp. 25–71.
 R. A. Jarvis and E. A. Patrick, “Clustering using a similarity measure based on shared near neighbors,” IEEE Transactions on computers, vol. 100, no. 11, pp. 1025–1034, 1973.
 J. R. Quinlan, C4. 5: programs for machine learning. Elsevier, 2014.
 W. Lee, S. J. Stolfo, and K. W. Mok, “Adaptive intrusion detection: A data mining approach,” Artificial Intelligence Review, vol. 14, no. 6, pp. 533–567, 2000.
 M. S. Chen, J. Han, and P. S. Yu, “Data mining: an overview from a database perspective,” IEEE Transactions on Knowledge and data Engineering, vol. 8, no. 6, pp. 866–883, 1996.
 L. Geng and H. J. Hamilton, “Interestingness measures for data mining: A survey,” ACM Computing Surveys (CSUR), vol. 38, no. 3, p. 9, 2006.
 X. Yan and J. Han, “gSpan: Graphbased substructure pattern mining,” in Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on. IEEE, 2002, pp. 721–724.
 M. J. Zaki, “Parallel and distributed association mining: A survey,” IEEE concurrency, vol. 7, no. 4, pp. 14–25, 1999.
 W. Gan, J. C. W. Lin, H. C. Chao, and J. Zhan, “Data mining in distributed environment: a survey,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 7, no. 6, 2017.
 J. Dean and S. Ghemawat, “MapReduce: a flexible data processing tool,” Communications of the ACM, vol. 53, no. 1, pp. 72–77, 2010.
 H. Li, Y. Wang, D. Zhang, M. Zhang, and E. Y. Chang, “PFP: parallel fpgrowth for query recommendation,” in Proceedings of the 2008 ACM conference on Recommender systems. ACM, 2008, pp. 107–114.
 M. Riondato, J. A. DeBrabant, R. Fonseca, and E. Upfal, “PARMA: a parallel randomized algorithm for approximate association rules mining in mapreduce,” in Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, 2012, pp. 85–94.
 S. Boggan and D. M. Pressel, “GPUs: an emerging platform for generalpurpose computation,” ARMY RESEARCH LAB ABERDEEN PROVING GROUND MD COMPUTATIONAL AND INFORMATION SCIENCES DIR, Tech. Rep., 2007.
 D. Y. Chiu, Y. H. Wu, and A. L. Chen, “An efficient algorithm for mining frequent sequences by a new strategy without support counting,” in Data Engineering, 2004. Proceedings. 20th International Conference on. IEEE, 2004, pp. 375–386.
 J. Ayres, J. Flannick, J. Gehrke, and T. Yiu, “Sequential pattern mining using a bitmap representation,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2002, pp. 429–435.
 F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable: A distributed storage system for structured data,” ACM Transactions on Computer Systems (TOCS), vol. 26, no. 2, p. 4, 2008.
 C. W. Tsai, C. F. Lai, M. C. Chiang, L. T. Yang et al., “Data mining for internet of things: A survey.” IEEE Communications Surveys and Tutorials, vol. 16, no. 1, pp. 77–97, 2014.
 C. W. Tsai, C. F. Lai, H. C. Chao, and A. V. Vasilakos, “Big data analytics: a survey,” Journal of Big Data, vol. 2, no. 1, p. 21, 2015.
 D. Apiletti, E. Baralis, T. Cerquitelli, P. Garza, P. Michiardi, and F. Pulvirenti, “Pampahd: a parallel mapreducebased frequent pattern miner for highdimensional data,” in Data Mining Workshop (ICDMW), 2015 IEEE International Conference on. IEEE, 2015, pp. 839–846.
 S. Salah, R. Akbarinia, and F. Masseglia, “A highly scalable parallel algorithm for maximally informative kitemset mining,” Knowledge and Information Systems, vol. 50, no. 1, pp. 1–26, 2017.
 M. J. Zaki, S. Parthasarathy, M. Ogihara, W. Li, P. Stolorz, and R. Musick, “Parallel algorithms for discovery of association rules,” in Scalable High Performance Computing for Knowledge Discovery and Data Mining. Springer, 1997, pp. 5–35.
 K. Beedkar and R. Gemulla, “LASH: Largescale sequence mining with hierarchies,” in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 2015, pp. 491–503.
 M. J. Zaki, “Parallel sequence mining on sharedmemory machines,” Journal of Parallel and Distributed Computing, vol. 61, no. 3, pp. 401–426, 2001.
 Y. Xu, W. Qu, Z. Li, G. Min, K. Li, and Z. Liu, “Efficient means++ approximation with mapreduce,” IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 12, pp. 3135–3144, 2014.
 W. Zhao, H. Ma, and Q. He, “Parallel kmeans clustering based on mapReduce,” in IEEE International Conference on Cloud Computing. Springer, 2009, pp. 674–679.
 S. Cong, J. Han, and D. Padua, “Parallel mining of closed sequential patterns,” in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 2005, pp. 562–567.
 N. R. Mabroukeh and C. I. Ezeife, “A taxonomy of sequential pattern mining algorithms,” ACM Computing Surveys (CSUR), vol. 43, no. 1, p. 3, 2010.
 C. H. Mooney and J. F. Roddick, “Sequential pattern mining–approaches and algorithms,” ACM Computing Surveys (CSUR), vol. 45, no. 2, p. 19, 2013.
 Q. Zhao and S. S. Bhowmick, “Sequential pattern mining: A survey,” ITechnical Report CAIS Nayang Technological University Singapore, vol. 1, p. 26, 2003.
 D. C. Anastasiu, J. Iverson, S. Smith, and G. Karypis, “Big data frequent pattern mining,” in Frequent Pattern Mining. Springer, 2014, pp. 225–259.
 J. A. Anderson, C. D. Lorenz, and A. Travesset, “General purpose molecular dynamics simulations fully implemented on graphics processing units,” Journal of Computational Physics, vol. 227, no. 10, pp. 5342–5359, 2008.
 M. Sousa, M. Mattoso, and N. Ebecken, “Data mining on parallel database systems,” in Proc. Intl. Conf. on PDPTA: Special Session on Parallel Data Warehousing, CSREA Press, Las Vegas, 1998.
 N. Khan, I. Yaqoob, I. A. T. Hashem, Z. Inayat, M. Ali, W. Kamaleldin, M. Alam, M. Shiraz, and A. Gani, “Big data: survey, technologies, opportunities, and challenges,” The Scientific World Journal, vol. 2014, 2014.
 A. Labrinidis and H. V. Jagadish, “Challenges and opportunities with big data,” Proceedings of the VLDB Endowment, vol. 5, no. 12, pp. 2032–2033, 2012.
 X. Wu, X. Zhu, G. Q. Wu, and W. Ding, “Data mining with big data,” IEEE transactions on knowledge and data engineering, vol. 26, no. 1, pp. 97–107, 2014.
 A. S. Tanenbaum and M. Van Steen, Distributed systems: principles and paradigms. PrenticeHall, 2007.
 F. Li, B. C. Ooi, M. T. Özsu, and S. Wu, “Distributed data management using mapreduce,” ACM Computing Surveys (CSUR), vol. 46, no. 3, p. 31, 2014.
 M. J. Zaki, “Parallel and distributed data mining: An introduction,” in Largescale parallel data mining. Springer, 2000, pp. 1–23.
 J. Liu, X. Jin, and Y. Wang, “Agentbased load balancing on homogeneous minigrids: Macroscopic modeling and characterization,” IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 7, pp. 586–598, 2005.
 P. Luo, K. Lü, Z. Shi, and Q. He, “Distributed data mining in grid computing environments,” Future Generation Computer Systems, vol. 23, no. 1, pp. 84–91, 2007.
 W. Rao, L. Chen, A. W. C. Fu, and G. Wang, “Optimal resource placement in structured peertopeer networks,” IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 7, pp. 1011–1026, 2010.
 Y. Xue, B. Li, and K. Nahrstedt, “Optimal resource allocation in wireless ad hoc networks: A pricebased approach,” IEEE Transactions on Mobile Computing, vol. 5, no. 4, pp. 347–364, 2006.
 L. Gkatzikis and I. Koutsopoulos, “Migrate or not? exploiting dynamic task migration in mobile cloud computing systems,” IEEE Wireless Communications, vol. 20, no. 3, pp. 24–32, 2013.
 Y. Jiang and J. Jiang, “Understanding social networks from a multiagent perspective,” IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 10, pp. 2743–2759, 2014.
 G. S. Almasi and A. Gottlieb, “Highly parallel computing,” Menlo Park, CA (USA), 1988.
 L. Zeng, L. Li, L. Duan, K. Lu, Z. Shi, M. Wang, W. Wu, and P. Luo, “Distributed data mining: a survey,” Information Technology and Management, vol. 13, no. 4, pp. 403–409, 2012.
 R. Dolbeau, S. Bihan, and F. Bodin, “HMPP: A hybrid multicore parallel programming environment,” in Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007), vol. 28, 2007.
 P. FournierViger, C. W. Wu, and V. S. Tseng, “Mining maximal sequential patterns without candidate maintenance,” in International Conference on Advanced Data Mining and Applications. Springer, 2013, pp. 169–180.
 S. Hauck and A. DeHon, Reconfigurable computing: the theory and practice of FPGAbased computation. Elsevier, 2010, vol. 1.
 K. Beedkar, K. Berberich, R. Gemulla, and I. Miliaraki, “Closing the gap: Sequence mining at scale,” ACM Transactions on Database Systems (TODS), vol. 40, no. 2, p. 8, 2015.
 S. Hong, T. Oguntebi, and K. Olukotun, “Efficient parallel graph exploration on multicore CPU and GPU,” in Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on. IEEE, 2011, pp. 78–88.
 C. H. Lin, C. H. Liu, L. S. Chien, and S. C. Chang, “Accelerating pattern matching using a novel parallel algorithm on GPUs,” IEEE Transactions on Computers, vol. 62, no. 10, pp. 1906–1916, 2013.
 M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, “Resilient distributed datasets: A faulttolerant abstraction for inmemory cluster computing,” in Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012, pp. 2–2.
 K. A. Huck and A. D. Malony, “PerfExplorer: A performance data mining framework for largescale parallel computing,” in Proceedings of the 2005 ACM/IEEE conference on Supercomputing. IEEE Computer Society, 2005, p. 41.
 H. Mannila, H. Toivonen, and A. I. Verkamo, “Discovery of frequent episodes in event sequences,” Data mining and knowledge discovery, vol. 1, no. 3, pp. 259–289, 1997.
 J. Han, J. Pei, B. MortazaviAsl, H. Pinto, Q. Chen, U. Dayal, and M. Hsu, “PrefixSpan: Mining sequential patterns efficiently by prefixprojected pattern growth,” in proceedings of the 17th international conference on data engineering, 2001, pp. 215–224.
 J. Wang and J. Han, “BIDE: Efficient mining of frequent closed sequences,” in Data Engineering, 2004. Proceedings. 20th International Conference on. IEEE, 2004, pp. 79–90.
 X. Yan, J. Han, and R. Afshar, “CloSpan: Mining: Closed sequential patterns in large datasets,” in Proceedings of the 2003 SIAM international conference on data mining. SIAM, 2003, pp. 166–177.
 J. Han, J. Pei, B. MortazaviAsl, Q. Chen, U. Dayal, and M.C. Hsu, “FreeSpan: frequent patternprojected sequential pattern mining,” in Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2000, pp. 355–359.
 M. J. Zaki, “SPADE: An efficient algorithm for mining frequent sequences,” Machine learning, vol. 42, no. 12, pp. 31–60, 2001.
 F. Masseglia, P. Poncelet, and R. Cicchetti, “An efficient algorithm for web usage mining,” Networking and Information Systems Journal, vol. 2, no. 5/6, pp. 571–604, 2000.
 Z. Yang, Y. Wang, and M. Kitsuregawa, “LAPIN: effective sequential pattern mining algorithms by last position induction for dense databases,” in International Conference on Database systems for advanced applications. Springer, 2007, pp. 1020–1023.
 Z. Yang and M. Kitsuregawa, “LAPINSPAM: An improved algorithm for mining sequential pattern,” in Data Engineering Workshops, 2005. 21st International Conference on. IEEE, 2005, pp. 1222–1222.
 J. Pei, J. Han, B. MortazaviAsl, and H. Zhu, “Mining access patterns efficiently from web logs,” in PacificAsia Conference on Knowledge Discovery and Data Mining. Springer, 2000, pp. 396–407.
 M. Seno and G. Karypis, “LPMiner: An algorithm for finding frequent itemsets using lengthdecreasing support constraint,” in Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on. IEEE, 2001, pp. 505–512.
 ——, “SLPMiner: An algorithm for finding frequent sequential patterns using lengthdecreasing support constraint,” in Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on. IEEE, 2002, pp. 418–425.
 M. ElSayed, C. Ruiz, and E. A. Rundensteiner, “FSMiner: efficient and incremental mining of frequent sequence patterns in web logs,” in Proceedings of the 6th annual ACM international workshop on Web information and data management. ACM, 2004, pp. 128–135.
 C. Ezeife, Y. Lu, and Y. Liu, “PLWAP sequential mining: open source code,” in Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations. ACM, 2005, pp. 26–35.
 R. J. Bayardo Jr, “Efficiently mining long patterns from databases,” ACM Sigmod Record, vol. 27, no. 2, pp. 85–93, 1998.
 S. Song, H. Hu, and S. Jin, “Hvsm: a new sequential pattern mining algorithm using bitmap representation,” in International Conference on Advanced Data Mining and Applications. Springer, 2005, pp. 455–463.
 J. Chen, “An updown directed acyclic graph approach for sequential pattern mining,” IEEE transactions on knowledge and data engineering, vol. 22, no. 7, pp. 913–928, 2010.
 P. FournierViger, J. C. W. Lin, A. Gomariz, T. Gueniche, A. Soltani, Z. Deng, and H. T. Lam, “The SPMF opensource data mining library version 2,” in Joint European conference on machine learning and knowledge discovery in databases. Springer, 2016, pp. 36–40.
 C. Luo and S. M. Chung, “Efficient mining of maximal sequential patterns using multiple samples,” in Proceedings of the 2005 SIAM International Conference on Data Mining. SIAM, 2005, pp. 415–426.
 C. Kim, J. H. Lim, R. T. Ng, and K. Shim, “Squire: Sequential pattern mining with quantities,” Journal of Systems and Software, vol. 80, no. 10, pp. 1726–1745, 2007.
 A. Gomariz, M. Campos, R. Marin, and B. Goethals, “ClaSP: an efficient algorithm for mining frequent closed sequences,” in PacificAsia Conference on Knowledge Discovery and Data Mining. Springer, 2013, pp. 50–61.
 P. FournierViger, C. W. Wu, A. Gomariz, and V. S. Tseng, “VMSP: Efficient vertical mining of maximal sequential patterns,” in Canadian Conference on Artificial Intelligence. Springer, 2014, pp. 83–94.
 F. Fumarola, P. F. Lanotte, M. Ceci, and D. Malerba, “CloFAST: closed sequential pattern mining using sparse and vertical idlists,” Knowledge and Information Systems, vol. 48, no. 2, pp. 429–463, 2016.
 A. Mueller, “Fast sequential and parallel algorithms for association rule mining: A comparison,” Tech. Rep., 1998.
 J. S. Park, M. S. Chen, and P. S. Yu, “Efficient parallel data mining for association rules,” in Proceedings of the fourth international conference on Information and knowledge management. ACM, 1995, pp. 31–36.
 D. W. Cheung, J. Han, V. T. Ng, A. W. Fu, and Y. Fu, “A fast distributed algorithm for mining association rules,” in Parallel and Distributed Information Systems, 1996., Fourth International Conference on. IEEE, 1996, pp. 31–42.
 M. J. Zaki, S. Parthasarathy, M. Ogihara, W. Li et al., “New algorithms for fast discovery of association rules.” in KDD, vol. 97, 1997, pp. 283–286.
 S. Moens, E. Aksehirli, and B. Goethals, “Frequent itemset mining for big data,” in Big Data, 2013 IEEE international conference on. IEEE, 2013, pp. 111–118.
 S. Rathee, M. Kaul, and A. Kashyap, “RApriori: an efficient Apriori based algorithm on spark,” in Proceedings of the 8th Workshop on Ph. D. Workshop in Information and Knowledge Management. Acm, 2015, pp. 27–34.
 T. Shintani and M. Kitsuregawa, “Mining algorithms for sequential patterns in parallel: Hash based approach,” in PacificAsia Conference on Knowledge Discovery and Data Mining. Springer, 1998, pp. 283–294.
 M. V. Joshi, G. Karypis, and V. Kumar, “Parallel algorithms for mining sequential associations: Issues and challenges,” Technical Report under preparation, Department of Computer Science, University of Minnesota, Minneapolis, Tech. Rep., 1999.
 V. Guralnik and G. Karypis, “Parallel treeprojectionbased sequence mining algorithms,” Parallel Computing, vol. 30, no. 4, pp. 443–472, 2004.
 A. Demiriz, “webSPADE: A parallel sequence mining algorithm to analyze web log data,” in Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on. IEEE, 2002, pp. 755–758.
 X. Yu, J. Liu, X. Liu, C. Ma, and B. Li, “A mapReduce reinforced distributed sequential pattern mining algorithm,” in International Conference on Algorithms and Architectures for Parallel Processing. Springer, 2015, pp. 183–197.
 J. W. Huang, S. C. Lin, and M. S. Chen, “DPSP: Distributed progressive sequential pattern mining on the cloud,” in PacificAsia Conference on Knowledge Discovery and Data Mining. Springer, 2010, pp. 27–34.
 S. Qiao, C. Tang, S. Dai, M. Zhu, J. Peng, H. Li, and Y. Ku, “Partspan: Parallel sequence mining of trajectory patterns,” in Fuzzy Systems and Knowledge Discovery, 2008. FSKD’08. Fifth International Conference on, vol. 5. IEEE, 2008, pp. 363–367.
 C. H. Wu, C. C. Lai, and Y. C. Lo, “An empirical study on mining sequential patterns in a grid computing environment,” Expert Systems with Applications, vol. 39, no. 5, pp. 5748–5757, 2012.
 S. Qiao, T. Li, J. Peng, and J. Qiu, “Parallel sequential pattern mining of massive trajectory data,” International Journal of Computational Intelligence Systems, vol. 3, no. 3, pp. 343–356, 2010.
 I. Miliaraki, K. Berberich, R. Gemulla, and S. Zoupanos, “Mind the gap: Largescale frequent sequence mining,” in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, 2013, pp. 797–808.
 C. C. Chen, C. Y. Tseng, and M. S. Chen, “Highly scalable sequential pattern mining based on mapreduce model on the cloud,” in Big Data (BigData Congress), 2013 IEEE International Congress on. IEEE, 2013, pp. 310–317.
 C. C. Chen, H. H. Shuai, and M. S. Chen, “Distributed and scalable sequential pattern mining through stream processing,” Knowledge and Information Systems, vol. 53, no. 2, pp. 365–390, 2017.
 S. Cong, J. Han, J. Hoeflinger, and D. Padua, “A samplingbased framework for parallel data mining,” in Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming. ACM, 2005, pp. 255–265.
 D. Yu, W. Wu, S. Zheng, and Z. Zhu, “BIDEbased parallel mining of frequent closed sequences with mapReduce,” in International Conference on Algorithms and Architectures for Parallel Processing. Springer, 2012, pp. 177–186.
 V. C. C. Liao and M. S. Chen, “DFSP: a depthfirst spelling algorithm for sequential pattern mining of biological sequences,” Knowledge and information systems, vol. 38, no. 3, pp. 623–639, 2014.
 Y. H. Liang and S. Y. Wu, “Sequencegrowth: A scalable and effective frequent itemset mining algorithm for big data based on mapreduce framework,” in Big Data (BigData Congress), 2015 IEEE International Congress on. IEEE, 2015, pp. 393–400.
 C. Luo and S. M. Chung, “Parallel mining of maximal sequential patterns using multiple samples,” The Journal of Supercomputing, vol. 59, no. 2, pp. 852–881, 2012.
 K. Wang, Y. Xu, and J. X. Yu, “Scalable sequential pattern mining for biological sequences,” in Proceedings of the thirteenth ACM international conference on Information and knowledge management. ACM, 2004, pp. 178–187.
 J. Ge, Y. Xia, and J. Wang, “Mining uncertain sequential patterns in iterative mapreduce,” in PacificAsia Conference on Knowledge Discovery and Data Mining. Springer, 2015, pp. 243–254.
 J. Ge and Y. Xia, “Distributed sequential pattern mining in large scale uncertain databases,” in PacificAsia Conference on Knowledge Discovery and Data Mining. Springer, 2016, pp. 17–29.
 J. Fowkes and C. Sutton, “A subsequence interleaving model for sequential pattern mining,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016, pp. 835–844.
 M. Sahli, E. Mansour, and P. Kalnis, “Parallel motif extraction from very long sequences,” in Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2013, pp. 549–558.
 O. K. Alkan and P. Karagoz, “CRoM and HuspExt: Improving efficiency of high utility sequential pattern extraction,” IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 10, pp. 2645–2657, 2015.
 T. P. Hong, C. S. Kuo, and S. C. Chi, “Mining fuzzy sequential patterns from quantitative data,” in Systems, Man, and Cybernetics, 1999. IEEE SMC’99 Conference Proceedings. 1999 IEEE International Conference on, vol. 3. IEEE, 1999, pp. 962–966.
 G.C. Lan, T.P. Hong, and H.Y. Lee, “An efficient approach for finding weighted sequential patterns from sequence databases,” Applied intelligence, vol. 41, no. 2, pp. 439–452, 2014.
 G. Ruan, H. Zhang, and B. Plale, “Parallel and quantitative sequential pattern mining for largescale intervalbased temporal data,” in Big Data (Big Data), 2014 IEEE International Conference on. IEEE, 2014, pp. 32–39.
 M. Zihayat, Z. Z. Hut, A. An, and Y. Hut, “Distributed and parallel high utility sequential pattern mining,” in Big Data (Big Data), 2016 IEEE International Conference on. IEEE, 2016, pp. 853–862.
 M. Muzammal and R. Raman, “Mining sequential patterns from probabilistic databases,” Knowledge and Information Systems, vol. 44, no. 2, pp. 325–358, 2015.
 M. Hassani and T. Seidl, “Towards a mobile health context prediction: Sequential pattern mining in multiple streams,” in Mobile Data Management (MDM), 2011 12th IEEE International Conference on, vol. 2. IEEE, 2011, pp. 55–57.
 M. B. Harries, C. Sammut, and K. Horn, “Extracting hidden context,” Machine learning, vol. 32, no. 2, pp. 101–126, 1998.
 S. Memeti and S. Pllana, “Accelerating dna sequence analysis using intel (r) xeon phi (tm),” in Trustcom/BigDataSE/ISPA, 2015 IEEE, vol. 3. IEEE, 2015, pp. 222–227.
 K. Wang, E. Sadredini, and K. Skadron, “Sequential pattern mining with the micron automata processor,” in Proceedings of the ACM International Conference on Computing Frontiers. ACM, 2016, pp. 135–144.
 H. Cheng, X. Yan, and J. Han, “IncSpan: incremental mining of sequential patterns in large database,” in Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004, pp. 527–532.
 B. Zhang, C. W. Lin, W. Gan, and T. P. Hong, “Maintaining the discovered sequential patterns for sequence insertion in dynamic databases,” Engineering Applications of Artificial Intelligence, vol. 35, pp. 131–142, 2014.
 J. W. Huang, C. Y. Tseng, J. C. Ou, and M. S. Chen, “A general model for sequential pattern mining with a progressive database,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 9, pp. 1153–1167, 2008.
 T. Zhu, G. Li, W. Zhou, and S. Y. Philip, “Differentially private data publishing and analysis: a survey,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 8, pp. 1619–1638, 2017.
 N. Tatti and J. Vreeken, “The long and the short of it: summarising event sequences with serial episodes,” in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012, pp. 462–470.
 A. Bhattacharyya and J. Vreeken, “Efficiently summarising event sequences with rich interleaving patterns,” in Proceedings of the 2017 SIAM International Conference on Data Mining. SIAM, 2017, pp. 795–803.
 Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015.
 R. J. Williams and D. Zipser, “A learning algorithm for continually running fully recurrent neural networks,” Neural computation, vol. 1, no. 2, pp. 270–280, 1989.
 S. Hochreiter and J. Schmidhuber, “Long shortterm memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.