Personalized Next PointofInterest Recommendation via Latent Behavior Patterns Inference
Abstract
In this paper, we address the problem of personalized next Pointofinterest (POI) recommendation which has become an important and very challenging task for locationbased social networks (LBSNs), but not well studied yet. With the conjecture that, under different contextual scenarios, human exhibits distinct mobility pattern, we attempt here to jointly model the next POI recommendation under the influence of user’s latent behavior pattern. We propose to adopt a thirdrank tensor to model the successive checkin behaviors. By integrating categorical influence into mobility patterns and aggregating user’s spatial preference on a POI, the proposed model deal with the next new POI recommendation problem by nature. By incorporating softmax function to fuse the personalized Markov chain with latent pattern, we furnish a Bayesian Personalized Ranking (BPR) approach and derive the optimization criterion accordingly. Expectation Maximization (EM) is then used to estimate the model parameters. We further develop a personalized model by taking into account personalized mobility patterns under the contextual scenario to improve the recommendation performance. Extensive experiments on two largescale LBSNs datasets demonstrate the significant improvements of our model over several stateoftheart methods.
I Introduction
Online social networks allow hundreds of millions of Internet users world wide to access to vast amount of information on an unprecedented scale. In recent years, there have been an increased emphasis on developing the locationbased social networks (LBSNs), such as Foursquare^{1}^{1}1https://foursquare.com, Gowalla, Facebook Place^{2}^{2}2https://www.facebook.com/places, and GeoLife, etc., where users can checkin at venues online and share their experiences towards pointofinterest (POIs) in the physical world via their mobile devices. For example, as of February 2017, Foursquare recorded 10 billion checkins at 93 million pointofinterests (POIs), which are contributed by more than 50 million Foursquare users world wide^{3}^{3}3https://foursquare.com/about. This so called checkin behavior has become a new culture of a modern life and can be used to study life patterns of millions of LBSN users. POI recommendation is one of the most important tasks in LBSN, which is to provide recommendations of places to users, and has attracted much attention as it is not only able to improve user viscosity to LBSN service provider but also to benefit for advertising agency to provide an effective way of launching advertisement to target the potential clients. In the broader context, an accurate POI recommendation is essential for urban computing [1], behavior informatics [2] and control of the spread of infectious diseases [3], etc.
POI recommendation has become a popular research issue and attracted much effort from both academia and industry [4, 5]. Yet achieving accurate personalized POI recommendation is challenging as the data available for each user is highly sparse. The sparsity is due to the fact that the checkin interactions are conducted by the users on a voluntary basis. “Diligent” users who keep checkingin on LBSN for every venue they visited in physical world are in fact rare. Collaborative filtering (CF) technique is widely adopted for recommender system and many CF models have been proposed to learn users’ preferences on the POIs from the checkin data in the literature. The CF methods can be divided into two categories, namely memorybased CF and modelbased CF. Memorybased CF models can be further divided into two subcategories, namely userbased CF and itembased CF. Memorybased CF methods suffer much from the data sparsity problem, since the useruser or itemitem similarities need to be calculated based on the common checkins. Ye et al. [6] adopt linear interpolation to incorporate both social and geographical influences into the userbased CF framework for POI recommendation. Their experimental results show that userbased CF outperforms itembased CF for POI recommendation. Incorporating the geographical influence into the userbased CF model leads to a significant improvement in the recommendation performance, while the social influence has little impact on the performance. Modelbased CF builds models using data mining techniques, such as matrix factorization (CF), on user ratings to perform the recommendations. Based on the observation that individualâs checkin locations are usually around several centers, Cheng et al. [7] introduce a multicenter Gaussian model to infer the geographical influence and then combine it heuristically with MF for POI recommendation. However, aforementioned works overlooked the consecutive information of checkins which is very important for POI recommendation, as human movements often exhibit sequential patterns. And a good POI recommendation should be able to provide a prompt recommendation with respect to users’ current location.
Hence, different from aforementioned works, some researchers considered the task of next POI recommendation (also called successive POI recommendation) in LBSNs. This is an even harder task which is to be accurate on predicting userâs very next move amon tens of thousands of location candidates. The challenge of next POI recommendation results from the follow two reasons. First, the checkin can be considered as a type of implicit feedback, which is different from conventional 5star rating data with explicitly denoting “like” or “dislikeâ to an item by different rating scores. The checkins offer only positive examples that a user likes, and the POIs without checkins are a mixture of real negative feedback (the location is unattractive for the user) and missing values (the user might visit the location in the future). Thus, the recommender system has to infer user preferences from the implicit feedback data, which makes the next POI recommendation very tough. Second, the checkin data for personalized successive interactions is very sparse. The usersâ visiting history data is often transformed to userPOI checkin matrix whose sparsity is dramatically higher than that of useritem rating matrix in Netflix data [8].Moreover, when considering the task of next POI recommendation, we propose to adapt a thirdrank tensor to model the successive checkin behaviors, then the userPOI checkin data (matrix) needs to be separated and represented as the userCurrent POINext POI tensor. This will make the data more sparse, and the density of the checkin tensor in experiments is for Foursquare dataset within Los Angeles, for Foursquare dataset within New York City and for Gowalla dataset respectively, which makes the next POI recommendation task more difficult.
Human mobility has been well known for its periodic property [9, 10, 11]. For next POI recommendation, we focus more on the transition periodicity of location categories. For example, people may regularly stop by coffee stalls, starbucks stores, to grab a cup of coffee on their way to work in the morning, which can be explained as a periodic transition pattern from coffee shop to workplace on weekday morning. The next POI is highly likely related to the previous POI. For example, after taking part in intense outdoor activities, e.g., hiking, running, some user may prefer to have highprotein meals in restaurants like Steakhouse rather than a Juicy Bar. Fig.1(a) and Fig.1(b) plot the checkin probabilities of the top4 most popular location categories over time of day (hours) and day of week respectively, based on the checkin data of LA, collected from Foursquare (the detailed data description will be seen in Section III). The categorical mobility periodicity is very obvious, e.g., the work places are often checked on weekdays. Another interesting observation is that the checkins of nightspots occur most often on Friday and least often on Sunday. Fig.2 plots the transition probabilities between categories along with the day of week. We observe that the transition preference shown in Fig.2(a) is significantly different from that of Fig.2(g) but somehow similar to that of Fig.2(b), which indicates that there exist several latent transition patterns and such patterns may play a key role for our next POI recommendation. However, such patterns are learned from all users’ visit history (i.e., global patterns), thus suffering from lack of personalization as users may exhibit distinct latent transition patterns under the same contextual scenario. To support personalized latent behavior patterns, we extend the proposed global model to accommodate personalized pattern distribution. In addition, the personalized model provides much more flexibility and interpretability than global model through enough checkin history and rich contextual features. For example, as shown in Fig.2, the transition preference of LA users is different from that of NYC users.
In addition to providing personalized recommendations of next POIs to users, our proposed model also recommend new POIs that users may be interested in but have not visited before. More specifically, the next new POI recommendation problem is to recommend new POIs in terms of the historical checkins of the user to be visited next give the user’s current location. Recently, this task becomes increasingly popular and useful, since it not only helps users to explore interesting new places in the city, but also creates the opportunities for businesses to increase their revenues by attracting and discovering potential customers. Thus, we further consider the task of next new POI recommendation in LBSNs, which is a much harder task than standard successive POI recommendation, as it is challenging to infer user preference for potential new POIs from the unobserved transitions based on the sparse historical data.
In fact, our observation is that there is a big fraction of new POIs to be visited in both datasets (see Section III for more details), which implies that the task of offering new POIs for users is important. Meanwhile, by exploring the proportion of new POIs for the top4 most popular location categories, as the statistics shown in Fig.2, we find that users are more likely to visit new places at the early stage in both categories and the ratio of new POIs is distinct between each category. Again, we also observe that the category of Nightlife Spot has the lowest ratio for new POIs, which indicates that if a user visits a nightspot for the first time, there is a higher chance she might return and checkin again. Obviously, there are several latent behavior patterns for users and they are important for the next new POI recommendation. However, the traditional recommender systems cannot deal with the next new POI recommendation problem, because they only provide the routinely visited locations of the user for her next movement. In contrast, our proposed model can deal with the task of next new POI recommendation by nature, since it integrates categorical influence into pattern distribution and derives the spatial preference of users on new POIs to predict the transition probabilities of the users on the new POIs.
In this paper, we attempt to jointly model next POI recommendation under the influence of user’s latent behavior pattern. Meanwhile, we observe that users often visit new POIs that they have not been visited before and the proposed model is able to recommend new POIs to be visited next given a user’s current lication. We propose to adopt a thirdrank tensor to model the successive checkin behaviors. By incorporating the softmax function to fuse the personalized Markov chain with the aforementioned latent pattern’s influence, we furnish a Bayesian Personalized Ranking (BPR)[12] approach and derive the optimization criterion accordingly. In the model learning phase, the Expectation Maximization (EM)[13] is used to estimate the model parameters.
The main contributions of this paper can be summarized as follows:

We propose a unified tensorbased latent model to fuse the observed successive checkin behavior with latent behavior preference for each user to address a personalized next POI recommendation problem. The corresponding optimization criterion and learning steps/tricks have been carefully studied.

We evaluate the proposed model by detailed experiments on two largescale LBSN datasets and demonstrate that our method outperforms other stateoftheart POI recommendation approaches by a large margin.
Ii Related Work
Location recommendation has received intensive attention recently due to a wide range of potential applications. It was studied on GPS trajectory logs of hundreds of monitored users [14]. With the easy access of users’ checkin data in LBSNs, many recent studies have been conducted for POI recommendation, which can be roughly classified into four categories:
1) timeaware POI recommendation which mainly leverages the temporal influence on POIs to enhance the recommendation performance. Yuan et al. assume that users tent to visit different locations at different time and proposed timeaware POI recommendation algorithm. Specifically, they proposed approach extends the userbased POI recommendation by leveraging the time factor when computing the similarity between two users as well as considering the historical checkins at time , rather than at all time to make POI recommendation [15]. Gao et al. investigated the temporal cyclic patterns of user checkins in terms of temporal nonuniformness and temporal consecutiveness [16]. Yin et al. proposed a temporal recommender system and modeled the user behavior based on intrinsic interest as well as the temporal context [17]. Zhao et al. proposed a spatialtemporal latent ranking method to recommend users most possible successive POIs by designing a time indexing scheme to smoothly encode time stamps to particular time ids and then incorporating the time ids into the proposed model [18].
2) geographical influence enhanced POI recommendation which exploits the “geographical clustering phenomenon” of checkin activities to improve the POI recommendation system [19]. Liu et al. proposed a geographical probabilistic factor analysis framework for POI recommendation by combining geographical influence with Bayesian nonnegative matrix factorization (BNMF). Specifically, they used a Gaussian distribution to represent a POI over a sampled region and BNMF is used to capture user preference from checkin data. Ye et al. delved into POI recommendation by investigating the geographical influences among locations and proposed a system that combines user preferences, social influence and geographical influence [6].
3) contentaware POI recommendation approaches which propose to detect users’ current locations by analyzing their published tweets or to rank POIs by analyzing user’s comments on them to alleviate the problem of data sparsity. Chen et al. build a detection model to mine user interest from short text and establish the mapping between location function and user interest [20]. Gao et al. studied both POIassociated contents and user sentiment information (e.g., user comments) into POI recommendation and reported their good performance [4]. However semantic analysis is a very challenging research issue as most of comments in LBSN are short and contextually ambiguous.
4) social influence enhanced POI recommendation which is inspired by the intuition that friends of LBSNs tend to have more common interests. By inferring the social relations, the quality of recommendation could be enhanced. However, there are other opinions of leveraging social influence in the literature, as previous studies also report a large number of friends share nothing in terms of POI [21]. And E. Cho et al. report their findings that the longdistance travel is more influenced by social relations [11].
Some very recent works have incorporated group behaviors into recommender systems for enhancing performance. T. Yuan et al. proposed a GroupSparse Matrix Factorization (GSMF) approach to factorize the rating matrices for multiple behaviors into a user and item latent factor space [5]. H. Wang et al. proposed a groupbased algorithm for POI recommendation [22] by grouping users of similar interests based on their frequently visited locations’ category hierarchy. Chen et al. proposed a novel twostep approach for personalized successive POI recommendation: First, groupbased category recommendation by designing a groupbased tensor model to predict the location category preference; then, categorybased location recommendation by proposing a distance weightedHITS algorithm to rank the locations under a selected location category [23].
Recently, researches have started to pay attention to exploiting deep network for recommender systems. One line of research is to integrate visual signals into personalized recommendation, which is conducted by utilizing visual features extracted from images using (pretrained) Deep Convolutional Neural Network (Deep CNN) [24]. The other line of work is to employ Recurrent Neural Networks (RNN) for location recommendation, which models spatial temporal contexts in each layer with timespecific and distancespecific transition matrices [25].
The next POI recommendation is a newly emerging task and even challenging. In the literature, there exist only few works in which the sequential influence between successive checkins is not yet wellstudied. S. Feng et al. proposed a personalized ranking metric embedding method (PRME) to model personalized checkin sequences for next new POI recommendation [26]. C. Cheng et al. proposed a tensorbased FPMCLR model by considering the order relationship between visitings [27]. However, the periodicity of checkin data and categorical influence are not well studied. Moreover, the candidate set of POIs is filtered by simply removing the venues far from the previous checkedin POI to deal with the data sparsity. The yielded smaller set leads to a lower computation cost at expense of neglecting the experience of users whose checkin behavior patterns are exclusive from the majority ones and a failure of predicting those far way POIs.
Iii Data Description and Characteristics of Checkins
Before introducing the proposed approach, in this section, we first introduce two realworld LBSN datasets used in this paper and then conduct some empirical analysis on them to explore the spatial influence, checkin counts, temporal influence and exploration for new locations of users’ successive checkin behaviors.
Iiia Datasets
We choose three largescale datasets from realworld LBSNs, Foursquare and Gowalla, to conduct the experiments. Foursquare checkin data is within Los Angeles and New York City, provided by [28], while Gowalla dataset is from [7] with a complete snapshot. For both datasets, we removed the users who checked in LSBN less than 10 times (Note that the categorical information of POIs are not included in Gowalla dataset). We split the two datasets into two nonoverlapping sets: for each user, the earliest 80% of checkins as training sets and the remaining 20% checkins as test sets to evaluate the performance of different algorithms. The densities of transition tensor are for FoursquareLA, for FoursquareNYC and for Gowalla respectively, which is extremely sparse. The statistics of the two datasets are listed in Table I.
IiiB Spatial Influence
We compute the geographical distance of two consecutive checkins and plot the cumulative distribution function (CDF) distribution in Fig.4, which shows that users’ movement is restricted by the spatial influence. More specifically, about 60% of FoursquareLA successive checkin behaviors, over 65% of FoursquareNYC and over 70% of Gowalla happened within 10 km since last checkin, while when the distance increases to 100 km, the number of successive checkins account to over 80% for FoursquareLA and FoursquareNYC and over 90% for Gowalla, respectively. The CDF curve increases fast when distance is small, which suggests that users’ movements mostly occur within a localized region. This observation is reasonable since most user generally move periodically within a bounded region but occasionally travel long distance journey. That is, successive checkin POIs are generally spatially correlated and the close POIs have the stronger geographical correlations than the POIs that are far from each other. Thus, the preference of the user for the POI is inversely proportional to geographic distance.
IiiC Checkin Counts
Fig.5 shows the checkin counts for each POI on three datasets, which demonstrates that apart from a few frequently visiting POIs such as home and office, most POIs are visited less than 4 times, which account for 90%, 90% and 80% of total visited POIs for FoursquareLA, FoursquareNYC and Gowalla respectively. Again, we also observe that over 23% of Foursquare POIs and 35% of Gowalla POIs is checked more than once, which suggests that users’ checkin activity exhibits periodic pattern. This observation motivates us to exploit transition periodicity for POI recommendation. From the observation in Fig.4 and Fig.5, we find that most POIs are visited occasionally within a short distance interval, which indicates that users’ next movements are influenced by their current locations. Hence, the proposed model takes into account timecritical for POI recommendation and recommend next POIs based on users’ current location.
IiiD Temporal Influence
Fig.4 shows the cumulative distribution function (CDF) of the time interval of two sequential checkins, which demonstrates that more than 60% successive checkins occur in less than 16 hours in both datasets. Only less than 20% of Foursquare successive checkin time and 10% of Gowalla is more than 64 hours, which indicates that successive checkins in shorter time interval contain stronger correlation. By further studying the categories of successive POIs for the user in a short interval, we find that there is a strong correlation between them. As shown in Fig.3, Food is always visited after Shop as users would like to dinner after shopping. So far we saw that successive checkins contain a personalized Markov chain property, and intuitively we utilize the transition probability to solve the task of successive personalized POI recommendation.
IiiE Exploration for New Locations
Fig.5 shows the ratio of new POIs over all users on three datasets along with time. For example, the ratio at 20% time scale is the proportion of POIs visited after the latest 20% of checkins that have not been visited in the previois days. Obviously, the ratio of new POIs is pretty high (most of the ratios above 0.3) on both datasets, which means that there is over 30% chance a user might explore new POIs. More specifically, about 55% of Foursquare checkins and about 35% of Gowalla are distributed among new POIs at 10% time scale while when the time scale increases to 90%, the number of new POIs accounts to about 80% for Foursquare and about 60% for Gowalla, respectively. This obervation suggests that users in Foursquare prefer to explore new POIs than Gowalla users. It is noted that users not only would like to explore new POIs, but also commute among a few routinely visited locations, as shown in Fig.5. Hence, a good POI recommender should be capable of predicting the periodicity of mobility and meet users’ expectation to explore new POIs.
Iv Problem Definition
Let be a set of LBSN users, and be a set of locations, also called POIs, where each location is geocoded by {longitude, latitude}. The set of POIs visited by user before time is denoted by , i.e. . The contextual feature vector is defined as which infers a specific contextual scenario c. The contextual features include previous location, time of day, day of week, previous location’s category, etc. denotes the number of features. Assuming there are latent behavior patterns determined by contextual scenarios, the pattern distribution can be represented as , s.t. , where denotes the probability of the contextual scenario belonging to the latent pattern. With the conjecture that the checkin behaviors are governed by the patternlevel preferences, the probability distribution over next POIs is then the mixture of each patternlevel preference towards those POIs. Our goal is to estimate the pattern distribution and patternlevel preference, so as to recommend topN venues to the user for his next move by combining the obtained patternlevel preferences.
V Proposed Method
Our proposed model is to recommend next personalized POIs via the ranking of probabilities that user will move from location to next location . Based on the firstorder Markov chain property, the probabilities is given as:
(1) 
where c denotes the contextual scenario. Thus, each user is associated with a specific transition matrix which in total generates a transition tensor with each representing the observed transition record of user from location to location . To further boost the recommendation performance, here we study both personal preference and spatial preference.
Personal Preference. A general linear factorization model for estimating the transition tensor is the Tucker Decomposition (TD):
(2) 
where is a core tensor and is the feature matrix for the users, is the feature matrix for the locations in the last transition and is the feature matrix for the next locations. As the transitions of are partially observed, here we adopt the lowrank factorization model— a special case of Canonical Decomposition which models the pairwise interaction between all three modes of the tensor (i.e. user , location , next location ), to fill up the missing information, given as:
(3) 
where and denote the latent factor vectors for users and next locations, respectively. Other notions are similarly defined. The term can be removed since it is independent of location and does not affect the ranking result, as shown in [29], which generates a more compact expression for :
(4) 
An advantage of this model over TD is that the prediction and learning complexity is much lower than for TD. Furthermore even though TD subsume the pairwise interaction model, with standard regularization estimation procedures have problems identifying such a model [30].
Spatial Preference. Inspired by [11], human mobility is constrained geographically by the distance one can travel within a day and their preference to visit a location decreases as the geographic distance increases. Moreover, most of POIs are likely explored near to users’ residence, workplace, and frequently visited POIs. Fig.6 shows the relation (in log scale) between the checkin counts and the distance between two successive checkedin locations for Foursquare data and Gowalla data respectively. It is obvious that the relation follows a power law distribution and the venues that a user has checked in are geographically dense. Different from the existing works which simply remove locations out of the candidate list based on predefined distance threshold, we leverage on the distance constraint by defining power law distribution as the spatial preference of user to visit a km far away POI as follows:
(5) 
where and are parameters of the power law distribution.
After taking logarithmic on both side of Eq.(7), the linear function can be easily learned by the leastsquare regression.
(6) 
Note that, in learning the two parameters, checkins having distance larger than 50km is mot considered for these checkins represent fewer than 29.5%, 26% and 9.8% of the total number of checkins in FoursquareLA dataset, FoursquareNY dataset and Gowalla dataset, respectively. As the result, we learn the parameter of equals to 10.5 for FoursquareLA dataset, 11.0 for FoursquareNY dataset and 11.5 Gowalla dataset respectively and the parameter of equals to 1.25 for FoursquareLA dataset, 1.45 for FoursquareNY dataset and 1.37 Gowalla dataset respectively. The empirical settings of parameters of the power law distribution are 11 for and 1 for , where the spatial preference is represented as:
(7) 
Combining these two types of preference linearly, we have an updated transition probability estimation, given as:
(8) 
where is a tradeoff parameter used to fuse the two preferences and the parameter of a in power law distribution will be absolved into since the optimal setting of will be learned during model inference phase. Thus, even the locations far away from the previously checkedin location have the chance to be recommended when personal preference dominates. And some occasional long journey could be predicted.
Va Incorporating PatternLevel Preference
With assumption that user mobility can be classified into some latent behavior patterns, each pattern has distinct impact to user’s transition preference, which indicates that users’ transition probability is patternsensitive. Here, we propose a novel model by introducing an intermediate latent patterns layer to capture the patternlevel preference in POI recommendation. is the latent variable to indicate the patternlevel influence. The joint probability of and is represented as:
(9) 
where is the mixing coefficient, i.e. . The patternlevel preference can be defined as:
(10)  
By marginalizing out the latent variable , the corresponding transition probability can be written as follows:
(11) 
Fig.7 gives a graphical illustration of our proposed model. The upper tensor contains the historical checkin data which is in fact the transition tensor , where the transition probability between two locations is labeled as “1” if we observe that a transition happens between the two locations for a user, or “?” otherwise. Each user, however, may have distinct patternlevel preference under different pattern. And each entry of lower tensors denotes the patternlevel transition probability. It is noted that transition tensor is a mixture of the patternlevel transition tensors, and is the mixing coefficient. Then, our goal is to infer the proper patternlevel transition probabilities and pattern distribution to recover the unobserved transition preference by fitting model.
We adopt a softmax function to infer the multipatterns and . is the weight associated with the feature for latent pattern and is the normalization factor that scaled the exponential function to be a proper probability distribution , i.e. . In this representation, contextual scenario c is denoted by a bag of features where F is the number of features. By plugging the softmax function into Eq.(11), is rewritten as:
(12) 
Because the learned pattern distribution is identical for all users and regardless of personalized difference, this model is also denoted as global pattern distribution model (GPDM) in this paper.
VB Optimization Criterion
The task of next POI recommendation is to recommend topN POIs to users, and we make the parameter learning via learning the ranking order of successive checkin possibilities. We care more about the ranking order of the candidate POIs rather than the real values of checkin possibilities, thus we can model it as a ranking over locations, where denotes a personalized ranking score of transition from location to location for user under pattern .
(13) 
Eq.(13) indicates user prefers location to location under pattern .
Next, we derive the sequential Bayesian Personalized Ranking (SBPR) optimization criterion which is similar to the general BPR approach [12]. Then for user influenced by the patternlevel preference , the best ranking can be modeled as:
(14) 
where is the set of model parameters, i.e. .
Then we estimate the model by maximizing the posterior with assumption that users and their checkin history are independent:
(15) 
The ranking probability can be futher expressed by:
(16) 
Similar to [29], we use the logistic function to approximate the likelihood of user’s preference over location and :
(17) 
By assuming the model parameters’ prior follows a Guassian distribution , the MAP estimation is now given as:
(18) 
VC Model Inference
Furthermore, can also be estimated by maximizing the following logscale objective function:
(19) 
Here, we adopt Expectation Maximization(EM) algorithm [31] to estimate the model parameters.
In EStep, the posterior distribution of is given as:
(20) 
VD Personalized Pattern Distribution Model
In the global model introduced in Subsection A, a fixed pattern distribution is learned from the global perspective to optimize the overall performance for all users. However, the best pattern distribution for a given user is not always the best for others due to the personalization of individual users. The users in LBSNs are extremely diverse according to the various properties, including age, gender, home city, occupation, etc. Different users may have personalized pattern distribution under the same contextual scenario.
Instead of inferring a fixed pattern distribution for every user, personalized pattern distribution model (PPDM) infers pattern distributions for each user, i.e. , as shown in Fig.8. It is noted that the mixing coefficients, i.e. , are personalized. By inferring as the personalized weight associated with the feature for latent pattern , the corresponding transition probability for PPDM can be rewritten as follows:
(23) 
and the optimization function of in Eq.(VC) is rewritten as:
(24) 
The formulations for updating the parameters are the same as in Eq.(VC) and the parameter updating rule of in Line 16 of Algorithm 1 is rewritten as:
(25) 
Vi Experiments
In this section, we evaluate the following: (1) how is the proposed approaches in comparison with other stateoftheart recommendation techniques? (2) how does the number of latent classes affect the model accuracy? (3) how does the features perform in the POI recommendation task? (4) how is the performance of proposed models in recommending new POIs? (5) how is the performance difference between GPMD and PPDM?
Via Evaluation Metrics
Given a topN recommendation list sorted in descending order of the prediction values to user , we adapt a precision metric to evaluate the performance of our proposed next POI recommendation, given as:
(26) 
where are the visited locations of user and denotes the number of the users, N is the size of the next POI candidate list.
We evaluate the performance of next new POI recommendation by defining precision as:
(27) 
where denotes locations that a user does not visit before and will be visited in the next time.
ViB Evaluated Methods and Parameter Settings
We compare the proposed model with the following methods:

MF: matrix factorization [32] is widely used in conventional recommender systems, which factorizes the useritem preference matrix.

PMF: probabilistic matrix factorization is a wellknown method for modeling time evolving relation data [33]. It is widely used in traditional recommender systems.

FPMCLR: this method is proposed in [27], which is the stateoftheart personalized successive POI recommendation method.

PRMEG: this approach utilizes two Euclidean distances in the latent space: one is the distance between current location and next location, the other is the distance between user and next location, then takes the combination of two distances in predicting [26].
In the experiments, we use the three datasets introduced in SectionIII and Table II, Table III and Table IV report the comparison results between our models and the baseline methods. We set to be 1 for both FPMCLR and our proposed model, and the number of latent dimensions to 60 for all the compared models. The time window size is set to be 6 hours for both FPMCLR and PRMEG. We set regularization term = 0.03 and component weight = 0.2 for PRMEG following [26]. The empirical settings of the number of latent behavior patterns are 4 and 6 for Gowalla dataset and Foursquare datasets, respectively. For other parameters, we tune them in the training sets to find the optimal values, and subsequently use them in the test set.
ViC Comparison of Next POI Recommendation
In the left of three Tables compares the recommendation accuracy of the evaluated methods on the next POI recommendation. The results show that:

Both FPMCLR, PRMEG and the proposed models outperform MF and PMF significantly, which indicates that the conventional POI recommendation algorithms are not effective for the successive POI recommendation. One possible explanation could be that MF and PMF mainly exploit the user preference rather than making use of the sequential information. More specifically, our proposed models achieve a relative improvement of at least 91% for MF and 81% for PMF respectively, while FPMCLR and PRMEG also achieve an improvement compared with MF and PMF. This demonstrates that spatial influence plays an important role in next POI recommendation.

Both GPDM and PPDM consistently outperform FPMCLR, improving around 35% and 45% over FPMCLR for Foursquare dataset and Gowalla dataset, respectively. We make similar observations by comparing the proposed models with PRMEG. Again, PPDM does around 35% for Foursquare dataset and 45% for Gowalla dataset better than PRMEG, respectively. It illustrates that inferring user latent behavior patterns can better capture user mobility preference in LBSNs, and therefore, help us recommend POIs to users more accurately.
ViD Comparison of Next New POI Recommendation
In the right of three Tables contrast the recommendation accuracy of the evaluated methods on the next new POI recommendation, which indicates that our model is capable of predicting the periodicity of user mobility as well as providing new POIs to users. According to the accuracy results, we have the following three observations:

MF and PMF aim to tune the latent factor vectors of users and locations to explain observed checkins and recover unobserved checkins. However, it’s challenging for conventional methods to recommend the new POIs to users without any extra information, because new POIs have received few preference from users and assigned low credit by conventional methods compared with the visited POIs. As a result, they report the lowest recommendation precision.

Both FPMCLR and PRMEG show the increasing precision by taking geographical influence into account. Moreover, PRMEG is better than FPMCLR, since PRMEG has been customized to predict next new POI by representing each POI as one point in latent space rather than two independent vectors.

The proposed models always achieve the highest recommendation precision by a large margin, which implies that inferring user latent behavior patterns plays an important role when performing next new POI recommendation.
Another interesting observation is that the proposed models reach lower precision than FPMCLR and PRMEG in terms of P@1 on Gowalla dataset. Note that the categorical information of POIs are not included in Gowalla dataset and the proposed models fail to integrate the categorical information to infer the latent behavior patterns. Intuitively we expect that categorical information is useful for modeling the specific preference of a user for recommending new POIs.
To quantify this importance, Fig.9 further depicts the fraction of new POIs over all accurately predicted POIs, which is . We observe that GPDM with the contextual feature of previous location’s category achieve much better performance in recommending new POIs than GPDM without the contextual feature of previous location’s category, which suggests that the predictive ability of proposed models on next new POI recommendation can be uplifted through incorporating categorical influence into inferring latent behavior patterns. Our explanation is that: the categories of POIs visited by a user implicitly indicate the activities of the user int the POIs. In reality people have different biases on the categories of POIs: a foodie often visits restaurants to taste a variety of food, and a tourism enthusiast usually travels on tourism attractions all over the world. Accordingly, the proposed model can deduce the relevance score of a user to an unvisited POI based on the categorical information in the categories of the user’s visited POIs and the unvisited POIs.
Metrics  Next POI Recommendation  Next New POI Recommendation  

MF  PMF  FPMCLR  PRMEG  GPDM  PPDM  MF  PMF  FPMCLR  PRMEG  GPDM  PPDM  























































*improved by GPDM
Metrics  Next POI Recommendation  Next New POI Recommendation  

MF  PMF  FPMCLR  PRMEG  GPDM  PPDM  MF  PMF  FPMCLR  PRMEG  GPDM  PPDM  























































*improved by GPDM
Metrics  Next POI Recommendation  Next New POI Recommendation  

MF  PMF  FPMCLR  PRMEG  GPDM  PPDM  MF  PMF  FPMCLR  PRMEG  GPDM  PPDM  























































*improved by GPDM
User ID  current Venue  Checkin T.  next Venue  Checkin T.  Dist.(km)  T. interval(h)  

2282  Arches National Park Visitor Center  14:24, Fri  Patagonia Outlet  10:26, Sun  304  44  
1598  Silvia’s Hair Design  13:09, Fri  Don Carlos  12:19, Sat  0.5577  23.16  
192 

06:01, Sun 

08:50, Sun  82.82  2.81  
1121 

22:22, Fri  John Ascuaga’s Nugget Casino Resort  22:49, Sat  0  24.45  
2446  Blue Bayou Restaurant  18:11, Sun  Pirates of the Caribbean  20:15, Sun  0.0268  2.06 
ViE GPDM vs. PPDM
Both GPDM and PPDM are much better than other algorithms, which demonstrates the effectiveness of latent behavior patterns assumptions. GPDM further improves the performance on all evaluation metrics on FoursquareLA datasets and Gowalla dataset, while PPDM outperforms GPDM on FoursquareNY dataset. The reason is twofold: (1) Fig.12(a) summarizes the distribution in different ranges of users’ checkin frequency in Foursquare datasets. From Fig.12(b) to Fig.12(f), we observe that GPDM outperforms PPDM when users’ checkin frequency is small. However, PPDM performs better than GPDM when users’ checkin frequency becomes larger. It is reasonable since when users’ checkin frequency is small, it is difficult to learn users’ latent behavior patterns, and GPDM is able to recommend POI to the users based on the shared distribution of latent behavior patterns. Furthermore, one challenge of the POI recommendation is that it is difficult to recommend POI to those users who have very few checkin history, and the latent behavior patterns for those users cannot be accurately obtained, which leads to the low performance of PPDM. (2) For Gowalla dataset, the contextual features only include time of day and day of week, which indicates that most users share the similar latent behavior patters and the low contextual scenario diversity can be learned even for users with few checkins.
ViF Quantitative Evaluation of Accumulated Precision
Fig.13 shows the prediction ability vs. distance and time. The quantitative results along with distance (see in Fig. 13(b)) manifest that our model is capable of predicting transitions within a localized region as well as an occasional journey with long distance and the quantitative results along with time (see in Fig. 13(a)) imply that our model is suited to periodicity of mobility within big time interval as well as random movement within small interval.
ViG Impact of the Contextual Features
Here, we discuss the recommendation efforts of different types of contextual information, i.e. Previous Location’s Category, Time of Day, Day of Week. Figure.10 depicts the experimental results with variants of combinations of contextual information incorporated. In general, the model accuracy increases with more contextual information added in. Most importantly, the proposed model uplifts the performance significantly by integrating all contextual information to infer latent behavior patterns in recommendation. It indicates that finer latent patterns are obtained to better capture user preference.
ViH Impact of the Number of Latent Patterns
Figure.11 shows the experimental results with different settings of the number of latent patterns. We can see that for both datasets, the model accuracy increases with the increasing of the number of patterns. When the number of latent patterns reaches 6 for Foursquare and 4 for Gowalla, the returns diminish largely. Even the performance gained by adding one more latent pattern is minor compared to the difference between the number of patterns less than 6 for Foursquare and 4 for Gowalla. For example, P@10 on Gowalla dataset is 0.184 using 3 latent patterns, whereas the fourlatentpattern model has a P@10 of 0.293, which is a 59.2% relative improvement. Using a fivelatentpattern model only increase performance by another 1.4%. Besides considering the additional computation cost of inferring preference for each pattern, we conclude that the 6 latent patterns for Foursquare and 4 latent patterns for Gowalla is rich enough to complete the task of next personalized POI recommendation.
ViI Case Study
Table tabulates 5 representative successful predictions for Foursquare data.For each user, we show the user ID, the current venue, checkin time of current venue, the next venue, checkin time of next venue, the distance and time interval between successive checkins. The distance between two successive POIs varies from 0.5km to 304km and the time interval varies from half an hour to 44 hour, which manifests that our model is capable of predicting transitions within a localized region as well as an occasional journey with long distance.
Vii Conclusion and Future Work
To address the personalized next POI recommendation problem, in this paper we propose a unified tensorbased latent model to capture the successive checkin behavior by exploring the latent patternlevel preference for each user. We derive a BPRlike optimization criterion accordingly and then use Expectation Maximization (EM) to estimate the model parameters. Performance evaluation conducted on two largescale realworld LBSNs datasets shows that our proposed approach improves the recommendation accuracy significantly compared against other stateoftheart methods. More specifically, our proposed method is capable of predicting journey of long distance and the consecutive checkins which span a long period of time. For future work, we will soon evaluate our proposed model’s ability for next new POI recommendation by redefining the transition tensor in a categorical dimension.
References
 [1] Y. Zheng, L. Capra, O. Wolfson, and H. Yang, “Urban computing: concepts, methodologies, and applications,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 5, no. 3, p. 38, 2014.
 [2] L. Cao and S. Yu, Behavior computing. Springer, 2012.
 [3] S. Eubank, H. Guclu, V. A. Kumar, M. V. Marathe, A. Srinivasan, Z. Toroczkai, and N. Wang, “Modelling disease outbreaks in realistic urban social networks,” Nature, vol. 429, no. 6988, pp. 180–184, 2004.
 [4] H. Gao, J. Tang, X. Hu, and H. Liu, “Contentaware point of interest recommendation on locationbased social networks,” in Proceedings of the TwentyNinth AAAI Conference on Artificial Intelligence, January 2530, 2015, Austin, Texas, USA., 2015, pp. 1721–1727.
 [5] T. Yuan, J. Cheng, X. Zhang, S. Qiu, and H. Lu, “Recommendation by mining multiple user behaviors with group sparsity,” in Proceedings of the TwentyEighth AAAI Conference on Artificial Intelligence, Québec City, Québec, Canada., 2014, pp. 222–228.
 [6] M. Ye, P. Yin, W.C. Lee, and D.L. Lee, “Exploiting geographical influence for collaborative pointofinterest recommendation,” in Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM, 2011, pp. 325–334.
 [7] C. Cheng, H. Yang, I. King, and M. Lyu, “Fused matrix factorization with geographical and social influence in locationbased social networks,” in TwentySixth AAAI Conference on Artificial Intelligence, 2012.
 [8] Y. Yu and X. Chen, “A survey of pointofinterest recommendation in locationbased social networks,” in Workshops at the TwentyNinth AAAI Conference on Artificial Intelligence, vol. 130, 2015.
 [9] N. Eagle and A. Pentland, “Eigenbehaviors: Identifying structure in routine,” in Behavioral Ecology and Soc., 2009.
 [10] Z. Li, B. Ding, J. Han, R. Kays, and P. Nye, “Mining periodic behaviors for moving objects,” in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’10. New York, NY, USA: ACM, 2010, pp. 1099–1108.
 [11] E. Cho, S. A. Myers, and J. Leskovec, “Friendship and mobility: User movement in locationbased social networks,” in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’11. New York, NY, USA: ACM, 2011, pp. 1082–1090.
 [12] S. Rendle, C. Freudenthaler, Z. Gantner, and L. SchmidtThieme, “Bpr: Bayesian personalized ranking from implicit feedback,” in Proceedings of the TwentyFifth Conference on Uncertainty in Artificial Intelligence, 2009, pp. 452–461.
 [13] R. M. Neal and G. E. Hinton, “A view of the em algorithm that justifies incremental, sparse, and other variants,” in Learning in graphical models. Springer, 1998, pp. 355–368.
 [14] Y. Zheng, L. Zhang, X. Xie, and W.Y. Ma, “Mining interesting locations and travel sequences from gps trajectories,” in Proceedings of the 18th international conference on World Wide Web, WWW’09. ACM, 2009, pp. 791–800.
 [15] Q. Yuan, G. Cong, Z. Ma, A. Sun, and N. M. Thalmann, “Timeaware pointofinterest recommendation,” in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM, 2013, pp. 363–372.
 [16] H. Gao, J. Tang, X. Hu, and H. Liu, “Exploring temporal effects for location recommendation on locationbased social networks,” in Proceedings of the 7th ACM conference on Recommender systems. ACM, 2013, pp. 93–100.
 [17] C. Chen, H. Yin, J. Yao, and B. Cui, “Terec: A temporal recommender system over tweet stream,” Proceedings of the VLDB Endowment, vol. 6, no. 12, pp. 1254–1257, 2013.
 [18] S. Zhao, T. Zhao, H. Yang, M. R. Lyu, and I. King, “Stellar: Spatialtemporal latent ranking for successive pointofinterest recommendation,” in Thirtieth AAAI Conference on Artificial Intelligence, 2016.
 [19] B. Liu, Y. Fu, Z. Yao, and H. Xiong, “Learning geographical preferences for pointofinterest recommendation,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013, pp. 1043–1051.
 [20] Y. Chen, J. Zhao, X. Hu, X. Zhang, Z. Li, and T. Chua, “From interest to function: Location estimation in social media,” in Proceedings of the TwentySeventh AAAI Conference on Artificial Intelligence, Bellevue, Washington, USA., 2013, pp. 180–186.
 [21] M. Ye, P. Yin, and W.C. Lee, “Location recommendation for locationbased social networks,” in Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM, 2010, pp. 458–461.
 [22] J. F. Henan Wang, Guoliang Li, “Groupbased personalized location recommendation on social networks,” in Proceeding of the 16th AsiaPacific Web Conference, APWeb’14, Changsha, China, 2014, pp. 68–80.
 [23] J. Chen, X. Li, W. K. Cheung, and K. Li, “Effective successive poi recommendation inferred with individual behavior and group preference,” Neurocomputing, 2016.
 [24] R. He and J. McAuley, “Vbpr: Visual bayesian personalized ranking from implicit feedback,” in Thirtieth AAAI Conference on Artificial Intelligence, 2016.
 [25] Q. Liu, S. Wu, L. Wang, and T. Tan, “Predicting the next location: A recurrent model with spatial and temporal contexts,” in Thirtieth AAAI Conference on Artificial Intelligence, 2016.
 [26] S. Feng, X. Li, Y. Zeng, G. Cong, Y. M. Chee, and Q. Yuan, “Personalized ranking metric embedding for next new poi recommendation,” in Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2015, pp. 2069–2075.
 [27] C. Cheng, H. Yang, M. R. Lyu, and I. King, “Where you like to go next: Successive pointofinterest recommendation,” in Proceedings of the TwentyThird international joint conference on Artificial Intelligence. AAAI Press, 2013, pp. 2605–2611.
 [28] J. Bao, Y. Zheng, and M. F. Mokbel, “Locationbased and preferenceaware recommendation using sparse geosocial networking data,” in Proceedings of the 20th International Conference on Advances in Geographic Information Systems. ACM, 2012, pp. 199–208.
 [29] S. Rendle, C. Freudenthaler, and L. SchmidtThieme, “Factorizing personalized markov chains for nextbasket recommendation,” in Proceedings of the 19th international conference on World wide web. ACM, 2010, pp. 811–820.
 [30] S. Rendle and L. SchmidtThieme, “Pairwise interaction tensor factorization for personalized tag recommendation,” in Proceedings of the third ACM international conference on Web search and data mining. ACM, 2010, pp. 81–90.
 [31] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the royal statistical society. Series B (methodological), pp. 1–38, 1977.
 [32] Y. Koren, R. M. Bell, and C. Volinsky, “Matrix factorization techniques for recommender systems,” vol. 42, no. 8, Aug 2009, pp. 30–37.
 [33] A. Mnih and R. Salakhutdinov, “Probabilistic matrix factorization,” in Advances in neural information processing systems, 2007, pp. 1257–1264.