NPE: Neural Personalized Embedding for Collaborative Filtering

NPE: Neural Personalized Embedding for Collaborative Filtering

ThaiBinh Nguyen, Atsuhiro Takasu
SOKENDAI (The Graduate University for Advanced Studies), Japan
National Institute of Informatics, Japan
{binh,takasu}@nii.ac.jp
Abstract

Matrix factorization is one of the most efficient approaches in recommender systems. However, such algorithms, which rely on the interactions between users and items, perform poorly for “cold-users” (users with little history of such interactions) and at capturing the relationships between closely related items. To address these problems, we propose a neural personalized embedding (NPE) model, which improves the recommendation performance for cold-users and can learn effective representations of items. It models a user’s click to an item in two terms: the personal preference of the user for the item, and the relationships between this item and other items clicked by the user. We show that NPE outperforms competing methods for top-N recommendations, specially for cold-user recommendations. We also performed a qualitative analysis that shows the effectiveness of the representations learned by the model.

NPE: Neural Personalized Embedding for Collaborative Filtering


ThaiBinh Nguyen, Atsuhiro Takasu SOKENDAI (The Graduate University for Advanced Studies), Japan National Institute of Informatics, Japan {binh,takasu}@nii.ac.jp

1 Introduction

In recent years, recommender systems have become a core component of online services. Given the “historical activities” of a particular user (e.g., product purchases, movie watching, and Web page views), a recommender system suggests other items that may be of interest to that user. Current domains for recommender systems include movie recommendation (Netflix and Hulu), product recommendation (Amazon), and application recommendation (Google Play and Apple Store).

The historical activities of users are often expressed in terms of a user-item preference matrix whose entries are either explicit feedback (e.g., ratings or like/dislike) or implicit feedback (e.g., clicks or purchases). Typically, only a small part of the potential user-item matrix is available, with the remaining entries not having been recorded. Predicting user preferences can be interpreted as filling in the missing entries of the user-item matrix. In this setting, matrix factorization (MF) is one of the most efficient approaches to find the latent representations of users and items [???]. To address the sparseness of the user-item matrix, additional data are integrated into MF as “side information.” This might include textual information for article recommendations [??], product images in e-commerce [?], or music signals for song recommendations [?]. However, there are two major issues with these MF-based algorithms. First, these models are poor at modeling cold-users (i.e., users who have only a short history of relevant activities). Second, because these models consider only user-item interactions, the item representations poorly capture the relationships among closely related items [?].

One approach to cold-user recommendation is to exploit user profiles. Such proposed models [??] can learn user representations from their profiles (e.g., gender and age). In this way, these models can make recommendations to new users who have no historical activities, provided their user profiles are available. However, user profiles are often very noisy, and in many cases, they are simply not available. Another approach is item-similarity based models [??], which recommends items based on item–item similarity. The main issue of this approach is that it considers only the most recent click when making a recommendation, ignoring previous clicks. In addition, these models are not personalized.

In item representations learning, Item2Vec [?] is an efficient model that borrows the idea behind word-embedding techniques [?] for learning item representations. However, the main goal of Item2Vec is to learn item representations and it cannot be used directly for predicting missing entries in a user-item matrix. Furthermore, in making recommendations, Item2Vec is not personalized: it recommends items based on the similarities between items, computed using item representations, and ignores users’ historical activities.

To address these problems, this paper proposes a neural personalized embedding (NPE) model that fuses item relationships for learning effective item representations in addition to improving recommendation quality for cold-users. NPE models a user’s click on an item by assuming that there are two signals driving the click: the personal preference of the user with respect to the item and the relationships between this item and other items that the user has clicked.

To model the personal preference term, we adopt the same approach as MF, which views the preference of a user for an item as the inner product of the corresponding factor vectors. To model the relationships among items, we propose an item-embedding model that generalizes the idea behind word-embedding techniques to click data. However, our item-embedding model differs from the word-embedding model in that the latter can only learn word representations. In contrast, our embedding model can both learn item representations and fill in the user-item matrix simultaneously.

2 Related Work

Matrix Factorization

MF [??] is one of the most efficient ways to perform collaborative filtering. An MF-based algorithm associates each user with a latent feature vector of preferences and each item with a latent feature vector of attributes. Given prior ratings of users to items, MF learns the latent feature vectors of users and items and uses these vectors to predict missing ratings. To address sparseness in the user-item matrix, additional data about items/users are also used [???].

Recently, the CoFactor [?] and CEMF [?] models have been proposed. These models integrate item embedding into the MF model. They simultaneously decompose the preference matrix and the SPPMI matrix (the item–item matrix constructed from co-click information) in a shared latent space. However, in contrast to our proposed method, CoFactor and CEMF use co-click information to regularize the user-item matrix information, whereas NPE exploits co-click information for learning effective representations of items. In [?], the author uses co-click information to address the data sparsity issue in rating prediction.

For cold-user recommendations, [?] and [?] proposed models that learn user presentations from user profiles. In [?], the user representations are learned from user profiles via a deep convolutional neural network for event recommendations, whereas [?] has user representations being learned by an auto-encoder. Despite these models being very useful for new-user recommendations, the main issue remains that user profiles are not always available. Furthermore, many user profiles may be very noisy (e.g., users may not want to publish their real gender, age, or location), which leads to inaccurate representations of users.

Embedding Models

Word-embedding techniques [??] have been applied successfully to many tasks in natural language processing. The goal of word embedding is to learn vector representations of words that capture the relationships with surrounding words. The assumption behind word embedding techniques is that words that occur in the same context are similar. To capture such similarities, words are embedded into a low-dimensional continuous space.

If an item is viewed as a word, and a list of items clicked by a user is a context window, we can map word embedding to recommender systems. Item2Vec [?] was introduced as a neural network-based item-embedding model. However, Item2Vec is not able to predict missing entries in a user-item matrix directly. Furthermore, in its recommendations, Item2Vec relies only on the last item, ignoring previous items that a user has clicked.

Exponential Family Embeddings (EFE) [?], a probabilistic embedding model that generalizes the spirit of word embedding to other kinds of data, which can be used for modeling clicks and learn item representations. However, EFE does not support for side information such as items’ rich contents. In addition, EFE is not personalized.

Item-based Models

In item-based collaborative filtering [??], an item is recommended to a user based on the similarity between this item and the items that the user clicked in the past. In [?], an item– similarity matrix is constructed and is used directly to calculate the item similarities for recommendations. Previous work shows that the performance of this method is highly sensitive to the choice of similarity metric and data normalization [?].

SLIM [?] is a recent model that identifies item similarity from the preference matrix by learning a sparse item-similarity matrix from the preference matrix. However, the disadvantage of SLIM is that it can only capture the relations between items that are co-clicked by at least one user. This will limit the capability of the model when applied to extremely sparse datasets. Furthermore, SLIM can only be used to predict the missing entries of the user-item matrix and cannot be used for learning effective representations of items.

3 NPE: Neural Personalized Embedding

We propose NPE, a factor model that explains users’ clicks by capturing the preferences of users for items and the relationships between closely related items. We will describe the model and how to learn the model parameters.

3.1 Problem Formulation

Each entry in the user-item preference matrix has one of two values or , such that if user has clicked item and otherwise. We assume that indicates that user prefers , whereas indicates that this entry is non-observed (i.e., a missing entry).

Given a user and the set of items that previously interacted, our goal is to predict a list of items that may find interesting (top-N recommendations).

The notations used in this paper are defined in Table 1.

Notation Meaning
the number of users and items, respectively
the user-item matrix (e.g., click matrix)
the observation data for user (i.e., the row corresponding to user of matrix
the dimensionality of the embedding space
the dimensionality of the user input vector
the dimensionality of the item input vector
the input vector of user ,
the input vector of item ,
the user embedding matrix,
the item-embedding matrix,
the item context matrix,
the embedding vector of user ,
the embedding vector of item ,
the context vector of item ,
The set of all model parameters
The regularization term
the set of items that user clicked, excluding (the context items)
the set of positive examples,
the set of negative examples, which is obtained by sampling from zero entries of matrix
Table 1: The notations used throughout the paper.

3.2 Model Formulation

We denote the observations for user as:

(1)

NPE models the probability of each observation conditioned on user and its context items as:

(2)

This equation captures the intuition behind the model, namely that the conditional distribution of whether user clicks on item is governed by two factors: (1) the personal preference of user for item , and (2) the set of items that has clicked (i.e., ).

The likelihood function for the entire matrix is then formulated as:

(3)

The conditional probability expressed in Eq. 2 is implemented by a neural network. This neural network connects the input vectors of user , item , and context items to their hidden representations as:

(4)
(5)
(6)

where is an activation function such as ReLU.

Note that there are two hidden representations associated with item : the embedding vector and the context vector , which have different roles. Whereas accounts for the attributes of item , accounts for specifying the items that appear in its context.

We can then define the conditional probability in Eq. 2 via the hidden representations as:

(7)

Note that the function on the right side of Eq. 7 comprises two terms: the first term accounts for how user prefers item , whereas the second term accounts for the compatibility between item and the items that has already clicked.

From Eq. 7, we can also obtain the probability that as:

(8)

The conditional probability functions in Eqs. 7 and 8 can be summarized in a single conditional probability function as:

(9)

where .

3.3 The Model Architecture

The architecture of NPE is shown in Fig. 1 as a multi-layer neural network. The first layer is the input layer which specifies the input vectors of (1) a user , (2) a candidate item , and (3) the context items. Above this is the second layer (the embedding layer), which connects to the input layer via connection matrices , , and .Above the embedding layer, two terms are calculated: the personal preference of user for item and the relationship between and the context items. Finally, the model combines these two terms to compute the output, which is the probability that will click .

Note that, the input layer accepts a wide range of vectors that describe users and items such as one-hot vector or content feature vectors obtained from side information. With such a generic input vectors, our method can address the cold-start problem by using content feature vectors as input vectors for users and items. Since this work focuses on the pure collaborative filtering setting, we use only the identities of users and items in the form of one-hot vectors as input vectors. Investigating the effectiveness of using content feature vectors, is left for future work.

Figure 1: The architecture of NPE.

3.4 Objective Function

Given an observed matrix , our goal is to learn the model parameters that maximize the likelihood function in Eq. 3. However, instead of modeling all zero entries, we only model a small subset of such entries by picking them randomly (negative sampling). This gives:

(10)

Maximizing the likelihood in Eq. 10 is equivalent to minimizing the following loss function (its negative log function):

(11)

where .

This loss function is known as the binary cross-entropy.

3.5 Model Training

We adopt the Adam technique (a mini-batch stochastic gradient descent approach) [?]. We do not perform negative sampling in advance, which can only produce a fixed set of negative samples. Instead, we perform negative sampling with each epoch, which enables diverse sets of negative examples to be used. The algorithm is summarized in Algorithm 1.

Input : 
  • : User-item preference matrix

  • : number of negative samples per positive example

Output : 
1 Initialization: sample from Gaussian distributions
2 for epoch=1 …T do
3       Sample negative examples
4      
5      
6       for t=1 …# of mini-batches do
7            
8             Backprop()
9       end for
10      
11 end for
Algorithm 1 NPE(). Backprop is the back-propagation procedure for updating network weights.

3.6 Connections with Previous Models

NPE vs. MF

In the conditional probability in Eq. 7, we can see that the function is a combination of two terms: (1) user preference and (2) item relationship. If the second term is removed, NPE will reduce to an original MF method.

NPE vs. Word Embedding

Similarly, if we remove the first element of in Eq. 7, NPE will model only the relationship among items. If we view each item as a word, and the set of items that a user clicked as a sentence, the model becomes similar to a word-embedding model. However, our embedding model differs in that word-embedding techniques can only learn word (item) representations and cannot fill the user-item matrix directly. In contrast, our embedding model can learn effective item representations while predicting the missing entries in the user-item matrix.

4 Empirical Study

We have studied the effectiveness of NPE both quantitatively and qualitatively. In our quantitative analysis, we compared NPE with state-of-the-art methods on top-N recommendation task, using real-world datasets. We also performed a qualitative analysis to show the effectiveness of the item representations.

4.1 Datasets

We used three real-world datasets whose sizes varied from small to large-scale, from different domains. First, Movielens 10M (ML-10m) is a dataset of user-movie ratings, collected from MovieLens, an online film service. Next, Online Retail [?] is a dataset of online retail transactions that contains all transactions from Dec 1, 2010 to Dec 9, 2011 for an online retailer. Finally, TasteProfile is a dataset of counts of song plays by users, as collected by Echo Nest.111http://the.echonest.com/

ML-10m OnlineRetail TasteProfile
#users 58,059 3,705 211,830
#items 8,484 3,644 22,781
#clicks 3,502,733 235,472 10,054,204
% clicks 0.71% 1.74% 0.21%
Table 2: Statistical information about the datasets.

4.2 Experiment Setup

Data Preparation

For the ML-10m, we binarized the ratings, thresholding at 4 or above; for TasteProfile and OnlineRetail, we binarized the data and interpreted them as implicit feedback. Statistical information about the datasets is given in Table 2.

We partitioned the data into three subsets, using 70% of the data as the training set, 10% as the validation set, and the remaining 20% as the test set (ground truth).

Evaluation Metrics

After training the models on the training set, we evaluated the accuracy of their top-N recommendations using the test set. We used the rank-based metrics Recall@ and nDCG@, which are common metrics in information retrieval, for evaluating the accuracy of the top-N recommendations. (We did not use “Precision” because it is difficult to evaluate, given that a zero entry can imply either that the user does not like the item or does not know about the item).

Competing Methods

We compared NPE with the following competing methods:

  • Bayesian personalized ranking (BPR) [?]: an algorithm that optimizes the MF model with a pair-wise ranking loss

  • Neural collaborative filtering (NeuCF) [?]: a generalization of an MF method in which the inner product of user and item feature vectors are replaced by a deep neural network

  • Sparse linear model (SLIM) [?]: a state-of-the-art method for top-N recommendations, which is based on the similarities between items.

4.3 Implementation Details

Since neural networks are prone to overfitting, we apply a dropout after the hidden representation layer. The dropout rate is tuned for each dataset. We use early stopping to terminate the training process if the loss function does not decrease on the validation set for five epochs. The weights for the matrices , , and are initialized as normal distributions. The size of each mini-batch was 10,000.

4.4 Experimental Results

Top-N Recommendations

Table 3 summarizes the Recall@20 and nDCG@20 for each model. Note that NPE significantly outperforms the other competing methods across all datasets for both Recall and nDCG. We emphasize that all methods used the same data. However, NPE benefits from capturing the compatibility between each item and other items picked by the same users.

Methods ML-10m OnlineRetail TasteProfile
Re@20 nDCG@20 Re@20 nDCG@20 Re@20 nDCG@20
SLIM 0.1342 0.1289 0.2085 0.1015 0.1513 0.1422
BPR 0.1314 0.1253 0.2137 0.0943 0.1598 0.1398
NeuCF 0.1388 0.1337 0.2199 0.0911 0.1609 0.1471
NPE (our) 0.1497 0.1449 0.2296 0.1742 0.1788 0.1594
Table 3: Recall and nDCG for three datasets, with embedding size and negative sampling ratio .

In Table 4, we summarize Recall@20 values for the four methods when different numbers of items were to be recommended. From these results, we can see that NPE consistently outperformed the other methods at all settings. The differences between NPE and the other methods are more pronounced for small numbers of recommended items. This is a desirable feature because we often only a consider a small number of top items (e.g., top- or top-).

Methods ML-10m OnlineRetail TasteProfile
Re@5 Re@10 Re@20 Re@5 Re@10 Re@20 Re@5 Re@10 Re@20
SLIM 0.1284 0.1298 0.1342 0.0952 0.1311 0.2085 0.1295 0.1304 0.1513
BPR 0.1254 0.1261 0.1314 0.0859 0.1222 0.2137 0.1307 0.1311 0.1598
NeuCF 0.1347 0.1363 0.1388 0.0871 0.1274 0.2199 0.1342 0.1356 0.1609
NPE (our) 0.1451 0.1487 0.1497 0.1392 0.1667 0.2296 0.1428 0.1523 0.1788
Table 4: Recall for different numbers of items to be recommended, with embedding size and negative sampling ratio .

The Performance on Cold-Users

We studied the performance of the models for users who had few historical activities. To this end, we partitioned the test cases into three groups, according to the number of clicks that each user had. The Low group’s users had less than clicks, the Medium group’s users had clicks, and the High group’s users had more than clicks.

Fig. 2 shows the breakdown of Recall@20 in terms of user activity in the training set for the ML-10m and OnlineRetail. Although the details varied across datasets, the NPE model outperformed the other methods for all three groups of users. The differences between NPE and the other methods are much more pronounced for users who have fewest clicks. This is to be expected because, for such users, NPE captures the item relations when making recommendations.

(a) ML-10m
(b) OnlineRetail
Figure 2: Recall@ for different groups of users.

Effectiveness of the Item Representations

We evaluated the effectiveness of item representations by investigating how well the representations capture the item similarity and items that are often purchased together.

Similar items: The similarity between two items is defined as the cosine distance between their embedding vectors. Fig. 3 shows three examples of the top-5 most similar items to a given item in the OnlineRetail dataset. We can see that the items’ embedding vectors effectively capture the similarity of the items. For example, in the first row, given a red alarm clock, four of its top-5 similar items are also alarm clocks.

Figure 3: Top-5 similar items for a given item. In each row, the given item is at the left and the top-5 similar items are to its right.

Items that are often purchased together: NPE can also identify items that are often purchased together. To assess if two items are often purchased together, we calculate the inner product of one item’s embedding vector and the other’s context vector . A high value of this inner product indicates that these two items are often purchased together. Fig. 4 shows an example of items that tend to be purchased together with the given item. Here, we see that buying a knitting Nancy, a child’s toy, might accompany the purchase of other goods for children or for a household.

Figure 4: Top-5 items that are likely to be bought together with a given item. The given item is at the left and its top-5 most similar items are to its right.

Sensitivity Analysis

We also studied the effect of the hyper-parameters on the models’ performance.

Impact of the embedding size: To evaluate the effects of the dimensionality of the embedding space on the top-N recommendations, we varied the embedding dimension while fixing the other parameters. Table 5 summarizes the Recall@ for NPE on the three datasets for various embedding sizes: . We can see that the larger embedding sizes seem to improve the performance of the models. The optimal embedding size for OnlineRetail is and, for ML-10m and TasteProfile is .

ML-10m OnlineRetail TasteProfile
Re@20 Re@20 Re@20
8 0.1428 0.1187 0.0987
16 0.1451 0.1596 0.1142
32 0.1441 0.1950 0.1509
64 0.1497 0.2296 0.1788
128 0.1482 0.2284 0.1992
256 0.1459 0.2248 0.1985
Table 5: Recall@20 for various embedding sizes, with negative sampling ratio .

Impact of the negative sampling ratio: During the training of NPE, we sampled negative examples. We studied the effect of the negative sampling ratio on the performance of NPE by fixing the embedding size and evaluating Recall@20 for . From Table 6, we note that when increases, the performance also increases up to a certain value of . The optimal negative sampling ratios are for OnlineRetail and for ML-10m and TasteProfile. This is reasonable because ML-10m and TasteProfile, being larger than OnlineRetail, will need more negative examples.

ML-10m OnlineRetail TasteProfile
Re@20 Re@20 Re@20
1 0.1392 0.1608 0.1243
2 0.1418 0.1795 0.1451
4 0.1441 0.1950 0.1509
5 0.1478 0.1952 0.1585
8 0.1563 0.1941 0.1621
12 0.1531 0.1937 0.1615
16 0.1524 0.1925 0.1603
20 0.1496 0.1908 0.1598
Table 6: Recall@20 for different negative sampling ratios, with a fixed embedding size .

5 Conclusions and Future Work

We propose NPE, a neural personalized embedding model for collaborative filtering, is effective in making recommendations to cold-users and for learning item representations. Our experiments have shown that NPE can outperform competing methods with respect to top-N recommendations in general, and to cold-users in particular. Our qualitative analysis also demonstrated that item representations can capture effectively the different kinds of relationships between items.

One future direction will be to study the effectiveness of the model when using available side information about items.We also aim to investigate different negative sampling methods for dealing with zero values in the user-item matrix.

References

  • [Barkan and Koenigstein, 2016] Oren Barkan and Noam Koenigstein. Item2vec: Neural item embedding for collaborative filtering. In RecSys Posters, volume 1688 of CEUR Workshop Proceedings, 2016.
  • [Chen et al., 2012] Daqing Chen, Sai Laing Sain, and Kun Guo. Data mining for the online retail industry: A case study of rfm model-based customer segmentation using data mining. Journal of Database Marketing & Customer Strategy Management, 19(3):197–208, Sep 2012.
  • [He and McAuley, 2016] Ruining He and Julian McAuley. Vbpr: Visual bayesian personalized ranking from implicit feedback. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, pages 144–150. AAAI Press, 2016.
  • [He et al., 2017] Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, WWW ’17, pages 173–182, 2017.
  • [Herlocker et al., 2002] Jon Herlocker, Joseph A. Konstan, and John Riedl. An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf. Retr., 5(4):287–310, October 2002.
  • [Hu et al., 2008] Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, pages 263–272. IEEE, 2008.
  • [Kingma and Ba, 2014] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.
  • [Koren, 2008] Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In KDD, pages 426–434. ACM, 2008.
  • [Koren, 2010] Yehuda Koren. Collaborative filtering with temporal dynamics. Commun. ACM, 53(4):89–97, 2010.
  • [Li et al., 2015a] Shaohua Li, Jun Zhu, and Chunyan Miao. A generative word embedding model and its low rank positive semidefinite solution. In EMNLP, pages 1599–1609. The Association for Computational Linguistics, 2015.
  • [Li et al., 2015b] Sheng Li, Jaya Kawale, and Yun Fu. Deep collaborative filtering via marginalized denoising auto-encoder. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM ’15, pages 811–820, 2015.
  • [Liang et al., 2016] Dawen Liang, Jaan Altosaar, Laurent Charlin, and David M. Blei. Factorization meets the item embedding: Regularizing matrix factorization with item co-occurrence. In RecSys, pages 59–66, 2016.
  • [Linden et al., 2003] Greg Linden, Brent Smith, and Jeremy York. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 7(1):76–80, January 2003.
  • [Mikolov et al., ] Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111–3119.
  • [Nguyen and Takasu, 2017] ThaiBinh Nguyen and Atsuhiro Takasu. A probabilistic model for the cold-start problem in rating prediction using click data. In Neural Information Processing, ICONIP ’17, pages 196–205, 2017.
  • [Nguyen et al., 2017] ThaiBinh Nguyen, Kenro Aihara, and Atsuhiro Takasu. Collaborative item embedding model for implicit feedback data. In International Conference on Web Engineering, ICWE ’17, pages 336–348, 2017.
  • [Ning and Karypis, 2011] Xia Ning and George Karypis. Slim: Sparse linear methods for top-n recommender systems. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining, ICDM ’11, pages 497–506. IEEE Computer Society, 2011.
  • [Ning and Karypis, 2012] Xia Ning and George Karypis. Sparse linear methods with side information for top-n recommendations. In Proceedings of the Sixth ACM Conference on Recommender Systems, RecSys ’12, pages 155–162. ACM, 2012.
  • [Oord et al., 2013] Aäron van den Oord, Sander Dieleman, and Benjamin Schrauwen. Deep content-based music recommendation. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13, pages 2643–2651. Curran Associates Inc., 2013.
  • [Rendle et al., 2009] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. Bpr: Bayesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, 2009.
  • [Rudolph et al., 2016] Maja Rudolph, Francisco Ruiz, Stephan Mandt, and David Blei. Exponential family embeddings. In Advances in Neural Information Processing Systems 29, pages 478–486. 2016.
  • [Salakhutdinov and Mnih, 2008] Ruslan Salakhutdinov and Andriy Mnih. Probabilistic matrix factorization. In Advances in Neural Information Processing Systems, volume 20, 2008.
  • [Sarwar et al., 2001] Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John Reidl. Item-based collaborative filtering recommendation algorithms. In World Wide Web, pages 285–295, 2001.
  • [Tang and Liu, 2017] L. Tang and E. Y. Liu. Joint user-entity representation learning for event recommendation in social network. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pages 271–280, April 2017.
  • [Wang and Blei, 2011] Chong Wang and David M. Blei. Collaborative topic modeling for recommending scientific articles. In KDD, pages 448–456, 2011.
  • [Wang et al., 2015] Hao Wang, Naiyan Wang, and Dit-Yan Yeung. Collaborative deep learning for recommender systems. In KDD, pages 1235–1244, 2015.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
199093
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description