Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network

Knowledge Graph Embedding with Entity Neighbors
and Deep Memory Network

Kai Wang    Yu Liu    Xiujuan Xu    Dan Lin
School of Software Dalian University of Technology
Dalian 116620, Liaoning, China
kai_wang@mail.dlut.edu.cn, {yuliu,xjxu}@dlut.edu.cn
lindan0823@mail.dlut.edu.cn

Abstract

Knowledge Graph Embedding (KGE) aims to represent entities and relations of knowledge graph in a low-dimensional continuous vector space. Recent works focus on incorporating structural knowledge with additional information, such as entity descriptions, relation paths and so on. However, common used additional information usually contains plenty of noise, which makes it hard to learn valuable representation. In this paper, we propose a new kind of additional information, called entity neighbors, which contain both semantic and topological features about given entity. We then develop a deep memory network model to encode information from neighbors. Employing a gating mechanism, representations of structure and neighbors are integrated into a joint representation. The experimental results show that our model outperforms existing KGE methods utilizing entity descriptions and achieves state-of-the-art metrics on 4 datasets.

Knowledge Graph Embedding with Entity Neighbors
and Deep Memory Network


Kai Wang and Yu Liu and Xiujuan Xu and Dan Lin School of Software Dalian University of Technology Dalian 116620, Liaoning, China kai_wang@mail.dlut.edu.cn, {yuliu,xjxu}@dlut.edu.cn lindan0823@mail.dlut.edu.cn

Copyright © 2019, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Introduction

With promising potential in artificial intelligence applications, knowledge graphs (KG) have attracted extensive interest (??). Knowledge facts are stored in KG as triplets in form of (head entity, relation, tail entity), e.g. (Apple Inc., Operating Systems Developed, Mac OS). Despite great progress that millions or even billions of facts from real world have been recorded, the construction of large scale knowledge graphs is confronted with incompleteness and sparseness (?).

Knowledge graph embedding (KGE) methods have been proposed to overcome this challenge by representing the entities and relations in a low-dimensional continuous vector space (?). As one of the typical methods, TransE (?) regards every relation as translation between head and tail entities. Benefiting from KGE methods, we can do reasoning and prediction over KG through algebraic computations.

However, when processing entities with few facts, KGE methods may decline in performance, as they solely learn from fact triplets (?). Therefore, multiple methods have been proposed by incorporating structural knowledge with additional information, including entity descriptions, relation paths and so on. Fig.1 shows two kinds of additional information of a triplet sampled from Freebase (?). Apart from fact triplets, those information can provide more semantic or topological features for entity representation.

Figure 1: Example of entity descriptions and relation paths for a triplet in Freebase. For relation paths, those paths with less than three intermediate entities are shown in graph. The relation name in paths is ignored, and the double line means two different relations existing between this entity pair.

Although effective, common used additional information usually contains numerous noise, which makes it hard to extract valuable representation. From Fig.1, there are 8 different relation paths from ‘Apple Inc.’ to ‘Mac OS’, and all of them pass the ‘OS X’ entity as the last intermediate. It is intuitive that (‘Apple Inc.’, ‘OS X’, ‘Mac OS’) forms a straight 3-step path, the other longer paths may become redundant. Similarly, many entity descriptions, written by volunteers around the world, are rich in content but not concise. In nearly 100 words long descriptions, key features for an entity are few and far between.

To solve the problem above, we design a new kind of additional information, called entity neighbors. As a man is known by the company he keeps, the representation of an entity can be inferred from its neighbor entities. As an instance, given three words ‘Corporation’, ‘Mac OS’ and ‘Steve Jobs’, people can immediately deduce the ‘Apple Inc.’ entity. Motivated by simplifying entity description and relation paths, we define entity neighbors containing two parts: (1) Semantic Neighbors, mentioned in description of specific entity, or whose description mentions the specific entity; (2) Topological Neighbors, the surroundings of specific entity in KG, having at least one relation with specific entity.

Note that, compared with entity descriptions and neighbor context used in recent methods, the entity neighbors we proposed have three advantages: (1) Semantic richness. Entity neighbors combine both structure features and semantic features. (2) Simplicity. Entity neighbors only retain representative elements while removing a lot of noise. (3) Availability. Entity neighbors can be generated quickly and handle situations where the description text is missing.

In this paper, we propose a novel architecture named Neighborhood Knowledge Graph Embedding (NKGE). We first define entity neighbors consisting of semantic neighbors and topological neighbors. To further generate neighbor representation for each entity, we develop a neighbor encoder based on deep memory network. To the best of our knowledge, it is the first time to utilize memory networks in KG embedding. Based on TransE and ConvE(?) method respectively, we design two kinds of NKGE architecture, combining structure representation and neighbor representation. To verify the effectiveness of entity neighbors and encoder model independently, we design a controlled trial on link prediction task. Experimental results show that our TransE-based model outperforms existing TransE-based methods utilizing the entity descriptions, and ConvE-based model gets state-of-the-art metrics on most of experimental datasets.

The rest of our paper is structured as follows: We outline related works about KGE and deep memory networks in next section. Section Methods gives a detailed description of NKGE model, and section Experiments presents experiments to validate the effectiveness of NKGE model. Finally, we summarize this work and the future direction in section Conclusion.

Related Work

Knowledge Graph Embedding

Recent KGE methods can be broadly separated into two groups: translational distance models and semantic matching models (?). Represented by the TransE method (?), translational distance models compute the distance between two entities to measure the plausibility of a triplet. To solve flaws in dealing with ‘1-to-N’, ‘N-to-1’, and ‘N-to-N’ relations (?), some variants of TransE such as TransH (?), TransR (?) and TransD (?) are proposed. Semantic matching models learn representations by matching latent semantics of entities and relations embodied in their vector space representations. RESCAL (?) models pairwise interactions between latent semantics vectors and latent relation matrix. DistMult (?) simplifies the relation matrices to diagonal matrices. ComplEx (?) extends DistMult to better model asymmetric relations. Some of the more recent models achieve strong performances, like ANALOGY(?) and ConvE (?). In our work, we focus on the impact of integrating entity neighbor information in KGE.

Incorporating Additional Information

Common used additional information, as a supplement of structure representation in KG, includes relation paths, entity descriptions and so on. In terms of integrating relation paths, Lin et al.(?) propose path-based TransE (PTransE) to model relation paths, in composition operations of addition, multiplication, and recurrent neural network (RNN). Guu et al.(?) utilize KG embedding to answer path queries and built triplets using entity pairs connected with relation paths. Xiong et al.(?) introduce a reinforcement learning method to learn multi-hop relational paths base on KG embedding. In terms of integrating entity descriptions, Wang et al.(?) propose a joint model with entity and word embeddings using entity names or Wikipedia anchors. Zhong et al.(?) align entity and text representation by entity descriptions in a continuous vector space. Xie et al.(?) jointly learn KG embedding by using CNN model to encode semantics of entity descriptions. Xu et al.(?) propose a gating mechanism to integrate structural and textual information into a unified architecture. Compared with path-based method, our neighbor processing model are more similar with those joint model using entity descriptions.

Deep Memory Network

Recently, computational models based on attention mechanism and explicit memory have achieved great success in many NLP tasks (??). Neural Turing Machines (?) extend deep neural networks with an external memory, which uses a continuous memory representation with both content and address-based access. Weston et al.(?) propose a neural networks based model called Memory Networks, which is designed with non-writable memories, and builds a hierarchical memory representation. Originally designed for question answering tasks, End-to-End Memory Networks (MemN2N) (?) improve Memory Networks to support end-to-end training, and operate via a memory selection mechanism in which relevant memory pieces are adaptively selected based on the input query. Dynamic Memory Networks (?) are equipped with an episodic memory and get promising results on question answering and sentiment analysis tasks.

Our model is inspired by the recent success of MemN2N. We aim to utilize the multilayered reasoning capabilities of memory networks for neighbor representation learning.

Methods

In this section, we first introduce the notations used in our model. Then, we define two types of entity neighbors and describe the extracting and filtering strategies in detail. After that, we dive into the mathematical and algorithmic details of the deep memory network encoder for neighbor representation. Finally, neighbor representation is utilized in two NKGE architectures, based on TransE and ConvE respectively.

Notations

For a given KG, we define the set of entities as , the set of relations as , and the set of fact triplets as . Each triplet in is represent by , while and . Each entity has a set of neighbors . Therefore, an entity is represented by two embedding vectors: (1) structure embedding, describes the meaning of entity, and (2) neighbor embedding, constructs neighbor representation. Each relation is represented by a relation embedding vector. All of entity, neighbor and relation embeddings take values in . Our goal is to learn those embedding vectors of all entities and relations.

Entity Neighbors

As a new kind of additional information for KG embedding, entity neighbors refer to a set of entities that are closely related to the specific entity. We define two types of neighbors including topological neighbors and semantic neighbors.

Topological neighbors of an entity are the surrounding of it in given KG. Each neighbor has at least one relation with the specific entity in triplets. Specifically, given an entity , the topological neighbor set of is . For example, as shown in Figure 1, the topological neighbors of ‘Apple Inc.’ in the graph include ‘IOS’, ‘OS X’ and ‘Official Website’.

Semantic neighbors of an entity are extracted from description text, including entities mentioned in its description, and entities whose description mentions this entity. Specifically, given an entity , its name is represented by a word set and its description text is a word set . The semantic neighbor set of is . For example, the semantic neighbors of ‘Apple Inc.’ in Figure 1, include ‘Corporation’, ‘Mac OS’, ‘OS X’, ‘Apple Software’ and so on.

Using the above extracting strategies, we obtain two neighbor sets of an entity. In some cases, there are up to hundreds of neighbors for an entity, such as ‘the United States’. Therefore, we develop a filter mechanism to select top typical neighbors from the two sets. First, for each neighbor, we count its number of occurrences in two kinds of neighbor sets respectively. We assume that the lower frequency reflects the neighbor is more representative. Then, given an entity , the neighbors which presenting in both two neighbor sets are selected first. The remaining places are filled by neighbors with smaller frequency in two sets.

Note that, compared with entity descriptions and neighbor context used in recent methods, the entity neighbors we proposed have three advantages: (1) Semantic richness. Entity neighbors combine both structure features and semantic features. (2) Simplicity. Entity neighbors only retain representative elements while removing a lot of noise. (3) Availability. Entity neighbors can be generated from given text quickly and handle situations where the description text is missing.

Deep Memory Network Encoder

After generating entity neighbors, we need to encode the neighbor representation for a given entity. There have been several kinds of neural models used in entity description encoding, such as such as continuous bag-of-words (CBOW), recurrent neural network (RNN) and convolutional neural network (CNN) (??). However, different from continuous word sequence in description text, each entity neighbor is semantically independent and has potential relations with others. In this paper, we propose a new encoder, DMN encoder based on MemN2N (?).

MemN2N is a new RNN-like model, having great performance in question answering tasks. Using sentences as external memory, MemN2N iteratively extracts information by a given query. Attention mechanism is utilized in each iteration to infer potential semantics. Then the final answer is predicted by processing the outputs of the last iteration. Leveraging memN2N’s reasoning capabilities, our DMN encoder extracts information from entity neighbors adaptively and integrates it into entity’s neighbor representation. The illustration of DMN encoder is shown in Fig. 2.

The input data of DMN encoder contains input query and external memory. Specifically, given an entity , its neighbor set is , all of them are converted into neighbor embedding vectors. Then is used as external memory in DMN encoder, each neighbor is regarded as a memory cell. The input query of encoder is a -dimensional vector. Intuitively, an entity should have different neighbor representations under different relations. So given a triplet (h, r, t), we use the relation embedding as input query to represent entity or .

Figure 2: An illustration of DMN encoder with three layers.
Figure 3: The general architecture of NKGE model.

Single Layer DMN Encoder We first introduce the DMN encoder with only one layer (iteration), which is made up of two parts: attention part and review part.

In attention part, the attention score for each neighbor is defined as , which is

(1)
(2)

where and . The score of the th neighbor determines the contribution degree of this neighbor. Then we generate neighbor encoding by a weighted sum of neighbors with attention score :

(3)

In review part, we process original information of input query using a fully connected layer independently.

(4)

where and . Review part’s output is added to neighbor encoding as the final neighbor representation of single layer DMN encoder.

(5)

Multiple Layer DMN Encoder Single-layer model of DMN encoder is simple and not powerful enough, while multiple layers allow the deep memory network to learn representation with deep-level abstraction. In the iterative process, the neighbor representation of entity is continuously improved by learning the potential semantics among neighbors. The multiple layer encoder has several neural layers with the same structure as single-layer model. The input query of the first layer is the same as single-layer encoder, and the output of this layer is used as the input query of upper layer. Finally, the output of the last layer is used as the neighbor representation of .

TransE-based NKGE Architecture

Different from typical KGE techniques, we integrate structure and neighbor information into a joint representation. For a better comparison with recent works (??), we select the typical method TransE (?) to learn structure representation. Given a triplet , TransE’s score function is defined as:

(6)

where are structure embeddings of h, t respectively, and satisfy . is relation embedding of r.

To integrate two kinds of entity representations, we use the gating mechanism proposed by (?). For each entity , a -dimensional vector is defined to assign different weight to each dimension in representation vector. To constrain each weight in [0, 1], a logistic sigmoid function is used to transform the gate vector. The joint representation is computed as follows:

(7)

where , are the structure and neighbor representations of respectively. Following TransE, our final score function is defined as

(8)

The general architecture of our model is shown in Fig. 3.

Following the recent works, we minimize contrastive max-margin criterion (??) as objective to train our model. The main idea is that each triplet from the train set should receive a higher score than a randomly generated triplet. Given a set of fact triplets as positive sampling set, we generate the negative sampling set ():

(9)

in which each negative sample is derived from a triplet in by replacing head or tail randomly by another entity. We use the Bernoulli sampling strategy described in (?). Let the set of all parameters as , we minimize the following objective:

(10)

where is a margin between positive triplet and negative triplet. We use the standard L2 regularization of all the parameters, weighted by the hyperparameter . The optimization is a standard back propagation using stochastic gradient descent (SGD).

ConvE-based NKGE Architecture

ConvE (?) uses a 2D convolutional neural network as the score function and gets start-of-the-art results in several datasets. Given a triplet , ConvE’s score function is defined as:

(11)

where , denote a 2D reshaping of and , respectively. The parameters of the convolutional filters the linear layer are denoted as and .

Similar with TransE-based architecture, we replace in ConvE by the jointly entity representation . Following ConvE’s training settings, we apply the logistic sigmoid function to the scores:

(12)

and minimize the following binary cross-entropy loss:

(13)

where the elements of label vector are ones for relationships that exists and zero otherwise. Furthermore, we use 1-N scoring, batch normalisation and dropout like ConvE and use Adam (?) as optimiser.

Dataset #Rel #Ent #Train #Valid #Test
FB15k
FB15k237
WN18
WN18RR
Table 1: Statistics of datasets used in experiments.

Experiments

We describe experimental settings and report empirical results in this section.

Metric FB15K FB15K237
MR MRR Hits@10 Hits@3 Hits@1 MR MRR Hits@10 Hits@3 Hits@1
TransE 0.39 47.0 24.8 27.6 16.6
TransD - - - - - -
NKGE(TransE) 50 0.54 78.5 62.5 39.7 193 33.4 21.7
DisMult - 73.3 54.6 26.3 15.5
ComplEx - 75.9 59.9 27.5 15.8
ANALOGY - 78.5 64.6 - - - - -
ConvE 51 72.3 55.8 35.6 23.7
NKGE(ConvE) 0.73 87.1 79.0 65.0 0.33 51.0 36.5 24.1
Table 2: Results of link prediction task on FB15k and FB15k237
Metric WN18 WN18RR
MR MRR Hits@10 Hits@3 Hits@1 MR MRR Hits@10 Hits@3 Hits@1
TransE - - - - - 43.2 - -
TransD - - - - - 42.8 - -
NKGE(TransE) 204 0.496 94.2 82.8 17.3 1511 34.4 2.2
DisMult 901 91.4 72.8 44 39
ComplEx - 93.6 93.6 46 41
ANALOGY - 94.4 93.9 - - - - -
ConvE 374 93.5 55.8 44 40
NKGE(ConvE) 0.947 95.7 94.9 94.2 0.45 52.6 46.5 42.1
Table 3: Results of link prediction task on WN18 and WN18RR

Datasets

In this paper, we use two popular datasets, FB15k (?) and WN18 (?). FB15k is extracted from Freebase in which a large fraction of content describes knowledge facts about movies, actors, awards, and sports. WN18 is a subset of the English lexical database, WordNet (?). However, the drawback of two datasets is that many test triplets can be obtained simply by inverting triplets in the training set. To solve this test leakage, FB15k237 (?) and WN18RR (?) are created by removing inverse relations respectively. Statistics of the four datasets are given in Table 1. The text descriptions of those datasets are publicly available. We build the entity neighbors data from text description of each entity and triplets in training set for each dataset.

Parameter Settings

For TransE model, we select the margin among , embedding dimension among , learning rate among , the maximum number of neighbors among and the number of layers for multi-layer DMN encoder among . The dissimilarity measure is set to either L1 or L2 distance. To speed up the convergence and avoid overfitting, the structure embedding of entities and relation embeddings are initialized by pre-trained results of TransE. The neighbor embeddings and rest parameters are initialized by randomly sampling from uniform distribution in . The final optimal configurations are: , , , , , , and L1 distance. For ConvE model, we set , K = 20, d = 200, L = 3, and the rest are the same as the original settings of ConvE.

Link Prediction Task

As a subtask of knowledge graph completion, link prediction aims to predict the missing entity when the other two parts of a triplet are given. In other word, we need to predict t given or predict h given . Different from other predicting tasks requiring the best one answer, this task focuses on the rank of the correct entity.

We utilize three evaluation metrics similar to (?): (1) Mean Rank (MR), the average rank of all correct entities, (2) Mean Reciprocal Rank (MRR), the average inverse rank for all correct entities, and (3) Hits@N, the proportion of correct entities ranked in top N (N = 1, 3, 10). Lower MR, higher MRR and higher Hits@10 should be achieved by a good embedding model. We also follow the evaluation settings named as ‘Filter’, which removes the candidate triplets appearing in train, valid and test sets before ranking.

Results on Four Datasets

The evaluation results on four datasets are shown in Table 2 and 3. We use ‘NKGE (TransE)’, ‘NKGE (ConvE)’ to represent our models based on TransE and ConvE respectively. The baselines are TransE and the state-of-the-art model ConvE. To validate our model’s performance, we choose several recent KRE methods, including TransR (?), TransD (?), DisMult (?), ComplEx (?) and ANALOGY (?). The results show that: (1) Our NKGE models outperform the baselines, TransE and ConvE, on all metrics respectively, which confirms the effectiveness of neighbor representation. (2) NKGE (TransE) gets state-of-the-art Mean Rank on 4 datasets. NKGE (ConvE) gets state-of-the-art MRR and Hits@N across most datasets. (3) On FB15k dataset, the original ConvE is weaker than ComplEx and ANALOGY, while NKGE(ConvE) is very close to ANALOGY, and get better Hits@10.

Comparison with entity descriptions

A controlled trial is designed to compare our model with description-based methods. We choose two representative works, DKRL (?) and Jointly model (?), which integrating entity descriptions into structure embeddings. Because both of them based on TransE, we use the TransE-based NKGE model in this trial.

To test the performance of entity neighbors and DMN encoder independently, we design two different derived models: (1) NKGE (CBOW + Nei), using entity neighbors and CBOW encoder, which generate representation by summing up all neighbor embeddings simply; (2) NKGE (Multi + Des), using entity descriptions and multi-layer DMN encoder. The initialization of word embeddings is the same as (?). The results are shown in Table 4.

Compared with two description-based methods on FB15k, our origin model NKGE (DMN + Nei) obtains best scores on MR and Hits@10. NKGE (DMN + Des) outperforms Jointly (A_LSTM + Des) using the same entity description, which verifies that our DMN encoder has better capability for additional information encoding. In terms of entity neighbors, the derived model NKGE (CBOW + Nei) gets better performance than Jointly (CBOW + Des) using the same CBOW encoder. It proves the validity of entity neighbors as additional information of KG, which can replace entity descriptions to some extent.

Metric MR Hits@10
DKRL (CNN + Des)
Joinlty (CBOW + Des)
Jointly (A_LSTM + Des)
NKGE (CBOW + Nei)
NKGE (DMN + Des)
NKGE (DMN + Nei) 50 78.5
Table 4: Comparison results of NKGE and description-based methods on FB15k.

Analysis of entity neighbors

Using two types of neighbors, topological and semantic neighbors, is motivated by simplifying relation paths and entity description. We assume the overlap between two parts is more valuable. To verify this hypothesis, we compare the performance of ConvE-based NKGE using different types of neighbors on FB15k237. As results shown in Table 5, NKGE(T&S), using the whole entity neighbors, gets better performance than models with only one type of neighbors.

Metric MR Hits@10 Hits@3 Hits@1
NKGE (Top)
NKGE (Sem)
NKGE (T&S) 237 51.0 36.5 24.1
Table 5: Comparison results of different types of neighbors on FB15k237.

We also care about the availability of neighbors. As the real KG is usually incomplete and sparse, there are entities with few triplets or missing description. In those case, only using text or paths will get trouble. Containing both two types of neighbors, the entity neighbors we proposed can be effective when one is missing. Fig.4 shows the quantity distribution of entities with different numbers of neighbors on FB15k237. As the maximum number of neighbors is 20, ‘T&S’ gets the most entities having complete neighbor information. Note that, there are some entities have no topological neighbors, it reflects the absence situation what we call.

Figure 4: The quantitative distribution of different types of neighbors on FB15k237.

Conclusion

In this paper, we propose the NKGE model for KG embedding. Instead of entity descriptions, we define entity neighbors as new additional information. We explore a deep memory network encoder to extract latent semantics from neighbors. Experiments results show that our model outperforms the baseline TransE and other recent KGE methods on link prediction task. In comparison with description-based methods, both entity neighbors and DMN encoder have better performance.

We will explore the following research directions in future:

  • We select semantic neighbors from text by matching entity name quickly, but inevitably produce omissions. We may design a more effective mechanism in future.

  • The gate mechanism we use only estimates weight according to entity, we may consider relation and neighbor information to improve it.

  • Since entity neighbors are more suitable for sparse KG completion, we will further utilize the NKGE model in real KGs of some specific domains.

References

  • [Bollacker et al. 2008] Bollacker, K. D.; Evans, C.; Paritosh, P.; Sturge, T.; and Taylor, J. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, 1247––1250.
  • [Bordes et al. 2013] Bordes, A.; Usunier, N.; García-Durán, A.; Weston, J.; and Yakhnenko, O. 2013. Translating embeddings for modeling multi-relational data. In Proceedings of Advances in Neural Information Processing Systems (NIPS 2013), 2787–2795.
  • [Bordes et al. 2014] Bordes, A.; Glorot, X.; Weston, J.; and Bengio, Y. 2014. A semantic matching energy function for learning with multi-relational data. Machine Learning 94:233–259.
  • [Das et al. 2017] Das, R.; Zaheer, M.; Reddy, S.; and McCallum, A. D. 2017. Question answering on knowledge bases and text using universal schema and memory networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Short Papers), 358–365.
  • [Dettmers et al. 2018] Dettmers, T.; Minervini, P.; Stenetorp, P.; and Riedel, S. 2018. Convolutional 2d knowledge graph embeddings. In Proceedings of the 32th AAAI Conference on Artificial Intelligence, 1811––1818.
  • [Graves, Wayne, and Danihelka 2014] Graves, A.; Wayne, G.; and Danihelka, I. 2014. Neural turing machines. Computing Research Repository arXiv:1410.5401. version 2.
  • [Guu, Miller, and Liang 2015] Guu, K.; Miller, J. G.; and Liang, P. 2015. Traversing knowledge graphs in vector space. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 318–327.
  • [Ji et al. 2015] Ji, G.; He, S.; Xu, L.; Liu, K.; and Zhao, J. 2015. Knowledge graph embedding via dynamic mapping matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 687–696.
  • [Ji et al. 2016] Ji, G.; Liu, K.; He, S.; and Zhao, J. 2016. Knowledge graph completion with adaptive sparse transfer matrix. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, 985–991.
  • [Kingma and Ba 2015] Kingma, D. P., and Ba, J. L. 2015. Adam: A method for stochastic optimization. In Proceedings of International Conference on Learning Representations (ICLR 2015).
  • [Lin et al. 2015a] Lin, Y.; Liu, Z.; Luan, H.-B.; Sun, M.; Rao, S.; and Liu, S. 2015a. Modeling relation paths for representation learning of knowledge bases. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 705–714.
  • [Lin et al. 2015b] Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; and Zhu, X. 2015b. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, 2181––2187.
  • [Liu, Wu, and Yang 2017] Liu, H.; Wu, Y.; and Yang, Y. 2017. Analogical inference for multi-relational embeddings. In Proceedings of the 34th International Conference on Machine Learning, 2168–2178.
  • [Miller 1995] Miller, G. A. 1995. Wordnet: A lexical database for english. Communications of The ACM 38:39–41.
  • [Nickel, Tresp, and Kriegel 2011] Nickel, M.; Tresp, V.; and Kriegel, H.-P. 2011. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning, 809–816.
  • [Pujara, Augustine, and Getoor 2017] Pujara, J.; Augustine, E.; and Getoor, L. 2017. Sparsity and noise: Where knowledge graph embeddings fall short. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 1751–1756.
  • [Socher et al. 2013] Socher, R.; Chen, D.; Manning, C. D.; and Ng, A. Y. 2013. Reasoning with neural tensor networks for knowledge base completion. In Proceedings of Advances in Neural Information Processing Systems (NIPS 2013), 926––934.
  • [Song, Wu, and Dong 2016] Song, Q. B.; Wu, Y.; and Dong, X. 2016. Mining summaries for knowledge graph search. 2016 IEEE 16th International Conference on Data Mining (ICDM) 1215–1220.
  • [Sukhbaatar et al. 2015] Sukhbaatar, S.; Szlam, A.; Weston, J.; and Fergus, R. 2015. End-to-end memory networks. In Proceedings of Advances in Neural Information Processing Systems (NIPS 2015), 2440––2448.
  • [Tang, Qin, and Liu 2016] Tang, D.; Qin, B.; and Liu, T. 2016. Aspect level sentiment classification with deep memory network. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 214––224.
  • [Toutanova and Chen 2015] Toutanova, K., and Chen, D. 2015. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality.
  • [Trouillon et al. 2016] Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; and Bouchard, G. 2016. Complex embeddings for simple link prediction. In Proceedings of the 33th International Conference on Machine Learning, 2071–2080.
  • [Wang et al. 2014a] Wang, Z.; Zhang, J.; Feng, J.; and Chen, Z. 2014a. Knowledge graph and text jointly embedding. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 1591––1601.
  • [Wang et al. 2014b] Wang, Z.; Zhang, J.; Feng, J.; and Chen, Z. 2014b. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, 1112–1119.
  • [Wang et al. 2017] Wang, Q.; Mao, Z.; Wang, B.; and Guo, L. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29:2724–2743.
  • [Weston, Chopra, and Bordes 2015] Weston, J.; Chopra, S.; and Bordes, A. 2015. Memory networks. In Proceedings of International Conference on Learning Representations (ICLR 2015).
  • [Xie et al. 2016] Xie, R.; Liu, Z.; Jia, J.; Luan, H.; and Sun, M. 2016. Representation learning of knowledge graphs with entity descriptions. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, 2659–2665.
  • [Xiong, Hoang, and Wang 2017] Xiong, W.; Hoang, T.; and Wang, W. Y. 2017. Deeppath: A reinforcement learning method for knowledge graph reasoning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 575–584.
  • [Xiong, Merity, and Socher 2016] Xiong, C.; Merity, S.; and Socher, R. 2016. Dynamic memory networks for visual and textual question answering. In Proceedings of the 33rd International Conference on Machine Learning, 2397–2406.
  • [Xu et al. 2017] Xu, J.; Qiu, X.; Chen, K.; and Huang, X. 2017. Knowledge graph representation with jointly structural and textual encoding. In Proeedings of the 26th International Joint Conference on Artificial Intelligence, 1318–1324.
  • [Yang et al. 2015] Yang, B.; tau Yih, W.; He, X.; Gao, J.; and Deng, L. 2015. Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of International Conference on Learning Representations (ICLR 2015).
  • [Zeng et al. 2017] Zeng, K.-H.; Chen, T.-H.; Chuang, C.-Y.; Liao, Y.-H.; Niebles, J. C.; and Sun, M. 2017. Leveraging video descriptions to learn video question answering. In Proceedings of the 31th AAAI Conference on Artificial Intelligence, 4334–4340.
  • [Zhong et al. 2015] Zhong, H.; Zhang, J.; Wang, Z.; Wan, H.; and Chen, Z. 2015. Aligning knowledge and text embeddings by entity descriptions. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 267––272.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
254349
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description