Collective Embedding-based Entity Alignment via Adaptive Features
Entity alignment (EA) identifies entities that refer to the same real-world object but locate in different knowledge graphs (KGs), and has been harnessed for KG construction and integration. When generating EA results, current embedding-based solutions treat entities independently and fail to take into account the interdependence between entities. In addition, most of embedding-based EA methods either fuse different features on representation-level and generate unified entity embedding for alignment, which potentially causes information loss, or aggregate features on outcome-level with hand-tuned weights, which is not practical with increasing numbers of features.
To tackle these deficiencies, we propose a collective embedding-based EA framework with adaptive feature fusion mechanism. We first employ three representative features, i.e., structural, semantic and string signals, for capturing different aspects of the similarity between entities in heterogeneous KGs. These features are then integrated at outcome-level, with dynamically assigned weights generated by our carefully devised adaptive feature fusion strategy. Eventually, in order to make collective EA decisions, we formulate EA as the classical stable matching problem between entities to be aligned, with preference lists constructed using fused feature matrix. It is further effectively solved by deferred acceptance algorithm. Our proposal is evaluated on both cross-lingual and mono-lingual EA benchmarks against state-of-the-art solutions, and the empirical results verify its effectiveness and superiority. We also perform ablation study to gain insights into framework modules.
Knowledge graph (KG) is playing an increasingly more important role in intelligent information services, e.g., information retrieval , automatic question answering  and recommendation systems . Despite that a large number of KGs have been constructed over recent years, none of them can reach full coverage. These KGs, however, usually contain complementary contents, making it compelling to study the integration of heterogeneous KGs. To incorporate the knowledge from target KGs into the source KG, an indispensable step would be entity alignment (EA).
EA aims to discover entities that have the same meaning but locate in different KGs. It is similar to the task of entity resolution (ER, also known as entity matching), which identifies entity records referring to the same real-world entity from different data sources . However, it is noted that for typical ER approaches, the data sources are mainly relational tables , and they tend to devise and aggregate various similarity measures between entity attributes to make ER decisions, while the task of EA centers on graph-structured data, i.e., KG, and most recently graph-oriented representation learning techniques are utilized to capture entity similarity [5, 25]. The focus of this paper is embedding-based EA solution.
Most state-of-the-art embedding-based EA strategies [2, 19, 23, 31] assume that equivalent entities in different KGs have very similar neighbouring information. Consequently, they harness representation learning frameworks, e.g., TransE , graph convolutional network (GCN) , recurrent skipping networks (RSNs) , to model structural feature. Additionally, some propose to incorporate attribute information [22, 25, 24, 29] and entity description  to offer complementing views for alignment. When generating EA results for entities in test set, these state-of-the-art solutions handle source entities separately and return a list of ranked target entities for each source entity with confidence (similarity) scores. The top ranked target entity is aligned to source entity. A pair of aligned entities is called a match or a correspondence.
Compared with conventional symbolic methods [21, 10], representation learning based solutions require less human involvement in feature engineering and can be easily scaled to large KGs. Nevertheless, we still identify the following limitations that restrain the effectiveness of embedding-based EA:
EA Decision Making. State-of-the-art solutions treat entities independently when determining EA results. However, there is often additional interdependence between different EA decisions, i.e., a target entity is less likely to be the match to a source entity if it is aligned to another source entity with higher confidence. Sufficiently modelling such collective signals would reduce mismatches and lead to higher aligning accuracy. Noteworthy is that here we highlight the overlook of collective signals during decision making process, while we acknowledge the use of structural (collective) information as a useful feature during feature generation process. We further exemplify the deficiency of independent EA decision making via Example 1.
In Figure 1, there are two KGs, where target entities in should be aligned to source entities in . We omit the process of generating and fusing features, and directly present the fused similarity matrix, where higher values represent higher similarities (entities in training set are excluded). Note that entities with the same indexes are equivalent entities (ground truth). For state-of-the-art EA methods, the alignment decisions are made independently. In concrete, regarding source entity , since target entity has the highest similarity score, is considered to be a match. Similarly, are also assumed to be equivalent by independent EA solutions. Compared with the ground truth, the latter two are incorrect.
Nevertheless, these mismatches can be avoided by exerting a simple collective aligning constraint, e.g., different source entities cannot match the same target entity and the source entity with higher similarity score can keep the match. In this case, will break the match with and choose , which forces to align . The resulting correspondences, , are all correct.
Feature Fusion. Merely using one type of features is unlikely to guarantee satisfactory alignment results. Some methods [29, 28, 26] propose to leverage representation learning techniques for fusing different alignment signals and generating unified feature representation, while some [25, 22] choose to aggregate alignment results and similarity scores generated within feature-specific spaces. Through empirical comparison, it is noted that the former representation-level feature fusion tends to yield worse results than the latter outcome-level feature fusion, as directly unifying feature representations inevitably causes the loss of feature-specific characteristics.
Nevertheless, current outcome-level feature fusion strategies [25, 22] usually hand-tune feature weights on validation set, which becomes impractical with the increase of features. Consequently, dynamically fusing various features with adaptive weights would be more preferable and practical.
In order to address the shortages of current EA solutions, we establish CEAFF, a collective entity alignment framework with adaptive feature fusion strategy. CEAFF first exploits structural, semantic and string-level features for capturing different aspects of the similarity between entities in source and target KGs. These features are representative and generally available, which are effectively modelled by our proposed feature generation strategies. Then we compute the intermediate outcomes, i.e., similarity matrices, within feature-specific spaces, and devise an adaptive feature fusion strategy to effectively fuse different features at outcome-level, which yields a fused similarity matrix that dynamically encodes multiple signals. Eventually, to capture the interdependence between EA decisions, we formulate EA as the classical Stable Matching (Marriage) Problem (SMP) , with entity preference lists constructed by using fused similarity matrix. The problem is further addressed by deferred acceptance algorithm (DAA)  with high effectiveness and efficiency.
Contribution. Our contribution can be summarized as:
We identify the shortcomings of existing embedding-based EA methods in fusing various features and making EA decisions. In this connection, we propose a novel solution, namely CEAFF, to boost the overall EA performance. This is done by (1) introducing an effective strategy to dynamically fuse different features with adaptive weight assignment, and (2) designing a collective embedding-based EA framework that takes into account the underlying interdependence between EA decisions for different entities. As where we are standing, this is among the first attempts to tackle EA collectively.
We introduce string similarity as an important feature for alignment, which has been largely overlooked and rarely exploited by existing embedding-based EA frameworks. Dynamically combining it with structural and semantic signals offers a more comprehensive view for matching entities.
We empirically evaluate CEAFF on both cross-lingual and mono-lingual EA tasks against over 10 state-of-the-art methods, and the comparative results demonstrate the superiority of CEAFF.
Organization. Section II overviews related work. In Section III, we define EA task and present the overall framework of CEAFF. We then elaborate the components of CEAFF, namely adaptive feature fusion, collective EA and feature generation in Section V, Section VI and Section IV, respectively. Section VII introduces experimental settings, evaluation results and detailed analysis, followed by conclusion in Section VIII.
Ii Related Work
We discuss embedding-based EA related work from EA merely using structural information to EA using multiple sources of information.
EA using structural information. Most efforts on EA harness KG embedding technique for aligning due to its simplicity, generality, and ability of dealing with large-scale data. A shared pattern can be observed from these works. Initially, they utilize KG representation methods, e.g., TransE [5, 30, 4] and GCN , to encode KG structural information and embed input KGs into individual low-dimensional spaces. Then the embedding spaces are aligned using seed entity pairs. Some methods [22, 23, 13] directly project entities in different KGs into the same embedding space by fusing the training corpus in the first place. In accordance to the distance in unified embedding space, equivalent elements in different KGs can thus be aligned.
There are some very recent efforts aiming to improve structural representation by devising advanced representation-learning models. Zhu et al.  propose a neighbourhood-aware attentional representation method to learn neighbour-level representation by aggregating neighbours’ representations with a weighted combination. In addition, Cao et al.  put forward a novel multi-channel graph neural network model to learn alignment-oriented KG embeddings by robustly encoding two KGs via multiple channels. Guo et al.  offer an EA solution which integrates recurrent neural networks (RNNs) with residual learning to efficiently capture the long-term relational dependencies within and between KGs. These methods are more effective for matching entities by incorporating richer structural signals.
EA using multiple sources of information. The aforementioned solutions mainly capture structural information. Additionally, some propose to incorporate attribute information to offer a complementing view. In  and , attribute types are utilized to generate attribute embedding. Most recently, Trisedya et al.  devise an attribute character embedding model which takes full advantage of attribute values. This line of work is built on the assumption that there is a large number of attribute triples. Nevertheless, as pointed out in , between 69% and 99% of instances in popular KGs lack at least one attribute that other entities in the same class have. Similarly, although entity descriptions  are utilized to provide textual information, the description information is often missing in many KGs. This could greatly restrain the effectiveness of these methods.
Instead of using attributes or entity description, which tends to be lacking for many KGs, some propose to integrate the generally available entity names to complement structural information. Both  and  employ entity name information as the inputs to their overall framework for learning entity embeddings, and the final entity representation encodes both structural and semantic signals. Specifically, Xu et al.  propose to construct topic graph for each entity, which can better integrate adjacent information into structural representation of entities, and then covert the task into graph matching problem between topic entity graphs. Wu et al.  put forward a novel relation-aware dual-graph convolutional network so as to incorporate relation information via attentive interactions between KG and its dual relation counterpart. In , a multi-view KG embedding framework is proposed, which learns entity embeddings from three views of KGs, i.e., the views of entity names, relations and attributes.
These methods unify multiple views of entities, e.g., entity name and KG structure, at representation-level, which inevitably causes information loss, e.g., two entities might be extremely similar in feature-specific embedding spaces, while placed distantly in the unified representation space that fails to keep the characteristic of each feature. Our proposed method takes another approach which aggregates features at outcome-level, i.e., similarity matrix calculated within each individual feature space and the significance of each feature is also dynamically adjusted by our adaptive feature fusion strategy.
Iii Problem Definition and Overall Framework
We first formally define the EA task, and then introduce the overall framework of our proposal.
Task Definition. A KG is a directed graph comprising a set of entities , relations , and triples . A triple represents head entity is connected to tail entity via relation .
Given two KGs, source KG and target KG , denote the seed entity pairs as , where represents equivalence. EA task can be defined as discovering equivalent target entities for sources entities in the test set.
Framework. CEAFF aims to collectively align entities based on adaptively fused features. As is shown in Figure 2, there are three main steps in our framework:
This step aims at generating representative features for EA, i.e., structural embedding learned by GCN, semantic embedding generated by averaged word embedding and string similarity matrix calculated using Levenshtein distance . Structural and semantic representations are converted to corresponding similarity matrices via cosine similarity.
Adaptive Feature Fusion
This module is to fuse various features with adaptively generated weights. This is achieved by identifying, selecting, and allocating weights to confident EA pairs that can characterize the significance of features. Utilizing such feature fusion strategy, we first combine semantic and string matrices to generate textual similarity matrix, which is then dynamically aggregated with structural matrix to yield the eventual fused similarity matrix.
This step generates EA results by taking into account the interdependence between decisions for different entities. In concrete, EA is formulated as SMP with preference lists constructed using fused similarity matrix. The classic DAA is harnessed to collectively produce correspondences in an effective and efficient manner.
In next, we first introduce adaptive feature fusion strategy, then collective EA framework and its solution. Representative features are elaborated at last.
|Notation||Definition and description|
|Pre-aligned (seed) entity pairs|
|Initial input matrix for GCN|
|Adjacency matrix for GCN|
|Final feature matrix generated by GCN|
|Number of entities|
|Dimensionality of feature matrix in all GCN layers|
|Parameters in GCN layers|
|Structural similarity matrix|
|Entity name embedding matrix|
|Semantic similarity matrix|
|String similarity matrix|
|Levenshtein ratio, also string similarity score|
|Parameters in feature fusion|
Iv Feature Generation
In this section, we elaborate the features that are employed for aligning entities.
Iv-a Structural information
In this work, we harness GCN  to encode the neighbourhood information of entities as real-valued vectors. We notice that there are more advanced frameworks, e.g., dual-graph convolutional network  and recurrent skipping network , for learning structural representation. They can also be easily plugged into our overall framework after parameter tuning. We will explore them in the future as it is not the focus of this work. Next we brief the configuration of GCN for EA task, while leave out GCN fundamentals in the interest of space.
Model Configuration and Training.
In EA task, GCN is harnessed to generate structural representations of entities.
We build two 2-layer GCNs, and each GCN processes one KG to generate embeddings of its entities.
The initial feature matrix, , is sampled from truncated normal distribution with L2-normalization on rows.
The entity embeddings generated by two different GCNs are then aligned into the same embedding space using pre-aligned EA pairs . In specific, the training objective is to minimize the margin-based ranking loss function:
where , denotes the set of negative EA pairs obtained by corrupting , i.e., substituting or with a randomly sampled entity from its corresponding KG. and denotes the (structural) embedding of entity and . is a positive margin that separates positive and negative EA pairs. Stochastic gradient descent (SGD) is harnessed to minimize the loss function.
Given a structural embedding matrix , two entities and , their similarity score can be formally defined as , where is the dimension of structural embeddings, and denotes the embedding vector for entity (i.e., , where is the one-hot encoding of entity ). From the perspective of KG structure, the entity with the highest similarity score to source entity is returned as alignment result. We denote the structural similarity matrix as , where rows represent source entities, columns denote target entities and each element in the matrix denote structural similarity score between a pair of source and target entities.
Complexity Analysis. The model complexity of CEAFF mainly lies in the parameters of GCN model, i.e., and in the two layers. As thus, the total number of parameters is .
Iv-B Semantic information
An entity contains abundant textual information, ranging from its name, description to other attribute values in text form. This information can help match entities in different KGs since equivalent entities have the same semantic meaning. Among this information, entity name, which identifies an entity, is the most universal text form. Also, given two entities, comparing their names is the easiest approach to judge whether they are the same. Therefore, in this work we propose to utilize entity names as the source of textual information for EA.
Entity name can be exploited both from the semantic and string similarity level. We first introduce semantic similarity, as it can also work when the vocabularies of KGs differ, especially for the cross-lingual scenario. More specifically, we choose averaged word embeddings to capture the semantic meaning on account of its simplicity and generality. It does not require special training corpus, and can represent semantics in a concise and straightforward manner. Suppose the name of entity comprises words, . Then the name embedding can be calculated by , where is the word embedding of word . For a KG, the name embeddings of all entities can be denoted in matrix form as .
Like word embeddings, similar entity names will be placed adjacently in the entity name representation space. From the perspective of semantic information, given two entities and , their similarity score can be formally defined as , and the entity with the highest similarity score to source entity is returned as alignment result. We denote semantic similarity matrix as . This approach is also applicable for cross-lingual EA, where cross-lingual word embeddings would be harnessed to ensure that the entity name embeddings are in a unified embedding space.
Iv-C String information
Current methods mainly capture semantic information of entities, as semantics can be easily encoded as embeddings, facilitating the fusion of different feature representations [29, 28]. Plus, semantic feature can also work well in cross-lingual scenarios by referring to external sources, e.g., pre-trained multilingual word embeddings.
In our work, we contend that string information, which has been largely overlooked by existing EA literature, is also a beneficial feature, since: (1) String similarity is especially useful in tackling mono-lingual EA task and cross-lingual EA task where KG pair is very close (e.g., English and German or English and French); (2) String similarity does not rely on external resources and is not affected by out-of-vocabulary problem which tends to restrain the effectiveness of semantic information. For instance, there might not be corresponding word embeddings for some rare words.
In particular, we adopt Levenshtein distance , which is a string metric for measuring the difference between two sequences. Formally, the Levenshtein distance between two strings (of length and respectively) is given by , where
where is the indicator function that equals to 0 when and to 1 otherwise. denotes the distance between the first characters of and the first characters of . are 1-based indices. Note that the first element in the minimum corresponds to deletion (from to ), the second to insertion and the third to match or mismatch, depending on whether the respective symbols are the same.
Based on Levenshtein distance, we characterize the similarity between entity name strings by calculating Levenshtein ratio . Note that in , the substitution operation costs 2, instead of 1 in of Equation 2. The motivation behind can be explained by a simple example: using , the Levenshtein ratio between ‘a’ and ‘c’ is 0.5, while the ratio is 0 when utilizing . Evidently the latter is more reasonable.
We denote the string similarity matrix computed using Levenshtein ratio as . From the perspective of string similarity, the entity with the highest similarity score to source entity is returned as alignment result.
V Adaptive Feature Fusion
We point out the limitations of current fusion methods, and then elaborate our adaptive feature fusion strategy.
Some state-of-the-arts unify multiple views of entities to learn embeddings for EA. They either embed entities based on various views, e.g., entity names, relations and attributes, and then present different strategies to combine multiple view-specific entity embeddings for alignment , or use certain features as inputs for learning representations of other features and achieve feature fusion during training [28, 26]. Nevertheless, directly fusing the representations of features would inevitably cause information loss, e.g., two entities might be extremely similar in individual feature-specific embedding spaces, while placed distantly in the unified representation space that fails to maintain the characteristic of each feature.
Different from these representation-level feature aggregation techniques, we resort to outcome-level feature fusion that operates on the intermediate outcomes, i.e., similarity matrices, generated by individual feature spaces. Current outcome-level feature fusion works [22, 25, 18] hand-tune feature weights, whereas it becomes inapplicable when the amount of features increases or the importance of certain feature varies greatly under different alignment settings. In this connection, for outcome-level feature fusion, the problem lies in how to dynamically determine the importance of each feature.
An intuitive solution is to use machine learning techniques to learn the weights. Nevertheless, regarding EA problem, for each source entity, the number of negative instances is much larger than positive ones. Besides, the amount of training data is very limited. Such restraints make it difficult to generate training corpus with high quality, which further hampers learning appropriate weights for different features. We will report the results of learning-based approach in Section VII.
We, on the other hand, devise an adaptive feature fusion strategy that can dynamically determine the weights of features without requiring training data. It consists of five stages:
Candidate Correspondence Generation. The inputs to this stage are features and their corresponding similarity matrices, . Note that in feature matrix , denotes the similarity between (from source KG) and (from target KG) measured by feature . If is the largest both along the row and the column, is considered to be a candidate confident correspondence obtained from feature . This constraint is very strong and the selected confident correspondences are very likely to be correct matches. As thus, the confident correspondences produced by a specific feature can reflect its importance, the concrete correlation of which is formulated later. As is shown in Figure 3, there are three feature matrices, producing six candidate confident correspondences with respective similarity scores.
Candidate Correspondence Filtering. Noteworthily, not all candidate confident correspondences are contributive to characterizing the significance of features. As thus, we devise several strategies to filter candidates and retain useful confident correspondences. We first take into account the potential conflicts among candidate correspondences generated by different features, and contend that, if different features detect conflicting candidate correspondences for the same source entity, these candidates will be filtered out. For instance, regarding source entity in Figure 3, both and produce correspondence , while produces correspondence , and consequently they are all pruned from the confident correspondences. This is evident, as conflicting correspondences cannot reflect the effectiveness of features.
In addition, if a candidate correspondence is shared by all features, it is also filtered out, as we consider this correspondence fails to characterize a feature. Eventually, we utilize the retained confident correspondences to determine feature weights.
Correspondence Weight Calculation. We first determine the weight of each confident correspondence. Instead of assigning equal weights, we assume that (1) a confident correspondence found by several features has less importance than other correspondences found within only one feature matrix, and; (2) correspondences with extremely high scores are actually less significant.
In specific, if a correspondence is shared by features, it is assigned with weight . In other words, the importance of a candidate correspondence is inversely proportional to the number of its appearances, as it is believed that the frequently occurring correspondence brings less new information in comparison with a correspondence that has the quality of being the highest in only one single feature matrix . In addition, for a correspondence with very large similarity score , we set its weight to a small value . This is to prevent a very unbalanced distribution of feature weights when a specific feature is very effective. We will further discuss the motivation and the setting of in experiment.
As is shown in Figure 3, as is only produced by , it is assigned with weight , while is detected by both and , resulting in the weight of . Notably, as the similarly score of produced by exceeds , it is set to .
Feature Weight Calculation. After obtaining the weight of each confident correspondence, the weighting score of feature is computed as the sum of weights of the confident correspondences it generates. The weight of this feature is the ratio between its weighting score and the sum of the weighting scores of all features.
Feature Fusion with Adaptive Weight. We further combine individual feature matrices and their corresponding adaptively determined weights to compute fused similarity matrix.
In specific, in this work we utilize three representative features, i.e., structural, semantic and string-level signals, which are elaborated in Section IV. We first fuse semantic and string features to generate textual similarity matrix, which is further aggregated with structural matrix to produce final fused similarity matrix. Compared with fusing all features simultaneously, our proposed two-stage fusion framework can better adjust weight assignment and generate fused similarity matrix with higher quality.
Vi Collective EA
We first formulate EA as stable matching problem, and then describe and discuss corresponding solution.
EA as Stable Matching Problem. After obtaining the fused matrix , we can determine EA results in an independent fashion that has been adopted by state-of-the-art methods. Specifically, for each source entity , we retrieve its corresponding row entry in , and rank the elements in a descending order. The top ranked target entity is considered as the match.
Nevertheless, this way of generating EA pairs fails to consider the interdependence between different EA decisions. To adequately model such coherence, we formulate EA as the stable matching problem (SMP) . Concretely, it is proved that for any two sets of members with the same size, each of whom provides a ranking of the members in the opposing set, there exists a bijection of the two sets such that no pair of two members from the opposite side would prefer to be matched to each other rather than their assigned partners . This set of pairing is also called a stable matching.
We then apply the key concepts of SMP, namely, preferences lists and blocking pairs, to our collective EA framework. Specifically, the preference list of an entity is the set of entities in opposing side ranked by similarity scores in a descending order. The blocking pair (BP) for a stable matching in EA is defined as a pair of source and target entities , where prefers to its currently matched entity , and prefers to its currently matched entity . Thus, will leave to be matched to and would prefer being matched to user than user .
Solution. As for the solution for SMP, we adopt deferred acceptance algorithm (DAA)  to generate a stable matching, since it can be easily implemented in a centralized manner with low time complexity, which is suitable for EA task. In concrete, it works as follows:
In the first round, each source entity proposes to the target entity it prefers most, and then each target entity replies “maybe” to the suitors it most prefers and “no” to all other suitors. The target entity is then provisionally matched to the suitor it most prefers so far, and that suitor is likewise provisionally matched to the target entity.
In each subsequent round, each unmatched source entity proposes to the most-preferred target entity to which it has not yet proposed (regardless of whether the target entity is already matched), and then each target entity replies “maybe” if it is currently not matched or if it prefers the new proposer over its current matched entity (in this case, it rejects its provisional match which becomes unmatched). The provisional nature of match preserves the right of an already-matched target entity to “trade up”.
This process is repeated until every source entity is matched.
We further illustrate this process via Figure 4. This algorithm guarantees that every entity gets matched. At the end, there cannot be a source entity and a target entity both unmatched, as this source entity must have proposed to this target entity at some point (since a source entity will eventually propose to every target entity, if necessary) and, being proposed to, the target entity would necessarily be matched (to a source entity) thereafter.
Also, the “marriages” are stable, meaning there will not be blocking pairs. Let and both be matched, but not to each other. Upon completion of the algorithm, it is not possible for both and to prefer each other over their current partners. If prefers to its current partner, it must have proposed to before it proposed to its current partner. If accepted ’s proposal, yet is not matched to it at the end, must have dumped for some entity it prefers more, and therefore does not like more than its current partner. If rejected ’s proposal, it was already with some entity it liked more than .
Discussion. Stable matching is competitive of its outcome and efficiency. However, theoretically, EA can also be formed as a Maximum Weighted Bipartite Matching problem that requires the elements from two sets to be paired one-to-one and to have the largest sum of pairwise utilities (similarity scores). This maximum weighted perfect bipartite matching problem is a classical combinatorial optimization problem in computer science. It can be formulated and solved in polynomial time as a linear program or using more specialized Hungarian algorithm techniques .
Though it is tempting to optimize the matching over certain utility functions, a stable matching is more desirable, since it takes into account a diverse set of preferences of source and target entities, and produces a matching result where no participants have incentives to deviate from. Also, DAA is of much higher efficiency.
This section reports the experiments with in-depth analysis.
Vii-a Experiment Setting
Datasets. Three datasets, including nine KG pairs, are used for evaluation:
DBP15K. The DBP15K dataset was originally introduced in . They extracted 15 thousand inter-language links (ILLs) in DBpedia with popular entities from English to Chinese, Japanese and French respectively, and considered them as reference alignment (i.e., gold standards).
DBP100K. The DBP100K dataset was originally introduced in . It comprises two mono-lingual EA datasets, DBP-WD and DBP-YG, which were extracted from DBpedia, Wikidata and YAGO3. Each dataset contains 100,000 entity pairs. The extraction method followed DBP15K, with ILLs replaced with references connecting these KGs.
SRPRS. Guo et al.  pointed out that KGs in previous EA datasets, e.g, DBP15K and DBP100K, are too dense and the degree distributions deviate from real-life KGs. Therefore, they established new EA benchmark that follows real-life distribution by using ILLs in DBpedia. They first divided the entities in a KG into several groups by their degrees and then separately performed random PageRank sampling for each group. To guarantee the distributions of the sampled datasets following the original KGs, they used the Kolmogorov-Smirnov (K-S) test to control the difference. The final evaluation benchmark consists of both cross-lingual and mono-lingual datasets.
A concise summary of the datasets can be found in Table II. Note that 30% of the gold standards are seed alignment. On all datasets, the amount of entity pairs is over 10,000, which is sufficient for evaluating the effectiveness of EA solutions.
For learning structural representation, is set to 300, is set to 3, the training epochs are 300. Five negative examples are generated for each positive pair.
Regarding entity name representation, we utilize the fastText embedding  as word embedding and the multilingual word embeddings are obtained from MUSE
Evaluation Metric. We utilize the accuracy of alignment results as evaluation metric. It is defined as the number of correctly aligned source entities divided by the total number of source entities.
We notice that previous EA methods all use Hits@k (k=1, 10) and mean reciprocal rank (MRR) as evaluation metrics, as they make EA decisions independently and for each source entity, they rank target entities according to similarity scores in a descending order. Hits@k reflects the percentage of correctly aligned entities in top-k ranked entities, while MRR characterizes the rank of ground truth entity. Nevertheless, in practical cases, the accuracy, i.e., Hits@1, of the results is of higher significance, whereas how close the results are to ground truth is less concerned. Consequently, we adopt the accuracy as the main evaluation metric.
Competitors. The following state-of-the-art EA methods are adopted for comparison, which can be divided into two groups, methods merely using structural information and methods using information external to structural information. In specific, structure-based EA group consists of :
MTransE : This is the first method that proposes to utilize KG embedding, i.e., TransE, for EA.
IPTransE : An iterative training process is used to improve the alignment results.
BootEA : This work devises an alignment-oriented KG embedding framework and a bootstrapping strategy.
RSNs : This work integrates recurrent neural networks (RNNs) with residual learning to efficiently capture the long-term relational dependencies within and between KGs.
MuGNN : A novel multi-channel graph neural network model is put forward to learn alignment-oriented KG embeddings by robustly encoding two KGs via multiple channels.
NAEA : This work proposes a neighbourhood-aware attentional representation method to learn neighbour-level representation.
Methods using several features include:
JAPE : In this work, the attributes of entities are harnessed to refine the structural information.
GCN-Align : This work utilizes GCN to generate entity embedding, which in combination with attribute embedding, are used to align entities in different KGs.
RDGCN : A relation-aware dual-graph convolutional network is proposed to incorporate relation information via attentive interactions between KG and its dual relation counterpart.
MultiKE : This paper offers a novel framework that unifies the views of entity names, relations and attributes to learn embeddings for mono-lingual EA. It can merely cope with mono-lingual EA.
GM-Align : A local sub-graph of an entity is constructed to represent entity. Entity name information is harnessed for initializing the overall framework.
Vii-B Cross-lingual EA Results
The highest results are represented in bold, while the highest results within each group are underlined.
As can be observed from Table III, our proposed model CEAFF consistently outperforms all baselines on all data sets. More specifically, CEAFF achieves at least 0.07 higher accuracy on DBP15K and 0.3 higher accuracy on SRPRS than the second best methods. We attribute the superiority of our model to its three advantages: (1) Source entities are aligned collectively, which can avoid frequently appearing situations where multiple source entities are aligned to the same target entity; (2) We leverage three representative sources of information, i.e, structural, semantic and string-level features, to offer more comprehensive signals for EA; (3) The features are dynamically fused with adaptively assigned weights, which can fully take into consideration the strengths of different signals.
Results in the first group. Solutions in the first group merely harness structural information for aligning. MTransE obtains the worst results as it learns the embeddings in different vector spaces, and losses information when modelling the transition between different embedding spaces . IPTransE achieves better performance than MTransE as it adopts relational path for learning structural embedding, and utilizes an iterative framework to augment training set. These two points are further exploited and advanced by RSNs and BootEA, respectively. Specifically, RSNs enhances the results by taking into account of long-term relational dependencies between entities, which can capture more structural signals for alignment, while BootEA devises a carefully designed alignment-oriented KG embedding framework, with one-to-one constrained bootstrapping strategy. MuGNN also outperforms IPTransE by at least 0.08, as it utilizes a multi-channel graph neural network that captures different levels of structural information. Nevertheless, its results are still inferior to BootEA and RSNs. On DBP15K, NAEA attains the highest results within this group, as its neighbourhood-aware attentional representation method can make better use of KG structures and learn more comprehensive structural representation. Noteworthily, the overall performance on SRPRS are worse than DBP15K, as the KGs in DBP15K are much denser than those in SRPRS and contain more structural information . On SRPRS, where KGs are with real-life degree distributions, RSNs achieves the best results since the long-term relational dependencies it captures are less sensitive to entity degrees. This is verified by the fact that the results of RSNs exceed BootEA, in contrast to results on DBP15K.
Results in the second group. Within the second group, taking advantage of attribute information, GCN-Align and JAPE outperform MTransE by around 0.1 on DBP15K, whereas on SRPRS, GCN-Align achieves worse results than MTransE and JAPE attains similar accuracy scores to MTransE. This reveals that attribute information is quite noisy and might not guarantee consistent performance. RDGCN and GM-Align outperforms all the other methods except for our proposal, as they harness entity name information as the inputs to their overall framework for learning entity embeddings, and the final entity representation encodes both structural and semantic signals, providing a more comprehensive view for alignment. Nevertheless, CEAFF outperforms RDGCN and GM-Align by a large margin, since we aggregate features on outcome-level instead of representation-level and employ a collective alignment strategy.
As pointed out before, since SRPRS is more similar to real-life KGs and with higher “difficulty”, existing methods achieve relatively worse results on SRPRS compared to DBP15K. In contrary, the performance of CEAFF on these two datasets are close, validating the robustness of our proposal. Interestingly, approaches in the first group are language-agnostic as they merely exploit structural information, whereas RDGCN, GM-Align and CEAFF achieve much better results on than and , unveiling that textual information is largely affected by language barriers.
Vii-C Mono-lingual EA Results
As can be observed from Table IV, the results on mono-lingual datasets resemble those on cross-lingual datasets. Within the first group, NAEA achieves the best results on DBP100K, while RSNs has a more consistent performance on both real-life and dense datasets. Among methods using multiple features, RDGCN yields the leading performance on SRPRS, whereas MultiKE attains the most competitive results on DBP100K, with accuracy exceeding 0.88, which can be ascribed to the multiple features (including entity name) it utilizes.
Notably, CEAFF advances the accuracy to 1 on all datasets. This is because entity names in DBpedia, YAGO and Wikidata are nearly identical, where string-level feature is extremely effective. In contrast, although semantic information is also useful, not all words in entity names can find corresponding entries in external word embeddings, which hence limits its effectiveness. In order to keep in line with previous works that merely exploits semantic feature, we also present the performance of CEAFF w/o where string-level feature is removed.
The fact that a simple string-level feature can achieve ground truth results on current benchmarks also encourages us to build more challenging mono-lingual EA datasets, which is left for future work.
Missing Results. On SRPRS, the datasets lack aligned relations which are required by MultiKE. We fail to reimplement GM-Align on DBP100K as it takes several days for training, while our proposal merely requires less than 10 minutes.
Vii-D Ablation Study
|w/o C, AFF||0.914||0.925||0.986||0.994||0.701|
|CEAFF w/o C||71.9||87.4||0.774||78.3||90.7||0.827||92.8||97.9||0.947|
We perform ablation study to gain insights into the components of CEAFF. The results on SRPRS and are presented in Table V, while the performance on the other datasets, which exhibits a similar trend, is left out in the interest of space. C represents collective alignment strategy, AFF denotes adaptive feature fusion strategy, represent structural, semantic and string-level features, respectively.
CEAFF vs. CEAFF w/o C. Without collective EA, the performance drops on all cross-lingual datasets, revealing the significance of considering the interdependence between EA decisions. Note that on mono-lingual datasets, after fusing features adaptively, the performance has already reaches 1.0, hence removing/adding collective alignment strategy does not affect the outcome.
CEAFF vs. CEAFF w/o AFF. We then examine the contribution made by adaptive feature fusion strategy. Specifically, the dynamic weight assignment is replaced with fixed weights, i.e., equal weight for each feature. As can be observed from Table V, using adaptive feature fusion strategy consistently yields better results than fixed weighting across all datasets, validating the usefulness of this feature.
CEAFF w/o C vs. CEAFF w/o C, AFF. Although solely removing adaptive feature fusion indeed lowers the overall accuracy, the decrease is not very obvious. To further demonstrate its effectiveness, we eliminate the influence of collective alignment and directly compare CEAFF w/o C with CEAFF w/o C, AFF. The accuracy drops by over 0.01 on four KG pairs, revealing its superiority over fixed weight assignment.
CEAFF vs. CEAFF -, , . Finally we test the feasibility of our proposed features. On cross-lingual datasets, removing structural information consistently brings performance drop, showcasing its stable effectiveness across all language pairs. In comparison, semantic information plays a more important role on distantly-related language pairs, e.g., Chinese and English, whereas string-level feature is significant for aligning closely-related language pairs, e.g., English and French. On mono-lingual datasets, removing structural or semantic information does not hurt the accuracy, while pruning string-level feature results in an accuracy drop at around 0.1. This unveils that string-level feature is extremely useful on datasets where entity names are very similar.
Interestingly, the performance decline of removing a component, i.e., , , , or AFF, from CEAFF, tends to be smaller than the performance drop of removing the same component from CEAFF w/o C over cross-lingual datasets. This can be attributed to the collective EA strategy, which can narrow the performance loss caused by removing a certain feature.
Evaluation as ranking problem. For the comprehensiveness of the experiment, in Table VI, we also consider EA results in the form of ranked target entity lists and report corresponding Hits@1, 10 and MRR values. Note that Hits@10 and MRR are missing for CEAFF, since the output of collective EA framework is aligned entity pairs and does not contain ranked entity lists. We leave out the evaluation performance on DBP100K and SRPRS in the interest of space.
Vii-E Further experiment
Thresholds . As mentioned in Section V, for a correspondence with very large similarity score, i.e., exceeding , we set its weight to a small value . To examine the usefulness of this strategy, we report the results after removing this setting in Table V. Without this setting, the performance has dropped overall all datasets, validating the effectiveness of this strategy. Theoretically speaking, by applying this strategy, features that are very effective would not be assigned with extremely large values, and less effective features can always contribute to the final EA decisions.
Learning based weighting strategy. Our adaptive feature fusion strategy can dynamically determine the weight of features without the need of training data. Nevertheless, it might be of interest to see the effectiveness of a simple learning based approach for generating feature weights. In this connection, we devise a weighting method with learnable parameters to serve as a stronger baseline. In specific, we consider EA as a classification problem, i.e., labelling correct EA pairs with 1s and false pairs with 0s, and adopt Logistic Regression to determine feature weights. To construct training set, for each positive pair (seed entity pair), we generate 10 negative instances by replacing target entity with a random entity. We use the learned weight to combine features and apply collective alignment on the fused similarity to obtain eventual EA results (LR).
As read from Table V, comparing CEAFF, CEAFF w/o AFF and LR, CEAFF achieves the best results on cross-lingual datasets, verifying the effectiveness of our proposed feature fusion method. The performance of CEAFF w/o AFF and LR is very close, whereas LR requires additional training data. This unveils that the learning-based weighting strategy is not necessarily an optimal choice for feature fusion.
When making EA decisions, current EA solutions treat entities separately and fail to consider the interdependence among entities. To model such coherence, we propose to formulate EA as stable matching problem, which is further solved by deferred acceptance algorithm. To construct entity preference lists required by SMP, we devise an adaptive feature fusion strategy that aims to generate fused similarity matrix encoding multiple features. The features we utilize, comprising structural, semantic and string-level signals, are representative and generally available. Compared with state-of-the art approaches, our proposal achieves consistently better results and the ablation study also verifies the usefulness of each component. As for future research directions, we would like to explore other collective matching methods, as well as devise more challenging EA benchmark.
- Other methods of initialization are also viable. We stick to the random initialization in order to capture “pure” structural signal.
- P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.
- Y. Cao, Z. Liu, C. Li, Z. Liu, J. Li, and T. Chua. Multi-channel graph neural network for entity alignment. In ACL, pages 1452–1461, 2019.
- Y. Cao, X. Wang, X. He, Z. Hu, and T. Chua. Unifying knowledge graph learning and recommendation: Towards a better understanding of user preferences. In WWW, pages 151–161, 2019.
- M. Chen, Y. Tian, K. Chang, S. Skiena, and C. Zaniolo. Co-training embeddings of knowledge graphs and entity descriptions for cross-lingual entity alignment. In IJCAI, pages 3998–4004, 2018.
- M. Chen, Y. Tian, M. Yang, and C. Zaniolo. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017, pages 1511–1517, 2017.
- V. Christophides, V. Efthymiou, T. Palpanas, G. Papadakis, and K. Stefanidis. End-to-end entity resolution for big data: A survey. CoRR, abs/1905.06397, 2019.
- J. Doerner, D. Evans, and A. Shelat. Secure stable matching at scale. In SIGSAC Conference, pages 1602–1613, 2016.
- C. Fu, X. Han, L. Sun, B. Chen, W. Zhang, S. Wu, and H. Kong. End-to-end multi-perspective matching for entity resolution. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, pages 4961–4967, 2019.
- L. Galárraga, S. Razniewski, A. Amarilli, and F. M. Suchanek. Predicting completeness in knowledge bases. In WSDM, pages 375–383, 2017.
- L. A. Galárraga, N. Preda, and F. M. Suchanek. Mining rules to align knowledge bases. In AKBC@CIKM, pages 43–48, 2013.
- D. Gale and L. S. Shapley. College admissions and the stability of marriage. The American Mathematical Monthly, 69(1):9–15, 1962.
- M. Gulic, B. Vrdoljak, and M. Banek. Cromatcher: An ontology matching system based on automated weighted aggregation and iterative final alignment. J. Web Semant., 41:50–71, 2016.
- L. Guo, Z. Sun, and W. Hu. Learning to exploit long-term relational dependencies in knowledge graphs. In ICML, pages 2505–2514, 2019.
- B. Hixon, P. Clark, and H. Hajishirzi. Learning knowledge graphs for question answering through conversational dialog. In NAACL-HLT, pages 851–861, 2015.
- T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. CoRR, abs/1609.02907, 2016.
- H. W. Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.
- V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, volume 10, pages 707–710, 1966.
- N. Pang, W. Zeng, J. Tang, Z. Tan, and X. Zhao. Iterative entity alignment with improved neural attribute embedding. In DL4KG@ESWC, pages 41–46, 2019.
- S. Pei, L. Yu, R. Hoehndorf, and X. Zhang. Semi-supervised entity alignment via knowledge graph embedding with awareness of degree difference. In WWW, pages 3130–3136, 2019.
- A. E. Roth. Deferred acceptance algorithms: history, theory, practice, and open questions. Int. J. Game Theory, 36(3-4):537–569, 2008.
- F. M. Suchanek, S. Abiteboul, and P. Senellart. PARIS: probabilistic alignment of relations, instances, and schema. PVLDB, 5(3):157–168, 2011.
- Z. Sun, W. Hu, and C. Li. Cross-lingual entity alignment via joint attribute-preserving embedding. In ISWC, pages 628–644, 2017.
- Z. Sun, W. Hu, Q. Zhang, and Y. Qu. Bootstrapping entity alignment with knowledge graph embedding. In IJCAI, pages 4396–4402, 2018.
- B. D. Trisedya, J. Qi, and R. Zhang. Entity alignment between knowledge graphs using attribute embeddings. In AAAI, pages 297–304, 2019.
- Z. Wang, Q. Lv, X. Lan, and Y. Zhang. Cross-lingual knowledge graph alignment via graph convolutional networks. In EMNLP, pages 349–357, 2018.
- Y. Wu, X. Liu, Y. Feng, Z. Wang, R. Yan, and D. Zhao. Relation-aware entity alignment for heterogeneous knowledge graphs. In IJCAI, pages 5278–5284, 2019.
- C. Xiong, R. Power, and J. Callan. Explicit semantic ranking for academic search via knowledge graph embedding. In WWW, pages 1271–1279, 2017.
- K. Xu, L. Wang, M. Yu, Y. Feng, Y. Song, Z. Wang, and D. Yu. Cross-lingual knowledge graph alignment via graph matching neural network. In ACL, pages 3156–3161, 2019.
- Q. Zhang, Z. Sun, W. Hu, M. Chen, L. Guo, and Y. Qu. Multi-view knowledge graph embedding for entity alignment. In IJCAI, pages 5429–5435, 2019.
- H. Zhu, R. Xie, Z. Liu, and M. Sun. Iterative entity alignment via joint knowledge embeddings. In IJCAI, pages 4258–4264, 2017.
- Q. Zhu, X. Zhou, J. Wu, J. Tan, and L. Guo. Neighborhood-aware attentional representation for multilingual knowledge graphs. In IJCAI, pages 1943–1949, 2019.