FAQ-based Question Answering via Knowledge Anchors

FAQ-based Question Answering via Knowledge Anchors

Ruobing Xie*, Yanan Lu*, Fen Lin, Leyu Lin
WeChat Group, Tencent Inc., Beijing, China.
Abstract

Question answering (QA) aims to understand user questions and find appropriate answers. In real-world QA systems, Frequently Asked Question (FAQ) based QA is usually a practical and effective solution, especially for some complicated questions (e.g., How and Why). Recent years have witnessed the great successes of knowledge graphs (KGs) utilized in KBQA systems, while there are still few works focusing on making full use of KGs in FAQ-based QA. In this paper, we propose a novel Knowledge Anchor based Question Answering (KAQA) framework for FAQ-based QA to better understand questions and retrieve more appropriate answers. More specifically, KAQA mainly consists of three parts: knowledge graph construction, query anchoring and query-document matching. We consider entities and triples of KGs in texts as knowledge anchors to precisely capture the core semantics, which brings in higher precision and better interpretability. The multi-channel matching strategy also enable most sentence matching models to be flexibly plugged in out KAQA framework to fit different real-world computation costs. In experiments, we evaluate our models on a query-document matching task over a real-world FAQ-based QA dataset, with detailed analysis over different settings and cases. The results confirm the effectiveness and robustness of the KAQA framework in real-world FAQ-based QA.

\nocopyright

Introduction

Question answering (QA) aims to find appropriate answers for user questions. According to the difficulties of questions and the sources of answers, there are mainly two kinds of QA systems. For simple questions like “Who writes Hamlet?”, users tend to directly know the answers via several entities or a short sentence. Such answers are usually selected from knowledge graphs (KGs) (e.g., KBQA task [Cui, Xiao, and Wang2016]), or retrieved from articles in answer pools (e.g., machine comprehension task [Seo et al.2016]). While for other complicated questions like “How to cook a risotto?”, users usually seek for detailed step-by-step instructions. In this case, FAQ-based QA system is a more effective and practical solution, which attempts to understand user questions and retrieve the most related documents that may contain correct answers [Kothari et al.2009].

Figure 1: An example of triples in knowledge anchors. The red arcs indicates the core semantics in query and title.

QA systems always pursue higher precision and better interpretability, for users of QA systems are much more critical compared to those of information retrieval or dialogue systems. Recent years have witnessed the great thrive in knowledge graphs like Freebase [Bollacker et al.2008] and DBpedia [Auer et al.2007]. A typical knowledge graph usually consists of entities, relations and triples, which can provide prior professional knowledge, and thus are naturally capable of improving QA system’s precision and interpretability. KBQA is a classical task that brings KGs into QA, which directly answers with entities selected from KGs. However, there are still few works focusing on introducing KGs to FAQ-based QA systems for complicated questions.

It is challenging to precisely understand user questions, since they usually involve informal representations, abbreviations, professional terms and specific logical limitations. To address these problems, we attempt to introduce KGs to FAQ-based QA system. Differing from the usage of KGs in KBQA systems, we bring in KGs not to directly answer questions, but to better understand queries and titles. A query or title in real-world QA systems usually contains entities and triples (entity pairs in texts with their latent relations) that derive from KGs. We consider such knowledge of KGs in texts as Knowledge anchors, which can benefit FAQ-based QA systems in both NLU and answer retrieval. It can bring in higher precision, better interpretability, and the capability of cold start and immediate manual intervention. Knowledge anchors capture the key information in query-title matching, which is not easy to be handled by end-to-end QA systems even with a hundred times of training instances. Fig 1 demonstrates an example of knowledge anchors in real-world FAQ-based QA system.

With the favor of knowledge anchors, we propose a novel Knowledge Anchor based Question Answering (KAQA) framework for FAQ-based QA. Precisely, our KAQA framework mainly consists of three modules: (1) knowledge graph construction which builds KGs in the target domain. (2) Query anchoring which extracts core semantics in queries and documents, relieving the confusions caused by the ambiguity and diversity in natural languages. And (3) multi-channel query-document matching which calculates the semantic similarities between queries and documents with token sequences and knowledge anchors.

In experiments, we construct a new dataset from a real-world Chinese FAQ-based QA system in software customer service domain, and then evaluate KAQA on the query-document matching task. The results demonstrate that knowledge anchors are essential in NLU and sentence matching. We further conduct some analysis on query anchoring and knowledge anchors with detailed cases to better interpret KAQA’s pros and cons as well as effective mechanisms. The main contributions are concluded as follows:

  • We propose a novel KAQA framework for real-world FAQ-based QA. Knowledge anchors can bring in higher precision and better interpretability with fewer annotations, which are essential in practical systems. The multi-channel matching strategy also enables QA systems to conduct both simple and sophisticated sentence matching models for different real-world scenarios. To the best of our knowledge, KAQA is the first to explicitly use knowledge anchors for NLU and matching in FAQ-based QA.

  • We construct a large dataset for domain-specific FAQ-based QA which will be released in future, and conduct sufficient experiments on different scenarios with detailed analysis and cases. The experimental results confirm the effectiveness and robustness of KAQA.

Related Work

Figure 2: The overall architecture of the Knowledge Anchor based Question Answering (KAQA) framework. The query anchoring module first takes query and title as inputs and extract knowledge anchors. Next, the entities and triples of knowledge anchors are fed to a multi-channel matching model combined with the original query and title token sequences. An MLP and softmax layer are followed to measure the sentence similarity.

Question Answering

Question answering is an important task in artificial intelligence. There are various of tasks including KBQA [Cui, Xiao, and Wang2016] and machine comprehension [Seo et al.2016] focusing on directly answering user questions. The answers could be given by either extractive-based methods [Xiong, Merity, and Socher2016] or generative-based methods [Yin et al.2016].

In real-world QA systems, FAQ-based QA is usually a more practical method, especially for the complicated questions (e.g., How or Why questions). \citeauthorcavnar1994n \shortcitecavnar1994n gives a classical n-gram based text categorization method for FAQ-based QA. Rule-based methods and semantic parser also help to improve model performances [Yih, He, and Meek, Bilotti et al.2007]. Since the performance of FAQ-based QA is strongly influenced by query-document matching, lots of efforts are focused on improving similarity calculations [Mhaisale, Patil, and Mahamuni2013, Zhou et al.2016]. Moreover, \citeauthorjeon2005finding \shortcitejeon2005finding, \citeauthorriezler2007statistical \shortciteriezler2007statistical propose several methods such as query expansion to bridge lexical gaps between queries and documents. However, there are still few works focusing on bringing in KGs for FAQ-based QA. To the best of our knowledge, KAQA is the first to use knowledge anchors for understanding and matching in FAQ-based QA systems.

Knowledge Graphs in QA

It is intuitive that knowledge graphs could help QA. Semantic parser helps to map user queries to logic forms that could be utilized to retrieve answers from knowledge bases [Liang, Jordan, and Klein2013, Kwiatkowski et al.2013]. Information extraction [Yao and Van Durme2014] and templates [Zheng et al.2018] are also powerful tools. Recently, with the thriving in deep learning models, neural approaches have been successfully utilized in KBQA, such as CNN [Dong et al.2015] and attention [Zhang et al.2016]. \citeauthoryih2015semantic \shortciteyih2015semantic attempts to combine semantic parser with deep neural models in KBQA via query graph. \citeauthorzhang2018variational \shortcitezhang2018variational focuses on multi-hop knowledge reasoning in KBQA, and \citeauthorhuang2019knowledge \shortcitehuang2019knowledge explores knowledge embeddings for simple QA. However, models in FAQ-based QA usually ignore or merely use entities as features for lexical weighting or matching [Beduè et al.2018]. In this paper, we propose a novel notion knowledge anchor, explicitly using both entities and triples from KGs to capture core semantics.

Sentence Matching

Measuring semantic similarities between questions and answers is essential in FAQ-based QA. Conventional methods usually rely on lexical similarity techniques like BM25 [Robertson, Zaragoza, and others2009]. \citeauthorsocher2011dynamic \shortcitesocher2011dynamic uses recursive autoencoder to generate semantic trees for matching. Inspired by Siamese network, DSSM [Huang et al.2013], Arc-I [Hu et al.2014] and LSTM-RNN [Palangi et al.2016] extract high-order features and then calculate similarities in semantic spaces. While DeepMatch [Lu and Li2013], Arc-II [Hu et al.2014] and MatchPyramid [Pang et al.2016] extract features from lexical interaction matrix. IWAN [Shen, Yang, and Deng2017] explores the orthogonal decomposition strategy for matching. Pair2vec [Joshi et al.2018] further considers compositional word-pair embeddings. Our multi-channel model enables most of sentence matching models to be plugged in KAQA flexibly.

Methodology

We first give an introduction of the notations used in this paper. For a knowledge graph , and represents the entity and relation set. We utilize to represent a triple in KG, in which are the head and tail entity, while is the relation.

We consider the query and document as KAQA’s inputs, where and indicate the -th token in query and document. In this paper, we simply use titles to represent the documents for convenience. Both queries and titles are labelled with knowledge anchors in query anchoring module. A knowledge anchor set in a query usually contains two sequences, namely the entity sequence and the triple sequence based on their positions. Each triple consists of two entities and their relation in the corresponding query. The knowledge anchor set in document is the same as that in query.

Overall Architecture

KAQA mainly consists of three modules, namely the knowledge graph construction module, the query anchoring module and the query-document matching module. Fig. 2 demonstrates the overall architecture of KAQA. Knowledge graph construction module is the fundamental step to learn prior knowledge for understanding and matching. Next, the query anchoring module attempts to analyze queries and titles via built KGs to extract knowledge anchors. Two disambiguation models are used to assure the extracted knowledge anchors are confidential. Finally, with helps of knowledge anchors, query-document matching module measures the semantic similarity between queries and documents.

Knowledge Graph Construction

A good KG is the fundamental of KAQA. Experiments show that the more precise and complete the KG is, the better performances our FAQ-based QA will have. In KAQA, KGs are mainly utilized for better NLU and matching, not for directly answering simple questions. Therefore, instead of directly using existing open-domain KGs like Freebase, we concentrate more on generating triples that represent core semantics in specific target domains. For instance in Fig. 1, the core semantics could be captured by (friend, component_of, WeChat) and (friend, has_operation, recover).

In this paper, we concentrate on the domain of software customer service, where users usually seek for detailed step-by-step instructions or explanations. To simplify the complexity in KG construction, we mainly concentrate on four types of relations to understand user intentions, namely has_operation, component_of, synonym and hypernym/hyponym. has_operation is responsible for understanding user intentions, component_of reveals important relatedness, while synonym and hypernym/hyponym are used for normalization and generalization.

query (Can I recover my WeChat friend if she has deleted me merely due to a late reply?)
entity delete; WeChat; friend; recover; reply
triple (WeChat, has_operation, delete)
candidates (WeChat, has_operation, recover)
(friend, has_operation, delete)
(friend, has_operation, recover)
(friend, component_of, WeChat)
Table 1: An example of a query with its entities and triple candidates. The bold triples indicate the correct triples which should be selected by the query anchoring module.

In KG construction, we first find out dozens of seed entities in the target domain, and then use some patterns-based models with conventional NER models like CRF to get the final entity set [Lafferty, McCallum, and Pereira2001]. Extracting useful entities from existing knowledge bases is also a good supplement in practical. Based on these entities, we combine several models to get triple candidates. (1) We first use pattern-based bootstrapping methods with helps of count-based features, lexical features (term weight and POS tag) and semantic parser results to generate high-frequent triples. (2) Next, based on a small amount of annotated supervised instances and distance supervision, we implement some classical neural relation extraction models (e.g., CNN, PCNN and PCNN+ATT) for binary relation classification (has or has no relation) [Lin et al.2016]. (3) We jointly consider all models with linear addition to rank all triple candidates. (4) Finally, to make the KG practical, we further conduct a human annotation to make sure the accuracy of KG is above 95%. Approximately, a skilled annotator will take about day to finish instances. Table 1 gives an query instance with its entities and central triples.

In real-world QA systems, customization in KG is labor-intensive but worthy to assure high precision. Moreover, the KG construction pipeline could be quickly adapted to other domains, and the built KG also benefits lots of modules including user intention recognition, recall and ranking.

Query Anchoring

Query anchoring module attempts to extract core semantics via knowledge anchors based on KGs. Simply relying on string matching or semantic parser is straightforward, while it will bring in noises and confusions as well. It is because that queries in domain-specific FAQs usually have informal expressions and abbreviations, which are hard to analyze precisely. We attempt to use two disambiguation components to address this issue.

Entity Disambiguation

We directly use a simple string matching algorithm to retrieve all entity candidates. Considering the balance between effectiveness and efficiency, we implement a forward maximum matching algorithm for entity disambiguation [Xiaohua and Xiupei2011].

Figure 3: The neural triple disambiguation model.

Triple Disambiguation

Queries in FAQ-based QA usually involves multiple operations and target objects, and thus we need a triple disambiguation model to capture the core semantics. For instance in Fig. 1, there are four triple candidates of the relation has_operation that may be the correct user intention. The triple disambiguation model needs to know whether the user wants to recover or delete, and whether WeChat or friend of WeChat is the target object of the operation.

We conduct an ensemble triple disambiguation model with (1) a rule-based model, (2) a simple knowledge reasoning model and (3) a neural model. The rule-based model (RB) considers simple syntactic rules and patterns, lexical features (e.g., token weights, POS tags, entity types), and triple-level features (e.g., relation types, entity pair distances). This model is simple and straightforward, which could be easily transferred and reused in different fields. The knowledge reasoning model (KR) enables some simple knowledge reasoning over KGs. For example, since friend is a component of WeChat, the target object of recover in Fig. 1 is more likely to be friend rather than WeChat.

As for the neural triple disambiguation model (NTD), we build a supervised model following the architecture of FastText [Joulin et al.2016] shown in Fig. 3, which takes a sentence with its target triple as the input and outputs a confidence score. We further introduce some features as follows: (1) Target triple that indicates which triple candidate we focus on. (2) Position features, which show the distances from the current token to two entities in the target triple. There are two position features for each token. (3) Conflict entity features: if (A, r, B) makes a triple candidate, while B is already in the target triple (C, r, B), then A-to-C is the conflict entity pair. (4) Conflict triple features: if a triple (except the target triple itself) shares any entities with that in target triple, then this triple is viewed as a conflict triple. All features are fed to the FastText model. The final triple confidence score is the addition of rule-based and neural model scores, with the weights empirically set to be and . The knowledge reasoning model works as a high-confident filter separately.

Query-document Matching

The query-document matching module takes document titles and queries with their knowledge anchors as inputs, and output the query-document similarities. The input of query contains three sequences, including the token sequence , the entity sequence and the triple sequence , and the same for document titles. The final similarity vector is formalized as follows:

(1)

in which represents a multi-layer perception. stands for the final hidden states of different sentence matching models, which are regarded as the query-document similarity features.

Multi-channel matching

Intuitively, we conduct a multi-channel matching strategy to jointly encode all semantics in different sequences into the final similarity features. We simply concatenates three final hidden states of different sequences to get the similarity features :

(2)

where , , indicate the final hidden states of token, entity and triple respectively.

The multi-channel matching strategy enables most of sentence matching models to be used in KAQA for different demands of complexity and accuracy in practical. To show the flexibility and robustness of KAQA in various situations, we learn these query-document similarity features based on three representative sentence matching models including ARC-I, MatchPyramid and IWAN. It is not difficult for KAQA to use other sentence matching models.

Architecture-I (ARC-I)

ARC-I is a classical sentence matching model following the siamese architecture [Hu et al.2014]. It first uses neural networks like CNN to get the sentence representations of both query and document separately, and then measures the similarities between these two representations. Here, is concatenated by the final query and title representations with token sequences as inputs, and the same as and .

MatchPyramid

Differing from ARC-I, MatchPyramid calculates the sentence similarity directly from the token-level interaction matrix rather that sentence representations [Pang et al.2016]. We use the cosine similarity to build the interaction matrix. The similarity features are generated after the 2D pooling and convolution layers.

Inter-Weighted Alignment Network (IWAN)

IWAN is an effective sentence matching model using orthogonal decomposition strategy [Shen, Yang, and Deng2017]. It calculates query-document similarity based on their orthogonal and parallel components in the sentence representations. For a query, IWAN first utilizes a Bi-LSTM layer to get the hidden state (and correspondingly for document). Next, an query-document attention mechanism is used to generate the alignment representation of query :

(3)

Based on and , the parallel and orthogonal components are formalized as follows:

(4)

in which indicates the parallel component that implies the similar semantic parts of document, while indicates the orthogonal component that implies the conflicts between query and document. At last, both orthogonal and parallel components of query and document are concatenated to form the final query-document similarity features.

Implementation Details

The query-document matching module is considered as a classification task. We utilize a softmax layer which outputs three labels: similar, related and unrelated. We use cross-entropy as our loss function, which is formalized as follows:

(5)

represents the number of training pair instances. equals only if the human annotated label of the -th sentence pair meets the -th label, and otherwise equals .

In this paper, we conduct KAQA concentrating on the field of software customer service. In KG construction, we automatically extract entity and triple candidates via classical models stated above. For practical use, we further conduct human annotations to make sure the accuracy of KG is above 95%. In query anchoring module, the synonym and hypernym/hyponym relations are directly utilized for entity and triple normalization and generalization, while component_of are mainly utilized for knowledge reasoning in triple disambiguation. In query-document matching module, we only consider the triples with has_operation as the triple part in knowledge anchors empirically, for this kind exactly captures the core semantics. It is not difficult to consider more relation types in our multi-channel matching framework. Further implementation details of knowledge graph construction and query anchoring are shown in the appendix for reproduction and further explorations.

Experiments

In experiments, we evaluate KAQA on the query-document matching task with a dataset extracted from a real-world software customer service FAQ-based QA system. We also give detailed analysis on different settings and typical cases.

Dataset and knowledge graph

In this paper, we construct a new dataset FAQ-SCS for evaluation since there are few large open-source FAQ datasets. In total, FAQ-SCS contains query-title pairs extracted from a real-world software customer service FAQ-based QA system. All query-title pairs are manually annotated with similar, related and unrelated labels. Similar indicates that the query and title exactly share the same intention, Related indicates that the query and title have related core semantics with minor differences, and unrelated indicates the sentence pair has different intentions. Overall, FAQ-SCS has similar, related and unrelated labels. For evaluation, we randomly split all instances into train, valid and test set with the proportion of ::. The statistics of FAQ-SCS are listed in Table 2. FAQ-SCS will be released in future for further explorations.

Dataset #train #valid #test
FAQ-SCS 23,304 2,913 2,917
Table 2: Statistics of FAQ-SCS.

We also construct a knowledge graph KG-SCS in the target software customer service domain for extracting knowledge anchors. Specifically, KG-SCS contains entities and relations. After entity normalization via alignments with synonym relations, there are totally entities and triples. triples have has_operation relation, triples have component_of relation, and triples have hypernym/hyponym relation. After query anchoring, there are entities and triples appeared in FAQ-SCS, queries and titles have at least one triple. The average entity numbers of query and title are and . The average triple candidate numbers of query and title are and , which are and after triple disambiguation.

Experimental Settings

In KAQA framework, we implement three representative models including the siamese architecture model ARC-I [Hu et al.2014], the lexical interaction model MatchPyramid [Pang et al.2016], and the orthogonal decomposition model IWAN [Shen, Yang, and Deng2017] for sentence matching in our multi-channel matching module, with their original models considered as baselines. It is not difficult for KAQA plugged with other matching models, and we do not compare with KBQA models for we focus on different tasks. All models share the same dimension of hidden states as . In training, the batch size is set to be while learning rate is set to be . For ARC-I and MatchPyramid, the dimension of input embeddings is . The number of filters is and the window size is in CNN encoder. For IWAN, the dimension of input embedding is . All parameters are optimized on valid set with grid search. For fair comparisons, all KAQA models and baselines follow the same experimental settings.

Analysis on Query-document Matching

Query-document matching task aims to predict whether a query-title pair is similar, related or unrelated, which is essential for the ranking modules in real-world systems.

Evaluation Protocol

We implement three different types of sentence matching models ARC-I, MatchPyramid and IWAN as our baselines to show the robustness and flexibility of our framework. All baselines are enhanced with knowledge anchors as their KAQA versions. We consider the evaluation as a classification task, where KAQA models and baselines should predict whether the query-document pair is unrelated, related or similar. As for evaluation metric, we report the average accuracies across runs for all models.

Model Accuracy
MatchPyramid [Pang et al.2016] 0.714
ARC-I [Hu et al.2014] 0.753
IWAN [Shen, Yang, and Deng2017] 0.778
KAQA (MatchPyramid) 0.747
KAQA (ARC-I) 0.773
KAQA (IWAN) 0.797
Table 3: Results of query-document matching.

Experimental Results

The results are demonstrated in Table 3, and we can observe that:

(1) The KAQA models consistently outperform all their corresponding original models on FAQ-SCS, among which KAQA (IWAN) achieves the best accuracy. It indicates that knowledge anchors can capture core semantics precisely. The consistent improvements also verify the capability of our query anchoring and multi-channel matching modules in modeling rich information in knowledge anchors. Moreover, KAQA can even get more improvement for queries or titles that have multiple triple candidates, which implies the capability of KAQA in handling informality and ambiguity in natural languages.

(2) All KAQA models with different types of sentence matching models have improvements compared to their original models without knowledge anchors. Specifically, we evaluate our KAQA framework with siamese architecture model (ARC-I), lexical interaction model (MatchPyramid) and orthogonal decomposition model (IWAN). The consistent improvements reconfirm the flexibility and robustness of KAQA in real-world FAQ-based QA systems. In real-world FAQ-based QA systems, the FAQ article number of the overall corpus varies from hundreds to millions. The KAQA framework enables systems to use both simple and sophisticated sentence matching algorithms according to the balance of both effectiveness and efficiency in practical.

To further confirm the power of the KAQA framework in real-world scenario, we further conduct an online A/B test on a widely-used FAQ-based QA system. We implement the KAQA framework with its corresponding baseline model in online evaluation. We conduct the online A/B test for days, with approximately million people influenced by our online models. The experimental results show that KAQA achieves improvements on Click-through-rate compared to the baseline model with the significance level . With helps of knowledge anchors, KAQA could have better performances in interpretability, cold start and immediate manual intervention. It has also been successfully used in other domains like medicine.

Analysis on Query Anchoring

The quality of knowledge anchors is essential in KAQA framework, which is strongly influenced by the triple disambiguation. In this subsection, we evaluate the effectiveness of different triple disambiguation models.

Model Accuracy AUC
RB 0.588 0.646
RB+KR 0.619 0.679
RB+KR+NTD 0.876 0.917
Table 4: Results of triple disambiguation.
Query Title ARC-I KAQA Label
How to delete my chat records in WeChat? Can WeChat recover those chat records which have already been deleted? 2 0 0
(chat record, has_operation, delete) (chat record, has_operation, recover)
How to not add pictures (when sending messages) in Moments? In Moments, can I only share textual messages without attaching figures? 0 2 2
(picture, has_operation, (not) add) (figure, has_operation, (without) attach)
How to log in WeChat with new account? Can I log in WeChat with two different accounts simultaneously ? 2 2 1
((new) account, has_operation, log in) ((two) account, has_operation, log in)
What should I do to set up administrators of the chatting group? How to change the administrator in my chatting group? 0 0 2
(administrator, has_operation, set up) (administrator, has_operation, change)
Table 5: Examples of query-title pairs, labels and predicted results of KAQA and baseline. Label 0/1/2 indicates unrelated/related/similar. For convenience, we only show the triples with core semantics in knowledge anchors.

Evaluation Protocol

We construct a new triple disambiguation dataset for query anchoring evaluation. Specifically, we randomly sample queries from a real-world software customer service system. To make this task more challenging, we only select the complicated queries which have at least two triple candidates with has_operation relation before triple disambiguation. At last, we sample queries with triples. After manually annotation, there are correct triples that represent the core semantics, while the rest triples are incorrect. We randomly select queries for evaluation.

There are mainly three triple disambiguation components in the final ensemble model. We use RB to indicate the basic rule-based model, KR to indicate the knowledge reasoning model, and NTD to indicate the neural triple disambiguation model. We conduct three combinations to demonstrate the contributions of different models in triple disambiguation, using accuracy and AUC as our evaluation metrics.

Experimental Results

From the triple disambiguation results in Table 4, we can observe that:

(1) The ensemble model RB+KR+NTD that combines all three disambiguation components achieves the best performances on both accuracy and AUC. Differing from titles in documents which are more formal, user queries usually struggle with abbreviations, informal representations and complex logical conditions. The results reconfirm that our triple disambiguation model is capable of capturing user intention precisely, even with the complicated queries containing multiple triple candidates. We will give detailed analysis on such complicated queries in case study.

(2) The neural triple disambiguation component brings in huge improvements compared to rule-based and knowledge reasoning models. It indicates that the supervised information and the generalization ability introduced by neural models are essential in triple disambiguation. Moreover, RB+KR model significantly outperforms RB model, which confirms the effectiveness of knowledge-based filters.

Analysis on Knowledge Anchors

In this subsection, we attempt to figure out what components in knowledge anchors exactly improve the performances. We set two different settings, the first only considers entities in knowledge anchors, while the second only considers triples. We report the accuracies of these two settings based on KAQA (ARC-I) in Table 6. We find that both settings have consistent improvements over the original models, which also implies that the qualities of KG and knowledge anchors are significant. Moreover, triples seem to play a more critical role in knowledge anchors.

Model Accuracy
ARC-I 0.753
KAQA (ARC-I) (entity) 0.762
KAQA (ARC-I) (triple) 0.766
KAQA (ARC-I) (all) 0.773
Table 6: Results of different knowledge anchors.

Case Study

In Table 5, we give some representative examples to show the pros and cons of using knowledge anchors. In the first case, KAQA successfully finds the correct knowledge anchor (chat record, recover) in title via the triple disambiguation model, avoiding confusions caused by the candidate operation delete. While the original ARC-I model makes a mistake by only judging from tokens. In the second case, there is a semantic ellipsis (send messages) in user query that confuses ARC-I, which usually occurs in QA systems. However, KAQA successfully captures the core semantics (not add pictures) to get the right prediction. The alignment between “figure” and “picture” in KG also helps. However, there are also some side effects caused by knowledge anchors. In the third case, knowledge anchors merely concentrate on the core semantic operation log in WeChat account, paying less attention to the differences between “new” and “two”. Therefore, KAQA gives a wrong prediction of similar. In the last case, KAQA does extract the correct knowledge anchors. However, although set up and change have different meanings, set up/change administrator should indicate the same operation in such scenario. Consider the synonym and hypernym/hyponym relationships between triples will partially solve this issue.

Conclusion and Future Work

In this paper, we propose a novel Knowledge Anchor based Question Answering (KAQA) framework for real-world FAQ-based QA systems. We consider entities and triples in texts as knowledge anchors to precisely capture core semantics for NLU and matching. The knowledge anchors and multi-channel matching strategy bring in better performances, interpretability and flexibility, which could be rapidly adapted to other domains in practical systems. Experimental results confirm the effectiveness and robustness of KAQA. The codes and datasets will be released in future.

We will explore the following research directions in future: (1) We will consider more sophisticated methods to fuse knowledge anchors into the multi-channel matching module. (2) We will conduct query anchoring and query-document matching simultaneously with a joint learning algorithm. (3) We will explore the relatedness between entity and triple to better modeling knowledge anchor similarities.

References

  • [Auer et al.2007] Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; and Ives, Z. 2007. Dbpedia: A nucleus for a web of open data. The semantic web.
  • [Beduè et al.2018] Beduè, P.; Graef, R.; Klier, M.; and Zolitschka, J. F. 2018. A novel hybrid knowledge retrieval approach for online customer service platforms. In Proceedings of ECIS.
  • [Bilotti et al.2007] Bilotti, M. W.; Ogilvie, P.; Callan, J.; and Nyberg, E. 2007. Structured retrieval for question answering. In Proceedings of SIGIR.
  • [Bollacker et al.2008] Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; and Taylor, J. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of KDD.
  • [Cavnar, Trenkle, and others1994] Cavnar, W. B.; Trenkle, J. M.; et al. 1994. N-gram-based text categorization. In Proceedings of SDAIR.
  • [Cui, Xiao, and Wang2016] Cui, W.; Xiao, Y.; and Wang, W. 2016. Kbqa: An online template based question answering system over freebase. In Proceedings of IJCAI.
  • [Dong et al.2015] Dong, L.; Wei, F.; Zhou, M.; and Xu, K. 2015. Question answering over freebase with multi-column convolutional neural networks. In Proceedings of ACL.
  • [Hu et al.2014] Hu, B.; Lu, Z.; Li, H.; and Chen, Q. 2014. Convolutional neural network architectures for matching natural language sentences. In Proceedings of NIPS.
  • [Huang et al.2013] Huang, P.-S.; He, X.; Gao, J.; Deng, L.; Acero, A.; and Heck, L. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of CIKM.
  • [Huang et al.2019] Huang, X.; Zhang, J.; Li, D.; and Li, P. 2019. Knowledge graph embedding based question answering. In Proceedings of WSDM.
  • [Jeon, Croft, and Lee2005] Jeon, J.; Croft, W. B.; and Lee, J. H. 2005. Finding similar questions in large question and answer archives. In CIKM.
  • [Joshi et al.2018] Joshi, M.; Choi, E.; Levy, O.; Weld, D. S.; and Zettlemoyer, L. 2018. pair2vec: Compositional word-pair embeddings for cross-sentence inference. arXiv preprint arXiv:1810.08854.
  • [Joulin et al.2016] Joulin, A.; Grave, E.; Bojanowski, P.; and Mikolov, T. 2016. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
  • [Kothari et al.2009] Kothari, G.; Negi, S.; Faruquie, T. A.; Chakaravarthy, V. T.; and Subramaniam, L. V. 2009. Sms based interface for faq retrieval. In Proceedings of ACL.
  • [Kwiatkowski et al.2013] Kwiatkowski, T.; Choi, E.; Artzi, Y.; and Zettlemoyer, L. 2013. Scaling semantic parsers with on-the-fly ontology matching. In Proceedings of EMNLP.
  • [Lafferty, McCallum, and Pereira2001] Lafferty, J.; McCallum, A.; and Pereira, F. C. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data.
  • [Liang, Jordan, and Klein2013] Liang, P.; Jordan, M. I.; and Klein, D. 2013. Learning dependency-based compositional semantics. Computational Linguistics.
  • [Lin et al.2016] Lin, Y.; Shen, S.; Liu, Z.; Luan, H.; and Sun, M. 2016. Neural relation extraction with selective attention over instances. In Proceedings of ACL.
  • [Lu and Li2013] Lu, Z., and Li, H. 2013. A deep architecture for matching short texts. In Proceedings of NIPS.
  • [Mhaisale, Patil, and Mahamuni2013] Mhaisale, S.; Patil, S.; and Mahamuni, K. 2013. Weighted edit distance based faq retrieval using noisy queries. In FIRE.
  • [Palangi et al.2016] Palangi, H.; Deng, L.; Shen, Y.; Gao, J.; He, X.; Chen, J.; Song, X.; and Ward, R. 2016. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. TASLP.
  • [Pang et al.2016] Pang, L.; Lan, Y.; Guo, J.; Xu, J.; Wan, S.; and Cheng, X. 2016. Text matching as image recognition. In AAAI.
  • [Riezler et al.2007] Riezler, S.; Vasserman, A.; Tsochantaridis, I.; Mittal, V.; and Liu, Y. 2007. Statistical machine translation for query expansion in answer retrieval. In Proceedings of ACL.
  • [Robertson, Zaragoza, and others2009] Robertson, S.; Zaragoza, H.; et al. 2009. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval.
  • [Seo et al.2016] Seo, M.; Kembhavi, A.; Farhadi, A.; and Hajishirzi, H. 2016. Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603.
  • [Shen, Yang, and Deng2017] Shen, G.; Yang, Y.; and Deng, Z.-H. 2017. Inter-weighted alignment network for sentence pair modeling. In Proceedings of EMNLP.
  • [Socher et al.2011] Socher, R.; Huang, E. H.; Pennin, J.; Manning, C. D.; and Ng, A. Y. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Proceedings of NIPS.
  • [Xiaohua and Xiupei2011] Xiaohua, W. R. L. J. P., and Xiupei, L. 2011. An improved forward maximum matching algorithm for chinese word segmentation. Computer Applications and Software.
  • [Xiong, Merity, and Socher2016] Xiong, C.; Merity, S.; and Socher, R. 2016. Dynamic memory networks for visual and textual question answering. In Proceedings of ICML.
  • [Yao and Van Durme2014] Yao, X., and Van Durme, B. 2014. Information extraction over structured data: Question answering with freebase. In Proceedings of ACL.
  • [Yih et al.2015] Yih, S. W.-t.; Chang, M.-W.; He, X.; and Gao, J. 2015. Semantic parsing via staged query graph generation: Question answering with knowledge base.
  • [Yih, He, and Meek] Yih, W.-t.; He, X.; and Meek, C. Semantic parsing for single-relation question answering. In Proceedings of ACL.
  • [Yin et al.2016] Yin, J.; Jiang, X.; Lu, Z.; Shang, L.; Li, H.; and Li, X. 2016. Neural generative question answering. In Proceedings of IJCAI.
  • [Zhang et al.2016] Zhang, Y.; Liu, K.; He, S.; Ji, G.; Liu, Z.; Wu, H.; and Zhao, J. 2016. Question answering over knowledge base with neural attention combining global knowledge information. arXiv preprint arXiv:1606.00979.
  • [Zhang et al.2018] Zhang, Y.; Dai, H.; Kozareva, Z.; Smola, A. J.; and Song, L. 2018. Variational reasoning for question answering with knowledge graph. In Proceedings of AAAI.
  • [Zheng et al.2018] Zheng, W.; Yu, J. X.; Zou, L.; and Cheng, H. 2018. Question answering over knowledge graphs: question understanding via template decomposition. Proceedings of VLDB.
  • [Zhou et al.2016] Zhou, G.; Zhou, Y.; He, T.; and Wu, W. 2016. Learning semantic representation with neural networks for community question answering retrieval. Knowledge-Based Systems.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
398372
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description