Learning Latent Beliefs and Performing Epistemic Reasoning for Efficient and Meaningful Dialog Management

Learning Latent Beliefs and Performing Epistemic Reasoning for Efficient and Meaningful Dialog Management


Many dialogue management frameworks allow the system designer to directly define belief rules to implement an efficient dialog policy. Because these rules are directly defined, the components are said to be hand-crafted. As dialogues become more complex, the number of states, transitions, and policy decisions becomes very large. To facilitate the dialog policy design process, we propose an approach to automatically learn belief rules using a supervised machine learning approach. We validate our ideas in Student-Advisor conversation domain, where we extract latent beliefs like student is curious, confused and neutral, etc. Further, we also perform epistemic reasoning that helps to tailor the dialog according to student’s emotional state and hence improve the overall effectiveness of the dialog system. Our latent belief identification approach shows an accuracy of 87% and this results in efficient and meaningful dialog management.


Dialog based interaction between a customer and a bot may become tedious and irrelevant if the system’s beliefs are not consistent with the set of beliefs of the customer. Beliefs are cognitive representational states that represent the presumed context of the conversation perceived by each agent. A key limitation in today’s conversational interfaces is their lack of robustness when understanding the latent beliefs. The inherent difficulties of conversational systems (chatbots) are further increased by the conditions under which these systems typically operate: increasingly larger vocabularies, large and diverse user populations, spontaneous input, etc. Unless mediated by better belief identification and robust reasoning mechanisms, these errors propagate to subsequent stages of processing in a dialog system and exert a considerable negative impact on the quality and ultimately the success of the interaction.

Identifying latent beliefs is a challenging task since it depends on various factors like identifying and analyzing context. Other challenges that exist in identifying latent beliefs is the reference to multiple concepts and the need to extract a large amount of facts, commonsense knowledge, anaphora resolution, and logical reasoning. Consider the example, “I am looking for a heavy course", it is difficult to identify the latent belief if the student is looking for a course which is heavy in terms of the workload; class size and/or the nature/quality of assignments.

We propose a novel approach for identifying more accurate beliefs over concept values in conversational systems by integrating information across multiple turns in the conversation. Traditional machine learning approaches train a system with extremely large dialog corpus that covers a variety of scenarios. Another approach is to build a system with a complex set of hand-crafted rules that may address some specific instances. Both approaches may be impractical in many real-world domains. Our framework is based on a combination of machine-learning mechanism and logical reasoning to understand the latent beliefs. Our model then evaluates the beliefs to tailor the dialog and make it consistent with the set of beliefs of the user. This process then helps drive the conversation in a meaningful way.

Related Work

Deep learning based dialog systems [\citeauthoryearMiller et al.2016] use memory networks to learn the underlying dialog structure and carry out goal-oriented dialog. However, they do not factor in beliefs or trigger epistemic rules in modifying the conversation given the evolving context. In [\citeauthoryearWilliams, Raux, and Henderson2016] Williams et.al, describe the dialog state tracking challenge and mention “how the task of correctly inferring the state of the conversation - such as the user’s goal - given all of the dialog history up to that turn” is important. It is in this overall context, we propose that it is important to evaluate the probable beliefs held by the human and tailor the dialog system suitably to be consistent with the beliefs in order to hold a relevant conversation.

Although a number of attempts have been made to build dialog systems [\citeauthoryearHenderson, Thomson, and Young2014],  [\citeauthoryearWeston2016],  [\citeauthoryearWilliams, Raux, and Henderson2016], the use of epistemic rules in driving the dialog in a consistent way with the beliefs has not yet been tackled. Various approaches to dialog management have been proposed and these can be broadly classified into finite-state methods, probabilistic methods and deep learned methods.

There are recent motivating examples of works that make use of machine-learning to build intelligent dialog systems. Traditional dialog systems are specialized for a domain and rely on slot-filling driven by a knowledge base and a finite-state model [\citeauthoryearLemon et al.]. The finite-state model represents the dialog structure in the form of a state transition network in which the nodes represent the system’s interactions and the transitions between the nodes determine all the possible paths through the network.

Uckelman [\citeauthoryearSara2010] describes how in a formal dialog system, dynamic epistemic logic can be used in an Obligatio, where two agents, an Opponent and a Respondent, engage in an alternating-move dialog to establish the consistency of a proposition. Sadek et al. [\citeauthoryearSadek, Bretier, and Panaget1997] proposes a reasoning engine to build effective and generic communicating agents. Motivated by the above works, we propose to identify the beliefs, use these beliefs to trigger epistemic rules, and use the assertions of the rules to drive the conversation by tailoring the states in a finite-state machine dialog system.

Opinion mining methods have been in use for over long time. Recently, these methods have been applied to dialog systems. Roy et al. [\citeauthoryearRoy et al.2016] propose a novel approach to consider customer satisfaction to tailor the dialog. While general sentiment analysis methods are useful to understand a customer’s mental situation, they may need to be complemented with more domain specific information leading to richer fine grained classes of sentiments. For this reason, it is necessary to develop methods which take into account domain specific sentiments information.

Supervised Approach for Automatically Learning Latent Beliefs

A key issue in dialogue management is the design of the dialogue policy incorporating latent beliefs. While it is possible to design this by hand, such an approach has several shortcomings. Development is manually intensive and expensive, the systems are difficult to maintain and performance is often sub-optimal. In our previous work, we highlighted some of the issues in hand crafting belief rules [\citeauthoryearSangroya et al.2018].

An alternative to hand-crafting belief rules is to automatically learn them given a large annotated corpus of utterances and corresponding labeled beliefs. However, it is challenging to design a machine learning model as the performance of the model would be degraded if it is based on simple linguistic features such as bag-of-words (See Figure 1). Sometimes, the available training data consist of large number of closely related classes that makes the problem of belief identification more difficult.

Figure 1: Word Cloud of Dialog Utterances of a Curious Student

It is challenging task to build an automatic system that can understand the latent beliefs of humans (more specifically students). For example, there may be a student who is confused and directionless, might come up with something like “I have no inkling of where I want my life to go and am unable to determine what classes to take. I’m interested in machine learning. Can you help me decide what to do?” . There may be another student who has done his homework on what he wants but needs only little direction from the advisor. These two different categories of students needs different conversational flow and a different set of dialog policy needs to be tailored for each case.

We take the initial input from the student to advisory bot and categorize it into one of the tree categories based on there emotional belief. We consider 3 categories “curious″, “neutral″, “confused″ and manually annotate them to student’s utterances for each dialog. Most of the conversations seem to be similar in nature. Therefore, it becomes a challenging task to annotate the data correctly and pass it to machine learning model.

Figure 2 illustrates the proposed methodology for identifying and updating the latent beliefs. Its inputs include the utterances and a domain ontology.

Figure 2: Domain Specific Latent Belief Extraction Process

Step 1: Classifying Latent Beliefs

If an utterance belongs to more critical categories such as confused, it is assigned a higher weight as compared to category such as curious. This is intuitive that student with a more confused state of mind would need more attention and a specific dialog plan. The latent belief classification is done with the help of machine learning in an automatic fashion.

Step 2: Enriching Beliefs using Domain Ontology

We make use of knowledge mining derived using the domain ontology to update the latent beliefs. Our ontology consists of a large knowledge graph expressing the information about a domain. For example, this includes the course information, their easiness ratings, workload ratings and class size (See Figure 3). This information is available in a structured RDF graph. Use of ontology assists in enriching the beliefs further. It also helps to map the utterance which can be in any natural language form to a structured information that can be used to tailor the FSM based dialog manager.

Figure 3: Domain Knowledge for Belief Enrichment

Epistemic Reasoning over Latent Beliefs and Domain Knowledge

The extracted latent beliefs and the domain knowledge trigger the epistemic rules. For example “Belief (Student is confused) and Course_Load(High) => Knows-Agent (student is not confident), Knows-Agent (should advise light courses)” asserts facts about the current epistemic state of the agent. We use Prolog to define epistemic rules and carry out the reasoning against facts and domain knowledge.

Step 3: Learning the Model for Latent Belief Extraction

We now train a LSTM (Long Short Term Memory) network to build a model that can automatically categorize the beliefs based upon the information described in previous section (See Algorithm 1). Like many other studies of LSTM on text, words are first converted to low-dimensional dense word vectors via a word embedding layer. The first layer is therefore the embedding layer that uses 32 length vectors to represent each word. The next layer is the LSTM layer with 100 units. Finally, we use a dense output layer with 5 neurons (5 classes/labels) and a softmax activation function to make the predictions. We used Categorical_Cross_Entropy as the loss function (in Keras) alongwith ADAM optimizer. For regularization, we employ dropout to prevent co-adaptation. We run the experiment for 20 epochs with a batch size of 64.

0:  FSM for dialog
0:  User Input as Dialog Utterance
0:  Fine Grain Set of Latent Beliefs
0:  Meaningful dialog states
  for all Utterances in  do
     for all states of FSM do
        Remove Stopwords and Punctuations
        Convert to Lowercase
        Mark special name-entities
        for all sentence in  do
           Extract Latent Beliefs , etc.
           Using Domain Ontology , , etc., update latent beliefs
           Using Information Extraction, update
           Using facts and EPISTEMIC_rulebase, fill slots of Dialog manager Frames
           update weight of state using Latent Beleifs and Epistemic Reasoning
           Using threshold and weights, build meaningful states of FSM (ask_states, skip_states)
        end for
     end for
  end for
Algorithm 1 Algorithm for Updating Dialog States using Latent Belief Extraction

In our experiments we consider dialog dataset student advisor conversations from https://www.ibm.com/blogs/research/2018/07/sentence-selection-dstc7. We manually annotate latent beliefs across three categories such as Curious, Confused and Neutral. Total number of utterances after data processing were 3500 (Figure 4). The clean-up process involves converting text to lower case; tokenizing the sentences; and removal of punctuations and stopwords. Each input utterance as input is converted into a vector form. We identify the top 300 unique words and every word in this vocabulary is given a index. If the word is not present in vocabulary we consider it as 0. For Example: I am very disappointed today. The vector representation of this sentence is [10, 100, 23, 467, 0]. Next, we need to truncate and pad the input sequences so that they are all the same length. We take the max length of utterance to be 50. We divide the data into training set (75%) and test set (25%). We ensure that we do not have data sparsity issue i.e. We keep approximate equal proportion of data for each class.

Experimental Results

Figure 4: Number of Utterances in Training Data for each Category (Latent Beliefs)

We evaluate the proposed method on the seventh Dialog State Tracking Challenge (DSTC-7) dataset (“Flex Data: Student – Advisor dialogue” [\citeauthoryearFle2018]). We used 3500 conversations including list of paraphrases for each utterance as training data for belief classification. It becomes a little challenging to tag them to correct categories because of lexical similarity between these three classes. We got 84% classification accuracy for latent belief extraction. We also tried bag-of-words and naive bayes models that performed badly for this experiment. As we discussed earlier, this is possible due to lot of common words between three classes.

Example: Tailoring the Chatbot

We parse a dialog through Dependency parsers (such as Stanford-CoreNLP, GATE, MITIE etc.) and extract triples from the description by focusing on the dependencies identified among nouns and verbs. For example, for a description ”I prefer morning classes as I sleep early at night", triples such as (I, prefer, morning classes) are extracted. Once the triples are extracted, we use a hand-crafted fact-assertion rulebase to assert facts implied by the triples. This is done by evaluating the triples in the context of a student-advisor ontology, synonym-and-slang dictionary, information-extraction patterns that are relevant for the category of the belief, and by triggering the fact-assertion rules. The latent belief identification process helps to understand the student’s emotional beliefs and tailor the conversations accordingly.

We assume that we have a hand-crafted dialog-management finite-state-machine (FSM) to carry out the dialog with the student. The FSM operates on slots that are filled by extracting information from the input dialog and subsequent interaction. The probable beliefs of the student that were asserted as facts are then evaluated by the epistemic rules encoded in a knowledge base for the domain. The rules make assertions about the states in the FSM that need to be skipped and the states that need to be evaluated in order to be consistent with the beliefs of the student. The subsequent dialog is carried out and the next set of beliefs are then asserted. The cycle then continues. As shown in Table 1, we can observe that chatbot is able to skip some FSM states as a result of latent beliefs. The beliefs and epistemic rules helped tailor the dialog to the user’s expectations. In this work, we demonstrate the overall architecture where we use the output of RNN based belief identification model as an input to the epistemic rule engine. Our approach is generic and can be applied easily in any other domains.

Advisor Hi! I am your advisor. You can ask any doubts in selection of your courses for next semester.

I am a junior year student with interest in statistics and data analysis
Output (-) Latent Belief: Student(Curious)

Advisor (skipstate(ask_interest), skipstate(ask_semester))
Do you have any specific requirement about the workload of the course

I would prefer a class with lighter workload and higher helpfulness rating

Do you have any timing preferences?

I prefer morning classes as I sleep early at night.

Advisor (SkipState(ask_extra_details))
I would advise you STATS250 “Statistics and Data Analysis" which is an easy course.

Table 1: Sample Output of Dialog System


This paper presents a supervised approach to identify latent beliefs from dialog utterances and tailor the dialog. In future, we would like to investigate reinforcement learning based approaches for belief identification.


  1. 2018. Student Advising Data (Flex Data). https://www.ibm.com/blogs/research/2018/07/sentence-selection-dstc7/. [Online; accessed September-2018].
  2. Henderson, M.; Thomson, B.; and Young, S. 2014. Word-based dialog state tracking with recurrent neural networks. In Proceedings of the 15th SIGDIAL, 292–299.
  3. Lemon, O.; Liu, X.; Shapiro, D.; and Tollander, C. Hierarchical Reinforcement Learning of Dialogue Policies in a development environment for dialogue systems: REALL-DUDE. In BRANDIAL’06, 185–186.
  4. Miller, A.; Fisch, A.; Dodge, J.; Karimi, A.-H.; Bordes, A.; and Weston, J. 2016. Key-Value Memory Networks for Directly Reading Documents. arXiv preprint arXiv:1606.03126.
  5. Roy, S.; Mariappan, R.; Dandapat, S.; Srivastava, S.; Galhotra, S.; and Peddamuthu, B. 2016. Qa: A system for real-time holistic quality assurance for contact center dialogues. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 3768–3775.
  6. Sadek, M. D.; Bretier, P.; and Panaget, F. 1997. ARTIMIS: Natural Dialogue Meets Rational Agency. In Proceedings of the Fifteenth IJCAI - Volume 2, 1030–1035.
  7. Sangroya, A.; Anantaram, C.; Saini, P.; and Rawat, M. 2018. Automatic extraction of domain specific latent beliefs in customer complaints to help tailor chatbots. In AAAI Workshops, 731–734. AAAI Press.
  8. Sara, U. 2010. Obligationes as formal dialogue systems. In Stairs 2010: Proceedings of the Fifth Starting AI Researchers’ Symposium, volume 222, 341. IOS Press.
  9. Weston, J. 2016. Dialog-based Language Learning. CoRR abs/1604.06045.
  10. Williams, J.; Raux, A.; and Henderson, M. 2016. The Dialog State Tracking Challenge Series: A Review. Dialogue & Discourse.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description