Affective Recommendation System for Tourists
by Using Emotion Generating Calculations
††thanks: ©2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
An emotion orientated intelligent interface consists of Emotion Generating Calculations (EGC) and Mental State Transition Network (MSTN). We have developed the Android EGC application software which the agent works to evaluate the feelings in the conversation. In this paper, we develop the tourist information system which can estimate the user’s feelings at the sightseeing spot. The system can recommend the sightseeing spot and the local food corresponded to the user’s feeling. The system calculates the recommendation list by the estimate function which consists of Google search results, the important degree of a term at the sightseeing website, and the the aroused emotion by EGC. In order to show the effectiveness, this paper describes the experimental results for some situations during Hiroshima sightseeing.
Our research group proposed an estimation method to calculate the agent’s emotion from the contents of utterances and to express emotions which are aroused in computer agent by using synthesized facial expression [1, 2, 3]. Emotion Generating Calculations (EGC) method  based on the Emotion Eliciting Condition Theory  can decide whether an event arouses pleasure or not and quantify the degree of pleasure under the event.
The calculated emotion by EGC is changed according to the emotions of the agent. Ren  describes Mental State Transition Network (MSTN) which is the basic concept of approximating to human psychological and mental responses. The assumption of discrete emotion state is that human emotion is classified into some kinds of stable discrete states, called “mental state,” and the variance of emotions occurs in the transition from a state to the other state with an arbitrary probability. Mera and Ichimura [6, 7] developed a computer agent that can transit a mental state in MSTN to the other state according to the kinds of emotion generated by EGC. The strength of emotion and the type of the aroused emotion by EGC arouses the transition of mental states. MSTN can measure the user’s feeling which is the mood represented in the continuous transition of mental states.
We have developed the Android EGC application software which the agent works to evaluate the feelings in the conversation. The smartphone user can not only obtain the variety of information but also converse with the agent in a smartphone, because the interface between human and smartphone has been equipped with the speech recognition. Our proposed techniques, EGC and MSTN, can be expected to be an emotional orientated intelligent interface.
The application using Android EGC enables the estimation of user’s feelings. We developed the tourist information system which can estimate the user’s feelings at the sightseeing spot. The system can recommend the sightseeing spot and the local food corresponded to the user’s feeling. The current recommendation systems nominate many information including the matter that the user thinks back the past sad occurrence, even if he/she feels unutterable solitude and desolation. By such recommendation, a traveler will stop the sightseeing on the way.
Our developed system for Hiroshima Tourist can guide some spots, local food shops, and local gifts collected in Hiroshima Tourist Map Android application software. Because the smartphone has the GPS device and acceleration sensor, the application system has a navigation system. In this paper, the system decides the next candidates for spots and foods according to the tourist feelings by EGC in order to enjoy the travel. The system calculates the recommendation list by the estimate function of which the number of hits for a term retrieved by Google search, the important degree of a term included in Hiroshima sightseeing website, and the strength of emotion and the type of the aroused emotion by EGC. In order to show the effectiveness, this paper describes the experimental results for some situations during Hiroshima sightseeing.
The remainder of this paper is organized as follows. In the section II, the brief explanation to understand the EGC is described. Section III describes the proposed recommendation method which combines 3 following methods; Google search results, the important degree of a term at the sightseeing website, and the the aroused emotion by EGC. In Section IV, the recommendation system is explained. In Section V, we give some conclusive discussions that carries out practical experiments to cooperate with local government.
Ii Emotion Generating Calculations
Ii-a An Overview of Emotion Generating Process
Fig.1 shows the emotion generating process where the user’s utterance is transcribed into a case frame representation based on the results of morphological analysis and parsing. The agent works to determine the degree of pleasure/displeasure from the event in case frame representation by using EGC. EGC consists of 2 or 3 terms such as subject, object and predicate, which have Favorite Value (), the strength of the feelings described in section II-C.
Then, the agent divides this simple emotion (pleasure/displeasure) into 20 various emotions based on the Elliott’s “Emotion Eliciting Condition Theory.” Elliott’s theory requires judging conditions such as “feeling for another,” “prospect and confirmation,” and “approval/disapproval.” The detail of this classification method is described in the section II-E.
Ii-B Case Frame Representation
The case frame structure bases the predicate phrase and the syntactic dependency between it and the other case elements. Fillmore developed a system of linguistic analysis where the theory analyzes the surface syntactic structure of sentences by the combination of deep cases, e.g. semantic roles. Each verb selects a certain number of deep cases which form its case frame as shown in Fig.2.
In order to transcribe the user’s utterances into the case frame representation, we implement morphological analysis and parsing to the input sentence.
Ii-C Favorite Value Database
Which an event is pleasure or displeasure is determined by using . is a positive/negative number to an object when the user likes/dislikes it, respectively. is predefined a real number in the range . There are two types of s, personal and initial . Personal is stored in a personal database for each person who the agent knows well, and it shows the degree of like/dislike to an object from the person’s viewpoint. On the other hand, an initial shows the common degree of like/dislike to an object that the agent feels. Generally, it is generated based on the agent’s own preference information according to the result of some questionnaires. Both personal and initial s are stored in the user own database. An initial value of is determined beforehand on the basis of ‘corpus’ of its applied field. The s of the objects are gained from a questionnaire. However, there are countless objects in the world. In this paper, we limit the objects that have initial into the frequently appeared words in the dialog during sightseeing.
Ii-D Equation of EGC
We assume an emotional space as three-dimensional space. Therefore, we present a method to distinguish pleasure/displeasure from an event by judging the existence of ‘synthetic vector”.
Table I shows the correspondence between the case element in EGC equations and the axis in the three-dimensional model. In Table I, ‘V(S,*)’ is the type of event (verb) and ‘A(S,*)’ is the type of attribute (adjective). the variables denoted in Table I are expressed as follows.
: of Subject
: of Object-From
: of Object-Mutual
: of Object-Content
: of Object
: of Object-To
: of Object-Source
: of Predicate
: of Instrument or tool
Table II shows the relation between the sign of axis in each dimension and the pleasure/displeasure of generated emotion. When the vector is on the axis, the event does not arouse any emotion. When we calculate the synthetic vectors of the events which do not have elements, we supply a dummy , as element. We tentatively defined as . Fig.3 is an example of emotion space of event type . There are three elements, Subject, Object, and Predicate, in the event type, and the orthogonal vectors by the elements construct a rectangular solid.
Ii-E Complicated Emotion Eliciting Method
Based on emotion values calculated by EGC method and their situations, the pleasure/displeasure is classified into 20 types of emotion. We consider only 20 emotion types, which are classified into six emotional groups as follows, “joy” and “distress” as a group of “Well-Being,” “happy-for,” “gloating,” “resentment,” and “sorry-for” as a group of “Fortunes-of-Others,” “hope” and “fear” as a group of “Prospect-based,” “satisfaction,” “relief,” “fears-confirmed,” and “disappointment” as a group of “Confirmation,” “pride,” “admiration,” “shame,” and “disliking” as a group of “Attribution,” “gratitude,” “anger,” “gratification,” and “remorse” as a group of “Well-Being/Attribution” . Fig.4 shows the dependency among the groups of emotion types.
Iii Recommendation for Tourist
Our developed system for Hiroshima Tourist can guide some spots, local food shops, and local gifts which were collected in Hiroshima Tourist Map Android application software[10, 11]. Because smartphone has the GPS device and acceleration sensor, the application system has a navigation system. In this paper, the system decides the candidates for spots and foods according to the tourist feelings by EGC in order to enjoy the travel. The system makes the recommendation list by the estimate function which consists of information retrieval by Google search, TF-IDF for Hiroshima tourism website, and the EGC results. The agreement values on the 3 scales are calculated for the words representing spots, food and gifts and are composed into one vector. In this paper, the 2,284 words for Hiroshima travel were extracted from the top 50 articles of July 4, 2013 in the blog site; 4travel( http://4travel.jp/ ).
Iii-a Information retrieval by Google search
In Google search, the best page tends to be the ones that people linked to the most. Moreover, the best description of a page is often derived from the anchor text associated with the links to a page. The information technologies surrounding search engines is commonly referred to as information retrieval. The numerical number at the top of the page represents the retrieval results. The cumulative frequency distribution of the retrieval results for the 2,284 words was illustrated as shown in Fig.5. As a result, the estimation function for Google search is assumed as Eq.(1). In Eq.(1), means the number of retrieval results by Google search to the selected word. The axis and the axis of Fig.5 show the normalized number of retrieval and the cumulative frequency number, respectively.
Iii-B TF-IDF for Hiroshima Tourism Web Information
Next, we investigated the TF-IDF value of words in Hiroshima Tourism Web sites as shown in Table III.
where is the occurrence count of a term in the document . is the total number of documents in the corpus. is the number of documents where the term appears.
In this paper, and indicate the html file in Tourist Web site and the representative word in the user comments of Android Application, respectively.
However, this simulation designs to use only one TF-IDF value per a sample. A user does not know the number of words representing tourism information in one article and does not determine the ‘best’ TF-IDF value. If a sample has two or more words with TF-IDF value, it is difficult to select the representative word per a sample. Because tourist’s subjective data in our developed MPPS (MobilePhone based Participatory Sensing System)[11, 10] relates to sightseeing spots in Hiroshima, the words used in comments are also limited to the familiar location or gifts.
The collected samples include valuable information that is known to few people. If there is a word with the maximum value of TF-IDF and it is a representative value in a sample, there was no appreciable difference among other words because most of TF-IDF value is small. In this paper, we consider that each comment has words at most and they are divided to 2 groups: the existing words with higher TF-IDF value and the remaining words with lower TF-IDF value. The TF-IDF value for words is 0. The experiment describes TF-IDF values in case of ‘,’ because the words with high TF-IDF values are not so much. Then, the value of TF-IDF field as shown in Table 3 denotes the sum of TF-IDF value for words in a comment.
The cumulative frequency distribution for TF-IDF values of 2,284 words was illustrated as shown in Fig.6. As a result, the estimation function for TF-IDF is assumed as Eq.(3). In Eq.(3), means the TF-IDF value to the selected word. The axis and the axis of Fig.6 show the normalized number of retrieval results and the cumulative frequency number, respectively.
Iii-C Estimation by EGC
EGC can calculate the type and its degree of user’s emotion described in the section II-E. In the system, we consider that the spots and foods with pleasure emotion by EGC will be recommended. On the contrary, if the user meets the situation where displeasure emotion will be aroused, the user feels more deep displeasure. The words arousing the user’s displeasure were recorded in the taboo list and then the system avoids to use them.
Iii-D Total Estimation
where means the word in the sentence and means the emotion value for the sentence .
In this paper, the system recommends the place, the food, or the gift corresponding to a noun in the sentence of the user’s utterance. In , we proposed the concierge system for Hiroshima Tourist by using Fuzzy Petri Net. The user’s utterances are classified into 2 main types according to the following verbs related to the action for place and related to the action for eating and purchasing.
If the user’s utterance includes the word representing the place, the system selects the top 10 places among the list of the recommended place near the place in the user’s utterance after the system calculates the recommended value for the sentence by Eq.(III-D). The at the places in the list becomes the higher than the place of the user’s utterance, because the recommended places were enumerated in the order increasing evaluation value.
On the contrary, if the user’s utterance includes the word representing the food or gift, the system selects the top 5 foods or gifts among the list in similar way. The for the food or the gift in the list becomes the higher than that of the user’s utterance.
Iv Experimental Results
In this paper, we investigated what kind of the recommendation was performed for the following sentences under the situation in Hiroshima sightseeing.
I am going to get to the Hiroshima castle.
I would like to eat a lunch.
First, the emotion value is calculated by EGC. For the sentence 1), 3 elements in caseframe representation are , , and , respectively. As a result, the degree of pleasure is . That is, the emotion value is and the type is ‘joy’ and ‘happy-for’. Second, as mentioned in section III-D, the system recommended the sightseeing spots as shown in Table IV. The recommended values for the places in the list are equal to or higher than that of current utterance. The system shows the recommendation list as shown in Fig.7.
|3||The Self Defense Forces||0.7639||2.9405E-4||0.5830||0.9610|
|8||Rihga Royal Hotel Hiroshima||0.1705||8.912E-5||0.7071||0.7274|
|9||View the scarlet maple leaves||0.1207||1.0161E-4||0.7071||0.7173|
|10||Hiroshima Peace Memorial Park||0.1136||0.0025||0.7071||0.7162|
For the sentence 2), 3 elements in caseframe representation are , , and , respectively. As a result, the degree of pleasure is . That is, the emotion value is and the type is ‘joy’ and ‘happy-for’. Second, as mentioned in section III-D, the system recommended the local food category as shown in Table V. The system shows the recommendation list as shown in Fig. 8. If the user clicks one in the list, the system opens the Google map on the user’s current location as shown in Fig. 9. The recommended values for the categories in the list are equal to or higher than that of current utterance. Next, the user can use the Navigation application to get the target restaurant.
|2||Fried Oysters Lunch||0.2937||7.3239E-5||1.0488||1.0892|
Fig.10 shows the simulation result to act as guide from Hiroshima station to the downtown. The user starts from Hiroshima station in the right side in Fig.10 and goes to the downtown in the center in Fig.10. Then the user reaches the first destination, Hiroshima Castle. Next, after the user talks with the system, the concierge recommends ‘Okonomi-Yaki’ restaurant. The system can recognize the user’s emotion value and type by EGC and then recommend the favorite local food, ‘Okonomi-Yaki’. The user selects the category and the Google map shows the restaurant near the current place as shown in Fig.10. The user went to the , however, the restaurant was crowded and the user could not enter. The system recommended another restaurant according to this situation. The user can enjoy a comfortable sightseeing by the end of the day.
The smartphone can use various kinds of applications such as web browser, e-mail, Google map and so on. Especially, the voice recognition function is the outstanding application to spread the capability of mobile phone, because the current dialog system requires the user’s typing. For example, the concierge system uses voice recognition function. However, the essential quality of dialog is limited to question and answer, although the recognition rate becomes good. In order to enjoy real conversation, the system can evaluate the user’s emotion by using Android EGC. However, the recommendation system in this paper does not work with the MSTN, because MSTN can represent mood transition for aroused emotions. In order to reply different response for same input and different mood, we will develop the response text database for each mood.
The Hiroshima tourist website replenishes the variety of information for foreigners. The Android application ‘Hiroshima Tourist map’ has been developed to one of Mobile Phone based Participatory Sensing system, because not only tourism association but the local citizens should give the innovative and attractive information in sightseeing to visitors. We will embed the concierge system into our developed ‘Hiroshima Sightseeing map’ in near future. The usability for the developed system will be investigated to put the system in practical use such as the development of special regional products, tourism resources, and markets in Etajima City.
This work was supported by JSPS KAKENHI Grant Number 25330366.
-  T.Ichimura, T.Yamashita, K.Mera et al., ‘Emotion orientated intelligent systems’, In Internet-based Intelligent Information Processing Systems, R.J.Howlett, N.S.Ichalkaranje, L.C.Jain, G Tonfoni Eds., pp.183-226, World Scientific Publishing Company (2003)
-  K.Mera, T.Ichimura et al. Invoking Emotions in a Dialog System based on Word-Impressions, Journal of Japan Society of Artificial Intelligence, Vol.17, No.3, pp.186-195, 2002.
-  K.Mera. Emotion Orientated Intelligent Interface, Doctoral Dissertation, Tokyo Metropolitan Institute of Technology, Graduate School of Engineering, 2003.
-  C.Elliott. The Affective Reasoner: A process model of emotions in a multi-agent system, Ph.D thesis, Northwestern University, The Institute for the Learning Sciences, Technical Report No.32, 1992.
-  F.Ren. Recognizing Human Emotion based on appearance information and Menta State Transition Network, IPSJ SIG Technical Report, pp. 43-48, 2006.
-  K.Mera, T.Ichimura, Y.Kurosawa, and T.Takezawa, ‘Mood Calculating Method for Speech Interface Agent by using Emotion Generating Calculation Method and Mental State Transition Network’, Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, Vol.22, No.1, pp.10-24 (Japanese) (2010)
-  T.Ichimura and K.Mera, Emotion Oriented Agent in Mental State Transition Learning Network, Intl. J. Computational Intelligence Studies (to appear in 2013).
-  Fillmore, C.J., The Case for Case, In Bach and Harms (Ed.), Universals in Linguistic Theory, New York: Holt, Rinehart, and Winston, 1-88, 1968.
-  T.Ichimura, K.Tanabe, and I.Tachibana, Tourist Navigation in Android Smartphone by using Emotion Generating Calculations and Mental State Transition Networks, Proc. of SCIS-ISIS 2012, pp.1578-1583, 2012.
-  ITProducts ‘Hiroshima Tourist Map’, https://market.android.com/details?id=jp.itproducts.KankouMap, [Online].
-  T.Ichimura, S.Kamada, and K.Kato, Knowledge Discovery of Tourist Subjective Data in Smartphone Based Participatory Sensing System by Interactive Growing Hierarchical SOM and C4.5, Intl. J. Knowledge and Web Intelligence, Vol.3, No.2, pp.110-129, 2012.
-  S.M.Chen, J.S.Ke, and J.F.Chang. Knowledge Representation using Fuzzy Petri Nets, IEEE Trans. on Knowledge and Data Engâg. Vol.2, No.3, pp.311-319, 1991.