Providing Explanations for Recommendations in Reciprocal Environments
Automated platforms which support users in finding a mutually beneficial match, such as online dating and job recruitment sites, are becoming increasingly popular. These platforms often include recommender systems that assist users in finding a suitable match. While recommender systems which provide explanations for their recommendations have shown many benefits, explanation methods have yet to be adapted and tested in recommending suitable matches. In this paper, we introduce and extensively evaluate the use of \sayreciprocal explanations – explanations which provide reasoning as to why both parties are expected to benefit from the match. Through an extensive empirical evaluation, in both simulated and real-world dating platforms with 287 human participants, we find that when the acceptance of a recommendation involves a significant cost (e.g., monetary or emotional), reciprocal explanations outperform standard explanation methods which consider the recommendation receiver alone. However, contrary to what one may expect, when the cost of accepting a recommendation is negligible, reciprocal explanations are shown to be less effective than the traditional explanation methods.
Automated platforms for assisting people in finding a suitable match, such as online-dating and job recruitment web-services, are rapidly gaining popularity. However, finding a suitable match in these platforms can be a difficult and time-consuming task for users, especially since both sides of a potential match have to agree to form a match. Specifically, a user who seeks to find a desirable counter-part (e.g., a spouse or a partner) needs to account for both her own preferences as well as her potential counter-part’s preferences in order to best utilize her time and effort. We refer to these platforms as Reciprocal Environments (REs). To assist users in finding a suitable match, REs often offer recommender systems, commonly known as Reciprocal Recommender Systems (RRSs) (Pizzato et al., 2010; Xia et al., 2015).
Previous work on RSSs found that considering the preferences of both sides of a potential match, i.e., the recommendation receiver and the recommended user, is better suited for REs than the traditional approach which considers the recommendation receiver alone (Pizzato et al., 2010; Tu et al., 2014; Xia et al., 2015). For example, say Alice and Bob are users in a online-dating platform. The traditional approach would generate Bob as a recommended match to Alice if it estimated that Alice would be interested in Bob. However, considering both Alice and Bob’s preferences in order to generate a recommendation was shown to outperform this approach. In tandem, the question of how an RRS should explain its recommendations to the recommendation receiver arises. Specifically, while the traditional explanation methods which consider the preferences of the recommendation receiver alone have been demonstrated to increase the user’s acceptance rate of the system’s recommendations, the user’s subjective satisfaction from the system and the user’s trust in the system for non-REs (e.g., (Herlocker et al., 2000; Cramer et al., 2008; Gedikli et al., 2014)), it remains unclear whether this approach is also suited for REs. To the best of our knowledge, previous work has not addressed this question in either simulation or the real world.
Continuing our example from before, a traditional explanation method would explain to Alice why she would be interested in Bob (e.g., \sayHe is tall and an artist). However, additional information as to why Bob is expected to be interested in Alice (e.g., \sayHe is likely to be interested in you because you are a doctor and like to hike) can be leveraged by an explanation method. To utilize this potentially useful information, in this paper, we introduce and extensively evaluate a novel explanation method based on the preferences of both the recommendation receiver and the recommended user, denoted reciprocal explanations.
We focus on the online-dating domain, which is perhaps today’s most popular RE online
2. Related Work and Background
Previous studies have designed and investigated different methods for generating recommendations in REs (e.g., (Pizzato et al., 2010; Yu et al., 2011; Tu et al., 2014; Xia
et al., 2015)). These studies have found that methods that contemplate the presumed preferences of both sides of the recommendation outperform methods that consider one side alone. In practice, many popular online-dating sites and other REs include recommender systems that take into account the preferences of both sides, such as the popular Match
Explainable Artificial Intelligence (XAI) is an emerging field which aims to make automated systems understandable to humans in order to enhance their effectiveness (Gunning, 2017). This field of research was highly prioritized in the recent National Artificial Intelligence Research and Development Strategic Plan (National Science and Technology Council, 2016, p. 28). The need for explanations is also acknowledged by regulatory bodies. For example, the European Union passed a General Data Protection Regulation
A wide variety of methods for generating explanations for a given recommendation were proposed and evaluated in the literature. Two practices are commonly applied in this realm: First, existing explanation methods focus on the recommendation receiver alone. To the best of our knowledge, none of the existing methods were developed or deployed in an RE. One exception to the above is Guy et al. (Guy et al., 2009), who presented a recommender system for an RE which is transparent (i.e., provides accurate reasoning as to how the recommendation was generated). Unfortunately, the authors did not compare the effects of their approach with other explanation methods nor did they consider the unique characteristics of REs. Secondly, existing explanation methods are often tailored for specific applications or heavily dependent on the underlying algorithm for generating the recommendation and therefore cannot be easily adapted or evaluated in different domains. In this work, we relieve these two practices by designing and extensively evaluating two novel general-purpose explanation methods for REs.
Many studies have demonstrated the potential benefits of providing explanations to automated recommendations. For example, Herlocker et al. (Herlocker et al., 2000) found that adding explanations to recommendations can significantly improve the acceptance rate of the provided recommendation and the satisfaction of the users thereof. Sinha et al. (Sinha and Swearingen, 2002) further found that transparent recommendations can also increase the user’s trust in the system. These results were replicated under various domains and explanation methods (e.g., (Cramer et al., 2008; Sharma and Cosley, 2013; Gedikli et al., 2014)). The results of these works and others have combined to suggest two widely acknowledged guidelines for developing explanation methods: (1) Explanations which include specific features of the recommended item/user are highly effective, even if these features are not the actual reason the recommendation was generated (Gedikli et al., 2014; Herlocker et al., 2000; Pu and Chen, 2006); and (2) It is important to limit the length of the explanation in order to avoid information overload which can make explanations counterproductive (Pu and Chen, 2006; Gedikli et al., 2014). We follow these guidelines in our designed reciprocal explanation methods.
Recommendation Methods for Online-dating
In this work we focus on the domain of online-dating. An RRS in online-dating may provide a user with a list of recommendations for suitable matches where each recommendation consists of a single user . Note that unlike the original formulation of economical matching markets (Gale and Shapley, 1962), an RRS in online-dating, as well as in many other REs, may recommend any user to more or less than a single user .
In this study we focus on generating explanations. As such, we use two state-of-the-art recommendation methods developed and tested in online-dating: RECON and Two-sided collaborative filtering.
RECON (Pizzato et al., 2010) is an effective content-based algorithm which was empirically shown to be superior to baseline algorithms in online-dating sites. In the RECON algorithm, each user in the system is defined by two components:
A predefined list of personal attributes which the user fills out in his profile, denoted as follows:
where is the user’s associated value with attribute .
The preference of user over every attribute of potential counterparts, denoted , which is represented by the user’s message history in the environment:
That is, contains a list of pairs, each consisting of a possible (discretized) value for and the number of messages sent by to users characterized by .
Example 2.1 ().
Bob is a male user who has sent messages to different female users. For simplicity, let us assume each user is only characterized by two attributes: smoking habits and body type. Bob sent messages to female users with smoking habits as follows: smokes regularly, smoke occasionally and never smoke. Regarding their body type: were slim, average and athletic. Bob’s preferences would be presented as follows:
The RECON algorithm derives the compatibility of each pair of users and using a heuristic function that reflects how much their respective preferences and attributes are aligned.
The second recommendation algorithm we use is the Two-sided collaborative filtering (Xia et al., 2015) which was found to outperform RECON. The algorithm uses a collaborative filtering approach where the similarity between users is derived from their message history. Namely, two users will be considered similar if a large portion of their messages were sent to the same recipients. Given a recommendation receiver and a potential recommended user , the method first calculates the presumed interest of in user by measuring the similarity of to users who sent messages to . Later, the interest of in is calculated symmetrically. Finally both measures are aggregated into a single measure, which models the mutual interest of the match.
3. Generating Reciprocal Explanations
Let us assume an RRS has decided to recommend user to user based on one of the algorithms discussed above. The recommendation may be provided with or without an accompanying explanation. If the explanation only addresses the potential interest of user in user (and not vice versa) we refer to it as a one-sided explanation and denote . Similarly, if the explanation addresses the potential interest of user in user and vice versa, we refer to it as a reciprocal explanation. Naturally, a reciprocal explanation may be decomposed into a pair of one-sided explanations and .
The generic framework for providing recommendations with reciprocal explanations is provided in Algorithm 1.
Providing a recommendation with a one-sided explanation is naturally derived from Algorithm 1 by omitting Row 5 and amending Row 6 accordingly.
To realize Algorithm 1, one needs to define both the recommendation method and the method. Specifically, one would need to choose the underlying methods to be used in order to provide either a one-sided or reciprocal explanations.
4. Empirical Investigation
In order to evaluate and compare the one-sided and reciprocal explanations methods, we performed three experiments: two in a simulated online-dating environment developed specifically for this study and one in an operational online-dating platform. Each environment has its own benefits: Results from the operational online-dating platform naturally reflect the real-world impact of both explanation methods, whereas in the simulated environment one receives detailed and explicit feedback from the users, which otherwise would be impractical to gather in an active online-dating platform. We discuss these experiments below.
4.1. The MATCHMAKER Simulated Environment
We created a realistic simulated online-dating platform, which we call MATCHMAKER (MM for short). Using MM, users can view profiles of other users, interact with each other by sending messages and receive recommendations from the system for suitable matches. With the collaboration of experts in online-dating who do not co-author this paper, we designed MM’s features to reflect those of popular online-dating platforms. Figure 1 presents a snapshot of a recommendation in the MM platform.
MM is a web-based platform and can be accessed at
In order to develop an RRS for MM, it is necessary to obtain the attributes and preferences of both of the participants of the experiment and the potential recommended users.
In order to create profiles in MM which would be as realistic as possible, we used the public attributes of profiles from real online-dating sites, such as www.date4dos.co.il. However, note that the data does not consist of the users’ message history or preferences, hence the designed RRS would be very limited. To overcome this challenge we preformed the following data collection:
We recruited 121 participants, 63 males and 58 females ranging in age between 18 and 35 (average 23.3), all of whom are self-reportedly single and heterosexual. First, the participants entered MM and filled out a personal attributes questionnaire common in on-line dating platforms (e.g., age, occupation). Later, the participants viewed the profiles obtained from the real online-dating sites as discussed above and sent fictitious messages to the profiles that they perceived as suitable matches
Following the above data collection procedure, we obtained 118 participant profiles and preferences. We anonymized the participants’ profiles and preferences and used them as the initial profiles in MM for later investigation.
4.2. Choosing the Method
Before we turn our attention to the main point in question of this paper – the evaluation of one-sided and reciprocal explanations in REs – we performed a preliminary investigation in order to find the best suited method for online-dating, the domain we focus on throughout this paper.
For our investigation, we use an method which returns a list of attributes of a user which can presumably best explain why the recommendation is suitable. This approach was shown to be very effective in prior work (Symeonidis et al., 2009; Gedikli et al., 2014). In order to avoid an information overload, we limited the number of attributes included in the explanation to three, as suggested in (Pu and Chen, 2007).
We investigate two methods which correspond with the suggested format above: 1) Transparent (Algorithm 2); and 2) Correlation-based (Algorithm 3).
The transparent method, which aims to reflect the actual reasoning for the recommendations provided by the RECON algorithm, works as follows: for explaining to user a recommendation of user , the method returns the top- attributes of which are the most prominent among users who received a message from user .
The correlation-based method is inspired by the commonly used Correlation Feature Selection method from the field of Machine Learning (Hall, 1999). In our context, we would like to measure the correlation between the presence of attribute value in a user’s profile and the likelihood that will choose to send him/her a message. To that end, for each user , we need to identify which users has viewed in the past and whether he chose to send them a message. Also, we need to identify which of the viewed users is characterized by each attribute value .
Formally, for each user , we first identify the set of users that user has viewed in the past, and define
Using and we define the correlation-based method described in Algorithm 3.
The PEARSON function, used in line 6 of Algorithm 3, is the well known Pearson correlation coefficient for measuring correlation between two variables (Benesty et al., 2009).
To illustrate the difference between the explanation methods, we revisit Example 2.1. Assume an RRS has decided to recommend Alice, who never smokes and is slim, to Bob. Recall that Bob sent messages to users who never smoke and to slim users. For , the transparent explanation method would provide \saynever smoke as an explanation because Bob sent more messages to users who never smoke than to users who are slim. Now say Bob viewed a total of users, of whom never smoke and were slim. In other words, Bob sent messages to only a third of the users he viewed who never smoke, and to all users he viewed who are slim. Thus, the correlation-based method would find a stronger correlation between the presence of \sayslim body and Bob’s messaging behavior and hence \sayslim body would be provided as an explanation.
In order to compare the two methods, we used the MM simulated system discussed above. We asked 59 of the 118 participants who took part in the data collection phase to reenter the MM platform where each participant then received a list of five personal recommendations generated by the RECON algorithm along with either transparent explanations (30 participants) or correlation-based explanations (29 participants). Participants were randomly assigned to one of the two conditions. Participants were asked to rate the relevance of each recommendation separately, on a five point Likert-scale from 1 (extremely irrelevant) to 5 (extremely relevant). Next, participants answered a short questionnaire (available in the appendix in section 7), debriefing them on their user experience. The questionnaire included questions which are commonly used for measuring the four prominent factors in user experience: user satisfaction from the recommendations, perceived competence of the system, perceived transparency of the system, and trust in the system (Knijnenburg et al., 2012; Cramer et al., 2008). In addition, the users were asked specifically about the explanation usefulness, namely, the extent to which the explanations were considered by users to be helpful. All questions were answered on a five point Likert-scale.
Note that we chose the RECON algorithm for the recommendations in since the collaborative filtering method described in (Xia et al., 2015) can only recommend users who have previously received messages. As described above, the recommended users in our experimental setup were created specifically for the recommendations, and were not viewed by any users prior to the recommendations.
All collected data was found to be approximately normally distributed according to the Anderson-Darling normality test (Razali et al., 2011). All reported results were compared using an unpaired t-test. The results show that participants in the correlation-based condition were significantly more satisfied than the transparent explanation condition (mean= ,s.d= 0.82 vs. mean= ,s.d= , ). Similarly, the perceived transparency was reported to be significantly higher in the correlation-based condition (mean= , s.d= 0.93 vs. mean=, s.d= 0.65, ), as was the perceived usefulness of the explanations (mean= , s.d= vs. , s.d= , ). In regards to the perceived competence of the system, the correlation-based condition was superior, but the difference was only marginally significant (mean= ,s.d= 0.68 vs. , s.d= , ). We did not find a significant difference in the way participants rated the relevance of the provided recommendations nor did we find a significant difference in the reported trust in the system.
Based on the above results, from this point onwards we adopt the correlation-based method as the method for our investigation.
4.3. Evaluation in Simulated Online-dating Environment
One of the main challenges in designing a realistic online-dating environment is the challenge of incorporating and modeling the costs and potential gains associated with accepting recommendations in the platform. Specifically, previous research has shown that different costs, especially an emotional cost such as fear of rejection, play prominent factors in determining the behavior of users in online dating platforms (Hitsch et al., 2010a; Xia et al., 2015). Since the costs and potential gains involved with the acceptance of a recommendation (i.e., sending a message to the recommended user) may vary significantly between users, we consider two models: First, a model in which no explicit cost is introduced. Specifically, users are asked to rate the relevance of the recommended profiles without encountering any explicit cost or gain, as in the preliminary investigation described in section 4.2. Then, we consider a model in which explicit costs and potential gains are associated with accepting recommendations and users are incentivized to maximize their performance. The first model will assist us in understanding the effects of the explanation method when the cost is negligible, and the second when the cost is significant.
We asked the remaining 59 participants out of the 118 participants who participated in the data collection (but did not participate in the evaluation of the method discussed above) to take part in this experiment. Each participant was randomly assigned to one of two conditions: 1) one-sided explanations (30 participants); and 2) reciprocal explanations (29 participants). The participants reentered the MM environment and received five recommendations with an explanation corresponding to their condition. Similar to the experimental design discussed in Section 4.1, participants were asked to rate the relevance of each recommendation separately, on a five point Likert scale from 1 (extremely irrelevant) to 5 (extremely relevant), followed by the user experience questionnaire (see Appendix).
Results: All data was found to be distributed normally according to the Anderson-Darling normality test. In contrast to what the authors initially expected, the one-sided explanation outperformed the reciprocal explanation in almost all tested measures. Specifically, using a two-tailed unpaired t-test, we found that the reported relevance (one-sided: mean=, s.d= 0.62 vs. reciprocal: mean=, s.d= 0.81 ), satisfaction (mean= s.d.= 0.84 vs. mean= , s.d=0.86 , ), perceived competence (mean= s.d=0.72 vs. mean=, s.d.=0.67 , ) and trust (mean=, s.d=0.58 vs. s.d.= 0.77, ) were all found to be superior for the one-sided explanations condition. In the explanation usefulness measure, we find the opposite to be true, where the reciprocal explanation condition outperformed the one-sided explanations condition (mean=, s.d.= 0.81). The results are presented in Figure 2.
For this experiment, we recruited 67 new participants who had not participated in this study thus far (35 male and 32 female) ranging in age from 18 to 35 (average= 24.8 s.d=4.74). Participants were then randomly assigned to one of the two conditions: one-sided explanations (33 participants) or reciprocal explanations (34 participants). As was the case in the original environment, participants created profiles, browsed profiles and sent messages to users they viewed as potential matches. However, in the recommendation phase, the participants were given an incentive to maximize an artificial score which was effected by costs and gains as follows: Upon receiving a recommendation, each participant had two options – either send a message to the recommended user or not. If the participant did not send a message, he or she did not gain or lose any points. If the participant did send a message, the recommended user returned a positive or negative reply according to a probability derived from the recommended user’s preferences. Specifically, we used the interest of the recommended user in the participant, as estimated by the RECON algorithm. Participants were informed that the probability is based on the preferences of the recommended user. If the recommended user replied positively, the participant gained points proportional to how RECON estimated that the recommended user fit the user’s preferences (between three and four points). If the recommended user replied negatively, the participant lost three points. This scoring scheme was chosen in order to propel users to send messages to other users in whom they are interested while considering the probability of being rejected. Participants were paid proportional to their score. Complete technical details about this scoring and payment methodology are available in the website. Each participant then received 5 recommendations accompanied by an explanation according to their assigned condition. In this setup, we define the acceptance rate as the number of recommended users to which the participant chose to send messages. Later the participants filled out the user experience questionnaire as done in the previous setups.
Results: In contrast to the results of the previous experiment, the results show a significant benefit to the reciprocal explanations method compared to the one-sided explanations. Specifically, the acceptance of the reciprocal explanation condition was reported to be significantly higher than the one-sided condition (one-sided: mean= s.d.= vs. reciprocal: mean= s.d.=, ). Also, participants’ trust in the system was found to be higher under the reciprocal explanation condition (one-sided: mean= s.d.= vs. reciprocal: mean= s.d.= , ). No statistically significant difference was found between the the conditions for the remaining measures.
The results are presented in Figure 3.
4.4. Evaluation in an Active Online-dating Application
After completing both experiments in the MM environment, we contacted Doovdevan, an Israeli online-dating application, and received permission to conduct a similar experiment within their application, using active users as participants.
Doovdevan is a web and mobile application customized for android and iOS operating systems. Similar to other online-dating applications, users of this platform can create profiles, search for possible matches and interact with other users via messages. Doovdevan currently consists of about users and is growing rapidly. We chose to perform our experiment in Doovdevan since it is relatively new and none of the users had received recommendations from the system prior to the experiment. This was important since previous recommendations can affect the trust of the users in the system and subsequently effect their attitude towards new recommendations (Komiak and Benbasat, 2006; Cramer et al., 2008).
The recommendation algorithm that was implemented in the Doovdevan application was the two-sided collaborative filtering method described above in Section 2.1.
We randomly selected a group of 161 active users on the site (i.e., users who logged on to the platform at least once in the week prior to the experiment), 78 males and 83 females, ranging in age from 18 to 69 (mean= , s.d= ), and randomly assigned them to one of the two examined conditions: one-sided explanations (80 participants) or reciprocal explanations (81 participants). Due to privacy concerns, we were not permitted to reveal the recommended user’s preferences to the recommendation receiver. Therefore, the reciprocal explanation included two (asymmetrical) parts: First, an explanation of the presumed interest of the recommendation receiver in the recommended user, including specific attributes of the recommended user, as done in the simulated MM environment. Second, a statement that the system believes that the recommendation receiver fits the recommended user’s preferences, thus he/she is likely to reply positively.
The recommendations were sent to users’ inboxes, and the user received a notification on his or her smartphone. The recommendation has a unique tagging in the application that distinguishes it from other incoming messages. The recommendation includes a brief description of the recommended user: low-resolution photograph, name, age, location, marital status. The user may click on the recommendation and thereby receive a higher quality photograph of the recommended user and an explanation (Figure 4). At this stage the user may send a message to the recommended user.
As in the previous experiment, each participant received five recommendations. However, unlike previous experiments, in Date we sent one recommendation per day, based on the advice from the site owner who suggested that users would find it odd to receive multiple recommendations in a single day after not receiving a single recommendation thus far. Unlike the MM environment, in Date we could not explicitly ask participants for their experience. Therefore, we measure the acceptance rate of the provided recommendations as the number of recommendations that resulted in the recommendation receiver sending a message to the recommended users divided by the number of recommendations the recommendation receiver had viewed (clicked on).
All data was found to be distributed normally according to the Anderson-Darling normality test. We compared both conditions using a t-test. The results show that users who received reciprocal explanations presented significantly higher acceptance rates compared to users who received one-sided explanations () . Specifically, on average, users who received reciprocal explanations sent messages to 53% of the recommended users they viewed while the same was true for only 36% of the recommended users under the one-sided explanations condition.
Interestingly, we find that reciprocal explanations outperform one-sided explanations for women while they do not show a statistically significant difference for men. Specifically, for women we find an average acceptance rate of 38% under the reciprocal explanation condition while only 24% under the one-sided explanations condition. For men, we find that the reciprocal explanation method achieves an average acceptance rate of 64% compared to 55% under the one-sided explanation method, but the difference is not statistically significant.
We further analyze the explanations’ effect on users who sent fewer or more messages than the median number of messages sent by users in the system. We found that for the group who sent fewer messages than the median, the reciprocal explanation significantly outperformed the one-sided explanation, averaging a 47% acceptance rate compared to 25% under the one-sided explanations condition. For the complementary group, the reciprocal explanation averaged approximately 60% compared to 57% in the one-sided explanation, without a significant difference between the two. The results are presented in Figure 5.
We also examined the number of log-ins of the participants in the week following the recommendation as an additional potential impact of the explanation method. The results show that the participants under the reciprocal explanations condition logged-in significantly more often than those under the one-sided explanations, with an average of 56 log-ins compared to 23 log-ins under the one-sided explanations condition ().
The summary of all the results, from all experimental setups, are presented in figure 6.
The results from both the synthetic and real-world investigations suggest that the choice of explanation method depends on the users’ cost for following the recommendations. Specifically, in environments where the cost of accepting a recommendation is high, the reciprocal explanations favorably compare to one-sided explanations. We suggest that this is because that the additional information in the reciprocal explanation makes the user feel more confident in the outcome of accepting the recommendation, and subsequently this increases his willingness to take the risk.
The results are consistent with previous research which found that many users in online-dating platforms have an emotional cost for sending a message, mainly due to the fear of rejection (Hitsch et al., 2010a, b). Specifically, when the fear of rejection was removed, as in our first simulation, the one-sided explanation method was found to be superior.
Still, one may wonder why one-sided explanations were found to be superior to reciprocal explanations when negligible cost is introduced. We suggest two possible explanations:
Users often perceive their own attractiveness in a different manner than others (Eyal and Epley, 2010). Therefore, it is possible that the users will have a negative reaction to an explanation that describes reasons for their attractiveness which do not match their own perception.
In a short informal interview subsequent to the experiment in the simulated environment, some participants expressed discomfort with the component of the explanation that focused on the the other’s side preferences. This strengthens the last suggested reason for the results.
We further find that not all users respond to explanations in the same way, possibly suggesting that a \sayone-size-fits-all explanation method is not likely to be found. Specifically, the cost associated with accepting a recommendation may vary between users. Previous work in the online dating domain has revealed that men tend to focus more on their own preferences compared to women who also take into account their own attractiveness to the other side of the match (Xia et al., 2015). We find support for these insights in our study as well. We further find that users who are more \saychoosy in their messaging behavior tend to benefit more from reciprocal explanations compared to other users.
In this work we used a generalized explanation method, which did not differentiate between users. We intend to extend this research and build a fully-personalized user model (Rosenfeld and Kraus, 2018), which will model the user’s considerations in a RRS and provide explanations accordingly.
It is important to note that since we focused on online-dating, the above results are not immediately generalized to other reciprocal environments, such as job recruitment or roommate matching. Therefore, we intend to explore additional REs in future work and include an investigation of how to personalize the explanation method to each specific user. We also intend to investigate coalitional reciprocal environments, where a user seeks to form or join a group of partners with whom to form a coalition. For example, a system which recommends potential research collaborators for scholars. In these environments, users often have preferences for a group of partners and therefore the explanations should be adapted accordingly.
In this paper we present a first-of-its-kind study which explores explanations for recommendations in REs. We introduce the use of reciprocal explanations, which includes reasoning for the presumed interest of both sides of the recommendation in the match. We extensively evaluated the proposed approach, compared it to the traditional one-sided explanation method in both simulated and real-world online-dating platforms, and found that the explanation method should depend on the users’ cost (e.g. emotional) for accepting recommendations. Specifically, in environments where accepting the recommendations has a high cost, reciprocal explanations should be adopted, while if the cost is negligible, one-sided explanations should be adopted.
Detailed information about the MM platform and the collected data is available on the MM website: www.biu-ai.com/Dating.
7. Appendix: Questionnaire for Evaluation of User Experience
Our questionnaire included 8 Likert-scale questions, with a scale ranging from 1 (”strongly disagree”) to 5 (”strongly agree”). These questions measured five prominent factors of user experience in recommender systems. We based the questions on previous questionnaires, such as (Cramer et al., 2008; Knijnenburg et al., 2012). The questions are presented in Table 1. Some measures were evaluated by two questions, and the scores were averaged to a single score. The third question, which is ’negatively worded’, was reversed-scored (Hartley, 2014) in order to join it with question 2.
|Satisfaction||1) I like the profiles the system recommended to me.|
|System Perceived competence||2) The provided recommendations fit my preferences. 3) The system is useless for me.|
|4) I trust the system to recommend all profiles that are of interest to me. 5) I trust the system not to recommend profiles that are not interesting to me.|
|Perceived Transparency||6) I understand why the system recommended the profiles it did.|
|Explanation Usefulness||7) The explanations that were provided along with the recommendation were good. 8) The explanations that were provided along with the recommendations helped me examine the relevance of the recommendations.|
- doi: 10.475/123_4
- isbn: 123-4567-24-567/08/06
- article: 4
- price: 15.00
- According to a recent survey, 74% of single people in the United States between the ages 18 and 65 have signed up with one of the various online-dating sites (Brain, [n. d.]).
- Participants were aware that the profiles were simulated although based upon real data and that the messages were not actually sent to recipients. They were guided to send simulated messages to profiles they viewed as relevant matches for them.
- Jacob Benesty, Jingdong Chen, Yiteng Huang, and Israel Cohen. 2009. Pearson correlation coefficient. In Noise reduction in speech processing. Springer, 1–4.
- Statistics Brain. [n. d.]. ([n. d.]).
- Henriette Cramer, Vanessa Evers, Satyan Ramlal, Maarten Van Someren, Lloyd Rutledge, Natalia Stash, Lora Aroyo, and Bob Wielinga. 2008. The effects of transparency on trust in and acceptance of a content-based art recommender. User Modeling and User-Adapted Interaction 18, 5 (2008), 455–496.
- Tal Eyal and Nicholas Epley. 2010. How to seem telepathic: Enabling mind reading by matching construal. Psychological Science 21, 5 (2010), 700–705.
- David Gale and Lloyd S Shapley. 1962. College admissions and the stability of marriage. The American Mathematical Monthly 69, 1 (1962), 9–15.
- Fatih Gedikli, Dietmar Jannach, and Mouzhi Ge. 2014. How should I explain? A comparison of different explanation types for recommender systems. International Journal of Human-Computer Studies 72, 4 (2014), 367–382.
- Bryce Goodman and Seth Flaxman. 2016. European Union regulations on algorithmic decision-making and a” right to explanation”. Workshop on Human Interpretability in Machine Learning at the International Conference on Machine Learning (2016).
- David Gunning. 2017. Explainable artificial intelligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web (2017).
- Ido Guy, Inbal Ronen, and Eric Wilcox. 2009. Do you know?: recommending people to invite into your social network. In Proceedings of the 14th international conference on Intelligent user interfaces. ACM, 77–86.
- Mark Andrew Hall. 1999. Correlation-based feature selection for machine learning. Ph.D. Dissertation. University of Waikato Hamilton.
- James Hartley. 2014. Some thoughts on Likert-type scales. International Journal of Clinical and Health Psychology 14, 1 (2014), 83–86.
- Jonathan L Herlocker, Joseph A Konstan, and John Riedl. 2000. Explaining collaborative filtering recommendations. In Proceedings of the 2000 ACM conference on Computer supported cooperative work. ACM, 241–250.
- Gunter J Hitsch, Ali Hortaçsu, and Dan Ariely. 2010a. Matching and sorting in online dating. American Economic Review 100, 1 (2010), 130–63.
- Günter J Hitsch, Ali Hortaçsu, and Dan Ariely. 2010b. What makes you click?â-Mate preferences in online dating. Quantitative marketing and Economics 8, 4 (2010), 393–427.
- Bart P Knijnenburg, Martijn C Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. Explaining the user experience of recommender systems. User Modeling and User-Adapted Interaction 22, 4-5 (2012), 441–504.
- Sherrie YX Komiak and Izak Benbasat. 2006. The effects of personalization and familiarity on trust and adoption of recommendation agents. MIS quarterly (2006), 941–960.
- National Science and Technology Council. 2016. The National Artificial Intelligence Research And Development Strategic Plan. (2016).
- Luiz Pizzato, Tomek Rej, Thomas Chung, Irena Koprinska, and Judy Kay. 2010. RECON: a reciprocal recommender for online dating. In Proceedings of the fourth ACM conference on Recommender systems. ACM, 207–214.
- Pearl Pu and Li Chen. 2006. Trust building with explanation interfaces. In Proceedings of the 11th international conference on Intelligent user interfaces. ACM, 93–100.
- Pearl Pu and Li Chen. 2007. Trust-inspiring explanation interfaces for recommender systems. Knowledge-Based Systems 20, 6 (2007), 542–556.
- Nornadiah Mohd Razali, Yap Bee Wah, et al. 2011. Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. Journal of statistical modeling and analytics 2, 1 (2011), 21–33.
- Ariel Rosenfeld and Sarit Kraus. 2018. Predicting Human Decision-Making: From Prediction to Action. Synthesis Lectures on Artificial Intelligence and Machine Learning 12, 1 (2018), 1–150.
- Amit Sharma and Dan Cosley. 2013. Do social explanations work?: studying and modeling the effects of social explanations in recommender systems. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1133–1144.
- Rashmi Sinha and Kirsten Swearingen. 2002. The role of transparency in recommender systems. In CHI’02 extended abstracts on Human factors in computing systems. ACM, 830–831.
- Panagiotis Symeonidis, Alexandros Nanopoulos, and Yannis Manolopoulos. 2009. MoviExplain: a recommender system with explanations. In Proceedings of the third ACM conference on Recommender systems. ACM, 317–320.
- Kun Tu, Bruno Ribeiro, David Jensen, Don Towsley, Benyuan Liu, Hua Jiang, and Xiaodong Wang. 2014. Online dating recommendations: matching markets and learning preferences. In Proceedings of the 23rd International Conference on World Wide Web. ACM, 787–792.
- Peng Xia, Benyuan Liu, Yizhou Sun, and Cindy Chen. 2015. Reciprocal recommendation system for online dating. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015. ACM, 234–241.
- Hongtao Yu, Chaoran Liu, and Fuzhi ZHANG. 2011. Reciprocal recommendation algorithm for the field of recruitment. JOURNAL OF INFORMATION AND COMPUTATIONAL SCIENCE 8, 16 (2011), 4061–4068.