Conceptualization and Validation of a Novel Protocol for Investigating the Uncanny Valley

Conceptualization and Validation of a Novel Protocol for Investigating the Uncanny Valley

Megan K. Strait
Tufts University
Medford, MA 02155, USA
megan.strait@tufts.edu
Abstract

Loosely based on principles of similarity-attraction, robots intended for social contexts are being designed with increasing human similarity to facilitate their reception by and communication with human interactants. However, the observation of an uncanny valley – the phenomenon in which certain humanlike entities provoke dislike instead of liking – has lead some to caution against this practice. Substantial evidence supports both of these contrasting perspectives on the design of social technologies. Yet, owing to both empirical and theoretical inconsistencies, the relationship between anthropomorphic design and people’s liking of the technology remains poorly understood.

Here we present three studies which investigate people’s explicit ratings of and behavior towards a large sample of real-world robots. The results show a profound “valley effect” on people’s willingness to interact with humanlike robots, thus highlighting the formidable design challenge the uncanny valley poses for social robotics. In addition to advancing uncanny valley theory, Studies 2 and 3 contribute and validate a novel laboratory task for objectively measuring people’s perceptions of humanlike robots.

  Conceptualization and Validation of a Novel Protocol for Investigating the Uncanny Valley

Megan K. Strait Tufts University Medford, MA 02155, USA megan.strait@tufts.edu

  AUTHOR BIO: Megan Strait is a PhD candidate in the Cognitive Science Program at Tufts University. She received her BA in computer science from Wellesley College in 2010 and her MSc from Tufts University in 2014. Her graduate research is centered in the areas of human-robot and human-computer interaction. She is especially interested in the uncanny valley and its implications for the design and use of humanlike robots. Her dissertation (Understanding the Uncanny: The Effects of Human Similarity on Aversion towards Humanlike Robots) advances uncanny valley theory and introduces a novel protocol for evaluating human-robot interactions.

Author Keywords

Uncanny valley theory, social robotics; human-robot interaction; emotion; emotion regulation

Introduction

Loosely based on similarity-attraction theory [EISENBERG1987], technologies intended for social contexts are being designed with increasing human similarity to facilitate their reception by and communication with people. This anthropomorphization of social technologies represents a particularly powerful mechanism for facilitating human-computer and human-robot interactions. Specifically, increasing the human similarity of a nonhuman entity can elicit more positive responding from its human interactants, which in turn, leads to positive social outcomes such as improved rapport in application domains such as education, collaboration, and therapy (e.g., [ANDRIST2012]).

margin:

For example, explicit humanlike cues such as a humanlike face presented on a computer screen (as compared with a text-based computer) lead people to respond more positively to and feel more relaxed with the computer (e.g., [SPROULL1996]), personified interfaces help users engage in tasks (e.g., [KODA1996]), and humanlike web content (socially-rich text and picture elements) increaseses perceptions of usefulness, trust and enjoyment of shopping websites, leading to more favourable consumer attitudes (e.g., [HASSANEIN2007]). Similarly, equipping a robot with humanlike attributes facilitates the formation of rapport with, empathic responding towards, and positive appraisals of it (e.g., [DUFFY2003, RIEK2009, SAUPPE2015]).

The Anthropomorphization of Social Robots

Within the human-robot interaction community, this has lead to a pervasive assumption that people’s liking of social robots is a monotonically-increasing function of human similarity (i.e., greater human similarity is always better). This assumption is reflected by the sheer number of engineering efforts towards developing humanlike robots (see Table LABEL:tab:humanoids) to the instantiation of a new field of study (android science) devoted to this topic [ISHIGURO2007]. Consistent with similarity-attraction theory, it is expected that such robots offer more natural and effective interactions by capitalizing on traits which are more familiar and intuitive to people.

This perspective is not unfounded. In fact, it is rather well-supported by a large empirical base. Humanlike robots are perceived as more thoughtful (e.g., [ANDRIST2014], intelligent (e.g., [BROADBENT2013]), and importantly, likeable (e.g., [DUFFY2003, RIEK2009]) relative to their less humanlike counterparts. People also report greater comfort in their presence (e.g., [SAUPPE2015]) and are more receptive to a robot interlocutor and compliant on collaborative tasks the greater the similarity of the robot (e.g., [ANDRIST2015, PARISE1999]).

The Uncanny Valley

Others have suggested a more nuanced relationship between human similarity and people’s liking of anthropomorphic entities. In particular, the emergence of increasingly humanlike robots and other artificial entities brought to light a competing phenomenon: the uncanny valley [MORI1970]. The uncanny valley refers to the observation of certain entities – often those with a highly humanlike appearance – provoking significant discomfort, instead of affinity as would be predicted by similarity-attraction theory.

Over the course of nearly five decades since Masahiro Mori’s formal introduction of the uncanny valley theory into scientific discourse, empirical inquiries have compiled substantial evidence of its existence (for a review, see [KATSYRI2015]). Specifically, people tend to rate highly humanlike entities as eerie and more unnerving than less humanlike instances (e.g., [MACDORMAN2006A]). Such dislike appears to also manifest in infants [LEWKOWICZ2012, MATSUDA2012] and even other primates [STECKENFINGER2009], suggesting the general phenomenon is relatively pervasive.

Yet, uncanny valley theory continues to be critically questioned due to various inconsistencies in and shortcomings of empirical probes (e.g., [BARTNECK2009, KATSYRI2015, ZLOTOWSKI2013B]). For example, people respond negatively towards some but not all instances of highly humanlike agents (e.g., [ROSENTHAL2014A]). Similarly for certain humanlike attributes, while some find their application provokes dislike (e.g., [PIWEK2014]), others find the exact opposite (e.g., [WANG2006]). It thus remains difficult to decide which perspective to take in the design of these social technologies.

At the root of the above issues lay several gaps in the literature. Most notably, the community lacks a consistent methodology for investigating the uncanny valley and its effects. Moreover, albeit due to practical limitations, the literature largely draws on subjective assessment and on the comparison of very few agents. Given the high variability of both subjective measures and the appearance of real-world robots, it may serve to explain at least some of the inconsistencies between studies using different robots.

margin:

Present Research

In the following sections, we present three experimental investigations of people’s perceptions of a large sample of real-world robots.111All procedures were approved by the Social, Behavioral, and Educational Research Institutional Review Board at Tufts University. All participants provided written, informed consent prior to participating and received either or course credit as compensation. Here we showed participants a series of pictures depicting humans and robots of varying human similarity (low, moderate, and high). We collected participants’ subjective ratings of the agents’ appearances on two dimensions: human similarity as a manipulation check, and eeriness to determine whether the set of humanlike robots refelcts an uncanny valley.

To move towards more objective assessment, Studies 2 and 3 additionally propose and validate a novel protocol for measuring people’s behavior towards humanlike robots. Specifically, Studies 2–3 investigate the link between the uncanny valley and avoidant behavior using the process model of emotion regulation [GROSS1998] as theoretical grounding. The motivations for doing so follow from the literature on avoidant behavior, defining it as a person’s unwillingness to experience negative emotions and their desire to change the form or frequency of situations giving rise to those experiences (e.g., [ELLIOT2001]) – which is an example of emotion regulation through situation selection.

The implications of avoidant behavior is particularly important to human-computer and human-robot interaction given a primary aim of design is to facilitate interaction. Thus, while increasing a robot’s human similarity can effect positive social outcomes, it remains crucial to understand when/why a design effects negative responding. Hence, the purpose of Studies 2–3 was to investigate whether the uncanny valley presents a serious consideration for human-robot interaction via more objective assessment of its impact on people’s behavior. That is, we wanted to determine whether highly humanlike robots can be so emotionally motivating that they evoke avoidant behavior.

In Studies 2–3, we again presented the series of pictures depicting humans and robots of varying human similarity, but with the addition of an option to press a button to remove the picture if the participant wished to stop looking at it. In addition to the subjective ratings of the agents’ appearances, we collected the percentage of button presses to measure the frequency of attempts to end encounters with the various agents as an index of avoidant behavior.

Study 1

Based on Mori’s uncanny valley theory, as well as supporting evidence (e.g., [MACDORMAN2005]), we hypothesized that people would rate highly humanlike robots as more eerie than than less humanlike robots and humans (Hypothesis 1). To test this prediction, we conducted a fully within-subjects study in which we manipulated the shown agents’ human similarity.

Materials & Methods
Protocol. Participants viewed a set of color pictures, each depicting a distinct (robotic or human) agent. The pictures were obtained from various academic and internet sources, and were categorized into four levels of relative human similarity: robots of low, moderate222For brevity, the results and contrast regarding the category of robots with moderate human similarity are excluded in the remainder of the paper., or high similarity, as well as human agents (see Figure LABEL:fig:agents). To generalize beyond the appearance of any one agent, we included instances per category for a total of pictures. Each picture was shown for 10-seconds in duration, followed by two prompts for the participant to rate the depicted agent.

margin:

Measures. We collected participants’ subjective ratings of the depicted agents’ appearances on two dimensions: human similarity and eeriness. As a fully within-subjects design was used, the ratings were averaged (by participant) across trials within each of the four agent categories.

Participants. Twenty Tufts University undergraduate and graduate students participated. Due to equipment failure, data were unavailable for two participants. Thus, participants ( male) with ages ranging from 18 to 30 years (, ) were included in our final sample.

Results
To confirm the assumptions of our study design and test our hypothesis, a repeated-measures ANOVA was conducted on each of the subjective ratings with agent category as the independent variable. For each ANOVA, the assumption of equal variance was confirmed using Mauchly’s test of sphericity or otherwise adjusted. In cases of violation, the degrees of freedom and corresponding -value reflect either a Greenhouse-Geisser or Huynh-Feldt adjustment as per [Girden1992]. In addition, all post-hoc contrasts reflect a Bonferroni-Holm correction for multiple comparisons.

Human Similarity. We first tested participants’ human similarity ratings to determine whether our four-level manipulation of agent appearance elicited different attributions. Specifically, we assumed the four levels would be perceived as having increasing human likeness from low (lowest) to human (highest). As expected, the ANOVA showed a main effect of agent category on human similarity ratings: , , . Furthermore all pairwise comparisons were significant (), confirming that participants’ perceptions of the agents’ human similarity were consistent with those assumed – increasing from robots categorized as low (, ) to high (, ) in human similarity to human (, ).

Eeriness. Based on Mori’s uncanny valley theory, we assumed the four categories would elicit differentially negative evaluations, with the greatest eeriness attributed to highly humanlike robots and least to humans. As expected, the ANOVA showed a main effect of agent category on eeriness ratings: , , . Again, all pairwise contrasts were significant and consistent with assumptions. Specifically, the highly humanlike robots were rated as most eerie (, ) and humans as least (, ). See Figure LABEL:fig:eeriness (left).

Study 2

The findings of Study 1 suggested the existence of an uncanny valley with real-world robots. To determine the impact of the valley on human-robot interactons, here in Study 2 we tested whether it provokes avoidant behavior in observers [STRAIT2015]. Specifically, via a slight modification of the picture-viewing protocol, we tested whether highly humanlike robots are so emotionally motivating that participants attempt to end their encounters more frequently than those with less humanlike and human agents (Hypothesis 2).

margin:

Materials & Methods
Protocol. To investigate whether the appearance of highly humanlike robots is so distressing that people avoid their encounters, we adapted the protocol by Vujović and colleagues for studying aversive responding towards negative stimuli [VUJOVIC2014]. Here, the above set of images were each presented for up to . Participants were informed that, should they wish to do so, they could press the spacebar to remove the given image from the screen. If the subject did not press the spacebar, the image remained on display for a total viewing duration of . Following the viewing, participants were cued to select one of three reasons for either pressing (unnerved, bored, or other) or not pressing (interested, indifferent, or other) the spacebar. The choice of these options served to tease apart whether a stimulus creates a negative situation for the subject (being unnerved) or rather, whether a press response indicates some other motivating factor (e.g., boredom). After a response was recorded, participants were then prompted to rate the appearance of the given agent as done in Study 1.

Measures. We again collected subjective ratings of the agents’ human similarity and eeriness. To index avoidant behavior, we recorded the percentage of button presses to measure the frequency of attempts to end encounters with the various agents, as well as participants’ explicit reasons as to why they did or did not press the button.333Due to space constraints, only one measure of avoidant behavior – press frequency due to being unnerved – will be discussed. Note however, the other results are available in [STRAIT2015].

Participants. Sixty-two undergraduates participated. Due to equipment failure data were unavailable for two subjects, thus sixty subjects ( male) with ages ranging from 18 to 28 years (, ) were included in our final sample.

Results
As in Study 1, a repeated-measures ANOVA was conducted on each of the dependent measures with agent category as the independent variable. Similarly, corrections (e.g., Bonferroni-Holm) were applied as appropriate.

Human Similarity. As in Study 1, the ANOVA on human similarity ratings showed a main effect of agent category: , , . All pairwise comparisons were significant, confirming that participants’ perceptions of the agents’ human similarity were consistent with those assumed – increasing from robots categorized as low (, ) to high (, ) to human (, .

Eeriness. Similarly, the ANOVA on eeriness ratings also showed a main effect of agent category:
, , . Again, all pairwise contrasts were significant and consistent with assumptions. Specifically, the highly humanlike robots were rated as most eerie (, ) and humans as least (, ). See Figure LABEL:fig:eeriness (middle).

Avoidant Behavior. The ANOVA on press frequency – the frequency at which participants terminated their encounters – also showed a main effect of agent category:
, , . Specifically, participants terminated their encounters with highly humanlike robots more frequently and due to being unnerved (, ) relative to less humanlike robots (, ) and humans (, ). See Figure LABEL:fig:presses (top).

Study 3

Study 2 made two theoretical contributions. First, it confirmed the existence of an uncanny valley in humanoid robots in replicating the findings of Study 1. Second, it established support towards Mori’s speculations that highly humanlike entities can be so unnerving that they motivate avoidant behavior. However, there remains a key limitation – both within Study 2 and across uncanny valley literature at large. Specifically, the question of a person’s exposure to and experience with social robots remains an unaddressed critique (e.g., [HANSON2005]). Thus, we developed Study 2 as a follow-up investigation. The primary goal was to determine whether controlled exposure to a humanoid would attenuate aversive responding towards robots in general and whether it would extinguish the presence of an uncanny valley or any valley effects in particular. This follow-up study thus allowed us to conceptually replicate the findings in a context in which key limitations of Study 2 were resolved.

Materials & Methods
Protocol. Here, participants were preexposed to one of three agents (either a robot with low or moderate human similarity, or a human confederate) in a simple interactive task (developed in [STRAIT2014]) prior to completing the picture-viewing task employed in Study 2.

Participants. Seventy-one undergraduate and graduate students participated ( male), ranging from to years old (, ).

Results
Human Similarity. As in Studies 1–2, the ANOVA on human similarity ratings showed a main effect of agent category: , , . All pairwise comparisons were significant, confirming that participants’ perceptions of the agents’ human similarity were consistent with those assumed – increasing from robots categorized as low (, ) to high (, ) to human (, ).

Eeriness. Similarly, the ANOVA on eeriness ratings also showed a main effect of agent category:
, , . Again, all pairwise contrasts were significant and consistent with assumptions. Specifically, the highly humanlike robots were rated as most eerie (, ) and humans as least (, ). See Figure LABEL:fig:eeriness (right).

Avoidant Behavior. The ANOVA on press frequency – the frequency at which participants terminated their encounters – also showed a main effect of agent category:
, , . Specifically, participants terminated their encounters with highly humanlike robots more frequently and due to being unnerved (, ) relative to less humanlike robots (, ) and humans (, ). See Figure LABEL:fig:presses (top).

General Discussion

Summary of Present Research. Studies 1–3 present an experimental test of Mori’s uncanny valley theory, as it pertains to real-world robots of varying human similarity and humans. In Study 1, we measured people’s subjective ratings of the agents’ eeriness, which reflected a valley corresponding to robots that are highly humanlike in their appearance. Study 2 replicated this valley in eeriness ratings and demonstrated the use of a novel protocol for more objective measurement of valley effects. The results showed that not only do people rate highly humanlike robots as more eerie, but moreover, they exhibit greater avoidance of such encounters than those with less humanlike and human agents. In Study 3, despite preexposure to an embodied humanoid prior to the picture-viewing protocol, the valley in participants’ ratings of eeriness and their corresponding avoidance of highly humanlike robots persisted. Consistent with Mori’s original postulations, these findings robustly demonstrate that robots can be so unnerving that they motivate people to avoid them. Furthermore, they suggest that people’s aversion to highly humanlike robots is not sensitive to other social factors such as exposure.

Future Impact. The present work has clear theoretical and practical implications. Studies 1–3 both demonstrate and replicate an uncanny valley in the appearance of humanlike robots and people’s avoidance thereof. In doing so, they provide strong support of Mori’s original theory theory and moreover, establish that robust valley effects pose a formidable design consideration for human-robot interaction. Should these findings replicate with other anthropomorphized technologies such as user interfaces equipped with humanlike attributes, it will show the uncanny valley extends beyond robotics to pose a broader design consideration for human-computer interaction at large.

The long-term impact of this work is three-fold. First, with the establishment of clear and robust effects, the present work allows the community to now focus on the causal mechanisms by which certain humanlike appearances contribute to people’s discomfort. It remains to be explained as to why people are particularly averse to highly humanlike robots. Numerous plausible explanations have been proposed, but there have been inconsistent findings due to the variable nature of subjective assessment. This highlights a second contribution: the protocol described here has the potential to address outstanding inconsistencies via a more objective measurement of valley effects (observation of people’s behavior and their emotional experiences driving their actions). Last but not least, further application of knowledge gained from the present work and the demonstrated protocol may serve towards the design of future robots. For example, in iterative design practices, the protocol may be used as a quick and simple test of the efficacy of one design versus others.

Acknowledgements

I am grateful to my advisor, Heather Urry, for her tremendous support and significant advising towards the work presented here. I would also like to thank Victoria Floerke and Lara Vujović for their help and collaboration on Study 1, as well as for their support and advice during the Study 2 production. Lastly, I would like to thank the many undergraduate research assistants in the Emotion, Brain, and Behavior Laboratory, as well as Max Bennett, George Brown, Brendan Fleig-Goldstein, and Maretta Morovitz who contributed towards data collection, processing, and analysis.

References

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
254299
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description