Perceiving the emotions of Pokémon

Ben J. Jennings¹

¹Centre for Cognitive Neuroscience, Brunel University London, London, U.K. E-mail: ben.jennings (at) brunel.ac (dot) uk

Download PDF

https://doi.org/10.5281/zenodo.8352076

The ability to reliably perceive the emotions of other people is vital for normal social functioning, and the human face is perhaps the strongest non-verbal cue that can be utilized when judging the emotional state of others (Ekman, 1965). The advantages of possessing this ability to recognise emotions, i.e., having emotional intelligence, include being able to respond to other people in an informed and appropriate manor, assisting in the accurate prediction of another individual’s future actions and additionally to facilitate efficient interpersonal behavior (Ekman, 1982; Izard, 1972; McArthur & Baron, 1983). In the current experiment the consistency with which emotions display by a human female face and a Pokémon character are investigated.

General Methods

The current study employed 30 hand drawings of Pikachu, a first generation electric-type Pokémon character, depicting a range of emotions (images used with permission from the illustrator, bluekomadori [https://www.deviantart.com/bluekomadori]; based on the video game characters belonging to The Pokémon Company); see Fig. 1a for examples. Also, 30 photo-quality stimuli displaying a range of emotions, expressed by the same female model, were taken from the McGill Face Database (Schmidtmann et al., 2016); see Fig. 1b for examples. Ratings of arousal (i.e., the excitement level, ranging from high to low) and valence (i.e., pleasantness or unpleasantness) were obtained for each image using a similar method to Jennings et al. (2017). This method involved the participants viewing each image in turn in a random order (60 in total: 30 Pikachu and 30 of the human female from the McGill database). After each image was viewed (presentation time 500 ms) the participants’ task was to classify the emotion being displayed (i.e., not their internal emotional response elicited by the stimuli, but the emotion they perceived the figure to be displaying).

The classification was achieved via “pointing-and-clicking” the corresponding location, with a computer mouse, within the subsequently displayed 2-dimensional Arousal-Valence emotion space (Russell, 1980). The emotion space is depicted in Fig. 1c; note that the red words are for illustration only and were not visible during testing, they are supplied here for the reader to obtain the gist of the types of emotion different areas of the space represent. Data for 20 observers (14 females) was collected, aged 23±5 years (Mean±SD), using a MacBook Pro (Apple Inc.). The stimuli presentation and participant responses were obtained via the use of the PsychToolbox software (Brainard, 1997).

Figure 1. Panels (a) and (b) illustrate 3 exemplars of the Pokémon and human stimuli, respectively. Panel (b) shows the response grid displayed on each trial for classifications to be made within (note: the red wording was not visible during testing). Panels (d) and (e) show locations of perceived emotion in the human and Pokémon stimuli, respectively. Error bars present one standard error.

Results

The calculated standard errors (SEs) serve as a measure of the classification agreement between observers for a given stimuli and were determined in both the arousal (vertical) and valence (horizontal) directions for both the Pokémon and human stimuli. These are presented as the error bars in Fig. 1d and 1e. The SEs were compared between the two stimulus types using independent t-tests for both the arousal and valence directions; no significant differences were revealed (Arousal: t(58)=-0.97, p=.34; and Valence: t(58)= 1.46, p=.15).

Effect sizes, i.e., Cohen’s d, were also determined; Arousal: d=0.06, and Valence: d=0.32, i.e., effect sizes were within the very small to small, and small to medium ranges, respectively (Cohen, 1988; Sawilowsky, 2009), again indicating a high degree of similarity in precision between the two stimuli classes. It is important to note that the analysis relied on comparing the variation (SEs) for each classified image (reflecting the agreement between participants) and not the absolute (x, y) coordinates within the space.

Discussion

What could observers be utilizing in the images that produce such a high degree of agreement on each emotion expressed by each stimulus class? Is all the emotional information contained within the eyes? Levy et al. (2012) demonstrated that when observers make an eye movement to either a human with eyes located, as expected, within the face or non-human (i.e., a ‘monster’) that has eyes located somewhere other than the face (for example, the mythical Japanese Tenome that has its eyes located on the palms of his hands; Sekien, 1776) the observers’ eye movements are nevertheless made in both cases towards the eyes, i.e., there is something special about the eyes that capture attention wherever they are positioned. Schmidtmann et al. (2016) additionally showed that accuracy for identifying an emotion was equal when either an entire face or a restricted stimulus showing just the eyes was employed. The eyes of the Pikachu stimuli are simply black circles with a white “pupil”, however they can convey emotional information, for example, based on the positions of the pupil, the orientation of the eye lid, and by how much the eye is closed. It is hence plausible that arousal-valence ratings are made on the information extracted from only the eyes.

However, for the Pokémon stimuli Pikachu’s entire body is displayed on each trail, and it has been previous shown when emotional information from the face and body are simultaneously available, they can interact. This has the result of intensifying the emotion expressed by the face (de Gelder et al., 2015), as perceived facial emotions are biased towards the emotion expressed by the body (Meeren et al., 2005). It is therefore likely that holistic processing of the facial expression coupled with signals from Pikachu’s body language, i.e., posture, provide an additional input into the observers’ final arousal-valence rating.

Conclusion

Whatever the internal processes responsible for perceiving emotional content, the data points to a mechanism that allows the emotional states of human faces to be classified with a high precision across observers, consistent with previous emotion classification studies (e.g., Jennings et al., 2017). The data also reveals the possibility of a mechanism present in normal observers that can extract emotional information from the faces and/or bodies depicted in simple sketches, containing minimal fine detail, shading and colour variation, and use this information to facilitate the consistent classification of the emotional states expressed by characters from fantasy universes.

References

Brainard, D.H. (1997) The psychophysics toolbox. Spatial Vision 10: 433–436.

de Gelder, B.; de Borst, A.W.; Watson, R. (2015) The perception of emotion in body expressions. WIREs Cognitive Science 6: 149–158.

Ekman, P. (1965) Communication through nonverbal behavior: a source of information about an interpersonal relationship. In: Tomkins, S.S. & Izard, C.E. (Eds.) Affect, Cognition and Personality: Empirical Studies. Spinger, Oxford. Pp. 390–442.

Ekman, P. (1982) Emotion in the Human Face. Second Edition. Cambridge University Press, Cambridge.

Izard, C.E. (1972) Patterns of Emotion: a new analysis of anxiety and depression. Academic Press, New York.

Jennings, B.J.; Yu, Y.; Kingdom, F.A.A. (2017) The role of spatial frequency in emotional face classification. Attention, Perception & Psychophysics 79(6): 1573–1577.

Levy, J.; Foulsham, T.; Kingstone, A. (2013) Monsters are people too. Biology Letters 9(1): 20120850.

McArthur, L.Z. & Baron, R.M. (1983) Toward an ecological theory of social perception. Psychological Review 90(3): 215–238.

Meeren, H.K.; van Heijnsbergen, C.C.; de Gelder, B. (2005) Rapid perceptual integration of facial expression and emotional body language. Proceedings of the National Academy of Sciences 102: 16518–16523.

Russel, J.A. (1980) A circumplex model of affect. Journal of Personality and Social Psychology 39(6): 1161–1178.

Schmidtmann, G.; Sleiman, D.; Pollack, J.; Gold, I. (2016) Reading the mind in the blink of an eye – a novel database for facial expressions. Perception 45: 238–239.

Sekien, T. (1776) 画図百鬼夜行 [Gazu Hyakki yagyō; The Illustrated Night Parade of a Hundred Demons]. Maekawa Yahei, Japan.

About the Author

Dr. Ben Jennings is a vision scientist. His research psychophysically and electrophysiologically investigates colour and spatial vision, object recognition, emotions, and brain injury. His favourite Pokémon is Beldum.

On the relation between Full Metal Alchemist’s Equivalent Exchange and the scientific Conservation Laws

A kawaii slug? The sea bunny Jorunna parva in pop culture

Hornet’s web: weaving the biology of silk into Hollow Knight: Silksong

Trending

On the relation between Full Metal Alchemist’s Equivalent Exchange and the scientific Conservation Laws

A kawaii slug? The sea bunny Jorunna parva in pop culture

Hornet’s web: weaving the biology of silk into Hollow Knight: Silksong

Monster Hunter: an evolutionary story