Phoebe Ellsworth: On Appraisal Theory, and the Social Uses of Psychology


An Interview With Andrea Scarantino (August 2015)

ellsworthPhoebe C. Ellsworth is the Frank Murphy Distinguished University Professor of Psychology and Law at the University of Michigan, Ann Arbor. She has been a Fellow of the American Academy of Arts and Sciences since 1992, and her recent awards include the SPSP Career Contribution Award and the Nalini Ambady Award for Mentoring Excellence in 2014 and the APS James McKeen Cattell Award in 2015. One line of her research involves the relation between cognition and emotion and, along with Klaus Scherer, she is one of the originators of appraisal theories of emotion. Her other line of research involves the application of Psychology to Law, with particular focus on jury decision making and the death penalty. She has been a member of ISRE since it was founded, and is Vice President of the board of the Death Penalty Information Center. Her most recent articles have appeared in the Alabama Law Review (Meador Lecture Series), Psychological ReviewEmotion Review, and Psychological Science.


Where did you grow up? What are your memories of family life as a young woman? Did you have academic role models within your family?

Phoebe at age 5

Phoebe at age 5

Phoebe's elementary school, New Canaan, Connecticut

Phoebe’s elementary school, New Canaan, Connecticut

I grew up in Connecticut in a family that was emphatically grounded in New England, with the exception of a sprinkling of Cubans and a few New Yorkers. The emotional climate was temperate. My father went back to graduate school when the War was over, soon after I was born, got his PhD in sociology, and was an assistant professor at Yale until he didn’t get tenure, after which he worked in Yale administration until he retired. My mother might have been an archaeologist, but she got married instead.

I began my education in a one-room schoolhouse with about 25 children in five grades. Miss Kelly gave us work that fit our abilities in each subject, so I usually did reading and writing and geography and history with the kids a couple of grades ahead of me, and arithmetic with my own grade. I was a fairly solitary child, not popular, but I don’t remember being particularly lonely, except in middle school. I was interested in almost everything, and spent my outdoor time in the woods observing plants and animals and my indoor time mostly reading and drawing.

My husband says that I was a natural born academic, but if so, I was certainly unware of it. However, my parents’ friends were mostly academics, including some women, such as the anthropologist Bea Whiting, so at least I knew that being a woman professor was possible. But the general view at the time was that women should get college educations in order to be fit companions for intelligent husbands.

You studied social relations at Radcliffe College and got your PhD in social psychology at Stanford University in 1970. Were you always interested in the social dimension of human life? Did you ever consider studying something completely different? What classes and teachers had a deep influence on you at Stanford University? What topic did you write your PhD thesis on?

Phoebe on a motorcycle, circa 1965

Phoebe on a motorcycle, circa 1965

Until the end of college I had no thoughts of a career in Psychology. For one thing, it was fairly rare for women to consider careers in anything, and for another, college students in the 60’s were far less career-oriented than they are now. We wanted to learn everything, we wanted to change the world, and we gave little thought to dull topics like paying jobs, vaguely figuring that mundane issues like that would somehow work themselves out in the future. I thought I’d like to be an artist or a writer.

I dropped out of college for a semester and was hired as “office help” by three Yale social psychologists – Barry Collins, Chuck Kiesler, and Norman Miller — but I was a failure as office help and so became a research assistant. My Yale employers and several Harvard professors told me that I had a real knack for research design. This did not strike me as a particularly glamorous skill, but I believed that people were most likely to make a contribution by doing things that they were good at, and since I had no other ideas about what to do next, I applied to graduate school.

At Stanford my advisor was Merrill Carlsmith, who was a brilliant statistician, an extremely competitive person, and not given to praise. He never told me that any of my ideas were good, but he pretty much let me do what I wanted. It was a good collaboration because Merrill never got interested in a question until the data were collected and he could start doing interesting statistical analyses, whereas once I had calculated the means, I was thinking about what the next study should be. He taught me how to really look at data and learn from them.

card 3

Phoebe’s drawings: 1

The course that made the biggest impression on me was Walter Mischel’s course on Personality, in which the main text was the draft of his revolutionary book, Personality and Assessment (Mischel, 1968). The central argument of this book was that personality traits do not predict people’s behavior very well. People respond to their immediate situation, and if you want to know what they will do, you’re much better off knowing what situation they’re in than what their score is on any personality test.

In every class we had passionate arguments, as most of us had always believed that personality was fundamentally important, but in the end the concept of the power of the situation had become a permanently available concept to me, in research as well as in life. I realized that one of the main reasons that people seemed consistent to me was that I always saw them in the same situation, and forever after when I see someone behaving badly I think “What circumstances are making them do that?” This was a genuinely new insight for me, and has made me not only a better scientist, but a more tolerant person.

The person I admired most at Stanford was Al Hastorf. Intellectually, Hastorf and Cantril (1954) showed me that different people perceive the same situation differently, which is an insight that informs appraisal theory, although I’m not sure I was consciously aware of the connection when I was first thinking about appraisal theory. Interpersonally, Al was able to talk to anyone in a way that made the person feel interesting and valuable, and was able to accomplish his goals as an administrator without making enemies. Lee Ross joined the faculty as a new assistant professor when I was still at Stanford, and was more available than other faculty for endless intense conversations that led to a lifelong friendship.

In my dissertation, I studied whether submission gestures (lowered gaze) forestalled aggression in humans as they do in other primates (Ellsworth & Carlsmith, 1973). The answer is “sometimes”. I did some follow-up research on nonverbal cues, but I eventually moved on – at the time the field of nonverbal communication seemed destined to stay on the periphery, and I didn’t want to become an aggression researcher because most of the aggression researchers I knew were pretty aggressive people, and I didn’t want them to be my closest colleagues.

When you were a graduate student at Stanford University, you spent the summers of 1967 and 1968 working in Paul Ekman’s lab at the Langley Porter Neuropsychiatric Institute in San Francisco. As you report in your paper “Basic Emotions and the Rocks of New Hampshire”, you helped him select the famous pictures he used during his field trip to Papua New Guinea to argue that facial expressions of basic emotions are universally recognized. Was the way in which the pictures were selected problematic to establish that conclusion? More generally, what is your view on the role of the cultural context in the recognition of facial expressions?

Phoebe with Paul Ekman and others, when Ekman was awarded the APA Distinguished Scientist Award.png

Phoebe with Paul Ekman and others, when Ekman was awarded the APA Distinguished Scientist Award

I was very interested in nonverbal communication and emotion at that time, but nobody at Stanford considered them important topics, which is why I was eager to work in Paul Ekman’s lab during the summer. I worked on designing a facial coding scheme based on pictures. At that time Paul and Wally Friesen were developing a verbal method for coding facial expressions, with terms like “slight squint”, “moderate squint”, “extreme squint”, “lips relaxed”, “lips pursed” and so on. I thought that words added an unnecessary complication and were likely to increase error, and I made a bunch of drawings to show the facial changes rather than describing them.

Paul thought that it would be much better to use photographs than drawings, so I spent a lot of time taking pictures of people and selecting eyes, brows, and mouths that were clear examples of various components of emotional expression. The coding system was called The Facial Action Scoring Technique (FAST; Ekman, Friesen, & Tomkins, 1971). It didn’t last long. Paul and Wally soon developed FACS, which was a much more detailed anatomically based system.

I also worked on the cross-cultural research. In those days a prevalent point of view was that facial expressions were culturally based and had no intrinsic meaning, and that emotions were based on people’s construction of the situation. Every introductory psychology textbook had a picture of someone showing an intense emotion like a tearful screaming face and asked readers to say what it was. The reader would say something like “agony” and then two pages later the face would be shown with its whole context – the screaming woman had just been told that she won the Miss America contest, and the text would say, “Ha,ha — see how wrong you were. Facial expressions are not a valid indicator of what a person is feeling.”

card 2

Phoebe’s drawings: 2

We thought that this was too extreme, and wanted to find out whether any facial expressions had the same meaning across cultures. I went through the stimuli used in all of the old facial expression studies and chose the pictures that had gotten the highest levels of agreement (mostly among Western Subjects) and then pilot tested them again on Stanford students to make sure that they were still interpreted in the same way. I didn’t go to New Guinea.

Given that our goal was to discover whether there was any cross cultural agreement in the recognition of facial expressions, we thought it was appropriate to choose the faces that were most likely to elicit agreement, and I still think that was the appropriate method. If the common hypothesis had been that facial expressions did have the same meaning across cultures, then it would have made sense to choose much more ambiguous faces as providing the best chance of showing cultural variability.

We never thought that all facial expression would be interpreted in the same way across cultures, but that it would be very important to discover whether any did. And it turned out to be even more important than we had expected. It changed the way psychologists thought about emotion, and pictures from Ekman’s research replaced Miss America in the textbooks. Of course culture plays a role. It influences the situations that people see as emotionally significant, the kinds of emotional experience that are salient in people’s minds and therefore accessible and common, the emotions that are considered appropriate, and the ways that they should be expressed. And of course there is wide variation in emotional experience and expression within cultures (Ellsworth, 1994a).

One of the distinctive aspects of your remarkable academic profile is that you have interests and expertise both in emotion theory and in the intersection between psychology and law. When and how did you get interested in emotions on the one hand and in the intersection between psychology and law on the other hand?

Phoebe in graduate school, 1969

Phoebe in graduate school, 1969

I was always interested in doing both basic research and research that would have more immediate relevance to social problems. I first became interested in law in a course on legal and criminal psychology taught by Hans Toch at Harvard, possibly the first of its kind. I thought that my law research was more like a “good deed”, like volunteer work, and had no idea that it would actually contribute to my career. I felt that if legislators and judges were going to make important decisions based on their assumptions about human behavior, they should have accurate scientific information about human behavior, rather than relying on their own intuitions.

My first psychology and law article was written with a law professor (Ellsworth & Levy, 1969) who was working on designing policies for child custody. I was surprised that he knew nothing about the relevant research in developmental psychology, and offered to review the literature for him. Policies about capital punishment and jury decision making were also being set without much knowledge of relevant research. For example, in 1970 the Supreme Court decided that there was no difference in the behavior of six-person and twelve-person juries (Williams v. Florida), and in 1972 they decided that whether or not there was a rule requiring unanimity would make no difference in jury behavior (Apodaca et al. v. Oregon).

Like many other psychologists, I was upset that decisions that had such major and permanent consequences could be made without any evidence. Further research by Michael Saks on jury size, Reid Hastie on unanimity, and many others showed that both size and unanimity do make a difference. Smaller juries are less representative and less predictable. Non-unanimous juries are less thorough in their deliberations and less likely to consider the views of jurors who disagree with the majority. I had already been working on legal issues before these decisions, but they strengthened my resolve. I believed that judges ought to consider relevant scientific research, but also that scientists had an obligation to produce research that was relevant.

You have argued in an influential paper entitled “William James and emotion: Is a century of fame worth a century of misunderstanding?” (1994) that William James’ views on emotions have been largely misunderstood. Your analysis has elicited several critical responses over the years. Now, 20 years after that contribution, do you still think James’ position is systematically misrepresented, and if so how?

William James (1842-1910)

William James (1842-1910)

I argued that the common view of James’s theory was an oversimplification, not that James’s position was “systematically misrepresented”, certainly not by scholars of his work. The common oversimplification was to say that James believed that emotions were nothing but the perception of one’s bodily sensations. What he actually said was that “the bodily changes follow directly the perception of the exciting fact” and that they are necessary for the experience of emotion, but “the perception of the exciting fact” was largely ignored in later discussions of the theory. And by “perception” he did not mean mere sensation; he meant a perception of the “total situation”, so a bear in the woods would elicit a very different emotional response than a bear in the zoo.

So I argued that James’s theory was consistent with appraisal theories of emotion. I think that none of us can definitively answer the question “What did William James really mean?” For one thing, although he could clearly describe the ideas he didn’t believe, when it came to what he did believe he was curious, speculative, complicated, and sometimes apparently contradictory. He was trying to focus the field on asking the right questions, and I don’t think he believed he had the final answers. Nor do we (Ellsworth, 1994b; Ellsworth, 2014).

Your work on emotional appraisal has been seminal. Could you reconstruct what got you interested in the appraisal dimension of emotions, and illustrate the core components of your current theory of appraisal, pointing to changes of your theory over time, if there have been any?

In the 1970’s Paul Ekman, Cal Izard and I were thinking of submitting a proposal to NASA on the utility of using facial expression as a means of regulating emotion. This idea was based on Silvan Tomkins’s ideas on the role of facial feedback in producing emotions. We thought we needed to include some preliminary research to indicate the value of this approach, and I was designated as the one to do it. Along with Roger Tourangeau, I designed and ran a study in which we showed people sad or scary films and posed their faces into sad or fearful expressions, so as to look at the relative influence of the situation and the facial feedback. We found that people’s emotions were powerfully influenced by the films rather than by their faces (Tourangeau & Ellsworth, 1979).

Phoebe with older daughter Sasha, 1988

Phoebe with older daughter Sasha, 1988

So then the question became: What is it about the situation that leads to emotions? And clearly the driving force had to be people’s perceptions of the situation rather than the situation itself, given the obvious fact that different people can have very different emotional responses to the same situation. Schachter and Singer (1962) had proposed that people’s perceptions of the situation matter, which was a big contribution, but didn’t go further and ask what it is about people’s perception of the situation that matters.

Many theories had proposed that arousal and valence were crucial dimensions differentiating emotions, but I didn’t think that these two dimensions were enough to account for the richness of emotional experience. For example, fear, fury, and anguish are all negative and all intense. So I spent years, working with Ira Roseman and Craig Smith, who were graduate students, trying to figure out what other situational appraisals were fundamental in differentiating people’s emotional responses.

When we were just starting the first big empirical research project on the theory, Klaus Scherer gave a talk at my lab, and we were both stunned to discover that our theories were nearly identical! This convergence gave us confidence in the validity of the theory. I am quite proud of us for feeling pleasure and excitement about the similarity of our theories, not fighting over the credit or wasting our time addressing the small dissimilarities, but seeing it as the same basic idea and thinking about how we could test, develop, and extend the point of view, working as collaborators rather than competitors.

Over the years the theory has been clarified and extended and I expect that it will continue to be. It always annoys me when people assume that my ideas today are exactly what they were in 1985, when Craig Smith and I published our first article on appraisal theory (Smith & Ellsworth, 1985).

Some critics have argued that things other than appraisals can elicit emotions. Candidate alternative mechanisms have included facial feedback, chemical induction, brain manipulation and exposure to music. What is your answer to this critique? Are appraisals strictly necessary for emotion elicitation?

I don’t believe that all emotional experiences are elicited by appraisals, and in fact I’m not sure that “elicit” is the right word, if it is a synonym for cause. Brain stimulation can elicit all sorts of things – thoughts, images, behaviors — and I expect that chemical induction may too, since it operates through brain stimulation. Facial feedback may play a role in modulating emotions, but the research I did with Roger Tourangeau persuaded me that people’s perception of the situation is far more important, and then there’s always the vexing background question of what tells the face what to do – quite a few brain processes must intervene between the retina and the face.

musical emotionsMusic is a more difficult challenge, as I pointed out in 1994 (Ellsworth, 1994c), and can evoke emotion in several ways – through memory (as when the music is associated with an emotionally-charged person or event), through the association of the rhythm with an action tendency – a strong loud rhythm may suggest striking, a descending scale may suggest falling, a soft, slow even melody may suggest resting, through aesthetic appreciation, through the sense of mastery that comes from familiarity. In adults, who have experienced emotional patterns of appraisal and their associated physiological responses and action tendencies many times, any component may elicit the others, so when we hear the loud strokes of the Dies Irae in Verdi’s requiem our physiological responses and action tendencies evoke the appraisals and feelings of determination and anger. But Klaus Scherer is the real expert on this.

At this point, it is hard to find an emotion theorist who would deny that emotions are elicited by appraisals at least in standard cases, provided we allow appraisals to range from primitive to sophisticated forms of information processing. But I presume accepting this tenet is not sufficient for being an appraisal theorist, or we would have to count pretty much everyone as an appraisal theorist. So what ultimately distinguishes appraisal theory as a research program from research programs generally considered as competing with it, such as basic emotion theory, dimensional theories, psychological constructionism, and social constructionism?

Well, the fact that you’d count everyone as an appraisal theorist suggests the success and utility of the theory. You wouldn’t have said this in 1980. Most theorists now allow that a person’s perception of the situation is a major component of emotion. Schachter and Singer said this, but left it very vague. What appraisal theorists did was to specify the kinds of perceptions of the situation that were most crucial in differentiating emotion: novelty, valence, certainty, agency, goal conduciveness, control (or effort), and compatibility with norms.

These appraisals will distinguish most common emotional experiences, including the emotions postulated in Basic Emotions theories. But appraisal theory differs from Basic Emotion theories in that it allows for emotional experiences that do not fit neatly into categories such as Joy, Sorrow, Fear, Anger, Disgust, Surprise, and Contempt – or any other theorist’s list – while still being able to specify the appraisals that distinguish among these emotions. Appraisal theories see emotional experience as a dynamic process, constantly changing. As an appraisal changes, so does the emotional experience. A person can have an undifferentiated negative feeling, and further appraisals may transform that feeling into sorrow or anger or anxiety or any number of other states, even positive states. Very often we feel emotional but don’t fit into any of the categories common to basic emotion theories.

Phoebe's drawings: 3

Phoebe’s drawings: 3

In some ways appraisal theories are dimensional theories, and in my first article on appraisal theory (Smith & Ellsworth, 1985) we reviewed dimensional theories and discussed appraisal theory in that context. Each type of appraisal is an appraisal dimension, so that a person may appraise a situation as anything from mildly positive to intensely positive, or see a situation as anything from totally under one’s control to totally out of control. There are more dimensions than in most other dimensional theories, but they are dimensions in the sense that a person can range continuously from very low to very high on any one of them, and they create a huge multidimensional space in which any point is theoretically possible, whether or not it corresponds to a labeled category or basic emotion.

Appraisal theories are also constructionist theories in that emotions correspond to the environment as appraised, and two people may appraise the same event in different ways depending on their goals, their experiences, and their cultural and personal values and beliefs, and feel different emotions. But appraisal theories are more specific than other constructionist theories in that they specify the constituent elements from which emotional experience is constructed. If you know how a person appraises a situation on the dimensions of novelty, valence, certainty, agency, goal conduciveness, control, and compatibility with norms, you have a pretty good idea of her emotional experience.

You have explored the role of culture in appraisal in your collaborative work with Batja Mesquita. It seems clear that there are cultural differences in the way some stimuli are appraised in different cultures (e.g. eating pork may be appraised as disgusting or as delightful depending on the culture), but also that there are universal antecedents in the elicitation of some emotions (e.g. sudden loss of support elicits fear in all cultures). How can we make sense of both aspects jointly?

It seems to me far more plausible to say that some situations are appraised very similarly across cultures and some are not than to say that either all situations are appraised in the same way (an extreme universalist point of view) or that no situations are appraised in the same way (an extreme relativist point of view).

culturesHuman beings belong to the same species. Our brains, bodies, autonomic nervous systems, hormones, and sense organs are similarly constructed, and our consciousness is shaped by the constraints and opportunities they provide. Human environments everywhere include novelty, hazard, opportunity, attack, success, failure, and loss, which people must perceive with some accuracy and respond to appropriately. These are the kinds of events that generate emotion, and many scholars believe that the primary function of emotion is to move the organism to appropriate action in circumstances consequential for its well-being.

However, it is also clear that cultures differ in their definitions of novelty, hazard, opportunity, attack, success, failure, and loss and in their definition of appropriate responses. They differ in their definitions of significant events and in their beliefs about the causes of significant events, and these differences affect their emotional responses. They differ in their beliefs about which emotions and which expressions of emotion are desirable or undesirable, and how they should be regulated.

I think the question is not “how can we make sense of both aspects jointly?”, but “how can we possibly understand emotions without taking account of both aspects?” What is needed is a framework that allows consideration of the general and the particular at the same time (Ellsworth, 1994a).

Amae, often defined as an emotional state of child-like attachment towards authority figures, is commonly presented as a uniquely Japanese emotion. Yet, in your work you argue that amae exists also in the US. How so? And do you think that there are any emotions that are culturally specific in the sense that you do not find them in any other culture?


Amae in Japanese

Amae is common in Japan. There is a word for it, and it is what Nico Frijda and Batja Mesquita called a focal emotion. Americans, who greatly value independence, have a hard time understanding the concept, which involves the enjoyment of dependence. Americans can understand it if they think of a child’s feelings towards its mother, but find it hard to understand as a positive feeling between adults. Nonetheless, if you ask Americans if they can remember a time when they were in a specific situation of the sort that elicits amae, they can often do it.

We asked people to describe a situation in which they asked a friend to do a fairly big favor for them and the friend agreed, or when a friend asked them for a favor. They put themselves in a position of dependence on the other person, and they tended to report positive emotions. Japanese people who read these descriptions saw them as amae situations (Niiya, Ellsworth, & Yamaguchi, 2006). Thus amae can be experienced by Americans, although amae situations might be quite rare in American culture, or they could be fairly common but go unnoticed because the concept is unnamed and unavailable.

In general, understanding the emotions of people in another culture involves appraising their situations in the same way that they do. Filmmakers and novelists portray events so that the audience can see them as the characters do, and thus share the characters’ emotions. Josh Wondra and I have also proposed that this is how empathy works: when we appraise another person’s situation in the same way that she does, we feel what she feels: We empathize (Wondra & Ellsworth, in press).

A wide range of diverse elicitors, ranging from maggots to hypocrisy, can cause disgust. This raises the question of whether disgust is just an umbrella term we use to refer to importantly different emotions, or whether disgust is a unified emotion. What are your views on the matter?

disgustI am not a believer in “unified emotions”. The English term “disgust” is used for a range of aversive states, some closely related to anger – such as our reactions to hypocrisy or injustice, some more closely related to fear – such as our reactions to maggots, with many other shades and nuances possible. Various scholars have dealt with this by proposing a basic “real” disgust and considering usages that don’t fit their definition of real disgust as metaphorical extensions, or by proposing a “primary” disgust (typically physical disgust) and a “secondary” disgust (typically a more culturally variable moral disgust), and there is some evidence for overlapping but distinctive brain processes corresponding to these.

What is referred to as moral disgust generally involves a human agent, and thus closely resembles anger, which is a response to something bad caused by another person. It also involves a desire to punish, and a sense of dominance and righteousness, all of which are absent in cases where there is no human agent, which are often more similar to fear (Lee & Ellsworth 2013) in that the perceiver feels vulnerable and is more concerned with escape and self-protection than with punishing anyone. As the appraisal changes, so does the emotion. I believe that there are many intermediate states and other subtle variations in the experience of disgust, and that the categorical distinctions (between physical and moral disgust, or between moral disgust and anger) are arbitrary and culturally determined, although once these distinctions exist in a person’s mind they are important and real in their consequence

Let us transition to your groundbreaking work at the intersection of psychology and law. You have written on eyewitness testimony, child custody, jury bias, the death penalty and many other pivotal topics, trying to use the tools of social psychology to improve our legal system. In which of these areas do you think social psychology research has been more successful in bringing about positive changes? Which of these areas is still sorely in need of radical reform in light of what empirical research tells us?

I think the biggest changes have probably been in the area of child custody, where joint custody arrangements have largely replaced the old preference for giving custody to the mother. One reason is that although judges tend to regard themselves as experts on most topics, they recognize that they are not experts on children, and so are more open to information from psychologists. Legal changes are sometimes top-down, with the Supreme Court declaring a practice unconstitutional, as in Brown vs. Board of Education, but often they are bottom up, with a few, then many, then most jurisdictions changing their practices, sometimes in response to psychological research.

The Supreme Court of the United States

The Supreme Court of the United States

The fallibility of eyewitness testimony has become much more widely recognized than it used to be, largely due to the work of Beth Loftus and other psychological researchers, and many police departments are changing the ways in which they conduct lineups and document the identification procedure in order to reduce suggestion and increase accuracy. Likewise the practice of videotaping interrogations has become much more common due to the evidence that some innocent people have confessed to crimes, and the psychological research on false confessions.

Research on capital punishment has been much less influential. Although there are many empirical issues related to capital punishment, such as the deterrent effect of the death penalty, racial discrimination, and the fairness of juries that exclude opponents of the death penalty, it is often seen as fundamentally a moral issue, and so courts have sometimes considered empirical evidence to be irrelevant. I think that this is slowly changing: In the past 15 years the Supreme Court has banned capital punishment for intellectually disabled people (Atkins v Virginia, 2002) and for juveniles (Roper v Simmons, 2005) in part based on psychological research, and some states have abolished the death penalty. The continued use of the death penalty in the United States will of course strike most European ISRE members as bizarre.

You spent the 1970s as a psychology professor at Yale University, but started going back to Stanford to work with Samuel Gross, who was then challenging the constitutionality of death-qualified juries, and later became your husband. At the heart of the case for which you eventually became expert witness was the claim that empirical research shows that weeding out prospective jurors who are against the death penalty biases the trial against the defendant. What evidence did you rely on to support this claim? The case reached the US Supreme Court but ultimately lost. What did this experience teach you about the extent to which scientific facts matter to legal decisions at different levels of the hierarchy of courts? You still are a board member of the Death Penalty Information Center. What is the objective of the Center? And finally, how hard is it to write papers with one’s spouse?

Well, actually Anthony Amsterdam was the famous lead lawyer in that case, and Hans Zeisel was the famous lead expert witness. The practice in capital cases is to exclude anyone who is adamantly opposed to the death penalty from serving on the jury because those people wouldn’t be able to follow the law and vote for the death penalty if it were called for. I was at Stanford on sabbatical from Yale, and conducted four studies (all published in Law and Human Behavior, 1984) that showed that the people who were allowed to serve on capital juries (the “death-qualified” jurors) were more pro-prosecution and more likely to find the defendant guilty than a jury that reflected the views of the whole community. By that time there were also eleven other studies that had found that death-qualified jurors were more conviction-prone, and I testified about all 15 studies.

In 1986, when the case came to the Supreme Court (Lockhart v McCree), the Court was highly unlikely to accept any arguments that challenged capital punishment. We had hoped that they would see this as a case about jury fairness, more than as a case against the death penalty, but it was a pretty feeble hope. The Supreme Court opinion plodded through all the studies one by one, finding a flaw in each one and so eliminating them from consideration until only one was left, and then said that “surely a ‘per se constitutional rule’ as far reaching as the one McCree proposes should not be based on the results of [a] lone study…” (476 US 172). This process of elimination, of course, is totally at odds with the idea of convergent validity, where a collection of studies using many different methods (surveys, experiments, interviews with real jurors) all converge on the same conclusion, providing strong evidence even if individual studies have flaws.

Beyond repairFor good measure (and perhaps in order to forestall future flawless studies) the Court decided that after all death qualification was not an empirical issue, so no future research would be relevant. A year later they similarly rejected very powerful evidence of racial discrimination in the application of capital punishment. This suggests that empirical data are likely to be unpersuasive when they challenge judges’ strongly held beliefs, in this case the belief in the acceptability of capital punishment.

The Death Penalty Information Center is an organization that collects and keeps up-to-date information on capital punishment – the number of people on death row, executions, and new death sentences in each state; racial issues; wrongful convictions; cost of the death penalty; new empirical research, and a host of others. It is where the media, the public, and students go to get accurate information. On issues that involve passionate attitudes, accurate information is hard to come by, and the goal of the Death Penalty Information Center is to provide it.

Although the issue of death qualification lost in the Supreme Court, various jurisdictions have used the research in attempts to reform procedures in capital cases. And although it was not much of a success in the policy realm, it was a stunning success in the personal realm: Sam Gross and I fell in love and got married, and so did two of the graduate students who worked on the research with me. Sam and I have continued to write together on the death penalty.

“How can you write articles with your husband and stay married?” is one of the most common questions I get about my work in this area, along with “How could you let a man on death row write to your four-year-old daughter?” I actually find it easier to write articles with Sam than with anyone else, and often when we reread our work we can’t remember who wrote which parts. This may be because we worked very intensively together for nine months before falling in love, so we already had a deep knowledge and admiration for each other as collaborators.

In one of your highly influential papers on racial bias in courts – “White Juror Bias”, co-authored with Samuel Sommers – we find the following poignant quote from To Kill a Mockingbird by Harper Lee: “in our courts, when it’s a white man’s word against a black man’s, the white man always wins…The one place where a man ought to get a square deal is a courtroom, be he any color of the rainbow, but people have a way of carrying their resentments right into a jury box”. We have come a long way from Atticus Finch’s Alabama, but in some of your work you have argued that there is still significant racial bias in juror decisions. What is the main evidence for this claim? And is there anything we can do to get rid of racial bias in juror decisions once and for all?

Racism in America is pervasive. In every city and town where both Blacks and Whites live, they live largely in segregated neighborhoods. Racial bias on juries is a tiny manifestation of a far more general problem. Even in the criminal justice system, discrimination begins long before the few cases that are tried by juries ever get to trial: There is discrimination on who is stopped, searched, arrested, and charged, in who can make bail, and what kind of lawyer is affordable. Racial minorities are hugely overrepresented in jails and prisons, but most of them were not convicted by juries. Most people plead guilty or are tried by judges. On juries, there are rules to prevent an attorney from striking all the Black jurors and instructions designed to reduce prejudice. These measures are far from completely effective, but it is probably easier to reduce prejudice on juries than in many other situations (Sommers & Ellsworth, 2001).

You have co-authored a book on Methods of Research in Social Psychology. You have also been Associate Editor of the Journal of Experimental Social Psychology (1977-80), currently are Associate Editor of Emotion Review (2014-) and sit on many editorial boards of top journals. You therefore seem to be ideally situated to say whether what gets published in social psychology journals tends to be as methodologically rigorous as it should be. More generally, should the quality of published scientific research in social psychology and/or emotion theory improve, and if so in what ways?

There is out-and-out fraud, and that of course is unacceptable. Then there is the much more complicated issue of data analysis, on which a great deal is being written by people with more expertise than mine. I have never believed that everything that is written in psychological journals, or any other scientific journals, is true. When we write an article we say, “This is what I think is true now, and this is why I think it.” Other scholars may think, “That’s interesting, and I’ll explore it further”, or they may think, “That’s dumb – she doesn’t realize that her treatment or her measure or her reasoning is flawed, and I will do the study that she should have done.”

methodsI think that there is a value in publishing interesting, provocative ideas even if the data are weak. If other people are interested they will follow up on the ideas, sometimes trying to extend them and sometimes trying to disconfirm them. The ideas are discussed and criticized, and the field is invigorated by these discussions, even if the original ideas turn out to be wrong. I think it is unrealistic to expect a researcher to raise, examine, and settle a question in in one article, and that if this can be done it usually means that the questions are pretty pedestrian. That is not how science works. Experiments are not just a series of true/false tests. For me the best research is research that gets us thinking and suggests ideas and questions we hadn’t thought of before.

Nonetheless, I do worry about the quality of the research. Over the course of my career, success has become increasingly judged by the number of studies run and the number of articles published, and this inevitably lowers the average quality of the articles, as in any speed-accuracy tradeoff. Young scholars know that getting a job and getting tenure depend on having a lot of publications, and so waiting to publish until the empirical support is unimpeachable is likely to result in fewer publications and lessen a person’s chances of success. Many schools even suggest that a certain number of publications is required for getting tenure, which puts scholars who are actually seeking the truth in an excruciating dilemma.

I think that two very good studies are worth more than ten mediocre ones, and I wish that the person who did the two studies would have better chances of promotion and success than the person who did the ten. When I was young I withdrew an already-accepted study from JPSP because I thought the findings needed more support, and the article that was finally published was much better. My career was not damaged by this choice. If anything, it was helped, as Bill McGuire, the editor of JPSP, who had never heard of me before, paid attention to my work and admired me from then on. Nowadays I fear that this sort of perfectionism is likely to hurt people’s careers.

I also worry that some of the recommendations seem to assume that there are methods that are intrinsically superior and methods that are intrinsically inferior. I believe that different questions require different methods (Ellsworth & Gonzalez, 2003). In my field, for example, some scholars have tried to define the best verbal measure of emotions, the best set of stimuli for testing perceptions of facial expressions, the best methods for eliciting different emotions, etc., all in the interests of creating standardization across the field. This enterprise is likely to fail.

First, it totally confounds the concept with the method. Second, the emotions I want to ask about depend on what I want to know – so if I want to find out about how feeling angry at oneself differs from feeling angry at someone else, I will want to ask a lot of fine-grained questions about varieties of anger, but I won’t need any of the items a Positive Psychologist would need to differentiate among positive emotions. A standard measure is unlikely to capture what I’m looking for. Or if it does, it will also include dozens of distracting, time-consuming questions that are irrelevant to my purpose.

Third, perceptions and customs change over time, more and more rapidly. Words fall out of use and new ones replace them, and even when the same words are used their meanings change over time. In the 21st century a verbal measure has a short shelf life. Stimuli that aroused fear 50 years ago (like the special effects in horror movies) may just be seen as ridiculous now, and stimuli that would have gone unnoticed (like a whole classroom taking turns drinking out of the same cup) may be regarded as disgusting. “Standardized” treatments and measures do not mean the same thing across time and place, and seeking such standardization is an effort doomed to failure.

I also believe that direct replication in social psychology is impossible. When people ask me to send them the stimuli I used 25 years ago so that they can conduct a “direct replication” I am reluctant. The times have changed, the people have changed, and if I wanted to replicate the findings myself I would be unlikely to think that the 25-year-old materials are the best ones to use now. I have heard that some of the groups that claim to do direct replication actually run Subjects through five or ten of the experiments to be replicated in a row. This does not result in direct replications: If you run a study of mine right after a study of aggression I would not expect it to have the same results as it would if you ran it right after a study of compassion. If we have learned anything in our field, it is that context matters.

You have received many teaching awards in your career, including the Dean’s Award for Distinguished Teaching, Stanford University (1984), the American Psychological Association Distinguished Lecturer Award (1999), the APA Association of Graduate Students (APGAS) Raymond D. Fowler Award for outstanding contributions to students’ professional development (2011) and the SPSP Nalini Ambady Award for Mentoring Excellence (2014). What do you think makes you a successful teacher and mentor? Do you think university professors as a group should work harder to improve the quality of their teaching and mentoring? How could they do so?

PhD party for Eddie Tong, 2006

PhD party for Eddie Tong, 2006

When I was in graduate school, Phil Zimbardo gave us all feedback on our teaching ability, and although he tried to be kind, he did not have high hopes for me as a teacher. And in fact I could never be a charismatic teacher like Phil because I can’t keep track of what I want to say, how the class is responding, and a dizzying array of audio-visual aids and special effects at the same time. I work hard and I care about students and I got better over time.

An important part of a professor’s job is training graduate students to become the next generation of psychological scientists. Treating them like research assistants that we have hired to work on our own projects is not enough; it’s also important to help the students to make their own research and their own ideas as good as they can possibly be. This may be easier for me than for some people, because I find solving methodological problems intrinsically interesting. And the students have been wonderfully generous about nominating me for awards.

What are your hobbies?

In addition to teaching and two entirely different lines of research, I have been a wife and a mother, so “hobbies” have not been on the agenda, except for functional ones like cooking and gardening. I also read something that’s not psychology every night before going to sleep, history or literature usually. I like to travel, and I like to draw. I draw Christmas cards every year, and occasionally other cards for people.

You now live in Ann Arbor, where you hold the position of Frank Murphy Distinguished University Professor of Psychology and Law at the University of Michigan. What do you like and what do you dislike about living in Ann Arbor? What are a handful of your favorite restaurants in town? Do you enjoy cooking, and if so do you have a favorite recipe to share?

ann arbor

Ann Arbor, Downtown

We moved to Ann Arbor from Stanford when our daughters were ages one and five. It is a great place to combine careers and family – safe, culturally rich, good public schools, and a world class university. It’s a little smug and Norman Rockwellish, but neither of these is a severe drawback. I’ve missed the edginess of New England and Sam misses Berkeley – and perhaps reflecting our backgrounds one of our daughters is now in Providence and the other in Berkeley – but on the whole it was a good choice.

Ann Arbor has Zingerman’s, one of the best gourmet delicatessens in the country, and for lunch on weekends we have their top-class cheeses, baguettes, and salami. Restaurants I like are Pacific Rim and Mani Osteria.

I’m from New England, which is not the gourmet capital of the world, but here’s a recipe:

Mussels with Roasted Potatoes (for 4)

4 large Yukon Gold or baking potatoes

6T olive oil


Phoebe’s Mussels With Potatoes

½ t salt

1/3 C finely chopped shallot

3 minced garlic cloves

¼ t red pepper flakes

1 C white wine

¾ C water

2lb mussels

3T chopped parsley


Preheat oven to 450

  1. Halve potatoes lengthwise and cut off a ¼ inch slice from round sides to make 8 slabs ½-3/4 inch thick.
  2. Coat slabs with 2 T oil and sprinkle with salt.
  3. Put in one layer in a large shallow baking dish and roast 25-30 minutes, turning over halfway through, until golden brown.
  4. Meanwhile, over medium heat, cook garlic, shallot, and pepper flakes in 2 T oil in a deep 12-inch skillet, stirring, 3 minutes.
  5. Add wine and water, bring to a simmer, add mussels and cook, covered, over medium high heat until mussels open, about 3 minutes, transferring to a bowl as they open. Stop after 6 minutes.
  6. Whisk parsley and 2T oil into mussel broth and season with salt and pepper.
  7. Divide potatoes among 4 shallow bowls and top with mussels and broth.

What are you working on these days?

I continue to work on the development and extension of appraisal theory, including new work on empathy and other vicarious emotions, and on empirical issues related to the death penalty. As I get older, I spend more of my time advising graduate students on their work, some of which is closely related to mine, some of which is not. They are, after all, the future of psychology.

Please list five articles or books that have had a deep influence on your thinking

  1. D. T. Campbell & J. C. Stanley. Experimental and quasi-experimental designs for research. Rand McNally, 1966.
  2. E. M. Forster. Two cheers for democracy. Edward Arnold & Co. 1951.
  3. William James. The principles of psychology. Henry Holt. 1890.
  4. G. E. Hutchinson. Concluding remarks. Cold Spring Harbor Symposium on Quantitative Biology, 1857, 22, 415-427.
  5. Erving Goffman. The presentation of self in everyday life. Anchor Books, 1959.

I have been deeply influenced by works of poetry and fiction too numerous to list.

What do you think are the most pressing questions that future affective science should be focusing on?

I would love to see more research on the development of emotion in infancy and childhood, as it seems to me that this research would help us to make progress on many of our most significant theoretical questions. First, we could learn more about which aspects of emotion are innate and which are learned, and so begin to understand what are the biological raw materials and how are they affected by experience and learning, including the role of culture. Second, we could learn about how language affects emotion. Third, we could better understand the relation between cognition and emotion (I realize that these questions are related). Some of this research could be informed by cross-species comparison, some by cross-cultural comparison. Joe Campos has done wonderful research on emotional development, but there are far more scholars working with adults than with children, and among developmental psychologists, there have been far more researchers studying cognition than emotion.


Cowan, C., Thompson, W., and Ellsworth, P.C. (1984) The effects of death qualification on jurors’ predisposition to convict and on the quality of deliberation. Law and Human Behavior, 8, 53-79.

Ekman, P., Friesen, W. V., & Tomkins, S. (1971) Facial affect scoring technique: A first validity study. Semiotica, 3, 37-58.

Ellsworth, P.C. (1994a) Sense, culture, and sensibility. In S. Kitayama and H. Markus (Eds.) Emotion and Culture: Empirical Studies of Mutual Influence, Washington, D.C.: APA.

Ellsworth, P.C. (1994b) William James and emotion: Is a century of fame worth a century of misunderstanding?   Psychological Review. (Centennial Issue), 101, pp. 222-229.

Ellsworth, P.C. (1994c)   Levels of thought and levels of emotion.   In P. Ekman and R.J. Davidson (Eds.) The Nature of Emotion: Fundamental Questions. New York: Oxford, pp. 192-196.

Ellsworth, P. C. (2014). Basic emotions and the rocks of New Hampshire. Emotion Review, 6, 21-26.

Ellsworth, P.C., Bukaty, R., Cowan, C., and Thompson, W. (1984) The death-qualified jury and the defense of insanity. Law and Human Behavior, 8, 81-93.

Ellsworth, P.C., & Carlsmith, J.M. (1973) Eye contact and gaze aversion in an aggressive encounter. Journal of Personality and Social Psychology, 28, 280-292.

Ellsworth, P.C., & Gonzalez, R. (2003) Questions and Comparisons: Methods of research in social psychology. In M. Hogg and J. Cooper (Eds.) Sage Handbook of Social Psychology. London: Sage, pp. 24-42.

Ellsworth, P.C., & Levy, R.J. (1969) Legislative reform of child custody adjudication: An effort to rely on social science data in formulating legal policies. Law and Society Review, 4, 167-233.

Fitzgerald, R., and Ellsworth, P.C. (1984) Due process vs. crime control: The impact of death qualification on jury attitudes. Law and Human Behavior, 8, 31-52.

Hastorf, A. H., & H. Cantril (1954). They saw a game: A case study, Journal of Abnormal and Social Psychology, 31, 129-134.

Lee, S. W. S., & Ellsworth, P. C. (2013) Maggots and morals: Physical disgust is to fear as moral disgust is to anger. In K. R. Scherer & J. R. J. Fontaine (Eds.) Components of Emotional meaning: A Sourcebook. Oxford University Press.

Mischel, W. (1968). Personality and Assessment. New York: Wiley.

Niiya, Y., Ellsworth, P.C., & Yamaguchi, S. (2006) Amae in Japan and the U.S. Explorations of a “culturally unique” emotion. Emotion, 6, 279-295.

Schachter, S., & J. E. Singer (1962). Cognitive, social, and physiological determinants of emotional state. Psychological Review, 69, 379-399.

Smith, C.A., and Ellsworth, P.C. (1985) Patterns of cognitive appraisal in emotion. Journal of Personality and Social Psychology, 1985 48, 813-38.

Sommers, S., and Ellsworth, P.C. (2000) Race in the courtroom: Perceptions of guilt and dispositional attributions.   Personality and Social Psychology Bulletin, 26, 1367-1379.

Thompson, W., Cowan, C., and Ellsworth, P.C., and Harrington, J.C. (1984) Death penalty attitudes and conviction proneness: The translation of attitudes into verdicts. Law and Human Behavior, 8, 95-114.

Tourangeau, R., and Ellsworth, P.C. (1979) The role of facial response in the experience of emotion. Journal of Personality and Social Psychology, 37, 1519-1531.

Wondra, J., & Ellsworth, P. C. (in press) An appraisal theory of empathy and vicarious emotions. Psychological Review.


Apodaca et al. v. Oregon. 406 U.S. 404 (1972)

Atkins v. Virginia. 536 U.S. 304 (2002)

Brown v. Board of Education. 347 U.S. 483 (1954)

Lockhart v. McCree. 476 U.S. 162 (1986)

Roper v. Simmons. 543 U.S. 551 (2005)

Williams v. Florida. 399 U.S. 78 (1970)










Print Friendly, PDF & Email

Traffic Count

  • 331520Total reads:
  • 77Reads today:
  • 265897Total visitors:
  • 65Visitors today: