Tuesday, January 17, 2023

A brief intellectual history of implicit bias

Note: This essay is a lightly edited version of an email I sent to some colleagues of mine, who suggested I post it as a blog for others to read. I hope you find it useful

---

First of all, implicit bias is a very very American theory. America was founded on the principle that all men were created equal, yet the Constitution had a provision that African Americans were only 3/5 of a person. American history is also characterized by both racist antipathy (Jim Crow, lynchings) and strivings toward equality (the Emancipation Proclamation, the Civil Rights Movement). 

In many ways, the theory of implicit bias is a manifestation of both these tendencies, just turned inward toward the individual person: both the antipathy (automatic stereotypic associations) and the strivings (egalitarian values).

The history of the theory also contains the imprints of these two tendencies. After the US Civil Rights Movement, intergroup researchers in psychology were optimistic that normative and legal changes would make racism obsolete, and this opinion seemed to be vindicated by increasingly positive attitudes toward black people in national opinion surveys. But in the 70s and 80s, researchers found that if you set up a lab experiment just right you could find evidence of discrimination. There's a famous review by Crosby, Bromley, and Saxe (1980) that summarizes these studies as follows: "Our own position has been to question the assumption that verbal reports reflect actual sentiments, and we inferred from the literature that whites today are, in fact, more prejudiced that they are wont to admit." 

I think this position reflected the consensus view among intergroup scholars at the time -- that people are lying on public opinion surveys but would betray their "true" opinions when they could get away with it. The view fit with a general movement in social psychology that viewed self-reports with suspicion, favoring cognitive tasks instead. A famous study by Srull and Wyer (1979) embodies this trend: they asked participants to unscramble sentences that either were related to aggression or not, then, in a separate task, judge whether a specific person is acting in a hostile way. The idea was that concepts from hostility and aggression would "prime" or prepare participants to view other people as aggressive. This general line of thinking would greatly influence the development of implicit measures in the study of race going forward.

From a modern eye, the unobtrusive studies on which Crosby et al. (1980) based their conclusion are extremely underpowered and look p-hacked. If you want a sense of just how bad, check out Gaertner (1975), who conducted a 2x2x2 design ... with 40 participants (5 participants per condition!!!!). These studies also predominantly used college students. So I'd say it's fair to say these studies (and the consensus that came from these studies) had a credibility problem.

Nevertheless, the consensus view was that people (even liberal college students) would behave in discriminatory ways, even despite reporting egalitarian attitudes. The usual conclusion researchers at the time drew at that people's self-reports couldn't be trusted. Devine (1989) disagreed, not because she questioned the assumption of widespread discrimination, but because she concluded that both the discrimination and the self-reports reflected genuine psychological processes that conflicted with each other. One of the processes was the automatic antipathy (the stereotypes, as she put it) that were acquired from repeated pairings of a social group with negative information, picked up from the social environment. The other reflected genuine beliefs (or values) and was more deliberative. Here we see America's historical conflict turned inward.

This view was immediately appealing. I think researchers liked it partly because it was an optimistic view (it's saying that people want to do the right thing, they just mess up along the way), partly because it provided a road toward intervention (reduce the influence of stereotypic associations), and partly because it had resonance: it reflected longstanding conflicts in American history. The problem was that Devine didn't offer any direct measure of the automatic associations -- she merely attempted to document their influence using the same "unobtrusive" paradigms I mention above (she used a race version of the Srull and Wyer paradigm for this, actually).

This changed with the introduction of implicit measures. First Fazio et al (1995) then Greenwald et al. (1998) introduced what they claimed to be direct measures of automatic associations (what later came to be known as implicit bias). Greenwald et al even made their materials and analysis scripts public in an impressive show of open science. This gave legs to the implicit bias movement.

In the excitement, however, standard psychometric principles fell by the wayside. In a weird reversal of the usual logic, boosters of implicit measures used low correlations with explicit measures as evidence for the validity of implicit measures (see, for example, Karpinsky & Hilton, 2001); usually you'd want similar measures to exhibit relatively high correlations to show convergent validity.

In addition, there was evidence of p-hacking in some of the studies of predictive validity. The highest profile is probably McConnell and Leibold (2001), who tested the predictive validity of the race IAT with 42 White students; a re-analysis of this study suggested that the results were not robust (Blanton et al., 2009).

I'd say that one of the bigger errors in this area of scholarship was that the implicit measures -- especially the IAT -- were aggressively pitched to the public as revealing the "roots of unconscious prejudice" (see this piece of investigative journalism by Jesse Singal). It's this very public marketing of the IAT as revealing something deep and inherent in all of us that pushed these measures into the public consciousness and made people feel like they revealed something deep and true -- even if the evidence for this was lacking.

Moreover, the meaning of the IAT -- as a measure of unconscious prejudice -- was seldom questioned. Uhlmann and colleagues (2006) pointed out that associations between Black and bad could be caused by an awareness that Black people in the United States have been the victim of slavery, Jim Crow, and other forms of oppression -- all undoubtedly bad things, but not the kinds of "unconscious prejudice" that the makers of implicit measures had in mind. The relationships between implicit measures and criterion measures were also assumed to indicate that implicit bias causes behavior. Yet these relationships could indicate that attitudes (not just implicit, but all attitudes) cause behavior -- or they could indicate that people who discriminate against Black people come to hold negative attitudes (through cognitive dissonance, for example). In fact, if anything, the makers of the IAT aggressively boosted its interpretation as a measure of unconscious prejudice that was incredibly consequential (Greenwald et al., 2015).

Yet the enthusiasm for implicit bias research only accelerated. Despite the lack of good validity evidence, bias on implicit measures came to be viewed as a valuable target of intervention in its own right, leading to a host of studies attempting to change the measure. The most famous is probably Dasgupta and Greenwald (2001), a study that purported to show 24 hour reductions in implicit bias from just showing people pictures of admired Black people (and disliked White people). (This study was probably subject to publication bias or other selection processes; see this early replication of it by Joy-Gaba and Nosek, 2010).

These studies set the stage for my meta-analysis with Calvin Lai. We set out to synthesize the studies of implicit bias change to find the best approaches. We both started out as believers that implicit bias was a valuable target of change. However, the meta-analysis revealed some real weaknesses: very few studies that measure behavior (speaking to the fact that researchers started to view implicit bias as important in its own right, effectively assuming the measure's validity), behavioral measures with poor validity, evidence of publication bias, very few longitudinal studies, very few intensive manipulations, and no evidence that changing implicit bias causes changes in behavior. 

To say that this was discouraging for me was a severe understatement: it completely upended my opinion on implicit bias as a useful target of intervention. It also severely challenged my opinion that implicit bias plays any causal role in impacting behavior: we observed that implicit bias could be shifted, but these shifts did not provoke shifts in behavior. This is a pattern that is hard for someone who believes implicit bias causes behavior to explain.

Moreover, the issues with the validity of implicit measures never really went away; they merely became harder to ignore. The relationship between implicit measures and behavior is small (Oswald et al., 2013), and there's not a strong base of studies with valid measures of discrimination to to test the important relationship between measured implicit bias and actual discrimination (Carlsson & Agerstrom, 2016). The issues with the interpretation of scores on cognitive tasks like the IAT that I brought up earlier also never really went away. The assumptions about discrimination that underlies the theory of implicit bias -- that it's widespread and done by everyone -- has been questioned (Campbell & Brauer, 2021). There's also accumulating evidence from cognitive psychology that suggests that the developmental story behind implicit bias -- that it's an automatic attitude acquired over time without awareness -- cannot be true (Corneille & Mertons, 2021). Finally, it's even possible that the interpretation that Devine put forward -- that there are two processes that oppose each other -- is wrong, and that implicit and explicit measures assess the same thing after all (Schimmack, 2021).  

From my view, the evidence is at this point solidly against the theory my former supervisor (Trish Devine) proposed in 1989.

2 comments:

  1. Nice review of the history and the broader cultural context. One more aspect that made the story of implicit bias appealing was that it made it easier to talk about prejudice. "It is not you, it is your unconscious, and we are not responsible for our unconscious" is just easier to tell somebody than to tell them that they are prejudiced or even racist. We still need to find a way to talk openly about our biases, but first we need a credible science of these biases.

    ReplyDelete
    Replies
    1. great point. when i used to give implicit bias trainings i found this tack useful, and many of the other trainers i've talked with have also found this angle strategic.

      even though this framing can be strategic, i think it is not without its downsides: framing bias as unintended absolves people of responsibility for working on these problems, especially if they are, in fact, complicit in treating black people badly.

      https://par.nsf.gov/servlets/purl/10200185

      Delete