The name problem

Names signal more than race

Most audits use names to signal the race of a fictional applicant. But a name can also signal social class, education, and citizenship. If a name reads as both Black and lower income, a callback gap conflates two treatments, and the causal claim quietly breaks.

This is the problem my coauthors and I tackled in Crabtree, Kim, Gaddis, Holbein, Guage, and Marx (2023), in Nature Scientific Data. We collected 44,170 evaluations from 4,026 US respondents for 600 names: 100 white, 300 Asian, 100 Black, and 100 Hispanic. For each name, the data record how respondents perceived the person's race, education, income, and citizenship. The point is to let researchers choose names whose perceived attributes match their design assumptions, rather than guessing.

There's a subtlety here that John and I pressed in our reply to Mitterer (2022, PNAS). Names are bundled treatments, but so is every other way of signaling race, and some bundle worse. A photograph carries age, attractiveness, and class on top of race. Names bundle too; their one advantage is that we can measure the bundle, which is what the data below do. Partialling those perceptions back out isn't always the right move either, since it can strip away part of the construct you set out to study.

Two follow-ups put numbers on the bundle. In Crabtree, Gaddis, Holbein, and Larsen (2022, Sociological Science), we show that names which read as Black or Hispanic also read as lower in education and income; the class signal rides along with the racial one. And in Crabtree and Chykina (2018, Sociological Science), we show a name's racial signal isn't fixed: it shifts with the local demographics a reader has in mind, so a name that signals race clearly in one place can signal it weakly in another. That's a real limit on how far any one set of names travels.

The full dataset runs live in your browser. Pick a name, or two, and see what people actually inferred.

SELECT A NAME
to see its perception profile
SELECT A SECOND NAME
to compare

Race shares are the proportion of respondents who categorized the name into each group. Education runs from 1 (high school or less) to 4 (graduate or professional degree). Income runs from 1 (low) to 3 (high). Citizenship is the share who perceived the person as a US citizen. Estimates pool all three surveys; n is the number of evaluations per name.

A useful default pairing is already loaded. David Hansen and Darnell Jackson have nearly identical perceived education, 2.09 against 2.05, which is exactly the point: the class confound is not a vague worry, it is a quantity you can read off and hold fixed.

Walk through

Watch a confound contaminate a "race effect"

Who reads whom

Recognition runs one way more than the other

The same survey asked respondents to name each person's race. People read in-group names well, at 85% correct, and out-group names less well, at 71%. The gap also has a direction: non-white respondents read white names correctly 75% of the time, while white respondents read non-white names correctly 68% of the time. W.E.B. Du Bois described this asymmetry in 1903. A minority learns to read the dominant group closely; for the dominant group, reading back is optional.

Recognition rates from Crabtree, Kim, Gaddis, Holbein, Guage, and Marx (2023). Bars show the share of respondents who correctly identified a name's intended race, split by whether the name was in- or out-group and by the respondent's own race.

← PreviousOverview