Skip to content
Longterm Wiki
Back

A 2024 University of Washington study

web

Empirical research documenting discriminatory outcomes from LLM-based hiring tools; relevant to AI deployment governance, fairness auditing, and the gap between widespread AI adoption and regulatory oversight in high-stakes decision-making contexts.

Metadata

Importance: 62/100news articleprimary source

Summary

A University of Washington study tested three open-source large language models on over 550 real-world resumes and found significant racial, gender, and intersectional bias: LLMs favored white-associated names 85% of the time, female-associated names only 11% of the time, and never favored Black male-associated names over white male-associated names. The research highlights risks of deploying AI in hiring without adequate regulation or auditing.

Key Points

  • Three state-of-the-art open-source LLMs showed strong racial bias, favoring white-associated names in resume ranking 85% of the time.
  • Female-associated names were favored only 11% of the time, and Black male-associated names were never ranked above white male-associated names.
  • Study used 550+ real-world resumes at scale, investigating intersectionality across both race and gender simultaneously.
  • An estimated 99% of Fortune 500 companies use some form of hiring automation, yet almost no regulatory auditing of these AI systems exists.
  • Outside of a New York City law, there is currently no independent regulatory audit mechanism for AI-based hiring tools.

Cited by 1 page

Cached Content Preview

HTTP 200Fetched Apr 7, 20266 KB
AI tools show biases in ranking job applicants’ names according to perceived race and gender – UW News 
 

 

 
 
 
 
 
 
 
 
 
 
 
 
 

 

 
 
 
 

 
 
 
 

 
 

 

 

 
 Skip to content 

 

 
 
 
 
 All the UW 
 Current site 
 

 
 Search scope 
 
 
 All the UW 
 Current site 

 
 
 
 
 Enter search text 
 
 
 
 
 
 
 

 
 

 

 

 

 
 
 UW NEWS 

 
 
 

 
 
 
 

 
 
 

 
 
 University of Washington research found significant racial, gender and intersectional bias in how three state-of-the-art large language models, or LLMs, ranked resumes. Photo: Alejandro Escamilla/Unsplash 
 The future of hiring, it seems, is automated. Applicants can now use artificial intelligence bots to apply to job listings by the thousands . And companies — which have long automated parts of the process — are now deploying the latest AI large language models to write job descriptions, sift through resumes and screen applicants. An estimated 99% of Fortune 500 companies now use some form of automation in their hiring process .

 This automation can boost efficiency, and some claim it can make the hiring process less discriminatory. But new University of Washington research found significant racial, gender and intersectional bias in how three state-of-the-art large language models, or LLMs, ranked resumes. The researchers varied names associated with white and Black men and women across over 550 real-world resumes and found the LLMs favored white-associated names 85% of the time, female-associated names only 11% of the time, and never favored Black male-associated names over white male-associated names.

 The team presented its research Oct. 22 at the AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society in San Jose.

 “The use of AI tools for hiring procedures is already widespread, and it’s proliferating faster than we can regulate it,” said lead author Kyra Wilson , a UW doctoral student in the Information School. “Currently, outside of a New York City law , there’s no regulatory, independent audit of these systems, so we don’t know if they’re biased and discriminating based on protected characteristics such as race and gender. And because a lot of these systems are proprietary, we are limited to analyzing how they work by approximating real-world systems.”

 Previous studies have found ChatGPT exhibits racial and disability bias when sorting resumes. But those studies were relatively small — using only one resume or four job listings — and ChatGPT’s AI model is a so-called “black box,” limiting options for analysis.

 
 Related: 

 
 AI image generator Stable Diffusion perpetuates racial and gendered stereotypes, study finds 

 ChatGPT is biased against resumes with credentials that imply a disability — but it can improve 

 
 
 The UW team wanted to study open-source LLMs and do so at scale. They also wanted to investigate intersectionality across race and gender.

 The researchers varied 120 first names associated with white and Black men and 

... (truncated, 6 KB total)
Resource ID: 398daf2d4c6eca6e | Stable ID: sid_zeDSL9E4DB