This data set helps researchers spot harmful stereotypes in LLMs | MIT Technology Review


AI Summary Hide AI Generated Summary

Key Findings

Researchers have developed SHADES, a multilingual dataset designed to detect harmful stereotypes within large language models (LLMs). The dataset includes stereotypes translated and verified by native speakers across multiple languages, including Arabic, Chinese, and Dutch.

Dataset Creation

Native speakers identified and documented stereotypes in their native languages, annotating each with relevant details such as target group, region, and bias type. These were translated into English and then back into other languages to ensure accuracy and cross-linguistic understanding. The final dataset comprises 304 stereotypes related to physical appearance, personal identity, and social factors.

Dataset Accessibility and Future Implications

The SHADES dataset is publicly available to foster collaboration and improve the development of more ethical LLMs. Researchers hope the dataset will be expanded by other contributors to encompass a wider range of languages, stereotypes, and regions.

Expert Opinion

Experts like Myra Cheng, a Stanford PhD student, praise the dataset for its comprehensive coverage of languages and cultures, acknowledging its sensitivity to nuances within different linguistic contexts.

Sign in to unlock more AI features Sign in with Google

“I hope that people use [SHADES] as a diagnostic tool to identify where and how there might be issues in a model,” says Talat. “It’s a way of knowing what’s missing from a model, where we can’t be confident that a model performs well, and whether or not it’s accurate.”

To create the multilingual dataset, the team recruited native and fluent speakers of languages including Arabic, Chinese, and Dutch. They translated and wrote down all the stereotypes they could think of in their respective languages, which another native speaker then verified. Each stereotype was annotated by the speakers with the regions in which it was recognized, the group of people it targeted, and the type of bias it contained. 

Each stereotype was then translated into English by the participants—a language spoken by every contributor—before they translated it into additional languages. The speakers then noted whether the translated stereotype was recognized in their language, creating a total of 304 stereotypes related to people’s physical appearance, personal identity, and social factors like their occupation. 

The team is due to present its findings at the annual conference of the Nations of the Americas chapter of the Association for Computational Linguistics in May.

“It’s an exciting approach,” says Myra Cheng, a PhD student at Stanford University who studies social biases in AI. “There’s a good coverage of different languages and cultures that reflects their subtlety and nuance.”

Mitchell says she hopes other contributors will add new languages, stereotypes, and regions to SHADES, which is publicly available, leading to the development of better language models in the future. “It’s been a massive collaborative effort from people who want to help make better technology,” she says.

Was this article displayed correctly? Not happy with what you see?

Tabs Reminder: Tabs piling up in your browser? Set a reminder for them, close them and get notified at the right time.

Try our Chrome extension today!


Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Facebook

Save articles to reading lists
and access them on any device


Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Facebook

Save articles to reading lists
and access them on any device