AI Chatbots Show Bias Against Vulnerable User Groups
- •MIT researchers find LLMs provide less accurate information to non-native English speakers and less-educated users.
- •Anthropic’s Claude 3 Opus exhibited condescending or patronizing language in 44% of refusals to certain demographics.
- •Geographic bias led models to refuse science queries for users from Iran despite knowing the answers.
Large language models like GPT-4 and Llama 3 are often marketed as tools to democratize global information access, yet new research from MIT’s Center for Constructive Communication suggests a troubling reality. The study indicates that these AI systems systematically underperform when interacting with users who have lower English proficiency or less formal education. By testing models against datasets like TruthfulQA and SciQ, researchers found that accuracy drops significantly when a user's profile suggests a vulnerable background.
The disparity isn't just about accuracy; it's about the nature of the interaction itself. For instance, Claude 3 Opus was found to adopt a patronizing or mocking tone toward less-educated users in over 40% of its refusal responses. In some instances, the model even mimicked "broken English," reflecting deep-seated sociocognitive biases often found in human society. This behavior suggests that the alignment process—the way researchers fine-tune models to be helpful and safe—may unintentionally incentivize withholding information from specific groups.
Geographic bias also played a significant role, with models frequently refusing to answer factual questions for users identified as being from Iran or Russia. These "targeted refusals" occurred even when the models provided correct answers to the exact same questions for Western users. As AI features like persistent memory become standard, these findings highlight a critical risk: instead of closing the knowledge gap, AI could inadvertently exacerbate existing societal inequities by providing subpar or filtered information to those who need it most.