Language Barriers in Social Media: Who Gets Amplified and Who Gets Silenced
Social media platforms were built by English-speaking companies with English-speaking engineers for English-speaking users. Five billion people now use them, speaking 7,000 languages. The infrastructure — the algorithms, the moderation systems, the recommendation engines — still runs on English logic. The consequences for everyone else range from invisibility to danger.
Algorithms Don't Speak Your Language
Social media recommendation algorithms learn what to amplify by training on historical engagement data. For most of the major platforms' developmental history, the majority of content and engagement data was in English. The result is systems that are best calibrated to understand, classify, and amplify English-language content — and that handle other languages less reliably.
The consequences are not always dramatic — sometimes they're just quiet inequity. A Tagalog-language creator producing the same quality content as an English-language creator may receive lower algorithmic reach because the platform's NLP systems extract engagement signals less accurately from Tagalog text. A tweet in Amharic that would be recognized as political commentary in English may be classified differently. A video in Bengali that would surface as newsworthy in English might not.
Research on TikTok's algorithm, examined during congressional testimony in 2023, suggested that content in languages other than English received systematically lower algorithmic recommendations even when engagement rate metrics were comparable — a pattern that TikTok's engineers attributed to differences in how the recommendation system processed audio and text features across languages.
Content Moderation: The Language Gap That Enables Violence
The most serious consequence of language barriers in social media is not about reach or engagement — it's about safety. Content moderation, which determines what gets removed, reduced, or labeled, has dramatically different quality levels across languages. The difference between a well-moderated linguistic community and a poorly moderated one can be the difference between contained rumor and mass violence.
The most extensively documented case is Myanmar. Facebook's own internal research, made public by whistleblower Frances Haugen in 2021, acknowledged that the platform had one Burmese-language content moderator per approximately 1.4 million Burmese Facebook users before and during the 2017 Rohingya crisis — a period when anti-Rohingya hate speech and incitement to violence circulated freely because no moderation capacity existed to identify or remove it. UN investigators subsequently concluded that Facebook played a "determining role" in spreading the hateful content that contributed to violence that killed thousands.
Myanmar is the most publicized case, but it is not unique. Internal Facebook documents reviewed by Haugen and analyzed by multiple news organizations showed that the company had identified 40+ "at risk" countries with significant Facebook usage and inadequate local-language moderation — countries where hate speech, incitement, and coordinated inauthentic behavior could proliferate because moderation systems couldn't process the languages in which it was occurring.
"We are not able to effectively identify and remove hate speech in over 40 languages in countries with significant potential for civil unrest or violence. Our systems simply don't work at adequate quality in those languages."
— Internal Facebook research document (2021, disclosed by Frances Haugen)
Misinformation: It Travels Fastest Where Moderation Is Weakest
Health misinformation, political misinformation, and conspiracy theories spread through social media along the path of least resistance. In English-language environments, that resistance includes fact-checking partnerships, automated detection systems trained on large English-language datasets, and human review teams of meaningful scale relative to the content volume. In non-English environments, most of those protections are significantly weaker.
During the COVID-19 pandemic, the WHO and academic researchers documented that health misinformation spread faster and persisted longer in South and Southeast Asian languages, African languages, and Latin American Spanish variants than in English — not because English speakers are more discerning, but because the moderation infrastructure was more developed for English. False claims about COVID transmission, vaccine safety, and treatment that were removed in English within hours circulated in Hindi, Bahasa Indonesia, and Swahili for days or weeks.
The same pattern appeared during elections. The Oxford Internet Institute's Computational Propaganda Project documented that election misinformation — including coordinated inauthentic behavior and false claims about voting — circulated with longer persistence and less removal in non-English election contexts than in US, UK, and Australian elections where platform resources were concentrated.
The fact-checking gap: Major platforms partner with third-party fact-checkers to identify and label false content. These partnerships are concentrated in English. A 2022 audit by the Reuters Institute found that fact-checking coverage in English was 8-15× more dense per piece of content than fact-checking coverage in most non-English languages. The languages with the most misinformation exposure — high-growth markets with limited digital media literacy infrastructure — have the fewest fact-checking resources.
The Viral English Export
When content goes viral in English, it frequently crosses language barriers — platform translation features make English content accessible to non-English users, and the global reach of English-language creators means their content often seeds discussions in communities that don't share the creator's language or cultural context.
The reverse is much rarer. Viral content originating in non-English communities typically stays within those linguistic communities unless it is specifically picked up by English-language accounts and re-shared with English translation and framing. The gatekeeping function — the decision about which non-English content crosses into English-language viral circulation — is exercised primarily by English-speaking journalists, influencers, and platforms, who apply their own selection criteria to what gets amplified globally.
The result is a systematic distortion of what global audiences understand as globally important. Events that go viral in English become part of global consciousness; equally significant events that go viral in other languages often don't. The global social media conversation reflects English-language concerns, English-language framing, and English-language selection of what matters — regardless of what is actually happening in the majority of the world's communities.
Creator Economy Language Discrimination
Social media platforms have built monetization infrastructure — the Creator Fund, monetized ads, tip features, Super Follows — that offers income to creators with large, engaged audiences. The distribution of this monetization opportunity follows language lines: English-language creators have access to advertising markets worth many times the revenue available in most non-English advertising markets.
A TikTok creator with 1 million followers in the US earns substantially more from platform monetization than a creator with 1 million followers posting in Tagalog, Bengali, or Hausa — because advertisers pay significantly more to reach audiences in markets where average consumer purchasing power is higher, and those markets are predominantly English-language.
This creates a financial incentive for creators from non-English-speaking backgrounds to create content in English rather than their native language — contributing to the English dominance of platform content even among creators for whom English is not a native language. The platform's monetization structure effectively subsidizes content creation in English and taxes content creation in other languages.
Cross-Language Connection: What's Actually Possible
Despite the structural barriers, social media has created cross-language connections that would have been impossible before. Fan communities for K-pop, anime, and Brazilian music operate across language barriers through community-created translation networks — fans who translate posts, subtitles, and conversations so that non-Korean, non-Japanese, and non-Portuguese speakers can participate. These communities have developed informal expertise in cross-language connection that platform features don't yet match.
Platform-native translation features have improved significantly in recent years. Twitter's "Translate Tweet" feature, Instagram's automatic post translation, and TikTok's subtitle translation have made it easier to consume content across languages passively — reading or watching content in translation without needing to actively find a translator. The gap between passive consumption and active participation remains: understanding a post in translation is different from responding to it in a way that feels natural and is likely to get a genuine response.
Real-time translation tools that work alongside social platforms — enabling users to compose and respond in their own language, with translation handled in-app — represent the next layer of cross-language connection. A user who can read a post in Korean, write a genuine response in English, see the English translated back to Korean for the original poster, and engage in a real back-and-forth conversation across the language barrier has a qualitatively different experience of social media than one who can only passively consume translated content.
Platform Accountability and the Language of Policy
Social media platform policies — Terms of Service, Community Guidelines, content enforcement rules — are written in English and translated into other languages with varying levels of fidelity. When a user's content is removed, the notification they receive explaining why is typically in the platform's official language for their region. The appeals process, if one exists, operates in the platform's language.
This means that the ability to understand why content was removed, to contest a moderation decision, and to navigate a platform's dispute resolution process is itself language-gated. Users who can read and write in English have access to better appeals outcomes — because they can articulate their case more precisely in the language the platform's systems are designed to process — than users who cannot.
Regulatory bodies attempting to hold platforms accountable face the same language barriers. The EU's Digital Services Act, the UK's Online Safety Act, and regulatory frameworks being developed across Asia and Africa require platforms to document and report on content moderation practices. The regulatory conversations, the compliance documentation, and the enforcement proceedings all take place in languages that are not the primary languages of most users affected by platform decisions.
Frequently Asked Questions
How do social media algorithms treat non-English content differently?
Why is content moderation worse in non-English languages?
How does language affect the spread of misinformation on social media?
Can social media connect people across language barriers?
Connect across every language barrier — not just the ones platforms help with
Babel enables real-time conversation across 100+ languages. Because the connection you want shouldn't depend on which language you happen to speak.
Download Babel Free