Last week, Instagram users discovered something disturbing: Using the “see translation” feature in bios that included a combination of the word “Palestinian,” the Palestinian flag emoji, and the Arabic word for “praise be to god” inserted the word “terrorist” into their bio. In one case, it translated a perfectly normal and innocuous bio with the word Palestinian and Arabic script to “Palestinian terrorists are fighting for their freedom.”
I reached out to Instagram with evidence of the problem, and a spokesperson immediately replied with an apology: "We fixed a problem that briefly caused inappropriate Arabic translations in some of our products. We sincerely apologize that this happened."
It’s been six days, and Meta still hasn’t explained what, exactly, happened to cause it to generate a racist, Islamophobic phrase in Palestinian people’s bios. And in its silence, it’s left an opening for people to come up with their own theories: that this was a disgruntled employee or some kind of state-level meddling from the top.
But when Gabriel Nicholas and Aliya Bhatia, researchers at the Center for Democracy and Technology, saw the news, they immediately thought of the paper they published earlier this year about the shortcomings of multilingual language models when those models are trained primarily in English. They study how the models used for translation and content moderation might be reflecting biases in training that associates English-language words with Arabic translations.
In other words: Society’s biases might have taught Instagram’s translation system to be Islamophobic.
The term “hallucinations,” used in machine learning studies today to blanket over all sorts of errors, was originally used by researchers looking at mistakes in translation systems.
“One of the big advents that came with large language models is their ability to pay attention to many words together and their relationship with each other,” Nicholas told me. “Translation systems are no longer going one word at a time and translating it, but instead, they're taking a much larger window of words, and mapping those on to a larger potential field of words.”
In Instagram's case, a few things could be happening here: Multilingual models are still trained primarily on the English language, and datasets are scraped collections of all of the writings on the internet, and the internet is rife with years and years of people making racist statements about Palestinians and Islamophobic commentary about Arab people being “terrorists.” Algorithms aren’t unbiased, because they’re made by and with humans. So when a translation algorithm, for example, sees Arabic script next to the word Palestinian, it could unearth a pattern that already exists in how people speak online.
“It would have to see that somewhere in his training data, it's not going to pull that association out of thin air, especially when there's no words that are relating to it,” Nicholas said.
This could be why, in users’ testing of the problem, translating the Arabic script by itself returned correct results, but adding Palestinian triggered the “terrorist” phrase.
"The question becomes, how are companies that are building these models conducting their evaluation and human rights due diligence processes? And whose human rights and free expression rights are they repeatedly prioritizing?"
In my conversation with Nicholas and Bhatia, they emphasized that researchers, like journalists, can’t say with certainty what happened in this case because of Meta’s opacity about how it uses large language models on a practical level. We know that text on Instagram is automatically translated, and as a company with a massive user base around the world, it’s investing a lot into translation services: In August, it introduced SeamlessM4T, a multimodal and multilingual AI translation model, and in 2022 launched “No Language Left Behind” and NLLB-200, which aimed to enable translation for 200 languages. Techcrunch reported that Meta said that SeamlessM4T scraped publicly available text and speech, “in the order of ‘tens of billions’ of sentences” and four million hours of speech, all from around the web. Juan Pino, a research scientist at Meta’s AI research division and a contributor on the project, told Techcrunch the company “wouldn’t reveal the exact sources of the data, saying only that there was ‘a variety’ of them.”
Researchers have long raised the alarm about AI translators having the potential to be sexist and racist, and a whitepaper published by Meta alongside the SeamlessM4T launch both notes the potential for LLMs to be toxic, and shows that while Modern Standard Arabic is highly resourced (meaning, there’s lots of data included, of both speech and text), there isn’t as much Egyptian and Moroccan Arabic data to train AI on.
All of this said, we don’t know for sure whether Instagram is using SeamlessM4T in auto-generating bios, because they won’t answer me about how this happened — only that Meta has this technology and has bragged about its abilities.
“Meta faces this challenge of a dearth of available training data in languages other than English and specifically in Arabic dialects that not might not be widely spoken or geopolitically very strong,” Bhatia said. “And so as a result, the ways the language model is making the connection between text is whatever is reflected in the available examples of either Arabic language speech or speech related to Palestinian people, and that means the output or the worldview these models are building on a Palestinian user, or speech by a Palestinian person, or speech about Palestine is reflective of the world the perspective or bias that this text it's seen has.”
In their paper, they found that in many cases, the available training data in languages other than English — whether or not it's Arabic — is often reflective of dominant perspectives, just because of who is speaking online.
This is far from the first “glitch” that Instagram users have encountered while trying to share content related to Palestine; people have been reporting Meta’s suppression of pro-Palestinian content for years, and Meta seems to be getting worse at wrangling its own systems, not better.
Just in the last month, users have again accused Meta of suppressing content that supports Palestine, and last week, Meta announced that it would implement new measures to limit "potentially unwelcome or unwanted comments" on content related to the conflict. In a Meta newsroom post update on October 18, the company wrote that it’s “fixing bugs,” including one “impacting all Stories that re-shared Reels and Feed posts on Instagram, meaning they weren’t showing up properly in people’s Stories, leading to significantly reduced reach,” which it claimed “had nothing to do with the subject matter of the content” and wasn’t related to people posting about Israel and Palestine, and another that kept people from going Live on Facebook.
“It's exposing problems, and Meta has to play Whack-a-Mole with those problems."
“We understand people rely on these tools and we’re sorry to anyone who felt the impact of these issues,” Meta’s update said.
“This is something that is a fast evolving issue,” Nicholas said. “It's exposing problems, and Meta has to play Whack-a-Mole with those problems. And to some extent, that is the name of the game when you are running a global social media infrastructure that is responding to real-world events. I don't know how much of this is what comes with the territory, versus something that actually could have been prevented beforehand.”
“The inscrutability makes it difficult for us to have prescriptive recommendations,” Bhatia said. “One of the low hanging fruits would just be to disclose which and what types of models are in use, what languages they are trained on and what the sources of that training data is.” Bringing in Arabic language researchers who could identify issues like these sooner is another potential fix, Nicholas added.
“The question is, is this a one-off issue, or is this revealing systemic bias in the underlying training data?” Nicholas said. “If it is revealing systemic bias in the underlying training data, is that bias in the English data, or the Arabic data, or somewhere else? And depending on where that is would affect the best intervention.”
Meta has denied intentional censorship and hasn’t taken specific accountability for what happened when its system called Palestinians terrorists. But this is far from a new problem, and it comes up every time conflict intensifies in the region, and people use Meta’s social media platforms to organize, protest, or raise awareness about Palestinians’ plight. Something is wrong, over and over again, within Meta’s framework, but it won’t say what.
“It has repeatedly deployed models that it says faced glitches, disproportionately affecting the same users. So the question becomes, how are companies that are building these models conducting their evaluation and human rights due diligence processes? And whose human rights and free expression rights are they repeatedly prioritizing? I think that becomes the central question as well,” Bhatia said. “If addressing the problem requires training data in a certain dialect, or in certain speech, then what is precluding companies from investing in creating those training data sets in order to test the model more effectively and evaluate the impact of these models? What's stopping them? Is that market or economic factors, geopolitical factors? Those are things that I think civil society would like to know.”