The AI behind the promising chat bot had to be trained not to spew sexual and violent content.
ChatGPT, an AI-powered chatbot that garnered widespread attention for its text generation, is partially built using the labour of poorly paid workers in Kenya, TIME magazine has reported. They were hired to read passages of explicit text and label them, so the algorithm could avoid this type of language.
According to an expose published on Wednesday, developer OpenAI contracted a San Francisco-based firm called Sama, which specializes in annotating data for AI training. The work is done by humans in countries like Kenya and Uganda, where wages are lower than in developed nations.
Three Sama contracts with OpenAI were signed in late 2021, worth about $200,000, TIME reported citing billing documents. Around three dozen workers in Kenya were tasked with reading graphic descriptions of acts, including sexual abuse of children, bestiality, murder, suicide, torture, self-harm, and incest.
Their output, labeled according to the nature of the content, was used by OpenAI to train their AI to police its own language. ChatGPT generates text after learning from billions of words written by humans and available online, including inappropriate material.
According to the scoop, workers in Kenya were paid a take-home wage of between $1.32 and $2 per hour depending on their position and performance. Three of them told TIME they were expected to parse through 150 and 250 passages of text, ranging in length from 100 to 1,000 words, in a nine-hour shift.
Last year, TIME reported that Sama did a similar job for Facebook, assisting with the removal of content violating the platform’s rules. In both cases, the magazine said some people were left with “mental scars” after reading the toxic materials.
In early 2022, OpenAI hired Sama for another job, which involved labeling explicit images for a different project. But, within weeks, the subcontractor pulled out of the deal, apparently because some of the images were illegal under US law. OpenAI blamed miscommunication.
Earlier this month, Sama announced it would no longer work with sensitive content and would instead focus on annotating data for AI computer vision solutions. TIME observed that, for all its glamor, “AI often relies on hidden human labor in the Global South that can often be damaging and exploitative.”