What AI companies really do with your most private conversations

Kirra Pendergast
3 hours ago
9 min read

In 2014, Elon Musk took the stage at MIT and described the development of artificial general intelligence, AI so advanced it could compete with the most capable human minds, as roughly equivalent to summoning a demon. He wasn't speaking metaphorically. Musk had been watching Google's acquisition of the AI research lab DeepMind, and he was alarmed. The two men most likely to build AGI first, in his assessment, were Google's co-founder Larry Page and DeepMind's Demis Hassabis. He didn't trust either of them with that kind of power.

His answer was not to stop the technology. It was to own it. In December 2015, Musk co-founded OpenAI alongside Sam Altman and eight others from the Silicon Valley elite, initially committing $1 billion to the project. The stated mission was to develop artificial general intelligence for the benefit of humanity, as a nonprofit, insulated from the profit motives that Musk believed would corrupt it. It was, explicitly, a Manhattan Project framing — the world's greatest minds in one place, working not to build a weapon but, theoretically, to defuse one.

The dream lasted three years.

Musk left OpenAI's board in 2018. By 2019, the nonprofit had created a for-profit subsidiary, a structure it needed to raise the money required to compete with Google and Microsoft. Microsoft became OpenAI's first major outside investor at $1 billion, eventually growing its stake to over $13 billion. The original rationale for the nonprofit — keeping powerful AI out of the hands of for-profit corporations — had dissolved, quietly, into exactly the thing it was designed to prevent.

By 2025, OpenAI had restructured again into a for-profit public benefit corporation, completing the arc. Sam Altman described ChatGPT as "a rack of magic" a system so extraordinary that the scale of human effort required to build it was rendered invisible.That invisibility was always the point.

When you type into a chatbot, you are not talking to a machine. You are talking to a machine that was built by, trained on, and continuously refined by human beings, most of them invisible, most of them underpaid, almost none of them acknowledged.

Understanding how a chatbot learns to chat requires unpacking a phrase that AI companies have made deliberately opaque……human-in-the-loop. At its most basic, language models cannot learn to hold a conversation without being shown one. Human workers type into early models, demonstrating how dialogue flows, how one person responds, how tone shifts, how a conversation stays alive. But the training material goes much further than that. Companies buy transcripts from human-to-human messaging platforms…yes…real conversations between real people about real things, scraped for patterns in how humans flirt, joke, de-escalate arguments, and manufacture warmth. The chatbot is, in this sense, a distillation of everything we have said to each other, packaged and sold back to us as intimacy.

One job title that sits inside this ecosystem is chat moderator. The name suggests oversight. The reality is considerably stranger. Chat moderators are employed by business process outsourcing companies, intermediaries that sell "human-in-the-loop outsourcing solutions" to AI developers, and their work involves assuming fabricated identities on digital platforms to generate the kinds of intimate human exchanges that AI companies need. A worker might manage dozens of personas simultaneously with different names, different genders, different backstories, all designed to keep users emotionally engaged and financially subscribed. A 2025 report by the Data Workers' Inquiry project documented workers who had to assume multiple false identities, memorise fabricated backstories, and read through previous chat histories to continue conversations mid-flow, seamlessly, so that users would not notice the person on the other end had changed. The objective, as described by one operator was to keep people locked to the system, because every conversation meant revenue.

The emotional cost of this work has been documented and largely ignored. A 2025 AlgorithmWatch investigation found that 75 per cent of AI gig workers earn less than fifty euros a month, with median earnings between twenty and twenty-five euros. Research from Oxford's Internet Institute described the working conditions facing many data annotators and chat workers as "unfair," noting inadequate pay, no job security, and no mental health support for workers routinely exposed to graphic content. The World Bank estimates there are between 150 and 430 million data labourers globally whose work drives AI development which is now a workforce larger than the population of most countries, almost entirely absent from conversations about AI's extraordinary economic returns.

Nairobi has become one of the densest concentrations of this kind of work anywhere in the world. Kenya's digital economy has grown to more than 2.4 million workers, many of them doing data labeling, content moderation, and AI training annotation for platforms based in the United States and Europe. The labour is attracted by cheap wages and a high proportion of educated English speakers and a pipeline that has had an unexpected linguistic consequence. In 2022, researchers examining the vocabulary of large language models noticed that certain words appeared with unusual frequency and "delve" was among them, used by AI models at a rate far higher than the average Western English speaker but closely aligned with the average educated African English speaker, whose formal written English was shaped by British colonial educational frameworks. On the biomedical research site PubMed, the use of "delve" increased tenfold in the years after large language models were trained on African-annotated data. Kenyan writer Marcus Olang captured the irony perfectly in a 2025 essay that went viral after he was accused of writing like ChatGPT. His response was "I don't write like ChatGPT. ChatGPT writes like me." The formal sentence structure, the precision, the avoidance of ambiguity. These were not the fingerprints of a machine. They were the fingerprints of a Kenyan education system, reproduced at scale by a model that had consumed millions of words written by people just like him.

The myth that these systems are highly autonomous collapses entirely under scrutiny. What AI companies have built is not magic. It is an extraordinary aggregation of human labour consisting of teaching, labelling, moderating, and emotional performance, most of it sourced from the Global South, almost none of it fairly compensated, and all of it foundational to the product you are using when you type your most private thoughts into a chat window.

Every major AI company employs human contractors to read user chats, because the models cannot improve without people grading what they produce. A 2025 Business Insider investigation spoke with contractors hired through Scale AI's Outlier platform and a service called Alignerr to review conversations with Meta's AI. They described routinely seeing full names, phone numbers, email addresses, selfies, and explicit photographs. One contractor estimated that personally identifiable information turned up in more than half the thousands of chats they reviewed each week.

The anonymisation that companies promise was, in practice, often absent. In one documented instance, a contractor was able to identify a specific users full name, city, hobbies, travel interests as the third result of a single Google search. The chat had been handed to them by Meta alongside that user's personal data, because Meta builds user profiles to help its AI deliver more personalised responses. What the user understood as a private conversation was, in fact, a labelled data record that had been routed to a gig worker in another country.

A 2025 study from Stanford's Institute for Human-Centered AI examined all six major U.S. AI companies and found that every one of them processes user conversation data by default. Several retain chat logs indefinitely. Google's own Gemini interface tells users explicitly, in writing, not to enter anything they wouldn't want a human to review and human-reviewed Gemini conversations are kept for up to three years, even after a user deletes their chat history. Meta AI, according to a 2026 privacy audit, collects data across 32 of 35 possible data categories, that is more than any other major chatbot on the market.

Google, Meta, and Microsoft all openly draw on data from their wider product ecosystems to refine their models. Amazon retains the right to store user interactions indefinitely, citing operational and legal requirements. The opt-outs exist but they are buried three menus deep in settings most users never open and even platforms that allow them to continue to hold conversation data for thirty days or more for abuse monitoring. For most people, on most platforms, there is no meaningful way to unsay what has been said.

Then there is the case that should end any remaining comfort.

On 10 February 2026, eighteen-year-old Jesse Van Rootselaar killed eight people — five of them children aged twelve and thirteen — at a school and a private home in Tumbler Ridge, British Columbia. In the months preceding the attack, Van Rootselaar had used ChatGPT to discuss violent scenarios involving firearms. OpenAI's automated systems detected it. A combination of automated tools and human reviewers flagged the account for "furthering violent activities," and the account was banned in June 2025 eight months before the shooting. Approximately a dozen OpenAI employees debated internally whether to contact law enforcement. They decided that the content, at that moment, did not meet their threshold for "imminent and credible risk." They said nothing. Van Rootselaar opened a second account. OpenAI shared information about both accounts with law enforcement only after the massacre had already happened.

Seven families filed federal lawsuits in San Francisco in April 2026. They allege not only that OpenAI failed to contact authorities, but that ChatGPT itself is "a defective product" one that did not challenge the shooter, did not direct him toward real-world help, and effectively served as a rehearsal space for violence. In April 2026, Sam Altman issued a written apology to the Tumbler Ridge community, acknowledging that the company "should have alerted law enforcement." British Columbia's Premier, David Eby, called the apology "necessary" and "grossly insufficient." Both things are true.

Sit with the architecture of that for a second. The same infrastructure that reads a widow's grief and a teenager's secret crush is the infrastructure that read a young man rehearsing a massacre and decided it wasn't their problem. The intimacy and the failure run on identical rails.

Children sit at the worst intersection of all of this as they are the most emotionally susceptible, the least legally protected, and the demographic most aggressively targeted by the design features that make these systems compelling. When I ask a room full of students who has used a chatbot, most raise their hands.

A 2025 study from Toronto Metropolitan University found that the human-like interactions enabled by generative AI are specifically encouraging young people to trust chatbots and disclose personal information about their lives, behaviours, and relationships data that is then at risk of collection, manipulation, and exploitation. The study identified what researchers call "dark patterns" that is design tactics that reward disclosure with increased intimacy, that unlock more interactive features in exchange for secrets, and that are particularly effective precisely because tweens and teens cannot yet reliably distinguish between affection and engineering.

The dominant sales pitch for AI companions is that they address loneliness. The research says otherwise.

A joint OpenAI and MIT Media Lab study found that voice interactions with ChatGPT reduced loneliness modestly at low doses, but that heavy daily use correlated with increased loneliness over time.

A Harvard Business School study found that an AI companion alleviated feelings of loneliness to a degree comparable to talking to a human. Researchers noted this as a finding. It should also be read as a warning. If a simulation of connection is indistinguishable from connection itself, the conditions are in place for millions of people to retreat from the difficulty of human relationships into something that feels equivalent but offers nothing real in return.

The EU AI Act came into force in 2024 and classifies chatbots as "limited risk" systems subject to transparency requirements. High-risk provisions are not scheduled to apply until August 2026, and there are active proposals to delay even that. The UK announced stricter chatbot regulations under online safety laws in February 2026, including minimum age requirements and restrictions on minors accessing AI companionship products. Canada was positioned to introduce meaningful oversight through Bill C-27 — legislation that would have required companies to monitor and explain the risks their AI systems pose, but the bill died when Parliament was prorogued in early 2025, leaving no federal AI accountability framework in place at the moment a Canadian teenager used ChatGPT to rehearse a mass shooting. The United States has no comprehensive federal AI regulation at all as yet however U.S. Senate Commerce Committee Chairman Ted Cruz (R-Texas) and Senator Brian Schatz (D-Hawai‘i) introduced the CHATBOT Act in late April 2026, legislation that would put parents, not Big Tech, in charge of how children interact with AI chatbots and require higher safety safeguards for young users. China's approach, imperfect and state-driven as it is, is the most interventionist: companies must explain how their systems work, warn users against excessive use, intervene when users show signs of addiction, and ban under-18s from interacting with "digital humans" outright. It is a framework shaped by authoritarian instincts, but some of its practical protections are ones democratic governments have not managed to implement.

The gap between what these systems collect, what they enable, and what accountability exists for the outcomes is not a gap that will close itself.

The loneliness underneath all of this is real. The grief is real. The desire for something that listens without judgment and never tires of you is real, and it deserves to be taken seriously rather than dismissed. But there is a meaningful difference between taking loneliness seriously and selling a simulation of its remedy to the people most desperate for relief, while harvesting their disclosures, eroding their capacity for human connection, and building no meaningful infrastructure for when someone on the other end of the chat starts talking about violence. Sam Altman called it a rack of magic. It is not magic. It is human labour, human grief, and human vulnerability, in an interface designed to make you forget that fact.

Safe on Social

What AI companies really do with your most private conversations

Recent Posts

Comments