In a significant move to enhance the practical capabilities of its artificial intelligence systems, OpenAI has initiated a project to train its next-generation models using data derived from actual human work. This initiative, detailed in a report by Wired, aims to create a robust benchmark for measuring AI performance against human proficiency in everyday professional tasks.
The Strategy: Mimicking Real-World Human Work
According to the report, OpenAI has collaborated with the training data firm Handshake AI to gather this crucial information. The process involves third-party contractors providing data based on the genuine work they have performed in their past or present job roles. The focus is on capturing two core components of any task: the initial request from a manager or colleague and the final deliverable produced in response.
An internal presentation cited by Wired instructed contractors to upload concrete outputs of their on-the-job work. These could include actual files like Word documents, PDFs, PowerPoint presentations, Excel sheets, images, or code repositories, rather than mere summaries. This approach ensures the AI is trained on authentic, complex work that often takes hours or days to complete.
Privacy Protocols and the "Superstar Scrubbing" Tool
Recognising the sensitivity of such data, OpenAI has implemented strict privacy measures. Contractors are mandated to remove all proprietary information, personally identifiable details, and confidential material before uploading. To facilitate this, the Microsoft-backed startup has provided a specialised tool called the 'ChatGPT Superstar Scrubbing' tool.
"Remove or anonymise any: personal information, proprietary or confidential data, material nonpublic information (e.g., internal strategy, unreleased product details)," stated an internal document accessed by Wired. This step is critical to prevent the AI models from memorising and potentially leaking sensitive business or personal data.
The Bigger Picture: The Race for High-Quality Data and AGI
This data collection drive is not happening in isolation. It coincides with similar efforts from other AI giants like Anthropic and Google, who are also enlisting large teams of contractors to generate premium training data. The goal is to develop more sophisticated AI models and agents capable of automating enterprise-level work.
This trend has fueled the growth of a lucrative sub-industry of data contracting firms, including Surge, Mercor, and Scale AI, alongside Handshake AI. These companies manage networks of contractors specifically for creating high-quality AI training datasets.
However, the pursuit of Artificial General Intelligence (AGI)—a hypothetical system that outperforms humans in most economically valuable tasks—has raised concerns. Several tech industry leaders have warned of a potential 'white-collar bloodbath,' where AI automation could significantly impact low-level tasks and entry-level professional roles. OpenAI's latest project, which directly benchmarks AI against human task performance, brings this future into sharper focus.
The report, dated January 11, 2026, highlights a pivotal moment in AI development where the line between human and machine capability is being measured and blurred with unprecedented precision.