Growing scrutiny of the AI data-labeling industry is driving companies to explore new ways of collecting training data. Recent reports have highlighted difficult working conditions at data-labeling firms such as Mercor and Sama, raising concerns about worker welfare and the quality of the information used to train artificial intelligence models. In response, virtual world platform VLGE and data company Protege have announced a partnership that aims to generate AI training data from natural human behavior instead of traditional manual labeling.
A Wired report published two months ago described contractors at Mercor working under intense pressure, tight deadlines and disorganized conditions. Meanwhile, workers at the Kenyan company Sama reportedly earned as little as $1.50 per hour while reviewing content from Meta’s smart glasses, including sensitive images. These cases have intensified debate over whether current data-labeling practices produce reliable and unbiased datasets.
VLGE and Protege believe data collected from people naturally interacting in virtual environments can provide a richer foundation for AI. Their partnership will use behavioral signals generated inside VLGE’s digital worlds, including movement paths, hesitation loops, exploration patterns, object interactions, spatial decision-making and contextual shopping behavior. The companies argue that this information better reflects real human actions than conventional labeled datasets.
Grant Murphy-Herndon, General Manager of Protege, said traditional datasets are heavily influenced by the way tasks are assigned, making them inherently biased. He said VLGE’s approach captures people pursuing their own goals in different ways, producing data that is more representative of genuine human behavior and less affected by artificial labeling processes.
VLGE founder and CEO Evelyn Mora said future AI systems must understand more than spoken language. She said AI also needs to learn how people build, move, hesitate, explore, compare and make decisions within physical and virtual environments. According to Mora, scalable behavioral systems and spatially contextualized human interaction will play a central role in advancing human intelligence for AI.
As AI development expands from two-dimensional applications into three-dimensional environments, spatial behavioral data is expected to become increasingly valuable. While robotics has attracted much of the industry’s attention, the technology also has immediate commercial uses. Mora pointed to retail as one example, explaining that businesses can analyze why certain product displays outperform others by studying customer behavior in virtual spaces. Metrics such as a “hesitation score” can reveal decision-making patterns that are difficult to capture through traditional in-store cameras or customer surveys.
The growing collection of behavioral data also raises questions about privacy and consent. Mora said all VLGE users explicitly agree to their data being collected and shared, adding that the company is focused on protecting individual rights while supporting a human-centered and sustainable future for AI. She said the goal is to avoid repeating the social media model in which users exchange personal data for free services without receiving meaningful benefits.
The shift toward spatial AI could eventually extend beyond virtual worlds. Technologies such as smart glasses may collect information including walking patterns and eye movements to improve AI systems, product design and robotics. While today’s data-labeling workforce receives direct compensation, Mora suggested future contributors should also benefit, potentially through subsidies that make the required hardware more affordable.
Mora believes human-generated behavioral data will become the foundation of future AI development. As companies increasingly rely on real-world interactions to train advanced models, the debate will center not only on technological progress but also on ensuring the people generating that data are treated as respected participants rather than simply another resource.
Leave a comment