Chennai: Home services startup Pronto's recent admission that it was piloting in-home video recordings to train physical AI systems has brought attention to a rapidly expanding and loosely regulated sector of AI data capture and labeling for the global robotics supply chain.
Pronto is not alone in this endeavor. Startups such as Human Archive, Humyn Labs, Egolab AI, and Neocambrian are collecting what is known as egocentric data or first-person video captured through wearables or head-mounted cameras. These companies collaborate with cloud kitchens, hotels, home services platforms, small textile and garment factories, and warehouse operators to record everyday tasks, ranging from cooking meals and washing dishes to stitching garments, assembling components, and sorting inventory. In some instances, startups have established dedicated 'data factories' equipped with motion-tracking rigs.
"Typical clients are robotics, vision-language-action model and world model companies," said Abhinav Kukreja, founder of Neocambrian AI, which raised funds from angels, including Dalmia Family Office Trust. "There is no equivalent repository of physical behavior on the internet. Robots need to learn from messy homes, crowded factories, small shops and repair stations, which India offers. When done right, it can become an additional source of paid work for many workers and households, and we compensate both environment owners and data collectors," he explained.
This data is used to train world models and physical AI systems, teaching robots to navigate and act in messy, unstructured environments, as well as smart glasses for object recognition. One industry insider indicated that there is significant demand from the defense industry, particularly for autonomous drone applications. However, the practice also raises questions about privacy, legality, and compensation, as in some cases videos are recorded without pay and consent from the workers. TOI learned that some factories have paused such pilots following the recent backlash.
Manish Agarwal, co-founder of Humyn Labs, which works with leading frontier labs, noted that demand is growing from robotics OEMs, software makers, and enterprises. "We collect and convert this into episodic strings for robot memory, which helps build low to mid-level agentic capabilities including physical action, voice, sight and mobility," he said. "We are using verified networks of workers across 16 countries as robots cannot be trained only in Indian environments. For European domestic robotics to navigate better, we need training data similar to that environment," he added.
Startups argue that this is India's entry into the global AI value chain, and that working with frontier labs could help the country train competitive models of its own in the future. But skeptics see a familiar cost-arbitrage play. Madhukar Yarra, CEO of Bengaluru BPO NextWealth, which annotates these videos, called it a flash in the pan. Much of the data is collected through unorganized gig work, he said.
Sangeeta Gupta, SVP at Nasscom, stated that physical AI data could diversify India's AI services beyond traditional data labeling. "But issues around informed consent, anonymization, worker awareness and ethical use will require continued industry responsibility and evolving safeguards," she said.



