Mechanize is building digital offices to train AI agents to fully automate computer work

San Francisco-based startup Mechanize wants to take AI office automation further than anyone else. The company’s goal: fully replace human computer work with AI, as quickly as possible. To get there, the team is building simulated digital workplaces designed specifically for training AI agents.
“Our goal is to fully automate work,” says Tamay Besiroglu, co-founder of Mechanize. Alongside Ege Erdil and Matthew Barnett, all formerly of the research group Epoch AI, the team is focused on creating a future where AI doesn’t just assist with digital labor, but takes it over entirely. Their first target is software development.
Simulated Offices for AI Agents
Mechanize is betting on reinforcement learning. The company trains AI agents in virtual workspaces that imitate real digital offices: email inboxes, Slack, code editors, browsers, and more. Agents take on tasks, earn rewards for success and penalties for failure, and repeat the cycle. “It’s effectively like creating a very boring video game,” Besiroglu told the New York Times.
The founders believe these environments will eventually be enough to train agents capable of handling any computer-based job. That reality is still years away. Barnett estimates it could take 10 to 20 years, while Besiroglu and Erdil put the timeline at 20 to 30 years.
Ad
Mechanize’s ambitions go beyond code. The company wants AI agents to handle every digital task, from planning and communication to execution. “We’ll only truly know we’ve succeeded once we’ve created A.I. systems capable of taking on nearly every responsibility a human could carry out at a computer,” the founders write.
But when it comes to the broader social impact, Mechanize is less specific. The team says it envisions a future of radical abundance and supports ideas like universal basic income for workers displaced by AI. Still, there’s no concrete transition plan. Barnett argues the mission is ethically justified: if society becomes wealthier overall, he says, that outweighs the cost of job losses.
The “Bitter Lesson”: Why Reinforcement Learning Is central to Mechanize’s approach
In an accompanying essay, Mechanize’s founders point out that despite rapid progress, today’s AI systems still can’t fully replace human software engineers. Models excel at specific programming tasks, but remain far from true autonomy.
Mechanize connects this limitation to the “bitter lesson” of AI research: whenever hand-designed algorithms have competed with data- and compute-driven approaches, the latter have eventually won out at scale. The real breakthrough won’t come from clever engineering, but from massive-scale training inside rich, simulated environments.
This approach closely aligns with the latest thinking from Sutton. He and David Silver recently outlined a vision for the next leap in AI: agents that learn by doing, not just by consuming human-written data. They argue that true AI progress depends on agents living in a continuous stream of experiences, learning through feedback and adaptation.
Recommendation
From Coding Assistants to Generalist AI
Mechanize sees the path forward as a combination of human demonstration data and reinforcement learning in simulated digital offices. Only this approach, they argue, will produce agents that act like real colleagues: delegating, planning, fixing mistakes, and understanding context. The goal is to automate the full role of a software developer, creating a “drop-in remote worker” that fits seamlessly into digital teams.
Current RL environments aren’t up to the task: they lack internet access, multi-agent collaboration, and realistic software tools, which prevents agents from developing true generalization skills. Mechanize wants to solve this by building richer, more realistic training spaces that mirror actual digital work.
This is their core business: creating reinforcement learning environments for agent training. Mechanize is competing with major AI labs, which have also been working since the earliest reasoning models like o1 to generate the right kinds of data for RL, from raw logs to verifiers to fully simulated workspaces. As tasks become more complex, the training environments need to keep up.
Software Development: The First and Last Domino?
For Mechanize, software development is the logical starting point. The work can be broken into discrete tasks, and many tools already handle parts of the process. At the same time, engineering is complex enough to serve as the ultimate test for agentic AI. That means software development could be both one of the first and last knowledge work domains to be fully automated.
AI is already handling everything from code completion to testing, but areas like architecture decisions, team coordination, and long-term maintenance remain out of reach. Mechanize believes that, as training environments approach the complexity of real digital offices, even these jobs could be automated.