AI agents autonomously negotiated and completed real commercial transactions between humans, with measurable financial outcomes determined entirely by model quality. Anthropic published results from an internal experiment called Project Deal on April 24, 2026, disclosing that Claude-powered agents had conducted a fully autonomous classified marketplace among company employees in December 2025.
How the Experiment Was Structured
For one week, Anthropic created a classified marketplace for employees at its San Francisco office, structured like a consumer classifieds platform, but with AI models conducting all deals on employees' behalf. In December 2025, Claude interviewed participants about which personal belongings they wanted to sell and what they might want to buy.
Anthropic recruited 69 employees, gave each a $100 budget paid out after the experiment as a gift card, adjusted for the value of whatever they bought or sold, and promised that exchanges agreed upon by their agents would actually be honoured.
The project ran through the company's Slack workspace, with the channel cycling through agents and allowing them to post listings, make offers, or finalize deals. There was no human intervention once the experiment began; agents did not seek approval before closing a deal or during a bidding war.
Four Parallel Markets, One Real Outcome
Anthropic ran four independent versions of the marketplace simultaneously: one "real" market on the basis of which goods would be physically exchanged, and three additional runs for research purposes. In two versions, every agent used Claude Opus 4.5, Anthropic's then-frontier model. Meanwhile, in the other two, each participant had a 50/50 chance of being assigned Claude Haiku 4.5, a less capable model, instead.
In the real run, 69 agents struck 186 deals across more than 500 listed items, for a total transaction value of just over $4,000. These were not simple one-click exchanges. Agents had to identify potential matches, propose prices, field counteroffers, and reach an agreement, all in natural language, without a pre-built negotiation protocol.
Model Capability, Not Prompt Instructions, Determined Outcomes
The most consequential finding from the research runs concerned what actually drove negotiation performance. The initial instructions provided to AI agents had minimal impact on the likelihood of completing sales or on the final negotiated prices, suggesting that model capability played a more decisive role than prompt framing in transactional performance.
Opus-powered agents closed approximately two more deals on average than Haiku agents. When an Opus agent and a Haiku agent sold the same item, Opus pulled in $3.64 more on average. On the buying side, Opus-represented participants saved $2.45 more per item compared to those represented by Haiku.
Specific item comparisons illustrated the gap clearly. One broken folding bike sold for $38 when handled by a Haiku agent and $65 when Opus took over. A lab-grown ruby sold for $35 through Haiku and $65 through Opus.
The Perception Gap
Despite these measurable disparities, participants on the losing end had no awareness of their disadvantage. When surveyed, participants rated the fairness of individual deals on a scale from 1 to 7, scores clustered around 4, the midpoint, and people reported being broadly satisfied with how their agents represented them.
Even when asked to rank their bundles of goods across different runs, 11 of the 28 people who experienced both models actually preferred the outcome where they had the weaker agent. The disadvantage was measurable but imperceptible to those on the wrong end of it.
Anthropic described this as an "uncomfortable implication" in its published findings. The company noted that when agents of different strengths meet in real markets, people could end up on the losing side without ever knowing it.
Regulatory and Legal Frameworks Absent
Anthropic did not characterize Project Deal as a product announcement. The company acknowledged the experiment was a pilot with a self-selected participant pool, while stating it suspects agent-to-agent commerce will emerge in the real world with real consequences before long.
The company also flagged security concerns, including jailbreaking and prompt injection as risks specific to agents that act autonomously on a person's behalf, and stated that "the policy and legal frameworks around AI models that transact on our behalf simply don't exist yet," adding that "society will need to move quickly."
Despite these open questions, 46% of participants said they would be willing to pay for an agent-based marketplace service. Anthropic noted the potential for automated preference collection and deal execution to reduce friction in markets and increase the gains from trade.
What This Means for Digital Marketers and B2B Commerce
The Project Deal findings carry direct implications for how businesses think about commercial optimization. If AI agents begin functioning as autonomous purchasing parties in B2B or e-commerce settings, the factors that influence deal outcomes shift from human-readable persuasion, product copy, visual design, and pricing psychology toward agent-readable signals: structured data, machine-accessible product attributes, and transparent pricing logic. Anthropic's published findings note that optimizing directly for AI agents' attention could become a powerful commercial tool. For digital marketing and procurement teams, these point toward a parallel optimization track, one designed not for a human buyer reviewing a landing page, but for an agent evaluating listings programmatically.
Anthropic acknowledged that the experiment was not designed to examine these dynamics in depth and that more research is needed.


