The Harness Is Now the Product, But the Labs Disagree Sharply on How to Sell It

The harness is now the product. But the labs disagree, sharply and publicly, on how that product should be sold.

The artificial intelligence landscape is experiencing a rapid evolution, marked by a fierce debate over the fundamental business model for enterprise AI agents. In a condensed period of just sixteen days, three major players in the AI arena – Sycamore, Anthropic, and OpenAI – have made significant moves, each betting on the growing importance of the "harness" – the operational layer surrounding AI models – as the next critical product in the enterprise. However, their strategies for packaging and selling this crucial component are starkly divergent, signaling a significant strategic fissure within the industry.

The genesis of this intense activity can be traced back to March 30, when Sycamore announced a substantial $65 million seed funding round. The company’s ambitious goal is to build what its founder describes as an operating system for autonomous enterprise AI, a testament to the perceived market opportunity in this nascent field. Just over a week later, on April 8, Anthropic entered the fray by launching Managed Agents in public beta. This offering, priced at eight cents per session hour, provides a managed infrastructure for deploying and running AI agents. In a swift follow-up, on April 15, OpenAI responded by releasing an update to its open-source Agents SDK. This update includes a model-native harness and a sandboxed execution environment, with OpenAI opting to charge only for standard API and tool usage, foregoing any additional first-party runtime fees.

This concentrated flurry of activity underscores a shared observation: the harness, once an implicit component of AI development, has now emerged as a distinct and valuable product. The disagreement, however, lies precisely in how this product should be monetized and delivered. Anthropic has chosen a path of a separately billed runtime hosted on its own infrastructure. Conversely, Google and Microsoft have integrated this layer into their platforms, offering comprehensive solutions that encompass sessions, memory management, code execution, and tool utilization. OpenAI, in a move that has particularly ignited debate, has made its harness open-source, essentially giving it away and relying on its core model and tool API revenues. The AI agent category is coalescing with unprecedented speed, but its foundational business model remains a battleground.

What a Harness Is and Why It Became a Market

The term "harness" gained widespread traction in February following an engineering blog post by OpenAI. The post detailed how a small team managed to deploy a production system comprising a million lines of code with zero human-written code. This anecdote resonated because it gave a name to a burgeoning discipline that development teams had been practicing without a formal label. The concept was further solidified in early April by Martin Fowler, a renowned software development expert, in an extensive essay that defined harness engineering as encompassing all aspects surrounding an AI model, excluding the model itself.

At its core, a harness serves as the control layer around an AI agent, ensuring its reliable operation in production environments. This typically includes functionalities such as model invocation and context management, orchestration of external tools, secure sandboxed execution of code, persistent storage for session and execution states, granular permission controls, robust error recovery mechanisms, comprehensive observability features, and detailed tracing capabilities. In essence, it mirrors the role of production infrastructure for containers, providing the surrounding system that renders long-running agents safe, debuggable, and dependable, without being the AI model itself.

For the past eighteen months, cloud providers and framework developers have offered fragmented managed components of this essential layer. However, the majority of teams responsible for deploying production-ready agents found themselves needing to piece together too many disparate elements. This gap spurred a wave of startups seeking to offer pre-packaged solutions, while many internal platform teams resorted to building their own from open-source building blocks. The harness has consequently emerged as a significant market because the available components, when considered individually, did not yet provide a seamless or complete solution.

Anthropic’s Managed Agents: A Bundled Runtime Approach

Anthropic’s Managed Agents represents its strategic response to this market gap, packaged as a beta API within its Claude Platform. Developers are empowered to define their agents, the tools they can access, and the guardrails that govern their behavior. Anthropic then assumes responsibility for the execution environment, supporting long-running sessions that can span multiple hours, enabling sandboxed code execution, implementing scoped permissions, and providing end-to-end tracing. Furthermore, it facilitates connections to third-party services via MCP-based integrations.

The caliber of Anthropic’s launch customers lends significant weight to its offering. Notion is leveraging Managed Agents to manage dozens of parallel delegation tasks, demonstrating its ability to handle complex workflows. Rakuten has deployed specialized agents across critical business functions, including product management, sales, marketing, finance, and HR, highlighting the platform’s versatility. Sentry has developed an agent capable of transforming flagged bugs into open pull requests autonomously, eliminating manual intervention. Asana has integrated the service into its AI Teammates feature, and Atlassian has also signed on as a launch customer, underscoring the broad industry adoption and trust in Anthropic’s approach.

The pricing model for Managed Agents is structured for transparency, particularly when compared to other offerings. Standard Claude Platform token rates apply to all model inference. In addition to these inference costs, there is a usage-based fee of eight cents per session hour for the duration the session remains active. However, some of the most advanced capabilities, including multi-agent orchestration, self-evaluating outcomes, and long-term memory, are currently restricted to a separate research preview access request, meaning these powerful features are not yet broadly available. Anthropic also provides a Claude Agent SDK for programmatic development, indicating that the distinction between managed and open-source solutions exists within their product suite rather than being a strict company-wide division. Nevertheless, the Managed Agents offering, launched on April 8, is exclusively hosted on Anthropic’s proprietary infrastructure.

OpenAI’s Open-Source Agents SDK: A Developer-Centric Model

Just seven days after Anthropic’s announcement, OpenAI presented a fundamentally different strategic bet with its updated open-source Agents SDK. This release introduces a model-native harness and integrated native sandbox execution capabilities. It features configurable memory management, sandbox-aware orchestration, Codex-style filesystem tools, and standardized MCP integrations. The primary target use case is long-horizon agents capable of operating across extended periods and executing numerous tool calls, mirroring the ambitious scope of Anthropic’s Managed Agents.

OpenAI’s delivery model is a direct inversion of Anthropic’s approach. Instead of hosting the compute infrastructure, OpenAI empowers developers to bring their own. This is facilitated through a Manifest abstraction that supports a diverse array of seven sandbox providers, including prominent names like Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel. Storage is similarly flexible, with support for platforms such as S3, GCS, Azure Blob, and Cloudflare R2. The system’s externalized state management ensures that runs can persist even if a sandbox container is lost, with snapshotting capabilities enabling the seamless rehydration of a run in a new container. While the harness coordinates these operations, the underlying infrastructure is the developer’s responsibility.

The pricing structure is where the contrast with Anthropic becomes particularly sharp. OpenAI explicitly stated in its announcement that the new features leverage standard API pricing, based on token and tool usage. Crucially, there is no separate first-party runtime fee, nor is there a session-hour meter. The Agents SDK itself is freely available and open-source. While developers are still responsible for covering the costs of sandbox compute and storage with their chosen providers, OpenAI has deliberately abstained from introducing an additional runtime line item. The overall cost for a developer can fluctuate, potentially falling on either side of Anthropic’s bundled model, depending heavily on the specific workload characteristics.

OpenAI has been forthright in articulating its rationale. Its announcement framed managed agent APIs as a simplification of deployment, but at the potential cost of restricting where agents can operate and how they access sensitive data. This represents a direct and public disagreement with the strategic direction pursued by Anthropic, Google, and Microsoft.

The Major Players Agree on Owning the Layer, Disagree on Billing

The divergence in business models extends beyond Anthropic and OpenAI. Google, for instance, lists its Vertex AI Agent Engine as a fully managed runtime solution. This offering includes distinct consumption-based billing for sessions, memory, code execution, and observability, rather than a single per-hour fee. Microsoft’s Foundry Agent Service operates on a similar consumption-based model, with billing applied across models and tools. It also features specific session metering for tools like Code Interpreter, but not for the platform as a whole. In February, AWS announced its intention to co-develop a Stateful Runtime Environment with OpenAI, slated for availability through Bedrock in the coming months, supplementing its existing Bedrock AgentCore, which provides the foundational runtime primitives.

Each of these major cloud providers presents a unique pricing configuration. Anthropic opts for a bundled approach, combining compute, state management, and orchestration into a session-hour fee. Google meters individual components separately, while Microsoft bases its charges on model and tool usage. AWS is set to introduce another managed option with the upcoming release of the OpenAI runtime. OpenAI’s own first-party offering conspicuously omits a runtime meter altogether.

Despite these varied approaches to pricing and delivery, there is a clear consensus among these five leading vendors: the harness layer is of paramount importance, and they each intend to control and profit from it. Their fundamental disagreement lies in the method of monetization – whether it should be a hosted service with its own distinct meter, a collection of individually priced foundational services, or an open-source offering supported by core model revenue. This disagreement is not a sign of stagnation but rather a deliberate and strategic divergence in their business strategies.

The Middleware Arc: A Split Running Through It

The cloud infrastructure landscape has witnessed similar bifurcations in the past, and the outcomes were rarely a complete absorption of one model by another. For example, Terraform has maintained its open-source status while AWS CloudFormation offers a managed solution. Similarly, Kubernetes has become the de facto standard in container orchestration, even as AWS, Google, and Microsoft provide their own managed container services. In both these instances, open-source solutions did not eliminate managed services, nor did managed services render open-source obsolete. They coexisted by catering to distinct buyer profiles and preferences.

The historical precedent suggests that when one vendor offers a free, open-source product and others provide paid, managed alternatives, the market tends to divide based on infrastructure preferences rather than consolidating into a single dominant model. Organizations seeking the convenience of hosted solutions gravitate towards managed services, while those prioritizing control, portability, or multi-cloud flexibility opt for the open-source stack. Both approaches have proven to be sustainable and commercially viable within the cloud computing era.

What is changing, however, is the economic landscape for independent companies specializing in horizontal versions of this layer. The introduction of a free, model-native harness by OpenAI places significant pricing pressure on independent frameworks, a challenge far greater than that posed by a paid managed service. The established cloud pattern appears to be replicating in the AI agent space, but with a dual compression of economic pressures.

Implications for Startups Filling the Gap

The risk profile for startups focused on AI harnesses has become significantly more nuanced. Sycamore’s value proposition, as evidenced by its funding from Coatue and Lightspeed, centers on trust, governance, and control within enterprise AI, with an emphasis on multi-model support. This approach is strategically defensible against both Anthropic’s managed offering and OpenAI’s open-source model. It targets a specific buyer segment that prioritizes independence from any single AI lab, positioning Sycamore as less vulnerable in the current market dynamics.

Conversely, the archetype most exposed by these recent launches is the horizontal orchestration framework. Companies like LangChain, CrewAI, and VoltAgent now face more direct competition from a free, model-native, and well-supported harness provided by the very labs whose models they often depend on. The long-standing argument for model-agnostic frameworks – that flexibility outweighs vendor lock-in – becomes more challenging to articulate when the vendor in question is offering an open-source harness that is intrinsically aligned with its cutting-edge models. Startups continuing to pitch a horizontal, model-agnostic orchestration layer to enterprise buyers will likely encounter more resistance and face more difficult conversations.

Startups that have chosen to offer paid managed platforms are now contending with increased competition from Anthropic, Google, and Microsoft. The strategic imperative for these companies appears to be a sharp differentiation. This could involve focusing on specialized areas such as governance, compliance, deep vertical expertise, or robust multi-model control. Alternatively, they may need to compete aggressively on price, navigating the pressure from OpenAI’s free offering on one side and the bundled solutions of the major cloud providers on the other.

Implications for Teams That Built Their Own

The "build versus buy" calculus for enterprise AI infrastructure has been significantly altered by these recent developments, introducing two new crucial reference points. Organizations that prioritize bundled infrastructure can now benchmark their internal harness development efforts against Anthropic’s Managed Agents, considering the eight-cents-per-session-hour fee in addition to token costs. For teams that have already invested in building and maintaining their own infrastructure, they can now compare their efforts against OpenAI’s SDK, which comes with no additional first-party runtime fee, aside from their existing expenses for sandbox and storage providers. While different teams will find varying benchmarks more relevant, the existence of these clear comparisons was absent just a month ago.

For teams still in the early stages of prototyping, the rationale for building AI scaffolding from scratch has been substantially weakened. The complex infrastructure work that was once a point of differentiated engineering is now readily available as an API-driven service or a free, well-supported SDK. For teams already operating in production, their internally developed systems may still offer superior tuning for specific workloads. However, the teams responsible for maintaining these internal systems are now in direct competition with a rapidly evolving category actively being invested in by four leading AI labs. This dynamic is likely to slow the pace of internal development, reduce its perceived prestige, and make recruitment for these specialized roles more challenging.

Building in-house remains a valid strategic choice for some organizations. However, it now necessitates a more rigorous justification, requiring the internal solution to demonstrably outperform both the new benchmark offerings in terms of workload fit and, critically, team sustainability. The cost-benefit analysis has become considerably more complex.

What’s Next in the Harness Race

The harness was widely anticipated to be the "moat" – the key differentiator for companies building production AI agents. For eighteen months, the consensus among most teams deploying agents was that building or assembling their own harness provided that crucial competitive advantage. However, the leading AI labs have collectively signaled an unwillingness to solely provide model access and allow others to capture the value added above it. They have, however, diverged significantly on how they intend to capture this value themselves. Three of the major players are opting to charge for the runtime in various forms, while one is distributing its harness freely, betting on model loyalty and broader ecosystem adoption to drive revenue.

The critical question moving forward is which of these competing business models will ultimately prevail, or if the market will sustain multiple approaches. OpenAI’s strategy hinges on the belief that a free, open-source, model-native harness will drive greater overall model consumption than any paid managed runtime could achieve, with its partnership with AWS covering enterprises that prefer hosted solutions. Anthropic is betting on a fully managed, paid service, while Google and Microsoft are pursuing a strategy of packaging priced primitives within their broader cloud platforms. It is unlikely that all these models can achieve equivalent scale and success. Startups observing this evolving landscape must carefully consider which differentiation strategy aligns best with the direction of market momentum before committing their resources. The concepts of the agent harness and the runtime are in constant flux, and continuous decoding of these evolving paradigms will be essential for navigating this dynamic market.

Leave a Reply Cancel reply