Open Data Infrastructure: Fivetran Challenges Vendor Lock-in Amidst AI Data Demands

The burgeoning era of artificial intelligence is dramatically reshaping how businesses interact with their data, introducing unprecedented query volumes that strain existing infrastructure. Anjan Kundavaram, Chief Product Officer at Fivetran, highlighted this seismic shift during a recent discussion at Google Cloud Next, drawing a stark analogy: "It’s kind of like using a Lamborghini to mow the lawn all the time." This potent imagery underscores a critical challenge facing closed data ecosystems: the tendency for the same, often expensive, compute paths to be utilized for every AI-driven query, regardless of its complexity or cost-efficiency.

Fivetran, a prominent player in data integration and replication, used the prominent industry event to champion its vision of "Open Data Infrastructure." This initiative, coupled with the recent launch of their Open Data Infrastructure Data Access Benchmark, aims to shed light on vendor practices that may inadvertently or deliberately inflate costs for AI workloads. The timing of this push is particularly relevant, as The New Stack previously reported that a significant portion of enterprise data systems were not originally architected to accommodate the demands of "agent swarms" – the parallel, high-volume queries characteristic of AI applications.

The Economic Disconnect: AI Agents vs. Human Analytics

Kundavaram’s central argument revolves around a fundamental misunderstanding of how AI agents operate compared to their human predecessors in analytics. While human analysts typically require near-instantaneous results to maintain workflow efficiency, AI agents possess a different operational paradigm. "An agent could go spend more time if the agent thinks you’re going to save 10x the cost," Kundavaram explained. This suggests that AI agents are inherently capable of more strategic resource allocation, able to leverage different compute engines based on cost-effectiveness and processing needs.

In a truly open data infrastructure, an AI agent could intelligently route a computationally intensive analytical query to a high-performance, albeit more expensive, engine. Conversely, simpler or more routine queries could be directed to lighter, lower-cost options. This dynamic, however, is often stifled within closed ecosystems where a single, often costly, compute path is the default for all data access. This lack of flexibility represents a significant source of the AI cost squeeze that many organizations are beginning to experience.

The "Triple Whammy" of Unconsolidated Data and Context

Beyond the inefficiencies of compute path routing, Kundavaram identified a second, equally critical factor contributing to escalating AI costs: the fragmentation of data and its associated context. When an organization’s critical information is scattered across numerous disparate systems, and the contextual relationships between these data points are not consolidated, the consequences for AI workloads can be severe.

"It’s going to be like a triple whammy," Kundavaram stated, outlining a scenario where the cumulative effects lead to a cascade of problems. Firstly, AI models trained on fragmented and contextually poor data are likely to produce suboptimal or inaccurate answers. Secondly, the inherent nature of AI agents to explore and iterate can lead to a dramatic increase in query volumes, each one hitting the same inefficient compute pathways, thus driving up costs exponentially. Finally, the effort required to feed these numerous queries with insufficient context results in significant waste, both in terms of computational resources and analytical effort.

Resisting the Lockdown Instinct: The Case for Openness

The immediate reaction within many data organizations facing these rising costs is often to impose stricter controls and limitations on data access. However, Kundavaram argues that this is precisely the wrong approach. He recounted a conversation with a data leader at a large enterprise who expressed concern over skyrocketing analytics budgets solely from query costs. "One of the data leaders told me at a very large company, hey, our analytics budgets, just queries, have gone up a lot," he said. This sentiment even resonated internally at Fivetran, where an analytics leader initially suggested implementing controls. Kundavaram’s counter-response was emphatic: "No, no, don’t put controls. Let’s innovate."

His broader prescription for unlocking the full productivity potential of agentic analytics is for businesses to actively reject the instinct towards lockdown and instead invest strategically in open infrastructure and what he terms "semantic discipline." This involves creating an environment where data is not only accessible but also well-understood and interconnected, allowing AI agents to operate with greater efficiency and accuracy.

Fivetran’s Strategic Moves in the Open Data Landscape

Fivetran’s commercial interests are clearly aligned with advocating for open data infrastructure. The company has been actively translating this philosophy into tangible products and initiatives. Their recent work in enabling data lake interoperability on Google Cloud, previously covered by The New Stack, demonstrates a commitment to breaking down data silos. Furthermore, Fivetran’s significant contribution of SQLMesh to the Linux Foundation in March signals a dedication to fostering open-source solutions that promote data governance and transformation in a more accessible manner.

The SQLMesh project, for instance, aims to provide a framework for managing and transforming data within data warehouses and data lakes. By donating it to the Linux Foundation, Fivetran is facilitating broader community involvement and ensuring its widespread adoption and development, aligning with the principles of open data infrastructure. This move positions SQLMesh as a potential industry standard for data modeling and transformation, further supporting the goal of creating more cohesive and accessible data environments.

The Evolving Data Landscape and Future Implications

The challenge now lies in whether enterprise buyers will fully grasp the economic implications of the current data infrastructure limitations and act decisively. The cost curve associated with AI workloads is steepening, and the bills are beginning to arrive. The industry is at a critical juncture where strategic decisions about data architecture will have profound and long-lasting impacts on a company’s ability to leverage AI effectively and cost-efficiently.

The shift towards open data infrastructure is not merely a technical preference; it represents a fundamental re-evaluation of how data should be managed and accessed in an AI-driven world. Organizations that embrace this paradigm shift are likely to find themselves better positioned to harness the transformative power of AI, while those that cling to closed, proprietary systems may face escalating costs and diminishing returns. The success of Fivetran’s "Open Data Infrastructure" vision will ultimately depend on the industry’s collective willingness to move beyond the limitations of legacy systems and embrace a more flexible, scalable, and cost-effective future for data management. The ongoing evolution of AI necessitates a parallel evolution in the underlying data infrastructure, and the choices made today will shape the competitive landscape for years to come.

The Economic Disconnect: AI Agents vs. Human Analytics

The "Triple Whammy" of Unconsolidated Data and Context

Resisting the Lockdown Instinct: The Case for Openness

Fivetran’s Strategic Moves in the Open Data Landscape

The Evolving Data Landscape and Future Implications

Leave a Reply Cancel reply