Databricks Unveils Lake Transactional/Analytical Processing (LTAP) Architecture to Unify Data Operations for AI Agents

Databricks, a prominent player in data analytics and artificial intelligence, has announced a significant architectural advancement aimed at bridging the long-standing divide between operational databases and analytical systems. At its annual Data + AI Summit in San Francisco, the company introduced Lake Transactional/Analytical Processing (LTAP), a new architecture designed to consolidate these disparate data environments, particularly to empower the burgeoning capabilities of AI agents. This initiative represents a strategic evolution for Databricks, building upon its prior acquisitions and reflecting a forward-looking vision of how enterprise data infrastructure must adapt to the rise of non-human users.

The genesis of LTAP can be traced back to Databricks’ strategic acquisitions, including the serverless PostgreSQL startup Neon and, more recently, Mooncake Labs. These moves underscore a core belief within the company: that the future of enterprise data interaction will be increasingly dominated by AI agents, not human users. Consequently, the underlying infrastructure must be re-engineered to meet the unique demands and operational efficiencies required by these intelligent systems. This paradigm shift necessitates a departure from traditional data architectures that were optimized for human workflows and analytical queries.

A Transformative Vision: 40 Years in the Making

Ali Ghodsi, co-founder and CEO of Databricks, articulated the profound impact of this architectural shift during the summit’s keynote address. "For decades, complicated data infrastructure was a tax that teams were forced to pay," Ghodsi stated. "Then agents arrived. In a matter of months, organizations effectively doubled their workforce, just not with humans. Agents write code, make calls, and run loops at a pace human teams never could. The infrastructure that powered the last era of computing is now the bottleneck that no one can afford. LTAP removes it." He further emphasized the groundbreaking nature of LTAP, calling it "a breakthrough the industry has been working on for 40 years. We think we finally pulled it off."

Databricks wants to merge the two databases every company runs

Historically, businesses have relied on two distinct types of database systems: Online Transactional Processing (OLTP) systems and Online Analytical Processing (OLAP) systems. OLTP databases, typically structured in row-based formats, are optimized for high-speed transactions, managing real-time operations such as order fulfillment, payment processing, and inventory management. In contrast, OLAP systems, often employing column-based formats, are designed for large-scale data analysis, reporting, and business intelligence, facilitating complex queries over vast datasets. The separation of these systems, historically maintained for performance and reliability, necessitated the use of Extract, Transform, Load (ETL) pipelines and data replication processes to synchronize information between them. This traditional approach, while functional for human-driven operations, presents significant latency and complexity challenges when scaled to the demands of AI agents.

Databricks contends that AI agents require a more integrated data environment. These agents need the ability to process live transactional data, reason over historical context, and act upon both simultaneously, a capability that current siloed architectures struggle to provide efficiently. Previous attempts to merge transactional and analytical processing, such as Hybrid Transactional/Analytical Processing (HTAP) systems, often incurred substantial costs and led to vendor lock-in. Similarly, "zero-ETL" solutions, while aiming to simplify data movement, often relied on hidden change data capture mechanisms, still resulting in data duplication and the persistent problem of data staleness.

The Architecture of LTAP: Unification and Efficiency

LTAP fundamentally redefines the data landscape by unifying transactional and analytical data within a single, governed storage layer. This unified data resides in open formats on cloud object storage, while leveraging separate compute engines optimized for each respective workload. This design builds directly upon Databricks’ Lakebase, an operational database introduced in June 2025. Lakebase itself represents a new category of database, decoupling compute from storage and placing data in open formats within the data lake.

Databricks is now extending Lakebase to support business-critical workloads with enhanced capabilities. These include native vector and full-text search functionalities, real-time event ingestion via Zerobus (part of the Lakeflow Connect service), and Git-style branching. This branching feature is particularly significant for AI agents, allowing them to create isolated copies of databases for experimentation, testing, and development purposes, and then discard them without impacting the live production environment. Ghodsi highlighted this capability, stating, "Agents love to just branch out and experiment with the data, try something else, and they want to do it quickly. They don’t want to wait ten minutes on a database to come up." This agility is crucial for rapid AI development and deployment cycles.

Lakehouse//RT: Real-Time Analytics at Scale

Complementing the LTAP architecture is Lakehouse//RT, a real-time analytics engine designed to deliver millisecond-level query speeds directly on data residing in Delta and Iceberg tables within the lakehouse. Previously, organizations often had to deploy separate, specialized systems and duplicate data into a "serving layer" to achieve such low latency. Lakehouse//RT eliminates this need by enabling high-performance analytics without additional data copies, pipelines, or governance complexities.

The engine is powered by Reyden, a vectorized engine developed by Databricks. Mehrshad Setayesh, SVP of Engineering at PointClickCare, shared a compelling endorsement of Lakehouse//RT, noting that it "ran more than a third faster on average than our prior warehouse on our healthcare dataset, with 10x faster queries." He further elaborated that Lakehouse//RT has removed the necessity for a dedicated real-time system alongside their existing lakehouse. The high concurrency capabilities of Lakehouse//RT are critical for supporting the simultaneous analytical demands of numerous AI agents and human users.

The Role of Mooncake and Neon

The ability of LTAP to operate on a single copy of data stored in open formats, eliminating complex data pipelines, is largely attributed to the integration of technologies from Databricks’ prior acquisitions. The Lakebase architecture, as previously described by the company, shares a single storage layer across transactional and analytical workloads without data duplication.

The analytical speed of Lakebase is significantly enhanced by Mooncake, the startup acquired by Databricks to accelerate these processes. Mooncake’s technology mirrors PostgreSQL changes into the lakehouse in real time, ensuring that both transactional operations and analytical queries access the same, up-to-date data. This real-time mirroring creates a columnar copy of the data, which is essential for fast analytical query performance. The advantage of this unified approach is that security, governance, auditing, and high availability only need to be implemented and managed once, on a single, open foundation, reducing operational overhead and complexity.

The branching feature, crucial for AI agent workflows, is a core component of Neon, the serverless PostgreSQL startup Databricks acquired. Because the data is stored on object storage, agents can fork an entire database in seconds, akin to Git branching, to test and experiment without risking the stability of the production environment. This is a stark contrast to traditional databases, where provisioning or cloning production instances can take hours and carries significant operational risks. Ghodsi reiterated the importance of robust database tools for AI agents, predicting, "In the next 12 months, we’re going to see more software written than ever in the history of mankind. All that software that your organizations are going to write using LLMs and coding tools need the database behind the scenes."

Broader Ecosystem and Strategic Initiatives

LTAP was a central announcement at the Data + AI Summit, but Databricks also revealed several other initiatives designed to support the evolving AI landscape. In response to the growing concern over "agent sprawl" and associated costs, Databricks introduced the Unity AI Gateway. This unified control point aims to manage all models, agents, MCP servers, and skills within an organization, offering features such as spending dashboards, budget controls, rate limiting, and single sign-on.

The company also unveiled Genie One, a general-purpose AI agent designed for business teams. Genie One is powered by Genie Ontology, a new layer that constructs a ranked graph of a company’s data using an algorithm called OntoRank, inspired by Google’s PageRank. This aims to provide agents with a more intelligent and contextual understanding of organizational data.

Further enhancing data and AI asset sharing, Databricks highlighted OpenSharing, a new protocol for sharing data, models, and agent skills across platforms. This initiative builds upon the foundation of Delta Sharing and is now a project under the Linux Foundation, signaling a commitment to open standards and interoperability.

In a move to cater to specific industry needs, Databricks launched CustomerLake, a customer data platform tailored for marketing teams. This product aims to leverage the data stored on Databricks to provide actionable insights for marketing campaigns. Additionally, the company announced its agreement to acquire Panther, a security company, to bolster its Lakewatch security information and event management service.

Databricks’ Differentiating Factor: The Data Layer

In an increasingly crowded enterprise AI market, where many vendors are introducing agent development and orchestration tools, Databricks’ core differentiation lies in its deep expertise in the data layer and its established data science heritage. While many SaaS providers leverage their existing customer data and domain expertise as a competitive advantage, Databricks positions itself as a foundational utility layer. The introduction of industry-specific solutions like CustomerLake suggests a strategic effort to build value-added product layers on top of the data infrastructure it provides, addressing specific market demands and further solidifying its position. This approach allows Databricks to remain a relatively neutral platform provider while still offering specialized solutions that cater to the evolving needs of businesses leveraging AI. The LTAP architecture represents a significant step in this direction, aiming to simplify and accelerate the integration of operational and analytical data, thereby unlocking new possibilities for AI-driven innovation.