The traditional hierarchy of enterprise information technology is undergoing a fundamental inversion as the rise of artificial intelligence forces a shift from application-centric models to a "data primacy" approach. This thesis, championed by Pure Storage CEO Charlie Giancarlo at the recent Accelerate conference in Las Vegas, posits that for fifty years, enterprises have prioritized applications as the core of their infrastructure, treating data as a secondary byproduct. However, the demands of modern AI and real-time decision-making now require data to sit at the center of the enterprise, with applications functioning as modular tools that move to where the data resides. While the theoretical benefits of this shift are significant, the practical implementation remains a complex challenge for global organizations. Recent research indicates that for many senior technology leaders, data remains inaccessible, fragmented, and fraught with regulatory risk. During the conference, executives from UGI Corporation and Sanofi provided contrasting yet complementary blueprints for navigating this transition, highlighting the distinct paths of data consolidation and federated compute.
The Shift Toward Data Primacy: Historical Context and the AI Catalyst
The concept of data primacy marks a departure from the architectural standards established during the mainframe and early client-server eras. In those models, data was often "trapped" within the silos of specific applications—such as Enterprise Resource Planning (ERP) or Customer Relationship Management (CRM) systems—making cross-functional analysis difficult. As organizations moved toward the cloud, this fragmentation often persisted, leading to what industry analysts describe as "data gravity," where the cost and latency of moving massive datasets prevent efficient utilization.
According to IDC, the global "datasphere" is expected to grow to over 175 zettabytes by 2025. In this environment, the traditional method of pulling data from various sources into a centralized application for processing is becoming increasingly unsustainable. The emergence of Generative AI and agentic workflows has accelerated the need for a new architecture. AI models require vast amounts of high-quality, contextual data to provide value; without a data-centric foundation, these models risk producing "hallucinations" or irrelevant outputs. Pure Storage’s argument is that by inverting the stack—placing data at the core and pushing applications downstream—enterprises can achieve the agility required to compete in an AI-driven economy.
UGI Corporation: Overcoming Fragmentation in Critical Infrastructure
For UGI Corporation and its subsidiary AmeriGas, the journey toward data primacy begins with the foundational task of consolidation. As a major US energy holding company involved in both regulated gas and electric utilities and the retail distribution of propane, UGI operates in an environment where data is historically siloed across disparate business units and geographic regions. Eric Frost, Director of Infrastructure Services at UGI Corporation, identified data fragmentation as the primary hurdle to modernization.
The utility sector presents unique challenges because it bridges the gap between traditional Information Technology (IT) and Operational Technology (OT). OT data, generated by industrial sensors, meters, and grid infrastructure, often exists in proprietary formats on legacy systems that were never designed for interoperability. Frost noted that bringing this data together in an organized fashion is an ongoing challenge, requiring strategic partnerships with infrastructure providers like Pure Storage and Nutanix.
The stakes for UGI are higher than in many other industries. In the utility business, data availability is a matter of public safety. If a gas leak occurs, the systems responsible for monitoring and response must be online and responsive without exception. Consequently, UGI’s approach to AI and data primacy is grounded in practical, "unglamorous" work focused on operational efficiency and customer satisfaction.
Earlier this year, UGI initiated a deep partnership with Amazon Web Services (AWS) to leverage AI as a "force multiplier." Rather than pursuing high-level experimental projects, UGI focused on training program managers to develop small, impactful use cases. By moving away from "old-fashioned" manual data handling, the company aims to optimize customer service interactions, such as billing inquiries and propane reordering. The goal is to use AI to reduce friction for the customer while simultaneously cleaning and organizing the underlying data architecture.
Sanofi: From Centralization to Federated Compute and FAIR Principles
While UGI is focused on pulling scattered data together, the global biopharma giant Sanofi is exploring the opposite end of the maturity curve. Having spent a decade investing in centralized data management, Sanofi is now pivoting toward a federated approach to address "decision latency." Pradeep Bandaru, Head of Platforms & AI at Sanofi, explained that the company has long adhered to the FAIR principles—ensuring data is Findable, Accessible, Interoperable, and Reusable.
For ten years, Sanofi built extensive data and infrastructure engineering arms to support centralized platforms on-premises, in the cloud, and at the edge. This model was designed to handle data from clinical trials, academic partnerships, and corporate acquisitions. However, the "centralization tax"—the time and resources required to move data into a single repository—often conflicted with the need for rapid scientific outcomes.
In response, Sanofi is now pushing compute capabilities back to where the data is originally generated, whether that be in a laboratory, a manufacturing plant, or a regulated clinical site. This federated approach is driven by several factors:
- Regulatory Compliance: Data residency and sovereignty laws (such as GDPR in Europe) often restrict the movement of sensitive patient data across borders.
- Latency: Moving petabytes of R&D data to a central cloud for processing creates delays that can slow the pace of drug discovery.
- Governance: By keeping data at the source and bringing the compute to it, Sanofi can maintain a more rigorous audit trail of data lineage and provenance.
Bandaru emphasized that as AI agents begin to perform autonomous activities within the research process, the ability to track the history of a data point becomes essential for both scientific integrity and regulatory approval.
Context as the New Compute: The Role of Agentic AI
A significant takeaway from Sanofi’s experience is the commoditization of AI models. Bandaru argued that the specific large language model (LLM) or algorithm used is becoming less important than the data layer that feeds it. In his view, "context is effectively the new compute."
Sanofi currently runs agentic workflows—AI systems capable of independent action—against more than 20 petabytes of R&D data. These agents do not necessarily need more powerful models; they need better context to make accurate Tool Calls and decisions. Providing this context requires a predictable, low-latency data layer and a robust governance "harness" to ensure autonomous systems remain within defined safety parameters.
In the pharmaceutical industry, the objective is "faster time to science." By accelerating the lifecycle of data from creation to decision, Sanofi aims to shorten the years-long process of discovering molecules and testing hypotheses in clinical trials. This reflects a broader trend where the competitive advantage in AI is shifting from those who have the best models to those who have the best-organized, most accessible data.
Chronology of Infrastructure Evolution
The transition discussed at the Accelerate conference can be viewed through a four-stage chronological lens:
- The Application Era (1970s–2000s): Software was the king. Data was stored in proprietary databases linked to specific applications. Interoperability was achieved through slow, manual ETL (Extract, Transform, Load) processes.
- The Cloud and Big Data Era (2010s): Organizations began moving data to central warehouses and "data lakes" to gain insights. This led to the "centralization tax" mentioned by Sanofi, where the volume of data began to outpace the efficiency of moving it.
- The AI Integration Era (2020–2023): The sudden explosion of GenAI forced companies to realize that their data was too fragmented or "dirty" to be used effectively by models. This triggered a rush to modernize storage and governance.
- The Data Primacy Era (2024 and Beyond): As exemplified by Pure Storage’s vision, the architecture is being flipped. Data is the permanent asset, and applications/models are transient tools that connect to it via federated networks.
Broader Implications and Market Impact
The shift toward data primacy has significant implications for the global technology market. For infrastructure providers, it necessitates a move toward "Evergreen" models where hardware can be upgraded without disrupting the data layer. For enterprises, it requires a cultural shift where data governance is no longer seen as a back-office compliance function but as a core driver of business value.
Industry analysts suggest that companies failing to adopt a data-centric architecture may find themselves hit by rising "AI technical debt." If an organization builds AI tools on top of fragmented, siloed data, the cost of correcting those errors later will be exponential. Furthermore, the rise of "sovereign AI"—where nations and regions require data to be processed locally—will make the federated compute model used by Sanofi a necessity rather than an option.
The experiences of UGI Corporation and Sanofi demonstrate that there is no one-size-fits-all approach to data primacy. For some, the immediate priority is the "unglamorous" work of consolidating silos to improve customer satisfaction. For others, the goal is to dismantle centralization in favor of a nimble, federated system that accelerates scientific discovery. Regardless of the path, the consensus from the Accelerate conference is clear: the era of application-centric IT is ending, and the era of the data-driven enterprise has begun. Success in this new landscape will be defined not by the models a company uses, but by the speed, context, and integrity of the data that powers them.
