Physical Foundation Models: Fixed hardware implementations of large-scale neural networks

Researchers from Yale University, Cornell University, Boston University, and NTT Research have introduced a paradigm-shifting approach to artificial intelligence infrastructure that could redefine the trajectory of global computing. By proposing the development of Physical Foundation Models (PFMs), the collaborative team suggests moving beyond the current reliance on general-purpose digital processors toward fixed hardware implementations where the neural network’s architecture and weights are embedded directly into the physical matter of the device. This research, detailed in a technical paper released in April 2026, argues that as foundation models like GPT-5 and Gemini 3 reach a level of general-purpose utility and stability, the industry should transition to manufacturing specialized hardware that realizes these models through natural physical dynamics rather than digital simulation.

The Shift from Software-Defined to Hardware-Resident AI

For the past decade, the field of artificial intelligence has been dominated by the Von Neumann architecture, where software instructions and data are processed by general-purpose units like CPUs and GPUs. While this flexibility allowed for rapid experimentation during the early years of deep learning, it has led to significant inefficiencies in energy consumption and data movement. The rise of foundation models—massive networks trained on trillions of tokens capable of performing hundreds of diverse tasks—has created a new technological landscape.

According to the research team, led by Logan G. Wright, Tianyu Wang, Tatsuhiro Onodera, and Peter L. McMahon, the "foundation" nature of these models means they no longer need to be updated or retrained weekly to remain useful. Instead, major versions of these models are now released on a roughly annual cadence. This stability presents a unique window for hardware engineering: the ability to manufacture a chip or a physical medium that is the model itself. Rather than loading weights from memory into a processor, the weights are represented by physical properties—such as the refractive index of glass or the conductance of a nanoelectronic junction—and the computation occurs as a result of the laws of physics acting on the system.

Technical Mechanics: Beyond Digital Inference

The core of the PFM proposal lies in utilizing the "natural physical dynamics" of a medium to perform the mathematical operations required by neural networks. In traditional digital hardware, a simple matrix-vector multiplication requires thousands of transistors switching on and off, generating heat and consuming electricity at every step. In a Physical Foundation Model, these operations are performed "for free" by the medium.

The researchers highlight a 3D nanostructured glass medium as a primary example of this technology. In this scenario, information is encoded into light waves. As these waves pass through a block of glass with precisely engineered internal structures, the light scatters and interferes in a way that mathematically corresponds to the layers of a neural network. By the time the light reaches the other side of the glass, the "computation" is complete. This optical approach could potentially handle models with 10^12 (one trillion) to 10^15 (one quadrillion) parameters with a fraction of the energy required by a modern GPU cluster.

Chronology of the Hardware Evolution

The path toward Physical Foundation Models can be traced through several distinct eras of AI hardware development:

2010–2018: The GPU Era. General-purpose graphics processing units were repurposed for neural network training. This era focused on flexibility, allowing researchers to test thousands of different architectures (CNNs, RNNs, LSTMs).
2019–2023: The Accelerator Era. Specialized digital chips like Google’s TPU and NVIDIA’s H100 series were developed. These chips optimized the data paths for tensor operations but remained digital and memory-constrained.
2024–2025: The Foundation Model Consolidation. AI development converged on the Transformer architecture. Models became so large that the energy cost of moving data between memory and the processor (the "memory wall") became the primary bottleneck.
2026 and Beyond: The PFM Era. The proposal of Physical Foundation Models suggests a move toward "frozen" physical implementations. Instead of software running on a chip, the chip is the software. This marks the transition from silicon-based digital logic to physics-based analog computation for large-scale inference.

Supporting Data: Energy and Scaling Advantages

The motivations for PFMs are rooted in the stark reality of current energy constraints. As of 2024, global data center energy consumption was estimated to be nearly 2% of the world’s total electricity. Projections suggest that if scaling continues on its current digital trajectory, AI training and inference could consume as much as 10% of global power by 2030.

The researchers provide back-of-the-envelope calculations illustrating the scaling potential of PFMs. While a trillion-parameter model currently requires a massive server rack and megawatts of power for real-time inference, a PFM implementation could potentially perform the same task using milliwatts.

Building Fixed HW Implementations of Neural Networks (Yale, Cornell et al.)

Furthermore, parameter density—the number of "neurons" and "synapses" that can be packed into a cubic centimeter—is orders of magnitude higher in physical media. In a 3D nanostructured material, information is processed in three dimensions, whereas traditional chips are largely restricted to 2D planes. This allows for the theoretical possibility of 10^18-parameter models—roughly equivalent to the complexity of the human brain—residing in a device the size of a standard laptop.

Implications for Datacenters and Edge Computing

The adoption of Physical Foundation Models would likely bifurcate the AI industry into two distinct streams: training and inference.

Datacenter Impact:
For large-scale providers like Amazon Web Services, Microsoft Azure, and Google Cloud, PFMs represent a solution to the "power wall." By deploying fixed hardware for popular models (e.g., a physical implementation of GPT-5), these providers could increase their inference throughput by 100x without increasing their power footprint. This would drastically lower the cost of AI services and reduce the environmental impact of the industry.

Edge Computing and Consumer Devices:
Perhaps the most transformative impact would be at the "edge"—in smartphones, medical devices, and autonomous vehicles. Currently, running a large language model on a smartphone requires significant battery drain and often relies on cloud processing. A PFM "inference module" could be integrated into consumer electronics, providing high-speed, high-intelligence capabilities locally without needing a constant internet connection or heavy battery usage. This would enable "Always-On AI" that is both private and efficient.

Challenges and Research Hurdles

Despite the immense promise, the researchers acknowledge significant hurdles that must be overcome before PFMs become a commercial reality.

Fabrication Precision: Manufacturing 3D nanostructures at the scale required for a trillion-parameter model is a monumental task. It requires advancements in 3D printing at the nanoscale or new forms of lithography that can penetrate deep into a material.
The "Fixed" Problem: Because the weights are physical, they cannot be changed once the device is manufactured. This necessitates a "perfect" training phase. Any errors in the digital version of the model would be permanently etched into the physical version.
Input/Output Bottlenecks: While the internal computation might be light-speed and energy-free, converting digital data into physical signals (like light or electrical pulses) and back again still incurs an energy cost.
Thermal Management: Even though PFMs are more efficient, processing 10^15 parameters at high speeds will still generate some heat that must be dissipated to prevent physical degradation of the medium.

Official Responses and Industry Outlook

While the paper represents an academic and theoretical framework, early reactions from the semiconductor and AI industries have been cautiously optimistic. Industry analysts suggest that the "fixed hardware" approach aligns well with the existing manufacturing cycles of companies like TSMC and Intel. If a model version is expected to be the industry standard for 12 to 18 months, the capital expenditure required to build a specialized PFM fab could be justified by the massive savings in operational energy costs.

"The philosophy of foundation models is to put effort into a single, large general-purpose model," the researchers conclude. "It now makes sense to build special-purpose, fixed hardware implementations… manufactured and released at the roughly 1-year cadence of major new foundation-model versions."

The proposal serves as a call to action for hardware engineers to move away from the safety of digital logic and embrace the complexity of physical dynamics. As the AI industry moves toward models that require more parameters than there are transistors on a modern chip, the shift to Physical Foundation Models may not just be an alternative—it may be the only viable path forward for the continued scaling of artificial intelligence.

In the coming years, the success of this approach will depend on the synergy between material science, optical engineering, and neural network theory. If the challenges of 3D fabrication can be met, the world may soon see the first "trillion-parameter glass" chips, bringing the full power of a datacenter into the palm of a hand.