Semiconductor Engineering Adds New Research on LLM Scaling Chiplet Security and GPU Power Management to Technical Library

The global semiconductor industry is currently navigating a period of rapid architectural transformation, driven primarily by the escalating demands of artificial intelligence (AI), the complexities of heterogeneous integration, and the critical need for power efficiency in hyperscale data centers. To address these evolving challenges, Semiconductor Engineering has expanded its technical library with a suite of new research papers. These publications, authored by leading academic institutions and industry giants such as Samsung, Google, IBM, and AMD, provide deep insights into the next generation of hardware design, from the physical foundations of neural networks to the security vulnerabilities of advanced 2.5D and 3D packaging.

The latest additions to the library highlight a shift toward specialized hardware-software co-design. As Moore’s Law slows, the industry is moving away from general-purpose compute toward domain-specific architectures (DSAs). This transition is reflected in the diverse range of topics covered in the new papers, which include large-scale neural network implementations, Large Language Model (LLM) inference scaling, silent data corruption detection, and the physics of NAND flash memory.

Advancing AI Hardware: From Fixed Implementations to LLM Scaling

As AI models grow in complexity, the hardware required to run them must evolve beyond traditional GPU-centric approaches. A collaborative effort between Yale University, Cornell University, Boston University, and NTT Research has resulted in a paper titled "Physical Foundation Models: Fixed HW implementations of large-scale neural networks." This research explores the potential of "fixed" hardware—circuits specifically designed for a single model or function—as a way to bypass the overhead associated with programmable logic. By mapping neural network architectures directly onto physical hardware substrates, researchers aim to achieve orders-of-magnitude improvements in energy efficiency and latency. This approach is particularly relevant for edge AI applications where power budgets are extremely constrained.

In parallel with hardware specialization, the industry is grappling with the sheer scale of LLM deployment. Micron Technology and Argonne National Laboratory have contributed a study titled "Understanding Inference Scaling for LLMs: Bottlenecks, Trade-offs, and Performance Principles." This research arrives at a critical time as companies transition from training models to serving them at scale. The paper identifies the primary bottlenecks in GPU-based inference, particularly for reasoning-centric LLMs that require high memory bandwidth and low-latency communication. By characterizing these performance principles, the researchers provide a roadmap for data center operators to optimize their infrastructure for the next wave of generative AI.

Addressing the Crisis of Silent Data Corruption in Modern CPUs

As semiconductor manufacturing nodes shrink to 3nm and below, the reliability of individual transistors becomes increasingly difficult to guarantee. One of the most pressing issues for hyperscalers like Google and Meta is "Silent Data Corruption" (SDC)—errors that occur within a CPU without being flagged by traditional error-correction mechanisms. To combat this, Stanford University and Google have introduced "ITHICA: Intra-Thread Instruction Checking Approach for Defect-Induced Silent Data Corruptions."

Chip Industry Technical Paper Roundup: Jun. 2

The ITHICA approach focuses on detecting defects that manifest only under specific workloads or environmental conditions, which are often missed during factory testing. By implementing instruction-level checking within the execution thread, the researchers propose a method to identify "mercurial cores" before they can cause catastrophic data integrity issues in a cloud environment. This research is a direct response to the growing frequency of SDCs in modern data centers, where even a one-in-a-billion error rate can lead to significant service disruptions across millions of servers.

Innovations in Memory and Storage: NAND Flash and Heterogeneous SoCs

The storage sector continues to push the boundaries of density and reliability. The University of Seoul and Samsung Electronics have published a paper titled "Impact of Band-to-Band Tunneling in the Charge Trap Layer of NAND Flash Memory." This technical deep dive explores the physics of V-NAND (Vertical NAND) flash, specifically focusing on the charge trap layer (CTL). As layers are stacked higher to increase capacity, the impact of band-to-band tunneling (BTBT) becomes a critical factor in data retention and leakage. Samsung’s involvement underscores the importance of this research for the future of high-density SSDs used in enterprise and consumer electronics.

Furthermore, the complexity of designing modern Systems-on-Chip (SoCs) has necessitated new benchmarking methodologies. Columbia University and IBM Research have introduced "HSCO-Bench: An Agent-Driven End-to-End HW-SW Co-design Benchmark for Systems-on-Chip." This benchmark is designed to evaluate heterogeneous systems where hardware accelerators and software stacks must be tuned simultaneously. By using an agent-driven approach, HSCO-Bench allows designers to simulate complex workflows and identify performance bottlenecks early in the design cycle, potentially shaving months off the development of custom silicon.

Security Risks in the Era of Chiplets and 3D Integration

One of the most significant shifts in semiconductor manufacturing is the move toward chiplets and 2.5D/3D integrated systems. While these technologies allow for greater performance and modularity, they also introduce new security vectors. A paper from Grenoble INP – UGA, CNRS, and TIMA, titled "Spying Across Chiplets: Side-Channel Attacks in 2.5/3D Integrated Systems," warns of the risks inherent in advanced packaging.

The research demonstrates that the proximity of chiplets on a silicon interposer or through-silicon vias (TSVs) can facilitate side-channel attacks. Specifically, an adversary could potentially monitor electromagnetic emissions or power fluctuations from a "victim" chiplet to extract sensitive data, such as cryptographic keys. As the industry moves toward standardized chiplet interconnects like UCIe (Universal Chiplet Interconnect Express), this research highlights the urgent need for physical-layer security protocols to prevent cross-chiplet spying.

Enhancing Energy Efficiency through Component-Level Power Management

In the high-performance computing (HPC) and gaming sectors, GPUs remain the primary workhorses, but their power consumption has reached levels that challenge thermal management and sustainability goals. AMD has contributed a paper titled "CompPow: A Case for Component-level GPU Power Management," which argues for a more granular approach to energy regulation.

Traditional power management techniques often treat the GPU as a single entity or a few large blocks. CompPow proposes managing power at the individual component level, allowing for more precise control over voltage and frequency based on real-time workload demands. AMD’s research indicates that this fine-grained control can significantly reduce energy waste without compromising performance, a vital advancement for both mobile devices and massive AI training clusters.

Chronology and Industry Context

The release of these papers follows a series of industry-wide discussions regarding the "Power Wall" and the "Memory Wall"—two major hurdles in contemporary computing. Over the past 24 months, the semiconductor industry has seen:

Late 2022: The emergence of LLMs as a mainstream technology, leading to a massive spike in GPU demand.
Mid-2023: Increased reports from major cloud providers regarding the prevalence of "mercurial" CPU cores and the need for better SDC detection.
Early 2024: The formalization of chiplet standards and the first commercial deployments of 3D-stacked logic-on-logic chips.
Present: A concerted effort by research organizations to move beyond theoretical models into practical, hardware-verified implementations of AI and security solutions.

Broader Impact and Future Implications

The research aggregated by Semiconductor Engineering serves as a bellwether for the future of the industry. The move toward "Physical Foundation Models" and fixed hardware suggests that we may be entering an era of "Extreme Specialization," where the chips in our devices are no longer general-purpose processors but a collection of highly efficient, task-specific engines.

The focus on side-channel attacks in chiplets is also a timely warning. As the supply chain becomes more disaggregated—with chiplets from different vendors potentially sitting on the same package—trust and security will become the most valuable commodities in semiconductor manufacturing. Standard bodies will likely look to this research to develop more robust isolation techniques for heterogeneous integration.

Finally, the collaboration between academia and industry (e.g., Stanford and Google, Columbia and IBM) emphasizes that the challenges facing the semiconductor world are too large for any single entity to solve. The path forward requires a unified approach that combines theoretical physics, advanced materials science, and software engineering. These papers provide the foundational knowledge necessary to build the more reliable, secure, and efficient systems of the next decade.

For engineers, architects, and researchers, these additions to the Semiconductor Engineering library represent a vital resource for staying ahead of the technological curve in an industry that remains the bedrock of the global digital economy.