Professor in electrical and computer engineering aims to improve reliability of integrated circuits by developing effective transistor testing methods

Published: May 16, 2025 8:40 AM

By Joe McAdory

As smartphones, laptops, cloud servers and other next-generation technologies grow more power-efficient and complex, they’re becoming more vulnerable to hidden hardware flaws that traditional testing methods can’t detect. The result: data is corrupted, applications crash and previously saved information is lost.

Adit Singh, the Godbold Endowed Chair Professor in electrical and computer engineering, hopes to contribute to a solution.

Singh’s project, “Understanding Test Escapes and SDC Failures in ICs Caused by Transistors with Extreme Device Parameters from Random Manufacturing Variations,” will create new testing methods that can detect failures caused by subtle manufacturing variations in transistors by targeting their impact on circuit behavior. The aim is to develop effective test screens that reduce undetected failures in operation and thereby improve the reliability of integrated circuits (ICs).

Singh’s research, funded by the Open Compute Project — a global consortium of more than 400 companies that share data center designs and best practices to improve large-scale computing — was one of only five projects awarded worldwide in the field of resilient computing.

“A laptop might contain up to a hundred chips from different manufacturers, making it difficult to pinpoint the cause of any failure,” said Singh, who is working closely with major industry players such as Google, Nvidia and Intel. “If the computer fails, who’s responsible? There’s no system in place to track which chips fail or which manufacturers are providing low-quality components. In consumer electronics, we usually just discard the device without knowing what went wrong. Historically, we have rarely collected comprehensive failure statistics.

“Recently, hyperscale computing systems have emerged to support cloud computing and data storage. These massive computing platforms can be made up of as many as a million identical processor chips from a single manufacturer. In this environment, we can track the statistics of even rarely occurring failures because of the large number of integrated circuits being observed. It’s important to understand the common sources of failure in faulty chips so that the processors can be periodically tested for likely failures and faulty ones deactivated before they can cause any harm. This matters even more when storing data in the cloud. If a file is silently corrupted and that isn’t discovered until weeks later, that’s a serious problem.”

At the heart of Singh’s research is a growing concern in advanced integrated circuit manufacturing: failures caused by rare combinations of a few underperforming, slow transistors resulting from random manufacturing variations. These can cause rare timing errors under very specific operating conditions that can be difficult to detect. Such latent defects are invisible to the standard testing methods currently used in the semiconductor industry.

“Understanding and addressing these new failure modes is essential for future reliable computing infrastructure, especially as we push toward lower power and higher performance in data centers and artificial intelligence systems,” Singh said.

Traditional testing methods operate under the assumption that manufacturing defects are infrequent and isolated. This allows for simplified fault models that can efficiently detect issues like circuit shorts and opens, enabling even the most complex chips to be tested within minutes. As IC complexity and transistor counts scale into the billions, these assumptions no longer hold, Singh said. Even extreme transistor variations that appear highly unlikely are observed in large modern chips. These can push critical signals past their timing limits under rare operating conditions, resulting in computational errors which are difficult to detect.

To address this, Singh will develop new testing methodologies tailored to detect failures caused by random manufacturing variations. The emphasis will be on identifying transistors that perform significantly below nominal levels due to variability in parameters such as threshold voltage and drain-induced barrier lowering. These subtle defects often go undetected by current testing approaches.

The research will begin with extensive simulations, leveraging published industrial data to model how such outlier behaviors impact overall chip performance. In parallel, Singh will design custom silicon experiments in collaboration with industry partners to validate their findings under real-world conditions. The goal: create practical, high-coverage test screens that can be adopted broadly to reduce the incidence of test escapes and soft data corruption in deployed systems.

“Current test strategies are based on defect models developed decades ago,” Singh said. “It’s time to rethink how we define and detect failures in light of what we now understand about transistor-level variation. Designers today are implicitly relying on test processes to screen out these marginal ICs. The reality is… our current testing methods were never designed to catch the distributed, statistical nature of these timing failures.

“Where universities can contribute is to look at problems in out-of-the-box ways. Companies often pursue low-risk linear solutions based on what they already know, but universities have the freedom to explore unconventional ideas. That’s what my research brings to the table.”

Media Contact: Joe McAdory, jem0040@auburn.edu, 334.844.3447
Adit Singh is the Godbold Endowed Chair Professor in the Department of Electrical and Computer Engineering.

Adit Singh is the Godbold Endowed Chair Professor in the Department of Electrical and Computer Engineering.

To fix accessbility issues

Recent Headlines