New Neural Network Slashes Sensor-Data Overload

Researchers say factories on Mars could benefit from the efficient approach

4 min read

Matthew S. Smith is a contributing editor for IEEE Spectrum and the former lead reviews editor at Digital Trends.

Conceptual illustration showing a heap of numbers ascending through a funnel and coming out as organized binary code.
iStock

Modern technology collects vast amounts of data from sensors, with one estimate projecting global data from Internet of Things devices at about 73 zettabytes (or 73 trillion gigabytes) in 2025. And as more data are collected, the infrastructure required to store, transfer, and run compute on that data also grows.

But what if, instead of collecting all possible data from a sensor, we could be more selective, collecting just enough data to accurately identify whatever we’re looking for? That’s the approach proposed by researchers at Pennsylvania State University and MIT. Their paper, recently published in Nature Scientific Reports, demonstrates how a neural network can achieve an accuracy of more than 90 percent while sampling as little as 10 percent of the original sensor data.

“The way I see it, edge computing is going to take a different direction because of what we did—or not just edge, but also edge used alongside cloud computing,” says Soundar Kumara, an industrial engineering professor at Penn State and coauthor on the paper.

Drawing Inspiration from Human Senses

Since the late 1980s, Kumara has been investigating the use of artificial intelligence for industrial systems, with a focus on sensing and sensor data. Over the years, he has researched Fourier transforms, wavelets, and chaos theory, among other ideas.

Recently, Ankur Verma, who was finishing a Ph.D. at Penn State, came to Kumara with a new angle. “Humans can make sense of things with only a small amount of information,” says Verma. “The question we then asked is, can we make machines do the same thing?”

Attempts to tackle this problem face a hurdle in the Nyquist-Shannon sampling theorem, a mathematical proof that says the sample rate of a signal must be twice the bandwidth of the signal to avoid losing information. To accurately measure a 100-Hertz sound wave, for example, it must be sampled at 200 Hz (or more).

This theorem implies that a large volume of sensing data must be collected and processed to obtain accurate results, contributing to the ever-growing amount of sensing data collected and processed by modern devices.

A Drastically Efficient Neural Network

The researchers addressed that problem with a “shift-invariant spectrally stable undersampled network,” or SIUN. It’s a neural network that uses “selective learning” to train on sensor data without using the entirety of the available data.

“We are sampling at Nyquist rates, but we are not collecting every data point at that resolution,” Verma explains, noting that SIUN relies on random seed-based sampling to collect only a portion of the data. “It turns out that you can do this while still preserving most of the data in your signal.” The researchers theorized this would be possible due to the redundancy often found in sensory datasets.

The researchers tested SIUN against several datasets used to assess fault detection, such as a Case Western Reserve University dataset that includes a variety of data from good and faulty ball bearings. The neural network was asked to correctly classify the bearings as normal or faulty and, if faulty, identifying the type of fault.

The system was 96 percent accurate when just 30 percent of raw data was sampled from that dataset. When tested against other datasets, SIUN was usually in the range of 80 to 90 percent accurate when less than 20 percent of the raw data was sampled.

For comparison, the paper pitted SIUN against a more traditional convolutional neural network (CNN) on the ball-bearing dataset. The CNN won on accuracy, classifying faults with 99.77 percent accuracy versus SIUN’s 96 percent. However, a CNN uses the entire dataset and becomes a larger, more complex model. The CNN contained over 3 million parameters, while the SIUN had less than 42,000.

Put simply: SIUN’s efficiency beat the CNN, and it wasn’t even close. The researchers found that SUIN “achieves a 435.01x reduction in the number of FLOPS required,” to classify the bearings within the dataset. While that was the researchers’ best-case example, other tested datasets also found significant boosts to efficiency, with SIUN reducing compute requirements about 8-fold to 27-fold when compared to a CNN.

Applications for a New Approach to Sensor Data

To bring the point home, Verma brought up an inexpensive, readily available microcontroller. “We deployed our software on a Raspberry Pi Pico,” says Verma. The Pi Pico, priced at just US $4, has 264 kilobytes of RAM and a dual-core processor running at 133 megahertz. “It runs on a few milliwatts of power, but we are still able to run inference on that.”

“Imagine there are factories producing things on Mars. We can’t just buy additional GPUs on Mars.” —Ankur Verma, Penn State

The researchers have filed several patent applications related to the technology and Verma, alongside paper coauthor Ayush Goyal, cofounded a company called Lightscline to commercialize the approach. They believe the paper’s findings could be relevant to many practical sensing tasks, but their moon shot is literal. They want to take the idea to space.

“Imagine if we have settlements in space, or on Mars, and there are factories producing things on Mars,” says Verma. “We can’t just buy additional GPUs on Mars. We can’t just put up more cloud storage.”

While Martian factories may sound fantastic, the example represents real-world concerns. SpaceX’s Smallsat Rideshare Program pegs the cost to launch at $6,000 per kilogram. At that rate, putting a single Nvidia DGX H200 system into orbit would cost more than $750,000. The SIUN approach could help missions accomplish sensing tasks in space with lighter hardware that costs less to launch.

Kumara, also a Lightscline cofounder, had a more down-to-earth example. He thinks SIUN could bring the advantages of AI sensing to rural areas with less-than-stellar access to AI hardware. “Imagine that even manufacturing sites in rural areas, at the edge, you can do this compute. They could come up with much deeper insights into their manufacturing and quality,” he says.

This story was updated on 28 January 2025 to add that Soundar Kumara is also a cofounder of Lightscline.

The Conversation (0)