Training algorithm breaks barriers to deep physical neural networks

AI-generated (DALL-E 3) conceptual image depicting light waves passing through a physical system. © LWE/EPFL

AI-generated (DALL-E 3) conceptual image depicting light waves passing through a physical system. © LWE/EPFL

EPFL researchers have developed an algorithm to train an analog neural network just as accurately as a digital one, enabling the development of more efficient alternatives to power-hungry deep learning hardware.

With their ability to process vast amounts of data through algorithmic ‘learning’ rather than traditional programming, it often seems like the potential of deep neural networks like Chat-GPT is limitless. But as the scope and impact of these systems have grown, so have their size, complexity, and energy consumption – the latter of which is significant enough to raise concerns about contributions to global carbon emissions.

And while we often think of technological advancement in terms of shifting from analog to digital, researchers are now looking for answers to this problem in physical alternatives to digital deep neural networks. One such researcher is Romain Fleury of EPFL’s Laboratory of Wave Engineering in the School of Engineering. In a paper published in Science, he and his colleagues describe an algorithm for training physical systems that shows improved speed, enhanced robustness, and reduced power consumption compared to other methods.

“We successfully tested our training algorithm on three wave-based physical systems that use sound waves, light waves, and microwaves to carry information, rather than electrons. But our versatile approach can be used to train any physical system,” says first author and LWE researcher Ali Momeni.

A “more biologically plausible” approach

Neural network training refers to helping systems learn to generate optimal values of parameters for a task like image or speech recognition. It traditionally involves two steps: a forward pass, where data is sent through the network and an error function is calculated based on the output; and a backward pass (also known as backpropagation, or BP), where a gradient of the error function with respect to all network parameters is calculated.

Over repeated iterations, the system updates itself based on these two calculations to return increasingly accurate values. The problem? In addition to being very energy-intensive, BP is poorly suited to physical systems. In fact, training physical systems usually requires a digital twin for the BP step, which is inefficient and carries the risk of a reality-simulation mismatch.

The scientists’ idea was to replace the BP step with a second forward pass through the physical system to update each network layer locally. In addition to decreasing power use and eliminating the need for a digital twin, this method better reflects human learning.

“The structure of neural networks is inspired by the brain, but it is unlikely that the brain learns via BP,” explains Momeni. “The idea here is that if we train each physical layer locally, we can use our actual physical system instead of first building a digital model of it. We have therefore developed an approach that is more biologically plausible.”

The EPFL researchers, with Philipp del Hougne of CNRS IETR and Babak Rahmani of Microsoft Research, used their physical local learning algorithm (PhyLL) to train experimental acoustic and microwave systems and a modeled optical system to classify data like vowel sounds and images. As well as showing comparable accuracy to BP-based training, the method was robust and adaptable – even in systems exposed to unpredictable external perturbations – compared to the state of the art.

An analog future?

While the LWE’s approach is the first BP-free training of deep physical neural networks, some digital updates of the parameters are still required. “It’s a hybrid training approach, but our aim is to decrease digital computation as much as possible,” Momeni says.

The researchers now hope to implement their algorithm on a small-scale optical system, with the ultimate goal of increasing network scalability.

“In our experiments, we used neural networks with up to 10 layers, but would it still work with 100 layers with billions of parameters? This is the next step, and will require overcoming technical limitations of physical systems.”

References

Ali Momeni et al, Backpropagation-free training of deep physical neural networks. Science 0, eadi8474 DOI: 10.1126/science.adi8474


Author: Celia Luterbacher

Source: Laboratoire d'ingénierie des ondes

This content is distributed under a Creative Commons CC BY-SA 4.0 license. You may freely reproduce the text, videos and images it contains, provided that you indicate the author’s name and place no restrictions on the subsequent use of the content. If you would like to reproduce an illustration that does not contain the CC BY-SA notice, you must obtain approval from the author.