What makes Neuromorphic Computing Neuromorphic?

Neuromorphic computing is often introduced as a brain inspired way of computing. This definition leaves a lot to the imagination, even for people who know a lot about brains. How far does this inspiration need to go for something to be considered neuromorphic? Many deep neural networks are inspired by pathways of the brain but they are not considered neuromorphic. Here I want to explain the two critical features that make a computational system neuromorphic: local memory and asynchronous signal transmission. They are central to neuromorphic computing, because they are sufficient to utilize the energy efficiency of spiking neuronal networks.

Local Memory

In biological brains, synapses store memories by changing their strength and they perform the computation by weighting signal transmission. This means that memory storage and computation happen in the same place. While the parameters of a deep neural network are sometimes thought of as synapses, they are not considered neuromorphic because their parameters are usually not stored locally. However, some specialized deep learning hardware has local memory storage. Let’s go though some examples.

If you train the parameters of a deep neural network (or download pretrained parameters) on your normal computer, they will most likely be stored in the random-access memory (RAM). By default, the computation will happen on the central processing unit (CPU) while the parameters and even intermediate computational results will be stored in the RAM. To do the computation, parameters need to be constantly transferred between RAM and CPU. This communication causes a large part of the computations energy consumption. But how about graphics processing units (GPUs) where most deep neural networks are trained and deployed?

GPUs come with their own RAM, which they can access relatively fast. The RAM being on the graphics card makes it somewhat local, however it is not considered neuromorphic. For a system to be considered neuromorphic it needs to have its memory at least on the same chip. Being on-chip does not reach the level of locality of actual neurons, which would require analog synapses at the actual signal transmission site. However, on-chip memory is much easier to realize since it can be implemented with well established digital chip design technologies. On-chip memory results in improve energy efficiency.

An interesting corner case is Google’s tensor processing unit (TPU). It is specifically designed for deep learning and one of its features is on-chip memory. This makes it more energy efficient for deep learning but TPUs are not considered neuromorphic hardware. That is because they do not implement the second feature that is required for a neuromorphic system and that is asynchronous communication.

Asynchronous Communication

The brain uses signals called action potentials to transmit information and to compute. Action potentials are considered binary signals being emitted at a defined time point. The fact that both the action potential and the bit in a personal computer are binary sometimes causes confusion about the difference between them. They are similar in the fact that they are both binary but they are different in the fact that the action potential is transmitted asynchronously, while the bits in a personal computer are transmitted synchronously. In a synchronous system the computation is synchronized by a common clock and all the data must be transmitted, even that of inactive cells. If you want to pass the activity of cells to the next layer you need to also pass the zeros of the cells that are inactive at that time step.

In an asynchronous system on the other hand, there is no need to transmit anything unless an action potential occurs. The absence of an action potential does not need transmission, because it does not have any downstream consequences. This means that in an asynchronous system, energy is almost exclusively expended when an action potential is transmitted. This is an advantage of spiking neural networks (action potentials are often called spikes) that synchronous systems can not utilize. This means that in a neuromorphic system the amount of energy needed is strongly related to the number of action potentials transmitted, while in a synchronous system the number of action potentials is nearly irrelevant. However, the number of neurons is relevant.

While nearly all of todays computers perform their computations synchronously, there is one area that has been asynchronous for a long time: the internet. On the internet information is transmitted in packages that are sent and received asynchronously. Similarly, spikes can be transmitted as packages. The well established rules for asynchronous communication as well as advances in digital circuit design have led to a generation of neuromorphic hardware systems that feature both on-chip memory and asynchronous communication. This allows them to operate more energy efficiently than traditional systems, however it has proven more difficult to train neural networks that operate with action potentials compared to deep neural networks. This confines neuromorphic systems to laboratories and some niche applications for the moment. Hopefully, novel methods to train spiking neuronal networks will soon yield applications.

If you are interested in learning more about neuromorphic computing, you might want to take a look at the currently existing chips. One of the most well known ones is Intel’s Loihi. The makers of Loihi have some YouTube videos you can find here. Another chip is IBM’s TrueNorth. Read more about it here. If you want to read a paper more generally about neuromorphic computing I recommend this one to start.

Spiking neural networks for a low-energy future

Spiking neural networks (SNNs) have some disadvantages compared to artificial neural networks (ANNs) but they have the potential to run for a fraction of the energy. Whether SNNs will be able to replace ANNs and how much energy they will be using depends on many engineering and neuroscience advances. Here I will go through some of the technical background of the SNN energy advantages and some of the current numbers.

Energy efficient SNN features

The energy efficiency of SNNs comes primarily from two features. Firstly, the spike is a discrete event and energy is only used when a spike occurs. This is probably the most fundamental feature that distinguishes SNNs from ANNs. This means that the energy efficiency of a SNN depends not only on the number of neurons but also on the number of spikes the model requires to perform. The second feature is local memory. At the heart of all models are parameters. On traditional hardware such as CPUs and GPUs, the part of the chip that performs the calculations is not the same that remembers the parameters. Loading the parameters onto the chip is much more energy intensive than the computation itself. Therefore, when the parameters can be stored locally on the chip that computes, efficiency advantages result. This is not something unique to spiking neural networks. Some tensor processing units (TPUs) also feature local memory and they are specifically designed for ANNs. When most people speak about the energy advantages of SNNs, they assume local memory.

SNNs also require specialized hardware to run efficiently. That hardware is called neuromorphic. It makes efficient use of the binary nature of spikes and local memory. Neuromorphic hardware is so far only available for research purpose and making it more widely available will be one of the challenges to SNN adoption. Next will be some numbers on energy efficiency.

How efficient are we talking?

How much more efficient SNNs are depends on many factors of the comparison. What is the task, what are the model architectures and what is the hardware. Making projections into the future is even harder, since machine learning advances are made quickly on both SNNs and ANNs. Projecting the absolute amount of energy that could be saved is then even harder because it requires AI demand predictions which can change non-linearly with technical advances. I would be interested in finding formal work on some of these uncertainties or work on some myself but for now here are some numbers.

The Loihi processor from Intel Labs is a recent piece of neuromorphic hardware. Depending on the size of their example problem they find that Loihi is 2.58x, 8.08x or 48.74x more energy efficient than a 1.67-GHz Atom CPU (Davies et al. 2018).

Yin et al. (2020) present a method to train SNNs (backpropagation of surrogate gradients). They calculate the theoretical energy consumption for a spiking recurrent network they train with the method and some ANN architectures. Depending on the task, their SNN was 126.2x, 935x, 1602x, 1776x or 3353.3x more efficient than a Long Short-Term Memory network (LSTM; also depends on some details of the LSTM implementation). Their network was 41.3x more efficient compared to a recurrent ANN. Here is a talk from the last author Sander Bohte where he summarizes the findings as >100x more efficient than best recurrent ANN and 1000x more efficient than LSTM. All their calculations assume local memory.

Panda et al. (2012) tried several methods to generate SNNs for image classification and calculated theoretical energy consumption. They estimate better efficiencies of SNNs of 6.52x, 7.7x, 10.6x, 74.9x, 81.3x, 104.8x depending on model architecture and parameter space.

Merolla et al. (2014) present the TrueNorth neuromorphic architecture. They compare synaptic operations per second (SOPS) of their architecture to floating-point operations per second (FLOPS) of traditional chips. They say that TrueNorth can deliver 46 billion SOPS per watt. The most energy-efficient supercomputer they say (at time of their writing) generates 4.5 billion FLOPS per watt.

These numbers highlight the potential for some massive energy savings but benchmarks are always complicated. Making good comparisons can be hard, especially since the unit of computational efficiency is fundamentally different. Either way, SNNs on neuromorphic hardware are extremely energy efficient but to truly save energy they must become better at the tasks ANNs already solve.

References

Davies et al. 2018. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning. IEEE Micro. 10.1109/MM.2018.112130359

Yin, Corradi & Bohte 2020. Effective and Efficient Computation with Multiple-timescale Spiking Recurrent Neural Networks. https://arxiv.org/abs/2005.11633

Panda, Aketi & Roy, 2012. Towards Scalable, Efficient and Accurate Deep Spiking Neural Networks with Backward Residual Connections, Stochastic Softmax and Hybridization. https://arxiv.org/abs/1910.13931.

Merolla et al. (2014). A million spiking-neuron integrated circuit with a scalable communication network and interface. https://science.sciencemag.org/content/345/6197/668