Neuromorphic computing is often introduced as a brain inspired way of computing. This definition leaves a lot to the imagination, even for people who know a lot about brains. How far does this inspiration need to go for something to be considered neuromorphic? Many deep neural networks are inspired by pathways of the brain but they are not considered neuromorphic. Here I want to explain the two critical features that make a computational system neuromorphic: local memory and asynchronous signal transmission. They are central to neuromorphic computing, because they are sufficient to utilize the energy efficiency of spiking neuronal networks.
Local Memory
In biological brains, synapses store memories by changing their strength and they perform the computation by weighting signal transmission. This means that memory storage and computation happen in the same place. While the parameters of a deep neural network are sometimes thought of as synapses, they are not considered neuromorphic because their parameters are usually not stored locally. However, some specialized deep learning hardware has local memory storage. Let’s go though some examples.
If you train the parameters of a deep neural network (or download pretrained parameters) on your normal computer, they will most likely be stored in the random-access memory (RAM). By default, the computation will happen on the central processing unit (CPU) while the parameters and even intermediate computational results will be stored in the RAM. To do the computation, parameters need to be constantly transferred between RAM and CPU. This communication causes a large part of the computations energy consumption. But how about graphics processing units (GPUs) where most deep neural networks are trained and deployed?
GPUs come with their own RAM, which they can access relatively fast. The RAM being on the graphics card makes it somewhat local, however it is not considered neuromorphic. For a system to be considered neuromorphic it needs to have its memory at least on the same chip. Being on-chip does not reach the level of locality of actual neurons, which would require analog synapses at the actual signal transmission site. However, on-chip memory is much easier to realize since it can be implemented with well established digital chip design technologies. On-chip memory results in improve energy efficiency.
An interesting corner case is Google’s tensor processing unit (TPU). It is specifically designed for deep learning and one of its features is on-chip memory. This makes it more energy efficient for deep learning but TPUs are not considered neuromorphic hardware. That is because they do not implement the second feature that is required for a neuromorphic system and that is asynchronous communication.
Asynchronous Communication
The brain uses signals called action potentials to transmit information and to compute. Action potentials are considered binary signals being emitted at a defined time point. The fact that both the action potential and the bit in a personal computer are binary sometimes causes confusion about the difference between them. They are similar in the fact that they are both binary but they are different in the fact that the action potential is transmitted asynchronously, while the bits in a personal computer are transmitted synchronously. In a synchronous system the computation is synchronized by a common clock and all the data must be transmitted, even that of inactive cells. If you want to pass the activity of cells to the next layer you need to also pass the zeros of the cells that are inactive at that time step.
In an asynchronous system on the other hand, there is no need to transmit anything unless an action potential occurs. The absence of an action potential does not need transmission, because it does not have any downstream consequences. This means that in an asynchronous system, energy is almost exclusively expended when an action potential is transmitted. This is an advantage of spiking neural networks (action potentials are often called spikes) that synchronous systems can not utilize. This means that in a neuromorphic system the amount of energy needed is strongly related to the number of action potentials transmitted, while in a synchronous system the number of action potentials is nearly irrelevant. However, the number of neurons is relevant.
While nearly all of todays computers perform their computations synchronously, there is one area that has been asynchronous for a long time: the internet. On the internet information is transmitted in packages that are sent and received asynchronously. Similarly, spikes can be transmitted as packages. The well established rules for asynchronous communication as well as advances in digital circuit design have led to a generation of neuromorphic hardware systems that feature both on-chip memory and asynchronous communication. This allows them to operate more energy efficiently than traditional systems, however it has proven more difficult to train neural networks that operate with action potentials compared to deep neural networks. This confines neuromorphic systems to laboratories and some niche applications for the moment. Hopefully, novel methods to train spiking neuronal networks will soon yield applications.
If you are interested in learning more about neuromorphic computing, you might want to take a look at the currently existing chips. One of the most well known ones is Intel’s Loihi. The makers of Loihi have some YouTube videos you can find here. Another chip is IBM’s TrueNorth. Read more about it here. If you want to read a paper more generally about neuromorphic computing I recommend this one to start.