Improving Neural Networks with Neuroscience

How taking inspiration from the brain can help us create Neural Networks.

8 min readJul 16, 2023

Data Scientists spend a lot of time and resources in improving the architecture of their AI Models. Finding the right architecture/configuration might be a very time-consuming process, but the high costs are justified by the higher performance. As such, there is a lot of research into figuring out different architectures and how different design decisions impact the performance of models. One of the most promising avenues for this kind of research is to learn from evolution and biological systems to improve the design of AI Models (after all Neural Networks were based on our brains).

The following article has been written by Dr. William Lambos. Dr. Lambos and I had a very extensive conversation about how biological needs influence intelligence, misconceptions about our brain’s biology that have led to worse ANN design, and how we can use the recent developments in neuroscience to improve neural networks. During our conversation, he shared an excellent paper- Deep Learning in a bilateral brain with hemispheric specialization- as a proof of concept for some of his ideas. This article will serve to discuss how Bilateral, Dual-Sided Artificial Neural Networks Might Offer An Improved Approach to Neural Network Design. This is going to be part 1, where Dr. Lambos will walk you through Bilaterality, the advantage it provides in biological brains, and how we might leverage it for our needs. In Part 2, I will do a comprehensive breakdown of the aforementioned paper, with suggestions on how this can be extended further. I firmly believe the ideas discussed in this series might become the next frontier of Machine Learning and Neural Network research.

We propose a bilateral artificial neural network that imitates a lateralization observed in nature: that the left hemisphere specializes in specificity and the right in generalities. We used two ResNet-9 convolutional neural networks with different training objectives and tested it on an image classification task. Our analysis found that the hemispheres represent complementary features that are exploited by a network head which implements a type of weighted attention. The bilateral architecture outperformed a range of baselines of similar representational capacity that don’t exploit differential specialization.
-Why you should pay attention to the idea of bilaterization. Even the Pathways architecture used by Google for their AI is inspired heavily from biology.

Here is a quick intro to Dr. Lambos and his background. If it is interesting to you, I would suggest reaching out.

About the Author: William A. Lambos, M.S., Ph.D. is an experienced computational neuroscientist. He did his doctoral thesis on animal vs. machine learning. He later underwent clinical retraining and became a Board Certified clinical neuropsychologist. He is a full-stack software engineer, a data scientist, and a practitioner for patients with brain insults. He is recognized as a world-leading expert in functional neuroimaging based on quantitative electroencephalography (qEEG). His also Board Certified in qEEG-based biofeedback by the Biofeedback Certification International Alliance.

Currently, Dr. Lambos works in three capacities: He is the Founder and CEO of American Brain Forensics, LLC, in which he remains active as a clinical neuropsychologist and a forensic expert witness for legal cases involving brain injuries. He is also the Founder and CEO of Computational Neurosciences, Inc., a consulting firm for brain-inspired data science. Finally, Bill is working with a start-up in leading-edge preventative medicine and wellness (including stem-cell and epigenetic interventions). His passion is using Python to design and implement new brain-driven algorithms and models in machine learning and artificial intelligence.

Join 35K+ tech leaders and get insights on the most important ideas in AI straight to your inbox through my free newsletter- AI Made Simple

Part I: Bilaterality and Biological Brains

Acritical Neural Networks (ANNs) have contributed to some of the most salient advances in machine learning and AI models in the last 10 years. Adding the more recent advances in design and efficiency of storage have led to a slew of very new models known as Generalized Adversarial Neural Networks, or “GANNs”. These typically make use of advancements such as encoder-decoder modules and optimizations of node weight allocations in the internal (“hidden”) layers, known as “sparse matrices.”

But these models continue to be plagued by limitations not seen in biological brains. There are, of course, many such differences between ANNs and biological brains, and in truth, many of these will likely never lend themselves to computational ML models. But at least one fundamental “design” difference can and has already been, used to improve the performance of ANNs.

This aspect of brains is seen throughout most of the animal kingdom and is based on the specialized, bilaterally asymmetry found in the brains of all (bilaterally symmetrical) living creatures. Such animal brains are divided into two hemispheres, the left and right. The two sides of the brain are asymmetrical, both in their structure and functions. The advantages to adaptive functioning afforded by this specialization may also help us to understand why it persisted in animal species above the level of echinoderms (i.e., octopi). If this “design feature” were implemented in ANNs, could it be the basis of a fundamental improvement in ANNs? Recent work suggests it is possible.

Let us look first at the typical characterization of the differences between the hemispheres:

Figure 1. Common View of Human Brain Lateralization.

The characterization above, while having some truth to it, is no longer compatible with what has more recently been discovered about the subject. The above dichotomy is, in fact, a broad misrepresentation of hemispheric specialization. This misunderstanding is often seen among neuroscientists and others who study biological brains but have not focused on the nature of bilaterality.

A better characterization of hemispheric specialization is based on decades of clever, and in many ways, counterintuitive studies by Elkhonon Goldberg (and that of colleagues and his graduate students). Goldberg, et al.’s research demonstrates a more accurate (and, for data science, a more useful) characterization of the specialized functions of each hemisphere. Some of the important differences in the functions of the two hemispheres are summarized in the following table:

Goldberg summarizes:

“To this end, let’s consider four factors: (1) structural morphological features of cortical organization, (2) pathway architecture, (3) synaptic mechanisms of long-term memory formation, and (4) two catecholamine neuromodulatory systems — dopaminergic and noradrenergic…If one were to summarize various morphometric differences between the two hemispheres, reported in a piecemeal fashion in different studies, the following picture emerges. The overall cortical space appears to be allocated slightly differently in the two hemispheres. The left hemisphere is characterized by a slight overrepresentation of modality-specific association cortices (including the superior temporal gyrus and the premotor cortex), at the expense of heteromodal association cortices. The opposite is true for the right hemisphere: there is a slight overrepresentation of heteromodal association cortices (prefrontal, inferotemporal, and inferoparietal) at the expense of modality-specific association cortices. Is it possible that the hemispheric differences in cortical space allocation cause information to be represented in slightly different ways in the two hemispheres and to confer differential advantages in processing novel and familiar information?”
See: Chapter 14 of “The New Executive Brain”, by Elkhonon Goldberg, whom I consider the world’s foremost living neuropsychologist, and who is my former mentor, and now a close colleague.)

The answer to his question is very likely “Yes.” Let’s look at what this would imply for the representation of memories stored in the left vs. the right hemisphere:

“FIGURE 7.5 Deeply indented vs. shallow network connectivity. (A) Deeply indented connectivity is more articulated in the cortex of the left hemisphere. (B) Shallow connectivity is more articulated in the cortex of the right hemisphere. Reproduced from The New Executive Brain, Oxford University Press, 2009.

Goldberg describes his conceptualization of hemispherical specialization the “novelty-routinization hypothesis”. This label captures a central tenet of hemispheric asymmetry: the right hemisphere is better designed to handle new and novel information. The left hemisphere is better at storing generalized and familiar information.

To summarize, the novelty-routinization hypothesis asserts that the right and left hemispheres have specialized functions, as described above. It has undoubtedly conferred evolutionary advantages to animals with such specialization. Most exciting to the data scientist, however, is an as yet undescribed advantage to this dual-sidedness: Such a network architecture, if correctly implemented in ANNs, solves one of the most important limitations of current digital neural network models: New learning ”overwrites” old learning.

Updating a neural net model originally trained on a previous set of tagged data (supervised learning) always degrades previous learning, because retraining and/or updating the artificial neural network via exposing it to new tagged data will change the weights on the hidden layers during backpropagation. Reducing the entropy in the new data set necessarily interferes with the node weights previously associated with the original training data. And when we are talking about billions of parameters, this is especially destructive to the predictive accuracy of the original model. Other features of ANNs only serve to make the problem worse: The attention mechanisms introduced are not selective to the newness or primacy of the training data. Sparse matrices also broaden the problem, as having fewer nodes to overwrite further degrades the predictive performance of the model. This has been one of the largest problems in artificial neural networks.

However, at least theoretically, an architecture that separates new training data from previous training data by dividing the ANN into separate halves, solves this problem rather spectacularly. Animal nervous systems faced the very same problem, and, speaking teleologically, “solved” the problem via bifurcation of the nervous system as described above.

Now, how might we leverage the same approach in artificial neural networks? As it turns out, at least one of the benefits of hemispheric specialization has already been implemented in a computational model, and results in a notable improvement in processing different categories of data. In Part II, the most recent demonstration of the advantages of dual-sided ANNs based on Goldberg’s work is described.

Suggestions for taking the next steps to infer additional aspects of specialized asymmetry, resulting in yet more powerful NNs, will follow.

I (Devansh) hope you found Dr. Lambos’s ideas as interesting as I did. Keep your eyes peeled because I will be a part 2 soon, where I will do an AI Research breakdown.

That is it for this piece. I appreciate your time. As always, if you’re interested in working with me or checking out my other work, my links will be at the end of this email/post. If you like my writing, I would really appreciate an anonymous testimonial. You can drop it here. And if you found value in this write-up, I would appreciate you sharing it with more people. It is word-of-mouth referrals like yours that help me grow.