Consisting of a large number of intricately connected neurons, the brain is one of the most sophisticated dynamical systems in nature. Understanding how the brain computes is at the forefront of current scientific research. Research in our group focuses on theoretical analysis of biological neural networks and computational models of neural systems. The modeling is done in close collaboration with experimental groups.
Computational model of birdsong syntax
Songbirds are accessible model systems for studying vocal communications. Male songbirds learn to sing from their fathers, and they sing to attract females. Birdsong consists of sequences of stereotypical syllables. The sequences can be largely fixed as in songs of the zebra finch, or can be variable as in songs of the Bengalese finch. The syllable sequences of variable songs obey probabilistic rules, or birdsong syntax. The neural mechanism of the birdsong syntax is not understood (PDF).
Single unit recordings in singing zebra finches revealed that projection neurons in HVC (proper name), the sensory-motor area in the song system, sequentially spike once with precise timings relative to the song. A simple explanation is that the projection neurons form chain networks and support the synfire chain activity. We examined this mechanism in detail with computational models, and discovered that such network activity is prone to instability unless single neurons have active dendritic processes that normalize the spiking activity between successive groups in the chain (PDF). This prediction was later confirmed in intracellular recordings in the projection neurons in singing birds. The experiment also strongly supported the chain network hypothesis (PDF).
We took the chain network model a step further, and developed a network model of generating variable birdsong sequences. In the model, a chain network encodes a syllable. The end of a chain connects to the begining of multiple chains. This branching connectivity allows the spike activity to probabilistically flow into one of the connected chains at the branching point, with the selection done by the winner-take-all mechanisms via the inhibitory interneurons. The transition probability is determined by the strengths of the connections to branches and external inputs from the thalamic nucleus and auditory feedback (PDF).
The model predicted that syllable transitions satisfy the Markovian statistics. We tested this prediction by analyzing the song syntax of the Bengalese finch. The analysis largely supported the prediction, but suggested two important modifications. Variable birdsong contains long repeated syllables, and they do not follow the Markovian dynamics. Through computational analysis, we identify the source of the non-Markovian repeats to the stimulus specific adaptation of auditory feedback to HVC. As the syllable repeats, the feedback weakens, reducing the repeat probability. The model predicted that deafening reduces syllable repetition, which was confirmed by experimental data (PDF). Another important modification is that multiple chains could encode the same syllable. There are indirect evidence supporting this idea, but direct experimental verification is yet to come. With these modifications, the resulting network model corresponds to partially observable Markov model with adaptation, which accurately describes the song syntax of the Bengalese finch (PDF).
Our computational model of the birdsong syntax will help to understand the syntactic structures and neural mechanisms of other animal vocalizations.
Dynamics of spiking neural networks
Neurons interact with discrete spikes. In certain regimes, the network dynamics can be approximated by rate models, in which the interactions between the neurons are described in terms firing rates of neurons. The resulting network equations are continuous. The well-known Hopfield model is one example. However, rate models leave out the possibility that the discreteness of spiking interactions lead to unique network properties. We took up the challenge and analyzed the spiking dynamics of leaky integrate and fire neuron models in the pulse-coupled regimes. We developed a novel nonlinear mapping technique to mathematically analyze such networks. We proved that, when the network is dominated by feedback inhibition and the neurons are driven by constant external inputs, the network dynamics flows into spike sequence attractors from any initial conditions and for arbitrary connectivity between the neurons, regardless of the inhomogeneity in neuron properties and the external drives. The attractors are characterized by precise timings. In small networks, the spike sequence attractors are periodic spiking patterns, and the convergence to them requires a few transient spikes. Our theory suggests that stable spike sequences are ubiquitous in spiking neural networks (PDF).
A special case of this theory is the winner-take-all dynamics between competing neurons. Our analysis showed that the winner-take-all dynamics requires very few transient spikes. Indeed, in certain regime, whoever spikes first will be the winner, with no transient dynamics at all. The winner-take-all dynamics is one of the most important mechanisms for decision-making and object recognition. Although this dynamics exists in the rate models, the transient dynamics is often long, leading to objections that recurrent dynamics cannot explain phenomenon such as fast object recognition in the visual system. Our analysis of the spiking networks clarified these misconceptions (PDF).
With our mapping technique, we further proved the existence of a winner-take-all competition between chain networks, which is the basis of our computational model of the variable birdsong syntax (PDF). The technique also led to efficient simulation technique, with which we demonstrated formation of chain networks through synaptic plasticity and spontaneous activity (PDF,PDF).
Auditory object recognition and robust speech recognition
Humans and animals can recognize auditory objects such as speech or conspecific vocalizations despite noise and other interfering sounds. The robustness of the auditory systems in humans and animals is unmatched by current artificial speech recognition algorithms, which usually fail in noisy conditions such as in loud bars. We examined the possibility that sparse coding, often observed in the auditory system and other sensory modalities, contributes to the noise robustness. We developed an algorithm of training detectors that respond to features in the speech signals within small time windows. Driven by speech signals, the feature detectors produce sparse spatiotemporal spike responses. Speech can be recognized through matching the patterns with stored templates. We demonstrated that such a scheme outperforms the state-of-the-art artificial speech recognition systems in the standard task of recognizing spoken digits in noisy conditions, especially when the noise level is comparable to that of the signal. Our results suggest that sparse spike coding can be crucial for the robustness of the auditory system (PDF).
The spike sequences generated by the feature detectors can be recognized by network of neurons with stable transient plateau potentials, or UP state, often observed in dendrites of pyramidal neurons. The states of the network can be defined by which neurons are in the UP state. Transitions between the network states can be driven by the inputs from the feature detectors and the connectivity between the neurons. Different inputs drive the network into different states. Auditory object can thus be recognized by identifying the network states achieved by the auditory inputs (PDF,PDF).
Neural coding in the basal ganglia
The basal ganglia is a critical structure for motor control and learning, and is extensively connected with many areas in the brain. The striatum is the input station of the basal ganglia. Dopamine signals, which are a reward signal for reinforcement learning of implicit motor skills and sensory-motor associations, target the striatum. It is thus believed that the striatum is a key structure for reinforcement learning. Temporal difference learning is a standard reinforcement learning mechanism. It explains how delayed rewards can be credited to correct actions or sensory inputs that happen early and eventually lead to the rewards. The mechanism required populations of neurons firing sequentially in between the actions or inputs to the rewards. But whether such dynamics exists in the striatum was unknown.
We analyzed thousands of neurons recorded in the striatum and the prefrontal cortex in monkeys during a simple visually guided saccade take. We applied a clustering technique to categorize neuron response profiles. We found that neurons in both structures encoded all aspects of the task, including the visual signals on the screen and the motor signals generated by the subjects. The timings of these neural responses are dispersed. Most interestingly, we found a subset of neurons in the striatum and the prefrontal cortex that responded with single peaks with different delays relative to the onset of the visual signals. These neurons thus formed sequential firing pattern that filled gaps between the visual inputs. With the population of neurons in the both structures, all time points during the task period can be precisely decoded. Our results suggest that time is encoded in disperses responses profiles in population of neurons in the prefrontal cortex and the striatum. Furthermore, the sequential firing of neurons conjectured by the temporal difference learning mechanism does exists in the striatum, further supporting the possibility that this mechanism guides reinforcement learning in the basal ganglia (PDF).
Population coding in the visual cortex
Tilt after effect is a visual illusion. Long exposure to gratings or bars in one orientation makes other oriented bars appear rotated away from the exposed orientation. The neural mechanism of such “repulsive” effects of visual adaptation was unknown. Single unit recordings in the cat primary visual cortex revealed that adaptation to single orientation led to changes in the tuning properties of neurons. The preferred orientations moved away from the adapting orientation. The response magnitudes were also decreased. At a first glance, the repulsive shifts of the preferred orientations explained the tilt after effect. We analyze the population-coding model of the visual cortex, and showed that in fact the opposite is usually true. Repulsive shifts of the preferred orientations alone in fact leads to attractive shift in the orientation perception, opposite of the tilt after effect. Only when the suppression of neural responses near the adapting orientation is strong enough the repulsive perception occurs. We analyzed the amount of shifts in the preferred orientations and the suppression of the neural responses in the neurons recorded in the primary visual cortex. The combined effects quantitatively matched the amounts of tilt-after effect typical observed. Our analysis revealed the importance of the interplay between the shifts in preferred orientation and neural response suppression, and suggested that these two effects tend to cancel each other to preserve the perception fidelity in normal conditions. Prolonged exposure to a single orientation break this balance, leading to errors in perception manifested as the tilt after effect (PDF).