Fabrizio Musacchio

Short-term depression (STD) and short-term facilitation (STF)

2026-05-25T11:08:48+02:00

Short-term depression (STD) and short-term facilitation (STF) are forms of short-term synaptic plasticity, which refer to temporary changes in the strength of synaptic transmission. These mechanisms are critical for the dynamic regulation of synaptic activity and play important roles in neural processing and information transmission. Here’s a brief overview of STD and STF, and how they can be implemented in computational models of neural networks.

Simulated short-term depression (STD) in a synapse. A detailed explanation of the simulation is provided below.

Short-term depression (STD)

Short-term synaptic depression (STD) is a temporary decrease in synaptic strength that occurs when synapses are used repeatedly over a short period. This phenomenon reduces the likelihood of neurotransmitter release in response to subsequent stimuli.

The primary mechanism underlying STD is the depletion of the readily releasable pool of synaptic vesicles. When a synapse is activated rapidly, vesicles are released faster than they can be replenished, leading to a transient decrease in neurotransmitter release. Another contributing factor can be the temporary inactivation of the molecular machinery responsible for vesicle release.

In a simplified frequency-dependent interpretation, STD often acts more like a low-pass filter: Isolated or low-frequency presynaptic spikes are transmitted relatively effectively, whereas sustained high-frequency input progressively depletes the available synaptic resources and reduces the postsynaptic response. In this way, STD can support adaptation to sustained input, prevent excessive excitation, and emphasize changes in presynaptic activity rather than steady high-frequency drive.

Short-term facilitation (STF)

Short-term synaptic facilitation (STF) is a transient increase in synaptic strength that occurs when two or more action potentials invade the presynaptic terminal within a short time frame. This leads to an increased probability of neurotransmitter release in response to subsequent stimuli.

The most widely accepted mechanism for STF is the residual calcium hypothesis. When an action potential reaches the presynaptic terminal, it causes an influx of calcium ions (Ca²⁺). If another action potential arrives before the calcium is fully cleared, the residual calcium from the first spike adds to the influx from the second spike, increasing the probability of vesicle release. Short-term changes in the efficacy of the synaptic release machinery can also contribute to facilitation.

Again, in a simplified frequency-dependent interpretation, STF often acts more like a high-pass filter: Isolated presynaptic spikes evoke relatively small responses, whereas closely spaced spikes increase the release probability and thereby enhance subsequent postsynaptic responses. This makes facilitating synapses particularly sensitive to bursts or rapid sequences of action potentials and supports the temporal summation of synaptic inputs.

Interaction and implications

STD and STF often coexist at synapses, and their relative influence can vary based on the type of synapse and the pattern of neural activity. The balance between depression and facilitation can shape the overall dynamics of synaptic transmission and influence neural coding and network behavior. Their significance in neural processing can be summarized as follows:

Both STD and STF contribute to the dynamic range modulation of synapses, allowing them to adjust their responsiveness based on recent activity.
These forms of plasticity enable synapses to act as temporal filters, selectively enhancing or diminishing the transmission of specific patterns of activity.
They play a role in maintaining synaptic homeostasis, ensuring that synaptic activity remains within a functional range and preventing overexcitation or underactivity.

Mathematical implementation

Both short-term depression and short-term facilitation can be mathematically modeled to capture their dynamics in neural networks. These models generally modify the synaptic efficacy based on the history of presynaptic activity. In 1998, Tsodyks and Markramꜛ proposed a phenomenological model that captures the essential features of both STD and STF. The model includes two variables, $x$ and $u$, which represent the synaptic depression and facilitation, respectively. The dynamics of these variables are governed by differential equations that depend on the presynaptic spike train and the synaptic parameters. Here are the detailed implementations of both mechanisms.

STD is modeled as a decrease in the fraction of available synaptic resources. The normalized fraction of available resources is denoted by $x$ ($0\leq x\leq1$): It is reduced at presynaptic spike times and recovers toward 1 between spikes with a time constant $\tau_d$. STF is modeled as an increase in the utilization of synaptic efficacy. The utilization factor is denoted by $u$: It increases at presynaptic spike times and, in the simplified convention used here, decays back toward 0 between spikes with a time constant $\tau_f$. The dynamics of $x$, $u$, and the synaptic current or conductance variable $I$ can be described by the following set of differential equations:

\[\begin{align} \frac{dx}{dt} &= \frac{1 - x}{\tau_d} - u^+ x^- \delta(t - t_{sp}) \\ \frac{du}{dt} &= -\frac{u}{\tau_f} + U(1 - u^-)\delta(t - t_{sp}) \label{eq:2} \\ \frac{dI}{dt} &= -\frac{I}{\tau_s} + A u^+ x^- \delta(t - t_{sp}) \end{align}\]

where:

$\tau_d$ is the depression time constant,
$\tau_f$ is the facilitation time constant,
$U$ is the utilization factor (a parameter that sets the fraction of resources used after each spike),
$\delta(t - t_{sp})$ represents the Dirac delta function, indicating the occurrence of a presynaptic spike at time $t_{sp}$,
$A$ is the amplitude of the synaptic response, and
$I$ is the synaptic current or conductance.

Variables just before the arrival of a spike are denoted by the superscript $-$, and those just after the spike by the superscript $+$. From Eq. ($\ref{eq:2}$) we get:

\[\begin{align} u^+ &= u^- + U(1 - u^-) \\ x^+ &= x^- - u^+ x^- \end{align}\]

After a spike, the variable $u$ increases by an amount proportional to $U$, reflecting the increased utilization of synaptic efficacy. Subsequently, a fraction $u^+$ of the currently available synaptic resources $x^-$ is used to generate the postsynaptic current or conductance increment. In the intervals between spikes, the simplified formulation used here lets $u$ decay back to zero with a time constant $\tau_f$, while $x$ gradually recovers to 1 with a time constant $\tau_d$. The synaptic current $I$ decays with a time constant $\tau_s$. When a spike occurs at $t_{sp}$, $x$ is reduced by an amount proportional to $u^+x^-$, and $I$ is incremented by a factor proportional to $Au^+x^-$.

It is worth noting that different formulations of the Tsodyks-Markram model use slightly different conventions for the facilitation variable. In the simplified pseudo-code implementation below, $u$ decays to zero between spikes. In other common formulations, including several simulator implementations, the utilization variable relaxes toward a baseline value related to $U$ instead. Therefore, the variables and parameters in this minimal implementation should not be compared one-to-one with those of every simulator implementation without checking the exact convention used there.

Pseudo-code implementation

It’s actually quite simple to implement the Tsodyks-Pawelzik-Markram model in Python. Let’s begin by importing the necessary libraries:

import numpy as np
import matplotlib.pyplot as plt

# Set global properties for all plots:
plt.rcParams.update({'font.size': 12})
plt.rcParams["axes.spines.top"]    = False
plt.rcParams["axes.spines.bottom"] = False
plt.rcParams["axes.spines.left"]   = False
plt.rcParams["axes.spines.right"]  = False

Next, we define the initial conditions and parameters for the simulation. To simulate a STD-dominating synapse, we can set the parameters as follows:

U_initial = 0.0 # set initial release probability
x_initial = 1.0 # set initial fraction of synaptic vesicles in the readily releasable pool

u = U_initial
x = x_initial
I = 0           # initial synaptic current in arbitrary units

# set time constants and parameters:
tau_f = 50   # facilitation time constant
tau_d = 750  # depression time constant
tau_s = 5    # synaptic current decay time constant
U = 0.45     # utilization factor
A = 1        # synaptic response amplitude

For a STF-dominating synapse, change the parameters as follows:

tau_f = 750  # facilitation time constant
tau_d = 50   # depression time constant
U     = 0.15 # utilization factor

We need to define the time step for numerical integration and the duration of the simulation. We also set some example spike times to observe the dynamics of the synapse:

# set time step for numerical integration:
dt = 0.1  # ms
simulation_duration = 100 # ms

# set some example spike times:
spike_times = [10, 20, 30, 50, 70]  # in ms

Finally, we simulate the dynamics of the synapse by iterating over time and updating the synaptic variables:

I_trace = []  # list to store the synaptic current over time
u_trace = []  # list to store the release probability over time
x_trace = []  # list to store the fraction of synaptic vesicles in the readily releasable pool over time

# simulation loop:
for t in np.arange(0, simulation_duration, dt):
    # check for spikes:
    if t in spike_times:
        # spike event:
        u = u + U * (1 - u)
        I = I + A * u * x
        x = x - u * x

    # update u, x, and I between spikes:
    u = u * np.exp(-dt/tau_f)    # exponential decay of facilitation
    x = x + (1 - x) * (dt/tau_d) # Euler step for recovery from depression
    I = I * np.exp(-dt/tau_s)    # exponential decay of synaptic current
    
    I_trace.append(I)
    u_trace.append(u)
    x_trace.append(x)

To visualize the results, we can plot the dynamics of the release probability $u$, the fraction of synaptic vesicles $x$, and the synaptic current $I$ over time:

time = np.arange(0, simulation_duration, dt)

# plot the results:
plt.figure(figsize=(6, 4))
plt.plot(time, u_trace, label="release probability $u$", lw=2.0)
plt.plot(time, x_trace, label="fraction of synaptic vesicles $x$", lw=2.0, alpha=0.5)
plt.plot(time, I_trace, label="synaptic current $I$", lw=2.0, c="black")
plt.plot(spike_times, np.zeros_like(spike_times), 'ro',label="spike timepoint")
plt.xlabel("time [ms]")
plt.ylabel(f"arbitrary units")
plt.legend()
plt.tight_layout()
plt.show()

In case of $\tau_d\gg\tau_f$ and a large $U$, the synapse exhibits STD-dominant behavior:

Simulation results for a STD-dominant synapse. Shown are the dynamics of the release probability $u$, the fraction of synaptic vesicles $x$, and the synaptic current $I$ over time. We set $\tau_d=750$, $\tau_f=50$, $U=0.45$, $A=1$, $U_{\text{initial}}=0$, and $x_{\text{initial}}=1$ for this simulation.

In this STD-dominated parameter regime, the available-resource variable $x$ is progressively depleted by repeated spikes and recovers only slowly. Although $u$ may transiently increase at spike times, the product $ux$, and hence the effective synaptic response, decreases across the spike train. The synapse therefore becomes less responsive to subsequent spikes, reflecting the effects of short-term depression.

In case of $\tau_f\gg\tau_d$ and a small $U$, the synapse exhibits STF-dominant behavior:

Same plot as before, but now for a STF-dominant synapse. We set $\tau_d=50$, $\tau_f=750$, $U=0.15$, $A=1$, $U_{\text{initial}}=0$, and $x_{\text{initial}}=1$.

In this STF-dominated parameter regime, the facilitation variable $u$ increases across closely spaced spikes, while the available-resource variable $x$ recovers relatively quickly. As a result, the product $ux$, and hence the effective synaptic response, increases across the spike train. The synapse therefore becomes more responsive to subsequent spikes, reflecting the effects of short-term facilitation.

Thus, the actual interplay between the dynamics of $x$ and $u$ determines whether the combined effect of $ux$ is dominated by depression or facilitation.

Simulation in NEST

NEST’s iaf_tum_2000ꜛ model includes a built-in implementation of short-term plasticity according to the Tsodyks-Uziel-Markram model proposed in 2000ꜛ. In the example below, the short-term plasticity variables are part of the iaf_tum_2000 neuron model, while the connection between the two neurons is created using a static_synapse. Thus, the term “synapse” in the following refers to the effective dynamic synaptic response generated by the model, not to a separate dynamic synapse object. For simulations in which short-term plasticity should be attached explicitly to synaptic connections, NEST also provides dedicated synapse models such as tsodyks_synapseꜛ.

Here is an example of how to simulate both STD and STF in NEST, following the NEST tutorials “Short-term facilitation example”ꜛ and “Short-term depression example”ꜛ with some minor modifications.

import matplotlib.pyplot as plt
import numpy as np
import nest

# set the verbosity of the NEST simulator:
nest.set_verbosity("M_WARNING")

# Set global properties for all plots:
plt.rcParams.update({'font.size': 12})
plt.rcParams["axes.spines.top"]    = False
plt.rcParams["axes.spines.bottom"] = False
plt.rcParams["axes.spines.left"]   = False
plt.rcParams["axes.spines.right"]  = False

We begin by simulating an effective synaptic response with short-term depression (STD). We set the parameters of the iaf_tum_2000 model and the simulation duration. We also define the DC input amplitude and frequency, as well as the initial values for the short-term plasticity variables $x$ and $u$:

nest.ResetKernel()
nest.resolution = 0.1  # simulation step size [ms]

T_sim = 1200.0  # simulation time [ms]

tau_m = 40.0  # membrane time constant [ms]
R_m = 0.1  # membrane input resistance [GΩ]
C_m = tau_m / R_m  # membrane capacitance [pF]
V_th = 15.0  # threshold potential [mV]
V_reset = 0.0  # reset potential [mV]
t_ref = 2.0  # refractory period [ms]

stim_start = 50.0   # start time of DC input [ms]
stim_end   = 1050.0 # end time of DC input [ms]
f_hz       = 20.0   # frequency [Hz]
f = f_hz / 1000.0   # frequency conversion to [1/ms]
dc_amp = V_th * C_m / tau_m / (1 - np.exp(-(1 / f - t_ref) / tau_m))  # DC amplitude [pA]

dc_gen = nest.Create("dc_generator", 1,
                     params={"amplitude": dc_amp, 
                             "start": stim_start, 
                             "stop": stim_end})

x = 1.0  # initial fraction of synaptic vesicles in the readily releasable pool
u = 0.0  # initial release probability of synaptic vesicles
U = 0.5  # fraction determining the increase in u with each spike
tau_psc = 3.0  # decay constant of PSCs (tau_inact in [2]) [ms]
tau_rec = 800.0  # recovery time from synaptic depression [ms]
tau_fac = 0.0  # time constant for facilitation (off) [ms]

Next, we create two neurons with the iaf_tum_2000 model and connect them using a static_synapse. Note that the short-term depression dynamics are not implemented by the static_synapse itself, but by the short-term plasticity variables inside the iaf_tum_2000 model. We also create a multimeter to record the membrane potential and synaptic currents:

neurons = nest.Create("iaf_tum_2000", 2, params={
                    "C_m": C_m,
                    "tau_m": tau_m,
                    "tau_syn_ex": tau_psc,
                    "tau_syn_in": tau_psc,
                    "V_th": V_th,
                    "V_reset": V_reset,
                    "E_L": V_reset,
                    "V_m": V_reset,
                    "t_ref": t_ref,
                    "U": U,
                    "tau_psc": tau_psc,
                    "tau_rec": tau_rec,
                    "tau_fac": tau_fac,
                    "x": x,
                    "u": u})

nest.Connect(dc_gen, neurons[0])

weight = 250.0  # synaptic weight [pA]
delay = 0.1  # synaptic delay [ms]

nest.Connect(neurons[0], neurons[1], syn_spec={
                "synapse_model": "static_synapse",
                "weight": weight,
                "delay": delay,
                "receptor_type": 1})

multimeter_std = nest.Create("multimeter", 
                             params={"interval": 1.0, 
                                     "record_from": ["V_m", 'I_syn_ex', 'I_syn_in']})
nest.Connect(multimeter_std, neurons[1])

Finally, we simulate the network and extract the recordings from the multimeter (we will plot them later):

nest.Simulate(T_sim)

# extract recordings from the multimeter:
times_STD = multimeter_std.get("events")["times"]
I_syn_ex_STD = multimeter_std.get("events")["I_syn_ex"]
I_syn_in_STD = multimeter_std.get("events")["I_syn_in"]
V_m_STD = multimeter_std.get("events")["V_m"]

Next, we simulate an effective synaptic response with short-term facilitation (STF) using the same procedure. We set the parameters of the iaf_tum_2000 model and the simulation duration, as well as the initial values for the short-term plasticity variables $x$ and $u$:

nest.ResetKernel()
nest.resolution = 0.1  # simulation step size [ms]

dc_gen = nest.Create("dc_generator", 1,
                     params={"amplitude": dc_amp, 
                             "start": stim_start, 
                             "stop": stim_end})

x = 1.0  # initial fraction of synaptic vesicles in the readily releasable pool
u = 0.0  # initial release probability of synaptic vesicles
U = 0.03  # fraction determining the increase in u with each spike
tau_psc = 1.5  # decay constant of PSCs (tau_inact in [2]) [ms]
tau_rec = 130.0  # recovery time from synaptic depression [ms]
tau_fac = 530.0  # time constant for facilitation [ms]

neurons = nest.Create("iaf_tum_2000", 2,params={
                    "C_m": C_m,
                    "tau_m": tau_m,
                    "tau_syn_ex": tau_psc,
                    "tau_syn_in": tau_psc,
                    "V_th": V_th,
                    "V_reset": V_reset,
                    "E_L": V_reset,
                    "V_m": V_reset,
                    "t_ref": t_ref,
                    "U": U,
                    "tau_psc": tau_psc,
                    "tau_rec": tau_rec,
                    "tau_fac": tau_fac,
                    "x": x,
                    "u": u})

nest.Connect(dc_gen, neurons[0])

weight = 1540.0  # synaptic weight [pA]
delay = 0.1  # synaptic delay [ms]

nest.Connect(neurons[0], neurons[1], syn_spec={
                "synapse_model": "static_synapse",
                "weight": weight,
                "delay": delay,
                "receptor_type": 1})

multimeter_stf = nest.Create("multimeter", 
                             params={"interval": 1.0, 
                                     "record_from": ["V_m", 'I_syn_ex', 'I_syn_in']})
nest.Connect(multimeter_stf, neurons[1])

nest.Simulate(T_sim)

# extract voltage trace from the voltmeter and plot it:
times_STF = multimeter_stf.get("events")["times"]
I_syn_ex_STF = multimeter_stf.get("events")["I_syn_ex"]
I_syn_in_STF = multimeter_stf.get("events")["I_syn_in"]
V_m_STF = multimeter_stf.get("events")["V_m"]

Finally, we plot the results for both STD and STF simulations:

fig1, ax = plt.subplots(2, 1, sharex=True, figsize=(6.5, 5))

ax[0].plot(times_STD, V_m_STD, label="V_m (STD)")
ax[0].plot(times_STF, V_m_STF, label="V_m (STF)", alpha=0.5)
ax[0].legend()
ax[0].set_ylabel(f"membrane potential\n[mV]")

ax[1].plot(times_STD, I_syn_ex_STD, label="I_syn_ex (STD)")
ax[1].plot(times_STF, I_syn_ex_STF, label="I_syn_ex (STF)", alpha=0.5)
ax[1].legend()
ax[1].set_xlabel("time [ms]")
ax[1].set_ylabel(f"synaptic current\n[pA]")

plt.tight_layout()
plt.show()

Simulation results for a synapse with short-term depression (STD, blue) and short-term facilitation (STF, orange). The top panel shows the membrane potential $V_m$ of the postsynaptic neuron, while the bottom panel shows the excitatory synaptic current $I_{\text{syn_ex}}$. The STD synapse exhibits a transient decrease in synaptic current after each spike, while the STF synapse shows an increase in synaptic current after each spike. The DC input is applied from 50 ms to 1050 ms.

The plot shows the membrane potential $V_m$ of the postsynaptic neuron and the excitatory synaptic current $I_{\text{syn_ex}}$ for both the STD (blue) and STF (orange) synapses. The STD synapse exhibits a rapid initial increase in $V_m$ due to the synaptic input, followed by a gradual decrease as the depression mechanism reduces the synaptic efficacy. In case of the STF synapse, $V_m$ continues to increase with successive spikes due to the facilitation mechanism, which increases synaptic efficacy. Both observations are consistent with the expected behavior of STD and STF, respectively. This is also reflected in the synaptic current $I_{\text{syn_ex}}$, which shows a transient decrease after each spike for the STD synapse (STD causes the depletion of available synaptic resources with each spike) and a transient increase for the STF synapse (with each successive spike, STF enhances the release probability in this case).

Overall, this simulation illustrated the contrasting effects of short-term depression and facilitation on synaptic plasticity, demonstrating how these mechanisms can dynamically modulate neural responses based on spike timing and frequency.

Conclusion

Short-term depression (STD) and short-term facilitation (STF) are essential mechanisms of short-term synaptic plasticity that regulate the strength of synaptic transmission in response to presynaptic activity. These mechanisms play crucial roles in shaping neural responses, filtering information, and maintaining synaptic homeostasis. By implementing STD and STF in computational models of neural networks, we can study their effects on network dynamics, information processing, and learning. The Tsodyks-Pawelzik-Markram model provides a simple yet effective framework for capturing the dynamics of STD and STF, allowing us to simulate and analyze the behavior of synapses under different conditions. By combining experimental and computational approaches, we can gain a deeper understanding of how short-term plasticity influences neural function and network behavior.

The complete code used in this blog post is available in this Github repositoryꜛ (short_term_synaptic_plasticity.py and short_term_synaptic_plasticity_with_NEST.py). Feel free to modify and expand upon it, and share your insights.

References

Zucker, Regehr, Short-Term Synaptic Plasticity, 2002, Annual Review of Physiology, Vol. 64, Issue 1, pages 355-405, doi: 10.1146/annurev.physiol.64.092501.114547ꜛ
L. F. Abbott, Wade G. Regehr, Synaptic computation, 2004, Nature, Vol. 431, Issue 7010, pages 796-803, doi: doi.org/10.1038/nature03010
L. F. Abbott, S. B. Nelson, Synaptic plasticity: taming the beast, 2000, Nature neuroscience, doi: 10.1038/81453ꜛ
Misha Tsodyks, Klaus Pawelzik, Henry Markram, Neural Networks with Dynamic Synapses, 1998, Neural Computation, Vol. 10, Issue 4, pages 821-835, doi: 10.1162/089976698300017502ꜛ
Misha Tsodyks, Asher Uziel, Henry Markram, Synchrony Generation in Recurrent Networks with Frequency-Dependent Synapses, 2000, The Journal of Neuroscience, Vol. 20, Issue 1, pages RC50-RC50, doi: 10.1523/JNEUROSCI.20-01-j0003.2000ꜛ
Citri, Malenka, Synaptic Plasticity: Multiple Forms, Functions, and Mechanisms, 2008, Neuropsychopharmacology, Vol. 33, Issue 1, pages 18-41, doi: 10.1038/sj.npp.1301559
scholarpedia article on ‘Short-term synaptic plasticity’ꜛ
NEST’s iaf_tum_2000 model descriptionꜛ
NEST’s tsodyks_synapse model descriptionꜛ
NEST’s “Short-term facilitation example” tutorialꜛ
NEST’s “Short-term depression example” tutorialꜛ

Virchow’s Cellularpathologie: A foundational work in the history of medicine and neuroscience

2026-05-14T13:57:00+02:00

After that talk by Kettenmann, I could not quite let go of the remaining historical connections that he had mentioned. One of them was the fact that Rudolf Virchow, the founder of cellular pathology, had introduced the distinction between neurons and glial cells in the mid 1850s. And this distinction, as Kettenmann emphasized, was crucial for the later development of neuroscience, because it provided the conceptual framework for understanding the cellular composition of the nervous system.

The cover of my personal copy of Virchow’s Cellularpathologie, which I ordered from a second-hand book store. The book is in German, but there may be English translations available as well.

I could not resist buying a physical copy of Virchow’s Cellularpathologie. A used reproduction from 1956, nothing special. But it felt oddly appropriate. As someone still relatively new to neuroscience, I did not know this work before. However, as a side note, I had already encountered Virchow’s name almost on a daily basis: In the city where I grew up, I lived in a neighbourhood where almost all streets were named after German scientists. One of them was Virchowstraße¹ (Virchow Street), which I passed by every day on my way to school. Of course, I knew what Virchow was famous for, but I had never read any of his work.

Zoom onto the cover of Virchow’s Cellularpathologie (my personal copy).

I did not read the entire book, since it is quite long and dense, and, from a physicist’s perspective, several chapters are out of my depth. However, since it was written in German, it was of course no problem to understand the text. I found it fascinating how structured and precise this early scientific work is. It felt almost like reading a modern scientific book. Of course, the figures were not made with modern software, as we would do today. Instead, woodcut illustrations were used, which achieve such a high level of detail and clarity that they even surpass some modern illustrations in my view. The book is organized into 20 lectures, each focused on a specific aspect, and the arguments are built step by step, with careful attention to evidence and logical structure.

In this post, I want to share some of the insights I gained from engaging with Virchow’s Cellularpathologie. I will not attempt to summarize the entire book, but rather focus on some key themes and implications that I found interesting.

Virchow and the conceptual shift in medicine

Rudolf Virchow (1821–1902) is often regarded as one of the most influential figures in the history of medicine. He was a physician, pathologist, anthropologist, politician, and social reformer, and his work fundamentally changed how disease was understood in the 19th century. Today, Virchow is primarily remembered as the founder of cellular pathology, but his scientific influence extended much further. He contributed to histology, pathology, epidemiology, and public health, argued strongly for the social dimension of medicine, and introduced or popularized concepts and terms that remain central to medicine today, including thrombosis, embolism, leukemia, and neuroglia.

First pages of Virchow’s Cellularpathologie, with a photograph of Virchow himself on the left. Source: Reproduced from Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre, 1858. Photograph of the printed figure taken by myself. Original work in the public domain.

His most influential work, however, was undoubtedly Cellularpathologie, first published in 1858. The book laid the conceptual foundation for modern pathology and had profound implications for how disease itself was understood.

Virchow’s central idea is often summarized in a single statement:

Omnis cellula e cellula

Every cell arises from another cell.

At first glance, this sounds almost trivial from a modern perspective. However, in the mid 19th century, this statement directly contradicted older ideas about spontaneous generation and the formation of tissues from amorphous substances. It established continuity at the cellular level as the fundamental principle of life. More importantly, Virchow extended this principle from physiology to pathology. Disease, in his view, is not a disturbance of abstract bodily “fluids” or humors such as blood, phlegm, black bile, and yellow bile, as in ancient medicine, but a disturbance of cells. This is the conceptual core of cellular pathology.

This insight has profound implications: If every physiological process is rooted in cellular processes, then every pathological process must also be reducible to changes in cellular structure, function, or proliferation. In modern terms, this marks a shift from descriptive medicine at the level of organs and symptoms toward a mechanistic and microscopic understanding of disease.

The lectures

Virchow’s Cellularpathologie is not a monolithic treatise in the traditional sense. It is a collection of 20 lectures, delivered in early 1858 at the Pathological Institute in Berlin and later transcribed and edited for publication. This also explains the style of the work: It is didactic, structured around demonstrations, and closely tied to visual material such as microscopic preparations and woodcut illustrations.

Cover of the first lecture of Virchow’s Cellularpathologie. Source: Woodcut illustration reproduced from Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre, 1858. Photograph of the printed figure taken by myself. Original work in the public domain.

The work begins with a systematic overview of cell theory, integrating botanical and zoological observations. The inclusion of plant cells alongside animal tissues reflects the universality of the cellular principle. The same structural logic applies across living systems, whether in plant tissue, cartilage, liver, or nervous tissue, as illustrated in the early figures of the book.

Woodcut illustration of an injection preparation of the skin, vertical cut. E: Epidermis, R: Rete Malpighii, P: Papilla with its up and down projecting vessels, C: Cutis. Source: Woodcut illustration (Fig. 44) reproduced from Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre, 1858. Photograph of the printed figure taken by myself. Original work in the public domain.

From this foundation, Virchow gradually develops the framework of cellular pathology in detail. Much of the book can be understood as a systematic attempt to establish the cell as the central explanatory unit of medicine.

Woodcut illustration of Ganglion cells from the central organs: A, B, C from the spinal cord, based on preparations by Mr. Gerlach; D from the cerebral cortex. A: Large, multi-dendritic (multipolar, polyclonal) cells from the anterior horns (motor cells). B: Smaller cells with 3 larger processes from the posterior horns (sensory cells). C: Two-branched (bipolar, diclonal), more rounded cells from the vicinity of the posterior commissure (sympathetic cells). Magnification: 300. Source: Woodcut illustration (Fig. 89) reproduced from Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre, 1858. Photograph of the printed figure taken by myself. Original work in the public domain.

In the lectures, Virchow explicitly argues against older humoral and organ-centered views of disease and instead localizes pathological processes at the cellular level. However, what is particularly interesting here is how concretely he develops this argument. Rather than presenting a purely abstract theory, he continuously moves between microscopic observations, tissue morphology, pathological examples, and conceptual interpretation.

Cover of the 12th lecture of Virchow’s Cellularpathologie, which is focused on the nervous system. Source: Woodcut illustration reproduced from Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre, 1858. Photograph of the printed figure taken by myself. Original work in the public domain.

Similarly, Virchow repeatedly emphasizes that pathological processes are not fundamentally separate from physiological ones. Disease is not treated as something metaphysically distinct from normal life processes, but as an alteration, dysregulation, or exaggeration of normal cellular activity. In his lectures, he establishes this view not only philosophically, but through numerous examples involving growth, degeneration, inflammation, and tissue remodeling. The detailed discussion of inflammation is especially interesting here: Inflammation is no longer described merely through externally visible symptoms such as heat, redness, and swelling, but as a sequence of cellular events involving the accumulation, transformation, and activity of specific cells. Likewise, tumors are interpreted not as foreign entities or parasitic growths, but as the result of abnormal cellular proliferation originating from the body’s own tissues.

And Virchow reinforces his arguments with an, at that time, extensive use of microscopic images, which range from plant cells to cartilage, liver cells, capillaries, ganglion cells, and pathological tissues:

Schematic representation of nerve fiber patterns in the cerebellar cortex according to Gerlach. (Microscopic Studies, Plate I, Fig. 3.) A: white matter; B, C: gray matter; B: granular layer; C: cell layer. Source: Woodcut illustration (Fig. 91) reproduced from Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre, 1858. Photograph of the printed figure taken by myself. Original work in the public domain.

Another important aspect is Virchow’s insistence on continuity. There is no abrupt transition from health to disease at the level of fundamental principles. Instead, there exists a continuous spectrum of cellular states and transformations. This idea feels remarkably modern, since many diseases today are understood not as entirely separate biological phenomena, but as dysregulated versions of normal cellular processes.

Finally, and in direct connection to our earlier discussion of Helmholtz, Virchow introduced the concept of neuroglia as a distinct cellular component of nervous tissue. This distinction between neurons and glial cells provided an important conceptual framework for later neuroscience and fundamentally changed how the nervous system was understood at the cellular level. What Virchow recognized was that the nervous system could not be reduced to nerve fibers and nerve cells alone. It also contained another cellular and structural component, which he interpreted as a kind of connective or supporting tissue of the nervous system. The term neuroglia itself expresses this idea: A “nerve glue”, a tissue that holds the nervous elements together:

Woodcut illustration of the ependyma of the ventricles and neuroglia from the floor of the fourth ventricle. E: epithelium, N: nerve fibers. In between lies the free portion of the neuroglia, containing numerous connective tissue cells and nuclei; at v is a blood vessel; elsewhere, numerous corpora amylacea are visible, which are shown in isolation at ca. Magnification: 300. Source: Woodcut illustration (Fig. 94) reproduced from Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre, 1858. Photograph of the printed figure taken by myself. Original work in the public domain.

From a modern perspective, this interpretation was still limited, because glial cells are no longer understood as merely passive support elements. Astrocytes, oligodendrocytes, microglia, and other glial cell types are now known to participate actively in homeostasis, metabolism, myelination, immune defense, synaptic regulation, and disease processes. But Virchow’s conceptual step was nevertheless crucial: He separated the nervous system into more than one cellular category and thereby opened the possibility of asking what these non-neuronal cells are, where they come from, and what they do.

Conceptual implications

What I take away from researching and reading Virchow’s Cellularpathologie is that it represents a major conceptual shift in the history of medicine. It is not just a collection of observations or a description of pathological phenomena. It is an attempt to reorganize the entire field of medicine around the cell as the fundamental unit of life and disease. Virchow constructs a coherent explanatory framework that connects anatomy, physiology, pathology, microscopy, and tissue structure into a unified view of disease. In this sense, the book can be regarded as on of the scientifically most important works in the history of medicine, and it laid the groundwork for many subsequent developments in pathology, histology, and neuroscience.

Several aspects of Virchow’s reasoning still feel surprisingly familiar, at least from what I can recognize as a non-medical reader. The idea that disease emerges from altered cellular states, that pathological and physiological processes are deeply connected, and that microscopic structures can provide mechanistic explanations for macroscopic symptoms remains central to biomedical science today.

The garden pavilion of the Juliusspital in Würzburg housed Virchow’s Institute of Pathology until 1853. Source: Wikimedia Commonsꜛ (license: CC BY-SA 4.0).

Engaging with Virchow after reading Helmholtz’s dissertation also highlights an interesting historical contrast. Helmholtz’s dissertation is to some extent a bit more observational and comparative. Virchow, by contrast, is much more focused on conceptual integration and theoretical organization. He builds a framework in which such structures acquire general pathological meaning. In this sense, the two works together provide an interesting glimpse into the emergence of modern life sciences during the 19th century, which could, in a very simplified way, be seen as: One focused on observation and comparative anatomy, the other on conceptual unification and medical interpretation.

This little excursion into the history of science was quite refreshing. Ideas we now take for granted, not only in neuroscience and medicine, must have been revolutionary at that time and had to be established through careful argumentation and evidence, which is not so different from today. I also enjoyed reading about the biographies of 19th and early 20th century² scientists, who were often polymaths³ and engaged in multiple fields of science, medicine, and even politics. This again reminds me that the boundaries between disciplines were much more fluid at that time, and that scientific progress often involved contributions from individuals with broad interests and expertise. An attitude that I think is worth remembering today.

References and further reading

Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre: Zwanzig Vorlesungen, gehalten während der Monate Februar, März und April 1858 im pathologischen Institute zu Berlin, 1858, Verlag von August Hirschwald, Berlin; online available for free at MDZ Digitale Sammlungenꜛ or on archive.orgꜛ (original work in the public domain)
Julia Heideklang, H.-J. Pflüger, Helmut Kettenmann, De fabrica systematis nervosi evertebratorum. Die kommentierte Dissertation von Hermann Helmholtz, 2021, Wbg Academic, ISBN: 9783534400942, online PDFꜛ, Websiteꜛ

Footnotes

And guess what, Helmholtzstraße (Helmholtz Street) was only a few streets away, and I passed by it almost every day as well. So, in a way, perhaps it was inevitable that I would eventually engage with the natural sciences. There were many other streets named after scientists, too, but no worries, I will not turn this blog into a tour through the scientific street names of my childhood town. ↩
See, e.g., our post on Richard Feynman’s problem solving approach. ↩
See, e.g., this Mastodon postꜛ. ↩

Helmholtz’s dissertation on the nervous system: A forgotten early contribution to neuroscience

2026-05-09T23:21:51+02:00

Just recently, I had the opportunity to attend a talk by Helmut Kettenmannꜛ on microglia at our institute. Kettenmann is a renowned neuroscientist, known for his pioneering work on glial cells, particularly microglia. The talk itself was excellent, and I learned a lot about microglial biology and current research in the field. However, what really impressed me was something he mentioned almost in passing during his introduction. He referred to the doctoral dissertation of Hermann von Helmholtz from 1842, which he had recently helped translate from Latin into English. To my surprise, I learned that Helmholtz’s dissertation was not about physics, as I would have expected, but about: biology. More precisely: About cell biology and the nervous system.

My personal hard copy of De fabrica systematis nervosi evertebratorum. Die kommentierte Dissertation von Hermann Helmholtz. The book contains the original Latin text, a German translation, an English translation, and extensive commentary by Julia Heideklang, Hans-Joachim Pflüger, and Helmut Kettenmann. The fact that Helmholtz’s dissertation dealt with the nervous system was quite surprising to me, since I had always associated him with physics and mathematics.

And if this was not surprising enough, Kettenmann made a further remark that was even more striking: Helmholtz’s dissertation was an anatomical study of invertebrate nervous systems, written at a time when the basic conceptual distinction between neurons and glial cells did not yet exist. In other words, Helmholtz investigated the structure of the nervous system without the conceptual framework that we now take for granted. This is remarkable, since the distinction between neurons and glia was introduced only later by Rudolf Virchow in 1856/1858. Helmholtz was therefore working in a conceptual landscape that was still largely undefined, and yet he was already engaging with the cellular organization of nervous tissue.

Thanks to the work of Helmut Kettenmann and his two colleagues, this early scientific text is now accessible again. The translation is freely available onlineꜛ, and I could not resist buying a hard copy of the book to hold in my hands and read through it. In this post, I would therefore like to share some impressions and insights from reading the dissertation, and to reflect on its significance, not as a historical curiosity, but as an early contribution to neuroscience that has largely been forgotten.

Helmholtz: Biography and what we usually know about him

Hermann von Helmholtz (1821–1894) is primarily known as one of the central figures of 19th century physics. For physicists, his name is associated with a range of foundational results across electrodynamics, thermodynamics, and mathematical physics.

Left: Hermann von Helmholtz in 1848. Source: Wikimedia Commonsꜛ (license: public domain). Right: Last photograph of von Helmholtz, taken three days before his final illness. Source: Wikimedia Commonsꜛ (license: CC0 1.0 Universal Public Domain Dedication). Helmholtz is widely regarded as one of the greatest physicists of the 19th century, and his contributions to physics are foundational. However, his early work on the nervous system is not commonly known and is often overlooked in standard accounts of his scientific career. This is partly because his dissertation was written in Latin and therefore remained largely inaccessible for a long time, but it also reflects a broader tendency in the history of science to reduce scientists to only one part of their intellectual work in my view.

One of the most familiar formulations associated with Helmholtz is the Helmholtz decomposition theorem (in German often just referred to as the Helmholtz theorem). Let’s briefly recall what this is about. In vector calculus, the theorem states that any sufficiently smooth vector field can be decomposed into a divergence-free (solenoidal) part and a curl-free (irrotational) part. Mathematically, for a vector field $\mathbf{F}(\mathbf{x})$ that vanishes at infinity, we can write:

\[\begin{align} \mathbf{F} &= -\nabla \phi + \nabla \times \mathbf{A} \end{align}\]

Here, $\phi$ is a scalar potential and $\mathbf{A}$ a vector potential. Physically, this means that any field can be separated into a component that originates from sources or sinks and a component that describes rotational or circulating structures. This decomposition underlies much of classical electrodynamics and fluid dynamics, where electric, magnetic, or velocity fields are naturally described in terms of potentials.

Closely related is the Helmholtz equation, which appears throughout wave physics:

\[\begin{align} \nabla^2 \psi + k^2 \psi &= 0 \end{align}\]

Here, $\psi$ describes a spatial wave field, such as a sound pressure distribution, an electromagnetic field amplitude, or a quantum mechanical wavefunction. The Laplacian $\nabla^2$ captures how the field changes spatially, while the parameter $k$ is the wave number, related to the wavelength $\lambda$ by

\[\begin{align} k &= \frac{2\pi}{\lambda} \end{align}\]

The Helmholtz equation emerges when oscillatory systems are studied in the frequency domain. Instead of describing how a wave evolves in time, it describes the spatial structure of standing or propagating waves at a fixed frequency. Solutions therefore represent spatial wave patterns and resonant modes. The equation appears in acoustics, optics, electrodynamics, and quantum mechanics, but also in plasma physics and space physics, where wave-like disturbances propagate through magnetized plasmas. Plasma waves in the solar wind, magnetospheric oscillations, and electromagnetic wave propagation in ionized media can all lead, under suitable approximations, to Helmholtz-type equations.

Left: Two sources of radiation in the plane, represented mathematically by a source function $f$, which vanishes in the blue region. Right: The real part of the resulting field $A$. Here, $A$ is the solution of the inhomogeneous Helmholtz equation $(\nabla^2 + k^2)A = -f$. Source: Wikimedia Commonsꜛ and Wikimedia Commonsꜛ (license: public domain). The Helmholtz equation describes how localized sources generate spatial wave fields and interference patterns. Such equations form part of the mathematical foundation of modern wave physics.

Helmholtz’s scientific work, however, extended far beyond vector calculus and wave theory. He also formulated the principle of conservation of energy in a modern quantitative framework, thereby contributing to the unification of mechanics, heat theory, and electromagnetism. At the same time, he remained deeply connected to physiology and medicine throughout his career. In experimental physiology, for example, he famously measured the finite speed of nerve conduction, thereby refuting the older assumption that neural signaling occurred instantaneously. This finding is already remarkable for itself, as at his time, the experiments by Hodgkin and Huxley that would later provide a detailed mechanistic explanation of nerve conduction were still decades away. But it also illustrates how Helmholtz’s scientific interests spanned both physics and physiology, and how he applied rigorous quantitative methods to the study of biological systems.

So, Helmholtz did not simply move from medicine into physics and leave biology behind. Rather, he approached biological systems with the same quantitative and mechanistic mindset that characterized his physical work. In many ways, he helped establish the broader 19th century idea that physiological processes could be investigated with the tools and methods of physics.

From a physicist’s perspective, Helmholtz is therefore a unifying figure who connects field theory, wave physics, thermodynamics, and physiology. The idea that this same person began his scientific career with a detailed anatomical study of invertebrate nervous systems is therefore unexpected at first glance, but perhaps also strangely fitting.

Helmholtz’s dissertation

Helmholtz’s 1842 dissertation, De fabrica systematis nervosi evertebratorum, is situated in a specific scientific context. At that time, microscopy had only recently reached a level that allowed the visualization of cellular structures. Christian Ehrenberg (1795-1876)ꜛ had published the first images of neuronsꜛ (Ehrenberg (1832, 1836); Chvátal (2015)ꜛ), including ganglia of the leech, and the emerging cell theory of Schleidenꜛ and Schwannꜛ proposed that all tissues are composed of cells in 1838/1839 (Schwann (1838); Schwann (1839)).

First image of a neuron: A drawing of the leech ganglion by Christian Ehrenberg (1836). Individual magnified neurons are displayed on the left. Source: Julia Heideklang, H.-J. Pflüger, Helmut Kettenmann, De fabrica systematis nervosi evertebratorum. Die kommentierte Dissertation von Hermann Helmholtz, 2021, page 10, (license: CC BY-SA 4.0).

Top Left: Theodore Schwann (1810-1882); from The Story of Nineteenth-Century Science, page 377, 1904, by Henry Smith Williams. Source: Wikimedia Commonsꜛ (license: public domain). Top Right: Matthias Schleiden (1804-1881), between 1882 and 1883. Source: Wikimedia Commonsꜛ (license: public domain). Bottom: Human cancer cells with nuclei (specifically the DNA) stained blue. The central and rightmost cell are in interphase, so the entire nuclei are labeled. The cell on the left is going through mitosis and its DNA has condensed. Source: Wikimedia Commonsꜛ (license: public domain). – Schwann and Schleiden are credited with the formulation of the cell theory, which posits that all living organisms are composed of cells. This was a foundational concept in biology, and it set the stage for later investigations into the cellular structure of tissues, including the nervous system. Helmholtz’s dissertation can be seen as an early attempt to apply these emerging ideas to the study of nervous tissue in invertebrates.

However, while vertebrate nervous systems had begun to be described, a systematic investigation of invertebrate nervous systems was largely missing. Helmholtz’s supervisor, Johannes Müllerꜛ, assigned precisely this problem to the young medical student: To determine whether the structural principles of nervous systems are conserved across species.

Experimental animals used by Helmholtz in his dissertation: Giant slug (Arion empiricorum; today: Arion ater), Great pond snail (Limnaea stagnalis), Burgundy snail (Helix pomatia), Great ramshorn snail (Planorbis corneus; today: Planorbarius corneus), Freshwater pearl mussel (Unio margaritifera; today: Margaritifera margaritifera), Common leech (Hirudo vulgaris), Earthworm (Lumbricus terrestris), Medicinal leech (Hirudo medicinalis). Source: Julia Heideklang, H.-J. Pflüger, Helmut Kettenmann, De fabrica systematis nervosi evertebratorum. Die kommentierte Dissertation von Hermann Helmholtz, 2021, page 14, (license: CC BY-SA 4.0).

Helmholtz approached this question through comparative anatomical analysis of a range of invertebrates, including leeches, worms, crustaceans, and insects. His method was fundamentally microscopic dissection and observation. For example, he describes how a nerve or nerve cord, taken from a living animal and mechanically separated on a glass plate, reveals individual nerve fibers under appropriate preparation. Similarly, by dissecting ganglia, he identifies cellular structures, including what he describes as “caudate cells”, that is, cell bodies with extensions.

Nervous system of the Medicinal leech. (A) Scheme of the nervous system; (B) Upper part of the supraesophageal ganglion (brain); (C) Fourth ganglion of the ventral nerve cord; Retzius 1891. Helmholtz describes the nervous system of the leech in detail, identifying nerve cords, ganglia, and cellular structures. This figure from Retzius (1891) illustrates the organization of the leech nervous system, which Helmholtz would have studied in his dissertation. Source: Julia Heideklang, H.-J. Pflüger, Helmut Kettenmann, De fabrica systematis nervosi evertebratorum. Die kommentierte Dissertation von Hermann Helmholtz, 2021, page 19, (license: CC BY-SA 4.0).

Thus, Helmholtz is effectively describing neurons as entities consisting of a cell body and processes, without yet having the conceptual framework of the neuron doctrine (the principle that the nervous system consists of distinct individual neurons rather than a continuous network). The terminology is not modern, but the observational content clearly anticipates it.

A central structural motif in his analysis is the organization of nerve cords and ganglia. He identifies longitudinal nerve cords connected by ganglia and observes that these ganglia contain lobed structures receiving nerve fibers from the cords. He further describes plexus-like arrangements, where multiple nerve branches interconnect and exchange fibers, in a manner analogous to plexuses in vertebrates.

One of the most significant conceptual conclusions of the dissertation is the assertion of structural homology between invertebrate and vertebrate nervous systems. Helmholtz argues that, despite differences in overall organization, the fundamental elements are the same. Nerve fibers, cellular bodies, and their interconnections follow a common structural plan across species. This is explicitly reflected in his comparison of branching patterns in invertebrate nerve plexuses to those found in vertebrates, where each branch carries fibers corresponding to those entering and leaving connected structures.

Nervous system of the crayfish after Huxley (1880): (A) Scheme of the nervous system (Retzius 1890), (B) Supraesophageal ganglion (brain), (Retzius 1890), (C) Third thoracic ganglion (Retzius 1890), (D) Individual nerve cells (Retzius 1890), (E) First abdominal ganglion (Retzius 1890). Helmholtz argues in his dissertation that the structural organization of the nervous system in invertebrates, such as the crayfish, shares fundamental features with vertebrate nervous systems: Nerve fibers, ganglia, and plexus-like arrangements are common motifs that reflect a shared structural plan. Source: Julia Heideklang, H.-J. Pflüger, Helmut Kettenmann, De fabrica systematis nervosi evertebratorum. Die kommentierte Dissertation von Hermann Helmholtz, 2021, page 21, (license: CC BY-SA 4.0).

In modern terms, this can be interpreted as an early formulation of a unifying principle of nervous system organization: That neural circuits are constructed from recurring cellular and connective motifs, independent of the organism. This is not yet an evolutionary argument in the Darwinian sense, but it is clearly a structural generalization across species.

Another important aspect is actually methodological. Helmholtz relies on careful mechanical dissection, microscopy, and comparative reasoning. There is no electrophysiology, no staining techniques in the modern sense, and no formal theory of synaptic connectivity. Yet, within these constraints, he identifies consistent structural patterns and interprets them in a general framework.

It is also crucial to emphasize what is absent. As mentioned in the introduction, there is no distinction between neurons and glial cells, because this conceptual separation had not yet been introduced. The nervous system is treated as a collection of fibers and cellular elements without a clear functional differentiation between support and signaling components. This absence highlights how early Helmholtz’s work is situated within the development of neuroscience.

What also surprised me is the absence of any figures in the dissertation. The text in this new translation is entirely descriptive, both in the original Latin version and in the translation, and while it contains detailed descriptions of structures, there are no illustrations or diagrams. This is quite different from what we know from today’s scientific publications, where figures play a central role in conveying information. At the moment, I can’t rule out whether this was an editorial choice in the translation or whether the original dissertation also lacked figures. But either way, it emphasizes the textual and descriptive nature of scientific communication at that time. If you have already seen other scientific texts from the early 19th century, you might know that this was not uncommon. The detailed descriptions of concepts and observations were conveyed entirely through words, which is a significant difference from modern scientific writing.

Taken together, the dissertation can be understood as an early comparative neuroanatomical study that establishes three key points:

First, nervous systems across different animal classes share a common structural basis.
Second, these systems are composed of cellular elements and fibers that can be identified microscopically.
Third, their organization into ganglia, cords, and plexuses follows systematic and reproducible patterns.

These insights, while not yet framed in the language of modern neuroscience, clearly anticipate later developments in the understanding of nervous system structure and organization. Helmholtz’s work can therefore be seen as a foundational contribution to the field, even if it has been largely overlooked in standard historical accounts.

Conclusion

What remains, after reading this dissertation in light of Helmholtz’s later work, is a certain sense of discontinuity that is, in fact, highly revealing. The figure who would later formalize energy conservation, develop mathematical frameworks for fields and waves, and shape theoretical physics, begins with meticulous anatomical observations of leeches and worms under a microscope.

From a modern perspective, this is not merely an anecdotal curiosity. It reflects a broader feature of 19th century science (and even in earlier times), where disciplinary boundaries were not yet rigidly established. The same individual could move from cell structure to electrodynamics without the conceptual friction that would exist today. Other famous figures such as Carl Friedrich Gauss, Michael Faraday, and James Clerk Maxwell also had similarly broad scientific interests that spanned multiple fields.

At the same time, the dissertation itself is more than a historical footnote. It contains a clear recognition of structural unity across nervous systems and an implicit cellular perspective that anticipates later developments. That this work was written in Latin and remained largely inaccessible for nearly two centuries may explain why it has not entered the standard narrative of neuroscience.

And Kettenmann’s remark on this work, in retrospect, points to something more general: The early history of neuroscience was less linear than our retrospective narratives often suggest. Ideas that now seem closely connected, such as cell theory, comparative neuroanatomy, glia, neurons, and physiological mechanism, did not emerge as one coherent program. They developed in fragments, in different disciplines, with different terminologies and different methods. Helmholtz’s early contribution to neuroscience was neither trivial nor naive. It was one such fragment. Revisiting it does not change the established milestones of neuroscience, but it does refine the picture. It shows that some of the core ideas about the cellular organization and comparative structure of nervous systems were already being articulated at a time when even the basic terminology had not yet been established.

Personally, this is also why I find Helmholtz’s scientific trajectory so interesting. I began in physics myself and later moved into neuroscience, although, of course, in a completely different historical and scientific context. For that reason, the trajectory from physical thinking to biological questions feels familiar, and reminds me that the border between physics and neuroscience is not as sharp as it sometimes appears. Helmholtz’s dissertation makes this visible in an unusually early and concrete form: The nervous system was already a place where anatomy, physiology, microscopy, and physical reasoning could meet.

As mentioned at the beginning, the new translation is sharedꜛ under a Creative Commons license and is freely available onlineꜛ. For anyone interested in the history of neuroscience, physiology, or 19th century science more broadly, it is well worth reading. Feel free to share your thoughts and insights in the comments below.

References and further reading

Julia Heideklang, H.-J. Pflüger, Helmut Kettenmann, De fabrica systematis nervosi evertebratorum. Die kommentierte Dissertation von Hermann Helmholtz, 2021, Wbg Academic, ISBN: 9783534400942, online PDFꜛ, Websiteꜛ
Chvátal, A., Discovering the Structure of Nerve Tissue: Part 1: From Marcello Malpighi to Christian Berres, 2015, Journal of the History of the Neurosciences, 24(3), 268–291. doi: 10.1080/0964704X.2014.977676ꜛ
Ehrenberg, C. G., Über das neueste Mikroskop, von Pistor und Schiek in Berlin, gefertigt im Januar 1832, 1832, Ein Schreiben von Hrn. Ehrenberg an den Herausgeber, Januarheft 1832 der Analen der Physik, 188-192
Ehrenberg, C. G., Abhandlung der Akademie: Beobachtung einer bisher unbekannren auffallenden Structur des Seelenorgans bei Menschen und Thieren, 1836, Berlin
Schleiden, M., Beträge zur Phytogenesis, 1838, Archiv für Anatomie, Physiologie und wissenschaftliche Medicin, Weit et Comp. Berlin
Schwann, T., Mikroskopische Untersuchungen über die Übereinstimmung in der Struktur und dem Wachstum der Thiere und Pflanzen, 1839, Sander’sche Buchhandlung, Berlin
Rudolf Virchow, Die Cellularpathologie in ihrer Begründung auf physiologische und pathologische Gewebelehre: Zwanzig Vorlesungen, gehalten während der Monate Februar, März und April 1858 im pathologischen Institute zu Berlin, 1856, Verlag von August Hirschwald, Berlin

Clopath plasticity rule

2026-04-14T09:44:41+02:00

The Clopath learning rule is a biophysically inspired model of synaptic plasticity that extends classical Hebbian learning and simple spike-timing-dependent plasticity (STDP) formulations by incorporating the postsynaptic membrane voltage as an explicit state variable. Introduced by Claudia Clopath et al. in 2010ꜛ, the rule was designed to account for experimental findings that cannot be captured well by timing-only STDP rules. In particular, synaptic potentiation and depression do not depend only on the relative timing of pre- and postsynaptic spikes, but also on the depolarization state of the postsynaptic neuron.

Result of a spike pairing simulation using the Clopath rule. The plot shows the change in synaptic weight as a function of the relative timing of pre- and postsynaptic spikes. The Clopath rule captures both potentiation and depression of synaptic weights based on the timing of spikes and the postsynaptic membrane potential. We will simulate this experiment in the section “Spike pairing simulation” below.

This makes the Clopath rule especially important in computational neuroscience. It provides a compact phenomenological framework that links synaptic plasticity to membrane dynamics, spike timing, and homeostatic stabilization. As a result, it is widely used in spiking neural network simulations that aim to remain closer to biological plasticity mechanisms than pair-based STDP rules.

Mathematical description

The central idea of the Clopath rule is that potentiation and depression depend on different combinations of presynaptic activity and postsynaptic voltage. In contrast to simple STDP rules, the postsynaptic membrane potential is not treated as an implicit byproduct of spike times, but as a relevant dynamical quantity in its own right.

A simplified schematic form of the rule can be written as

\[\begin{align*} \frac{dw_{ij}}{dt} =& -A_\text{LTD} \, x_i(t)\,[\bar{u}_j(t)-\theta_-]_+ \\ & + A_\text{LTP} \, \bar{x}_i(t)\,[u_j(t)-\theta_+]_+[\bar{u}_j(t)-\theta_-]_+, \end{align*}\]

where

$w_{ij}$ is the synaptic weight from presynaptic neuron $i$ to postsynaptic neuron $j$,
$A_\text{LTD}$ and $A_\text{LTP}$ set the relative strength of the depressive and potentiating plasticity components,
$x_i(t)$ and $\bar{x}_i(t)$ are presynaptic traces,
$u_j(t)$ is the postsynaptic membrane potential,
$\bar{u}_j(t)$ is a low pass filtered postsynaptic voltage,
$\theta_-$ and $\theta_+$ are voltage thresholds for depression and potentiation, respectively, and
$[\,\cdot\,]_+ = \max(\cdot,0)$ denotes rectification.

This expression captures the main logic of the model:

depression is triggered when presynaptic activity coincides with sufficient postsynaptic depolarization above a lower threshold
potentiation requires stronger postsynaptic depolarization above a higher threshold and depends on presynaptic activation through a trace term
the two processes are therefore asymmetric in their voltage dependence

The precise formulation differs somewhat between the original papers and specific simulator implementations such as NEST (see below), but the conceptual structure remains the same.

A typical presynaptic trace is described by

\[\frac{dx_i(t)}{dt} = -\frac{x_i(t)}{\tau_x} + \sum_f \delta(t-t_i^f),\]

where $\tau_x$ is the decay time constant and $t_i^f$ denotes the times of presynaptic spikes. $\delta(t - t_i^f)$ represents the Dirac delta function indicating a presynaptic spike at time $t_i^f$. $f$ denotes the time of the presynaptic spike. The (low-pass) filtered postsynaptic voltage potential can be written as

\[\frac{d\bar{u}_j(t)}{dt} = -\frac{\bar{u}_j(t)}{\tau_u} + u_j(t),\]

where $\tau_u$ is the corresponding voltage filter (decay) time constant.

In addition to its voltage dependence, the original Clopath model also contains a homeostatic component that prevents runaway synaptic growth. This is one of the key reasons why the rule is more stable in recurrent network simulations than many simpler STDP variants.

The synaptic weights are typically bounded such that

\[0 \leq w_{ij} \leq w_\text{max}.\]

In some versions of the model, the LTD amplitude is further modulated by a homeostatic postsynaptic activity term, for example in the schematic form

\[A_\text{LTD} \propto \alpha \frac{\bar{u}_{j,\mathrm{homeo}}^2}{u_\text{ref}^2},\]

where $\alpha$ is a scaling factor, $u_\text{ref}$ is a reference level, and $\bar{u}_{j,\mathrm{homeo}}$ denotes a slowly varying postsynaptic activity or voltage measure rather than the instantaneous membrane potential itself. The exact form depends on the specific model variant or simulator implementation. This allows the rule to counteract runaway potentiation and helps maintain stable learning in recurrent networks.

Implementation in neural network simulations

To use the Clopath rule in a spiking network simulation, one must supply not only pre- and postsynaptic spikes but also the postsynaptic membrane voltage and its filtered versions. For this reason, the rule is often coupled to a neuron model that provides a realistic subthreshold voltage trajectory. A common choice is the adaptive exponential integrate and fire (AdEx) neuron model, which captures leak dynamics, exponential spike initiation, and adaptation. A standard AdEx membrane equation is

\[\begin{align*} C \frac{du(t)}{dt} = & -g_L \bigl(u(t)-E_L\bigr) + g_L \Delta_T e^{(u(t)-V_T)/\Delta_T} \\ & - I_\text{adapt}(t) + I_\text{syn}(t) + I_\text{ext}(t), \end{align*}\]

where $u(t)$ is the membrane potential, $C$ is the membrane capacitance, $g_L$ is the leak conductance, $E_L$ is the resting potential, $\Delta_T$ controls the sharpness of spike initiation, $V_T$ is the (adaptive) threshold parameter, $I_\text{adapt}(t)$ is the adaptation current, $I_\text{syn}(t)$ is the synaptic current, and $I_\text{ext}(t)$ is the external input current.

The adaptation current is commonly modeled as

\[\frac{dI_\text{adapt}(t)}{dt} = \frac{a\bigl(u(t)-E_L\bigr)-I_\text{adapt}(t)}{\tau_\text{adapt}},\]

where $a$ is the adaptation conductance and $\tau_\text{adapt}$ is the adaptation time constant.

In the context of the Clopath rule, the synaptic current $I_\text{syn}(t)$ is given by the sum of inputs from all presynaptic neurons, weighted by their synaptic weights $w_{ij}$:

\[I_\text{syn}(t) = \sum_i w_{ij} \cdot x_i(t).\]

In this setting, the Clopath synapse does not replace the neuron model. Rather, it uses the neuron model’s voltage trajectory to determine when synaptic potentiation or depression should occur. This separation is important conceptually: The AdEx model describes membrane dynamics, whereas the Clopath rule describes plasticity at synapses.

In some AdEx-based implementations, the threshold potential is itself adaptive and evolves dynamically after each spike. A simple phenomenological form is

\[\frac{dV_T(t)}{dt} = -\frac{V_T(t)-V_{T,\infty}}{\tau_{V_T}},\]

where $V_{T,\infty}$ is the baseline threshold value and $\tau_{V_T}$ is the corresponding relaxation time constant.

In Clopath and Gerstner (2010)ꜛ, the interaction between spike timing and voltage was further analyzed in a formulation that includes a depolarizing spike afterpotential. This can be represented by an additional variable $z(t)$ that is incremented at spike times and otherwise decays exponentially:

\[\frac{dz(t)}{dt} = -\frac{z(t)}{\tau_z}.\]

Such additions are useful because they help reproduce experimental pairing protocols more faithfully.

Why the Clopath rule matters

The Clopath learning rule is significant for several reasons:

Biological plausibility

A major advantage of the Clopath rule is that it links synaptic modification to the postsynaptic voltage state. This reflects the fact that in biological neurons, synaptic plasticity is shaped not only by spike timing but also by dendritic and somatic depolarization. In this sense, the rule is substantially closer to experimentally observed plasticity than purely timing-based STDP models.

Voltage dependence

The model distinguishes between weaker depolarization regimes associated with depression and stronger depolarization regimes associated with potentiation. This captures an important principle of real synapses: The effect of presynaptic input depends on the electrical state of the postsynaptic cell. The same presynaptic spike train can therefore induce very different weight changes depending on whether the postsynaptic neuron is weakly depolarized, strongly depolarized, or near firing threshold.

Homeostasis

The Clopath model includes homeostatic ingredients that help constrain weight growth and maintain stable activity over time. This is especially important in recurrent networks (RNN), where unconstrained Hebbian plasticity can otherwise produce runaway excitation or pathological synchrony.

Computational usefulness

From a modeling perspective, the Clopath rule is attractive because it remains relatively compact while capturing several important experimental dependencies. In particular, it provides a practical compromise between biological realism and computational tractability: It goes beyond purely timing-based STDP rules by incorporating postsynaptic voltage, yet remains simple enough to be used in larger spiking network simulations. This makes it useful in studies of receptive field formation, self-organization, synaptic competition, and activity-dependent structure in recurrent circuits. Because each synapse is updated locally on the basis of presynaptic activity and the postsynaptic voltage state, even reciprocal connections between two neurons do not have to evolve symmetrically. In this way, the rule can contribute to the emergence of structured and direction-specific connectivity patterns rather than merely uniform global weight changes. Whenever a plasticity model is needed that depends on more than spike timing alone, the Clopath rule is a natural candidate.

Spike pairing simulation

The Clopath rule (clopath_synapseꜛ) and the corresponding voltage-based neuron model (aeif_psc_delta_clopathꜛ) are both implemented in the NEST simulator. To illustrate the behavior of the rule, we can reproduce the NEST tutorial “Clopath Rule: Spike pairing experiment”ꜛ. The goal is to study how synaptic weights change under repeated spike pairings with different temporal structure and different effective pairing frequencies.

Let’s start by importing the necessary libraries:

import matplotlib.pyplot as plt
import numpy as np
import nest

# set the verbosity of the NEST simulator:
nest.set_verbosity("M_WARNING")

# Set global properties for all plots
plt.rcParams.update({'font.size': 12})
plt.rcParams["axes.spines.top"]    = False
plt.rcParams["axes.spines.bottom"] = False
plt.rcParams["axes.spines.left"]   = False
plt.rcParams["axes.spines.right"]  = False

nest.ResetKernel()

# define the simulation resolution:
resolution = 0.1

Next, we define the parameters of the neuron.,

# define the parameters of the neuron:
clopath_neuron_params = {
    "V_m": -70.6,           # [mV] membrane potential
    "E_L": -70.6,           # [mV] leak reversal potential
    "C_m": 281.0,           # [pF] membrane capacitance
    "theta_minus": -70.6,   # [mV] threshold for u (LTD)
    "theta_plus": -45.3,    # [mV] threshold for u_bar (LTP)
    "A_LTD": 14.0e-5,       # [nS] amplitude of LTD
    "A_LTP": 8.0e-5,        # [nS] amplitude of LTP
    "tau_u_bar_minus": 10.0,# [ms] time constant for u_bar_minus (LTD)
    "tau_u_bar_plus": 7.0,  # [ms] time constant for u_bar_plus (LTP)
    "delay_u_bars": 4.0,    # [ms] delay with which u_bar_[plus/minus] are processed to compute the synaptic weights.
    "a": 4.0,               # [nS] subthreshold adaptation
    "b": 0.0805,            # [pA] spike-triggered adaptation
    "V_reset": -70.6 + 21.0,# [mV] reset potential
    "V_clamp": 33.0,        # [mV] clamping potential
    "t_clamp": 2.0,         # [ms] clamping time
    "t_ref": 0.0,           # [ms] refractory period
}

We then need to define the spike times for the pre- and postsynaptic neurons. They are arranged in two lists, spike_times_pre and spike_times_post, each containing the spike times for different pairs of pre- and postsynaptic neurons. The simulation will run for each pair of spike times:

spike_times_pre = [
    # Presynaptic spike before the postsynaptic
    [20.0, 120.0, 220.0, 320.0, 420.0],
    [20.0, 70.0, 120.0, 170.0, 220.0],
    [20.0, 53.3, 86.7, 120.0, 153.3],
    [20.0, 45.0, 70.0, 95.0, 120.0],
    [20.0, 40.0, 60.0, 80.0, 100.0],
    # Presynaptic spike after the postsynaptic
    [120.0, 220.0, 320.0, 420.0, 520.0, 620.0],
    [70.0, 120.0, 170.0, 220.0, 270.0, 320.0],
    [53.3, 86.6, 120.0, 153.3, 186.6, 220.0],
    [45.0, 70.0, 95.0, 120.0, 145.0, 170.0],
    [40.0, 60.0, 80.0, 100.0, 120.0, 140.0]]

spike_times_post = [
    [10.0, 110.0, 210.0, 310.0, 410.0],
    [10.0, 60.0, 110.0, 160.0, 210.0],
    [10.0, 43.3, 76.7, 110.0, 143.3],
    [10.0, 35.0, 60.0, 85.0, 110.0],
    [10.0, 30.0, 50.0, 70.0, 90.0],
    [130.0, 230.0, 330.0, 430.0, 530.0, 630.0],
    [80.0, 130.0, 180.0, 230.0, 280.0, 330.0],
    [63.3, 96.6, 130.0, 163.3, 196.6, 230.0],
    [55.0, 80.0, 105.0, 130.0, 155.0, 180.0],
    [50.0, 70.0, 90.0, 110.0, 130.0, 150.0]]

We set the initial weight of the synapse,

# set the initial weight of the synapse::
init_w = 0.5
syn_weights = []

and run the simulation for each pair of spike times (the weights are recorded for each simulation run):

for s_t_pre, s_t_post in zip(spike_times_pre, spike_times_post):
    nest.ResetKernel()
    nest.resolution = resolution

    # create one adaptive exponential integrate-and-fire neuron neuron with Clopath synapse:
    clopath_neuron = nest.Create("aeif_psc_delta_clopath", 1, clopath_neuron_params)

    # We need a parrot neuron for technical reasons since spike generators can only
    # be connected with static connections:
    parrot_neuron = nest.Create("parrot_neuron", 1)

    # create and connect spike generators:
    spike_gen_pre = nest.Create("spike_generator", {"spike_times": s_t_pre})
    nest.Connect(spike_gen_pre, parrot_neuron, syn_spec={"delay": resolution})

    spike_gen_post = nest.Create("spike_generator", {"spike_times": s_t_post})
    nest.Connect(spike_gen_post, clopath_neuron, syn_spec={"delay": resolution, "weight": 80.0})

    # create weight recorder:
    weightrecorder = nest.Create("weight_recorder")

    # create Clopath connection with weight recorder:
    nest.CopyModel("clopath_synapse", "clopath_synapse_rec", {"weight_recorder": weightrecorder})
    syn_dict = {"synapse_model": "clopath_synapse_rec", "weight": init_w, "delay": resolution}
    nest.Connect(parrot_neuron, clopath_neuron, syn_spec=syn_dict)

    # simulation:
    simulation_time = 10.0 + max(s_t_pre[-1], s_t_post[-1])
    nest.Simulate(simulation_time)

    # extract and save synaptic weights:
    weights = weightrecorder.get("events", "weights")
    syn_weights.append(weights[-1])

syn_weights = np.array(syn_weights)
# scaling of the weights so that they are comparable to Clopath et al (2010):
syn_weights = 100.0 * 15.0 * (syn_weights - init_w) / init_w + 100.0

Finally, we plot the results:

plt.figure(figsize=(4.5, 3.5))
plt.plot([10.0, 20.0, 30.0, 40.0, 50.0], syn_weights[5:], 
         lw=2.5, ls="-", label="pre-post pairing")
plt.plot([10.0, 20.0, 30.0, 40.0, 50.0], syn_weights[:5], 
         lw=2.5, ls="-", label="post-pre pairing")
plt.ylabel("normalized weight change")
plt.xlabel("firing rate of the presynaptic neuron [Hz]")
plt.legend()
plt.title(f"Synaptic weight using\nthe Clopath rule")
plt.tight_layout()
plt.show()

The resulting plot shows the synaptic weight change for different pairs of pre- and postsynaptic spike timings, plotted as a function of the firing rate of the presynaptic neuron. It shows how weight change depends not only on the order of spikes but also on the repetition frequency of the pairing protocol:

Synaptic weight change using the Clopath rule for different pairs of pre- and postsynaptic spike timings. The weight change is normalized to the initial weight and scaled to be comparable to Clopath et al. (2010)ꜛ. The two curves correspond to different temporal pairing orders and illustrate that the outcome depends jointly on spike timing, voltage dynamics, and repetition frequency.

The key observations in this plot are:

pre-post pairing: As the firing rate of the presynaptic neuron increases, the synaptic weight change also increases. This indicates stronger synaptic potentiation at higher firing rates.
post-pre pairing: The synaptic weight change increases similarly with higher firing rates but at lower rates compared to pre-post pairing. This shows a tendency for less potentiation or even depression in this pairing scenario for lower frequencies. However, for higher frequencies, the difference between pre-post and post-pre pairings diminishes.

The frequency of presynaptic spikes is crucial because it can significantly impact the degree of synaptic plasticity. Higher frequencies typically lead to more substantial changes in synaptic weights due to increased temporal overlap between pre- and postsynaptic spikes, which enhances the effects of spike-timing-dependent plasticity (STDP).

Network simulation of bidirectional connections

Next, we further investigate the Clopath rule by recapitulating the NEST tutorial “Clopath Rule: Large network simulation with bidirectional connections”ꜛ. In this simulation, we create a recurrent network of 10 excitatory and 3 inhibitory neurons connected with Clopath synapses, stimulated by Poisson generators. The goal is not to examine isolated spike pairs, but to see how voltage-based synaptic plasticity shapes connectivity within a small network of excitatory neurons embedded in an input driven circuit.

Let’s begin by importing the necessary libraries and setting the simulation parameters:

import random
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable
import numpy as np
import nest

# set the verbosity of the NEST simulator:
nest.set_verbosity("M_WARNING")

# Set global properties for all plots
plt.rcParams.update({'font.size': 12})
plt.rcParams["axes.spines.top"]    = False
plt.rcParams["axes.spines.bottom"] = False
plt.rcParams["axes.spines.left"]   = False
plt.rcParams["axes.spines.right"]  = False


# set the simulation resolution and time:
simulation_time = 1.0e4
resolution = 0.1
delay = resolution
nest.ResetKernel()
nest.resolution = resolution

# for reproducibility:
np.random.seed(1)

Next, we define the parameters of the Clopath synapse and the neuron model and make connections between the neurons in the network. We create 10 excitatory and 3 inhibitory neurons and connect them with Clopath synapses. We also create Poisson generators to stimulate the network by 500 input neurons:

# poisson_generator parameters:
pg_A = 30.0     # amplitude of Gaussian
pg_sigma = 10.0 # std deviation

# create neurons and devices:
nrn_model = "aeif_psc_delta_clopath"
nrn_params = {
    "V_m": -30.6,           # [mV] membrane potential
    "g_L": 30.0,            # [nS] leak conductance
    "w": 0.0,               # [nS] adaptation conductance
    "tau_u_bar_plus": 7.0,  # [ms] time constant for u_bar_plus (LTP)
    "tau_u_bar_minus": 10.0,# [ms] time constant for u_bar_minus (LTD)
    "tau_w": 144.0,         # [ms] time constant for w
    "a": 4.0,               # [nS] subthreshold adaptation conductance
    "C_m": 281.0,           # [pF] membrane capacitance
    "Delta_T": 2.0,         # [mV] slope factor
    "V_peak": 20.0,         # [mV] spike cut-off
    "t_clamp": 2.0,         # [ms] clamping time
    "A_LTP": 8.0e-6,        # [nS] amplitude of LTP
    "A_LTD": 14.0e-6,       # [nS] amplitude of LTD
    "A_LTD_const": False,   
    "b": 0.0805,            # [nA] spike-triggered adaptation
    "u_ref_squared": 60.0**2# [nA^2] squared threshold for u
    }

pop_exc = nest.Create(nrn_model, 10, nrn_params) # create 10 excitatory neurons
pop_inh = nest.Create(nrn_model, 3, nrn_params)  # create 3 inhibitory neurons

pop_input = nest.Create("parrot_neuron", 500)    # helper neurons (for technical reasons)
pg = nest.Create("poisson_generator", 500)       # poisson generators; i.e., 500 input neurons
wr = nest.Create("weight_recorder")              # create a weight recorder

nest.Connect(pg, pop_input, "one_to_one", 
             {"synapse_model": "static_synapse", 
              "weight": 1.0, 
              "delay": delay})

nest.CopyModel("clopath_synapse", "clopath_input_to_exc", {"Wmax": 3.0})
conn_dict_input_to_exc = {"rule": "all_to_all"}
syn_dict_input_to_exc = {"synapse_model": "clopath_input_to_exc",
                         "weight": nest.random.uniform(0.5, 2.0),
                         "delay": delay}
nest.Connect(pop_input, pop_exc, conn_dict_input_to_exc, syn_dict_input_to_exc)

# create input->inh connections:
conn_dict_input_to_inh = {"rule": "all_to_all"}
syn_dict_input_to_inh = {"synapse_model": "static_synapse", "weight": nest.random.uniform(0.0, 0.5), "delay": delay}
nest.Connect(pop_input, pop_inh, conn_dict_input_to_inh, syn_dict_input_to_inh)

# create exc->exc connections:
nest.CopyModel("clopath_synapse", "clopath_exc_to_exc", {"Wmax": 0.75, "weight_recorder": wr})
syn_dict_exc_to_exc = {"synapse_model": "clopath_exc_to_exc", "weight": 0.25, "delay": delay}
conn_dict_exc_to_exc = {"rule": "all_to_all", "allow_autapses": False}
nest.Connect(pop_exc, pop_exc, conn_dict_exc_to_exc, syn_dict_exc_to_exc)

# create exc->inh connections:
syn_dict_exc_to_inh = {"synapse_model": "static_synapse", "weight": 1.0, "delay": delay}
conn_dict_exc_to_inh = {"rule": "fixed_indegree", "indegree": 8}
nest.Connect(pop_exc, pop_inh, conn_dict_exc_to_inh, syn_dict_exc_to_inh)

# create inh->exc connections:
syn_dict_inh_to_exc = {"synapse_model": "static_synapse", "weight": 1.0, "delay": delay}
conn_dict_inh_to_exc = {"rule": "fixed_outdegree", "outdegree": 6}
nest.Connect(pop_inh, pop_exc, conn_dict_inh_to_exc, syn_dict_inh_to_exc)

# set initial membrane potentials:
pop_exc.V_m = nest.random.normal(-60.0, 25.0)
pop_inh.V_m = nest.random.normal(-60.0, 25.0)

Finally, we simulate the network. We run the simulation in intervals of 100 ms and set the rates of the Poisson generators based on a Gaussian distribution:

# simulate the network:
sim_interval = 100.0
for i in range(int(simulation_time / sim_interval)):
    # set rates of poisson generators:
    rates = np.empty(500)
    # pg_mu will be randomly chosen out of 25,75,125,...,425,475
    pg_mu = 25 + random.randint(0, 9) * 50
    for j in range(500):
        rates[j] = pg_A * np.exp((-1 * (j - pg_mu) ** 2) / (2 * pg_sigma**2))
        pg[j].rate = rates[j] * 1.75
    nest.Simulate(sim_interval)

We then sort the synaptic weights according to the sender and reshape them into a matrix for visualization:

# sort weights according to sender and reshape:
exc_conns = nest.GetConnections(pop_exc, pop_exc)
exc_conns_senders = np.array(exc_conns.source)
exc_conns_targets = np.array(exc_conns.target)
exc_conns_weights = np.array(exc_conns.weight)
idx_array = np.argsort(exc_conns_senders)
targets = np.reshape(exc_conns_targets[idx_array], (10, 10 - 1))
weights = np.reshape(exc_conns_weights[idx_array], (10, 10 - 1))

# sort according to target:
for i, (trgs, ws) in enumerate(zip(targets, weights)):
    idx_array = np.argsort(trgs)
    weights[i] = ws[idx_array]

weight_matrix = np.zeros((10, 10))
tu9 = np.triu_indices_from(weights)
tl9 = np.tril_indices_from(weights, -1)
tu10 = np.triu_indices_from(weight_matrix, 1)
tl10 = np.tril_indices_from(weight_matrix, -1)
weight_matrix[tu10[0], tu10[1]] = weights[tu9[0], tu9[1]]
weight_matrix[tl10[0], tl10[1]] = weights[tl9[0], tl9[1]]

We also calculate the difference between the initial and final synaptic weights to assess the changes in synaptic weights over time:

# difference between initial and final value:
init_w_matrix = np.ones((10, 10)) * 0.25
init_w_matrix -= np.identity(10) * 0.25

Here are the plot commands to visualize the synaptic weight changes:

# plot synapse weights of the synapses within the excitatory population:
fig, ax = plt.subplots(figsize=(4.85, 4.5))
img = ax.imshow(weight_matrix - init_w_matrix, aspect='auto')

# create an axes on the right side of ax for the colorbar:
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
cbar = plt.colorbar(img, cax=cax)
cbar.ax.tick_params()
cbar.set_ticks(np.arange(-0.002, 0.0115, 0.002))

# adjust the labels and title positions:
ax.set_xlabel("to neuron")
ax.set_ylabel("from neuron")
ax.set_title("Change of synaptic weights\nbefore and after simulation")

# set x and y ticks:
xticklabels = ["1", "3", "5", "7", "9"]
ax.set_xticks([0, 2, 4, 6, 8])
ax.set_xticklabels(xticklabels)
yticklabels = ["1", "3", "5", "7", "9"]
ax.set_yticks([0, 2, 4, 6, 8])
ax.set_yticklabels(yticklabels)

plt.tight_layout()
plt.show()

Synaptic weight changes in a large network with 10 excitatory and 3 inhibitory receiving inputs from 500 Poisson generators. The plot shows the change in synaptic weights before and after the simulation. Rows indicate presynaptic neurons and columns indicate postsynaptic neurons. The synaptic weights are normalized to the initial weight.

The resulting matrix of weight changes shows the plasticity effects in the network, with positive values indicating potentiation and negative values indicating depression. The matrix is not diagonal-symmetric because the Clopath rule allows for bidirectional plasticity, meaning the changes in synaptic weights are not necessarily the same in both directions between any two neurons. The Clopath rule modifies synaptic weights based on the precise timing of spikes between the presynaptic and postsynaptic neurons. If neuron $i$ fires before neuron $j$, the weight change from $i$ to $j$ can be different from the weight change from $j$ to $i$, leading to an asymmetric weight matrix. And each synaptic connection updates its weight independently. The weight from neuron $i$ to neuron $j$ ($w_{i\rightarrow j}$) and from neuron $j$ to neuron $i$ ($w_{j\rightarrow i}$) are updated based on their own specific spike timing and neuronal activity. This independent update mechanism naturally leads to non-symmetric changes.

In biological neural networks, synaptic plasticity is inherently direction-dependent. The synaptic strength from one neuron to another can be modulated differently compared to the reverse direction, reflecting the biological processes of learning and memory encoding. The LTP and LTD processes depend on the relative timing of pre- and postsynaptic spikes. The temporal asymmetry in spike timing (e.g., pre-before-post vs. post-before-pre) results in different synaptic modifications for each direction.

Thus, the lack of diagonal symmetry in the weight change matrix is a direct result of the nature of the Clopath rule, which inherently supports bidirectional, independent, and asymmetric synaptic weight updates. This property allows for more complex and realistic modeling of synaptic plasticity, capturing the nuanced and direction-specific nature of biological synapses.

Conclusion

The Clopath rule is not just another STDP curve. It is a voltage-based plasticity model that explicitly links synaptic change to the postsynaptic depolarization state. This makes it far more suitable than simple pair-based STDP rules for modeling experiments in which firing rate, subthreshold voltage, and repeated spike pairing all influence the direction and magnitude of synaptic plasticity.

I think, its main strength lies in the combination of mechanistic interpretability and computational simplicity. The rule remains compact enough for large network simulations, yet rich enough to capture important biological features such as voltage dependence, frequency dependence, and homeostatic stabilization. For this reason, it has become a standard choice whenever synaptic plasticity is meant to depend on more than spike timing alone.

The complete code used in this blog post is available in this Github repositoryꜛ. The spike pairing and recurrent network examples can be adapted easily to explore how voltage-based plasticity shapes synaptic organization under different input regimes.

The complete code used in this blog post is available in this Github repositoryꜛ (clopath_spike_pairing.py and clopath_bidirectional_connections.py). Feel free to modify and expand upon it, and share your insights.

References

C. Clopath, L. Büsing, E. Vasilaki, Wulfram Gerstner, Connectivity reflects coding: a model of voltage-based STDP with homeostasis, 2010, Nat Neurosci, 13(3), 344-352. doi: 10.1038/nn.2479ꜛ
Claudia Clopath, Wulfram Gerstner, Voltage and spike timing interact in STDP - a unified model, 2010, Frontiers in Synaptic Neuroscience, Vol. n/a, Issue n/a, pages n/a, doi: 10.3389/fnsyn.2010.00025
Voltage-based STDP synapse (Clopath et al. 2010) on ModelDBꜛ
Wulfram Gerstner, Werner M. Kistler, Richard Naud, and Liam Paninski, Chapter 19 Synaptic Plasticity and Learning Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition, 2014, Cambridge University Press, ISBN: 978-1-107-06083-8, free online versionꜛ
Jesper Sjöström, Wulfram Gerstner, Spike-timing dependent plasticity, 2010, Scholarpedia, 5(2):1362, doi: 10.4249/scholarpedia.1362ꜛ
NEST’s tutorial “Clopath Rule: Spike pairing experiment”ꜛ
NEST’s tutorial “Clopath Rule: Bidirectional connections”ꜛ
NEST’s aeif_psc_delta_clopath model descriptionꜛ
NEST’s clopath_synapse descriptionꜛ

Urbanczik-Senn plasticity

2026-02-22T10:56:47+01:00

In 2014, Urbanczik and Sennꜛ proposed a novel learning rule for dendritic synapses in a simplified compartmental neuron model. This rule extends traditional spike-timing-dependent plasticity (STDP) by incorporating the local dendritic potential as a crucial third factor, alongside pre- and postsynaptic spike timings. In this post, we briefly introduce the Urbanczik-Senn plasticity model and discuss its implications for neural computation and learning.

Evolution of synaptic weights according to the Urbanczik-Senn plasticity model. We will discuss them in detail in the results part.

Core concepts

Unlike traditional STDP models, which rely solely on the relative timing of pre- and postsynaptic spikes, the Urbanczik-Senn plasticity model introduces a third factor: The local dendritic potential. The neuron model therefore consists of a somatic and a dendritic compartment. The role of the somatic compartment is to integrate inputs and generate spikes, while the dendritic compartment receives synaptic inputs. The local dendritic potential $V_d(t)$ predicts the somatic activity:

\[\begin{align} C_m \frac{dV_d(t)}{dt} &= -g_L (V_d(t) - E_L) + I_d(t) \end{align}\]

where $C_m$ is the membrane capacitance, $g_L$ is the somatic leak conductance, $E_L$ is the resting membrane potential, and $I_d(t)$ is the dendritic input current.

The somatic potential $V_s(t)$ is driven by the dendritic input $V_d$ and other synaptic inputs directly targeting the soma:

\[\begin{align} C_m \frac{dV_s(t)}{dt} =& -g_L (V_s(t) - E_L) + I_s(t) \\ & + g_D (V_d(t) - V_s(t)) \nonumber \end{align}\]

where $I_s(t)$ is the somatic input current, and $g_D$ is the dendritic coupling conductance. The model aims to minimize the discrepancy between the somatic firing rate $\phi(U)$ and the dendritic prediction $V_d^*$ of the somatic membrane potential:

\[\begin{align} V_d^* &= \frac{g_L \cdot E_L + g_D \cdot V_d}{g_L + g_D} \end{align}\]

The somatic firing rate $\phi(V_S)$ is given by:

\[\begin{align} \phi(V_S) &= \frac{\phi_{\text{max}}}{1.0 + k \cdot e^{\beta \cdot (\theta - V_S)}} \end{align}\]

where $\phi_{\text{max}}$ is the maximum rate, $k$ is the rate slope, $\beta$ is a parameter controlling the steepness, and $\theta$ is the threshold potential.

Synaptic weights are adjusted based on a local dendritic prediction error, which is the difference between the actual somatic spikes and the predicted firing rate from the dendritic potential:

\[\begin{align} \Delta w_i &= \eta \cdot [S(t) - \phi(V_d^*)] \cdot \frac{\partial V_d}{\partial w_i} \end{align}\]

Here, $\eta$ is the learning rate, $S(t)$ represents the somatic spike train, and $\frac{\partial V_d}{\partial w_i}$ is the contribution of synapse $i$ to the dendritic potential.

This plasticity rule is versatile and can support various learning paradigms:

supervised learning: The somatic compartment receives a target signal that guides learning.
unsupervised learning: The network generates its own teaching signals, promoting self-organization.
reinforcement learning: The learning rate is modulated by a reward signal, enabling reinforcement learning.

The main advantage of this model is its ability to unify different learning paradigms under a single rule, driven by the dendritic prediction error. This rule is both biologically plausible and functionally powerful, offering a comprehensive framework for understanding and simulating synaptic plasticity in neural networks. It therefore enables more nuanced and robust learning dynamics compared to traditional STDP models:

Predictive coding: The dendritic prediction error adjusts synaptic weights to better match somatic activity, embodying a form of predictive coding.
Robust learning: By integrating dendritic potentials, the model captures more nuanced synaptic dynamics compared to traditional STDP, which only considers spike timings.
Versatility: The model’s applicability to supervised, unsupervised, and reinforcement learning highlights its robustness and broad relevance to various neural processing tasks.

For further details, please refer to the original paper by Urbanczik and Senn (2014)ꜛ.

Simulation in NEST

In the following, we replicate the NEST tutorial “Weight adaptation according to the Urbanczik-Senn plasticity”ꜛ with some minor modifications. The simulation uses the pp_cond_exp_mc_urbanczik modelꜛ implemented in the NEST simulator. It consists of a two-compartment spiking point process neuron with conductance-based synapses and is capable to connect to a Urbanczik synapse urbanczik_synapseꜛ. The code reproduces the simulation results shown in figure 1B in Urbanczik’s and Senn’s original workꜛ and all simulation parameters are set accordingly except for the units: The simulation uses standard units instead of the unitless quantities used in the paper.

Let’s begin with importing the necessary libraries:

import os
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import numpy as np
import nest
import nest.raster_plot
# set the verbosity of the NEST simulator:
nest.set_verbosity("M_WARNING")

# Set global properties for all plots
plt.rcParams.update({'font.size': 12})
plt.rcParams["axes.spines.top"]    = False
plt.rcParams["axes.spines.bottom"] = False
plt.rcParams["axes.spines.left"]   = False
plt.rcParams["axes.spines.right"]  = False

Next, we need to define a couple of helper functions. First, we define two functions that generate inhibitory and excitatory inputs to the soma:

# define a function for the inhibitory input:
def g_inh(amplitude, t_start, t_end):
    """
    returns weights for the spike generator that drives the inhibitory
    somatic conductance.
    """
    return lambda t: np.piecewise(t, [(t >= t_start) & (t < t_end)], [amplitude, 0.0])

# define a function for the excitatory input:
def g_exc(amplitude, freq, offset, t_start, t_end):
    """
    returns weights for the spike generator that drives the excitatory
    somatic conductance.
    """
    return lambda t: np.piecewise(
        t, [(t >= t_start) & (t < t_end)], [lambda t: amplitude * np.sin(freq * t) + offset, 0.0]
    )

Then, we define a function that calculates the matching potential $U_M$ as a function of the somatic conductances $g_E$ and $g_I$:

\[\begin{align} U_M &= \frac{g_E \cdot E_E + g_I \cdot E_I}{g_E + g_I} \end{align}\]

This function computes the effective potential based on the contributions of excitatory and inhibitory synaptic inputs. It represents the equilibrium potential at the soma, considering the balance of excitatory and inhibitory inputs:

# define the matching potential:
def matching_potential(g_E, g_I, nrn_params):
    """
    returns the matching potential as a function of the somatic conductances.
    """
    E_E = nrn_params["soma"]["E_ex"]
    E_I = nrn_params["soma"]["E_in"]
    return (g_E * E_E + g_I * E_I) / (g_E + g_I)

Next, we define the dendritic prediction $V_d^*$ of the somatic membrane potential:

# define the dendritic prediction of the somatic membrane potential:
def V_w_star(V_w, nrn_params):
    """
    returns the dendritic prediction of the somatic membrane potential.
    """
    g_D = nrn_params["g_sp"]
    g_L = nrn_params["soma"]["g_L"]
    E_L = nrn_params["soma"]["E_L"]
    return (g_L * E_L + g_D * V_w) / (g_L + g_D)

We also need a function to calculate the rate function $\phi(U)$:

# define the rate function phi:
def phi(U, nrn_params):
    """
    rate function of the soma
    """
    phi_max = nrn_params["phi_max"]
    k = nrn_params["rate_slope"]
    beta = nrn_params["beta"]
    theta = nrn_params["theta"]
    return phi_max / (1.0 + k * np.exp(beta * (theta - U)))

and a function to calculate the derivative of the rate function $\phi$, i.e.,

\[\begin{align} h(V_s) &= \frac{15 \cdot \beta}{1 + \frac{e^{-\beta \cdot (\theta - V_s)}}{k}} \end{align}\]

# define the derivative of the rate function phi:
def h(U, nrn_params):
    """
    derivative of the rate function phi
    """
    k = nrn_params["rate_slope"]
    beta = nrn_params["beta"]
    theta = nrn_params["theta"]
    return 15.0 * beta / (1.0 + np.exp(-beta * (theta - U)) / k)

The derivative is needed to calculate the contribution of each synapse to the dendritic potential.

Now, we can start setting up the simulation. We first reset the NEST kernel and set the simulation parameters:

nest.ResetKernel()

# set simulation parameters:
n_pattern_rep     = 100  # number of repetitions of the spike pattern
pattern_duration  = 200.0
t_start           = 2.0 * pattern_duration
t_end             = n_pattern_rep * pattern_duration + t_start
simulation_time   = t_end + 2.0 * pattern_duration
n_rep_total       = int(np.around(simulation_time / pattern_duration))
resolution        = 0.1
nest.resolution   = resolution

Next, we set the neuron parameters, synapse parameters, and input parameters:

# set neuron parameters:
nrn_model = "pp_cond_exp_mc_urbanczik"
nrn_params = {
    "t_ref": 3.0,       # refractory period
    "g_sp": 600.0,      # soma-to-dendritic coupling conductance
    "soma": {
        "V_m": -70.0,   # initial value of V_m
        "C_m": 300.0,   # capacitance of membrane
        "E_L": -70.0,   # resting potential
        "g_L": 30.0,    # somatic leak conductance
        "E_ex": 0.0,    # resting potential for exc input
        "E_in": -75.0,  # resting potential for inh input
        "tau_syn_ex": 3.0,  # time constant of exc conductance
        "tau_syn_in": 3.0,  # time constant of inh conductance
    },
    "dendritic": {
        "V_m": -70.0,  # initial value of V_m
        "C_m": 300.0,  # capacitance of membrane
        "E_L": -70.0,  # resting potential
        "g_L": 30.0,   # dendritic leak conductance
        "tau_syn_ex": 3.0,  # time constant of exc input current
        "tau_syn_in": 3.0,  # time constant of inh input current
    },
    # set parameters of rate function:
    "phi_max": 0.15,    # max rate
    "rate_slope": 0.5,  # called 'k' in the paper
    "beta": 1.0 / 3.0,
    "theta": -55.0,
}

# set synapse params:
syns = nest.GetDefaults(nrn_model)["receptor_types"]
init_w = 0.3 * nrn_params["dendritic"]["C_m"]
syn_params = {
    "synapse_model": "urbanczik_synapse_wr",
    "receptor_type": syns["dendritic_exc"],
    "tau_Delta": 100.0,  # time constant of low pass filtering of the weight change
    "eta": 0.17,  # learning rate
    "weight": init_w,
    "Wmax": 4.5 * nrn_params["dendritic"]["C_m"],
    "delay": resolution,
}

# set somatic input:
ampl_exc = 0.016 * nrn_params["dendritic"]["C_m"] # amplitude of the excitatory input in nA
offset = 0.018 * nrn_params["dendritic"]["C_m"]   # offset of the excitatory input in nA
ampl_inh = 0.06 * nrn_params["dendritic"]["C_m"]  # amplitude of the inhibitory input in nA
freq = 2.0 / pattern_duration                     # frequency of the excitatory input in Hz
soma_exc_inp = g_exc(ampl_exc, 2.0 * np.pi * freq, offset, t_start, t_end) # excitatory input
soma_inh_inp = g_inh(ampl_inh, t_start, t_end)                              # inhibitory input

Then, we set the dendritic input by creating a spike pattern using Poisson generators. We record the spikes of a simulation of $n_{\text{pg}}$ Poisson generators and give the recorded spike times to spike generators:

# set dendritic input
n_pg = 200      # number of poisson generators
p_rate = 10.0  # rate in Hz
pgs = nest.Create("poisson_generator", n=n_pg, params={"rate": p_rate}) # poisson generators
prrt_nrns_pg = nest.Create("parrot_neuron", n_pg)                       # parrot neurons (for technical reasons)
nest.Connect(pgs, prrt_nrns_pg, {"rule": "one_to_one"})
spikerecorder = nest.Create("spike_recorder", n_pg)                # create the spike recorder
nest.Connect(prrt_nrns_pg, spikerecorder, {"rule": "one_to_one"})
nest.Simulate(pattern_duration)
t_srs = [ssr.get("events", "times") for ssr in spikerecorder]

After simulating the spike pattern, the spike times are stored in the variable t_srs and we need to reset the simulation kernel to start the actual simulation:

nest.ResetKernel()
nest.resolution = resolution

Now, we can create the neuron and the according spike generators:

# define neuron:
nrn = nest.Create(nrn_model, params=nrn_params) # create the Urbanczik neuron

# poisson generators are connected to parrot neurons which are connected to the mc neuron:
prrt_nrns = nest.Create("parrot_neuron", n_pg)

# create excitatory input to the soma:
spike_times_soma_inp = np.arange(resolution, simulation_time, resolution)
sg_soma_exc = nest.Create("spike_generator", 
                          params={"spike_times": spike_times_soma_inp, 
                                  "spike_weights": soma_exc_inp(spike_times_soma_inp)})
# create inhibitory input to the soma:
sg_soma_inh = nest.Create("spike_generator", 
                          params={"spike_times": spike_times_soma_inp, 
                                  "spike_weights": soma_inh_inp(spike_times_soma_inp)})

# create excitatory input to the dendrite:
sg_prox = nest.Create("spike_generator", n=n_pg)

We also create a multimeter for recording all parameters of the Urbanczik neuron, a weight recorder for recording the synaptic weights of the Urbanczik synapses, and another spike recorder for recording the spiking of the soma:

# create a multimeter for recording all parameters of the Urbanczik neuron:
rqs = nest.GetDefaults(nrn_model)["recordables"]
multimeter = nest.Create("multimeter", params={"record_from": rqs, "interval": 0.1})

# create a weight_recorder for recoding the synaptic weights of the Urbanczik synapses:
wr = nest.Create("weight_recorder")

# create another spike recorder for recording the spiking of the soma:
spikerecorder_soma = nest.Create("spike_recorder")

Finally, we connect all nodes:

# connect all nodes:
nest.Connect(sg_prox, prrt_nrns, {"rule": "one_to_one"})
nest.CopyModel("urbanczik_synapse", "urbanczik_synapse_wr", {"weight_recorder": wr[0]})
nest.Connect(prrt_nrns, nrn, syn_spec=syn_params)
nest.Connect(multimeter, nrn, syn_spec={"delay": 0.1})
nest.Connect(sg_soma_exc, nrn, 
             syn_spec={"receptor_type": syns["soma_exc"], 
                       "weight": 10.0 * resolution, 
                       "delay": resolution})
nest.Connect(sg_soma_inh, nrn, 
             syn_spec={"receptor_type": syns["soma_inh"], 
                       "weight": 10.0 * resolution, 
                       "delay": resolution})
nest.Connect(nrn, spikerecorder_soma)

and start the simulation, which is divided into intervals of the pattern duration:

# start the simulation, which is divided into intervals of the pattern duration:
for i in np.arange(n_rep_total):
    # Set the spike times of the pattern for each spike generator
    for sg, t_sp in zip(sg_prox, t_srs):
        nest.SetStatus(sg, {"spike_times": np.array(t_sp) + i * pattern_duration})
    nest.Simulate(pattern_duration)

After the simulation is completed, we read out the recorded data for plotting:

# read out devices for plotting:
# multimeter:
mm_events = multimeter.events
t = mm_events["times"]
V_s = mm_events["V_m.s"]
V_d = mm_events["V_m.p"]
V_d_star = V_w_star(V_d, nrn_params)
g_in = mm_events["g_in.s"]
g_ex = mm_events["g_ex.s"]
I_ex = mm_events["I_ex.p"]
I_in = mm_events["I_in.p"]
U_M = matching_potential(g_ex, g_in, nrn_params)

# weight recorder:
wr_events = wr.events
senders = wr_events["senders"]
targets = wr_events["targets"]
weights = wr_events["weights"]
times = wr_events["times"]

# spike recorder:
spike_times_soma = spikerecorder_soma.get("events", "times")

Here are the plot commands for plotting $V_s$ (membrane potential of the soma), $V_d$ (membrane potential of the dendrite), $V_d^*$ (dendritic prediction of the somatic membrane potential), $U_M$ (matching potential), somatic conductances $g_I$ and $g_E$, dendritic currents $I_{\text{in}}$ and $I_{\text{ex}}$, rates $\phi(U)$ and $\phi(V_d)$, and the rate derivative $h(V_d^*)$:

# plot the results:
lw = 1.0
fig1, (axA, axB, axC, axD, axE) = plt.subplots(5, 1, sharex=True, figsize=(8, 12))

# plot membrane potentials and matching potential:
axA.plot(t, V_s, lw=lw, label=r"$V_s$ (soma)", color="darkblue")
axA.plot(t, V_d, lw=lw, label=r"$V_d$ (dendrit)", color="deepskyblue")
axA.plot(t, V_d_star, lw=lw, label=r"$V_d^\ast$ (dendrit)", color="b", ls="--")
axA.plot(t, U_M, lw=lw, label=r"$U_M$ (soma)", color="r", ls="-", alpha=0.5)
axA.set_ylabel("membrane pot [mV]", )
axA.legend(loc="upper right")

# plot somatic conductances:
axB.plot(t, g_in, lw=lw, label=r"$g_I$", color="r")
axB.plot(t, g_ex, lw=lw, label=r"$g_E$", color="magenta")
axB.set_ylabel("somatic\nconductance [nS]")
axB.legend(loc="upper right")

# plot dendritic currents:
axC.plot(t, I_in, lw=lw, label=f"$I_$", color="r")
axC.plot(t, I_ex, lw=lw, label=f"$I_$", color="magenta")
axC.set_ylabel("dendritic\ncurrent [nA]")
axC.legend(loc="upper right")

# plot rates:
axD.plot(t, phi(V_s, nrn_params), lw=lw, label=r"$\phi(V_s)$", color="darkblue")
axD.plot(t, phi(V_d, nrn_params), lw=lw, label=r"$\phi(V_d)$", color="deepskyblue")
axD.plot(t, phi(V_d_star, nrn_params), lw=lw, label=r"$\phi(V_d^\ast)$", color="b", ls="--")
axD.plot(t, phi(V_s, nrn_params) - phi(V_d_star, nrn_params), lw=lw, 
         label=r"$\phi(V_s) - \phi(V_d^\ast)$", color="r", ls="-")
axD.plot(spike_times_soma, 0.15 * np.ones(len(spike_times_soma)), ".", 
         color="g", markersize=2, label="spike")
axD.legend(loc="upper right")

axE.plot(t, h(V_d_star, nrn_params), lw=lw, label=r"$h(V_d^\ast)$", color="g", ls="-")
axE.set_ylabel("rate derivative")
axE.legend(loc="upper right")
axE.set_xlim([0, 5000]) # we don't need to plot the whole simulation time

plt.tight_layout()
plt.show()

And here are the plot commands for plotting the evolution of synaptic weights:

# plot synaptic weights:
fig2, axA = plt.subplots(1, 1, figsize=(7, 4))
for i in np.arange(2, 200, 10):
    index = np.intersect1d(np.where(senders == i), np.where(targets == 1))
    if not len(index) == 0:
        axA.plot(times[index], weights[index], label="pg_{}".format(i - 2), lw=lw)

axA.set_title("Synaptic weights of Urbanczik synapses")
axA.set_xlabel("time [ms]")
axA.set_ylabel("weight")
axA.legend(fontsize=7, loc="upper right")
plt.tight_layout()
plt.show()

Results

In the following, we will take a look at the major results of the simulation, panel by panel. Note, that we limited the plots to the first 5000 ms (10000 ms for the synaptic weights) of the simulation to focus on the initial dynamics.

Membrane potentials and matching potential

First panel: membrane potentials of the soma ($V_S$, blue) and dendrite ($V_d$, cyan), the predicted dendritic potential ($V_d^*$, dashed blue) the matching potential ($U_M$, red).

The first panel shows the membrane potentials of the soma ($V_S$, blue) and dendrite ($V_d$, cyan), the predicted dendritic potential ($V_d^*$, dashed blue), and the matching potential ($U_M$, red). Let’s describe what we see here:

Membrane potentials ($V_s$ and $V_d$): The somatic membrane potential ($V_s$) and dendritic membrane potential ($V_d$) show oscillatory behavior. This oscillation is driven by the excitatory and inhibitory inputs defined in your simulation parameters. The somatic membrane potential oscillates around a certain value, reflecting the integration of dendritic input and other somatic inputs.
Predicted dendritic potential ($V_d^*$): This potential is calculated based on the somatic potential and the coupling conductance between soma and dendrite. It closely follows the dendritic potential ($V_d$), indicating that the model’s prediction aligns well with the actual dendritic activity.
Matching potential ($U_M$): This potential is derived from the somatic excitatory and inhibitory conductances. It provides a reference for the neuron’s overall excitatory and inhibitory state.

Thus, the matching potential ($U_M$) aligns with the oscillations in the somatic and dendritic potentials, indicating that the neuron is balancing its excitatory and inhibitory inputs effectively. The predicted dendritic potential ($V_d^*$) closely tracking the actual dendritic potential ($V_d$) demonstrates the accuracy of the model’s prediction mechanism. This is crucial for the learning rule, as the synaptic adjustments depend on minimizing the discrepancy between the predicted and actual potentials. The somatic firing rate, which is not directly visible in this panel but is influenced by these potentials, will be regulated based on the predicted dendritic potential, ensuring that the neuron’s output remains consistent with its input patterns.

Somatic conductances and dendritic currents

Second panel: somatic conductances ($g_I$, red and $g_E$, magenta).

The second panel shows the somatic conductances ($g_I$, red, and $g_E$, magenta). $g_I$ starts at zero and quickly rises to a constant value, indicating a steady inhibitory input throughout the simulation. The excitatory conductance $g_E$ shows a sinusoidal pattern, oscillating in sync with the sinusoidal excitatory input applied. The steady inhibitory conductance ($g_I$) serves to balance the excitatory input and control the overall excitability of the neuron. The oscillatory excitatory conductance ($g_E$) directly reflects the applied sinusoidal input, showing that the synaptic inputs are effectively driving the somatic conductances as intended.

Dendritic currents

Third panel: dendritic currents ($I_{\text{in}}$, red, and $I_{\text{ex}}$, magenta).

The third panel shows the dendritic currents ($I_{\text{in}}$, red, and $I_{\text{ex}}$, magenta). The inhibitory current $I_{\text{in}}$ remains at zero, indicating that there is no inhibitory dendritic current being applied during the simulation. This is consistent with the settings used in the simulation, where only excitatory input was varied. The excitatory current $I_{\text{ex}}$ follows the sinusoidal pattern of the excitatory input, reflecting the synaptic input dynamics. The oscillatory excitatory current $I_{\text{ex}}$ oscillates with varying amplitude, showing periodic fluctuations over time. These fluctuations are influenced by the periodic excitatory input applied to the neuron, as per the settings of the simulation. This input drives the dendritic potential, which in turn affects the somatic potential and the overall activity of the neuron.

Overall, the oscillatory nature of the excitatory current reflects the sinusoidal excitatory input pattern defined in the simulation. The reduction in frequency over time may indicate a form of adaptation or a change in the neuron’s response to the input over time. The absence of inhibitory current suggests that the dynamics observed are primarily driven by excitatory inputs and their interaction with the neuron’s intrinsic properties.

Firing rates

Forth panel: fireing rates $\phi(U)$ (blue), $\phi(V_d)$ (light blue), $\phi(V_d^*)$ (blue dashed), $\phi(V_S) - \phi(V_d^*)$ (red), and the spikes (green dots).

The forth panel shows the different firing rates:

$\phi(V_s)$ (blue): The somatic firing rate calculated from the somatic membrane potential $V_s$. It oscillates due to the synaptic inputs and the intrinsic dynamics of the neuron.
$\phi(V_d)$ (light blue): The dendritic firing rate calculated from the dendritic membrane potential $V_d$. It follows a similar oscillatory pattern to the somatic firing rate but with some differences due to the separate dynamics of the dendritic compartment.
$\phi(V_d^*)$ (blue dashed): The predicted dendritic firing rate, calculated from the predicted dendritic potential $V_d^*$. It also oscillates and aims to match the actual dendritic firing rate.
$\phi(V_s) - \phi(V_d^*)$ (red): The difference between the somatic firing rate and the predicted dendritic firing rate. This discrepancy drives the synaptic plasticity according to the Urbanczik-Senn learning rule. Oscillations and deviations of this line from zero highlight periods where the model adjusts the synaptic weights to minimize this error.
Spikes (green dots): The green dots along the top axis indicate the times at which the neuron fired action potentials (spikes). These spikes occur when the somatic membrane potential crosses a certain threshold, causing the neuron to emit an action potential.

Overall, this panel illustrates the dynamic interplay between the actual somatic and dendritic firing rates, the predicted dendritic firing rate, and the resulting discrepancies that drive synaptic adjustments. The goal of the learning rule is to minimize these discrepancies over time, leading to an adaptive and predictive neural response.

Rate derivative

Fifth panel: rate derivative $h(V_d^*)$.

The fifth panel shows the rate derivative $h(V_d^*)$ of the rate function $\phi(V_d^*)$. Initially, $h(V_d^*)$ starts at a high value around 5, indicating a steep rate of change in the firing probability. As time progresses, the value of $h(V_d^*)$ decreases, fluctuating between 1 and 4. This fluctuation shows the dynamic adjustment of the rate derivative in response to changes in the dendritic prediction potential. The overall trend shows an increase in fluctuations, indicating that the system is responding to varying synaptic inputs and adjusting the firing rate accordingly. The initial high value could be due to the neuron adapting rapidly to the initial conditions, after which it stabilizes and responds to the ongoing synaptic inputs

This rate derivative function plays a crucial role in the learning rule, as it affects the update of synaptic weights based on the dendritic prediction error. The fluctuations in $h(V_d^*)$ indicate active learning and adaptation processes in the neuron model, as it continuously adjusts to match the predicted somatic activity with the actual somatic firing rate.

Synaptic weights

Evolutions of synaptic weights of Urbanczik synapses.

This plot shows the evolution of synaptic weights for several synapses over time. Each colored line represents the weight of a synapse from a particular parrot neuron to the Urbanczik neuron. The weights exhibit different patterns of change, with some increasing significantly while others decrease or stabilize. The diversity in weight dynamics demonstrates the model’s capacity to differentiate between inputs based on their timing and interaction with the dendritic potential, leading to a self-organized pattern of synaptic strengths.

Overall interpretation

The plots collectively illustrate the dynamics of the Urbanczik-Senn plasticity model. The membrane potentials, conductances, and currents reflect the input-driven activity of the neuron. The firing rates and their discrepancies indicate the model’s predictive coding capabilities, where the dendritic compartment predicts somatic activity. The synaptic weights evolve based on the prediction errors, demonstrating the learning rule’s impact on synaptic plasticity.

Conclusion

The Urbanczik-Senn plasticity model offers a comprehensive framework for understanding synaptic plasticity in neural networks. By integrating dendritic prediction errors, the model unifies different learning paradigms under a single rule, enabling supervised, unsupervised, and reinforcement learning. The model’s predictive coding mechanism, robust learning dynamics, and versatility make it thus a powerful tool for simulating neural processing tasks and understanding the underlying mechanisms of synaptic plasticity.

The complete code used in this blog post is available in this Github repositoryꜛ (urbanczik_senn_plasticity.py). Feel free to modify and expand upon it, and share your insights.

References

Robert Urbanczik, Walter Senn, Learning by the dendritic prediction of somatic spiking, 2014, Neuron, Vol. 81, Issue 3, pages 521-528, doi: 10.1016/j.neuron.2013.11.030ꜛ
NEST’s tutorial “Weight adaptation according to the Urbanczik-Senn plasticity”ꜛ
NEST’s pp_cond_exp_mc_urbanczik model descriptionꜛ
NEST’s urbanczik_synapse synapse descriptionꜛ

Implementing a minimal spiking neural network for MNIST pattern recognition using nervos

2026-02-16T21:05:45+01:00

I recently came across nervosꜛ, an open source spiking neural network framework recently developed by Jaskirat Singh Maskeenꜛ and Sandip Lashkareꜛ. nervos aims to provide a unified and hardware aware platform to evaluate different spike timing dependent plasticity learning rules together with different synapse models, ranging from idealized floating point synapses to finite state nonlinear memristor based models. The framework is described in A Unified Platform to Evaluate STDP Learning Rule and Synapse Model using Pattern Recognition in a Spiking Neural Networkꜛ.

In this post, we use nervosꜛ (Maskeen & Lashkare, 2025ꜛ) to implement a minimal two layer spiking neural network for pattern recognition on the MNIST dataset. We analyze how the network learns to classify digits through STDP and how the synaptic weights evolve during training. The figure shows the evolution of synaptic weights for a single output neuron over the course of training, illustrating how the network develops selectivity for certain input patterns corresponding to specific digit classes. We will further analyze the internal dynamics of the network, the learned receptive fields, and the classification performance on the test set. The code for this example is available in the Github repository mentioned at the end of this post.

I was curious and applied nervos to a classical pattern recognition task using MNIST digits. I basically replicated their original tutorial “MNIST Exampleꜛ” and then extended it with additional analyses. The goal was to understand in detail how a minimal two layer spiking network with purely local plasticity can self organize into digit selective neurons, how classification emerges, and what exactly is stored internally for later analysis. In this post, I summarize the mathematical formulation of the network, the learning rules, and the results I obtained.

Mathematics of nervos

Let’s first have a look at the mathematical formulation of the network architecture. In the following, we will walk through the main components of the model, including the network architecture, input encoding, neuron and synapse models, and the STDP learning rule. This is the best way to understand what the model does and how it learns, before we go ahead to the actual implementation and results.

Network architecture

The architecture implemented in nervos is deliberately minimal. It consists of a two layer spiking neural network without hidden layers:

Input layer: we will use 784 neurons corresponding to the 28×28 pixels of an MNIST imageꜛ.
Output layer: typically 60 or 80 neurons depending on the experiment (we will use 80 in our example).

Every input neuron is connected to every output neuron via an excitatory synapse. Thus, the weight matrix is

\[W \in \mathbb{R}^{N_{\text{out}} \times 784},\]

with entries $w_{ij}$ representing the synaptic strength from input neuron $j$ to output neuron $i$, with $N_{\text{out}}$ being the number of output neurons.

Competition in the output layer is implemented algorithmically. At each time step, the neuron with the highest membrane potential is identified. If this neuron crosses its adaptive threshold, it emits a spike and all other neurons are inhibited by resetting their potentials to an inhibitory level and placing them into a refractory state. There is no explicit lateral inhibitory connectivity matrix; instead, Winner Takes All dynamics are enforced by this global inhibition rule.

Samples from the MNIST datasetꜛ, which is used as input for the nervos SNN in our example below. Each image is 28x28 pixels, which corresponds to 784 input neurons in the network. The pixel intensities are converted to firing frequencies to generate spike trains for the input layer. The network learns to classify these images based on the spiking activity of the output layer neurons.

Input encoding

Each MNIST image is flattened into a vector $p \in [0,1]^{784}$. Pixel intensities are converted to firing frequencies according to

\[f = p \cdot (f_{\max} - f_{\min}) + f_{\min},\]

with typical values $f_{\max} = 70$ Hz and $f_{\min} = 5$ Hz.

For a presentation duration of $T$ discrete simulation steps, each input neuron generates a binary spike train

\[M \in \{0,1\}^{784 \times T},\]

where $M_{ij} = 1$ if neuron $i$ fired at time step $j$.

Thus, the network receives a temporally structured, rate encoded spike pattern.

Neuron model: discrete integrate and fire dynamics with adaptive threshold

In practice, nervos does not implement a continuous time leaky integrate and fire differential equation with an explicit leak term. Instead, the neuron state is updated in discrete time steps. For each output neuron $i$ at time step $t$, the membrane potential is incremented by the weighted sum of incoming spikes

\[V_i(t) \leftarrow V_i(t) + \sum_{j=1}^{N_{\text{in}}} w_{ij}\,x_j(t),\]

where $x_j(t)\in{0,1}$ is the presynaptic spike of input neuron $j$ at time step $t$.

After this synaptic integration step, nervos applies a discrete relaxation term whenever the neuron is above its resting potential $V_{\text{rest}}$:

\[\text{if } V_i(t) > V_{\text{rest}}:\; V_i(t)\leftarrow V_i(t) - \Delta_V,\]

with $\Delta_V=\texttt{spike_drop_rate}$. This term plays a stabilizing role similar to a leak, but it is not an exponential decay and it is not derived from a biophysical conductance model.

Each neuron also has a refractory mechanism implemented as a hard time step lock. After a neuron fires or is inhibited at time $t$, it is set to rest until

\[t < t_{\text{rest},i} \equiv t + \tau_{\text{ref}},\]

where $\tau_{\text{ref}}=\texttt{refractory_time}$. While $t < t_{\text{rest},i}$, the neuron does not integrate synaptic input.

The firing threshold is adaptive and implemented as an explicit state variable $\theta_i(t)$, initialized at a baseline value $\theta_0=\texttt{spike_threshold}$. Whenever a neuron fires, its adaptive threshold is increased additively

\[\theta_i(t^+) \leftarrow \theta_i(t) + 1.\]

Additionally, when a neuron is above the resting potential, the adaptive threshold relaxes linearly toward the baseline by a fixed amount per time step

\[\text{if } V_i(t) > V_{\text{rest}} \text{ and } \theta_i(t) > \theta_0:\; \theta_i(t)\leftarrow \theta_i(t) - \Delta_\theta,\]

with $\Delta_\theta=\texttt{threshold_drop_rate}$. Thus, the adaptive threshold dynamics are discrete and piecewise linear rather than exponential with a time constant.

Finally, note that inhibition is implemented as a hard reset of the membrane potential to an inhibitory potential $V_{\text{inh}}=\texttt{inhibitory_potential}$ together with the same refractory lockout. This is a compressed, algorithmic representation of inhibition rather than an explicit inhibitory synapse conductance model.

Synapse models

nervos allows different synapse models, all normalized to

\[w \in [w_{\min}, w_{\max}] = [10^{-3}, 1].\]

Three major models are implemented:

Ideal synapse: Continuous weight updates without quantization.
Linear finite state synapse: Uniform weight steps but with a finite number of discrete states.
Nonlinear memristor synaps: Based on experimental Pr$_{0.7}$Ca$_{0.3}$MnO$_3$ RRAM data. The $i$th weight state of an $n$ state synapse is modeled as
\[\begin{aligned} w_i =& \quad w_{\max} - \frac{w_{\max} - w_{\min}}{1 - e^{-\nu}} \; \cdot \\ & \cdot \left[1 - \exp\left(-\nu\left(1 - \frac{i}{n}\right)\right)\right] \end{aligned}\]
The parameter $\nu$ controls the curvature of the nonlinearity.

Initially, all synapses are set to $w = 1$ to facilitate early learning.

STDP learning

Weight changes are driven by an exponential STDP kernel $F(\Delta t)$ with

\[\Delta t = t_{\text{post}} - t_{\text{pre}},\]

and

\[F(\Delta t)= \begin{cases} A_{\text{up}} \exp(-\Delta t/\tau_{\text{up}}), & \Delta t \ge 0,\\ A_{\text{down}} \exp(\Delta t/\tau_{\text{down}}), & \Delta t < 0. \end{cases}\]

Note that nervos also supports alternative STDP kernels such as cosine, sinusoidal, and Gaussian depression-only variants. In the present analysis, however, we focus exclusively on the conventional exponential kernel defined above.

Synaptic updates are applied in a bounded, weight dependent manner. Let $w$ be the current weight and $w_{\min},w_{\max}$ the configured bounds. Define

\[d(w) = \begin{cases} w - w_{\min}, & F(\Delta t) < 0,\\ w_{\max} - w, & F(\Delta t) > 0. \end{cases}\]

Then the update is

\[w \leftarrow w + \eta\,F(\Delta t)\,\text{sign}(d(w))\,|d(w)|^{\gamma},\]

with learning rate $\eta=\texttt{eta}$ and fixed exponent $\gamma=0.9$ in the current implementation. This makes potentiation and depression naturally saturate near the bounds.

STDP is applied only to synapses projecting onto a selected output neuron. In the default Winner Takes All mode (self.wta=True), synaptic updates are applied only for the winner neuron $k(t)$ at each time step where a spike event occurs. This is a strong form of competitive learning.

A further implementation detail is that if no presynaptic spike was observed in the configured past window for a given synapse at that postsynaptic event, nervos still applies a depression update with a randomly chosen negative $\Delta t$ value from a restricted range. This enforces ongoing weakening of synapses that are not consistently supported by correlated pre and post activity, thereby strengthening competition and sparsifying receptive fields.

Training and emergence of classification

During training, each image is presented as an input spike train for $T=\texttt{training_duration}$ discrete time steps. The network is simulated forward, and STDP updates are applied online whenever spiking events occur.

A key point is how nervos constructs the neuron to label association. In the current implementation, the neuron label map is updated online by directly assigning the true label of the current training sample to the neuron that was the maximally excited neuron when the last spike event occurred during that presentation. Denoting this neuron index by $k$, the update is

\[\texttt{neuron_label_map}[k] \leftarrow y,\]

where $y$ is the true class label of the presented sample.

Thus, the label map is not computed via a separate majority vote over winners across the full training set. Instead, it is formed incrementally through repeated overwriting during training, and it stabilizes in practice because neurons that consistently win for a given class will repeatedly reassign themselves to that class.

At test time, classification proceeds without further plasticity. For a test spike train, the network computes winner event counts $c_i$ over the presentation window and returns

\[\hat{y} = \texttt{neuron_label_map}\left[\arg\max_i c_i\right].\]

This readout is algorithmic and relies on the externally stored mapping from neurons to labels. It is therefore best interpreted as a minimal decision rule that extracts class predictions from the emergent winner selective dynamics of the trained output layer.

Synapse bounds and initialization

All synaptic weights are bounded by the configured limits $w_{\min}=\texttt{min_weight}$ and $w_{\max}=\texttt{max_weight}$. In the example configuration used below, these were set explicitly in the parameter dictionary, and the STDP update rule implements saturating weight dependence relative to these bounds.

In the current nervos implementation, the input to output synapse matrix is initialized as

\[W(0)=\mathbf{1},\]

that is all weights start at $w=1$. This choice accelerates early competition and receptive field formation, but it also means that early epochs can show very large weight norms and broad activation patterns before synaptic competition and bounded STDP drive weights toward more selective configurations.

What is stored internally

For detailed analysis, nervos stores (optionally):

full weight matrix snapshots $W$ after each sample or epoch.
spike raster matrices for each layer:
$M^{(\text{layer})}_{\text{epoch}, \text{sample}}$
learned neuron to label mapping
synapse state trajectories for finite state models

From $W$, we can, e.g., compute receptive fields by reshaping

\[w_i \in \mathbb{R}^{784}\]

into $28 \times 28$ images, directly visualizing digit templates emerging in single neurons.

From spike rasters, we can compute:

winner indices
spike statistics
L1 and L2 norms of synaptic vectors
weight evolution curves

Thus, the framework allows a full dynamical and structural analysis of learning.

What nervos’ SNN does and does not achieve

While nervos is in my view a powerful tool to study STDP and synapse models, it is important to understand its strengths and limitations in the context of computational neuroscience and neuromorphic engineering.

Strengths

The main strengths of the nervos SNN are:

fully local learning rule without global error backpropagation
hardware aware synapse models including memristor nonlinearity
very small architecture
good performance under small training sets
transparent internal dynamics

In small five class MNIST tasks, the authors report over 90% accuracy with conventional STDP and ideal synapses. When extended to all ten classes, accuracy drops to around 76%, which is still impressive for such a minimal architecture and purely local learning rule.

Limitations

The main limitations of the nervos SNN are:

Two layer architecture only.
Rate based input encoding, not true temporal coding.
No deep hierarchical feature extraction.
Classification relies on post hoc label assignment: It relies on an external label map that is built during training by repeatedly assigning the current sample label to the winning neuron, and is then used at test time to map the $\arg\max$ spike count neuron to a class label.
Accuracy drops significantly when synapse states are strongly quantized.

Compared to fully biological plausible cortical microcircuits, the network is extremely simplified:

no dendritic compartmentalization,
no recurrent excitatory loops,
no neuromodulation, and
no reward signals.

Nevertheless, it provides a clean minimal platform to study STDP and hardware constraints.

Python example: Pattern Recognition on MNIST

Before we begin and for reproducibility, here is the environment setup that I have used for this example:

conda create -n nervos python=3.12 mamba -y
conda activate nervos
mamba install -y numpy matplotlib ipykernel requests
pip install nervos

I used nervos version 0.0.5, which is the latest version at the time of writing. The code is structured in a way that should be compatible with future versions, but some adjustments may be needed if the API changes significantly.

So, let’s start with the imports and some global settings for plotting:

import os
import numpy as np
import matplotlib.pyplot as plt
import nervos as nv

# set global properties for all plots:
plt.rcParams.update({'font.size': 12})
plt.rcParams["axes.spines.top"]    = False
plt.rcParams["axes.spines.bottom"] = False
plt.rcParams["axes.spines.left"]   = False
plt.rcParams["axes.spines.right"]  = False

Parameter setup

First, we define the parameters for the simulation. We will use 500 images of the MNIST training set for training and 150 images of the test set for testing. The training duration is set to 100 discrete time steps, which is sufficient for the network to process the input and generate spikes. We also set various parameters related to the neuron and synapse models, as well as the learning rates for STDP. We can also choose the number of classes we want to train on, e.g., 5 (MNIST subset) or 10 (full MNIST set). In this example, we will use 6 classes (digits 0 to 5) to keep the training time manageable while still demonstrating the learning capabilities of the network.

All parameters are stored in a Parameters object from the nervos library, which allows for easy access and modification throughout the code:

RESULTS_PATH = "figures"
os.makedirs(RESULTS_PATH, exist_ok=True)

# choose classes here:
CLASSES = list(range(6))   # choose any value between 1 and 10
identifier_name = f"{len(CLASSES)}classmnist"

p = nv.Parameters()

parameters_dict = {
    "training_images_amount": 500, # nervos is very memory hungry especially when m.get_spikeplots = True and m.get_weight_evolution = True (!); either use smaller numbers here, or set those to False below, or run it on a machine with high RAM
    "testing_images_amount": 150,
    "training_duration": 100, # discrete simulation time units
    "past_window": -10,
    "epochs": 3,  # note: after each epoch, the training set is not reshuffled, so the same images are presented in the same order.
    "image_size": [28, 28],
    "resting_potential": -70,
    "input_layer_size": 784,
    "output_layer_size": 80,
    "inhibitory_potential": -100,
    "spike_threshold": -55,
    "reset_potential": -90,
    "spike_drop_rate": 0.8,
    "threshold_drop_rate": 0.4,
    "min_weight": 1e-05,
    "max_weight": 1.0,
    "A_up": 0.8,
    "A_down": -0.3,
    "tau_up": 5,
    "tau_down": 5,
    "eta": 0.03,
    "min_frequency": 1,
    "max_frequency": 50,
    "refractory_time": 15,
    "tau_m": 10,
    "conductance": 10
}
for key, value in parameters_dict.items():
    setattr(p, key, value)

Note, that nervos is very memory hungry especially when m.get_spikeplots = True and m.get_weight_evolution = True (see below). Either use smaller numbers for the training and testing images (like ~500 and ~150, respectively), or set those settings to False, or run it on a machine with enough RAM.

Helper functions and class definitions

Next, we set up some functions which we need for the implementation of the MNIST_SNN class and for the analysis of the results.

We begin with the MNIST_SNN class which is a wrapper around the nervos SNN that handles data loading, training, and prediction. We also add a method to plot random samples from the dataset, which will be useful for visualizing the input spike trains before training the model. The plot_random_samples method takes care of aggregating the spike trains over time to reconstruct the original images for visualization, since the MNIST images are not stored as pixel values but only as spike trains in the model:

class MNIST_SNN(nv.Module):
    def __init__(self, parameters, identifier=None, classes=None, train_size=None, test_size=None, seed=None):
        super().__init__(parameters, identifier)

        # set default (5) if not provided
        if classes is None:
            classes = list(range(5))
        self.classes = list(classes)

        self.dataloader = nv.dataloader.MNISTLoader(parameters, classes=self.classes)

        # if you want the loader sizes to be controllable from outside:
        if train_size is None:
            train_size = getattr(parameters, "training_images_amount", 100)
        if test_size is None:
            test_size = getattr(parameters, "testing_images_amount", 20)

        self.X_train, self.Y_train = self.dataloader.dataloader(
            preprocess=True, pca=False, size=int(train_size), seed=seed)
        self.X_test, self.Y_test = self.dataloader.dataloader(
            preprocess=True, train=False, pca=False, size=int(test_size), seed=seed)

    def predict(self, un_processed_image, model_location):
        spike_train = np.array(self.dataloader.img2spiketrain(un_processed_image))
        synapses, neuron_label_map = self.load_model(model_location)
        return self.get_prediction(spike_train, synapses, neuron_label_map)

    def plot_random_samples(self, N=10, train=True, aggregate="sum", seed=None, cmap="hot_r", figsize=(10, 10)):
        """
        Plot N random MNIST samples from train or test set.

        Parameters
        ----------
        N : int
            Number of samples to plot.
        train : bool
            If True: use training set, else test set.
        aggregate : str
            "sum" or "mean" over time to reconstruct image from spike train.
        seed : int or None
            Optional seed for reproducibility.
            
            
        Note:
        -----
        nervos' dataloader directly returns spike trains for the MNIST images, 
        which we can visualize here before training the model. This also means,
        the MNIST images are not stored as pixel values in the model, but only 
        as spike trains. We therefore need to aggregate the spike trains over 
        time to reconstruct the original image for visualization.
        """

        if seed is not None:
            np.random.seed(seed)

        X = self.X_train if train else self.X_test
        Y = self.Y_train if train else self.Y_test

        indices = np.random.choice(len(X), size=min(N, len(X)), replace=False)

        cols = min(N, 5)
        rows = int(np.ceil(N / cols))

        plt.figure(figsize=figsize)

        for i, idx in enumerate(indices):
            spike_train = X[idx]  # shape: (784, T)

            if aggregate == "sum":
                img_vec = spike_train.sum(axis=1)
            elif aggregate == "mean":
                img_vec = spike_train.mean(axis=1)
            else:
                raise ValueError("aggregate must be 'sum' or 'mean'")

            img = img_vec.reshape(28, 28)

            plt.subplot(rows, cols, i + 1)
            plt.imshow(img, cmap=cmap, interpolation="nearest")
            plt.title(f"Label: {Y[idx]}")
            plt.axis("off")

        plt.tight_layout()

The next function is a direct replication of the original visualize_synapse function from the nervos tutorial, which visualizes the learned synaptic weights for each class. It aggregates the synaptic weights of all neurons that are assigned to the same class and reshapes them into 28x28 images to visualize the receptive fields learned by the network for each digit class:

def visualize_synapse(synapses, labels, cmap="hot_r", figsize=(10, 30), ncols=5):
    kk = 28
    labels = np.asarray(labels)

    classes = {i: np.zeros((kk, kk)) for i in np.unique(labels)}
    for idx in range(len(synapses)):
        classes[labels[idx]] += synapses[idx].reshape((kk, kk))

    class_keys = sorted(classes.keys())
    n_classes = len(class_keys)
    ncols = max(1, int(ncols))
    nrows = int(np.ceil(n_classes / ncols))

    plt.figure(figsize=figsize)
    for i, k in enumerate(class_keys, start=1):
        plt.subplot(nrows, ncols, i)
        plt.imshow(classes[k], cmap=cmap, interpolation="nearest")
        plt.title(f"{k}")
        plt.axis("off")

    plt.tight_layout()

To evaluate the performance of the model, we need to compute the accuracy on the test set. We use again a provided function from the nervos tutorial, called accuracy, which takes the trained model and the test classes as input, generates spike trains for the test images, and computes the predicted labels based on the output layer activity. It then compares the predicted labels to the true labels to calculate the overall accuracy of the model:

def accuracy(m2, classes, parameters_dict):
    loader = nv.dataloader.MNISTLoader(m2.parameters, classes=list(classes))
    spike_trains, labels = loader.dataloader(train=False, preprocess=True, 
      seed=123, size=parameters_dict["testing_images_amount"])

    t = 0
    c = 0
    preds = []
    print("Calculating Accuracy")
    for st, label in zip(spike_trains, labels):
        pred = m2.get_prediction(st)
        preds.append(pred)
        c += int(pred == label)
        t += 1
        print(f"\rTested {t} images", end="")
    print()
    print(c / t)
    return labels, preds

For a more detailed analysis of the model’s performance, we can compute the confusion matrix, which shows how many times each true class was predicted as each possible class. The following function confusion_matrix_np computes the confusion matrix which is used in the plot_confusion_matrix function to visualize it. The confusion matrix can be normalized to show percentages instead of raw counts, which can be helpful for interpretation:

def confusion_matrix_np(y_true, y_pred, labels):
    """
    Calculate the confusion matrix.
    Rows: true labels
    Cols: predicted labels
    """
    y_true = np.asarray(y_true, dtype=int)
    y_pred = np.asarray(y_pred, dtype=int)
    labels = np.asarray(labels, dtype=int)

    k = labels.size
    idx = {lab: i for i, lab in enumerate(labels)}
    C = np.zeros((k, k), dtype=int)

    for t, p in zip(y_true, y_pred):
        if (t in idx) and (p in idx):
            C[idx[t], idx[p]] += 1
    return C

def plot_confusion_matrix(C, labels, normalize=False, title=None, cmap="Greys"):
    """
    Plot confusion matrix. If normalize=True: row-normalize 
    (per true label).
    """
    C = np.asarray(C)
    if normalize:
        row_sums = C.sum(axis=1, keepdims=True).astype(float)
        row_sums[row_sums == 0.0] = 1.0
        M = C / row_sums
    else:
        M = C

    fig, ax = plt.subplots(figsize=(6, 5))
    im = ax.imshow(M, interpolation="nearest", cmap=cmap)
    cbar = fig.colorbar(im, ax=ax, fraction=0.046, pad=0.04)

    ax.set(
        xticks=np.arange(len(labels)),
        yticks=np.arange(len(labels)),
        xticklabels=[str(x) for x in labels],
        yticklabels=[str(x) for x in labels],
        xlabel="Predicted Labels",
        ylabel="True Labels",
        title=title if title is not None else ("Confusion matrix (normalized)" if normalize else "Confusion matrix"))

    # annotate:
    fmt = ".2f" if normalize else "d"
    thresh = M.max() * 0.6 if M.size else 0.0
    for i in range(M.shape[0]):
        for j in range(M.shape[1]):
            ax.text(
                j, i, format(M[i, j], fmt),
                ha="center", va="center",
                color="white" if M[i, j] > thresh else "black")

    plt.tight_layout()
    return fig, ax

The next function actually repeats the accuracy function but uses the confusion matrix to compute the overall accuracy. It takes the confusion matrix as input and calculates the accuracy as the trace (sum of diagonal elements) divided by the total number of samples. It also computes the recall for each class, which is the diagonal element divided by the sum of the corresponding row (true positives / total actual positives):

def accuracy_metrics(C):
    """
    From confusion matrix C (rows=true, cols=pred):
    accuracy.
    """
    C = np.asarray(C, dtype=float)
    total = C.sum()
    acc = np.trace(C) / total if total > 0 else np.nan

    row_sums = C.sum(axis=1)
    with np.errstate(divide="ignore", invalid="ignore"):
        recall = np.diag(C) / row_sums

    return acc, recall

An important aspect of analyzing the model’s performance is to look at the spike trains of the output neurons. The following rasterplot function creates a raster plot for the binary spike matrix, where each dot represents a spike from a neuron at a specific time step. The function also allows highlighting specific neurons (e.g., the epoch winner and the final winner) with different colors and sizes to visually distinguish them from the rest of the neurons:

def rasterplot(
    spike_train: np.ndarray,
    title: str = "raster",
    xlim=None,
    highlight_neuron_idx: int | None = None,
    highlight_color: str = "orange",
    highlight2_neuron_idx: int | None = None,
    highlight2_color: str = "magenta",
    base_color: str = "0.35",
    s_base: float = 2.0,
    s_highlight: float = 8.0,
    s_highlight2: float = 8.0):
    """
    Raster plot for a binary spike matrix with up to two highlighted neurons.

    spike_train: shape (N, T), entries 0/1
    highlight_neuron_idx: epoch wise winner (orange)
    highlight2_neuron_idx: final winner (magenta, retroactive)
    If both indices are equal, only one overlay is drawn (orange), but the legend
    label indicates "epoch winner = final winner".
    """
    spike_train = np.asarray(spike_train)
    if spike_train.ndim != 2:
        raise ValueError(f"spike_train must be 2D (N,T), got shape {spike_train.shape}")

    N, T = spike_train.shape
    ys, xs = np.where(spike_train == 1)

    plt.figure(figsize=(10, 4))
    plt.scatter(xs, ys, s=s_base, color=base_color, linewidths=0)

    handles = []

    def _overlay(idx: int, color: str, size: float, label: str):
        if not (0 <= idx < N):
            raise ValueError(f"highlight idx={idx} out of bounds for N={N}")
        xs_h = np.where(spike_train[idx] == 1)[0]
        if xs_h.size == 0:
            return None
        ys_h = np.full(xs_h.shape, idx, dtype=int)
        return plt.scatter(xs_h, ys_h, s=size, color=color, linewidths=0, label=label)

    j1 = int(highlight_neuron_idx) if highlight_neuron_idx is not None else None
    j2 = int(highlight2_neuron_idx) if highlight2_neuron_idx is not None else None
    same = (j1 is not None) and (j2 is not None) and (j1 == j2)

    # epoch winner (always, if provided)
    if j1 is not None:
        label1 = f"epoch winner = final winner idx {j1}" if same else f"epoch winner idx {j1}"
        h1 = _overlay(j1, highlight_color, s_highlight, label1)
        if h1 is not None:
            handles.append(h1)

    # final winner (only if provided and different)
    if (j2 is not None) and (not same):
        h2 = _overlay(j2, highlight2_color, s_highlight2, f"final winner idx {j2}")
        if h2 is not None:
            handles.append(h2)

    if handles:
        plt.legend(loc="best")

    if xlim is not None:
        plt.xlim(xlim)

    plt.xlabel("time step")
    plt.ylabel("neuron index")
    plt.grid(True, axis="x", linestyle="--", alpha=0.6)
    plt.title(title)
    plt.tight_layout()

To visualize the receptive fields of the output neurons, we define the following function plot_rf_of_neuron, which takes the synaptic weights of the first layer and a specific neuron index as input. It reshapes the weight vector of that neuron into a 28x28 image and visualizes it using a colormap. This allows us to see what kind of input pattern that neuron has learned to respond to:

def plot_rf_of_neuron(
    synapses_0: np.ndarray,
    neuron_idx: int,
    title: str = "",
    cmap: str = "viridis",
    figsize=(3.0, 3.0)) -> None:
    """
    Plot receptive field (weights reshaped to 28x28) of one output neuron.

    synapses_0: shape (n_out, 784)
    """
    w = np.asarray(synapses_0)[neuron_idx]  # (784,)
    img = w.reshape(28, 28)

    plt.figure(figsize=figsize)
    plt.imshow(img, cmap=cmap, interpolation="nearest")
    plt.title(title)
    plt.axis("off")
    plt.tight_layout()

The function plot_label_template visualizes a class specific weight template derived from the trained network. It selects all output neurons whose entry in neuron_label_map equals the given label and then averages their input weight vectors (or sums them, depending on mode). The resulting 784 dimensional vector is reshaped to 28×28 and plotted. This is therefore not an average MNIST image, but an average of learned synaptic weight patterns for that class. We use this function to visualize the emergent digit templates learned by the network for each class, which can be compared to individual receptive fields of single winner neurons as well as to the aggregated input spike representations of real MNIST samples, in order to assess how closely the learned synaptic structure aligns with the underlying data distribution:

def plot_label_template(
    synapses_0: np.ndarray,
    neuron_label_map: np.ndarray,
    label: int,
    title: str = "",
    cmap: str = "viridis",
    mode: str = "sum",  # "sum" or "mean"
    figsize=(3.0, 3.0)) -> None:
    """
    Plot label template: aggregate RFs of all neurons assigned to a given label.
    """
    synapses_0 = np.asarray(synapses_0)
    nlm = np.asarray(neuron_label_map)

    idx = np.where(nlm == int(label))[0]
    plt.figure(figsize=figsize)

    """ 
    idx.size indicates how many neurons are mapped to this label. 
    If idx.size == 0, it means no neuron is mapped to this label.
    """

    if idx.size == 0:
        plt.text(0.5, 0.5, f"No neurons mapped to label {label}", ha="center", va="center")
        plt.axis("off")
        plt.title(title)
        plt.tight_layout()
        return

    W = synapses_0[idx]  # (n_label_neurons, 784)
    if mode == "sum":
        img = W.sum(axis=0).reshape(28, 28)
    elif mode == "mean":
        img = W.mean(axis=0).reshape(28, 28)
    else:
        raise ValueError("mode must be 'sum' or 'mean'")

    plt.imshow(img, cmap=cmap, interpolation="nearest")
    plt.title(title + f" (neurons mapping: {idx.size}/{synapses_0.shape[0]})")
    plt.axis("off")
    plt.tight_layout()

The next functions are utility functions to extract the winner neuron index based on spike counts and to get the last weight snapshot for a specific sample and epoch. These functions will be useful for analyzing the evolution of the winner neuron’s receptive field over epochs:

def get_winner_neuron_idx(m, epoch, train_image_idx):
    spk_out = np.asarray(m.spikeplots[epoch][train_image_idx][-1])  # (n_out, T)
    spike_counts = spk_out.sum(axis=1)
    return int(np.argmax(spike_counts)), spike_counts

def get_last_weight_snapshot_for_sample(m, epoch, train_image_idx):
    """
    weight_evolution[epoch][sample] is a list of snapshots.
    Each snapshot has shape (n_out, n_in).
    """
    snapshots = m.weight_evolution[epoch][train_image_idx]
    if snapshots is None or len(snapshots) == 0:
        raise ValueError("No weight evolution stored. Set m.get_weight_evolution=True before training.")
    return np.asarray(snapshots[-1])  # (n_out, n_in)

def plot_winner_rf_evolution_over_epochs(
    m,
    train_image_idx,
    last_epoch=None,
    cmap="viridis",
    parameters_dict=None,
    nlm_final=None,):
    """
    For each epoch, pick the winner neuron by spike count
    (same definition as in your raster loop), then plot its RF from the
    epoch specific weight snapshot and compute summary metrics for that same winner.

    Notes:
      - The "winner" can change across epochs (that is the point).
      - If nlm_final is given, we also show map=... in the titles (final neuron label map).
    """
    if last_epoch is None:
        last_epoch = m.parameters.epochs - 1

    true_label = int(m.Y_train[train_image_idx])

    rfs = []
    norms_l1 = []
    norms_l2 = []
    means = []

    winner_idxs = []
    winner_counts = []
    winner_maps = []

    for ep in range(m.parameters.epochs):
        # epoch specific winner from spikes (same as your raster loop)
        spk_out = np.asarray(m.spikeplots[ep][train_image_idx][-1])  # (n_out, T)
        spike_counts = spk_out.sum(axis=1)
        winner_idx = int(np.argmax(spike_counts))
        winner_count = int(spike_counts[winner_idx])

        winner_map = None
        if nlm_final is not None and winner_idx < len(nlm_final):
            winner_map = int(nlm_final[winner_idx])

        # epoch specific weights (last snapshot for this sample at this epoch)
        W = get_last_weight_snapshot_for_sample(m, ep, train_image_idx)  # (n_out, 784)
        w = W[winner_idx]  # (784,)

        if w.size != 28 * 28:
            raise ValueError(
                f"Expected 784 weights for RF, got {w.size}. W shape is {W.shape}.")

        rfs.append(w.reshape(28, 28))
        norms_l1.append(np.sum(np.abs(w)))
        norms_l2.append(np.sqrt(np.sum(w**2)))
        means.append(np.mean(w))

        winner_idxs.append(winner_idx)
        winner_counts.append(winner_count)
        winner_maps.append(winner_map)

    # RF tiles
    n = len(rfs)
    fig, axes = plt.subplots(1, n, figsize=(3 * n, 3), squeeze=False)
    for ep in range(n):
        ax = axes[0, ep]
        ax.imshow(rfs[ep], cmap=cmap, interpolation="nearest")

        if winner_maps[ep] is None:
            ax.set_title(f"Epoch {ep}\nidx={winner_idxs[ep]}, spikes={winner_counts[ep]}")
        else:
            ax.set_title(
                f"Epoch {ep}\nidx={winner_idxs[ep]}, spikes={winner_counts[ep]}, map={winner_maps[ep]}"
            )

        ax.axis("off")

    plt.suptitle(
        f"Winner RF evolution for sample {train_image_idx}\ntrue={true_label}")
    plt.tight_layout()
    plt.savefig(os.path.join(RESULTS_PATH, f"winner_rf_evolution_sample{train_image_idx}_tiles.png"),dpi=200)
    plt.close()


    # summary metrics:
    fig, ax1 = plt.subplots(figsize=(6, 4))

    l1_line, = ax1.plot(norms_l1, marker="o", label="L1 norm")
    l2_line, = ax1.plot(norms_l2, marker="o", label="L2 norm")

    ax1.set_xlabel("Epoch")
    ax1.set_ylabel("L1 / L2 value")
    ax1.set_title(f"Weight summary for epoch wise winners (sample {train_image_idx})")
    # annotate eg at the L2 dots, which winner idx they correspond to:
    for ep in range(n):
        # ax1.annotate(f"winner\nidx {winner_idxs[ep]}", (ep, norms_l2[ep]), 
        #              textcoords="offset points", xytext=(0,10), ha='center', fontsize=8)
        ax1.annotate(f"winner\nidx: {winner_idxs[ep]}", (ep, np.max(norms_l1)*1.1), 
                     textcoords="offset points", xytext=(0,10), ha='center', fontsize=10,
                     va="top")
    ax1.set_ylim(bottom=0, top=np.max(norms_l1)*1.2)

    # secondary axis for mean weight:
    ax2 = ax1.twinx()
    mean_line, = ax2.plot(means, marker="o", linestyle="--", color="gray", label="Mean weight")
    ax2.set_ylabel("Mean weight", color="gray")
    ax2.tick_params(axis="y", labelcolor="gray")

    if parameters_dict is not None:
        ax2.set_ylim(bottom=parameters_dict["min_weight"], top=parameters_dict["max_weight"])
    else:
        ax2.set_ylim(bottom=0, top=1.01)

    # unified legend
    lines = [l1_line, l2_line, mean_line]
    labels = [line.get_label() for line in lines]
    ax1.legend(lines, labels, loc="best")
    plt.tight_layout()
    plt.savefig(os.path.join(RESULTS_PATH, f"winner_rf_evolution_sample{train_image_idx}_summary.png"), dpi=200)
    plt.close()

Loading data and training the model

Having defined all the necessary classes and functions, we can now proceed to load the data. We use nervos’ own dataloader.MNISTLoader to load the MNIST dataset and initialize the layers of the SNN. We also plot some random samples from the training and test sets to visualize the input spike trains before training the model:

m = MNIST_SNN(p, identifier_name, classes=CLASSES)
m.initialise_layers([784,80])

m.plot_random_samples(N=25, train=True, aggregate="sum", seed=42, cmap="viridis", figsize=(8,9))
plt.suptitle("Random samples from the training set (aggregated over time (sum))")
plt.tight_layout()
plt.savefig(os.path.join(RESULTS_PATH, "random_samples_train.png"), dpi=200)
plt.close()
m.plot_random_samples(N=25, train=False, aggregate="sum", cmap="viridis", figsize=(8,9))
plt.suptitle("Random samples from the test set (aggregated over time (sum))")
plt.tight_layout()
plt.savefig(os.path.join(RESULTS_PATH, "random_samples_test.png"), dpi=200)
plt.close()

Samples from the training set, visualized by aggregating the input spike trains over time (sum) to reconstruct the original images. Each image is labeled with its true class. This gives us an intuition of what kind of input patterns the network will be trained on.

Finally, we run the training loop of the model and evaluate its performance. The training loop will save the learned synapses and neuron label map at the end of each epoch in a subdirectory named “Epoch_{epoch_number}-{accuracy}” for easy identification of the best epoch later on:

m.get_spikeplots = True
m.get_weight_evolution = True
y = m.train()

In the next section, we will analyze the results of the training by visualizing the learned synapses and evaluating the accuracy of the model on the test set.

Evaluation

The first evaluation step is to visualize the learned synapses. We aggregate synaptic weight vectors across output neurons that share the same label in the learned neuron_label_map, and visualize the resulting class conditioned templates as 28x28 images. This allows us to see what kind of input patterns the network has learned to associate with each digit class:

# evaluate the model by visualizing the learned synapses and calculating accuracy on test set:
visualize_synapse(m.learned_synapses[0], m.learned_neuron_label_map, figsize=(8, 5.0), cmap="viridis")
plt.suptitle("Learned synapses\n(summed over output neurons of the same predicted class)")
plt.tight_layout()
plt.savefig(os.path.join(RESULTS_PATH, "learned_synapses.png"), dpi=200)
plt.close()

Learned synapses, visualized by summing over output neurons of the same predicted class. We trained the model on the classes 0 to 5, and we can see that the learned synaptic patterns for each class show distinct features that resemble the corresponding digit shapes, indicating that the network has successfully learned to differentiate between the classes based on the input spike patterns.

In our case, we can see that the learned synapses for each class show distinct patterns that resemble the corresponding digit shapes, indicating that the network has successfully learned to differentiate between the classes based on the input spike patterns.

Accuracy evaluation on test set

Next, we evaluate the accuracy of the model on the test set. We use the accuracy function to get the true and predicted labels for the test samples, then compute the confusion matrix and overall accuracy metrics. Finally, we plot the confusion matrix both in raw counts and normalized form to visualize how well the model is performing across different classes:

# evaluate accuracy on test set:
y_true,y_pred = accuracy(m, classes=CLASSES, parameters_dict=parameters_dict)

# calculate confusion matrix and metrics:
labels = CLASSES  # use the selected classes
C = confusion_matrix_np(y_true, y_pred, labels=labels)
acc, recall = accuracy_metrics(C)
print("Accuracy:", acc)
print("Recall:", recall)
# plot confusion matrix:
plot_confusion_matrix(C, labels=labels, normalize=False, title="Confusion matrix (counts)",cmap="BuGn")
plt.savefig(os.path.join(RESULTS_PATH, "confusion_matrix_counts.png"), dpi=200)
plt.close()
plot_confusion_matrix(C, labels=labels, normalize=True, title="Confusion matrix (row-normalized)", cmap="BuGn")
plt.savefig(os.path.join(RESULTS_PATH, "confusion_matrix_normalized.png"), dpi=200)
plt.close()

Confusion matrix, row-normalized. The confusion matrix shows that the model has a high true positive rate for most classes, with the majority of predictions falling on the diagonal. There are only a few misclassifications (“3” and “5” seem to be confused sometimes), which indicates that the model has very well learned to differentiate between the digit classes based on the input spike patterns.

With our model and training parameters defined above, we reach an accuracy of around 0.85 on the test set, which is already quite good. The confusion matrix also indicated that almost all classes are well recognized, i.e., most of the predictions are on the diagonal, with only a few misclassifications. This suggests that the model has successfully learned to differentiate between the digit classes based on the input spike patterns.

Spike activity and winner neuron RF evolution

Next, we will take a closer look at the spike activity of the output neurons and the evolution of the winner neuron’s receptive field over epochs for specific training samples. We will visualize the spike raster plots for the output layer at each epoch and also plot the receptive fields of the winner neurons to see how they evolve during training and how they relate to the learned synaptic weights and the true labels of the samples:

# spike rasterplots and winner RF evolution for specific training samples:
train_image_idx_list = [41, 61]
synapses_final = m.learned_synapses[0]
nlm_final      = m.learned_neuron_label_map
for train_image_idx in train_image_idx_list:
    true_label = int(m.Y_train[train_image_idx])
    final_epoch = p.epochs - 1
    final_winner_idx, _ = get_winner_neuron_idx(m, final_epoch, train_image_idx)
    for epoch in range(p.epochs):
        
        # pick winner from spikes (epoch-specific)
        spk_in  = m.spikeplots[epoch][train_image_idx][0]    # epoch, sample/train image, layer (0: Input, 1: Output)
        spk_out = m.spikeplots[epoch][train_image_idx][-1] 
        spike_counts = spk_out.sum(axis=1)
        winner_idx   = int(np.argmax(spike_counts))
        winner_count = int(spike_counts[winner_idx])
        
        # spike raster plot for the output layer at this epoch and this training image:
        rasterplot(spk_out,
            title=(f"Output raster, epoch {epoch} sample {train_image_idx} "
                   f"(true={true_label}, current winner={winner_idx}, final winner={final_winner_idx})"),
            xlim=(0, spk_out.shape[1]),
            highlight_neuron_idx=winner_idx,
            highlight_color="orange",
            highlight2_neuron_idx=final_winner_idx,
            highlight2_color="magenta")
        plt.savefig(os.path.join(RESULTS_PATH, f"raster_output_neurons_epoch{epoch}_sample{train_image_idx}.png"), dpi=200)
        plt.close()

        # predicted label according to the (final) neuron_label_map:
        winner_label = int(nlm_final[winner_idx]) if winner_idx < len(nlm_final) else -1

        
        # epoch specific weights:
        W_ep = get_last_weight_snapshot_for_sample(m, epoch, train_image_idx)

        # 1. RF of the CURRENT epoch winner (this is what you want additionally):
        winner_label = int(nlm_final[winner_idx]) if winner_idx < len(nlm_final) else -1
        plot_rf_of_neuron(
            W_ep,
            winner_idx,
            title=(
                f"Epoch winner RF in epoch {epoch}\n on sample {train_image_idx}: neuron idx={winner_idx},\n"
                f"spikes={winner_count}, map={winner_label}, true={true_label}"),
            cmap="viridis",
            figsize=(3.8, 4.0))
        plt.savefig(os.path.join(RESULTS_PATH, f"rf_epochWinner_epoch{epoch}_sample{train_image_idx}.png"), dpi=200)
        plt.close()

        # 2. RF of the FINAL winner, but using CURRENT epoch weights (optional, if you also want this):
        final_label = int(nlm_final[final_winner_idx]) if final_winner_idx < len(nlm_final) else -1
        final_count_this_epoch = int(spike_counts[final_winner_idx])
        plot_rf_of_neuron(
            W_ep,
            final_winner_idx,
            title=(
                f"Final winner RF in epoch {epoch}\n on sample {train_image_idx}: neuron idx={final_winner_idx},\n"
                f"spikes(ep)={final_count_this_epoch}, map={final_label}, true={true_label}"),
            cmap="viridis",
            figsize=(3.8, 4.0))
        plt.savefig(os.path.join(RESULTS_PATH, f"rf_finalWinner_asOfEpoch{epoch}_sample{train_image_idx}.png"), dpi=200)
        plt.close()
        
        # now plot the label template:
        plot_label_template(
            synapses_final,
            nlm_final,
            true_label,
            title=f"Template sample {train_image_idx} in epoch {epoch}:\ntrue={true_label}",
            cmap="viridis",
            mode="mean",
            figsize=(3.8, 4.0))
        plt.savefig(os.path.join(RESULTS_PATH, f"rf_template_epoch{epoch}_sample{train_image_idx}.png"), dpi=200)
        plt.close()

The template for the true label of the sample, which has been learned by the network as an aggregate of the synaptic weights of all output neurons that are mapped to that label. This template gives an intuition of the prototypical input pattern that the network has learned to associate with that class (here: the digit “2”), and can be compared to the RF of the winner neuron and to the original input image to see how closely they align.

Epoch 0

Top: Raster plot of output neuron activity in epoch 0 for sample 61. Each dot represents a spike from a neuron at a specific time step. The highlighted neurons indicate the epoch winner (orange) and the final winner (magenta; i.e., the neuron that wins in the final epoch 3). Bottom left: Receptive field of the winner neuron in epoch 0 for sample 61, visualized as a 28x28 image of its synaptic weights. Bottom right: Receptive field of the final winner neuron as of epoch 0 for sample 61, visualized using the same weights but highlighting the final winner neuron.

The raster plot shows the spiking activity of the output layer neurons over time for a single training image at epoch 0. Here, we have 80 output neurons (as defined in the parameters) and the x-axis represents the discrete time steps of the simulation (100 + 1). Each dot in the raster plot corresponds to a spike from a particular neuron at a specific time step. The plot shows an initial firing during the ongoing exposure to the input image, followed by a damping of activity due to adaptation or inhibition. The pattern of spiking depends on the learned synaptic weights and the input. Later, we reach threshold and the output neurons fire again, which can be seen as a second wave of spiking. This behavior is overall controlled by

the refractory time,
the spike drop rate, and
the adaptive threshold.

The timing and pattern of these spikes are crucial for the model’s predictions and learning process.

Note that nervos uses two related but not identical notions of a “winner”: during training, the label map is updated using the neuron with the highest membrane potential at the spike event (specifically the last such event in the presentation), whereas in our analysis we define the winner as the neuron with maximal spike count over the full presentation window.

However, as implemented in nervos, this is not a biologically detailed simulation with biophysically plausible temporal dynamics as we have

no synaptic delays,
no continuous integration of membrane potential over time (instead, the potential is updated in discrete time steps based on incoming spikes and current synaptic weights),
no real membrane potential dynamics/ODE (like leaky integration, conductance-based synapses, etc.), and
no real WTA (winner-takes-all) inhibition between output neurons. Inhibition is implemented algorithmically: At each spike event, neurons that do not exceed threshold are forced to an inhibitory potential and placed in refractory lockout, rather than receiving explicit inhibitory synaptic conductances.

However, the discrete time steps and the spike patterns still allow us to analyze the learning dynamics and the evolution of the synaptic weights in a way that is analogous to how we would analyze a more biologically detailed SNN, albeit with some caveats regarding the interpretation of the membrane potential traces and spike timings.

The two bottom plots show the receptive fields of the winner neuron in epoch 0 and the final winner neuron as of epoch 0, respectively. The RF is visualized as a 28x28 image of the synaptic weights from the input layer to that specific output neuron. The RF of the winner neuron in epoch 0 already shows the structure that resembles the input pattern “2”. However, the mapped label of that neuron shows “1”, which indicates that at this early stage of training, the neuron has not yet learned to correctly associate its RF with the true label of the sample.

The RF of the final winner neuron as of epoch 0 also shows a similar pattern, but it is important to note that the final winner neuron can change across epochs, and its RF will evolve during training as the synaptic weights are updated based on the STDP learning rule. However, already in this first epoch 0, it maps to the correct label “2”, which suggests that it has already started to learn the correct association, even though its RF is not yet fully developed.

Epoch 1

Top: Raster plot of output neuron activity in epoch 1 for sample 61. Bottom left: Receptive field of the winner neuron in epoch 1 for sample 61. Bottom right: Receptive field of the final winner neuron as of epoch 1 for sample 61. Note, that in this epoch, the current winner neuron and the final winner neuron are the same.

In epoch 1, the winner neuron changes from index 62 to index 69, which is the index of the final winner neuron at the end of the training. Thus, the final neuron already dominates the spike activity during this training sample, reflecting a stabilization of the network’s internal representation for this sample.

The raster plot on the other hand shows a reduction in overall spiking activity compared to epoch 0, indicating increased selectivity and stronger competition among output neurons.

Epoch 2

Top: Raster plot of output neuron activity in epoch 2 for sample 61. Bottom left: Receptive field of the winner neuron in epoch 2 for sample 61. Bottom right: Receptive field of the final winner neuron as of epoch 2 for sample 61. Note, that in this final epoch, the current winner neuron and the final winner neuron are the same.

In the final epoch 2, neuron 69 remains the winner and now consistently dominates the spike activity. The predicted label matches the true label, demonstrating successful specialization of this neuron for the digit “2”. The receptive field of this neuron 69 also appears more broadened, indicating that STDP has reinforced synaptic weights from a wider range of input pixels that are relevant for recognizing the digit, which is a sign of successful learning and generalization.

The raster plot shows a further refinement of activity, with reduced distributed firing across other neurons. This suggests that the network has converged toward a more selective representation for this digit.

Weight evolution of the winner neuron

Finally, we can also plot the evolution of the synaptic weights of the winner neuron across epochs to see how its receptive field develops during training. This will allow us to see how the synaptic weights are updated based on the STDP learning rule and how they converge to a stable pattern that corresponds to the learned representation for that sample:

# let's plot the evolution of the synaptic weights over epochs for the winner neuron of the last epoch:
for train_image_idx in train_image_idx_list:
    plot_winner_rf_evolution_over_epochs(m, train_image_idx=train_image_idx, cmap="viridis",
                                         parameters_dict=parameters_dict, nlm_final=nlm_final)

Top: Evolution of the receptive field of the winner neuron across epochs for sample 61, visualized as tiles. Each tile shows the RF of the winner neuron at a specific epoch, allowing us to see how it evolves during training. The title of each tile indicates the epoch number, the index of the winner neuron, its spike count, its mapped label according to the final neuron label map, and the true label of the sample. Bottom: Summary plot of weight metrics (L1 norm, L2 norm, and mean weight) for the winner neurons across epochs for sample 61.

The top panel shows the receptive field of the epoch-wise winner neuron for sample 61 across training epochs. Importantly, the winner neuron is defined separately in each epoch as the neuron with the highest spike count for this specific input sample. Consequently, the identity of the winner can change between epochs.

In epoch 0, neuron 62 wins with 61 spikes and is mapped to label 1 according to the final neuron label map. Its receptive field already resembles the digit “2” in structure, but the class association is still incorrect. We have recognized this already in the previous section. The weight pattern is relatively diffuse, and the L1 and L2 norms are comparatively large, indicating a broadly distributed weight configuration.

In epoch 1, the winner switches to neuron 69. This neuron is mapped to label 2 and therefore aligns with the true class of the sample. The receptive field still resembles the digit “2”, but is now a bit more “smeared out”, which is not a bad thing, as it indicates that the neuron is integrating information from a broader set of input pixels that are relevant for recognizing the digit. The spike count of this winner neuron is 40, which is slightly lower than the previous winner. The weight norms decrease substantially compared to epoch 0. This reflects a redistribution and normalization of synaptic weights under the STDP dynamics and weight constraints.

In epoch 2, neuron 69 remains the winner. The receptive field changes only moderately compared to epoch 1, suggesting stabilization of the learned representation. The weight norms decrease slightly further, while the overall shape remains consistent. This indicates convergence toward a stable attractor-like weight configuration for this sample.

The bottom summary plot quantifies this evolution:

The L1 norm decreases strongly from epoch 0 to epoch 1 and slightly further to epoch 2.
The L2 norm shows a similar but less pronounced decline.
The mean weight also decreases across epochs.

This reduction does not imply loss of information. Instead, it reflects synaptic competition and bounded, weight dependent saturation inherent to the implemented STDP update, together with the adaptive threshold dynamics. Early in training, weights are more broadly distributed. As learning progresses, weights become more selective and concentrated on input pixels that consistently co activate with the winner neuron.

Crucially, because the winner neuron can change across epochs, the summary metrics track the weights of different neurons at different times. This is intentional: the plot characterizes the weight profile of whichever neuron currently dominates the representation of this sample (compare the raster plots above), rather than following a single fixed neuron.

Overall, the combined RF tiles and norm curves illustrate three key aspects of learning in nervos:

Early competition between output neurons for representing a pattern.
Reassignment of dominance to a neuron whose label mapping matches the true class.
Gradual stabilization and sharpening of the receptive field under repeated exposure.

This provides a transparent view of how discrete-time STDP, adaptive thresholds, and weight constraints together drive the formation of class-specific receptive fields in the output layer.

Conclusion

In my view, nervos provides a deliberately minimal, transparent framework for studying how local spike timing dependent plasticity interacts with synapse models that range from ideal floating point weights to hardware constrained finite state and nonlinear memristor inspired devices. The core appeal is that essentially every relevant mechanism remains inspectable: Input encoding, spike generation, winner selection, weight updates, and the emergence of neuron selectivity can be analyzed directly from stored spike rasters and weight snapshots, without any implicit gradient based optimization.

In this post, we closely followed nervos’ official MNIST tutorialꜛ and extended it with additional analyses of internal dynamics. Using a two layer network with 784 input neurons and 80 output neurons, and using only local STDP weight updates, the model reached an accuracy of roughly 85% on a six class test set. The learned synapse visualizations and the class specific weight templates indicated that the network develops digit like weight patterns, consistent with the idea that the output population self organizes into feature selective units.

The winner based analyses highlight how this self organization plays out on single samples. For the inspected training example with true label “2”, the epoch wise winner changed from one output neuron to another between early and later epochs. The receptive fields showed an interpretable progression from an initial pattern with an incorrect final label mapping, toward a stable winner neuron whose mapping matched the true class. Tracking weight norms across epochs showed a strong reduction in L1 and L2 magnitudes when the winner switched, consistent with competitive redistribution under STDP together with weight clipping and the adaptive threshold dynamics. In other words, the network does not merely accumulate weights monotonically. It reallocates representational dominance across output neurons until repeated wins make the label assignment stable in practice, because the same neuron is repeatedly reassigned to the same class.

At the same time, it is also important to be explicit about what kind of model this is and is not. nervos captures a few key computational motifs that are often discussed in theoretical accounts of unsupervised cortical learning: local plasticity, competition, and specialization of units. However, it is not a biologically detailed simulator of spiking circuits. The time axis is discrete, the neuronal dynamics are simplified relative to continuous conductance based models, the inhibition is implemented in a simplified global inhibition scheme, and there is no anatomical or physiological structure beyond a fully connected feedforward projection with global competition. Most importantly, classification is not implemented by an intrinsic downstream readout population but by a post hoc interpretation step, namely the neuron label map constructed online by repeatedly assigning the current sample label to the winning neuron and then applied to $\arg\max$ spike counts. This is a legitimate algorithmic readout, but it is not the same as a biological circuit that must produce a decision through spikes alone.

These points also clarify where nervos sits relative to biologically plausible architectures. Compared to models that aim for cortical realism, such as recurrent networks with structured excitation and inhibition, synaptic delays, dendritic nonlinearities, and neuromodulatory or reward gated learning, nervos is far simpler and leaves out many mechanisms that matter for real neural computation. Those richer models can express temporal codes, recurrent memory, context dependence, and credit assignment mechanisms beyond local STDP. They also often avoid the need for an explicit label map by embedding a readout circuit into the model itself. The cost is that they become harder to analyze and harder to link to neuromorphic constraints in a clean way. And, also an important point, they often require more computational resources to simulate, which can limit the scope of systematic parameter sweeps and mechanistic analyses.

Overall, Maskeen and Lashkare did a fantastic job! I think the main strengths of nervos are conceptual clarity, transparence, and practical accessibility. The minimal architecture makes it easy to attribute observed behavior to specific design choices, and the framework is explicitly designed to compare learning rules and synapse implementations under controlled conditions. This is valuable both for neuromorphic engineering, where non ideal synapses are the rule rather than the exception, and for computational neuroscience, where it can serve as a clean baseline for what purely local plasticity and competition can achieve on a pattern recognition problem.

If you are interested in exploring the code and running the simulations yourself, I highly recommend checking out the official nervos GitHub repositoryꜛ and read the according pre-printꜛ.

The complete code used in this blog post is available in this Github repositoryꜛ (nervos_snn_mnist.py). Feel free to modify and expand upon it, and share your insights.

References and further reading

Maskeen, Jaskirat Singh; Lashkare, Sandip, A Unified Platform to Evaluate STDP Learning Rule and Synapse Model using Pattern Recognition in a Spiking Neural Network, 2025, arXiv:2506.19377, DOI: 10.48550/arXiv.2506.19377ꜛ
Maskeen, Jaskirat Singh; Lashkare, Sandip, A Unified Platform to Evaluate STDP Learning Rule and Synapse Model using Pattern Recognition in a Spiking Neural Network, ICANN 2025, Springerꜛ
nervos GitHub repositoryꜛ
nervos documentationꜛ
LeCun, Yann; Bottou, Léon; Bengio, Yoshua; Haffner, Patrick, Gradient-based learning applied to document recognition, 1998, Proceedings of the IEEE, doi: 10.1109/5.726791ꜛ
Diehl, Peter U.; Cook, Matthew, Unsupervised learning of digit recognition using spike-timing-dependent plasticity, 2015, Frontiers in Computational Neuroscience, doi: 10.3389/fncom.2015.00099ꜛ
N. Caporale, & Y. Dan, Spike timing-dependent plasticity: a Hebbian learning rule, 2008, Annu Rev Neurosci, Vol. 31, pages 25-46, doi: 10.1146/annurev.neuro.31.060407.125639ꜛ
G. Bi, M. Poo, Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, 1998, Journal of neuroscience, doi: 10.1523/JNEUROSCI.18-24-10464.1998ꜛ
Wulfram Gerstner, Werner M. Kistler, Richard Naud, and Liam Paninski, Chapter 19 Synaptic Plasticity and Learning in Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition, 2014, Cambridge University Press, ISBN: 978-1-107-06083-8, free online versionꜛ
Robert C. Malenka, Mark F. Bear, LTP and LTD, 2004, Neuron, Vol. 44, Issue 1, pages 5-21, doi: 10.1016/j.neuron.2004.09.012
Nicoll, A Brief History of Long-Term Potentiation, 2017, Neuron, Vol. 93, Issue 2, pages 281-290, doi: 10.1016/j.neuron.2016.12.015ꜛ
Jesper Sjöström, Wulfram Gerstner, Spike-timing dependent plasticity, 2010, Scholarpedia, 5(2):1362, doi: 10.4249/scholarpedia.1362ꜛ

Spike-timing-dependent plasticity (STDP)

2026-02-12T11:05:45+01:00

Another frequently used term in computational neuroscience is spike-timing-dependent plasticity (STDP). STDP is a form of synaptic plasticity in which the strength of a synaptic connection is modified as a function of the precise temporal relationship between presynaptic and postsynaptic spikes. Rather than depending on averaged firing rates or stimulation frequency alone, STDP operates on the millisecond timescale of individual action potentials.

The plot illustrates the relationship between the change in synaptic weight ($\Delta w_{ij} / w_{ij}$) and the time difference ($t_j^f - t_i^f$) between the firing of the presynaptic neuron ($t_i^f$) and the postsynaptic neuron ($t_j^f$; $f$ denotes the spike index). For positive time differences, the synaptic weight increases, leading to long-term potentiation (LTP). For negative time differences, the synaptic weight decreases, resulting in long-term depression (LTD). The magnitude of the change in synaptic weight is determined by the time constants $\tau_+$ and $\tau_-$ for potentiation and depression, respectively (see explanation below). Source: scholarpedia.orgꜛ (modified; after Bi and Poo (1998)ꜛ)

Spike-timing-dependent plasticity

Spike-timing-dependent plasticity is a specific form of synaptic plasticity that depends on the precise timing of spikes (action potentials) between pre- and postsynaptic neurons. The change in synaptic strength is directly influenced by the relative timing of these spikes.

What is a synapse?

Before we continue, it is important to clarify what we mean by a “synapse” when we use this term.

First of all, a synapse is not a virtual construct or purely theoretical connection. It is a real anatomical structure that can be identified under the microscope. At the same time, in computational models it is represented in a strongly simplified and abstract form. It is important to clearly distinguish these two levels.

Anatomical synapse

In biological tissue, a synapse is a specialized contact structure between two neurons. It consists of three main components:

a presynaptic terminal containing synaptic vesicles filled with neurotransmitter,
a narrow synaptic cleft (about 20–40 nm wide),
a postsynaptic membrane equipped with specific receptors and a dense protein scaffold.

Schematic representation of an anatomical synapse: The presynaptic terminal contains synaptic vesicles filled with neurotransmitter, which are released into the synaptic cleft upon arrival of an action potential. The postsynaptic membrane contains receptors that bind the neurotransmitter and initiate a postsynaptic response. Source: Wikimedia Commonsꜛ (license: CC BY-SA 4.0)

This tripartite structure (not to be confused with the tripartite synapse, see below) is clearly visible, e.g., in electron microscopy and is, therefore, a real physical entity and not just a conceptual placeholder. However, please also note, that

Functionally, a synapse is the site where:

an action potential arrives presynaptically,
neurotransmitter is released,
postsynaptic receptors are activated,
and a postsynaptic current or conductance change is generated.

It is the elementary unit of signal transmission between neurons.

Synapses in computational models

In theoretical and computational neuroscience, a synapse is abstracted to:

a directed connection from neuron $j$ to neuron $i$,
a single scalar weight $w_{ij}$,
optionally additional internal state variables such as eligibility traces.

Schematic representation of a synapse in computational models: Presynaptic neurons $x_i$ project to a postsynaptic neuron $y$ with corresponding synaptic weights $w_i$. The postsynaptic activity $y$ is the sum of the products of presynaptic activities and synaptic weights.

This weight summarizes the effective coupling strength between two neurons. It compresses a highly complex molecular and structural system into a single number.

In reality, a synapse involves:

probabilistic vesicle release,
receptor kinetics,
nonlinear dynamics,
structural plasticity,
and many interacting proteins.

The model does not reproduce this complexity. It captures only the effective transmission strength and its modification.

The anatomical synapse described above is referred to as a “chemical synapse”. There are also “electrical synapses” (gap junctions) that allow direct electrical coupling between neurons, but they are not the focus of this discussion. However, we discussed them already in our post on gap junctions.

The tripartite synapse

The anatomical picture described above was substantially refined when it became clear that astrocytes, a type of glial cell, actively participate in synaptic signaling. Araque et al. (1999)ꜛ demonstrated that astrocytes are not merely passive support cells, but can detect synaptic activity, respond with intracellular calcium elevations, and in turn modulate synaptic transmission.

Schematic representation of glutamate reuptake at an excitatory synapse. The presynaptic terminal releases glutamate, which binds to postsynaptic receptors and is subsequently cleared by astrocytic transporters (EAAT2/GLT1). This functional integration of presynaptic terminal, postsynaptic membrane, and surrounding astrocytic processes is referred to as the tripartite synapse. Source: Wikimedia Commonsꜛ (license: CC BY-SA 4.0)

Astrocytic processes closely enwrap many excitatory synapses. They express neurotransmitter receptors, particularly for glutamate, allowing them to sense synaptic release. Upon activation, astrocytes can regulate extracellular neurotransmitter concentrations through uptake mechanisms, thereby shaping synaptic efficacy and preventing excitotoxicity. Moreover, astrocytes have been reported to release so-called gliotransmitters such as glutamate, ATP, or D-serine, although the physiological relevance and mechanisms of such release remain an active area of debate.

This led to the concept of the tripartite synapse, consisting of:

the presynaptic terminal,
the postsynaptic membrane,
and the surrounding astrocytic process.

In this framework, synaptic transmission is no longer viewed as a purely neuronal two-element interaction. Instead, it is embedded in a local neuron–glia microcircuit in which astrocytes dynamically regulate information flow and plasticity. While many computational models treat the synapse as a two-body interaction between pre- and postsynaptic neurons, the biological reality is more complex and includes glial modulation as an additional regulatory layer, which is an active area of ongoing research.

One more subtle point

Between two neurons, there can be multiple anatomical synapses. In most network models, these are represented as a single effective connection with one weight parameter.

Thus, when we speak of “a synapse” in numerical simulations, we refer to a simplified mathematical representation of a real, anatomically defined contact structure.

I think, understanding this distinction prevents conceptual confusion when moving between biology and mathematical modeling.

In STDP, the direction and magnitude of synaptic changes depend on whether the presynaptic spike precedes the postsynaptic spike or vice versa. Empirically, STDP exhibits the following characteristic behavior:

If a presynaptic neuron fires shortly before a postsynaptic neuron, typically within a temporal window of 10 to 20 ms, the synapse is strengthened. This corresponds to long-term potentiation (LTP).
If the presynaptic neuron fires shortly after the postsynaptic neuron, the synapse is weakened. This corresponds to long-term depression (LTD).

In contrast to classical experimental protocols for long-term potentiation and long-term depression, which are typically defined and induced using sustained patterns of stimulation such as prolonged high-frequency or low-frequency input, spike-timing-dependent plasticity emphasizes the fine temporal structure of neuronal activity. Rather than averaging activity over extended time windows, STDP operates on the millisecond timescale of individual action potentials and explicitly encodes the relative order of presynaptic and postsynaptic spikes.

This focus on spike timing introduces a causal interpretation of synaptic modification. Synapses are strengthened when presynaptic activity reliably precedes postsynaptic firing, indicating a predictive contribution to postsynaptic activation, and weakened when this temporal order is reversed. In this sense, STDP provides a temporally precise and biologically plausible description of how synaptic changes can arise directly from ongoing spiking activity in neural circuits, without requiring artificial stimulation protocols.

Although STDP and classical LTP and LTD are often discussed as separate forms of plasticity, they should not be regarded as distinct mechanisms. Instead, STDP captures a specific temporal organization of synaptic modification that can give rise to LTP-like or LTD-like changes when considered over longer timescales. Depending on the statistics of spike timing, repeated pre-before-post pairings lead to net potentiation, while repeated post-before-pre pairings lead to net depression.

At the mechanistic level, STDP and classical LTP and LTD share common intracellular substrates. Both involve NMDA receptor activation, calcium influx, and downstream signaling cascades that ultimately modify synaptic efficacy. The primary distinction between these descriptions therefore does not lie in the underlying biological machinery, but in the temporal resolution at which synaptic changes are formulated and analyzed. STDP provides a spike-based, temporally resolved framework that complements and refines the rate- and protocol-based descriptions traditionally used to characterize LTP and LTD.

Mathematical formulation of STDP

STDP can be considered a form of Hebbian learning that refines the concept by incorporating precise spike timing.

Let us consider a simple mathematical model for STDP. We can describe the change in synaptic weight, $\Delta w_{ij}$ between two neurons $i$ and $j$ as a function of the time difference between their spikes ($t_j^n - t_i^f$), where $f=1, 2,3, \ldots$ counts the presynaptic spikes spikes of neuron $i$ (presynaptic), $n=1, 2,3, \ldots$ counts the postsynaptic spikes of neuron $j$ (postsynaptic). The total change in synaptic weight can be expressed as follows:

\[\begin{align} \Delta w_{ij} = \sum_{f=1}^{N_i} \sum_{n=1}^{N_j} W(t_j^n - t_i^f) \end{align}\]

where $W(t_j^n - t_i^f)$ denotes the chosen STDP function, also called learning window. The STDP function $W(t_j^n - t_i^f)$ typically follows an exponential decay function, where the change in synaptic weight depends on the time difference between the spikes. A commonly used form of the learning window is an asymmetric exponential function:

\[\begin{equation} W(t_j^n - t_i^f) = \begin{cases} A_+ \exp\!\left(-\dfrac{t_j^n - t_i^f}{\tau_+}\right), & \text{if } t_j^n - t_i^f > 0 \\ -A_- \exp\!\left(\frac{t_j^n - t_i^f}{\tau_-}\right), & \text{if } t_j^n - t_i^f < 0 \end{cases} \end{equation}\label{eq:stdp}\]

Here, $A_+$ and $A_-$ are scaling factors, while $\tau_+$ and $\tau_-$ are time constants for potentiation and depression, respectively. They are typically in the order of 10 ms. Key features of this model include:

when the time difference is positive ($t_j^n - t_i^f > 0$), i.e., that presynaptic neuron fired before the postsynaptic neuron, it typically leads to long-term potentiation (LTP) where the synaptic strength increases.
when the time difference is negative ($t_j^n - t_i^f < 0$), indicating that the postsynaptic neuron fired before the presynaptic neuron, it generally results in long-term depression (LTD), where the synaptic strength decreases.

Graphical representation of STDP

Let’s plot the learning window function Eq. $\eqref{eq:stdp}$ to further understand these relationships:

STDP learning window $W(\Delta t)$ as a function of the relative spike timing $\Delta t = t_j^n - t_i^f$. The lower panel shows the change in synaptic weight induced by a single pre–post spike pair, with potentiation (LTP) for $\Delta t > 0$ (i.e., $t_i^f < t_j^n$) and depression (LTD) for $\Delta t < 0$ (i.e., $t_j^n < t_i^f$). The upper panel provides a schematic illustration of individual pre–post spike pairs corresponding to specific points on the learning window. The Python code used to generate this figure is available in the GitHub repository mentioned at the end of the post.

Shown here is STDP for a single, directed synapse, connecting a presynaptic neuron $i$ to a postsynaptic neuron $j$, with synaptic weight $w_{ij}$.

The lower panel shows the STDP learning window $W(\Delta t)$ as a function of the relative spike timing

\[\Delta t = t_j^n - t_i^f.\]

Again, $t_i^f$ denotes the spike time of the presynaptic neuron and $t_j^n$ the spike time of the postsynaptic neuron. The vertical axis represents the change in synaptic weight induced by a single pre–post spike pair.

The learning window is asymmetric around $\Delta t = 0$, and as described before, two distinct regimes can be identified based on the sign of $\Delta t$:

For $\Delta t > 0$, the presynaptic neuron fires before the postsynaptic neuron (“pre fires before post”), i.e., $t_i^f < t_j^n$. This is illustrated in the upper panel (right side, red marks), where the spike of the presynaptic neuron is closer to the $\Delta t = 0$ line, indicating its earlier firing, followed by the later spike of the postsynaptic neuron at a larger positive $\Delta t$. In this case, the spiking of the presynaptic neuron can be interpreted as contributing causally to the firing of the postsynaptic neuron. The synapse is therefore potentiated and the weight change is positive, corresponding to long-term potentiation (LTP).
For $\Delta t < 0$, the postsynaptic neuron fires before the presynaptic neuron (“post fires before pre”), i.e., $t_j^n < t_i^f$. This is illustrated in the upper panel (left side, green marks), where the spike of the postsynaptic neuron is closer to the $\Delta t = 0$ line, indicating its earlier firing, followed by the later spike of the presynaptic neuron at a larger negative $\Delta t$. In this case, the presynaptic spike is less likely to have contributed to the postsynaptic firing, and the synapse is depressed, leading to a negative weight change corresponding to long-term depression (LTD).

The exponential decay of both branches reflects the decreasing influence of spike pairs as their temporal separation increases. Spike pairs with large absolute time differences contribute only weakly to synaptic modification.

The color coding (red and green) emphasizes the functional distinction between LTD and LTP regimes (and not the neuron identity).

Together, the two panels demonstrate how STDP maps the relative timing of individual pre- and postsynaptic spikes onto systematic synaptic weakening or strengthening. Absolute spike times are irrelevant for the learning rule. Only the temporal order and separation of spikes determine the direction and magnitude of synaptic change, with larger temporal proximity leading to stronger modifications. This illustrates the core principle of STDP as a local, spike-based learning mechanism that encodes causal relationships between neuronal activity patterns at the level of individual synapses.

Incorporating STDP in neuron models

To incorporate STDP in a neuron model, we need to make the following assumptions. Each presynaptic spike arrival leaves a trace variable $x_i(t)$ that is updated by an amount $a_+(x_i)$, and, similarly, each postsynaptic spike arrival leaves a trace variable $y_j(t)$ that is updated by an amount $a_-(y_j)$. Both traces exponentially decay with time constants $\tau_+$ and $\tau_-$ in absence of spikes:

\[\begin{align} \tau_+ \frac{dx_i}{dt} &= -x_i + a_+(x_i)\sum_f \delta(t - t_i^f), \\ \tau_- \frac{dy_j}{dt} &= -y_j + a_-(y_j)\sum_n \delta(t - t_j^n) \end{align}\]

where:

$x_i(t)$ is a trace variable for the presynaptic spikes
$y_j(t)$ is a trace variable for the postsynaptic spikes
$\tau_+$ and $\tau_-$ are the time constants for the trace variables.
$a_+(x_i)$ and $a_-(y_j)$ are functions that describe how the trace variables are updated upon the occurrence of spikes.
$w_{ij}$ is the synaptic weight from presynaptic neuron $i$ to postsynaptic neuron $j$

In the simplest and most commonly used case, $a_+(x_i)$ and $a_-(y_j)$ are constants, so that each spike produces a fixed additive increment of the corresponding trace. More general choices allow state-dependent or saturating trace updates, which can be used to model nonlinear effects without changing the overall structure of the STDP rule.

The synaptic weight change according:

\[\begin{equation}\begin{aligned} \frac{dw_{ij}}{dt} = \quad &A_+(w_{ij})\, x_i(t)\sum_n \delta(t - t_j^n) \\ -&A_-(w_{ij})\, y_j(t)\sum_f \delta(t - t_i^f) \end{aligned}\end{equation}\]

The synaptic weight $w_{ij}$ changes based on the trace variables and the occurrence of spikes. The first term, $A_+(w_{ij}) x_i(t) \sum_n \delta(t - t_j^n)$, represents potentiation and depends on the presynaptic trace variable and the occurrence of postsynaptic spikes. The second term, $A_-(w_{ij}) y_j(t) \sum_f \delta(t - t_i^f)$, represents depression and depends on the postsynaptic trace variable and the occurrence of presynaptic spikes.

Python example

To illustrate how to implement STDP in a simple neuron model, this time we use the brian2ꜛ simulator, which is a popular tool for simulating spiking neural networks. The code below defines a simple (postsynaptic) leaky integrate-and-fire neuron receiving input from a population of 1000 Poisson spike generators, with STDP implemented on the synapses. The parameters are chosen to be in a reasonable range for this type of model. The synaptic weights are constrained to be between 0 and a maximum value gmax. We also set up monitors to record the synaptic weights over time and the presynaptic spike times. The code is available in the GitHub repository mentioned at the end of the post. It is originally based on the example provided in the brian2 documentation: Spike-timing-dependent plasticity (STDP)ꜛ:

import os
from pdb import run
from turtle import clear
import brian2 as b2
from brian2 import ms, mV, nS, Hz, second
import matplotlib.pyplot as plt

# set global properties for all plots:
plt.rcParams.update({'font.size': 12})
plt.rcParams["axes.spines.top"]    = False
plt.rcParams["axes.spines.bottom"] = False
plt.rcParams["axes.spines.left"]   = False
plt.rcParams["axes.spines.right"]  = False

# define parameters:
N = 1000            # number of presynaptic neurons
taum = 10*ms        # membrane time constant
taupre = 20*ms      # STDP time constant for presynaptic trace
taupost = taupre    # STDP time constant for postsynaptic trace (often set equal to taupre)
Ee = 0*mV           # excitatory reversal potential
vt = -54*mV         # spike threshold
vr = -60*mV         # reset potential
El = -74*mV         # leak reversal potential
taue = 5*ms         # excitatory synaptic time constant
F = 15*Hz           # firing rate of Poisson input
gmax = .01          # maximum synaptic weight
dApre = .01         # increment applied to the presynaptic eligibility trace Apre on each presynaptic spike (sets the scale of potentiation via Apre)
dApost = -dApre * taupre / taupost * 1.05  # increment applied to the postsynaptic eligibility trace Apost on each postsynaptic spike (negative; slightly stronger magnitude as a stabilizing heuristic)
dApost *= gmax      # scale trace increments to the same order of magnitude as w (since Apre/Apost are added directly to w)
dApre *= gmax       # same scaling for Apre

RESULTS_PATH = "figures"
os.makedirs(RESULTS_PATH, exist_ok=True)

# define the brian2 model with STDP synapses
eqs_neurons = '''
dv/dt = (ge * (Ee-v) + El - v) / taum : volt
dge/dt = -ge / taue : 1
'''

poisson_input = b2.PoissonGroup(N, rates=F)
neurons = b2.NeuronGroup(1, eqs_neurons, threshold='v>vt', reset='v = vr',
                      method='euler')

S = b2.Synapses(poisson_input, neurons,
             '''w : 1
                dApre/dt = -Apre / taupre : 1 (event-driven)
                dApost/dt = -Apost / taupost : 1 (event-driven)''',
             on_pre='''ge += w
                    Apre += dApre
                    w = clip(w + Apost, 0, gmax)''',
             on_post='''Apost += dApost
                     w = clip(w + Apre, 0, gmax)''')
S.connect() # all-to-one connectivity from the Poisson input to the single postsynaptic neuron
S.w = 'rand() * gmax' # random initialization of weights between 0 and gmax
mon = b2.StateMonitor(S, 'w', record=[0, 1]) # record the weights of two example synapses to see how they evolve over time

# run the simulation:
b2.run(100*b2.second, report='text')

# plots:
plt.figure(figsize=(5, 8))
plt.subplot(3, 1, 1)
plt.plot(S.w / gmax, '.', c='mediumaquamarine', markersize=4)
plt.ylabel('w/gmax')
plt.xlabel('synapse index')
plt.title('STDP: weights after simulation')

plt.subplot(3, 1, 2)
plt.hist(S.w / gmax, 20, color='mediumaquamarine')
plt.xlabel('w/gmax')
plt.title('STDP: weight distribution')

plt.subplot(3, 1, 3)
plt.plot(mon.t/b2.second, mon.w[0]/gmax, label='Synapse 0')
plt.plot(mon.t/b2.second, mon.w[1]/gmax, label='Synapse 1')
plt.xlabel('t [s]')
plt.ylabel('w/gmax')
plt.ylim(0.0,1.1)
plt.legend()
plt.title('STDP: example synapses vary over time')

plt.tight_layout()
plt.savefig(os.path.join(RESULTS_PATH, "stdp_example_weights.png"), dpi=300)
plt.close()

In order to compare the sme neuron model with and without STDP, you can simply define a second set of synapses without the STDP rule (i.e., with fixed weights) and run the simulation again:

# weights that do not change over time:
S2 = b2.Synapses(
    poisson_input, neurons,
    model='w : 1',
    on_pre='ge += w')
S2.connect()

This allows you to directly observe the effects of STDP on synaptic weight evolution and neural activity patterns.

After running this simulation, we end up with two plots:

Synaptic weight dynamics with and without spike-timing-dependent plasticity. Left: STDP-enabled network. Synaptic weights differentiate over time and converge toward a bimodal distribution. Right: Control simulation without STDP. Synaptic weights remain at their initial random values and show no dynamical reorganization. Top panels show the final synaptic weights, middle panels show the distribution of synaptic weights, and bottom panels show the time course of two example synapses.

Shown are the results of the simulation with STDP (left) and without STDP (right). For each, three panels are shown, which show from top to bottom:

The synaptic weights after the simulation as a function of synapse index.
The histogram of synaptic weights.
The time course of two example synapses.

Importantly, the upper panel in each column is not a spike raster. It does not contain temporal information. Instead, it shows a single point per synapse representing the final synaptic weight after learning. The x-axis enumerates synapses, while the y-axis shows the normalized weight $w/g_{\max}$.

Network with STDP

Let’s first discuss the results with STDP (left column).

Final weights across synapses (upper panel)

In the STDP condition, the upper panel immediately reveals a strong differentiation of synaptic weights. Although all presynaptic neurons fire statistically identical Poisson spike trains, the synapses do not remain equivalent. Instead, we observe:

a strong spread of final weights,
many synapses clustered near 0,
many synapses clustered near $g_{\max}$,
relatively few synapses in the intermediate regime,
no spatial structure (synapse index is meaningless).

There is no spatial structure along the synapse index, i.e., the structure exists purely in the distribution of weights, not in their arrangement.

This is already a nontrivial result. The network began with homogeneous random weights and statistically identical inputs. Nevertheless, STDP has broken this symmetry and generated synaptic differentiation.

This plot answers exactly one question: What do the synaptic weights look like across all synapses after learning?

The answer is clear: they no longer reflect their random initialization. Instead, they exhibit a pronounced polarization toward the boundaries of the allowed range. Most synapses have either weakened substantially or strengthened close to the maximum value, while comparatively few remain in an intermediate state.

This is a hallmark of STDP dynamics under unstructured input. The learning rule amplifies small random differences in spike timing, leading to a competitive process that pushes synapses toward extreme values. The result is a bimodal distribution of synaptic weights, with many synapses effectively “winning” (potentiated) and many “losing” (depressed), while few remain in an intermediate state.

Weight distribution (middle panel)

The histogram in the middle panel makes this effect quantitative. The distribution is clearly bimodal:

one peak close to 0,
one peak close to $g_{\max}$,
a depletion of weights in the middle.

This is the classical signature of additive, pair-based STDP under unstructured input. Synapses that, by chance, participate slightly more often in causal pre-before-post pairings are reinforced. Synapses that experience slightly more post-before-pre pairings are weakened. Because the learning rule is additive and asymmetric, these small differences are amplified over time.

The process resembles unsupervised synaptic competition. It is not overfitting, nor is it a numerical artifact. It is the expected behavior of this learning rule in the absence of additional stabilizing mechanisms.

Time course of individual synapses (lower panel)

The lower panel shows the evolution of two example synapses. Both exhibit stochastic fluctuations, yet their trajectories display a clear long-term drift. One synapse gradually decreases toward zero, the other drifts upward.

This illustrates several fundamental properties of STDP:

The dynamics are stochastic.
The process is path-dependent.
Early random fluctuations can bias long-term outcomes.
The system tends toward stable boundary states.

STDP here does not fine-tune weights toward a specific optimum. Instead, it acts as a selective amplification mechanism that pushes synapses toward extreme states.

Why do weights accumulate near 0 and $g_{\max}$?

So, why do the weights accumulate near the boundaries? Why do we see this bimodal distribution instead of a more uniform spread?

The bimodal outcome arises from the structure of the learning rule itself. Formally, the update mechanism in the simulation is:

on presynaptic spikes: $w \leftarrow \mathrm{clip}(w + A_{\text{post}}, 0, g_{\max})$
on postsynaptic spikes: $w \leftarrow \mathrm{clip}(w + A_{\text{pre}}, 0, g_{\max})$

with positive LTP contributions and slightly stronger LTD contributions.

This is not gradient descent on a global objective. It is a stochastic drift process. For independent Poisson input, causal and anti-causal spike pairings occur with approximately equal probability. However:

LTD is slightly stronger than LTP.
Postsynaptic firing depends on the current synaptic weight.
Weights are clipped at 0 and $g_{\max}$.

As a consequence:

Intermediate weights are unstable.
Small weights tend to drift further downward.
Large weights tend to drift further upward.
The boundaries at 0 and $g_{\max}$ act as stable attractors.

The result is weight binarization. Synapses are pushed into an either-or regime. This phenomenon has long been known in theoretical studies of additive STDP and is often described as bimodal weight dynamics or winner-take-all behavior at the synaptic level.

Biologically, such pure binarization is unrealistic. Real neural systems require additional mechanisms such as weight-dependent plasticity, homeostatic regulation, normalization, or inhibitory competition to prevent saturation. However, for didactic purposes, this simple setup is ideal. It isolates the intrinsic competitive character of STDP.

Control experiment without STDP

Now, let’s turn to the control simulation without STDP (right column). Here, the synaptic weights are fixed and do not change over time. This allows us to isolate the effects of spiking activity alone, without any plasticity.

Final weights (upper panel)

Without plasticity, the final weights are indistinguishable from the initialization. The distribution across synapse indices remains random. No synapse is selected, no differentiation emerges.

Weight distribution (middle panel)

The histogram remains approximately uniform over $[0, g_{\max}]$. Any deviations are due to finite sampling, not dynamics. There is no drift, no bimodality, no boundary accumulation. This means, that the distribution reflects only the initial random assignment of weights, and that spiking activity alone does not modify synaptic strength. The system remains in a static state with no learning or reorganization.

Time course (lower panel)

The trajectories of the example synapses remain perfectly constant. Despite continuous presynaptic spiking, nothing changes. Spiking activity alone does not modify synaptic strength.

This control condition demonstrates a central point:
Activity does not imply plasticity, i.e., the network does not structure itself just because neurons are spiking. Only when spike timing is coupled to weight updates does structural reorganization occur:

with selection of early vs. later inputs,
with structure in the weight distribution, and
with functional differentiation of synapses.

Thus, we can summarize the functional role of STDP in this minimal model as follows: Without STDP, the network is a passive integrator of random inputs. With STDP, it becomes a system that learns temporal causality, or to be more precise, it becomes a system that reinforces temporal correlations and spike order relationships.

What is actually learned?

It is important to evaluate what this model does and does not achieve.

There is:

no structured input,
no task,
no supervision,
no inhibition, and
no explicit competition beyond STDP itself.

Consequently, the network does not learn semantic structure, features, or representations. It does not classify, predict, or encode meaningful patterns.

What it does demonstrate is more fundamental. STDP alone acts as a self-organizing mechanism that breaks symmetry among statistically identical inputs. It induces synaptic competition and produces structured weight distributions even under pure noise.

The comparison between both simulations makes this transparent:

Without STDP: We end up with static random connectivity.
With STDP: We get dynamic differentiation and weight binarization.
With additional regulatory mechanisms: We would potentially get even meaningful learning.

Thus, this minimal model illustrates how a simple, biologically plausible learning rule can generate nontrivial synaptic structure from unstructured activity. It serves as a foundation for understanding more complex learning dynamics in spiking neural networks.

STDP and Hebbian learning rules

STDP can be viewed as a temporally refined form of Hebbian learning. Classical Hebbian plasticity is often summarized as a correlation-based rule in which synaptic strength increases when presynaptic and postsynaptic activity are correlated. In its simplest rate-based form, Hebbian learning can be written as

\[\frac{d w_{ij}}{d t} \propto r_i r_j,\]

where $r_i$ and $r_j$ are the firing rates of the pre and postsynaptic neurons.

STDP extends this principle by resolving correlations at the level of individual spikes rather than averaged firing rates. By distinguishing pre-before-post from post-before-pre spike pairings, STDP introduces a causal asymmetry that is absent in classical Hebbian rules. Synaptic strengthening occurs when presynaptic activity predicts postsynaptic firing, while synaptic weakening occurs when this temporal order is reversed.

In this sense, STDP implements a temporally precise notion of Hebbian causality rather than mere correlation.

STDP and Bienenstock-Cooper-Munro (BCM) rule

Teh Bienenstock-Cooper-Munro (BCM) rule is a rate-based plasticity model in which synaptic changes depend nonlinearly on postsynaptic activity relative to a sliding threshold. In its classical form, the BCM rule can be written as

\[\frac{d w_{ij}}{d t} = r_i \, \phi(r_j),\]

where $\phi(r_j)$ is a nonlinear function that changes sign at a postsynaptic activity threshold $\theta_M$. This threshold itself depends on the long-term average of postsynaptic activity.

Although STDP is formulated at the spike level, it can give rise to BCM-like behavior when averaged over stochastic spike trains. In particular, when neurons fire as Poisson processes and when extended STDP rules such as triplet-based STDPꜛ are used, the expected synaptic change becomes a nonlinear function of presynaptic and postsynaptic firing rates.

Under these conditions, the average weight change takes the form

\[\langle \Delta w_{ij} \rangle \propto r_i \, r_j (r_j - \theta),\]

where $\theta$ depends on the parameters of the STDP rule and the statistics of postsynaptic firing. This expression mirrors the structure of the BCM rule, with a sliding threshold emerging from spike timing statistics rather than being imposed explicitly.

STDP can therefore be understood as a spike-based mechanism from which rate-based learning rules such as BCM emerge as effective descriptions.

Functional consequences of STDP

STDP refines the functional properties known from rate models by incorporating spike timing, which can lead to better temporal coding, reduced latency in neural responses, and inherent normalization of synaptic strengths. These features make STDP a powerful mechanism for synaptic plasticity and learning in neural networks:

Spike-Spike correlations:: STDP incorporates the correlation of spikes between pre- and postsynaptic neurons on a millisecond timescale. This spike-spike correlation is crucial for learning in STDP models, unlike in standard rate models, which neglect these correlations. This feature of STDP enhances learning by leveraging the precise timing of spikes.
Reduced latency:: STDP can reduce the latency of postsynaptic neuron firing in response to sequential presynaptic spikes. If a postsynaptic neuron is connected to multiple presynaptic neurons firing in a specific sequence, STDP will strengthen synapses with pre-before-post timing and weaken those with post-before-pre timing. This results in the postsynaptic neuron firing earlier over repeated stimuli, thus reducing response latency.
Temporal coding:: Due to its sensitivity to spike timing, STDP is effective in temporal coding paradigms. It can fine-tune synaptic connections for tasks like sound source localization, learning spatiotemporal spike patterns, and time-order coding. These applications showcase the ability of STDP to process and learn from the precise timing of neural events.
Implicit rate normalization:: Unlike rate-based Hebbian learning, which can lead to unbounded growth of synaptic strengths and firing rates, STDP inherently normalizes synaptic changes without requiring explicit renormalization. This intrinsic stability allows neurons to detect weak correlations in inputs while maintaining controlled synaptic growth and firing rates.

Conclusion

Spike-timing-dependent plasticity provides a biologically grounded and mathematically precise framework for understanding synaptic learning in spiking neural networks. By linking synaptic modification to the causal structure of spike timing, STDP refines classical Hebbian learning and connects naturally to established rate-based rules such as the BCM model.

Its ability to capture temporal structure, reduce response latency, and stabilize synaptic dynamics makes STDP a central mechanism in computational models of learning and memory. As such, it forms an essential building block for more advanced plasticity frameworks, including multi-factor learning rules and neuromodulator-gated synaptic adaptation.

The complete code used in this blog post is available in this Github repositoryꜛ (stdp_weight_plot.py and stdp_simple_network_example.py). Feel free to modify and expand upon it, and share your insights.

Follow-up: In the next post, we explore how STDP can be used for pattern recognition and learning in spiking neural networks: Implementing a minimal spiking neural network for MNIST pattern recognition using nervos. This will allow us to see how STDP can be applied in a more complex setting with structured input and a learning task, while using a simple and efficient simulation framework called nervosꜛ.

References and further reading

N. Caporale, & Y. Dan, Spike timing-dependent plasticity: a Hebbian learning rule, 2008, Annu Rev Neurosci, Vol. 31, pages 25-46, doi: 10.1146/annurev.neuro.31.060407.125639ꜛ
G. Bi, M. Poo, Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, 1998, Journal of neuroscience, doi: 10.1523/JNEUROSCI.18-24-10464.1998ꜛ
Wulfram Gerstner, Werner M. Kistler, Richard Naud, and Liam Paninski, Chapter 19 Synaptic Plasticity and Learning in Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition, 2014, Cambridge University Press, ISBN: 978-1-107-06083-8, free online versionꜛ
Robert C. Malenka, Mark F. Bear, LTP and LTD, 2004, Neuron, Vol. 44, Issue 1, pages 5-21, doi: 10.1016/j.neuron.2004.09.012
Nicoll, A Brief History of Long-Term Potentiation, 2017, Neuron, Vol. 93, Issue 2, pages 281-290, doi: 10.1016/j.neuron.2016.12.015ꜛ
Jesper Sjöström, Wulfram Gerstner, Spike-timing dependent plasticity, 2010, Scholarpedia, 5(2):1362, doi: 10.4249/scholarpedia.1362ꜛ
Alfonso Araque, Vladimir Parpura, Rita P. Sanzgiri, Philip G. Haydon, Alfonso Araque, Vladimir Parpura, Rita P. Sanzgiri, Philip G. Haydon , Tripartite synapses: glia, the unacknowledged partner; 1999, Trends in Neurosciences. 22 (5): 208–215. doi: 10.1016/s0166-2236(98)01349-6ꜛ
Squadrani, Wert-Carvajal, Müller-Komorowska, Bohmbach, Henneberger, Verzelli, Tchumatchenko, Astrocytes enhance plasticity response during reversal learning, 2024, Communications Biology, Vol. 7, Issue 1, pages n/a, doi: 10.1038/s42003-024-06540-8ꜛ; we discussed this paper in this blog post.
brain2 example tutorial on STDPꜛ.

Revisiting the Moore’s law of Neuroscience, 15 years later

2026-02-05T06:03:51+01:00

I recently stumbled over a small but remarkably forward-looking paper from 2011 that I had not read before. Its central claim is that neuroscience has its own version of Moore’s law, at least when it comes to the number of neurons that can be recorded simultaneously.

By today’s techniques such as Neuropixels probes, it is possible to record from thousands of neurons simultaneously.Figure 3 (panel a) from Alessio P. Buccino et al. (2025)ꜛ shows the output of the visualization and quality control stage in a modern large-scale spike sorting pipeline for such large data sets. The panel displays raw (left) and preprocessed (right) electrophysiological data recorded with Neuropixels 2.0 probes, highlighting the dense spatiotemporal structure of the recorded signals and the necessity of scalable preprocessing, inspection, and quality control before spike sorting and downstream analysis. The figure exemplifies the practical data volumes and organizational challenges that accompany contemporary high-density neural recordings. Stevenson and Kording predicted such developments over a decade ago by noting the exponential growth in simultaneously recorded neurons. Source: Figure 3 from Buccino et al., Efficient and reproducible pipelines for spike sorting large-scale electrophysiology data, 2025, bioRxiv 2025.11.12.687966, doi: 10.1101/2025.11.12.687966ꜛ (license: CC BY 4.0)

This observation was articulated by Ian H. Stevenson and Konrad Kording in their Nature Neuroscienceꜛ perspective How advances in neural recording affect data analysis. Based on a systematic survey of the electrophysiology literature, they argued that the number of simultaneously recorded neurons has been growing exponentially for decades, with a doubling time of roughly seven years.

At first glance, this sounds like a catchy analogy. But the paper makes a deeper point: If this scaling holds, it fundamentally reshapes what kinds of data analysis, models, and theories are viable in computational neuroscience.

The empirical “law”

Stevenson and Kording examined 56 studies spanning roughly five decades, starting with early single-electrode recordings in the 1950s and extending to then-modern multi-electrode arrays and optical techniques. When plotting the maximum number of simultaneously recorded neurons reported in each era, they found an approximately straight line on a logarithmic scale. The fitted doubling time was about 7.4 ± 0.4 years.

A more recent study by Anne E. Urai et al. (2021)ꜛ confirms and extends the pattern identified by Stevenson and Kording. Here is their Figure 2, summarizing the scaling of neural recording technologies across modalities and species from the 1950ies to the 2020ies. Blue points indicate the number of simultaneously recorded neurons using electrophysiological methods, including the original dataset analyzed by Stevenson & Kording (2011)ꜛ, shown as squares. Red points show population sizes achieved with optical imaging techniques such as two-photon and light-sheet microscopy. The solid line represents an exponential fit to the original electrophysiology data, while dashed and dotted black lines extrapolate this trend to the present and near future. Gray reference lines mark the approximate number of neurons in the brains of commonly studied model organisms. Not only are Stevenson and Kording’s observations upheld, but the trend also extends to optical methods, even further accelerating the growth in recorded neuron counts. Source: Anne E. Urai, Brent Doiron, Andrew M. Leifer, Anne K. Churchland, Large-scale neural recordings call for new insights to link brain and behavior, arXiv:2103.14662, doi: 10.48550/arXiv.2103.14662ꜛ, Figure available at github.com/anne-urai/largescale_recordingsꜛ (license: CC-BY)

To be clear here: This is not presented as a law of nature. It is an empirical regularity, contingent on technological innovation. The authors explicitly compare it to Moore’s law not because it is guaranteed to persist forever, but because it is a useful abstraction for thinking about scaling. Just as Moore’s law shaped algorithm design in computer science, exponential growth in recording capacity should shape how we design neural data analysis methods.

For comparison, Moore’s law states that the number of transistors on integrated circuits doubles approximately every two years. This empirical observation, made by Gordon Moore in 1965, has held remarkably well for several decades, driving exponential growth in computing power and efficiency. Source: Wikimedia Commons (license: CC BY-SA 2.0)

They also emphasize that the growth is driven by multiple, mutually reinforcing factors: Advances in electrode fabrication, automated silicon processing, wiring density, data acquisition hardware, storage, and transfer rates. Many of these improvements would have seemed implausible a few decades earlier.

A concrete prediction

One of the most memorable parts of the paper is its forward extrapolation. If the seven-year doubling trend continued, physiologists should be able to record from on the order of 1,000 neurons simultaneously within about 15 years. The authors explicitly state that this seemed feasible even in 2011, based on existing micro-wire arrays and the rapid development of two- and three-photon calcium imaging.

How Neuropixels probes enable large-scale neural recordings

Neuropixels probes are high-density silicon electrode arrays designed to record extracellular activity from large neural populations with single-neuron resolution. The figure below from Steinmetz et al. (2020)ꜛ illustrates the key design principles and recording capabilities of Neuropixels 2.0.

Neuropixels 2.0 enables dense, large-scale extracellular recordings. Shown is Figure 1 from Steinmetz et al. (2020)ꜛ showing how Neuropixels probes support population-level recordings with single-neuron resolution. The figure illustrates the miniaturized high-density probe architecture, example raw extracellular signals and spike waveforms, validation using auto- and cross-correlograms, and large-scale spiking activity recorded across thousands of sites spanning multiple brain regions. Source: Figure 1 from Steinmetz et al., Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings, 2020, bioRxiv 2020.10.27.358291, doi: 10.1101/2020.10.27.358291ꜛ (license: CC BY 4.0)

Miniaturized, high-density probe design (panel A)
Neuropixels probes integrate thousands of recording sites onto one or multiple thin silicon shanks. The name “Neuropixels” reflects that each recording site acts as an individual spatial sampling element, analogous to a pixel in an image sensor, but measuring extracellular voltage instead of light intensity. Neuro-pixels are therefore not optical sensors such as miniature cameras, LEDs, or photodiodes, but dense arrays of microelectrodes for recording neural activity. Each site consists of a microscopic metal electrode connected to on-chip amplification, filtering, and multiplexing circuitry. Neuronal action potentials generate local extracellular voltage deflections that are detected simultaneously by multiple nearby sites, creating spatially distributed spike waveforms.

Neuropixels probes are inserted directly into the brain tissue along the shank axis (see panel E and F, schematics on the left), typically using micromanipulators that allow precise positioning across cortical and subcortical regions. Importantly, individual recording sites do not correspond one-to-one to individual neurons. Instead, each electrode samples the superposition of extracellular signals from multiple nearby neurons, with signal amplitude decaying with distance from the electrode. As a consequence, spikes from a single neuron are usually observed across several adjacent sites, while each site records contributions from multiple neurons. Post-processing algorithms exploit this spatial redundancy to isolate and identify individual neurons from the mixed signals.

Compared to Neuropixels 1.0, Neuropixels 2.0 increases site density while reducing the size and weight of the base and headstage, enabling chronic recordings in freely moving animals. Each probe contains several thousand electrodes distributed along the shank with micrometer spacing, allowing dense spatial sampling across cortical and subcortical structures. In post hoc analysis, this redundancy across channels is exploited to separate and localize individual neurons by clustering multi-channel spike waveforms, a process known as spike sorting.

Extracellular signal acquisition (panel B, C)
Shown are the raw local extracellular voltage traces recorded by each electrode site (panel B) along with example spike waveforms from individual neurons (panel C). The present fluctuations are caused by nearby neuronal activity. Low-frequency components reflect local field potentials, while high-frequency components correspond to action potentials. Because spikes from a single neuron are detected on multiple nearby channels, characteristic spatial waveform patterns emerge, as shown by example waveforms spanning overlapping sites.

Spike identity and timing validation (panel D)
Auto- and cross-correlograms are used to assess refractory periods and temporal relationships between units. The presence of a refractory period in autocorrelograms provides a biophysical constraint for identifying single neurons, while cross-correlograms reveal shared inputs or functional interactions between units.

Large-scale, multi-region recordings (panel E)
Neuropixels probes can record from hundreds of channels simultaneously and access thousands of sites sequentially via on-chip switching. In the example shown, activity from more than 6,000 recording sites was obtained using two probes implanted in a single mouse. This design enables dense sampling across extended depths of the brain, spanning multiple regions within one experiment.

Population-level dynamics on single trials (panel F)
Dense spatial coverage allows structured spiking patterns to be observed across large neuronal populations on individual trials. Reproducible spatiotemporal spike sequences, such as those observed in dorsal striatum during behavior, illustrate how Neuropixels recordings support population-level analyses that go beyond single-neuron tuning.

Neuropixels probes have become a central technology for large-scale electrophysiology. By dramatically increasing the number of simultaneously recorded neurons while maintaining signal quality and spatial resolution, they exemplify the technological scaling that motivates new analysis methods in modern computational neuroscience.

Currently, there is even an optogenetic version of Neuropixels probes under development (see Lakunia et al. (2025)ꜛ), which would further expand their capabilities by enabling simultaneous recording and manipulation of neural activity with high spatial precision.

They also stress the limits of such extrapolation. Tissue displacement, toxicity, bleaching, spike sorting, and spatial constraints all impose hard biological and practical boundaries. The famous thought experiment of recording from all ~100 billion neurons in the human brain is acknowledged as absurd when extrapolated linearly, yet the authors deliberately remind us how often similar extrapolations in computing once sounded absurd as well.

The important message is not the endpoint, but the regime change that happens long before any extreme limit is reached.

Why scaling changes the analysis

The second half of the paper turns from technology to computation. Stevenson and Kording ask a very specific question: How does spike prediction accuracy scale with the number of simultaneously recorded neurons, and how does this depend on the class of model being used?

They contrast two dominant approaches.

The first treats neurons independently. Classical tuning curve and receptive field models fall into this category. Each neuron’s firing rate is modeled as a function of external variables such as stimulus orientation or movement direction. Because neurons are fit independently, adding more neurons does not improve spike prediction accuracy for any single neuron. The accuracy remains essentially constant as population size grows.

The second class explicitly models interactions between neurons. Pairwise coupling models, implemented here as linear-nonlinear Poisson models with interaction terms, allow the activity of other recorded neurons to influence spike probability. In this case, prediction accuracy increases with the number of recorded neurons. In both motor and visual cortex datasets, the authors observed an approximately logarithmic scaling with population size.

This result has an important caveat that the paper makes very clear. The observed log N scaling occurs in highly undersampled recordings. In more complete recordings, where most relevant inputs are observed, prediction accuracy is expected to saturate. The scaling also depends on spatial scale, correlation strength, and how neurons are distributed across the tissue.

Still, the qualitative conclusion is robust: Interaction models benefit from larger populations in a way that independent models do not.

Latent state spaces as a way out

The paper also highlights a third approach that has since become central to population neuroscience: Low-dimensional latent variable or state-space models. Rather than modeling all pairwise interactions explicitly, these methods assume that population activity is driven by a small number of hidden factors. Dimensionality reduction and regularization are framed not as optional conveniences, but as necessities imposed by scaling and the curse of dimensionality.

Figure 3 (panels A–H) from S. Komi et al. (2025)ꜛ demonstrates low-dimensional neural manifolds underlying locomotion and stopping. The figure illustrates how population activity in spinal motor circuits can be described by structured trajectories in a low-dimensional latent space. Panel A shows behavioral and neural data during walk-to-stop transitions, including pose metrics and phase-aligned spinal spike rasters. Panel B quantifies dimensionality reduction, showing that the first three principal components explain more than 80% of the variance in population firing rates, indicating strong low-dimensional structure. Panels C–E depict state-space trajectories during locomotion, forming a ring-shaped “locomotor manifold” with phase-dependent dynamics and a consistent flow direction, corresponding to a limit-cycle attractor. Persistent homology analysis confirms the ring topology. Panels F–H show that stopping behavior corresponds to a transition into a distinct postural manifold characterized by fixed-point attractor dynamics. Overall, this examples shows how large-scale neural recordings can be reduced to interpretable latent dynamics that link population activity to behavior. Such manifold-based interpretations of neural activity are a direct response to the challenges of scaling and have become a central framework in modern computational neuroscience. Source: Figure 3 from S. Komi, J. Kaur, A. Winther, M. C. Adamsson Bonfils, G. A. Houser, R. J. F. Sørensen, G. Li, K. Sobriel, R. W. Berg, Neural manifolds that orchestrate walking and stopping, 2025, bioRxiv 2025.11.08.687367, doi: 10.1101/2025.11.08.687367ꜛ (license: CC BY 4.0)

In retrospect, this section reads almost like a roadmap for the following decade. Gaussian process factor analysis, latent dynamical systems, population trajectories, and manifold-based interpretations of neural activity all fit squarely into the framework Stevenson and Kording sketched in 2011.

Fifteen years later

From today’s perspective, roughly fifteen years after publication, the core prediction has largely held. Simultaneous recordings of hundreds to thousands of neurons are now routine with Neuropixels probes and advanced optical imaging. Long-term, stable recordings across days or weeks are no longer exotic. What has not materialized in the same way is straightforward whole-brain spike recording, but that was never the real claim.

Figure 1 (panels a–j) from Marius Pachitariu et al. (2024)ꜛ illustrates the spike detection and feature extraction pipeline used in Kilosort4, exemplifying the analytical complexity introduced by large-scale neural recordings today. The figure shows how dense, high-channel-count recordings require multi-stage processing to separate overlapping spikes and extract meaningful features at scale. Panel a outlines the pipeline from initial spike detection using simple templates to refined, background-corrected features suitable for clustering. Panel b shows raw preprocessed Neuropixels data with frequent temporal and spatial spike overlap. Panels c and d compare predefined simple templates with learned templates adapted to the data. Panels e and f demonstrate signal reconstruction and residuals, highlighting structured activity beyond noise. Panels g–i visualize how feature representations improve with learned templates and background subtraction. Panel j shows the spatial distribution of extracted spikes along the probe. Together, the figure illustrates how advances in recording density necessitate increasingly sophisticated analysis methods. Source: Pachitariu et al., Spike sorting with Kilosort4, 2024, Nature Methods, 914–921, doi: 10.1038/s41592-024-02232-7ꜛ (license: CC BY 4.0)

More importantly, the computational consequences the authors anticipated have fully arrived. Interaction models, latent-variable approaches, and population-level dynamical analyses now dominate much of systems and computational neuroscience. At the same time, the challenges they emphasized, computational cost, statistical identifiability, and scaling behavior, remain central and unresolved in many contexts.

Figure 2 (panels a–i) from Marius Pachitariu et al. (2024) shows graph-based clustering strategies used in Kilosort4 to structure large-scale spike datasets. The figure illustrates how dense, high-dimensional spike features are iteratively reassigned and merged to obtain stable clusters from large neural populations. Panel a sketches the neighbor-based reassignment process that progressively reduces an initially large set of clusters. Panel b shows an example clustering overlaid on a t-SNE embedding of spike features. Panel c presents the hierarchical merging tree used to decide which clusters should be combined based on a modularity cost. Panel d summarizes the criteria for accepting or rejecting merges, combining feature-space bimodality with refractory-period constraints derived from spike timing. Panels e and f show the final clustering result, highlighting units that exhibit refractory periods. Panels g and h characterize the resulting units using average waveforms, autocorrelograms, cross-correlograms, and regression projections. Panel i visualizes the spatial distribution of clustered spikes along the probe. Together, the figure exemplifies how modern spike sorting algorithms impose structure on massive datasets by combining graph methods, statistical criteria, and biophysical constraints. Source: Pachitariu et al., Spike sorting with Kilosort4, 2024, Nature Methods, 914–921, doi: 10.1038/s41592-024-02232-7ꜛ (license: CC BY 4.0)

A personal takeaway

What strikes me most reading this paper today is not how bold it was, but how measured. Stevenson and Kording were not selling a technological fantasy. They were issuing a methodological warning. Data will keep growing. Models that ignore scaling will quietly fail. Models that exploit structure, regularization, and low-dimensionality stand a chance.

If neuroscience really does have its own Moore’s law, then the obligation for computational neuroscience is clear. We cannot afford to treat population size as a secondary detail. It is a defining constraint that shapes which theories are even expressible, let alone testable.

References and further reading

Ian H Stevenson, Konrad P Kording, How advances in neural recording affect data analysis, 2011, Nature Neuroscience, Vol. 14, Issue 2, pages 139-142, doi: 10.1038/nn.2731 ꜛ
Buccino et al., Efficient and reproducible pipelines for spike sorting large-scale electrophysiology data, 2025, bioRxiv 2025.11.12.687966, doi: 10.1101/2025.11.12.687966ꜛ
Anne E. Urai, Brent Doiron, Andrew M. Leifer, Anne K. Churchland, Large-scale neural recordings call for new insights to link brain and behavior, arXiv:2103.14662, doi: 10.48550/arXiv.2103.14662ꜛ
Marius Pachitariu, Shashwat Sridhar, Jacob Pennington, Carsen Stringer, Spike sorting with Kilosort4, 2024, Nature Methods, Vol. 21, Issue 5, pages 914-921, doi: 10.1038/s41592-024-02232-7ꜛ
Marblestone AH, Zamft BM, Maguire YG, Shapiro MG, Cybulski TR, Glaser JI, Amodei D, Stranges PB, Kalhor R, Dalrymple DA, Seo D, Alon E, Maharbiz MM, Carmena JM, Rabaey JM, Boyden ES, Church GM and Kording KP, Physical principles for scalable neural recording, 2013, Front. Comput. Neurosci. 7:137. doi: 10.3389/fncom.2013.00137ꜛ
Nicholas A. Steinmetz, Cagatay Aydin, Anna Lebedeva, Michael Okun, Marius Pachitariu, Marius Bauza, Maxime Beau, Jai Bhagat, Claudia Böhm, Martijn Broux, Susu Chen, Jennifer Colonell, Richard J. Gardner, Bill Karsh, Dimitar Kostadinov, Carolina Mora-Lopez, Junchol Park, Jan Putzeys, Britton Sauerbrei, Rik J. J. van Daal, Abraham Z. Vollan, Marleen Welkenhuysen, Zhiwen Ye, Joshua Dudman, Barundeb Dutta, Adam W. Hantman, Kenneth D. Harris, Albert K. Lee, Edvard I. Moser, John O’Keefe, Alfonso Renart, Karel Svoboda, Michael Häusser, Sebastian Haesler, Matteo Carandini, Timothy D. Harris, Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings, 2020, bioRxiv 2020.10.27.358291, doi: 10.1101/2020.10.27.358291ꜛ / published in Science372, eabf4588(2021),doi: 10.1126/science.abf4588ꜛ
Anna Lakunina, Karolina Z Socha, Alexander Ladd, Anna J Bowen, Susu Chen, Jennifer Colonell, Anjal Doshi, Bill Karsh, Michael Krumin, Pavel Kulik, Anna Li, Pieter Neutens, John O’Callaghan, Meghan Olsen, Jan Putzeys, Charu Bai Reddy, Harrie AC Tilmans, Sara Vargas, Marleen Welkenhuysen, Zhiwen Ye, Michael Häusser, Christof Koch, Jonathan T. Ting, Neuropixels Opto Consortium, Barundeb Dutta, Timothy D Harris, Nicholas A Steinmetz, Karel Svoboda, Joshua H Siegle, Matteo Carandini, Neuropixels Opto: Combining high-resolution electrophysiology and optogenetics, 2025, bioRxiv 2025.02.04.636286, doi: 10.1101/2025.02.04.636286ꜛ
S. Komi, J. Kaur, A. Winther, M. C. Adamsson Bonfils, G. A. Houser, R. J. F. Sørensen, G. Li, K. Sobriel, R. W. Berg, Neural manifolds that orchestrate walking and stopping, 2025, bioRxiv 2025.11.08.687367, doi: 10.1101/2025.11.08.687367ꜛ

Neural Dynamics: A definitional perspective

2026-02-04T13:03:51+01:00

I think it is finally time to define the term “neural dynamics” as I understand it and use it on this blog. The motivation for doing so is both practical and personal. On the practical side, the terms “neural dynamics” and “computational neuroscience” are often used interchangeably, which tends to obscure their respective scope and meaning. On the personal side, I have previously written similar definitional overview posts for other fields, such as space plasma physics and hydrodynamics. These exercises turned out to be useful, primarily because they forced me to make explicit how I mentally structure a field, which topics I consider central, and how different subareas relate to one another. For this reason, it seemed worthwhile to do the same for neural dynamics.

Phase plane (left) and time series (right) of an action potential generated by the FitzHugh–Nagumo model. The left panel shows the nullclines (blue and orange lines) and the trajectory of the system in the phase plane (black line). The right panel shows the membrane potential (voltage) as a function of time, illustrating the rapid rise and fall characteristic of an action potential. Neural dynamics is largely concerned with understanding how such action potentials arise from the underlying biophysical and network dynamics. However, it also goes beyond and studies the dynamics of, e.g., neuronal populations, synaptic plasticity, and learning. In this post, we provide a definitional overview of the field of neural dynamics in order to situate it within the broader context of computational neuroscience and clarify some common misconceptions.

The overview provided in this post is not intended as a textbook chapter, nor as a canonical or exhaustive definition of the field. It is a personal attempt to structure themes, methods, and historical developments in a way that reflects my own focus and trajectory. And: Everything summarized here should be understood as provisional. Both the scientific field and my own understanding of it will continue to evolve, and this overview will almost certainly be extended, refined, or corrected over time. So, if you proceed to read this, please keep in mind that it is a living document rather than a definitive account.

Acknowledgments: My main knowledge of neural dynamics comes from the textbook Neural Dynamicsꜛ by Wulfram Gerstner and colleagues (2014) (among others). I can highly recommend this book to anyone interested in the topic. It provides a comprehensive and mathematically rigorous introduction to the field, covering single neuron dynamics, network models, plasticity mechanisms, and links to cognition. Many of the themes and structures outlined here are inspired by this work.

What is neural dynamics?

Neural dynamics is not identical with computational neuroscience, even though the two are closely related and often confused with each other. Computational neuroscience is best understood as a broad methodological and conceptual framework that uses mathematics, physics, and computational methods to study nervous systems. Within this framework we find a wide range of topics, including biophysical neuron models, network models, learning and plasticity, information coding and decoding, perception and decision making, statistical inference, and data analysis of neural recordings.

Propagation of an action potential along an axon. An action potential travels along the axon as a spatiotemporal wave of membrane depolarization and repolarization. When the membrane potential reaches threshold, voltage-gated sodium (Na⁺) channels open, leading to rapid depolarization as Na⁺ ions flow into the axon. This is followed by repolarization, driven by the opening of potassium (K⁺) channels and outward K⁺ currents. The resulting change in membrane polarity propagates unidirectionally toward the axon terminal, where it can influence downstream neurons. Neural dynamics studies such processes as dynamical systems, describing how action potentials emerge from underlying biophysical mechanisms, how they propagate in space and time, and how similar principles extend to synaptic interactions, neuronal populations, and network-level activity. Source: Wikimediaꜛ (CC BY-SA 3.0 license)

Hodgkin–Huxley dynamics of action potential generation. Shown are the membrane potential $U_m(t)$, the gating variables $m$, $h$, and $n$, the external input current $I_{\mathrm{ext}}(t)$, and corresponding phase-plane projections during the generation of an action potential. A brief but strong external current pulse ($I_{\mathrm{ext}} = 45\,\mu\mathrm{A/cm}^2$ applied for 3 ms) drives the system across threshold, triggering a rapid excursion in state space that corresponds to spike initiation. The subsequent evolution is governed by the coupled nonlinear dynamics of sodium and potassium channel gating, leading to repolarization and afterhyperpolarization. In neural dynamics, the Hodgkin–Huxley model serves as a canonical example of how discrete events such as spikes emerge from continuous-time nonlinear dynamical systems, and how neuronal excitability can be understood geometrically in terms of trajectories, thresholds, and phase-space structure.

Neural dynamics refers more specifically to the study of time dependent neural activity and the mathematical structures that govern it. The focus lies on how neural states evolve in time, how stable or unstable activity patterns arise, how transitions between regimes occur, and how learning reshapes these dynamics. Typical objects of study include membrane potentials, spike trains, synaptic variables, population activity, and low dimensional representations thereof.

Spiking activity in a recurrent network of model neurons (Izhikevich model). Shown are the spike times of all neurons in a recurrent spiking neural network as a function of time. The network consists of 800 excitatory neurons with regular spiking (RS) dynamics and 200 inhibitory neurons with low-threshold spiking (LTS) dynamics, separated by the horizontal line. Each vertical mark corresponds to an action potential (spike) emitted by a single neuron. In the context of neural dynamics, this representation illustrates how single-neuron events, such as the action potentials described above, combine to form structured, time-dependent activity patterns at the network level. Such spiking rasters provide a direct link between microscopic neuronal dynamics and emerging population activity, which can later be analyzed in terms of collective states, low-dimensional structure, and neural manifolds.

In this sense, neural dynamics forms a central but not exhaustive subfield of computational neuroscience. It provides the dynamical backbone on which many other questions rest, but it does not by itself encompass the full scope of computational approaches to brain function. Topics such as Bayesian decoding, normative theories of perception, or purely statistical models of neural data may rely on dynamical assumptions, yet are not primarily concerned with the dynamical systems themselves.

Two complementary perspectives on population activity in neural dynamics. The figure contrasts a “circuit” perspective with a “neural manifold” perspective. In circuit models, neurons are organized in an abstract tuning space, where proximity reflects tuning similarity, and recurrent connectivity $W_{ij}$ together with external inputs generates time-dependent firing rates $r_i(t)$ (panels A–C). In the neural manifold view, the joint activity vector $r(t)\in\mathbb{R}^N$ of a recorded population evolves along low-dimensional trajectories embedded in a high-dimensional space (panels D–F). This is illustrated by ring-like manifolds for head-direction representations and by rotational trajectories in motor cortex, both of which can often be captured by a small number of latent variables $\kappa_1(t),\ldots,\kappa_D(t)$ with $D\ll N$. In the context of our overview post here, I think, the figure highlights very well why neural dynamics naturally connects mechanistic network modeling with state-space descriptions of population activity. These are not competing accounts, but complementary levels of description that emphasize different aspects of the same underlying dynamical system. Source: Figure 1 from Pezon, Schmutz, Gerstner, Linking neural manifolds to circuit structure in recurrent networks, 2024, bioRxiv 2024.02.28.582565, doi: 10.1101/2024.02.28.582565ꜛ (license: CC-BY-NC-ND 4.0)

The focus I adopt here is explicitly on neural dynamics as it appears in spiking neuron models, recurrent networks, and plasticity driven systems. Integrate and fire models, conductance based neurons, spiking neural networks, and learning rules such as spike timing dependent plasticity fall squarely into this domain. This decision reflects a weighting rather than an exclusion. It simply marks the region of computational neuroscience where dynamical systems theory and time continuous modeling play the most prominent role.

In short:

Computational neuroscience is the broad field that uses computational methods to study the brain, while neural dynamics is the subfield that focuses on the time dependent evolution of neural activity and the mathematical structures that govern it.

A mathematical backbone for neural dynamics

What I have learned so far is that, unlike classical [hydrodynamics]/blog/2021-03-04-hydrodynamics/ or magnetohydrodynamics, neural dynamics does not possess a single, closed set of governing equations from which all models can be derived. The diversity of biological mechanisms and levels of description precludes such a unifying formulation. Nevertheless, there exists a common mathematical backbone that underlies most models used in neural dynamics.

At its core, neural dynamics studies systems of coupled, nonlinear, and often stochastic differential equations. A generic formulation can be written as

\[\begin{align} \frac{d\mathbf{x}}{dt} = \mathbf{F}(\mathbf{x}, \mathbf{I}(t), \boldsymbol{\theta}) + \boldsymbol{\eta}(t). \end{align}\]

Here, $\mathbf{x}(t)$ denotes the state vector of the system. Depending on the level of description, this vector may contain membrane potentials, gating variables, synaptic conductances, adaptation currents, or abstract firing rate variables. The function $\mathbf{F}$ encodes the deterministic dynamics, including intrinsic neuronal properties, synaptic coupling, and nonlinear interactions. External inputs are represented by $\mathbf{I}(t)$, model parameters by $\boldsymbol{\theta}$, and $\boldsymbol{\eta}(t)$ denotes stochastic terms capturing intrinsic noise or unresolved microscopic processes.

Specific neuron models correspond to particular choices of $\mathbf{F}$. For example, conductance based models yield systems of nonlinear ordinary differential equations with biophysically interpretable parameters. Integrate and fire models reduce this structure to a lower dimensional system with a threshold and reset condition, effectively introducing hybrid dynamics that combine continuous evolution with discrete events. Network models arise when many such units are coupled through synaptic variables that themselves obey additional dynamical equations.

Left: RC equivalent circuit of an Integrate-and-Fire model neuron. The “neuron” in this model is represented by the capacitor $C$ and the resistor $R$. The membrane potential $U(t)$ is the voltage across the capacitor $C$. The input current $I(t)$ is split into a resistive current $I_R$ and a capacitive current $I_C$ (not shown here). The resistive current is proportional to the voltage difference across the resistor $R$. The capacitive current is proportional to the rate of change of voltage across the capacitor. $U_\text{rest}$ is the resting potential of the neuron, which is the membrane potential when the neuron is not receiving any input. Right: The response $U(t)$ on current pulse inputs. When the neuron receives an input current, the membrane voltage changes. The Integrate-and-Fire model describes how the neuron integrates these incoming signals over time and fires an action potential once the membrane voltage exceeds a certain threshold (here depicted as $\vartheta$). Modified from this source: Wikimediaꜛ (CC BY-SA 4.0 license)

Plasticity introduces a further layer of dynamics. Synaptic weights become time dependent variables, often governed by equations of the form

\[\begin{align} \frac{dw_{ij}}{dt} = G(x_i, x_j, t), \end{align}\]

where $w_{ij}$ denotes the synaptic efficacy from neuron $j$ to neuron $i$, and $G$ implements a learning rule such as spike timing dependent plasticity or a more general three factor rule. The full system then becomes a coupled dynamical system on multiple time scales, with fast neuronal dynamics and slower synaptic adaptation.

From this perspective, neural dynamics is fundamentally the study of high dimensional nonlinear dynamical systems, their fixed points, limit cycles, attractors, bifurcations, and transient trajectories, as well as the ways in which learning reshapes the underlying phase space.

Thematic overview

In this section, I try to maintain a thematic map of neural dynamics as far as I discover the field. Especially this section is likely to evolve over time as I read more and refine my understanding. Therefore, please consider this as a provisional outline rather than a definitive structure.

The map is largely inspired by the organization of Wulfram Gerstner’s Neuronal Dynamics (2014). It closely follows the conceptual progression of that book, while extending it in places to reflect later developments and my own focus. Also note: All topics listed below are understood as interconnected aspects of neural dynamics rather than as isolated modules.

Neurons and biophysical foundations

The Hodgkin–Huxley model of action potential generation
Phase plane analysis as a tool for understanding dynamical systems
Reduced neuron models: FitzHugh–Nagumo, Morris–Lecar, Izhikevich models
Spike initiation dynamics and threshold phenomena
Synaptic dynamics: Conductance based and current based synapses
Dendritic processing and compartmental models

Integrate-and-fire neuron models

The Leaky integrate-and-fire (LIF) model
Exponential integrate-and-fire (EIF) and adaptive exponential integrate-and-fire (AdEx) models
Generalized integrate-and-fire neuron models
Nonlinear integrate-and-fire models
Noisy input and output models

Neuronal populations and network dynamics

Neuronal populations
Tuning curves and population coding
Continuity equation and the Fokker–Planck approach
Quasi-renewal theory and integral equation approaches
Fast transients and rate models
Asynchronous irregular states in spiking networks (Brunel network)
Neural state space and latent space representations

Cognitive and systems-level dynamics

Memory and attractor dynamics
Cortical field models for perception
Latest developments in dynamical theories of cognition (incomplete):
- Representational drift in hippocampus and cortex

Synaptic plasticity and learning

Synaptic plasticity and learning rules
Short-term synaptic plasticity: Depression (STD) and facilitation (STF)
Three-factor learning rules
Bienenstock–Cooper–Munro (BCM) rule
Voltage-based plasticity rules (e.g. Clopath rule)
Synaptic tagging and capture (STC)
Spike-timing-dependent plasticity (STDP)
- In this post, we apply STDP to a pattern recognition task as an example of how it can be used for learning in spiking neural networks.
Behavioral time-scale synaptic plasticity (BTSP)
Long-term potentiation (LTP) and long-term depression (LTD)
Dendritic prediction and credit assignment (Urbanczik–Senn plasticity)

Dynamical (and stochastic) phenomena in neural systems

Stability and instability of neural activity patterns
Transitions between dynamical regimes and state changes
Oscillatory activity and synchronization phenomena
Phase oscillator models and synchronization (Kuramoto model)
Irregular and chaotic dynamics in recurrent neural systems
Noise-driven variability and stochastic effects in neural activity

Neural dynamics and signal processing

Backpropagation through time (BPTT)
Backpropagating action potentials (bAPs)
Calcium dynamics and calcium waves

From biological to artificial networks and machine learning

Eligibility traces and Eligibility propagation (e-prop)
Trainable spiking neural networks for computation and learning
Energy-based and variational formulations of neural dynamics

Historical overview

Neural dynamics and computational neuroscience did not emerge fully formed, but developed gradually through contributions from physiology, physics, mathematics, and computer science. The following table highlights selected milestones that are particularly relevant for the dynamical perspective.

Year		Development	Significance
1891	🧠	Santiago Ramón y Cajal formulates the neuron doctrine	Establishes neurons as discrete anatomical and functional units of the nervous system
1907	📝	Lapicque introduces the integrate and fire abstraction	Early reduction of excitability to leaky integration with threshold, a prototype for later point neuron models
1909	📝	Campbell’s theorem for shot noise	Mathematical foundation for relating stochastic spike trains to mean rates and variances
1926	🔬	First extracellular recordings of action potentials (Adrian & Zotterman)	Demonstrates that spikes can be recorded extracellularly and linked to sensory stimulation
1931	📝/🔬	Maria Goeppert-Mayer predicts two-photon absorption	Theoretical foundation of nonlinear optical excitation, later enabling two-photon laser scanning microscopy
1943	📝	McCulloch and Pitts formalize threshold units as logical elements	First widely cited mathematical idealization of neurons as binary threshold devices, linking networks to computation
1949	🔬	First intracellular recordings with sharp electrodes (Ling & Gerard)	Enables direct measurement of membrane potentials in single neurons
1949	📝	Hebb formulates cell assemblies and the Hebbian learning principle	Puts synaptic modification and association at the center of learning theory, motivating later plasticity rules
1951	🧠/🔬	Eccles et al. describe local field potentials (LFP) in cerebral cortex	Establishes mesoscopic population signals reflecting synaptic and network activity
1951	📝	Siegert approximation for first passage times	Enables analytical estimation of firing rates for threshold driven stochastic processes
1952	📝	Hodgkin and Huxley conductance based membrane model	Establishes mechanistic ODEs for spikes via voltage dependent gating, defining modern single neuron dynamics
1957	📝	Rall emphasizes dendritic cable properties and compartmental thinking	Provides the foundation for spatially extended neuron models and dendritic integration as a dynamical process
1958	📝	Rosenblatt’s perceptron	Early trainable neural network model, historically important for learning rules and the connectionist line of thought
1961	📝	FitzHugh reduction of Hodgkin and Huxley	Introduces a planar excitable system, enabling phase plane analysis of spikes, nullclines, and excitability types
1962	📝	Nagumo and colleagues’ circuit implementation of excitable dynamics	Concrete electronic realization of excitable systems, helping to popularize reduced excitable models
1963	🏅	Nobel Prize in Physiology or Medicine (Hodgkin & Huxley, shared with Eccles)	Honors the quantitative, biophysical theory of action potential generation that laid the foundation for modern neural dynamics
1965	📝	Stein’s leaky integrate and fire neuron with noise	Establishes stochastic LIF as a canonical rate generating model
1970ies	📝	Rate based neuron models (early formalizations)	Introduces continuous firing rate dynamics as an alternative to explicit spikes; first models by Wilson and Cowan (1972–1973)
1970ies	🔬	Neher & Sakmann develop the patch clamp technique	Revolutionizes single neuron electrophysiology by enabling high resolution recordings of ionic currents
1971	📝	Ricciardi formulation of diffusion approximations	Formal mathematical framework for firing rate statistics in noisy neurons
1971	🧠	Discovery of place cells in the hippocampus (O’Keefe & Dostrovsky)	Demonstrates location-specific firing of single neurons, establishing a neural basis for spatial representation
1972	📝	Wilson and Cowan population activity equations	Canonical mean field style dynamics for interacting excitatory and inhibitory populations, a workhorse for cortex scale dynamics
1975	📝	Kuramoto phase oscillator model	Canonical model for synchronization and collective dynamics in coupled oscillator systems, later applied to neural rhythms
1975/1977	📝	Amari neural field equations	Spatially continuous rate dynamics and pattern formation in cortex scale models
1978	🔬	Voltage-sensitive dye imaging used to potentially measure action potentials by Cohen & Salzberg	Voltage-sensitive dyes enable optical recording of fast membrane potential changes across neural populations
1982	📝	Hopfield networks as dynamical associative memory	Makes attractor dynamics central for memory and computation, with an explicit energy like Lyapunov function framework
1982	📝	Bienenstock, Cooper, Munro (BCM) synaptic modification theory	Establishes a sliding threshold plasticity principle, influential as a stability mechanism and a bridge between activity statistics and learning
1984	🧠	Discovery of head direction cells (Ranck)	Identifies neurons encoding the animal’s directional heading, introducing orientation as a dynamical neural variable
1986	📝	Rumelhart, Hinton, Williams backpropagation	Not biologically plausible, but historically central for learning in neural networks
1988	📝	Sompolinsky, Crisanti, Sommers chaos in random recurrent networks	Introduces a mathematically controlled route to chaos in high dimensional neural dynamics via random connectivity
1990	🔬	Two-photon laser scanning microscopy (Denk, Strickler & Webb)	Allows deep-tissue optical imaging with cellular resolution in scattering brain tissue
1991	🏅	Neher & Sakmann receive Nobel Prize for patch clamp technique	Enables high resolution recordings of ionic currents and membrane potentials, revolutionizing single neuron physiology
1996	📝	van Vreeswijk and Sompolinsky balanced excitation and inhibition in cortical circuits	Formalizes the balanced state idea as a mechanism for irregular activity and fast responses in large networks
1997	🔬	First genetically encoded calcium indicators (GECIs; here: Cameleon) by Miyawaki et al.	Opens the door to long-term optical recording of neural population activity
1997	🔬	First genetically encoded voltage indicators (GEVIs; here. Flash) by Siegel & Isacoff	Establishes genetically targetable optical reporters of membrane potential dynamics
1997	📝	Spiking neurons as computational units (Maass)	Formal proof that networks of spiking neurons constitute a distinct and powerful computational model
1998	📝	van Vreeswijk and Sompolinsky chaotic balanced state (extended analysis)	Detailed theory of balanced networks, linking microscopic chaos to stable macroscopic activity statistics
1998	📝	Bi and Poo spike timing dependent synaptic modification	Establishes experimentally grounded timing windows for plasticity, pushing “Hebb” into a temporally precise rule
2000	📝	Brunel dynamics of sparsely connected E I spiking networks	Unifies asynchronous irregular states, synchrony, and oscillatory regimes within a tractable LIF network theory
2002	📝	Real time computation with spiking networks (Maass)	Demonstrates computation through transient dynamics rather than fixed point attractors
2003	📝	Izhikevich simple spiking neuron model	Compact two variable system reproducing diverse spiking regimes with low computational cost
2004	👨‍💻	First release of the NEST simulator	Large scale spiking network simulation with focus on biological realism and scalability
2005	📝	Brette and Gerstner adaptive exponential integrate and fire model	Provides a compact two dimensional point neuron capturing spike initiation and adaptation, widely used for dynamical studies
2005	📝	Toyoizumi and colleagues link BCM style principles to spiking and timing	Illustrates how rate based stability ideas can be translated into spike based plasticity frameworks
2005	🔬	Optogenetics demonstrated for neural control (Boyden et al.)	Introduces millisecond-precise, cell-type-specific optical manipulation of neural activity
2005	🧠	Discovery of grid cells in entorhinal cortex (Moser & Moser)	Reveals a periodic spatial firing pattern forming a metric for navigation and path integration
2006	📝	Izhikevich polychronization	Highlights precise spike timing patterns as computational primitives
2007	👨‍💻	First public release of Brian simulator	Flexible, equation oriented simulator emphasizing clarity and rapid prototyping
2007	🔬	Chemogenetics (DREADDs) introduced by Armbruster et al.	Enables selective, long-lasting modulation of neural activity via engineered receptors
2010	📝	Clopath voltage based STDP rule	Links synaptic plasticity to membrane potential dynamics and spike timing
2011	📝/🔬	“Moore’s law” of neuroscience	Stevenson and Kording predict an exponential growth of numbers of simultaneously recorded neurons every ~7 years
2013	🔬	Three-photon microscopy demonstrated for deep brain imaging (Horton et al.)	Extends functional imaging to deeper cortical and subcortical structures
2014	📝	Urbanczik–Senn dendritic predictive plasticity	Introduces plasticity driven by mismatch between somatic output and dendritic prediction
2014	📝	Systematic dynamical theory of spiking computation (Gerstner et al.)	Establishes a unified dynamical framework linking spikes, networks, and computation
2014	🏅	Nobel Prize in Physiology or Medicine awarded to O’Keefe and the Mosers	Honors the discovery of the brain’s spatial positioning system based on place and grid cells
2015/2017	🧠	Behavioral time scale synaptic plasticity (BTSP) experimentally described	Reveals learning rules operating over seconds, beyond classical STDP windows
2017	🔬	Neuropixels probes introduced	Enables simultaneous recording from thousands of neurons across multiple brain regions
2018	📝	Neural dynamics as variational inference (Isomura & Friston)	Explicitly links neuronal activity and plasticity to variational free energy minimization
2018	📝	Three factor learning rules unified in theoretical frameworks (Gerstner et al.)	Generalizes Hebbian plasticity by incorporating modulatory signals
2018	👨‍💻	Neuron version 8 with extended dynamical mechanisms	Introduces new features for simulating complex neuronal dynamics, including dendritic compartments and plasticity
2019	👨‍💻	Brian2 matures as standard teaching and research tool	Combines flexibility with code generation for efficient simulations
2020	📝	Bellec and colleagues introduce e-prop	Biologically motivated approximation to backpropagation through time for spiking networks
2024	🏅	Nobel Prize in Physics awarded to John Hopfield and Geoffrey Hinton	Honors foundational energy-based concepts underlying modern machine learning and neural network theory

Legend:

🧠 Neuroscience/biological discovery
🔬 Experimental technique/method
📝 Theoretical/mathematical development
👨‍💻 Computational tool/simulator
🏅 Nobel Prize awarded for relevant work

This list is necessarily selective. It emphasizes developments that shaped how neural activity is modeled as a dynamical system, rather than cataloging all advances in computational neuroscience. However, if you think important milestones are missing, please let me know in the comments below.

Closing remarks

I see neural dynamics occupying a central position within the field of computational neuroscience. It provides the language in which time, change, and interaction are made explicit. While it does not define the entire field, it supplies the mathematical and conceptual tools needed to understand how neural systems evolve, stabilize, and learn.

The perspective outlined here is deliberately dynamical and model driven. It reflects an interest in equations, phase spaces, and mechanisms rather than in purely descriptive or statistical approaches. This is not meant as a value judgment, but as a clarification of scope. Computational neuroscience is broader than neural dynamics, and neural dynamics gains much of its relevance precisely because it interfaces with experiments, data analysis, and theories of computation.

This overview should therefore be read as a living document. Its purpose is to orient, not to prescribe, and to serve as a conceptual anchor for my past and future posts rather than as a definitive account of the field.

References and further reading

Wulfram Gerstner, Werner M. Kistler, Richard Naud, and Liam Paninski, Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition, 2014, Cambridge University Press, ISBN: 978-1-107-06083-8, free online version
Hodgkin, A. L., & Huxley, A. F., A quantitative description of membrane current and its application to conduction and excitation in nerve, 1952, The Journal of Physiology, 117(4), 500–544, doi: 10.1113/jphysiol.1952.sp004764
P. Dayan, I. F. Abbott, Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems, 2001, MIT Press, ISBN: 0-262-04199-5
Izhikevich, Eugene M., (2010), Dynamical systems in neuroscience: The geometry of excitability and bursting (First MIT Press paperback edition), The MIT Press, ISBN: 978-0-262-51420-0
G. Bard Ermentrout, & David H. Terman, Mathematical Foundations of Neuroscience, 2010, Book, Springer Science & Business Media, ISBN: 9780387877082
Christoph Börgers, An Introduction to Modeling Neuronal Dynamics, 2017 , Vol. 66, Springer International Publishing, doi: 10.1007/978-3-319-51171-9
Gerasimos G. Rigatos, Advanced Models of Neural Networks: Nonlinear Dynamics and Stochasticity in Biological Neurons, 2015, Springer-Verlag Berlin Heidelberg, doi: 10.1007/978-3-662-43764-3
Miller, Paul,, An introductory course in computational neuroscience, 2018, The MIT Press, ISBN: 978-0-262-34756-3
Pezon, Schmutz, Gerstner, Linking neural manifolds to circuit structure in recurrent networks, 2024, bioRxiv 2024.02.28.582565, doi: 10.1101/2024.02.28.582565ꜛ

The list above is by no means exhaustive. It summarizes the main sources with which I would start my own exploration of neural dynamics. I highly recommend starting with Gerstner et al. (2014) for a comprehensive and mathematically rigorous introduction to the field. Also, each post in this blog contains further references and links to original articles, reviews, and textbooks that can help deepen your understanding of specific topics.

Neural plasticity and learning: A computational perspective

2026-02-02T16:52:13+01:00

After having discussed structural plasticity in some detail in the previous post, I thought it would be useful to take now a broader look at neural plasticity and learning from a computational perspective. Both, plasticity and learning are fundamental adaptive processes that enable the brain to modify its structure and function in response to experience. But what are the main forms of plasticity, how do they relate to learning, and how can we formalize these concepts in models of neural dynamics? In this post, I will explore these questions, based on my so-far understanding of the topic. As in all areas of science, these views are subject to ongoing revision as new data and theoretical frameworks emerge and as my own understanding develops. I will update this post accordingly over time. Please keep that in mind while reading. If you find something that is inaccurate, incomplete, or misleading, please let me know in the comments below. For a more comprehensive treatment, I recommend consulting the references listed at the end of the post.

Neural plasticity and learning are fundamental adaptive processes that enable the brain to modify its structure and function in response to experience. Shown is a simplified schematic illustration of plastic changes induced by intensive practice of a specific skill at multiple organizational levels. Top (behavior): Repeated practice improves behavioral performance, illustrated here by increased accuracy and reduced variability when practicing a motor task such as throwing darts. Middle (cortex): At the cortical level, practice leads to a reorganization of functional representations, with task relevant neural populations becoming more strongly engaged and more precisely tuned. Bottom (neuron): At the cellular level, these changes are supported by synaptic and structural plasticity, including modifications of synaptic strength and local connectivity at individual neurons. Together, these coordinated changes across scales illustrate how learning emerges from local plastic mechanisms to produce stable improvements in behavior. Source: Wikimedia Commonsꜛ (license: CC BY 3.0).

Overview: Neuronal plasticity across scales

So, what are we talking about when we refer to neuronal plasticity? Broadly speaking, neuronal plasticity denotes the capacity of the nervous system to change both its functional properties and its underlying biological substrate (e.g., synaptic strengths, intrinsic excitability, structural connectivity) as a consequence of experience, ongoing activity, and environmental conditions (Bear et al., 2016ꜛ; Gerstner et al., 2014ꜛ). These changes may affect how neurons respond to inputs, how strongly they influence one another, and how information is represented and processed at the level of circuits and networks. For this reason, neuronal plasticity is widely regarded as the biological substrate of learning, memory formation, and behavioral adaptation (Dayan & Abbott, 2001ꜛ; Bear et al., 2016ꜛ).

Schematic illustration of neuronal plasticity across scales. (a) Schematic representation of key cellular elements (neuronal and glial cells) involved in the neuroplasticity (NP) process (Neuronal and Glia plasticity) as well as subcellular compartments (Synaptic, Dendritic, Axonal, Neuromuscular Plasticity). (b) Schematic diagram representing how repetitive synaptic stimulations repetitive LTPs are linked to molecular changes (Doted circle in (a)), and the generation of dendrite and spine remodeling. (c) Schematic illustration showing some of the changes in dendrite formations (dotted dendrite spines) following learning (synaptic re-wiring) and memory formation (synaptic stabilization). (d) Schematic illustration showing axonal mechanism of neuroplasticity and repair (rerouting and sprouting) following brain injury. Abbreviations: LTP, long-term potential. Source: Wikimediaꜛ (license: CC BY-SA 4.0; original figure from Gatto, 2020ꜛ).

Importantly, plasticity should not be understood as a single, uniform mechanism. Rather, it comprises a heterogeneous set of processes that operate on different spatial and temporal scales (Malenka & Bear, 2004ꜛ; Turrigiano & Nelson, 2004ꜛ). Some forms act locally and rapidly, such as activity dependent changes in synaptic efficacy, while others unfold over longer time scales and involve structural remodeling of neurons or the reorganization of entire cortical representations (Holtmaat & Svoboda, 2009ꜛ). Together, these processes allow neural systems to remain adaptable over short time scales while preserving stable function over the lifetime of an organism (Zenke et al., 2017ꜛ).

To make this more concrete, it is useful to distinguish several major forms of plasticity, which differ in the level at which they act, the mechanisms they involve, and the time scales on which they operate (Bear et al., 2016ꜛ):

synaptic plasticity
intrinsic plasticity
structural plasticity
plasticity of cortical maps

At the most fine grained level, synaptic plasticity refers to changes in the efficacy of individual synapses. Long term potentiation and long term depression are the canonical examples and have been studied extensively as cellular correlates of learning and memory (Bliss & Lomo, 1973ꜛ; Malenka & Bear, 2004ꜛ). Beyond these classical forms, spike timing dependent plasticity and its extensions relate synaptic change to the precise temporal structure of neural activity (Markram et al., 1997ꜛ; Bi & Poo, 1998ꜛ; Caporale & Dan, 2008ꜛ).

Example of synaptic plasticity. Shown is a synaptic plasticity rule for gradient estimation based on dynamic perturbations of synaptic conductances. Neurons in the exploratory circuit (LMAN) introduce stochastic perturbations to the activity of neurons in RA, a premotor motor area, via dedicated perturbation synapses. A critic evaluates behavioral performance and broadcasts a global reinforcement signal to plastic synapses, in particular the corticocortical projections from HVC (a premotor area) to RA (the motor area). Each synapse computes the product of its local perturbation and the global reinforcement signal and integrates this quantity over time to estimate the gradient of expected reward with respect to its synaptic weight. Synaptic weights are then adjusted in the direction of this estimated gradient, enabling reward-based optimization of motor output. Source: Wikimedia Commonsꜛ (license: CC BY 3.0).

Intrinsic plasticity captures changes in the excitability of individual neurons. By modifying ion channel expression, membrane time constants, or firing thresholds, neurons can adapt their input-output relationships without altering synaptic weights (Turrigiano, 2008ꜛ). These mechanisms regulate gain, responsiveness, and temporal filtering and thus shape how synaptic inputs are transformed into spikes (Buonomano & Maass, 2009ꜛ).

Structural plasticity describes morphological reorganization, such as the formation and elimination of dendritic spines, axonal boutons, and even larger scale dendritic and axonal remodeling (Yuste, 2010ꜛ; Holtmaat & Svoboda, 2009ꜛ). These changes alter the physical wiring diagram of the network and therefore its effective connectivity and representational capacity over longer time scales (Bernardinelli et al., 2014ꜛ).

In computational models, such as those we have explored in our previous post, structural plasticity is typically implemented at an abstract level, where the formation and elimination of synapses represents the functional outcome of underlying morphological processes rather than their detailed biophysical realization (Butz & van Ooyen, 2013ꜛ; Gerstner et al., 2014ꜛ).

Sketch illustrating astrocytic structural plasticity and synapse stabilization. The sketch illustrates how activity dependent remodeling of astrocytic processes contributes to the long term maintenance of activated synapses. Left: Excitatory synapses are dynamically contacted by highly motile astrocytic processes, reflecting ongoing structural plasticity in the tripartite synapse. Middle: Neurotransmitter release during neuronal activity activates metabotropic glutamate receptors on astrocytic processes, leading to intracellular calcium signaling and increased motility of astrocytic extensions. Right: Under learning related conditions that induce long term potentiation (upper synapse), enhanced astrocytic motility results in increased and persistent coverage of the synapse, promoting its stabilization. In contrast, synapses that are not potentiated (lower synapse) fail to recruit stable astrocytic contacts and are preferentially eliminated. Adhesion molecules are thought to contribute to this stabilization process. Source: Bernardinelli, Y., Nikonenko, I., Muller, D., Structural plasticity: mechanisms and contribution to developmental psychiatric disorders, Frontiers in Neuroanatomy, 2014, 8:123, doi 10.3389/fnana.2014.00123ꜛ (license: CC BY 4.0).

Finally, plasticity of cortical maps refers to the reorganization of large scale representations, for example after sensory deprivation, focal lesions, or prolonged training. Such reorganization reflects the coordinated outcome of synaptic, intrinsic, and structural plasticity operating across many neurons (Bear et al., 2016ꜛ; Gatto, 2020ꜛ).

In computational neuroscience, synaptic plasticity has traditionally taken center stage because it can be directly translated into mathematical learning rules (Dayan & Abbott, 2001ꜛ; Gerstner et al., 2014ꜛ). Correlation based rate models, including the classical Hebbian rule (Hebb, 1949ꜛ),

\[\Delta w_{ij} \propto \text{pre}_i \times \text{post}_j,\]

and its stabilized and extended variants such as Oja’s rule (Oja, 1982ꜛ)

\[\Delta w_{ij} \propto \text{pre}_i \times \text{post}_j - \alpha \, \text{post}_j^2 \, w_{ij},\]

and the BCM framework (Bienenstock et al., 1982ꜛ; Shouval, 2007ꜛ)

\[\Delta w_{ij} \propto \text{pre}_i \times \text{post}_j \, (\text{post}_j - \theta_M),\]

describe how synaptic weights depend on averaged firing rates. Later, spike based formulations were introduced, most prominently spike timing dependent plasticity (STDP), which ties synaptic change to millisecond precise spike timing (Markram et al., 1997ꜛ; Morrison et al., 2008ꜛ). More recent developments, including triplet and higher order STDP models, three factor learning rules, and reward modulated plasticity, explicitly link synaptic mechanisms to learning at the behavioral and systems level (Pfister & Gerstner, 2006ꜛ; Fusi & Abbott, 2007ꜛ).

For this reason, synaptic plasticity remains the best studied and most formalized form of neuronal plasticity. At the same time, it must be understood as part of a broader plasticity landscape in which multiple mechanisms interact (Zenke et al., 2017ꜛ).

What is learning and how does it relate to plasticity?

Plasticity and learning are closely related but conceptually distinct notions. While plasticity refers to concrete biological mechanisms that change the properties of neurons and circuits, learning denotes the functional outcome of these changes at the level of behavior and internal representations (Bear et al., 2016ꜛ; Dayan & Abbott, 2001ꜛ). Clarifying this distinction is essential, especially when moving between biological description and computational modeling.

In neuroscience, learning is commonly defined as any persistent change in behavior or internal representations that arises from experience, training, or interaction with the environment (Hebb, 1949ꜛ; Bear et al., 2016ꜛ). These changes are functional rather than structural per se: They manifest as improved performance, the acquisition of new skills, the formation of memories, or the refinement of decisions and actions. Learning is therefore a system-level phenomenon, expressed in the dynamics of neural populations and their ability to generate stable, context-appropriate activity patterns (Buonomano & Maass, 2009ꜛ).

Left: Little child learns to take its first steps (Source: Nathan Dumlaoꜛ from Unspashꜛ, license: Unsplash Licenseꜛ). Right: A monkey mother teaches her baby how to climb (Source: Anirudh Chaudharyꜛ from Unspashꜛ, license: Unsplash Licenseꜛ). Learning is a fundamental adaptive process that enables organisms to acquire new skills and knowledge through experience, practice, and social interaction. In biological terms, learning is underpinned by various forms of neuronal plasticity that modify synaptic strengths, intrinsic excitability, and network connectivity. These plastic changes reshape neural dynamics, allowing the brain to form stable representations, improve performance, and adapt behaviorally to changing environments.

Crucially, learning should not be identified with any single plasticity mechanism. No individual synaptic modification, intrinsic adjustment, or structural change constitutes learning on its own (Malenka & Bear, 2004ꜛ; Turrigiano & Nelson, 2004ꜛ). Instead, learning emerges from the coordinated interaction of many plastic processes acting across different organizational levels. Through this coordination, neural networks adapt their dynamics such that certain patterns of activity become more reliable, more discriminable, or more easily reactivated (Fusi & Abbott, 2007ꜛ; Zenke et al., 2017ꜛ).

From this perspective, plasticity provides the mechanistic substrate of learning, while learning itself is the emergent reorganization of network function. Synaptic, intrinsic, and structural plasticity shape how neurons interact, how activity propagates through circuits, and which population states become stable or transient (Holtmaat & Svoboda, 2009ꜛ; Yuste, 2010ꜛ). The result is not merely a modified wiring diagram or altered excitability, but a transformed dynamical system capable of representing and using information in new ways (Gerstner et al., 2014ꜛ).

This distinction becomes particularly important in computational neuroscience. Models typically implement learning as parameter adaptation, for example changes in synaptic weights, thresholds, or connectivity (Dayan & Abbott, 2001ꜛ; Gerstner et al., 2014ꜛ). These parameter changes correspond to specific forms of biological plasticity. Learning, however, is assessed at a different level: By changes in network dynamics, attractor structure, representational geometry, or behavioral output (Buonomano & Maass, 2009ꜛ; Gao et al., 2017ꜛ). In this sense, plasticity is local and mechanistic, whereas learning is global and functional.

Learning as dynamical reorganization of neural systems

From a computational perspective, learning is most naturally described within the framework of dynamical systems (Dayan & Abbott, 2001ꜛ; Gerstner et al., 2014ꜛ). Neural networks are not static input–output mappings, but evolving systems whose internal state changes continuously in time as a function of ongoing activity, external inputs, and slowly adapting parameters. Learning corresponds to persistent, experience dependent modifications of this dynamical system (Buonomano & Maass, 2009ꜛ).

Formally, the internal state of a network evolves according to a set of coupled differential equations

\[\dot{x}(t) = f\big(x(t), u(t); W, \theta, \ldots \big),\]

where $x(t) \in \mathbb{R}^N$ denotes the vector of neural states, such as firing rates or membrane potentials of $N$ neurons, $u(t)$ represents external inputs, $W$ the synaptic weight matrix, and $\theta$ other parameters including thresholds, gains, or time constants. Network outputs are given by a readout mapping

\[y(t) = g\big(x(t)\big),\]

which may correspond to motor commands, decisions, or downstream neural signals. The functions $f$ and $g$ jointly define the intrinsic dynamics and the observable behavior of the network (Gerstner et al., 2014ꜛ).

Within this formalism, learning can be defined as any persistent, experience dependent change in the network dynamics $f$ and/or in the parameter set $\Theta = {W, \theta, \ldots}$ that alters the mapping from inputs $u(t)$ to internal states $x(t)$ and outputs $y(t)$ in a functionally beneficial way. Such benefits may include improved robustness of representations, enhanced recall, better generalization, or more reliable goal directed behavior (Fusi & Abbott, 2007ꜛ). Plasticity provides the mechanistic substrate for these changes, while learning is the emergent consequence at the level of network dynamics.

A key structural feature of biological learning systems is the separation of time scales. Neural activity typically evolves on fast time scales, while parameters adapt much more slowly (Zenke et al., 2017ꜛ). This can be expressed explicitly as a coupled slow–fast system

\[\begin{aligned} \dot{x} &= f(x, u; \Theta), \\ \dot{\Theta} &= \varepsilon \, \mathcal{L}(x, u, y, r; \Theta), \end{aligned}\]

with $0 < \varepsilon \ll 1$. Here, $\mathcal{L}$ denotes a learning rule that may depend on local pre- and postsynaptic activity, global modulatory signals such as reinforcement or error feedback $r$, and additional contextual variables(Gerstner et al., 2014ꜛ; Zenke et al., 2017ꜛ). On short time scales, the parameters can be treated as quasi-static, while on longer time scales their gradual evolution reshapes the dynamical landscape of the system.

A useful geometric interpretation arises by viewing the activity of a population of $N$ neurons at any moment in time as a point $x \in \mathbb{R}^N$ in neural state space. As time evolves, network activity traces out trajectories through this space (Buonomano & Maass, 2009ꜛ; Gao et al., 2017ꜛ). Learning reshapes these trajectories by modifying the underlying vector field defined by $f$. In many cases, plasticity causes trajectories to concentrate onto low dimensional manifolds embedded in the high dimensional state space (Gao et al., 2017ꜛ; Song et al., 2023ꜛ). These manifolds encode task relevant variables, categories, or internal states, while suppressing variability along irrelevant dimensions.

Neural state space representation of population activity. (A) The activity of individual neurons is shown as time dependent signals, for example firing rates or other measures of neural activation. Colored markers indicate the joint activity pattern across all recorded neurons at specific time points. (B) At each time point, the collective activity of a population of $N$ neurons defines a single point in an $N$ dimensional neural state space. As time evolves, neural dynamics correspond to trajectories through this space, providing a geometric description of population activity that underlies representation, computation, and learning. Source: Ji, X., Elmoznino, E., Deane, G., Constant, A., Dumas, G., Lajoie, G., Bengio, Y., Sources of richness and ineffability for phenomenally conscious states, 2023, preprint, arXiv, doi: 10.48550/arXiv.2302.06403ꜛ; Figure available via ResearchGateꜛ (license: CC BY-SA 4.0)

From this viewpoint, learning corresponds to a geometric reorganization of population activity. Distances between trajectories representing different stimuli or decisions may increase, improving separability, while variability within behaviorally equivalent states may contract, enhancing robustness and generalization (Gao et al., 2017ꜛ).

Latent state space of large-scale neural dynamics. The figure illustrates how high-dimensional neural activity can be described in terms of low-dimensional latent states that capture the dominant network dynamics. Shown on the left is a schematic illustration of hidden Markov model (HMM) inference applied to large-scale brain activity. Here, from observed multivariate fMRI time series, the model infers a sequence of discrete latent states that summarize recurrent patterns of population activity. On the right, neural activity can be visualized as trajectories through a high-dimensional state space, here defined by activity across multiple cortical parcels. The HMM identifies latent clusters within this space, where each state is characterized by its mean activity pattern and covariance structure. Transitions between states reflect the underlying network dynamics and provide a compact description of how neural activity evolves over time. Source: Figure 1A from Song, H., Shim, W. M., Rosenberg, M. D., Large-scale neural dynamics in a shared low-dimensional state space reflect cognitive and attentional dynamics, eLife, 2023, 12:e85487, doi: 10.7554/eLife.85487ꜛ (license: CC BY 4.0; modified (cropped)); adapted from Cornblath et al., 2020ꜛ.

Neural population dynamics as an embedding of task structure into neural state space. (A) Behavioral paradigm illustrating a reaching movement to a single target. The monkey initiates a movement from a fixed starting position and reaches toward a specified goal. In this simple condition, the task is fully parameterized by time $t$ along the movement trajectory. (B) Trial-averaged population firing rates of many simultaneously recorded neurons during the reach. Each trace represents the activity of one neuron as a function of time, illustrating how complex, heterogeneous single-neuron responses unfold during a structured behavior. (C) The same population activity represented as a trajectory in neural state space. At each time point, the joint activity of all recorded neurons defines a single point in a high-dimensional space, with neural dynamics corresponding to a continuous trajectory through this space. Temporal evolution of behavior is thus mapped onto a geometric path in population activity space. (D) Extension of the task to reaching movements in multiple directions. In this case, behavior is no longer described by time alone but by two task variables: Time within the movement and reach angle. (E) The resulting task manifold, here shown schematically as a low-dimensional cylinder parameterized by time and reach direction. Each point on this manifold corresponds to a specific behavioral state of the task. (F) The neural data manifold obtained from population activity. Neural trajectories corresponding to different reach directions form a smooth, structured surface in neural state space, demonstrating that population activity provides a continuous embedding of the low-dimensional task manifold into a high-dimensional neural space. This illustrates how neural manifolds capture both task structure and dynamics, and how learning and experience shape the geometry of population activity. Source: Figure 3 from Gao, P., Trautmann, E., Yu, B., Santhanam, G., Ryu, S., Shenoy, K., Ganguli, S., A theory of multineuronal dimensionality, dynamics and measurement, 2017, bioRxiv 214262, doi: 10.1101/214262ꜛ (license: CC BY 4.0).

Learned information often manifests as the stability of particular regions of state space, which motivates an attractor based interpretation (Hopfield, 1982ꜛ; Gerstner et al., 2014ꜛ). An attractor $\mathcal{A}$ is a set toward which trajectories converge for a range of initial conditions. Local stability of a fixed point attractor $x^*$, for example, can be characterized by the eigenvalues of the Jacobian

\[J = \left. \frac{\partial f}{\partial x} \right|_{x = x^\ast},\]

with stability requiring that all eigenvalues have negative real parts (Strogatz, 1998; Gerstner et al., 2014ꜛ). Plasticity modifies $f$ and therefore alters both the location and stability of attractors. New attractors may emerge, existing attractors may shift or merge, and others may lose stability (Curto & Morrison, 2023ꜛ).

(A) In recurrent networks with symmetric interactions, such as Hopfield type networks, neural activity evolves toward stable fixed point attractors. Independent of initial conditions, trajectories converge into basins of attraction that correspond to stored or learned network states. The illustrated trajectories show how different initial states relax into distinct stable fixed points. (B) In networks with asymmetric connectivity, fixed point attractors can coexist with dynamic attractors, such as limit cycles or more complex recurrent trajectories. These dynamic attractors support sustained, time dependent activity patterns and illustrate how learning can give rise not only to static memory states but also to structured temporal dynamics. Source: Curto & Morrison, Graph rules for recurrent neural network dynamics: extended version, 2023, preprint, arXiv, doi: 10.48550/arXiv.2301.12638ꜛ, ResearchGateꜛ (license: CC BY-SA 4.0)

Different classes of attractors support different computational functions. Fixed point attractors underlie associative memory and pattern completion (Hopfield, 1982ꜛ), continuous attractors represent continuous variables such as position or orientation, and dynamic attractors including limit cycles or more complex trajectories enable temporal and sequential computations (Buonomano & Maass, 2009ꜛ). Plasticity mechanisms such as Hebbian learning (Hebb, 1949ꜛ), BCM type rules (Bienenstock et al., 1982ꜛ), and spike timing dependent plasticity (STDP) (Bi & Poo, 1998ꜛ; Caporale & Dan, 2008ꜛ) selectively stabilize patterns of coactivity, thereby shaping the attractor landscape of the network.

Attractor dynamics and memory representations in neural networks. (A) Schematic illustration of attractors in a low dimensional neural state space. Once network activity enters the basin of attraction of a fixed point, trajectories converge toward that attractor and remain stable unless perturbed by external input or intrinsic noise. (B) Example dynamics from a recurrent neural network (RNN) trained to perform a working memory task, where the network must maintain and update binary information across multiple input channels. Task relevant variables are encoded in stable patterns of population activity. (C) In the trained network, fixed point attractors emerge as solutions to the task. Each attractor corresponds to a distinct memory state, here representing one of the possible input configurations. Neural trajectories evolve between these attractors as inputs change. For visualization, the high dimensional state space of the network is projected onto its leading principal components. Source: Ji, X., Elmoznino, E., Deane, G., Constant, A., Dumas, G., Lajoie, G., Bengio, Y., Sources of richness and ineffability for phenomenally conscious states, 2023, preprint, arXiv, doi: 10.48550/arXiv.2302.06403ꜛ; Figure available via ResearchGateꜛ (license: CC BY-SA 4.0)

Learning can thus be understood as a reorganization of the network’s attractor structure. Experiences become embedded as stable or metastable dynamical regimes that can be reliably reactivated by partial input or contextual cues (Fusi & Abbott, 2007ꜛ). At the same time, learning systems must resolve the stability plasticity dilemma. Plasticity must be sufficiently strong to allow adaptation, yet sufficiently regulated to prevent catastrophic interference with previously acquired knowledge (Zenke et al., 2017ꜛ). Homeostatic mechanisms, normalization, and metaplasticity constrain parameter drift and help maintain global dynamical stability (Turrigiano & Nelson, 2004ꜛ; Turrigiano, 2008ꜛ).

In summary, learning in neural systems is best understood not as a single mechanism or parameter change, but as the slow, experience driven reconfiguration of a high dimensional dynamical system. Plasticity acts locally and mechanistically at synapses and neurons, while learning emerges globally as a transformation of neural dynamics, representational geometry, and attractor structure.

Learning paradigms and signals

The form of plasticity depends strongly on the learning context. Three major paradigms can be distinguished: unsupervised, reinforcement based, and error driven learning.

In unsupervised settings, synaptic changes are driven solely by correlations within the neural activity itself. Classical Hebbian (Hebb, 1949ꜛ), covariance, Oja (Oja, 1982ꜛ), and BCM rules (Bienenstock et al., 1982ꜛ; Shouval, 2007ꜛ) can be understood as local estimates of input statistics. In rate-based form, these rules typically depend on low-order moments of pre- and postsynaptic activity and lead to feature extraction, receptive field formation, and dimensionality reduction (Dayan & Abbott, 2001ꜛ). From a dynamical perspective, such rules reshape the flow field of the network so that frequently co-active patterns become more stable or more strongly amplified (Buonomano & Maass, 2009ꜛ).

Reinforcement-based learning introduces an additional global signal that evaluates the outcome of neural activity rather than its detailed structure. Because rewards or punishments are often delayed relative to the neural events that caused them, synaptic updates rely on eligibility traces that temporally bridge this gap (Gerstner et al., 2014ꜛ; Zenke et al., 2017ꜛ). A generic formulation is

\[\dot e_{ij}(t) = \phi(\text{pre}_i(t), \text{post}_j(t)) - \frac{e_{ij}(t)}{\tau_e},\]

where $e_{ij}$ is a synapse-specific eligibility trace and $\tau_e$ its decay time constant. Weight changes then take the form

\[\dot w_{ij}(t) = \eta \, e_{ij}(t) \, r(t),\]

with $r(t)$ denoting a global reinforcement or modulatory signal, often associated with dopaminergic input (Gerstner et al., 2014ꜛ). In this view, reinforcement learning is not a distinct plasticity mechanism, but a particular factorization of the learning rule into local activity-dependent terms and a global scalar signal.

Error-driven learning represents a special case in which explicit teaching or error signals are available. This is well established in cerebellar circuits, where climbing fiber input conveys performance errors (Dayan & Abbott, 2001ꜛ). In cortical systems, however, direct error signals are rare, and many phenomena that appear error-driven at the behavioral level can be interpreted as reinforcement-like modulation of otherwise local plasticity rules (Fusi & Abbott, 2007ꜛ; Zenke et al., 2017ꜛ). Across paradigms, the common structure is that learning rules estimate how changes in synaptic parameters influence future network dynamics and behavioral outcomes.

Time and spatial scales

Plasticity and learning unfold across a hierarchy of time and spatial scales, and this hierarchy is essential for stable adaptation (Fusi & Abbott, 2007ꜛ; Zenke et al., 2017ꜛ). At the fastest level, neural states $x(t)$ evolve on millisecond time scales according to the intrinsic dynamics of neurons and synapses. Plastic changes act on slower variables, introducing a separation of time scales that can be expressed schematically as

\[\dot x = f(x, u; W), \quad \dot W = \epsilon \, \mathcal{L}(x, \ldots),\]

with $\epsilon \ll 1$.

On short time scales ranging from milliseconds to minutes, mechanisms such as short-term synaptic plasticity, spike-timing–dependent eligibility traces, and early phases of long-term potentiation and depression transiently tag relevant patterns of activity (Bi & Poo, 1998ꜛ; Pfister & Gerstner, 2006ꜛ; Caporale & Dan, 2008ꜛ). These mechanisms bias subsequent plasticity without permanently altering network structure.

On intermediate time scales of hours to days, late-phase LTP and LTD, spine stabilization, and local circuit reorganization consolidate these transient changes (Bliss & Lomo, 1973ꜛ; Malenka & Bear, 2004ꜛ; Holtmaat & Svoboda, 2009ꜛ). At this level, learning becomes robust to noise and perturbations, and newly formed attractors or manifolds persist beyond the immediate learning episode (Fusi & Abbott, 2007ꜛ).

On long time scales of days to years, large-scale cortical reorganization, systems consolidation from hippocampus to neocortex, and the automatization of skills take place (Turrigiano & Nelson, 2004ꜛ; Turrigiano, 2008ꜛ). These slow processes effectively reshape the parameter landscape of the network, constraining future learning and enabling lifelong accumulation of knowledge.

Importantly, learning alternates between online phases during active behavior and offline phases during sleep or rest. Offline replay and reactivation can be interpreted as additional trajectories through neural state space that reinforce or refine previously formed attractors while reducing interference between memories (Buonomano & Maass, 2009ꜛ; Zenke et al., 2017ꜛ).

Rates versus spike timing

Whether learning is best described in terms of firing rates or precise spike timing depends on the temporal resolution required to capture the relevant dynamics (Dayan & Abbott, 2001ꜛ; Gerstner et al., 2014ꜛ). Spike-based models explicitly represent action potentials and can express learning rules that depend on millisecond-scale timing differences, as in STDP (Markram et al., 1997ꜛ; Bi & Poo, 1998ꜛ; Caporale & Dan, 2008ꜛ). These mechanisms are essential for tasks involving temporal sequences, causality, or fine sensory discrimination (Buonomano & Maass, 2009ꜛ).

Rate-based models, in contrast, describe neural activity as temporally averaged variables (Dayan & Abbott, 2001ꜛ). They are often sufficient for capturing the stabilization of attractors, working memory, decision-making processes, and other phenomena that depend primarily on slower collective dynamics (Hopfield, 1982ꜛ; Buonomano & Maass, 2009ꜛ). Mathematically, rate models can be viewed as coarse-grained descriptions of underlying spiking dynamics, obtained by averaging over fast fluctuations (Gerstner et al., 2014ꜛ).

Bridging models such as triplet STDP demonstrate how spike-based rules reduce, under temporal averaging, to rate-based learning rules like BCM (Pfister & Gerstner, 2006ꜛ; Shouval, 2007ꜛ). In this sense, rate and spike formulations do not represent competing theories, but different projections of the same underlying dynamical system onto different time scales (Zenke et al., 2017ꜛ). The appropriate description is determined by the question being asked rather than by biological realism alone.

Practical implications for experiments and modeling

Viewing learning as a dynamical reorganization of neural systems has direct implications for both experimental design and computational modeling (Buonomano & Maass, 2009ꜛ; Gerstner et al., 2014ꜛ). At the behavioral level, learning manifests as increased accuracy, improved robustness to noise, and more reliable reward maximization. These behavioral changes correspond to increased stability and separability of neural representations (Fusi & Abbott, 2007ꜛ).

At the neurophysiological level, learning is reflected in changes to population activity rather than isolated neurons. Observable signatures include reduced dimensionality of task-relevant activity, reshaped manifolds in neural state space, and the emergence or stabilization of attractors (Gao et al., 2017ꜛ; Song et al., 2023ꜛ). These effects are often invisible at the level of single-neuron tuning curves but become apparent when analyzing population trajectories.

At the anatomical and biophysical level, learning leaves persistent traces in spine stability, receptor composition, intrinsic excitability, and connectivity patterns (Holtmaat & Svoboda, 2009ꜛ; Yuste, 2010ꜛ). These changes constrain future dynamics and bias the system toward previously learned solutions.

From a theoretical perspective, learning corresponds to changes in the qualitative structure of the dynamical system. Fixed points may be created or displaced, eigenvalue spectra of the linearized dynamics may shift, and basins of attraction may expand or contract (Gerstner et al., 2014ꜛ; Curto & Morrison, 2023ꜛ). Improved generalization and robustness can often be traced back to increased margins between representations in state space and to the stabilization of task-relevant directions of activity (Fusi & Abbott, 2007ꜛ; Gao et al., 2017ꜛ).

A compact formal summary

Bringing these ideas together, we can condense them into a minimal formal description that captures the essential structure of learning in neural systems.

Network dynamics are described by

\[\dot{x} = f(x, u; \Theta),\]

where $x$ denotes the network state, $u$ external inputs, and $\Theta$ the set of parameters governing the dynamics.

Learning corresponds to experience dependent changes of these parameters according to

\[\dot{\Theta} = \mathcal{L}(x, u, y, r; \Theta),\]

with $\mathcal{L}$ denoting a learning rule that may depend on local activity, global modulatory signals, or behavioral feedback.

Through this coupled evolution, learning reshapes the geometry of neural state space and the attractor structure of the dynamics, enabling stable representations, flexible computation, and adaptive behavior under biological constraints.

Concluding perspective

What we can take away from this discussion is that neuronal plasticity and learning describe different levels of the same adaptive process. Plasticity refers to the biological mechanisms by which neural systems change, while learning denotes the functional outcome of these changes at the level of network dynamics, representations, and behavior. Learning does not reside in individual synapses or neurons, but emerges from the coordinated interaction of multiple plastic processes acting across spatial and temporal scales.

From a computational viewpoint, this distinction is crucial. Local plasticity mechanisms modify parameters or connectivity, whereas learning is expressed globally as a reorganization of neural state space and attractor structure. Changes in synaptic strength, intrinsic excitability, or network topology reshape the effective dynamics, giving rise to stable yet flexible patterns of population activity.

I believe, that this dynamical perspective provides a unifying framework for diverse modeling approaches in computational neuroscience. Rate based and spike based descriptions, different learning paradigms, and geometric or attractor based interpretations capture complementary aspects of how adaptive neural systems operate. Together, they emphasize that learning is best understood as a constrained, multiscale reconfiguration of network dynamics rather than as the outcome of any single plasticity rule.

As stated at the beginning of this post, the view presented here reflects my current understanding of the topic. It is intended as a conceptual reference point rather than a definitive account. As always, I welcome feedback, corrections, and suggestions for improvement in the comments below. Any new insights I gain will be incorporated into future updates of this post.

References and further reading

Marc F. Bear, Barry W. Connors, and Michael A. Paradiso, Neuroscience: Exploring the Brain, 2016, 4th edition, Wolters Kluwer, ISBN: 978-0-7817-7817-6ꜛ
Bernardinelli, Y., Nikonenko, I., Muller, D., Structural plasticity: mechanisms and contribution to developmental psychiatric disorders, 2014, Frontiers in Neuroanatomy, 8:123, doi 10.3389/fnana.2014.00123ꜛ
G. Bi, & M. Poo, Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, 1998, Journal of neuroscience, doi: 10.1523/JNEUROSCI.18-24-10464.1998ꜛ
E. L. Bienenstock, L. N. Cooper, P. W. Munro, Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex, 1982, Journal of Neuroscience, doi: 10.1523/JNEUROSCI.02-01-00032.1982ꜛ
Bliss TV, Lomo T. Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path, 1973, J Physiol., 232(2):331-56. doi: 10.1113/jphysiol.1973.sp010273ꜛ
Buonomano, Dean V., Maass, Wolfgang, State-dependent computations: spatiotemporal processing in cortical networks, Nature Reviews Neuroscience, 2009, 10(2), 113–125, doi: 10.1038/nrn2558ꜛ
Markus Butz, Arjen van Ooyen, A Simple Rule for Dendritic Spine and Axonal Bouton Formation Can Account for Cortical Reorganization after Focal Retinal Lesions, 2013, PLoS Computational Biology, Vol. 9, Issue 10, pages e1003259, doi: 10.1371/journal.pcbi.1003259
Natalia Caporale, & Yang Dan, Spike timing-dependent plasticity: a Hebbian learning rule, 2008, Annu Rev Neurosci, Vol. 31, pages 25-46, doi: 10.1146/annurev.neuro.31.060407.125639ꜛ
Carina Curto, Katherine Morrison, Graph rules for recurrent neural network dynamics: extended version, 2023, preprint, arXiv, doi: 10.48550/arXiv.2301.12638ꜛ
P. Dayan, I. F. Abbott, Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems, 2001, MIT Press, ISBN: 0-262-04199-5ꜛ
Feldman, Daniel E., The spike-timing dependence of plasticity, 2012, Neuron;75(4):556-71. doi: 10.1016/j.neuron.2012.08.001ꜛ
Fusi, S., Abbott, L., Limits on the memory storage capacity of bounded synapses, 2007, Nat Neurosci 10, 485–493, doi: 10.1038/nn1859ꜛ
Rodolfo Gabriel Gatto, Molecular and microstructural biomarkers of neuroplasticity in neurodegenerative disorders through preclinical and diffusion magnetic resonance imaging studies, 2020, J. Integr. Neurosci., 19(3), 571–592. doi: 10.31083/j.jin.2020.03.165ꜛ
Gao, P., Trautmann, E., Yu, B., Santhanam, G., Ryu, S., Shenoy, K., Ganguli, S., A theory of multineuronal dimensionality, dynamics and measurement, 2017, bioRxiv 214262, doi: 10.1101/214262ꜛ
Wulfram Gerstner, Werner M. Kistler, Richard Naud, and Liam Paninski, Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition, 2014, Cambridge University Press, ISBN: 978-1-107-06083-8, free online versionꜛ
Donald O. Hebb, The Organization of Behavior, 1949, Wiley: New York, doi: 10.1016/s0361-9230(99)00182-3ꜛ
Holtmaat, A., Svoboda, K., Experience-dependent structural synaptic plasticity in the mammalian brain, 2009, Nat Rev Neurosci 10, 647–658, doi: 10.1038/nrn2699ꜛ
Hopfield, John J., Neural networks and physical systems with emergent collective computational abilities, 1982, Proc Natl Acad Sci U S A, 79(8), 2554-2558. doi: 10.1073/pnas.79.8.2554
Ji, X., Elmoznino, E., Deane, G., Constant, A., Dumas, G., Lajoie, G., Bengio, Y., Sources of richness and ineffability for phenomenally conscious states, 2023, preprint, arXiv, doi: 10.48550/arXiv.2302.06403ꜛ
Malenka, Robert C., Bear, Mark F., LTP and LTD: An embarrassment of riches, 2004, Neuron, Vol. 44, Issue 1, pages 5-21, doi: 10.1016/j.neuron.2004.09.012ꜛ
H. Markram, J. Lübke, M. Frotscher, B. Sakmann, Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs, 1997, Science, Vol. 275, Issue 5297, pages 213-215, doi: 10.1126/science.275.5297.213ꜛ
Morrison, A, Diesmann, M, & Gerstner, W, Phenomenological models of synaptic plasticity based on spike timing, 2008, Biol Cybern, 98(6), 459-478. doi: 10.1007/s00422-008-0233-1ꜛ
E Oja, Simplified [neuron model as a principal component analyzer, 1982, Journal of mathematical biology, doi: 10.1007/BF00275687ꜛ
J. P. Pfister & Wulfram Gerstner, Triplets of spikes in a model of spike timing-dependent plasticity, 2006, Journal of [Neuroscience, doi: 10.1523/JNEUROSCI.1425-06.2006ꜛ
Pozo, Karen, Goda, Yukiko, Unraveling mechanisms of homeostatic synaptic plasticity, 2010, Neuron; 66(3):337-51. doi: 10.1016/j.neuron.2010.04.028ꜛ
Rumelhart, David E., McClelland, James L., PDP Research Group, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1, 1986, MIT Press, ISBN: 978-0262680530.
Harel Z. Shouval, Models of synaptic plasticity, 2007, Scholarpedia, 2(7):1605, doi: 10.4249/scholarpedia.1605ꜛ
Song, H., Shim, W. M., Rosenberg, M. D., Large-scale neural dynamics in a shared low-dimensional state space reflect cognitive and attentional dynamics, eLife, 2023, 12:e85487, doi: 10.7554/eLife.85487ꜛ
Steven H. Strogatz, Nonlinear Dynamics and Chaos. With Student Solutions Manual: With Applications to Physics, Biology, Chemistry, and Engineering, 1998, Book, CRC Press, ISBN: 0429680155
Turrigiano, Gina G., The self-tuning neuron: synaptic scaling of excitatory synapses, 2008, Cell; 135(3):422-35. doi: 10.1016/j.cell.2008.10.008ꜛ
Turrigiano, Gina G., Nelson, Sacha B., Homeostatic plasticity in the developing nervous system, 2004, Nature Reviews Neuroscience, 5(2), 97–107, doi: 10.1038/nrn1327ꜛ
Yuste, Rafael, Dendritic Spines, 2010, MIT Press, ISBN: 978-0262013505ꜛ
Zenke, F, Gerstner, W, & Ganguli, S, The temporal paradox of Hebbian learning and homeostatic plasticity (2017), Curr Opin Neurobiol, 43, 166-176. doi: 10.1016/j.conb.2017.03.015