Wednesday, November 11, 2015

Nerd Food: Tooling in Computational Neuroscience - Part I: NEURON

Nerd Food: Tooling in Computational Neuroscience - Part I

In the previous series of posts we did a build up of theory - right up to the point where we were just about able to make sense of Integrate and Fire - one of the simpler families of neuron models. The series used a reductionist approach - or bottom up, if you prefer1. We are now starting a new series with the opposite take, this time coming at it from the top. The objective is to provide a (very) high-level overview - in laymen's terms, still - of a few of the "platforms" used in computational neuroscience. As this is a rather large topic, we'll try to tackle a couple of platforms each post, discussing a little bit of their history, purpose and limitations - whilst trying to maintain a focus on file formats or DSLs. "File formats" may not sound particularly exciting at first glance, but it is important to keep in mind that these are instances of meta-models of the problem domain in question, and as such, their expressiveness is very important. Understand those and you've understood a great deal about the domain and about the engineering choices of those involved.

But first, let's introduce Computational Neuroscience.

Computers and the Brain

Part V of our previous series discussed some of the reasons why one would want to model neurons (section Brief Context on Modeling). What we did not mention is that there is a whole scientific discipline dedicated to this endeavour, called Computational Neuroscience. Wikipedia has a pretty good working definition, which we will take wholesale. It states:

Computational neuroscience […] is the study of brain function in terms of the information processing properties of the structures that make up the nervous system. It is an interdisciplinary science that links the diverse fields of neuroscience, cognitive science, and psychology with electrical engineering, computer science, mathematics, and physics.

Computational neuroscience is distinct from psychological connectionism and from learning theories of disciplines such as machine learning, neural networks, and computational learning theory in that it emphasizes descriptions of functional and biologically realistic neurons (and neural systems) and their physiology and dynamics. These models capture the essential features of the biological system at multiple spatial-temporal scales, from membrane currents, proteins, and chemical coupling to network oscillations, columnar and topographic architecture, and learning and memory.

These computational models are used to frame hypotheses that can be directly tested by biological or psychological experiments.

Lots of big words, of course, but hopefully they make some sense after the previous posts. If not, don't despair; what they all hint at is an "interdisciplinary" effort to create biologically plausible models, and to use these to provide insights on how the brain is performing certain functions. Think of the Computational Neuroscientist as the right-hand person of the Neuroscientist - the "computer guy" to the "business guy", if you like. The Neuroscientist (particularly the experimental Neuroscientist) gets his or her hands messy with wetware and experiments, which end up providing data and a better biological understanding; the Computational Neuroscientist takes these and uses them to make improved computer models, which are used to test hypothesis or to make new ones, which can then validated by experiments and so on, in a virtuous feedback loop.2 Where the "interdisciplinary" part comes in is that many of the people doing the role of "computer guys" are actually not computer scientists but instead come from a variety of backgrounds such as biology, physics, chemistry and so on. This variety adds a lot of value to the discipline because the brain is such a complex organ; understanding it requires all kinds of skills - and then some.

It's Models All the Way Down

At the core, then, the work of the Computational Neuroscientist is to create models. Of course, as we already seen, one does not just walk straight into Mordor and starts creating the "most biologically plausible" model of the brain possible; all models must have a scope as narrow as possible, if they are to become a) understandable and b) computationally feasible. Thus engineering trade-offs are crucial to the discipline.

Also, it is important to understand that creating a model does not always imply writing things from scratch. Instead, most practitioners rely on a wealth of software available, all with different advantages and disadvantages.

At this juncture you are probably wondering just what exactly are these "models" we speak so much of. Are they just equations like IaF? Well, yes and no. As it happens, all models have roughly the following structure:

  • a morphology definition: we've already spoken a bit about morphology; think of it as the definition of the entities that exist in your model, their characteristics and relationships. This is actually closer to what we, computer scientists think the word modeling means. For example, the morphology defines how many neurons you have, how many axons and dendrites, connectivity, spatial positioning and so on.
  • a functional, mathematical or physical definition: I've heard it named in many ways, but fundamentally, what it boils down to is the definition of the equations that your model requires. For example, are you modeling electrical properties or reaction/diffusion?

For the simpler models, the morphology gets somewhat obscured - after all, in LIF, there is very little information about a neuron because all we are interested in are the spikes. For other models, a lot of morphological details are required.

The Tooling Landscape

Idealised…

It is important to keep in mind that these models are to be used in a simulation; that is, we are going to run the program for a period of time (hours or days) and observe different aspects of its behaviour. Thus the functional definition of the model provides the equations that describe the dynamics of the system being simulated and the morphology will provide some of the inputs for those equations.

From here one can start sketch the requirements for a system for the Computational Neuroscientist:

  • a platform of some kind to provide simulation control: starting, stopping, re-running, storing the results and so on. As the simulations can take a long time to run, the data sets can be quite large - on the hundreds of gigs range - so efficiently handling of the output data is a must.
  • some kind of DSL that provides a user friendly way to define their models, ideally with a graphical user interface that helps author the DSL. The DSL must cover the two aspects we mention above.
  • efficient libraries of numerical routines to help solve the equations. The libraries must be exposed in someway to the DSL so that users can make use of these when defining the functional aspects of the model.

Architecturally, the ability to use a cluster or GPUs would of course be very useful, but we shall ignore those aspects for now. Given this idealised platform, we can now make a bit more sense of what actually exists in the wild.

… vs Actual

The multidisciplinary nature of Computational Neuroscience poses some challenges when it comes to software development: as mentioned, many of the practitioners in the field do not have a Software Engineering background; of those that do have, most tend not to have strong biology and neuroscience backgrounds. As a result, the landscape is fragmented and the quality is uneven. On one side, most of the software is open source, making reuse a lot less of a problem. On the other hand, things such as continuous integration, version control, portability, user interface guide lines, organised releases, packaging and so on are still lagging behind most "regular" Free and Open Source projects3.

In some ways, to enter Computational Neuroscience is a bit like travelling in time to a era before git, before GitHub, before Travis and all other things we take for granted. Not everywhere, of course, but still in quite a few places, particularly with the older and more popular projects. One cannot help but get the feeling that the field could do with some of the general energy we have in the FOSS community, but the technical barriers to contributing tend to be large since the domain is so complex.

So after all of this boring introductory material, we can finally look at our first system.

NEURON

Having to choose, one feels compelled to start with NEURON - the most venerable of the lot, with roots in the 80s4. NEURON is a simulation environment with great depth of functionality and a comprehensive user manual published as a (non-free) book. For the less wealthy, an overview paper is available, as are many other online resources. The software itself is fully open source, with a public mercurial repo.

As with many of the older tools in this field, NEURON development has not quite kept up the pace with the latest and greatest. For instance, it still has a Motif'esque look to its UI but, alas, do not be fooled - its not Motif but InterViews - a technology I never heard of, but seems to have been popular in the 80's and early 90's. One fears that NEURON may just be the last widely used program relying on InterViews - and the fact that they carry their own fork of it does not make me hopeful.

subset0.gif

Figure 1: Source: NEURON Cell Builder

However, once one goes past these layers of legacy, the domain functionality of the tool is very impressive. This goes some way to explain why so many people rely on it daily and why so many papers have been written using it - over 600 papers at the last count.

Whilst NEURON is vast, we are particularly interested in only two aspects of it: hoc and mod (in its many incarnations). These are the files that can be used to define models.

Hoc

Hoc has a fascinating history and a pedigree to match. It is actually the creation of Kernighan and Pike, two UNIX luminaries, and has as contenders tools like bc and dc and so on. NEURON took hoc and extended it both in terms of syntax as well as the number of available functions; NEURON Hoc is now an interpreted object oriented language, albeit with some limitations such as lack of inheritance. Programs written in hoc execute in an interpreter called oc. There are a few variations of this interpreter, with different kinds of libraries made available to the user (UI, neuron modeling specific functionality, etc) but the gist of it is the same, and the strong point is the interactive development with rapid feedback. On the GUI versions of the interpreter, the script can specify it's UI elements including input widgets for parameters and widgets to display the output. Hoc is then used as a mix between model/view logic and morphological definition language.

To get a feel for the language, here's a very simple sample from the manual:

create soma    // model topology
access soma    // default section = soma

soma {
   diam = 10   // soma dimensions in um
   L = 10/PI   //   surface area = 100 um^2
}

NMODL

The second language supported by NEURON is NMODL - The NEURON extended MODL (Model Description Language). NMODL is used to specify a physical model in terms of equations such as simultaneous nonlinear algebraic equations, differential equations and so on. In practice, there are actually different versions of NMODL for different NEURON versions, but to keep things simple I'll just abstract these complexities and refer to them as one entity5.

As intimated above, NMODL is a descendant of MODL. As with Hoc, the history of MODL is quite interesting; it was a language was defined by the National Biomedical Simulation Resource to specify models for use with SCoP - the Simulation Control Program6. From what I can gather of SCoP, its main purpose was to make life easier when creating simulations, providing an environment where users could focus on what they were trying to simulate rather than nitty-gritty implementation specific details.

NMODL took MODL syntax and extended it with the primitives required by its domain; for instance, it added the NEURON block to the language, which allows multiple instances of "entities". As with MODL, NMODL is translated into efficient C code and linked against supporting libraries that provide the numerics; the NMODL translator to C also had to take into account the requirement of linking against NEURON libraries rather than SCoP.

The below is a snippet of NMODL code, copied from the NEURON book (chapter 9, listing 9.1):

NEURON {
  SUFFIX leak
  NONSPECIFIC_CURRENT i
  RANGE i, e, g
}

PARAMETER {
  g = 0.001  (siemens/cm2)  < 0, 1e9 >
  e = -65    (millivolt)
}

ASSIGNED {
  i  (milliamp/cm2)
  v  (millivolt)
}

NMODL and hoc are used together to form a model; hoc to provide the UI, parameters and morphology and NMODL to provide the physical modeling. The website ModelDB provides a database of models in a variety of platforms with the main objective of making research reproducible. Here you can see an example of a production NEURON model in its full glory, with a mix of hoc and NMODL files - as well as a few others such as session files, which we can ignore for our purposes.

Thoughts

NEURON is more or less a standard in Computational Neuroscience - together with a few other tools such as GENESIS, which we shall cover later. Embedded deeply in it source code is the domain logic learned painstakingly over several decades. Whilst software engineering-wise it is creaking at the seams, finding a next generation heir will be a non-trivial task given the features of the system, the amount of models that exist out there, and the knowledge and large community that uses it.

Due to this, a solution that a lot of next-generation tools have developed is to use NEURON as a backend, providing a shiny modern frontend and then generating the appropriate hoc and NMODL required by NEURON. This is then executed in a NEURON environment and the results are sent back to the user for visualisation and processing using modern tools. Le Roi Est Mort, Vive Le Roi!

Conclusions

In this first part we've outlined what Computational Neuroscience is all about, what we mean by a model in this context and what services one can expect from a platform in this domain. We also covered the first of such platforms. Tune in for the next instalment where we'll cover more platforms.

Footnotes:

1

I still owe you the final post of that series, coming out soon, hopefully.

2

Of course, once you scratch the surface, things get a bit murkier. Erik De Schutter states:

[…] The term is often used to denote theoretical approaches in neuroscience, focusing on how the brain computes information. Examples are the search for “the neural code”, using experimental, analytical, and (to a limited degree) modeling methods, or theoretical analysis of constraints on brain architecture and function. This theoretical approach is closely linked to systems neuroscience, which studies neural circuit function, most commonly in awake, behaving intact animals, and has no relation at all to systems biology. […] Alternatively, computational neuroscience is about the use of computational approaches to investigate the properties of nervous systems at different levels of detail. Strictly speaking, this implies simulation of numerical models on computers, but usually analytical models are also included […], and experimental verification of models is an important issue. Sometimes this modeling is quite data driven and may involve cycling back and forth between experimental and computational methods.

3

This is a problem that has not gone unnoticed; for instance, this paper provides an interesting and thorough review of the state onion in Computational Neuroscience: Current practice in software development for computational neuroscience and how to improve it. In particular, it explains the dilemmas faced by the maintainers of neuroscience packages.

4

The early story of NEURON is available here; see also the scholarpedia page.

5

See the NMODL page for details, in the history section.

6

As far as I can see, in the SCoP days MODL it was just called the SCoP Language, but as the related paper is under a paywall I can't prove it either way. Paper: SCoP: An interactive simulation control program for micro- and minicomputers, from Springer.

Created: 2015-11-11 Wed 17:59

Emacs 24.5.1 (Org mode 8.2.10)

Validate

No comments: