Wednesday, August 26, 2015

Science is a maze

If you want to truly understand how scientific progress works, I suggest fitting mathematical models to dynamical data (i.e. population or community time series) for a few days.
map for science?

You were probably told sometime early on about the map for science: the scientific method. It was probably displayed for your high school class as a tidy flowchart showing how a hypothetico-deductive approach allows scientists to solve problems. Scientists make observations about the natural world, gather data, and come up with a possible explanation or hypothesis. They then deduce the predictions that follow, and design experiments to test those predictions. If you falsify the predictions you then circle back and refine, alter, or eventually reject the hypothesis. Scientific progress arises from this process. Sure, you might adjust your hypothesis a few times, but progress is direct and straightforward. Scientists aren’t shown getting lost.

Then, once you actively do research, you realize that formulation-reformulation process dominates. But because for most applications the formulation-reformulation process is slow – that is, each component takes time (e.g. weeks or months to redo experiments and analyses and work through reviews) – you only go through that loop a few times. So you usually still feel like you are making progress and moving forward.

But if you want to remind yourself just how twisting and meandering science actually is, spend some time fitting dynamic models. Thanks to Ben Bolker’s indispensible book, this also comes with a map, which shows how closely the process of model fitting mirrors the scientific method. The modeller has some question they wish to address, and experimental or observational data they hope to use to answer it. By fitting or selecting the best model for they data, they can obtain estimates for different parameters and so hopefully test predictions from they hypothesis. Or so one naively imagines.
From Bolker's Ecological Models and Data in R,
a map for model selection. 
The reality, however, is much more byzantine. Captured well in Vellend (2010)
“Consider the number of different models that can be constructed from the simple Lotka-Volterra formulation of interactions between two species, layering on realistic complexities one by one. First, there are at least three qualitatively distinct kinds of interaction (competition, predation, mutualism). For each of these we can have either an implicit accounting of basal resources (as in the Lotka-Volterra model) or we can add an explicit accounting in one particular way. That gives six different models so far. We can then add spatial heterogeneity or not (x2), temporal heterogeneity or not (x2), stochasticity or not (x2), immigration or not (x2), at least three kinds of functional relationship between species (e.g., predator functional responses, x3), age/size structure or not (x2), a third species or not (x2), and three ways the new species interacts with one of the existing species (x3 for the models with a third species). Having barely scratched the surface of potentially important factors, we have 2304 different models. Many of them would likely yield the same predictions, but after consolidation I suspect there still might be hundreds that differ in ecologically important ways.”
Model fitting/selection, can actually be (speaking for myself, at least) repetitive and frustrating and filled with wrong turns and dead ends. And because you can make so many loops between formulation and reformulation, and the time penalty is relatively low, you experience just how many possible paths forward there to be explored. It’s easy to get lost and forget which models you’ve already looked at, and keeping detailed notes/logs/version control is fundamental. And since time and money aren’t (as) limiting, it is hard to know/decide when to stop - no model is perfect. When it’s possible to so fully explore the path from question to data, you get to suffer through realizing just how complicated and uncertain that path actually is. 
What model fitting feels like?

Bolker hints at this (but without the angst):
“modeling is an iterative process. You may have answered your questions with a single pass through steps 1–5, but it is far more likely that estimating parameters and confidence limits will force you to redefine your models (changing their form or complexity or the ecological covariates they take into account) or even to redefine your original ecological questions.”
I bet there are other processes that have similar aspects of endless, frustrating ability to consider every possible connection between question and data (building a phylogenetic tree, designing a simulation?). And I think that is what science is like on a large temporal and spatial scale too. For any question or hypothesis, there are multiple labs contributing bits and pieces and manipulating slightly different combinations of variables, and pushing and pulling the direction of science back and forth, trying to find a path forward.

(As you may have guessed, I spent far too much time this summer fitting models…)

4 comments:

Hans Castorp said...

Great post. In science there are in fact tow principles. The first is the well known principle of flasification of hypothesis, the other, much less appreciated, is parsimony or Occam's razor. Often you have more than one hypothesis which agrees with the data, and you must chose the most parsimonious (in statistical textbooks this is often stated as "biolotologically meaninglful".

Caroline Tucker said...

Definitely! And parsimony is fundamental for model fitting type approaches.
But it's funny how much science classes focus on the falsification approach. For example, I was looking at the second year ecology lab manual for my department and it presents exactly the first figure (scientific method flowchart) I have above. Popper's influence is pretty strong!

Ben Haller said...

And then there's the really tricky part: if you're trying all of these different models and successively iterating towards something that seems to work, how do you handle the implicit multiple comparisons problem inherent in that process? How do you avoid p-hacking, data-dredging, or whatever term you prefer for what is basically the same phenomenon – searching through model space until you find a model that happens to give you a significant result? When I took stats, I was told that one ought to choose one's whole statistical analysis ahead of time, including all of the "planned comparisons" one would do, to avoid any such issues. But in the real world, that's a total fantasy, as this post illustrates. And so even the most well-meaning scientists seem to descend into the domain of Bad Statistics and Irreproducible Results. How to avoid that?

Caroline Tucker said...

Great points, and these are some of the ongoing conversations I've had with my collaborator (Brett Melbourne) on model fitting.

Model fitting is really about developing mechanistic descriptions of ecological data/patterns. If anything, its limitations have more in common with AIC approaches - you can still end up choosing an objectively poor model, if all the models you looked at were bad and it just happened to be the best of the bunch.

Two additional points - one major difference from p-hacking (where you might look at every possible variable hoping for some significant result) is that model fitting is about parsimony and eliminating potential models. If for example you determine that a growth rate is density-independent, then you can eliminate possible models where the rate is density-dependent. Hundreds of models thus eliminated. (The number of models you can look at always sounds huge, but this is simply because the number of combinations of a few reasonable mechanisms quickly add up. Bolker makes the point that this technique works poorly for generalized “fishing expeditions” where you have measured lots of variables and want to try to sort them out.)

The second is that model fitting shouldn't stop at identifying the 'best' model; there is a surprisingly large role for human oversight - you have to examine the fit, residuals, and predictions of the model and determine if these are acceptable given the data and your intuitions about the biology. Model fitting tends to be exploratory, and dependent on the willingness of the practitioner to question and test the results. (But this should be true of any statistical analysis.)