Tuesday, February 14, 2012

A good null model is hard to find

Ecologists have always found the question of how communities assemble to be of great interest. However, studies of community assembly are often thwarted by the large temporal and spatial scales over which processes occur, making experimental tests of assembly theory difficult. As a result, researchers are often forced to rely on observational data and make inferences about the mechanisms at play from patterns alone. While historical assembly research focused on inferring evidence of competition or environmental filtering from patterns of species co-occurrence, more recent research often looks at patterns of phylogenetic or trait similarity in a community to answer these questions. 

Not surprisingly, when methods rely heavily on observational data they are open to criticism: one of the most important outcomes of early community assembly literature was the recognition that patterns that appeared to support a hypothesis about competition or environmental filtering could in fact result by random chance. This ultimately lead to the widespread incorporation of null models, which are meant to simulate patterns that might be observed by random chance (or other processes not under study), against which the observed data can be compared. Patterns of functional and phylogenetic information in communities can also be compared against null expectations to ensure that patterns of phylogenetic or functional over- or under-dispersion can't arise due to chance alone. However, while null models are an important tool in assembly research, they are sometimes as the foolproof solution to all of its problems.

In a new paper by Francesco de Bello, the author states frankly “whilst reading null-model methods applied in the literature (indeed including my work), one may have the impression of reading a book of magic spells”. While null models are increasingly sophisticated, allowing researchers to determine which processes to control for and which to leave out, de Bello suggests that the decision to include or omit particular factors from a null model can be unclear, making it difficult to interpret results or compare results across studies. Further, results from null models may not mean what researchers expect them to mean.

Using the example of functional diversity (FD; variation in trait values among species in a community), de Bellow illustrates how null models may have different meanings than expected. Ideally, a null model for FD should produce random values of FD, against which the observed values of FD can be compared. Interpreting the difference between the observed and random results can be done using the standardized effect size (SES, the standardized difference between the observed and randomly generated FD values); SES values >0 show that traits are more divergent than expected by chance, suggesting competition structures communities. If SES<0, traits are more convergent than expected by chance, suggesting environmental conditions structure communities. Finally, if SES ~0, then trait values aren’t different from random. However, de Bello shows that the SES is driven by the observed FD values, because the ‘random’ FD values are dependent on the pool of observations sampled. This means that the values the null model produces are ultimately dependent on those observed values, despite the fact you plan to make inferences by comparing the null and observed values as though they are independent. For example, consider the situation where you are building a null model of community structure for plant communities found along two vegetation belts. If the null model is constructed using all the plant communities, regardless of the habitat they are found in, the resulting null FD value will be higher, since species that are dissimilar and found in different vegetation belts are being randomly selected as occurring in a community. If null models are constructed separately for both vegetation belts, the null FD value is lower, since species are more similar. The magnitude of the difference between the null model and the observed values, and further, the biological conclusions one would take from this study, would therefore depend on which null model was constructed.

from de Bello 2012, illustrating how combining species pools (right) can lead to entirely different decisions about whether communities are convergent or divergent in terms of traits than when they are considered separately (left, centre).
De Bello’s findings make important points about the limitations of null models, particularly for functional diversity, but likely for other types of response variable. The type of null model he explores is relatively simplistic (reshuffling of species among sites), and the suggestion that the species pool affects the null model is not unique (Shipley & Weiher, 1995). However, even sophisticated and complex null models need to be biologically relevant and interpretable, and null models are still frequently used incorrectly. Although only mentioned briefly, De Bello also notes another problem with studies of community assembly, which is that popular indices like FD, PD, and others may not always be able to distinguish correctly between different assembly mechanisms (Mouchet et al. 2010Mayfield & Levine, 2010), something that null model do not control for.