Research interests

Modeling species distribution in space and time is a key challenge for conservation. Typically, such information is crucial to identify essential habitats or biodiversity hotspots and to design protected areas.

Thanks to citizen science and declarative data, a massive amount of data is becoming available to address these challenges at an unprecedented spatio-temporal resolution.

My research seeks to develop spatio-temporal statistical methods to integrate these datasets for mapping species distribution in space and time.

Integrating all these datasets raises strong methodological issues. Typically, combining multiple and heterogeneous datasets in a single statistical framework requires to account for potential sampling bias (opportunistic/preferential sampling). Data are also often available at a rough scale, while the process under study need to be available at a fine scale (change of support or change of scale problem).

When all the data are integrated, they give access to a huge amount of information on species distribution (i.e. information available over large spatial domains and long time period). Providing a low rank representation of the process of interest is pivotal to identify the main mechanisms shaping species spatio-temporal distribution (dimension-reduction).

In the few sections below, I present the different axes of my research work. I first present the methodological bottlenecks that need to be addressed before integrating the datasets (preferential sampling, change of support). I then introduce a model to integrate data while accounting for preferential sampling and change of support. I finally deal with dimension-reduction methods for spatio-temporal data.

Abundance spatio-temporal predictions of an integrated population dynamics model (Olmos et al., 2023)

Handling preferential sampling

Inferring species spatio-temporal distribution can be done through diverse source of data some of which do not necessarily follow a standardized sampling plan (opportunistic sampling). In some cases, the sampling agents target areas where they know the variable under study is high. The typical example is the case of fishermen that target areas where fish biomass is higher. In such case, the sampling is called preferential. Developing methods that handle preferential sampling is a critical issue to provide unbiased predictions of the process under study.

Since my PhD, I have been developing an integrated spatio-temporal hierarchical model to map fish species distribution. It accounts for preferential sampling to integrate massive commercial catch declaration data and standardized fine-scale scientific survey data.

You’ll find some description of this approach in the following references:

Handling change of support

In addition to preferential sampling, these massive datasets are often available at a rough spatio-temporal scale while the process of interest (species distribution) must be inferred at a fine scale. This is typically a change of support problem (or change of scale problem). Spatio-temporal statistical models need to account for change of support to provide reliable predictions of the process under study.

I developed a statistical approach handling change of support for complex ecological data (zero-inflated data with heavy tails). It allows to combine (massive) areal-level data and point-level data to map species distribution. It has been applied on commercial catch declarations data (defined at the scale of 0.5° x 1° ICES rectangles) to predict fish distribution at a fine scale.

Integrating the different data sources

Once the different sources of bias have been identified, all the datasets can be combined in a single hierarchical spatio-temporal model.

I’m currently working on an integrated hierarchical spatio-temporal model that accounts for both change of support and preferential sampling. It is the basis for a generic integrated hierarchical model aiming at combining heterogeneous datasets for ecological and environmental applications.

Conceptual diagram of the hierarchical model

Dimension reduction for spatio-temporal data

Once integrated, all these data sources give access to a huge amount of spatial information over long time series. Methods to reduce the dimension of these data are critical to analyse and interpret such information. Empirical Orthogonal Functions (EOFs) are the keystone method for reducing the dimension of spatio-temporal data.

In my research, I intend to adapt these methods to address specific ecological issues (e.g. extend these methods to the multivariate case, set constrains on EOFs to ensure ecological interpretability).

Here are some references to my work:

  • Alglave et al. (under review) for an introduction to the basics of EOFs and some statistical development of EOFs to address specific ecological issues.

  • Alglave et al. (2024) for an application of EOFs to identify essential habitats from a spatio-temporal species distribution model.

EOFs results for common sole in the Bay of Biscay between 2008 and 2018