Schweitzer Fachinformationen
Wenn es um professionelles Wissen geht, ist Schweitzer Fachinformationen wegweisend. Kunden aus Recht und Beratung sowie Unternehmen, öffentliche Verwaltungen und Bibliotheken erhalten komplette Lösungen zum Beschaffen, Verwalten und Nutzen von digitalen und gedruckten Medien.
Introduction xiNathalie PEYRARD, Stéphane ROBIN and Olivier GIMENEZ
Chapter 1. Trajectory Reconstruction and Behavior Identification Using Geolocation Data 1Marie-Pierre ETIENNE and Pierre GLOAGUEN
1.1. Introduction 1
1.1.1. Reconstructing a real trajectory from imperfect observations 1
1.1.2. Identifying different behaviors in movement 3
1.2. Hierarchical models of movement 3
1.2.1. Trajectory reconstruction model 3
1.2.2. Activity reconstruction model 6
1.3. Case study: masked booby, Sula dactylatra (originals) 14
1.3.1. Data 14
1.3.2. Projection 15
1.3.3. Data smoothing 15
1.3.4. Identification of different activities through movement 16
1.3.5. Results 17
1.4. References 23
Chapter 2. Detection of Eco-Evolutionary Processes in the Wild: Evolutionary Trade-Offs Between Life History Traits 27Valentin JOURNÉ, Sarah CUBAYNES, Julien PAPAÏX and Mathieu BUORO
2.1. Context 27
2.2. The correlative approach to detecting evolutionary trade-offs in natural settings: problems 28
2.2.1. Mechanistic and statistical modeling as a means of accessing hidden variables 29
2.3. Case study 31
2.3.1. Costs of maturing and migration for survival: a theoretical approach 31
2.3.2. Growth/reproduction trade-off in trees 37
2.4. References 44
Chapter 3. Studying Species Demography and Distribution in Natural Conditions: Hidden Markov Models 47Olivier GIMENEZ, Julie LOUVRIER, Valentin LAURET and Nina SANTOSTASI
3.1. Introduction 47
3.2. Overview of HMMs 48
3.3. HMM and demography 50
3.3.1. General overview 50
3.3.2. Case study: estimating the prevalence of dog-wolf hybrids with uncertain individual identification 54
3.4. HMM and species distribution 55
3.4.1. General case 55
3.4.2. Case study: estimating the distribution of a wolf population with species identification errors and heterogeneous detection 57
3.5. Discussion 60
3.6. Acknowledgments 62
3.7. References 62
Chapter 4. Inferring Mechanistic Models in Spatial Ecology Using a Mechanistic-Statistical Approach 69Julien PAPAÏX, Samuel SOUBEYRAND, Olivier BONNEFON, Emily WALKER, Julie LOUVRIER, Etienne KLEIN and Lionel ROQUES
4.1. Introduction 69
4.2. Dynamic systems in ecology 70
4.2.1. Temporal models 70
4.2.2. Spatio-temporal models without reproduction 74
4.2.3. Spatio-temporal models with reproduction 76
4.2.4. Numerical solution 77
4.3. Estimation 77
4.3.1. Estimation principle 77
4.3.2. Parameter estimation 78
4.3.3. Estimation of latent processes 80
4.3.4. Mechanistic-statistical models 82
4.4. Examples 83
4.4.1. The COVID-19 epidemic in France 83
4.4.2. Wolf (Canis lupus) colonization in southeastern France 86
4.4.3. Estimating dates and locations of the introduction of invasive strains of watermelon mosaic virus 90
4.5. References 94
Chapter 5. Using Coupled Hidden Markov Chains to Estimate Colonization and Seed Bank Survival in a Metapopulation of Annual Plants 97Pierre-Olivier CHEPTOU, Stéphane CORDEAU, Sebastian LE COZ and Nathalie PEYRARD
5.1. Introduction 97
5.2. Metapopulation model for plants: introduction of a dormant state 99
5.2.1. Dependency structure in the model 99
5.2.2. Distributions defining the model 100
5.2.3. Parameterizing the model 101
5.2.4. Linking the parameters of the model with the ecological parameters of the dynamics of an annual plant 103
5.2.5. Estimation 104
5.2.6. Model selection 105
5.3. Dynamics of weed species in cultivated parcels 105
5.3.1. Dormancy and weed management in agroecosystems 105
5.3.2. Description of the data set 106
5.3.3. Comparison with an HMM with independent patches 108
5.3.4. Influence of crops on weed dynamics 109
5.4. Discussion and conclusion 110
5.5. Acknowledgments 113
5.6. References 113
Chapter 6. Using Latent Block Models to Detect Structure in Ecological Networks 117Julie AUBERT, Pierre BARBILLON, Sophie DONNET and Vincent MIELE
6.1. Introduction 117
6.2. Formalism 119
6.3. Probabilistic mixture models for networks 120
6.3.1. SBMs for unipartite networks 121
6.3.2. Stochastic block model for bipartite networks 122
6.4. Statistical inference 124
6.4.1. Estimation of parameters and clustering 125
6.4.2. Model selection 126
6.5. Application 127
6.5.1. Food web 127
6.5.2. A bipartite plant-pollinator network 129
6.6. Conclusion 130
6.7. References 132
Chapter 7. Latent Factor Models: A Tool for Dimension Reduction in Joint Species Distribution Models 135Daria BYSTROVA, Giovanni POGGIATO, Julyan ARBEL and Wilfried THUILLER
7.1. Introduction 135
7.2. Joint species distribution models 138
7.3. Dimension reduction with latent factors 139
7.4. Inference 140
7.5. Ecological interpretation of latent factors 141
7.6. On the interpretation of JSDMs 142
7.7. Case study 142
7.7.1. Introduction of the dataset 142
7.7.2. R package used 144
7.7.3. Implementation and convergence diagnosis 144
7.7.4. Results and discussion 144
7.8. Conclusion 152
7.9. References 153
Chapter 8. The Poisson Log-Normal Model: A Generic Framework for Analyzing Joint Abundance Distributions 157Julien CHIQUET, Marie-Josée CROS, Mahendra MARIADASSOU, Nathalie PEYRARD and Stéphane ROBIN
8.1. Introduction 157
8.2. The Poisson log-normal model 159
8.2.1. The model 159
8.2.2. Inference method 162
8.2.3. Dimension reduction 164
8.2.4. Inferring networks of interaction 165
8.3. Data analysis: marine species 167
8.3.1. Description of the data 167
8.3.2. Effects due to site and date 168
8.3.3. Dimension reduction 170
8.3.4. Inferring ecological interactions 171
8.4. Discussion 176
8.5. Acknowledgments 177
8.6. References 177
Chapter 9. Supervised Component-Based Generalized Linear Regression: Method and Extensions 181Frédéric MORTIER, Jocelyn CHAUVET, Catherine TROTTIER, Guillaume CORNU and Xavier BRY
9.1. Introduction 181
9.2. Models and methods 184
9.2.1. Supervised component-based generalized linear regression 184
9.2.2. Thematic supervised component-based generalized linear regression (THEME-SCGLR) 187
9.2.3. Mixed SCGLR 189
9.3. Case study: predicting the abundance of 15 common tree species in the forests of Central Africa 191
9.3.1. The SCGLR method: a direct approach 191
9.3.2. THEME-SCGLR: improved characterization of predictive components 194
9.3.3. Mixed-SCGLR: taking account of the concession effect 196
9.4. Discussion 200
9.5. References 201
Chapter 10. Structural Equation Models for the Study of Ecosystems and Socio-Ecosystems 203Fabien LAROCHE, Jérémy FROIDEVAUX, Laurent LARRIEU and Michel GOULARD
10.1. Introduction 203
10.1.1. Ecological background 203
10.1.2. Methodological problem 204
10.1.3. Case study: biodiversity in a managed forest 205
10.2. Structural equation model 206
10.2.1. Hypotheses and general structure of an SEM 206
10.2.2. Likelihood and estimation in an SEM 209
10.2.3. Fit quality and nested SEM tests 211
10.3. Case study: biodiversity in managed forests 213
10.3.1. Preliminary steps 213
10.3.2. Evaluating the measurement model alone 213
10.3.3. Evaluating the relational model 214
10.3.4. Significance of parameters in the relational model 219
10.3.5. Findings 221
10.4. Discussion 223
10.4.1. A confirmatory approach 223
10.4.2. Gaussian framework 224
10.4.3. Centered-reduced observed variables 224
10.4.4. Structural constraints 224
10.4.5. Use of resampling 225
10.5. Acknowledgments 225
10.6. References 226
List of Authors 229
Index 233
Nathalie PEYRARD1, Stéphane ROBIN2 and Olivier GIMENEZ3
1University of Toulouse, INRAE, UR MIAT, Castanet-Tolosan, France
2Paris-Saclay University, AgroParisTech, INRAE, UMR MIA-Paris, France
3CEFE, University of Montpellier, CNRS, EPHE, IRD, Paul Valéry Montpellier 3 University, France
Ecology is the study of living organisms in interaction with their environment. These interactions occur at individual level (an animal, a plant), at the level of groups of individuals (a population, a species) or across several species (a community). Statistics provides us with tools to study these interactions, enabling us to collect, organize, present, analyze and draw conclusions from data collected on ecological systems. However, some components of these ecological systems may escape observation: these are known as hidden variables. This book is devoted to models incorporating hidden variables in ecology and to the statistical inference for these models.
The hidden variables studied throughout this book can be grouped into three classes corresponding to three types of questions that can be posed concerning an ecological system. We may consider the identification of groups of individuals or species, such as groups of individuals with the same behavior or similar genetic profiles, or groups of species that interact with the same species or with their environment in a similar way. Alternatively, we may wish to study variables which can only be observed in a "noisy" form, often called a "proxy". For example, the presence of certain species may be missed as a result of detection difficulties or errors (confusion with another species), or as a result of "noisy" data resulting from technology-related measurement errors. Finally, in the context of data analysis, we may wish to reduce the dimension of the information contained in data sets to a small number of explanatory variables. Note the shift from the notion of a variable which escapes observation, in the first cases, to a more generalized notion of hidden variables.
All three of these problems can be translated into questions of inference concerning variables which, in statistical terms, are said to be latent. Inference poses statistical problems that require specific methods, described in detail here. The ecological interpretation of these variables will also be discussed at length. As we shall see, while the statistical treatment of these variables may be complex, their inclusion in models is essential in providing us with a better understanding of ecological systems.
The term "hidden variable", widely used in ecology, finds its translation in the more general notion of latent variables in statistical modeling. This notion encompasses several situations and goes beyond the idea of unobservable physical variables alone. In statistics, a latent variable is generally defined as a variable of interest, which is not observable and does not necessarily have a physical meaning, the value of which must be deduced from observations. More precisely, latent variables are characterized by the following two specificities: (i) in terms of number, they are comparable to the number of data items, unlike parameters that are fewer in number. Consider, for example, the case of a hidden Markov chain, where the number of observed variables and latent variables is equal to the number of observation time steps; (ii) if their value were known, then model parameter estimation would be easier. For example, consider the estimation of parameters of a mixture model where the groups of individuals are known.
In practice, if a latent variable has a physical reality but cannot be observed in the field (e.g. the precise trajectory of an animal, or the abundance of a seedbank), it is often referred to as a hidden variable (although both terms are often used interchangeably). In other cases, the latent variable naturally plays a role in the description of a given process or system, but has no physical existence. This is the case, for example, of latent variables corresponding to a classification of observations into different groups. We will refer to them as fictitious variables. Finally, latent variables may also play an instrumental role in describing a source of variability in observations that cannot be explained by known covariates, or in establishing a concise description of a dependency structure. They may result from a dimension reduction operation applied to a group of explanatory variables in the context of regression, as we see in the case of the principal components of a principal component analysis.
The notion of latent variables is connected to that of hierarchical models: if they are not parameters, the elements in the higher levels of the model are latent variables. It is important to note that the notion of latent variables may be extended to cover the case of determinist quantities (represented by a constant in a model). For example, this holds true in cases where the latent variable is the trajectory of an ordinary differential equation (ODE) for which only noisy observations are available.
Some of the most common examples of statistical models featuring latent variables are described here.
Mixture models are used to define a small number of groups into which a set of observations may be sorted. In this case, the latent variables are discrete variables indicating which group each observation belongs to. Stochastic block models (SBMs) or latent block models (LBMs, or bipartite SBM) are specific forms of mixture models used in cases where the observations take the form of a network. Hidden Markov models (HMMs) are often used to analyze data collected over a period of time (such as the trajectory of an animal, observed over a series of dates) and take account of a subjacent process (such as the activity of the tracked animal: sleep, movement, hunting, etc.), which affects observations (the animal's position or trajectory). In this case, the latent variables are discrete and represent the activity of the animal at every instant. In other models, the hidden process itself may be continuous. Mixed (generalized) linear models are one of the key tools used in ecology to describe the effects of a set of conditions (environmental or otherwise) on a population or community. These models include random effects which are, in essence, latent variables, used to account for higher than expected dispersions or dependency relationships between variables. In most cases, these latent variables are continuous and essentially instrumental in nature. Joint species distribution models (JSDMs) are a multidimensional version of generalized linear models, used to describe the composition of a community as a function of both environmental variables and of the interactions between constituent species. Many JSDMs use a multidimensionsal (e.g. Gaussian) latent variable, the dependency structure of which is used to describe inter-species interactions.
In ecology, models are often used to describe the effect of experimental conditions or environmental variables on the response or behavior of one or more species. Explanatory variables of this kind are often known as covariates. These effects are typically accounted for using a regression term, as in the case of generalized linear models. A regression term of this type may also be used in latent variable models, in which case the distribution of the response variable in question is considered to depend on both the observed covariates and non-observable latent variables.
Many methods have been proposed for estimating the parameters of a model featuring latent variables. From a frequentist perspective, the oldest and most widespread means of computing the maximum likelihood estimator is the expectation-maximization (EM) algorithm, which draws on the fact that the parameters for many of these models would be easy to estimate if the latent variables could be observed. The EM algorithm alternates between two steps: in step E, all of the quantities involving latent variables are calculated in order to update the estimation of parameters in the second step, M. Step E focuses on determining the conditional distribution of latent variables given the observed data. This calculation may be immediate (as in the case of mixture models and certain mixed models) or possible but costly (as in the case of HMMs); alternatively, it may be impossible for combinatorial or formal reasons.
The estimation problem is even more striking in the context of Bayesian inference, as a conditional distribution must be established not only for the latent variables, but also for parameters. Once again, except in very specific circumstances, precise determination of this joint conditional law (latent variables and parameters) is usually impossible.
The inference methods used in models with a non-calculable conditional law fall into two broad categories: sampling methods and approximation methods. Sampling methods use a sample of data relating to the non-calculable law to obtain precise estimations of all relevant quantities. This category includes the Monte Carlo, the Markov chain Monte Carlo (MCMC) and the sequential Monte Carlo (SMC) methods. These algorithms are inherently random, and are notably used in Bayesian inference. Methods in the second category are used to determine an approximation of the conditional law of the latent variables (and, in the Bayesian case, of...
Dateiformat: ePUBKopierschutz: Adobe-DRM (Digital Rights Management)
Systemvoraussetzungen:
Das Dateiformat ePUB ist sehr gut für Romane und Sachbücher geeignet – also für „fließenden” Text ohne komplexes Layout. Bei E-Readern oder Smartphones passt sich der Zeilen- und Seitenumbruch automatisch den kleinen Displays an. Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!
Weitere Informationen finden Sie in unserer E-Book Hilfe.