Simplicity, Complexity and Modelling

Name: Simplicity, Complexity and Modelling
Brand: Wiley
Price: 90.99 EUR
Availability: OnlineOnly

Mike Christie Andrew Cliffe Philip Dawid Stephen S. Senn(Editor)

Wiley (Publisher)

Published on 19. October 2011

232 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-119-96096-6 (ISBN)

€90.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Several points of disagreement exist between different modelling traditions as to whether complex models are always better than simpler models, as to how to combine results from different models and how to propagate model uncertainty into forecasts. This book represents the result of collaboration between scientists from many disciplines to show how these conflicts can be resolved. Key Features: * Introduces important concepts in modelling, outlining different traditions in the use of simple and complex modelling in statistics. * Provides numerous case studies on complex modelling, such as climate change, flood risk and new drug development. * Concentrates on varying models, including flood risk analysis models, the petrol industry forecasts and summarizes the evolution of water distribution systems. * Written by experienced statisticians and engineers in order to facilitate communication between modellers in different disciplines. * Provides a glossary giving terms commonly used in different modelling traditions. This book provides a much-needed reference guide to approaching statistical modelling. Scientists involved with modelling complex systems in areas such as climate change, flood prediction and prevention, financial market modelling and systems engineering will benefit from this book. It will also be a useful source of modelling case histories.

Reviews / Votes

"In short, this book offers plenty. While reading itcannot entirely replace first-hand experience of actually workingwith statistical modelling, I think it can be highly useful, eitherfor a course on Ph.D. level, or for a statistician setting out onher own to improve her competence in applying statisticaltechniques and modelling in non-trivial situations. (International Statistical Review, 1 December 2012)

More details

Other editions

Persons

Content

Preface ix

Acknowledgements xi

Contributing authors xiii

1 Introduction 1
Mike Christie, Andrew Cliffe, Philip Dawid and Stephen Senn

1.1 The origins of the SCAM project 1

1.2 The scope of modelling in the modern world 2

1.3 The different professions and traditions engaged in modelling 3

1.4 Different types of models 3

1.5 Different purposes for modelling 5

1.6 The purpose of the book 6

1.7 Overview of the chapters 6

References 8

2 Statistical model selection 11
Philip Dawid and Stephen Senn

2.1 Introduction 11

2.2 Explanation or prediction? 12

2.3 Levels of uncertainty 12

2.4 Bias-variance trade-off 13

2.5 Statistical models 15

2.5.1 Within-model inference 16

2.6 Model comparison 18

2.7 Bayesian model comparison 18

2.7.1 Model uncertainty 19

2.7.2 Laplace approximation 20

2.8 Penalized likelihood 20

2.8.1 Bayesian information criterion 21

2.9 The Akaike information criterion 21

2.9.1 Inconsistency of AIC 23

2.10 Significance testing 23

2.11 Many variables 27

2.12 Data-driven approaches 28

2.12.1 Cross-validation 29

2.12.2 Prequential analysis 29

2.13 Model selection or model averaging? 30

References 31

3 Modelling in drug development 35
Stephen Senn

3.1 Introduction 35

3.2 The nature of drug development and scope for statistical modelling 36

3.3 Simplicity versus complexity in phase III trials 36

3.3.1 The nature of phase III trials 36

3.3.2 The case for simplicity in analysing phase III trials 37

3.3.3 The case for complexity in modelling clinical trials 38

3.4 Some technical issues 39

3.4.1 The effect of covariate adjustment in linear models 40

3.4.2 The effect of covariate adjustment in non-linear models 42

3.4.3 Random effects in multi-centre trials 44

3.4.4 Subgroups and interactions 45

3.4.5 Bayesian approaches 46

3.5 Conclusion 46

3.6 Appendix: The effect of covariate adjustment on the variance multiplier in least squares 47

References 48

4 Modelling with deterministic computer models 51
Jeremy E. Oakley

4.1 Introduction 51

4.2 Metamodels and emulators for computationally expensive simulators 52

4.2.1 Gaussian processes emulators 53

4.2.2 Multivariate outputs 56

4.3 Uncertainty analysis 57

4.4 Sensitivity analysis 58

4.4.1 Variance-based sensitivity analysis 58

4.4.2 Value of information 61

4.5 Calibration and discrepancy 63

4.6 Discussion 64

References 65

5 Modelling future climates 69
Peter Challenor and Robin Tokmakian

5.1 Introduction 69

5.2 What is the risk from climate change? 70

5.3 Climate models 70

5.4 An anatomy of uncertainty 72

5.4.1 Aleatoric uncertainty 72

5.4.2 Epistemic uncertainty 73

5.5 Simplicity and complexity 75

5.6 An example: The collapse of the thermohaline circulation 77

5.7 Conclusions 79

References 79

6 Modelling climate change impacts for adaptation assessments 83
Suraje Dessai and Jeroen van der Sluijs

6.1 Introduction 83

6.1.1 Climate impact assessment 84

6.2 Modelling climate change impacts: From world development paths to localized impacts 87

6.2.1 Greenhouse gas emissions 87

6.2.2 Climate models 90

6.2.3 Downscaling 93

6.2.4 Regional/local climate change impacts 94

6.3 Discussion 95

6.3.1 Multiple routes of uncertainty assessment 96

6.3.2 What is the appropriate balance between simplicity and complexity? 96

References 98

7 Modelling in water distribution systems 103
Zoran Kapelan

7.1 Introduction 103

7.2 Water distribution system models 104

7.2.1 Water distribution systems 104

7.2.2 WDS hydraulic models 104

7.2.3 Uncertainty in WDS hydraulic modelling 107

7.3 Calibration of WDS hydraulic models 108

7.3.1 Calibration problem 108

7.3.2 Existing approaches 109

7.3.3 Case study 113

7.4 Sampling design for calibration 116

7.4.1 Sampling design problem 116

7.4.2 Existing approaches 116

7.4.3 Case study 120

7.5 Summary and conclusions 120

References 122

8 Modelling for flood risk management 125
Jim Hall

8.1 Introduction 125

8.2 Flood risk management 126

8.2.1 Long-term change 130

8.2.2 Uncertainty 131

8.3 Multi-purpose management 131

8.4 Modelling for flood risk management 132

8.4.1 Source 132

8.4.2 Pathway 132

8.4.3 Receptors 135

8.4.4 An example of a system model: Towyn 135

8.5 Model choice 137

8.6 Conclusions 143

References 144

9 Uncertainty quantification and oil reservoir modelling 147
Mike Christie

9.1 Introduction 147

9.2 Bayesian framework 148

9.2.1 Solution errors 149

9.3 Quantifying uncertainty in prediction of oil recovery 150

9.3.1 Stochastic sampling algorithms 151

9.3.2 Computing uncertainties from multiple history matched models 153

9.4 Inverse problems and reservoir model history matching 155

9.4.1 Synthetic problems 155

9.4.2 Imperial college fault model 157

9.4.3 Comparison of algorithms on a real field example 158

9.5 Selecting appropriate detail in models 162

9.5.1 Adaptive multiscale estimation 162

9.5.2 Bayes factors 165

9.5.3 Application of solution error modelling 167

9.6 Summary 170

References 171

10 Modelling in radioactive waste disposal 173
Andrew Cliffe

10.1 Introduction 173

10.2 The radioactive waste problem 174

10.2.1 What is radioactive waste? 174

10.2.2 How much radioactive waste is there? 175

10.2.3 What are the options for long-term management of radioactive waste? 175

10.3 The treatment of uncertainty in radioactive waste disposal 177

10.3.1 Deep geological disposal 177

10.3.2 Repository performance assessment 177

10.3.3 Modelling 179

10.3.4 Model verification and validation 180

10.3.5 Strategies for dealing with uncertainty 182

10.4 Summary and conclusions 184

References 184

11 Issues for modellers 187
Mike Christie, Andrew Cliffe, Philip Dawid and Stephen Senn

11.1 What are models and what are they useful for? 187

11.2 Appropriate levels of complexity 189

11.3 Uncertainty 190

11.3.1 Model inputs and parameter uncertainty 190

11.3.2 Model uncertainty 191

References 192

Glossary 193

Index 201

Chapter 1

Introduction

Mike Christie1, Andrew Cliffe2, Philip Dawid3 and Stephen Senn4

1Institute of Petroleum Engineering, Heriot Watt University, Edinburgh, UK

2School of Mathematical Sciences, University of Nottingham, UK

3Centre for Mathematical Sciences, University of Cambridge, UK

4School of Mathematics and Statistics, University of Glasgow, UK

In this introductory chapter we make some brief remarks about this book, what its purpose is, how it relates to the Simplicity Complexity and Modelling (SCAM) project and also more widely about what the purpose of modelling is and what various traditions in modelling there are.

1.1 The origins of the SCAM project

In January 2006 the Engineering and Physical Research Council (EPSRC) organized a ‘sandpit’ or ‘ideas factory’ at Shrigley Park under the directorship of Peter Grindrod with the title ‘Scientific Uncertainty and Decision Making for Regulatory and Risk Assessment Purposes’ in which scientists from a wide variety of disciplines participated. At the ideas factory there were frequent informal and formal meetings to discuss issues relevant to uncertainty in modelling. As the week progressed various themes emerged, projects were mooted and teams coalesced. These teams then competed with each other for funding from the EPSRC. Among those that were successful was a project which had the following specific objectives:

First, given that data are finite, what is the appropriate balance between simplicity and complexity required in modelling complex data?
Second, where more than one plausible candidate model is used, how should forecasts be combined?
Third, where model uncertainty exists, how should this uncertainty be propagated into predictions?

However, the project also had the more general and wider purposes of making modellers in different traditions mutually aware of what they were doing and also of making the different terminology that they employed intelligible to each other.

Funding for the project was agreed and the name Simplicity, Complexity and Modelling (SCAM) was chosen. This is the book of the SCAM project.

1.2 The scope of modelling in the modern world

Scientists working in many diverse areas are engaged in modelling the world. Obviously, the various fields in which the models they create are applied vary considerably and this is reflected in the approaches they adopt to build, fit, test and use the models they devise. Consider, for example, credit scoring and climate modelling. In the former case the data consist of billions of transactions every day. The field is data-rich and the opportunities to test the ability of the fitted models to predict (say) good and bad debts abundant. A model that is fitted today can be tested tomorrow and again the day after and so on. On the other hand, climate modellers are trying to predict a unique future. If current trends in human activity persist, will this lead to global warming and what will be the consequences? If the models suggest that the consequences of current activity are serious and if mankind acts on the warning and mends its ways then the prediction will never be validated. Climate modellers are thus cast in the role of Cassandras: if heeded they will ultimately be doubted because what they predict will not come to pass and only disaster will reveal them to have spoken the truth. This may seem somewhat fanciful, yet consider the case of the so-called millennium bug. Huge sums of money were invested in fixing computer code. The world computing network survived the arrival of the year 2000, and now some are convinced that it was all a fuss about nothing while others believe that it was only foresight and action that prevented disaster.

Yet, if one looks a little deeper even in these very different fields there are points in common. For example, in the wake of the global financial crisis of 2008 many financial analysts are no doubt pondering how well the current approach to forecasting the credit weather will serve if the credit climate is changing.

Nevertheless, some things are very different as one moves from one field to another, and it is the belief that knowledge of such differences is valuable that is one of the justifications for this book. On the other hand, some things that appear different are in fact the same or similar, and it is the vocabulary that differs from field to field and sometimes within a field, rather than the concept. For example, the terms random effects model, hierarchical model and mixed model used within the discipline of statistics are either synonyms or so readily interchangeable that they might be applied, depending on author, to exactly the same algebraic construct. However, those who work in pharmacometrics use machinery that is identical to random effects models but are likely to refer to such as population models (Sheiner et al. 1977). This reflects, of course, the fact that even within the same discipline different individuals responding to different perceived needs have stumbled across the same solution, and that as one switches discipline the scope for this phenomenon is even greater.

It is the object of this book and of the SCAM project, to represent various modelling traditions and application areas with a view to making researchers aware of a rich diversity but also that there are many concerns they share in common.

1.3 The different professions and traditions engaged in modelling

However, it would be foolish of us to claim that the team members cover all disciplines and hence that our book encompasses the whole field. We are, in fact, three statisticians (APD, JO and SS), an applied mathematician (AC), a climate modeller (PC), a geographer (SD) and three engineers (MC, ZK and JH). Not included in the team, for example, are any computer scientists. Also absent, to name but a few scientific professions, are any econometricians, financial analysts or pharmacometricians (although SS has some interests in the latter field). The bias towards the physical sciences in the team is thus clear. In fact the application areas covered by us include topics from the physical sciences such as climate, oil exploration, flood prevention, nuclear waste disposal, water distribution networks, and simpler approximations of complex computer programs. The modelling of treatment effects in drug development is perhaps the only exception to this theme.

We do not claim that the breadth of the book is great enough to cover all fields or even all lessons that might be learned from study of such fields, but hope that it is great enough to be interesting and valuable and that it will serve to make the strange familiar by drawing parallels where they can be found and to make the familiar strange by alerting modellers in a given field to the fact that others do not necessarily do things the same way and hence that what they take for granted may be far from obvious.

1.4 Different types of models

Cox (1990) identifies two major types of model: substantive and empirical. Models of the former type arise as a result of careful consideration of some well-established or at least plausible background scientific theory. Careful thought concerning processes involved suggests a relationship between quantities of interest. The theory thus embodied may suggest some difficult or intricate mathematical work, and this receives expression in a model. We give a simple example of the thinking that might go into such a model from the field of pharmacokinetics.

Various physiological considerations may suggest that a particular pharmaceutical given by injection will be eliminated at a rate that is proportional to its concentration in the blood. Suppose we have an experiment in which a healthy volunteer is given a pharmaceutical by intravenous injection and then blood samples are drawn at regular and frequent intervals. A differential equation suggests that the concentration–time relationship can then be modelled with concentration on the log scale as a linear function of time. Of course nothing is measured perfectly, so that some random variation should be allowed for. It may thus be valuable to think in terms of data which have a signal plus some noise. The signal part of the model can then be modelled as

1.1

where μt is the ‘true’ concentration at time t after dosing, μ0 is the concentration in the blood at time 0 and k is a so-called elimination constant. One could regard such a model as being a simple (incomplete) example of a substantive model. Making it realistic using purely theory-based considerations may be difficult, however. A log transformation is particularly appealing and we can then write

1.2

(Here we follow the usual statistician's convention of writing natural logarithms as log.) We do not, however, observe μt directly but (say) a quantity Yt. The model given in (1.1) may then be extended to represent observable quantities by proposing some simple relationship between a given observed concentration Yi taken at time ti and the true unobserved concentration that involves an unobserved random variable . One possible relationship is

1.3

However, this model is itself not complete until we specify how the...

Content (EPUB)

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Simplicity, Complexity and Modelling

Description

Reviews / Votes

More details

Other editions

Additional editions

Persons

Content

System requirements