Introduction to Bayesian Statistics

Name: Introduction to Bayesian Statistics
Brand: Wiley
Price: 128.99 EUR
Availability: OnlineOnly

William M. Bolstad James M. Curran(Author)

Wiley (Publisher)

3rd Edition

Published on 2. September 2016

624 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-118-59322-6 (ISBN)

€128.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

"...this edition is useful and effective in teaching Bayesian inference at both elementary and intermediate levels. It is a well-written book on elementary Bayesian inference, and the material is easily accessible. It is both concise and timely, and provides a good collection of overviews and reviews of important tools used in Bayesian statistical methods." There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian statistics. The authors continue to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inference for discrete random variables, binomial proportions, Poisson, and normal means, and simple linear regression. In addition, more advanced topics in the field are presented in four new chapters: Bayesian inference for a normal with unknown mean and variance; Bayesian inference for a Multivariate Normal mean vector; Bayesian inference for the Multiple Linear Regression Model; and Computational Bayesian Statistics including Markov Chain Monte Carlo. The inclusion of these topics will facilitate readers' ability to advance from a minimal understanding of Statistics to the ability to tackle topics in more applied, advanced level books. Minitab macros and R functions are available on the book's related website to assist with chapter exercises. Introduction to Bayesian Statistics, Third Edition also features: * Topics including the Joint Likelihood function and inference using independent Jeffreys priors and join conjugate prior * The cutting-edge topic of computational Bayesian Statistics in a new chapter, with a unique focus on Markov Chain Monte Carlo methods * Exercises throughout the book that have been updated to reflect new applications and the latest software applications * Detailed appendices that guide readers through the use of R and Minitab software for Bayesian analysis and Monte Carlo simulations, with all related macros available on the book's website Introduction to Bayesian Statistics, Third Edition is a textbook for upper-undergraduate or first-year graduate level courses on introductory statistics course with a Bayesian emphasis. It can also be used as a reference work for statisticians who require a working knowledge of Bayesian statistics.

More details

Other editions

Persons

Content

Preface xiii

1 Introduction to Statistical Science 1

1.1 The Scientic Method: A Process for Learning 3

1.2 The Role of Statistics in the Scientic Method 5

1.3 Main Approaches to Statistics 5

1.4 Purpose and Organization of This Text 8

2 Scientic Data Gathering 13

2.1 Sampling from a Real Population 14

2.2 Observational Studies and Designed Experiments 17

Monte Carlo Exercises 23

3 Displaying and Summarizing Data 31

3.1 Graphically Displaying a Single Variable 32

3.2 Graphically Comparing Two Samples 39

3.3 Measures of Location 41

3.4 Measures of Spread 44

3.5 Displaying Relationships Between Two or More Variables 46

3.6 Measures of Association for Two or More Variables 49

Exercises 52

4 Logic, Probability, and Uncertainty 59

4.1 Deductive Logic and Plausible Reasoning 60

4.2 Probability 62

4.3 Axioms of Probability 64

4.4 Joint Probability and Independent Events 65

4.5 Conditional Probability 66

4.6 Bayes' Theorem 68

4.7 Assigning Probabilities 74

4.8 Odds and Bayes Factor 75

4.9 Beat the Dealer 76

Exercises 80

5 Discrete Random Variables 83

5.1 Discrete Random Variables 84

5.2 Probability Distribution of a Discrete Random Variable 86

5.3 Binomial Distribution 90

5.4 Hypergeometric Distribution 92

5.5 Poisson Distribution 93

5.6 Joint Random Variables 96

5.7 Conditional Probability for Joint Random Variables 100

Exercises 104

6 Bayesian Inference for Discrete Random Variables 109

6.1 Two Equivalent Ways of Using Bayes' Theorem 114

6.2 Bayes' Theorem for Binomial with Discrete Prior 116

6.3 Important Consequences of Bayes' Theorem 119

6.4 Bayes' Theorem for Poisson with Discrete Prior 120

Exercises 122

Computer Exercises 126

7 Continuous Random Variables 129

7.1 Probability Density Function 131

7.2 Some Continuous Distributions 135

7.3 Joint Continuous Random Variables 143

7.4 Joint Continuous and Discrete Random Variables 144

Exercises 147

8 Bayesian Inference for Binomial Proportion 149

8.1 Using a Uniform Prior 150

8.2 Using a Beta Prior 151

8.3 Choosing Your Prior 154

8.4 Summarizing the Posterior Distribution 158

8.5 Estimating the Proportion 161

8.6 Bayesian Credible Interval 162

Exercises 164

Computer Exercises 167

9 Comparing Bayesian and Frequentist Inferences for Proportion 169

9.1 Frequentist Interpretation of Probability and Parameters 170

9.2 Point Estimation 171

9.3 Comparing Estimators for Proportion 174

9.4 Interval Estimation 175

9.5 Hypothesis Testing 178

9.6 Testing a One-Sided Hypothesis 179

9.7 Testing a Two-Sided Hypothesis 182

Exercises 187

Monte Carlo Exercises 190

10 Bayesian Inference for Poisson 193

10.1 Some Prior Distributions for Poisson 194

10.2 Inference for Poisson Parameter 200

Exercises 207

Computer Exercises 208

11 Bayesian Inference for Normal Mean 211

11.1 Bayes' Theorem for Normal Mean with a Discrete Prior 211

11.2 Bayes' Theorem for Normal Mean with a Continuous Prior 218

11.3 Choosing Your Normal Prior 222

11.4 Bayesian Credible Interval for Normal Mean 224

11.5 Predictive Density for Next Observation 227

Exercises 230

Computer Exercises 232

12 Comparing Bayesian and Frequentist Inferences for Mean 237

12.1 Comparing Frequentist and Bayesian Point Estimators 238

12.2 Comparing Condence and Credible Intervals for Mean 241

12.3 Testing a One-Sided Hypothesis about a Normal Mean 243

12.4 Testing a Two-Sided Hypothesis about a Normal Mean 247

Exercises 251

13 Bayesian Inference for Di erence Between Means 255

13.1 Independent Random Samples from Two Normal Distributions 256

13.2 Case 1: Equal Variances 257

13.3 Case 2: Unequal Variances 262

13.4 Bayesian Inference for Dierence Between Two Proportions Using Normal Approximation 265

13.5 Normal Random Samples from Paired Experiments 266

Exercises 272

14 Bayesian Inference for Simple Linear Regression 283

14.1 Least Squares Regression 284

14.2 Exponential Growth Model 288

14.3 Simple Linear Regression Assumptions 290

14.4 Bayes' Theorem for the Regression Model 292

14.5 Predictive Distribution for Future Observation 298

Exercises 303

Computer Exercises 312

15 Bayesian Inference for Standard Deviation 315

15.1 Bayes' Theorem for Normal Variance with a Continuous Prior 316

15.2 Some Specic Prior Distributions and the Resulting Posteriors 318

15.3 Bayesian Inference for Normal Standard Deviation 326

Exercises 332

Computer Exercises 335

16 Robust Bayesian Methods 337

16.1 Eect of Misspecied Prior 338

16.2 Bayes' Theorem with Mixture Priors 340

Exercises 349

Computer Exercises 351

17 Bayesian Inference for Normal with Unknown Mean and Variance 355

17.1 The Joint Likelihood Function 358

17.2 Finding the Posterior when Independent Jeffreys' Priors for µ and s2 Are Used 359

17.3 Finding the Posterior when a Joint Conjugate Prior for µ and s2 Is Used 361

17.4 Difference Between Normal Means with Equal Unknown Variance 367

17.5 Difference Between Normal Means with Unequal Unknown Variances 377

Computer Exercises 383

Appendix: Proof that the Exact Marginal Posterior Distribution of µ is Student's t 385

18 Bayesian Inference for Multivariate Normal Mean Vector 393

18.1 Bivariate Normal Density 394

18.2 Multivariate Normal Distribution 397

18.3 The Posterior Distribution of the Multivariate Normal Mean Vector when Covariance Matrix Is Known 398

18.4 Credible Region for Multivariate Normal Mean Vector when Covariance Matrix Is Known 400

18.5 Multivariate Normal Distribution with Unknown Covariance Matrix 402

Computer Exercises 406

19 Bayesian Inference for the Multiple Linear Regression Model 411

19.1 Least Squares Regression for Multiple Linear Regression Model 412

19.2 Assumptions of Normal Multiple Linear Regression Model 414

19.3 Bayes' Theorem for Normal Multiple Linear Regression Model 415

19.4 Inference in the Multivariate Normal Linear Regression Model 419

19.5 The Predictive Distribution for a Future Observation 425

Computer Exercises 428

20 Computational Bayesian Statistics Including Markov Chain Monte Carlo 431

20.1 Direct Methods for Sampling from the Posterior 436

20.2 Sampling - Importance - Resampling 450

20.3 Markov Chain Monte Carlo Methods 454

20.4 Slice Sampling 470

20.5 Inference from a Posterior Random Sample 473

20.6 Where to Next? 475

A Introduction to Calculus 477

B Use of Statistical Tables 497

C Using the Included Minitab Macros 523

D Using the Included R Functions 543

E Answers to Selected Exercises 565

References 591

Index 595

CHAPTER 1
INTRODUCTION TO STATISTICAL SCIENCE

Statistics is the science that relates data to specific questions of interest. This includes devising methods to gather data relevant to the question, methods to summarize and display the data to shed light on the question, and methods that enable us to draw answers to the question that are supported by the data. Data almost always contain uncertainty. This uncertainty may arise from selection of the items to be measured, or it may arise from variability of the measurement process. Drawing general conclusions from data is the basis for increasing knowledge about the world, and is the basis for all rational scientific inquiry. Statistical inference gives us methods and tools for doing this despite the uncertainty in the data. The methods used for analysis depend on the way the data were gathered. It is vitally important that there is a probability model explaining how the uncertainty gets into the data.

Showing a Causal Relationship from Data

Suppose we have observed two variables X and Y. Variable X appears to have an association with variable Y. If high values of X occur with high values of variable Y and low values of X occur with low values of Y, then we say the association is positive. On the other hand, the association could be negative in which high values of variable X occur in with low values of variable Y. Figure 1.1 shows a schematic diagram where the association is indicated by the dashed curve connecting X and Y. The unshaded area indicates that X and Y are observed variables. The shaded area indicates that there may be additional variables that have not been observed.

Figure 1.1 Association between two variables.

Figure 1.2 Association due to causal relationship.

We would like to determine why the two variables are associated. There are several possible explanations. The association might be a causal one. For example, X might be the cause of Y. This is shown in Figure 1.2, where the causal relationship is indicated by the arrow from X to Y.

On the other hand, there could be an unidentified third variable Z that has a causal effect on both X and Y. They are not related in a direct causal relationship. The association between them is due to the effect of Z. Z is called a lurking variable, since it is hiding in the background and it affects the data. This is shown in Figure 1.3.

Figure 1.3 Association due to lurking variable.

Figure 1.4 Confounded causal and lurking variable effects.

It is possible that both a causal effect and a lurking variable may both be contributing to the association. This is shown in Figure 1.4. We say that the causal effect and the effect of the lurking variable are confounded. This means that both effects are included in the association.

Our first goal is to determine which of the possible reasons for the association holds. If we conclude that it is due to a causal effect, then our next goal is to determine the size of the effect. If we conclude that the association is due to causal effect confounded with the effect of a lurking variable, then our next goal becomes determining the sizes of both the effects.

1.1 The Scientific Method: A Process for Learning

In the Middle Ages, science was deduced from principles set down many centuries earlier by authorities such as Aristotle. The idea that scientific theories should be tested against real world data revolutionized thinking. This way of thinking known as the scientific method sparked the Renaissance.

The scientific method rests on the following premises:

A scientific hypothesis can never be shown to be absolutely true.
However, it must potentially be disprovable.
It is a useful model until it is established that it is not true.
Always go for the simplest hypothesis, unless it can be shown to be false.

This last principle, elaborated by William of Ockham in the 13th century, is now known as Ockham's razor and is firmly embedded in science. It keeps science from developing fanciful overly elaborate theories. Thus the scientific method directs us through an improving sequence of models, as previous ones get falsified. The scientific method generally follows the following procedure:

Ask a question or pose a problem in terms of the current scientific hypothesis.
Gather all the relevant information that is currently available. This includes the current knowledge about parameters of the model.
Design an investigation or experiment that addresses the question from step 1. The predicted outcome of the experiment should be one thing if the current hypothesis is true, and something else if the hypothesis is false.
Gather data from the experiment.
Draw conclusions given the experimental results. Revise the knowledge about the parameters to take the current results into account.

The scientific method searches for cause-and-effect relationships between an experimental variable and an outcome variable. In other words, how changing the experimental variable results in a change to the outcome variable. Scientific modeling develops mathematical models of these relationships. Both of them need to isolate the experiment from outside factors that could affect the experimental results. All outside factors that can be identified as possibly affecting the results must be controlled. It is no coincidence that the earliest successes for the method were in physics and chemistry where the few outside factors could be identified and controlled. Thus there were no lurking variables. All other relevant variables could be identified and could then be physically controlled by being held constant. That way they would not affect results of the experiment, and the effect of the experimental variable on the outcome variable could be determined. In biology, medicine, engineering, technology, and the social sciences it is not that easy to identify the relevant factors that must be controlled. In those fields a different way to control outside factors is needed, because they cannot be identified beforehand and physically controlled.

1.2 The Role of Statistics in the Scientific Method

Statistical methods of inference can be used when there is random variability in the data. The probability model for the data is justified by the design of the investigation or experiment. This can extend the scientific method into situations where the relevant outside factors cannot even be identified. Since we cannot identify these outside factors, we cannot control them directly. The lack of direct control means the outside factors will be affecting the data. There is a danger that the wrong conclusions could be drawn from the experiment due to these uncontrolled outside factors.

The important statistical idea of randomization has been developed to deal with this possibility. The unidentified outside factors can be "averaged out" by randomly assigning each unit to either treatment or control group. This contributes variability to the data. Statistical conclusions always have some uncertainty or error due to variability in the data. We can develop a probability model of the data variability based on the randomization used. Randomization not only reduces this uncertainty due to outside factors, it also allows us to measure the amount of uncertainty that remains using the probability model. Randomization lets us control the outside factors statistically, by averaging out their effects.

Underlying this is the idea of a statistical population, consisting of all possible values of the observations that could be made. The data consists of observations taken from a sample of the population. For valid inferences about the population parameters from the sample statistics, the sample must be "representative" of the population. Amazingly, choosing the sample randomly is the most effective way to get representative samples!

1.3 Main Approaches to Statistics

There are two main philosophical approaches to statistics. The first is often referred to as the frequentist approach. Sometimes it is called the classical approach. Procedures are developed by looking at how they perform over all possible random samples. The probabilities do not relate to the particular random sample that was obtained. In many ways this indirect method places the "cart before the horse."

The alternative approach that we take in this book is the Bayesian approach. It applies the laws of probability directly to the problem. This offers many fundamental advantages over the more commonly used frequentist approach. We will show these advantages over the course of the book.

Frequentist Approach to Statistics

Most introductory statistics books take the frequentist approach to statistics, which is based on the following ideas:

Parameters, the numerical characteristics of the population, are fixed but unknown constants.
Probabilities are always interpreted as long-run relative frequency.
Statistical procedures are judged by how well they perform in the long run over an infinite number of hypothetical repetitions of the experiment.

Probability...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Introduction to Bayesian Statistics

Description

More details

Other editions

Additional editions

Persons

Content

CHAPTER 1
INTRODUCTION TO STATISTICAL SCIENCE

Showing a Causal Relationship from Data

1.1 The Scientific Method: A Process for Learning

1.2 The Role of Statistics in the Scientific Method

1.3 Main Approaches to Statistics

Frequentist Approach to Statistics

System requirements

Schweitzer Fachinformationen

Introduction to Bayesian Statistics

Description

More details

Other editions

Additional editions

Persons

Content

CHAPTER 1 INTRODUCTION TO STATISTICAL SCIENCE

Showing a Causal Relationship from Data

1.1 The Scientific Method: A Process for Learning

1.2 The Role of Statistics in the Scientific Method

1.3 Main Approaches to Statistics

Frequentist Approach to Statistics

System requirements

CHAPTER 1
INTRODUCTION TO STATISTICAL SCIENCE