
Introduction to Bayesian Statistics
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Persons
Content
1 Introduction to Statistical Science 1
1.1 The Scientic Method: A Process for Learning 3
1.2 The Role of Statistics in the Scientic Method 5
1.3 Main Approaches to Statistics 5
1.4 Purpose and Organization of This Text 8
2 Scientic Data Gathering 13
2.1 Sampling from a Real Population 14
2.2 Observational Studies and Designed Experiments 17
Monte Carlo Exercises 23
3 Displaying and Summarizing Data 31
3.1 Graphically Displaying a Single Variable 32
3.2 Graphically Comparing Two Samples 39
3.3 Measures of Location 41
3.4 Measures of Spread 44
3.5 Displaying Relationships Between Two or More Variables 46
3.6 Measures of Association for Two or More Variables 49
Exercises 52
4 Logic, Probability, and Uncertainty 59
4.1 Deductive Logic and Plausible Reasoning 60
4.2 Probability 62
4.3 Axioms of Probability 64
4.4 Joint Probability and Independent Events 65
4.5 Conditional Probability 66
4.6 Bayes' Theorem 68
4.7 Assigning Probabilities 74
4.8 Odds and Bayes Factor 75
4.9 Beat the Dealer 76
Exercises 80
5 Discrete Random Variables 83
5.1 Discrete Random Variables 84
5.2 Probability Distribution of a Discrete Random Variable 86
5.3 Binomial Distribution 90
5.4 Hypergeometric Distribution 92
5.5 Poisson Distribution 93
5.6 Joint Random Variables 96
5.7 Conditional Probability for Joint Random Variables 100
Exercises 104
6 Bayesian Inference for Discrete Random Variables 109
6.1 Two Equivalent Ways of Using Bayes' Theorem 114
6.2 Bayes' Theorem for Binomial with Discrete Prior 116
6.3 Important Consequences of Bayes' Theorem 119
6.4 Bayes' Theorem for Poisson with Discrete Prior 120
Exercises 122
Computer Exercises 126
7 Continuous Random Variables 129
7.1 Probability Density Function 131
7.2 Some Continuous Distributions 135
7.3 Joint Continuous Random Variables 143
7.4 Joint Continuous and Discrete Random Variables 144
Exercises 147
8 Bayesian Inference for Binomial Proportion 149
8.1 Using a Uniform Prior 150
8.2 Using a Beta Prior 151
8.3 Choosing Your Prior 154
8.4 Summarizing the Posterior Distribution 158
8.5 Estimating the Proportion 161
8.6 Bayesian Credible Interval 162
Exercises 164
Computer Exercises 167
9 Comparing Bayesian and Frequentist Inferences for Proportion 169
9.1 Frequentist Interpretation of Probability and Parameters 170
9.2 Point Estimation 171
9.3 Comparing Estimators for Proportion 174
9.4 Interval Estimation 175
9.5 Hypothesis Testing 178
9.6 Testing a One-Sided Hypothesis 179
9.7 Testing a Two-Sided Hypothesis 182
Exercises 187
Monte Carlo Exercises 190
10 Bayesian Inference for Poisson 193
10.1 Some Prior Distributions for Poisson 194
10.2 Inference for Poisson Parameter 200
Exercises 207
Computer Exercises 208
11 Bayesian Inference for Normal Mean 211
11.1 Bayes' Theorem for Normal Mean with a Discrete Prior 211
11.2 Bayes' Theorem for Normal Mean with a Continuous Prior 218
11.3 Choosing Your Normal Prior 222
11.4 Bayesian Credible Interval for Normal Mean 224
11.5 Predictive Density for Next Observation 227
Exercises 230
Computer Exercises 232
12 Comparing Bayesian and Frequentist Inferences for Mean 237
12.1 Comparing Frequentist and Bayesian Point Estimators 238
12.2 Comparing Condence and Credible Intervals for Mean 241
12.3 Testing a One-Sided Hypothesis about a Normal Mean 243
12.4 Testing a Two-Sided Hypothesis about a Normal Mean 247
Exercises 251
13 Bayesian Inference for Di erence Between Means 255
13.1 Independent Random Samples from Two Normal Distributions 256
13.2 Case 1: Equal Variances 257
13.3 Case 2: Unequal Variances 262
13.4 Bayesian Inference for Dierence Between Two Proportions Using Normal Approximation 265
13.5 Normal Random Samples from Paired Experiments 266
Exercises 272
14 Bayesian Inference for Simple Linear Regression 283
14.1 Least Squares Regression 284
14.2 Exponential Growth Model 288
14.3 Simple Linear Regression Assumptions 290
14.4 Bayes' Theorem for the Regression Model 292
14.5 Predictive Distribution for Future Observation 298
Exercises 303
Computer Exercises 312
15 Bayesian Inference for Standard Deviation 315
15.1 Bayes' Theorem for Normal Variance with a Continuous Prior 316
15.2 Some Specic Prior Distributions and the Resulting Posteriors 318
15.3 Bayesian Inference for Normal Standard Deviation 326
Exercises 332
Computer Exercises 335
16 Robust Bayesian Methods 337
16.1 Eect of Misspecied Prior 338
16.2 Bayes' Theorem with Mixture Priors 340
Exercises 349
Computer Exercises 351
17 Bayesian Inference for Normal with Unknown Mean and Variance 355
17.1 The Joint Likelihood Function 358
17.2 Finding the Posterior when Independent Jeffreys' Priors for µ and s2 Are Used 359
17.3 Finding the Posterior when a Joint Conjugate Prior for µ and s2 Is Used 361
17.4 Difference Between Normal Means with Equal Unknown Variance 367
17.5 Difference Between Normal Means with Unequal Unknown Variances 377
Computer Exercises 383
Appendix: Proof that the Exact Marginal Posterior Distribution of µ is Student's t 385
18 Bayesian Inference for Multivariate Normal Mean Vector 393
18.1 Bivariate Normal Density 394
18.2 Multivariate Normal Distribution 397
18.3 The Posterior Distribution of the Multivariate Normal Mean Vector when Covariance Matrix Is Known 398
18.4 Credible Region for Multivariate Normal Mean Vector when Covariance Matrix Is Known 400
18.5 Multivariate Normal Distribution with Unknown Covariance Matrix 402
Computer Exercises 406
19 Bayesian Inference for the Multiple Linear Regression Model 411
19.1 Least Squares Regression for Multiple Linear Regression Model 412
19.2 Assumptions of Normal Multiple Linear Regression Model 414
19.3 Bayes' Theorem for Normal Multiple Linear Regression Model 415
19.4 Inference in the Multivariate Normal Linear Regression Model 419
19.5 The Predictive Distribution for a Future Observation 425
Computer Exercises 428
20 Computational Bayesian Statistics Including Markov Chain Monte Carlo 431
20.1 Direct Methods for Sampling from the Posterior 436
20.2 Sampling - Importance - Resampling 450
20.3 Markov Chain Monte Carlo Methods 454
20.4 Slice Sampling 470
20.5 Inference from a Posterior Random Sample 473
20.6 Where to Next? 475
A Introduction to Calculus 477
B Use of Statistical Tables 497
C Using the Included Minitab Macros 523
D Using the Included R Functions 543
E Answers to Selected Exercises 565
References 591
Index 595
CHAPTER 1
INTRODUCTION TO STATISTICAL SCIENCE
Statistics is the science that relates data to specific questions of interest. This includes devising methods to gather data relevant to the question, methods to summarize and display the data to shed light on the question, and methods that enable us to draw answers to the question that are supported by the data. Data almost always contain uncertainty. This uncertainty may arise from selection of the items to be measured, or it may arise from variability of the measurement process. Drawing general conclusions from data is the basis for increasing knowledge about the world, and is the basis for all rational scientific inquiry. Statistical inference gives us methods and tools for doing this despite the uncertainty in the data. The methods used for analysis depend on the way the data were gathered. It is vitally important that there is a probability model explaining how the uncertainty gets into the data.
Showing a Causal Relationship from Data
Suppose we have observed two variables X and Y. Variable X appears to have an association with variable Y. If high values of X occur with high values of variable Y and low values of X occur with low values of Y, then we say the association is positive. On the other hand, the association could be negative in which high values of variable X occur in with low values of variable Y. Figure 1.1 shows a schematic diagram where the association is indicated by the dashed curve connecting X and Y. The unshaded area indicates that X and Y are observed variables. The shaded area indicates that there may be additional variables that have not been observed.
Figure 1.1 Association between two variables.
Figure 1.2 Association due to causal relationship.
We would like to determine why the two variables are associated. There are several possible explanations. The association might be a causal one. For example, X might be the cause of Y. This is shown in Figure 1.2, where the causal relationship is indicated by the arrow from X to Y.
On the other hand, there could be an unidentified third variable Z that has a causal effect on both X and Y. They are not related in a direct causal relationship. The association between them is due to the effect of Z. Z is called a lurking variable, since it is hiding in the background and it affects the data. This is shown in Figure 1.3.
Figure 1.3 Association due to lurking variable.
Figure 1.4 Confounded causal and lurking variable effects.
It is possible that both a causal effect and a lurking variable may both be contributing to the association. This is shown in Figure 1.4. We say that the causal effect and the effect of the lurking variable are confounded. This means that both effects are included in the association.
Our first goal is to determine which of the possible reasons for the association holds. If we conclude that it is due to a causal effect, then our next goal is to determine the size of the effect. If we conclude that the association is due to causal effect confounded with the effect of a lurking variable, then our next goal becomes determining the sizes of both the effects.
1.1 The Scientific Method: A Process for Learning
In the Middle Ages, science was deduced from principles set down many centuries earlier by authorities such as Aristotle. The idea that scientific theories should be tested against real world data revolutionized thinking. This way of thinking known as the scientific method sparked the Renaissance.
The scientific method rests on the following premises:
- A scientific hypothesis can never be shown to be absolutely true.
- However, it must potentially be disprovable.
- It is a useful model until it is established that it is not true.
- Always go for the simplest hypothesis, unless it can be shown to be false.
This last principle, elaborated by William of Ockham in the 13th century, is now known as Ockham's razor and is firmly embedded in science. It keeps science from developing fanciful overly elaborate theories. Thus the scientific method directs us through an improving sequence of models, as previous ones get falsified. The scientific method generally follows the following procedure:
- Ask a question or pose a problem in terms of the current scientific hypothesis.
- Gather all the relevant information that is currently available. This includes the current knowledge about parameters of the model.
- Design an investigation or experiment that addresses the question from step 1. The predicted outcome of the experiment should be one thing if the current hypothesis is true, and something else if the hypothesis is false.
- Gather data from the experiment.
- Draw conclusions given the experimental results. Revise the knowledge about the parameters to take the current results into account.
The scientific method searches for cause-and-effect relationships between an experimental variable and an outcome variable. In other words, how changing the experimental variable results in a change to the outcome variable. Scientific modeling develops mathematical models of these relationships. Both of them need to isolate the experiment from outside factors that could affect the experimental results. All outside factors that can be identified as possibly affecting the results must be controlled. It is no coincidence that the earliest successes for the method were in physics and chemistry where the few outside factors could be identified and controlled. Thus there were no lurking variables. All other relevant variables could be identified and could then be physically controlled by being held constant. That way they would not affect results of the experiment, and the effect of the experimental variable on the outcome variable could be determined. In biology, medicine, engineering, technology, and the social sciences it is not that easy to identify the relevant factors that must be controlled. In those fields a different way to control outside factors is needed, because they cannot be identified beforehand and physically controlled.
1.2 The Role of Statistics in the Scientific Method
Statistical methods of inference can be used when there is random variability in the data. The probability model for the data is justified by the design of the investigation or experiment. This can extend the scientific method into situations where the relevant outside factors cannot even be identified. Since we cannot identify these outside factors, we cannot control them directly. The lack of direct control means the outside factors will be affecting the data. There is a danger that the wrong conclusions could be drawn from the experiment due to these uncontrolled outside factors.
The important statistical idea of randomization has been developed to deal with this possibility. The unidentified outside factors can be "averaged out" by randomly assigning each unit to either treatment or control group. This contributes variability to the data. Statistical conclusions always have some uncertainty or error due to variability in the data. We can develop a probability model of the data variability based on the randomization used. Randomization not only reduces this uncertainty due to outside factors, it also allows us to measure the amount of uncertainty that remains using the probability model. Randomization lets us control the outside factors statistically, by averaging out their effects.
Underlying this is the idea of a statistical population, consisting of all possible values of the observations that could be made. The data consists of observations taken from a sample of the population. For valid inferences about the population parameters from the sample statistics, the sample must be "representative" of the population. Amazingly, choosing the sample randomly is the most effective way to get representative samples!
1.3 Main Approaches to Statistics
There are two main philosophical approaches to statistics. The first is often referred to as the frequentist approach. Sometimes it is called the classical approach. Procedures are developed by looking at how they perform over all possible random samples. The probabilities do not relate to the particular random sample that was obtained. In many ways this indirect method places the "cart before the horse."
The alternative approach that we take in this book is the Bayesian approach. It applies the laws of probability directly to the problem. This offers many fundamental advantages over the more commonly used frequentist approach. We will show these advantages over the course of the book.
Frequentist Approach to Statistics
Most introductory statistics books take the frequentist approach to statistics, which is based on the following ideas:
- Parameters, the numerical characteristics of the population, are fixed but unknown constants.
- Probabilities are always interpreted as long-run relative frequency.
- Statistical procedures are judged by how well they perform in the long run over an infinite number of hypothetical repetitions of the experiment.
Probability...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.