
Probability and Conditional Expectation
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Persons
Content
Part I Measure-Theoretical Foundations of Probability Theory
1 Measure 3
1.1 Introductory Examples 3
1.2 s-Algebra and Measurable Space 4
1.2.1 s-Algebra Generated by a Set System 9
1.2.2 s-Algebra of Borel Sets on Rn 12
1.2.3 s-Algebra on a Cartesian Product 13
1.2.4 n-Stable Set Systems That Generate a s-Algebra 15
1.3 Measure and Measure Space 16
1.3.1 s-Additivity and Related Properties 17
1.3.2 Other Properties 18
1.4 Specific Measures 20
1.4.1 Dirac Measure and Counting Measure 21
1.4.2 Lebesgue Measure 22
1.4.3 Other Examples of a Measure 23
1.4.4 Finite and s-Finite Measures 23
1.4.5 Product Measure 24
1.5 Continuity of a Measure 25
1.6 Specifying a Measure via a Generating System 27
1.7 s-Algebra That is Trivial With Respect to a Measure 28
1.8 Proofs 28
1.9 Exercises 31
2 Measurable Mapping 41
2.1 Image and Inverse Image 41
2.2 Introductory Examples 42
2.2.1 Example 1: Rectangles 42
2.2.2 Example 2: Flipping two Coins 44
2.3 Measurable Mapping 46
2.3.1 Measurable Mapping 46
2.3.2 s-Algebra Generated by a Mapping 51
2.3.3 Final s-Algebra 54
2.3.4 Multivariate Mapping 54
2.3.5 Projection Mapping 56
2.3.6 Measurability With Respect to a Mapping 56
2.4 Theorems on Measurable Mappings 58
2.4.1 Measurability of a Composition 59
2.4.2 Theorems on Measurable Functions 61
2.5 Equivalence of Two Mappings With Respect to a Measure 64
2.6 Image Measure 67
2.7 Proofs 70
2.8 Exercises 75
3 Integral 83
3.1 Definition 83
3.1.1 Integral of a Nonnegative Step Function 83
3.1.2 Integral of a Nonnegative Measurable Function 88
3.1.3 Integral of a Measurable Function 93
3.2 Properties 96
3.2.1 Integral of µ-Equivalent Functions 98
3.2.2 Integral With Respect to a Weighted Sum of Measures 100
3.2.3 Integral With Respect to an Image Measure 102
3.2.4 Convergence Theorems 103
3.3 Lebesgue and Riemann Integral 104
3.4 Density 106
3.5 Absolute Continuity and the Radon-Nikodym Theorem 108
3.6 Integral With Respect to a Product Measure 110
3.7 Proofs 111
3.8 Exercises 120
Part II Probability, Random Variable and its Distribution
4 Probability Measure 127
4.1 Probability Measure and Probability Space 127
4.1.1 Definition 127
4.1.2 Formal and Substantive Meaning of Probabilistic Terms 128
4.1.3 Properties of a Probability Measure 128
4.1.4 Examples 130
4.2 Conditional Probability 132
4.2.1 Definition 132
4.2.2 Filtration and Time Order Between Events and Sets of Events 133
4.2.3 Multiplication Rule 135
4.2.4 Examples 136
4.2.5 Theorem of Total Probability 137
4.2.6 Bayes' Theorem 138
4.2.7 Conditional-Probability Measure 139
4.3 Independence 143
4.3.1 Independence of Events 143
4.3.2 Independence of Set Systems 144
4.4 Conditional Independence Given an Event 145
4.4.1 Conditional Independence of Events Given an Event 145
4.4.2 Conditional Independence of Set Systems Given an Event 146
4.5 Proofs 148
4.6 Exercises 150
5 Random Variable, Distribution, Density, and Distribution Function 155
5.1 Random Variable and its Distribution 155
5.2 Equivalence of Two Random Variables With Respect to a Probability Measure 161
5.2.1 Identical and P-Equivalent Random Variables 161
5.2.2 P-Equivalence, PB-Equivalence, and Absolute Continuity 164
5.3 Multivariate Random Variable 167
5.4 Independence of Random Variables 169
5.5 Probability Function of a Discrete Random Variable 175
5.6 Probability Density With Respect to a Measure 178
5.6.1 General Concepts and Properties 178
5.6.2 Density of a Discrete Random Variable 180
5.6.3 Density of a Bivariate Random Variable 180
5.7 Uni- or Multivariate Real-Valued Random Variable 182
5.7.1 Distribution Function of a Univariate Real-Valued Random Variable 182
5.7.2 Distribution Function of a Multivariate Real-Valued Random Variable 184
5.7.3 Density of a Continuous Univariate Real-Valued Random Variable 185
5.7.4 Density of a Continuous Multivariate Real-Valued Random Variable 187
5.8 Proofs 188
5.9 Exercises 196
6 Expectation, Variance, and Other Moments 199
6.1 Expectation 199
6.1.1 Definition 199
6.1.2 Expectation of a Discrete Random Variable 200
6.1.3 Computing the Expectation Using a Density 202
6.1.4 Transformation Theorem 203
6.1.5 Rules of Computation 206
6.2 Moments, Variance, and Standard Deviation 207
6.3 Proofs 212
6.4 Exercises 213
7 Linear Quasi-Regression, Covariance, and Correlation 217
7.1 Linear Quasi-Regression 217
7.2 Covariance 220
7.3 Correlation 224
7.4 Expectation Vector and Covariance Matrix 227
7.4.1 Random Vector and Random Matrix 227
7.4.2 Expectation of a Random Vector and a Random Matrix 228
7.4.3 Covariance Matrix of two Multivariate Random Variables 229
7.5 Multiple Linear Quasi-Regression 231
7.6 Proofs 233
7.7 Exercises 237
8 Some Distributions 245
8.1 Some Distributions of Discrete Random Variables 245
8.1.1 Discrete Uniform Distribution 245
8.1.2 Bernoulli Distribution 246
8.1.3 Binomial Distribution 247
8.1.4 Poisson Distribution 250
8.1.5 Geometric Distribution 252
8.2 Some Distributions of Continuous Random Variables 254
8.2.1 Continuous Uniform Distribution 254
8.2.2 Normal Distribution 256
8.2.3 Multivariate Normal Distribution 259
8.2.4 Central ¿2-Distribution 262
8.2.5 Central t -Distribution 264
8.2.6 Central F-Distribution 266
8.3 Proofs 267
8.4 Exercises 271
Part III Conditional Expectation and Regression
9 Conditional Expectation Value and Discrete Conditional Expectation 277
9.1 Conditional Expectation Value 277
9.2 Transformation Theorem 280
9.3 Other Properties 282
9.4 Discrete Conditional Expectation 283
9.5 Discrete Regression 285
9.6 Examples 287
9.7 Proofs 291
9.8 Exercises 291
10 Conditional Expectation 295
10.1 Assumptions and Definitions 295
10.2 Existence and Uniqueness 297
10.2.1 Uniqueness With Respect to a Probability Measure 298
10.2.2 A Necessary and Sufficient Condition of Uniqueness 299
10.2.3 Examples 300
10.3 Rules of Computation and Other Properties 301
10.3.1 Rules of Computation 301
10.3.2 Monotonicity 302
10.3.3 Convergence Theorems 302
10.4 Factorization, Regression, and Conditional Expectation Value 306
10.4.1 Existence of a Factorization 306
10.4.2 Conditional Expectation and Mean-Squared Error 307
10.4.3 Uniqueness of a Factorization 308
10.4.4 Conditional Expectation Value 309
10.5 Characterizing a Conditional Expectation by the Joint Distribution 312
10.6 Conditional Mean Independence 313
10.7 Proofs 318
10.8 Exercises 321
11 Residual, Conditional Variance, and Conditional Covariance 329
11.1 Residual With Respect to a Conditional Expectation 329
11.2 Coefficient of Determination and Multiple Correlation 333
11.3 Conditional Variance and Covariance Given a s-Algebra 338
11.4 Conditional Variance and Covariance Given a Value of a Random Variable 339
11.5 Properties of Conditional Variances and Covariances 342
11.6 Partial Correlation 345
11.7 Proofs 347
11.8 Exercises 348
12 Linear Regression 357
12.1 Basic Ideas 357
12.2 Assumptions and Definitions 359
12.3 Examples 361
12.4 Linear Quasi-Regression 366
12.5 Uniqueness and Identification of Regression Coefficients 367
12.6 Linear Regression 369
12.7 Parametrizations of a Discrete Conditional Expectation 370
12.8 Invariance of Regression Coefficients 374
12.9 Proofs 375
12.10Exercises 377
13 Linear Logistic Regression 381
13.1 Logit Transformation of a Conditional Probability 381
13.2 Linear Logistic Parametrization 383
13.3 A Parametrization of a Discrete Conditional Probability 385
13.4 Identification of Coefficients of a Linear Logistic Parametrization 387
13.5 Linear Logistic Regression and Linear Logit Regression 388
13.6 Proofs 394
13.7 Exercises 396
14 Conditional Expectation With Respect to a Conditional-Probability Measure 399
14.1 Introductory Examples 399
14.2 Assumptions and Definitions 404
14.3 Properties 410
14.4 Partial Conditional Expectation 412
14.5 Factorization 413
14.5.1 Conditional Expectation Value With Respect to PB 414
14.5.2 Uniqueness of Factorizations 415
14.6 Uniqueness 415
14.6.1 A Necessary and Sufficient Condition of Uniqueness 415
14.6.2 Uniqueness w.r.t. P and Other Probability Measures 417
14.6.3 Necessary and Sufficient Conditions of P-Uniqueness 418
14.6.4 Properties Related to P-Uniqueness 420
14.7 Conditional Mean Independence With Respect to PZ=z 424
14.8 Proofs 426
14.9 Exercises 431
15 Conditional Effect Functions of a Discrete Regressor 437
15.1 Assumptions and Definitions 437
15.2 Conditional Intercept Function and Effect Functions 438
15.3 Implications of Independence of X and Z for Regression Coefficients 441
15.4 Adjusted Conditional Effect Functions 443
15.5 Conditional Logit Effect Functions 447
15.6 Implications of Independence of X and Z for the Logit Regression Coefficients 450
15.7 Proofs 452
15.8 Exercises 454
Part IV Conditional Independence and Conditional Distribution
16 Conditional Independence 459
16.1 Assumptions and Definitions 459
16.1.1 Two Events 459
16.1.2 Two Sets of Events 461
16.1.3 Two Random Variables 462
16.2 Properties 463
16.3 Conditional Independence and Conditional Mean Independence 470
16.4 Families of Events 473
16.5 Families of Set Systems 473
16.6 Families of Random Variables 475
16.7 Proofs 478
16.8 Exercises 486
17 Conditional Distribution 491
17.1 Conditional Distribution Given a s-Algebra or a Random Variable 491
17.2 Conditional Distribution Given a Value of a Random Variable 494
17.3 Existence and Uniqueness 497
17.3.1 Existence 497
17.3.2 Uniqueness of the Functions PY |C ( ·, A') 498
17.3.3 Common Null Set (CNS) Uniqueness of a Conditional Distribution 499
17.4 Conditional-Probability Measure Given a Value of a Random Variable 502
17.5 Decomposing the Joint Distribution of Random Variables 504
17.6 Conditional Independence and Conditional Distributions 506
17.7 Expectations With Respect to a Conditional Distribution 511
17.8 Conditional Distribution Function and Probability Density 513
17.9 Conditional Distribution and Radon-Nikodym Density 516
17.10Proofs 520
17.11Exercises 536
References 541
Preface
Why another book on probability?
This book has two titles. The subtitle, 'Fundamentals for the Empirical Sciences', reflects the intentions and the motivation of the first author for writing this book. He received his academic training in psychology but considers himself a methodologist. His scientific interest is in explicating fundamental concepts of empirical research (such as causal effects and latent variables) in terms of a language that is precise and at the same time compatible with the statistical models used in the analysis of empirical data. Applying statistical models aims at estimating and testing hypotheses about parameters such as expectations, variances, covariances, and so on (or of functions of these parameters, such as differences between expectations, ratios of variances, regression coefficients, etc.), all of which are terms of probability theory. Precision is necessary for securing logical consistency of theories, whereas compatibility of theories about real-world phenomena with statistical models is crucial for probing the empirical validity of theoretical propositions via statistical inference.
Much empirical research uses some kind of regression in order to investigate how the expectation of one random variable depends on the values of one or more other random variables. This is true for analysis of variance, regression analysis, applications of the general linear model and the generalized linear model, factor analysis, structural equation modeling, hierarchical linear modeling, and the analysis of qualitative data. Using these methods, we aim at learning about specific regressions. A regression is a synonym for what, in probability theory, is called a factorization of a conditional expectation, provided that the regressor is numerical. This explains the main title of this book, 'Probability and Conditional Expectation'.
What is it about?
Since the seminal book of Kolmogoroff (1933-1977), the fundamental concepts of probability theory are considered to be special concepts of measure theory. A probability measure is a special finite measure, random variables are special measurable mappings, and expectations of random variables are integrals of measurable mappings with respect to a probability measure. This motivates Part I of this book with three chapters on the measure-theoretical foundations of probability theory. Although at first sight this part seems to be far off from practical applications, the contrary is true. This part is indispensable for probability theory and for its applications in empirical sciences. This applies not only to the concepts of a measure and an integral but also, in particular, to the concept of a measurable mapping, although we concede that the full relevance of this concept will become apparent only in the chapters on conditional expectations. The relevance of measurable mappings is also the reason why chapter 2 is more detailed than the corresponding chapters in other books on measure theory.
Part II of the book is fairly conventional. The material covered - probability, random variable, expectation, variance, covariance, and some distributions - is found in many books on probability and statistics.
Part III is not only the longest; it is also the core of the book that distinguishes it from other books on probability or on probability and statistics. Only a few of these other books contain detailed chapters on conditional expectations. Exceptions are Billingsley (1995), Fristedt and Gray (1997), and Hoffmann-Jørgensen (1994). Our book does not cover any statistical model. However, we treat in much detail what we are estimating and which the hypotheses are that we test or evaluate in statistical modeling. How we are estimating is important, but what we are estimating is of most interest from the empirical scientist point of view, and this point is typically neglected in books on statistics and in books on probability theory such as Bauer (1996) or Klenke (2013). A simple example is the meaning of the coefficient ß1 in the equation E(Y | X, Z) = ß0 + ß1X + ß2Z + ß3ZX. Oftentimes, this coefficient is misinterpreted as the 'main effect' of X. However, sometimes ß1 has no autonomous meaning at all, for example if P(Z = 0) = 0. In general, this coefficient is just a component of the function g1(Z) = ß1 + ß3Z that can be used to compute the conditional effects of X on Y for various values z of Z (see chapter 15 for more details). The crucial point is that such concepts can be treated most clearly within probability theory, without referring to a statistical model, sample, estimation, or testing.
This also includes exemplifying the limitations of conditional expectations. Simple examples show that conditional expectations do not necessarily serve the purpose of the empirical researcher, which often is to evaluate the effects of an intervention on an outcome variable. But even in these situations, conditional expectations are indispensable for the definition of parameters and other terms of substantive interest (see, e.g., chapter 14).
There is much overlap of Parts II and III with Steyer (2003). However, that book is written in German, and the mathematics is considerably less rigorous. Aside from mathematical precision, the two books also differ in the definition of an important concept: In Steyer (2003), the term regression is used as a synonym of a conditional expectation, whereas in this book we use it as a synonym for the factorization of a conditional expectation , provided that the codomain of X is .
In chapter 9, the first chapter of Part III, we gently introduce conditional expectation values and discrete conditional expectations. In chapter 10, we then present the general theory of conditional expectations that has been introduced by Kolmogoroff (1933-1977) and since that time has been treated in many books on probability theory - although much too briefly in order to be intelligible to researchers in empirical sciences. Our chapter on conditional expectations contains many more details and is supplemented by a number of other chapters on important special aspects and special cases.
Such a special aspect is the concept of a residual with respect to a conditional expectation (see chapter 11). Residuals have many interesting properties, and they are used in order to introduce the concepts of conditional variance and covariance, as well as the notion of a partial correlation. We then turn to specific parameterizations of a conditional expectation, including the concepts of a linear regression (chapter 12) and a linear logistic regression (chapter 13). Note that these concepts are introduced as probabilistic concepts. As mentioned, they are what we aim at estimating in applying the corresponding statistical models.
Chapters 14 to 16 provide the probabilistic foundations of the analysis of conditional and average effects of treatments, interventions, or expositions to potentially harmful or beneficial environments. To our knowledge, this material is not found in any other textbook. Note, however, that although these two chapters provide important concepts, they do not cover the theory of causal effects, which is another book project of the first author.
Part IV uses conditional expectations in order to introduce conditional independence (chapter 16) and conditional distributions (chapter 17). Although these two chapters are more extensive than comparable chapters or sections in other books, the material is found in other books on probability theory as well.
For whom is it?
This book has been written for two kinds of readers. The first are applied statisticians and empirical researchers who want to understand in a proper language (i.e., in terms of probability theory) what they estimate and test in their empirical studies. The second kind of readers are mathematicians who want to understand in terms of probability theory what applied statisticians and empirical researchers estimate and test in their research. Both kinds of readers are potential contributors to the methodology of empirical sciences.
Many exercises and their solutions provide extensive material for assignments in courses, but they also facilitate independent learning. At the same time, these exercises and their solutions help streamline the main text.
Note that we do not provide all proofs, in particular in the chapters on measure, integral, and distributions. In these cases, we refer to other textbooks instead. We decided to include only those proofs that may help to increase understanding of the background and to learn important mathematical procedures. Of course, we provide proofs of all propositions for which we did not find an appropriate reference.
Prerequisites
We assume that the reader is familiar with the elementary concepts of logic, sets, functions, sequences, and matrices, as presented for example in chapters 1 and 2 of Rosen (2012). We try to stick to his notation as closely as possible.
One of the exceptions is the symbol for the implication, for which we use ? instead of . Another exception is the symbol for the equivalence, for which we use ? instead of .
Box 0.1 summarizes the most important notation to start with. The concepts referred to by these symbols are defined, for example, in Rosen (2012) or in Ellis and Gulick (2006). For a rich...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.