
Engineering Biostatistics
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Through its scope and depth of coverage, this book addresses the needs of the vibrant and rapidly growing bio-oriented engineering fields while implementing software packages that are familiar to engineers. The book is heavily oriented to computation and hands-on approaches so readers understand each step of the programming. Another dimension of this book is in parallel coverage of both Bayesian and frequentist approaches to statistical inference. It avoids taking sides on the classical vs. Bayesian paradigms, and many examples in this book are solved using both methods. The results are then compared and commented upon. Readers have the choice of MATLAB® for classical data analysis and WinBUGS/OpenBUGS for Bayesian data analysis. Every chapter starts with a box highlighting what is covered in that chapter and ends with exercises, a list of software scripts, datasets, and references.
Engineering Biostatistics: An Introduction using MATLAB® and WinBUGS also includes:
* parallel coverage of classical and Bayesian approaches, where appropriate
* substantial coverage of Bayesian approaches to statistical inference
* material that has been classroom-tested in an introductory statistics course in bioengineering over several years
* exercises at the end of each chapter and an accompanying website with full solutions and hints to some exercises, as well as additional materials and examples
Engineering Biostatistics: An Introduction using MATLAB® and WinBUGS can serve as a textbook for introductory-to-intermediate applied statistics courses, as well as a useful reference for engineers interested in biostatistical approaches.
More details
Other editions
Additional editions

Person
BRANI VIDAKOVIC, PhD, is a Professor in the School of Industrial and Systems Engineering (ISyE) at Georgia Institute of Technology and Department of Biomedical Engineering at Georgia Institute of Technology/Emory University. Dr. Vidakovic is a Fellow of the American Statistical Association, Elected Member of the International Statistical Institute, an Editor-in-Chief of Encyclopedia of Statistical Sciences, Second Edition, and former and current Associate Editor of several leading journals in the field of statistics.
Content
Preface v
1 Introduction 1
Chapter References 7
2 The Sample and Its Properties 9
2.1 Introduction 9
2.2 A MATLAB Session on Univariate Descriptive Statistics 10
2.3 Location Measures 12
2.4 Variability Measures 15
2.4.1 Ranks 24
2.5 Displaying Data 25
2.6 Multidimensional Samples: Fisher's Iris Data and Body Fat Data 29
2.7 Multivariate Samples and Their Summaries 35
2.8 Principal Components of Data 40
2.9 Visualizing Multivariate Data 45
2.10 Observations as Time Series 49
2.11 About Data Types 52
2.12 Big Data Paradigm 53
2.13 Exercises 55
Chapter References 70
3 Probability, Conditional Probability, and Bayes' Rule 73
3.1 Introduction 73
3.2 Events and Probability 74
3.3 Odds 85
3.4 Venn Diagrams 86
3.5 Counting Principles 88
3.6 Conditional Probability and Independence 92
3.6.1 Pairwise and Global Independence 97
3.7 Total Probability 97
3.8 Reassesing Probabilities: Bayes' Rule 100
3.9 Bayesian Networks 105
3.10 Exercises 111
Chapter References 130
4 Sensitivity, Specificity, and Relatives 133
4.1 Introduction 133
4.2 Notation 134
4.2.1 Conditional Probability Notation 138
4.3 Combining Two or More Tests 141
4.4 ROC Curves 144
4.5 Exercises 149
Chapter References 157
5 Random Variables 159
5.1 Introduction 159
5.2 Discrete Random Variables 161
5.2.1 Jointly Distributed Discrete Random Variables 166
5.3 Some Standard Discrete Distributions 169
5.3.1 Discrete Uniform Distribution 169
5.3.2 Bernoulli and Binomial Distributions 170
5.3.3 Hypergeometric Distribution 174
5.3.4 Poisson Distribution 177
5.3.5 Geometric Distribution 180
5.3.6 Negative Binomial Distribution 183
5.3.7 Multinomial Distribution 184
5.3.8 Quantiles 186
5.4 Continuous Random Variables 187
5.4.1 Joint Distribution of Two Continuous Random Variables 192
5.4.2 Conditional Expectation 193
5.5 Some Standard Continuous Distributions 195
5.5.1 Uniform Distribution 196
5.5.2 Exponential Distribution 198
5.5.3 Normal Distribution 200
5.5.4 Gamma Distribution 201
5.5.5 Inverse Gamma Distribution 203
5.5.6 Beta Distribution 203
5.5.7 Double Exponential Distribution 205
5.5.8 Logistic Distribution 206
5.5.9 Weibull Distribution 207
5.5.10 Pareto Distribution 208
5.5.11 Dirichlet Distribution 209
5.6 Random Numbers and Probability Tables 210
5.7 Transformations of Random Variables 211
5.8 Mixtures 214
5.9 Markov Chains 215
5.10 Exercises 219
Chapter References 232
6 Normal Distribution 235
6.1 Introduction 235
6.2 Normal Distribution 236
6.2.1 Sigma Rules 240
6.2.2 Bivariate Normal Distribution 241
6.3 Examples with a Normal Distribution 243
6.4 Combining Normal Random Variables 246
6.5 Central Limit Theorem 249
6.6 Distributions Related to Normal 253
6.6.1 Chi-square Distribution 254
6.6.2 t-Distribution 258
6.6.3 Cauchy Distribution 259
6.6.4 F-Distribution 260
6.6.5 Noncentral ¿2, t, and F Distributions 262
6.6.6 Lognormal Distribution 263
6.7 Delta Method and Variance Stabilizing Transformations 265
6.8 Exercises 268
Chapter References 274
7 Point and Interval Estimators 277
7.1 Introduction 277
7.2 Moment Matching and Maximum Likelihood Estimators 278
7.2.1 Unbiasedness and Consistency of Estimators 285
7.3 Estimation of a Mean, Variance, and Proportion 288
7.3.1 Point Estimation of Mean 288
7.3.2 Point Estimation of Variance 290
7.3.3 Point Estimation of Population Proportion 294
7.4 Confidence Intervals 295
7.4.1 Confidence Intervals for the Normal Mean 296
7.4.2 Confidence Interval for the Normal Variance 299
7.4.3 Confidence Intervals for the Population Proportion . . . 302
7.4.4 Confidence Intervals for Proportions When X = 0 306
7.4.5 Designing the Sample Size with Confidence Intervals 307
7.5 Prediction and Tolerance Intervals 309
7.6 Confidence Intervals for Quantiles 311
7.7 Confidence Intervals for the Poisson Rate 312
7.8 Exercises 315
Chapter References 328
8 Bayesian Approach to Inference 331
8.1 Introduction 331
8.2 Ingredients for Bayesian Inference 334
8.3 Conjugate Priors 338
8.4 Point Estimation 340
8.4.1 Normal-Inverse Gamma Conjugate Analysis 343
8.5 Prior Elicitation 345
8.6 Bayesian Computation and Use of WinBUGS 348
8.6.1 Zero Tricks in WinBUGS 351
8.7 Bayesian Interval Estimation: Credible Sets 353
8.8 Learning by Bayes' Theorem 357
8.9 Bayesian Prediction 358
8.10 Consensus Means 362
8.11 Exercises 365
Chapter References 372
9 Testing Statistical Hypotheses 375
9.1 Introduction 375
9.2 Classical Testing Problem 377
9.2.1 Choice of Null Hypothesis 377
9.2.2 Test Statistic, Rejection Regions, Decisions, and Errors in Testing 379
9.2.3 Power of the Test 380
9.2.4 Fisherian Approach: p-Values 381
9.3 Bayesian Approach to Testing 382
9.3.1 Criticism and Calibration of p-Values 386
9.4 Testing the Normal Mean 388
9.4.1 z-Test 389
9.4.2 Power Analysis of a z-Test 389
9.4.3 Testing a Normal Mean When the Variance Is Not Known: t-Test 391
9.4.4 Power Analysis of t-Test 394
9.5 Testing Multivariate Mean: T-Square Test* 397
9.5.1 T-Square Test 397
9.5.2 Test for Symmetry 401
9.6 Testing the Normal Variances 402
9.7 Testing the Proportion 404
9.7.1 Exact Test for Population Proportions 406
9.7.2 Bayesian Test for Population Proportions 409
9.8 Multiplicity in Testing, Bonferroni Correction, and False Discovery Rate 412
9.9 Exercises 415
Chapter References 425
10 Two Samples 427
10.1 Introduction 427
10.2 Means and Variances in Two Independent Normal Populations 428
10.2.1 Confidence Interval for the Difference of Means 433
10.2.2 Power Analysis for Testing Two Means 434
10.2.3 More Complex Two-Sample Designs 438
10.2.4 A Bayesian Test for Two Normal Means 439
10.3 Testing the Equality of Normal Means When Samples Are Paired 443
10.3.1 Sample Size in Paired t-Test 448
10.3.2 Difference-in-Differences (DiD) Tests 449
10.4 Two Multivariate Normal Means 451
10.4.1 Confidence Intervals for Arbitrary Linear Combinations of Mean Differences 453
10.4.2 Profile Analysis With Two Independent Groups 454
10.4.3 Paired Multivariate Samples 456
10.5 Two Normal Variances 459
10.6 Comparing Two Proportions 463
10.6.1 The Sample Size 465
10.7 Risk Differences, Risk Ratios, and Odds Ratios 466
10.7.1 Risk Differences 466
10.7.2 Risk Ratio 467
10.7.3 Odds Ratios 469
10.7.4 Two Proportions from a Single Sample 473
10.8 Two Poisson Rates 476
10.9 Equivalence Tests 479
10.10 Exercises 483
Chapter References 500
11 ANOVA and Elements of Experimental Design 503
11.1 Introduction 503
11.2 One-Way ANOVA 504
11.2.1 ANOVA Table and Rationale for F-Test 506
11.2.2 Testing Assumption of Equal Population Variances . . . 509
11.2.3 The Null Hypothesis Is Rejected. What Next? 511
11.2.4 Bayesian Solution 516
11.2.5 Fixed- and Random-Effect ANOVA 518
11.3 Welch's ANOVA 518
11.4 Two-Way ANOVA and Factorial Designs 521
11.4.1 Two-way ANOVA: One Observation Per Cell 527
11.5 Blocking 529
11.6 Repeated Measures Design 531
11.6.1 Sphericity Tests 534
11.7 Nested Designs 535
11.8 Power Analysis in ANOVA 539
11.9 Functional ANOVA 545
11.10 Analysis of Means (ANOM) 548
11.11 Gauge R&R ANOVA 550
11.12 Testing Equality of Several Proportions 556
11.13 Testing the Equality of Several Poisson Means 557
11.14 Exercises 559
Chapter References 582
12 Models for Tables 585
12.1 Introduction 586
12.2 Contingency Tables: Testing for Independence 586
12.2.1 Measuring Association in Contingency Tables 591
12.2.2 Power Analysis for Contingency Tables 593
12.2.3 Cohen's Kappa 594
12.3 Three-Way Tables 596
12.4 Fisher's Exact Test 600
12.5 Stratified Tables: Mantel-Haenszel Test 603
12.5.1 Testing Conditional Independence or Homogeneity . . . 604
12.5.2 Odds Ratio from Stratified Tables 607
12.6 Paired Tables: McNemar's Test 608
12.7 Risk Differences, Risk Ratios, and Odds Ratios for Paired Tables 610
12.7.1 Risk Differences 610
12.7.2 Risk Ratios 611
12.7.3 Odds Ratios 612
12.7.4 Liddell's Procedure 617
12.7.5 Garth Test 619
12.7.6 Stuart-Maxwell Test 620
12.7.7 Cochran's Q Test* 626
12.8 Exercises 628
Chapter References 643
13 Correlation 647
13.1 Introduction 647
13.2 The Pearson Coefficient of Correlation 648
13.2.1 Inference About ¿ 650
13.2.2 Bayesian Inference for Correlation Coefficients 663
13.3 Spearman's Coefficient of Correlation 665
13.4 Kendall's Tau 667
13.5 Cum hoc ergo propter hoc 670
13.6 Exercises 671
Chapter References 677
14 Regression 679
14.1 Introduction 679
14.2 Simple Linear Regression 680
14.2.1 Inference in Simple Linear Regression 688
14.3 Calibration 697
14.4 Testing the Equality of Two Slopes 699
14.5 Multiple Regression 702
14.5.1 Matrix Notation 703
14.5.2 Residual Analysis, Influential Observations, Multicollinearity, and Variable Selection 709
14.6 Sample Size in Regression 720
14.7 Linear Regression That Is Nonlinear in Predictors 720
14.8 Errors-In-Variables Linear Regression 723
14.9 Analysis of Covariance 724
14.9.1 Sample Size in ANCOVA 728
14.9.2 Bayesian Approach to ANCOVA 729
14.10 Exercises 731
Chapter References 748
15 Regression for Binary and Count Data 751
15.1 Introduction 751
15.2 Logistic Regression 752
15.2.1 Fitting Logistic Regression 753
15.2.2 Assessing the Logistic Regression Fit 758
15.2.3 Probit and Complementary Log-Log Links 769
15.3 Poisson Regression 773
15.4 Log-linear Models 779
15.5 Exercises 783
Chapter References 798
16 Inference for Censored Data and Survival Analysis 801
16.1 Introduction 801
16.2 Definitions 802
16.3 Inference with Censored Observations 807
16.3.1 Parametric Approach 807
16.3.2 Nonparametric Approach: Kaplan-Meier or Product-Limit Estimator 809
16.3.3 Comparing Survival Curves 815
16.4 The Cox Proportional Hazards Model 818
16.5 Bayesian Approach 822
16.5.1 Survival Analysis in WinBUGS 823
16.6 Exercises 829
Chapter References 835
17 Goodness of Fit Tests 837
17.1 Introduction 837
17.2 Probability Plots 838
17.2.1 Q-Q Plots 838
17.2.2 P-P Plots 841
17.2.3 Poissonness Plots 842
17.3 Pearson's Chi-Square Test 843
17.4 Kolmogorov-Smirnov Tests 852
17.4.1 Kolmogorov's Test 852
17.4.2 Smirnov's Test to Compare Two Distributions 854
17.5 Cramér-von Mises and Watson's Tests 858
17.5.1 Rosenblatt's Test 860
17.6 Moran's Test 862
17.7 Departures from Normality 863
17.7.1 Ellimination of Unknown Parameters by Transformations 866
17.8 Exercises 867
Chapter References 876
18 Distribution-Free Methods 879
18.1 Introduction 879
18.2 Sign Test 880
18.3 Wilcoxon Signed-Rank Test 884
18.4 Wilcoxon Sum Rank Test and Mann-Whitney Test 887
18.5 Kruskal-Wallis Test 890
18.6 Friedman's Test 894
18.7 Resampling Methods 898
18.7.1 The Jackknife 898
18.7.2 Bootstrap 901
18.7.3 Bootstrap Versions of Some Popular Tests 908
18.7.4 Randomization and Permutation Tests 916
18.7.5 Discussion 919
18.8 Exercises 919
Chapter References 929
19 Bayesian Inference Using Gibbs Sampling - BUGS Project 931
19.1 Introduction 931
19.2 Step-by-Step Session 932
19.3 Built-in Functions and Common Distributions in WinBUGS 937
19.4 MATBUGS: A MATLAB Interface to WinBUGS 938
19.5 Exercises 942
Chapter References 943
Index 945
Chapter 1
Introduction
Many people were at first surprised at my using the new words "Statistics" and "Statistical," as it was supposed that some term in our own language might have expressed the same meaning. But in the course of a very extensive tour through the northern parts of Europe, which I happened to take in 1786, I found that in Germany they were engaged in a species of political inquiry to which they had given the name of "Statistics".. I resolved on adopting it, and I hope that it is now completely naturalised and incorporated with our language.
- Sinclair, 1791; Vol XX
WHAT IS COVERED IN THIS CHAPTER
- What is the subject of statistics?
- Population, sample, data
- Appetizer examples
The problems confronting health professionals today often involve fundamental aspects of device and system analysis, and their design and application. As such they are of extreme importance to engineers and scientists.
Due to many aspects of engineering and scientific practice involving nondeterministic outcomes, understanding and knowledge of statistics is important to any engineer and scientist. Statistics is a guide to the unknown. It is a science that deals with designing experimental protocols; collecting, summarizing, and presenting data; and, most important, making inferences and aiding decisions in the presence of variability and uncertainty. For example, R. A. Fisher's 1943 elucidation of the human blood-group system Rhesus in terms of the three linked loci C, D, and E, as described in Fisher (1947) or Edwards (2007), is a brilliant example of building a coherent structure of new knowledge guided by a statistical analysis of available experimental data.
The uncertainty that statistical science addresses derives mainly from two sources: (1) from observing only a part of an existing, fixed, but large population or (2) from having a process that results in nondeterministic outcomes. At least a part of the process needs to be either a black box or inherently stochastic, so the outcomes cannot be predicted with certainty.
A population is a statistical universe. It is defined as a collection of existing attributes of some natural phenomenon or a collection of potential attributes when a process is involved. In the case of a process, the underlying population is called hypothetical, for obvious reasons. Thus, populations can be either finite or infinite. A subset of a population selected by some relevant criteria is called a subpopulation.
Often we think about a population as an assembly of people, animals, items, events, times, etc., in which the attribute of interest is measurable. For example, the population of all US citizens older than 21 is an example of a population for which many attributes can be assessed. Attributes might be a history of heart disease, weight, political affiliation, level of blood sugar, etc.
A sample is an observed part of a population. Selection of a sample is a rich methodology in itself, but, unless otherwise specified, it is assumed that the sample is selected at random. The randomness ensures that the sample is representative of its population.
The sampling process depends on the nature of the problem and the population. For example, a sample may be obtained via a retrospective study (usually existing historical outcomes over some period of time), an observational study (an observer monitors the process or population in real time), a sample survey (a researcher administers a questionnaire to measure the characteristics and/or attitudes of subjects), or a designed study (a researcher makes deliberate changes in controllable variables to induce a cause/effect relationship), to name just a few.
Example 1.1. Ohm's Law Measurements. A student constructed a simple electric circuit in which the resistance R and voltage E were controllable. The output of interest is current I, and according to Ohm's law it is
This is a mechanistic, theoretical model. In a finite number of measurements under an identical R, E setting, the measured current varies. The population here is hypothetical - an infinite collection of all potentially obtainable measurements of its attribute, current I. The observed sample is finite. In the presence of sample variability, one establishes an empirical (statistical) model for currents from the population as either (statistical) model for currents from the population as either
On the basis of a sample, one may first select the model and then proceed with the inference about the nature of the discrepancy, ?.
Example 1.2. Cell Counts. In a quantitative engineering physiology laboratory, a team of four students was asked to make a LabVIEW© program to automatically count MC3T3-E1 cells in a hemocytometer (Fig. 1.1). This automatic count was to be compared with the manual count collected through an inverted bright field microscope. The manual count is considered the gold standard.
Fig. 1.1 Cells on a hemocytometer plate.
The experiment consisted of placing 10 µL of cell solutions at two levels of cell confluency: 20% and 70%. There were n1 =12 pairs of measurements (automatic and manual counts) at 20% and n2 = 10 pairs at 70%, as in the table below.
20% confluency Automated 34 44 40 62 53 51 30 33 38 51 26 48 Manual 30 43 34 53 49 39 37 42 30 50 35 54 70% confluency Automated 72 82 100 94 83 94 73 87 107 102 Manual 76 51 92 77 74 81 72 87 100 104The students wish to answer the following questions:
- Are the automated and manual counts significantly different for a fixed confluency level? What are the confidence intervals for the population differences if normality of the measurements is assumed?
- If the difference between automated and manual counts constitutes an error, are the errors comparable for the two confluency levels?
We will revisit this example later in the book (Exercise 10.20) and see that for the 20% confluency level there is no significant difference between the automated and manual counts, whereas for the 70% level the difference is significant. We will also see that the errors for the two confluency levels significantly differ. The statistical design for comparison of errors is called a difference in differences (DiD) and is quite common in biomedical data analysis.
Example 1.3. Rana Pipiens. Students in a quantitative engineering physiology laboratory were asked to expose the gastrocnemius muscle of the northern leopard frog (Rana pipiens, and stimulate the sciatic nerve to observe contractions in the skeletal muscle. Students were interested in modeling the length-tension relationship. The force used was the active force, calculated by subtracting the measured passive force (no stimulation) from the total force (with stimulation).
The active force represents the dependent variable. The length of the muscle begins at 35 mm and stretches in increments of 0.5 mm, until a maximum length of 42.5 mm is achieved. The velocity at which the muscle was stretched was held constant at 0.5 mm/s.
Reading Change in Length (in %) Passive force Total...System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.