
Nonparametric Statistics with Applications to Science and Engineering with R
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Introduction to the methods and techniques of traditional and modern nonparametric statistics, incorporating R code
Nonparametric Statistics with Applications to Science and Engineering with R presents modern nonparametric statistics from a practical point of view, with the newly revised edition including custom R functions implementing nonparametric methods to explain how to compute them and make them more comprehensible.
Relevant built-in functions and packages on CRAN are also provided with a sample code. R codes in the new edition not only enable readers to perform nonparametric analysis easily, but also to visualize and explore data using R's powerful graphic systems, such as ggplot2 package and R base graphic system.
The new edition includes useful tables at the end of each chapter that help the reader find data sets, files, functions, and packages that are used and relevant to the respective chapter. New examples and exercises that enable readers to gain a deeper insight into nonparametric statistics and increase their comprehension are also included.
Some of the sample topics discussed in Nonparametric Statistics with Applications to Science and Engineering with R include:
* Basics of probability, statistics, Bayesian statistics, order statistics, Kolmogorov-Smirnov test statistics, rank tests, and designed experiments
* Categorical data, estimating distribution functions, density estimation, least squares regression, curve fitting techniques, wavelets, and bootstrap sampling
* EM algorithms, statistical learning, nonparametric Bayes, WinBUGS, properties of ranks, and Spearman coefficient of rank correlation
* Chi-square and goodness-of-fit, contingency tables, Fisher exact test, MC Nemar test, Cochran's test, Mantel-Haenszel test, and Empirical Likelihood
Nonparametric Statistics with Applications to Science and Engineering with R is a highly valuable resource for graduate students in engineering and the physical and mathematical sciences, as well as researchers who need a more comprehensive, but succinct understanding of modern nonparametric statistical methods.
More details
Other editions
Additional editions


Persons
Paul Kvam is professor in the Department of Mathematics, University of Richmond, USA. He received his Ph.D. from University of California, Davis.
Brani Vidakovic is professor in the Department of Statistics, Texas A&M University, USA. He received his Ph.D from Purdue University.
Seong-joon Kim is assistant professor in Department of Industrial Engineering, Chosun University, South Korea. He received his Ph.D. from Hanyang University.
Content
Preface xi
1 Introduction 1
1.1 Efficiency of Nonparametric Methods 2
1.2 Overconfidence Bias 4
1.3 Computing with R 5
1.4 Exercises 6
References 7
2 Probability Basics 9
2.1 Helpful Functions 10
2.2 Events, Probabilities and Random Variables 12
2.3 Numerical Characteristics of Random Variables 13
2.4 Discrete Distributions 14
2.5 Continuous Distributions 18
2.6 Mixture Distributions 24
2.7 Exponential Family of Distributions 26
2.8 Stochastic Inequalities 26
2.9 Convergence of Random Variables 28
2.10 Exercises 32
References 34
3 Statistics Basics 35
3.1 Estimation 36
3.2 Empirical Distribution Function 36
3.3 Statistical Tests 38
3.4 Confidence Intervals 41
3.5 Likelihood 45
3.6 Exercises 49
References 51
4 Bayesian Statistics 53
4.1 The Bayesian Paradigm 53
4.2 Ingredients for Bayesian Inference 54
4.3 Point Estimation 58
4.4 Interval Estimation: Credible Sets 60
4.5 Bayesian Testing 62
4.6 Bayesian Prediction 65
4.7 Bayesian Computation and Use of WinBUGS 67
4.8 Exercises 69
References 73
5 Order Statistics 75
5.1 Joint Distributions of Order Statistics 77
5.2 Sample Quantiles 79
5.3 Tolerance Intervals 79
5.4 Asymptotic Distributions of Order Statistics 81
5.5 Extreme Value Theory 82
5.6 Ranked Set Sampling 83
5.7 Exercises 84
References 87
6 Goodness of Fit 89
6.1 KolmogorovSmirnov Test Statistic 90
6.2 Smirnov Test to Compare Two Distributions 96
6.3 Specialized Tests 99
6.4 Probability Plotting 106
6.5 Runs Test 112
6.6 Meta Analysis 117
6.7 Exercises 121
References 125
7 Rank Tests 127
7.1 Properties of Ranks 128
7.2 Sign Test 130
7.3 Spearman Coefficient of Rank Correlation 135
7.4 Wilcoxon Signed Rank Test 139
7.5 Wilcoxon (TwoSample) Sum Rank Test 142
7.6 MannWhitney U Test 144
7.7 Test of Variances 146
7.8 Walsh Test for Outliers 147
7.9 Exercises 148
References 153
8 Designed Experiments 155
8.1 KruskalWallis Test 156
8.2 Friedman Test 160
8.3 Variance Test for Several Populations 165
8.4 Exercises 166
References 169
9 Categorical Data 171
9.1 ChiSquare and GoodnessofFit 172
9.2 Contingency Tables 178
9.3 Fisher Exact Test 183
9.4 Mc Nemar Test 184
9.5 Cochran's Test 186
9.6 MantelHaenszel Test 188
9.7 CLT for Multinomial Probabilities 190
9.8 Simpson's Paradox 191
9.9 Exercises 193
References 200
10 Estimating Distribution Functions 203
10.1 Introduction 203
10.2 Nonparametric Maximum Likelihood 204
10.3 KaplanMeier Estimator 205
10.4 Confidence Interval for F 213
10.5 Plugin Principle 214
10.6 SemiParametric Inference 215
10.7 Empirical Processes 217
10.8 Empirical Likelihood 218
10.9 Exercises 221
References 223
11 Density Estimation 225
11.1 Histogram 226
11.2 Kernel and Bandwidth 228
11.3 Exercises 235
References 236
12 Beyond Linear Regression 237
12.1 Least Squares Regression 238
12.2 Rank Regression 239
12.3 Robust Regression 243
12.4 Isotonic Regression 249
12.5 Generalized Linear Models 252
12.6 Exercises 259
References 261
13 Curve Fitting Techniques 263
13.1 Kernel Estimators 265
13.2 Nearest Neighbor Methods 269
13.3 Variance Estimation 272
13.4 Splines 273
13.5 Summary 279
13.6 Exercises 279
References 282
14 Wavelets 285
14.1 Introduction to Wavelets 285
14.2 How Do the Wavelets Work? 288
14.3 Wavelet Shrinkage 295
14.4 Exercises 304
References 305
15 Bootstrap 307
15.1 Bootstrap Sampling 307
15.2 Nonparametric Bootstrap 309
15.3 Bias Correction for Nonparametric Intervals 315
15.4 The Jackknife 317
15.5 Bayesian Bootstrap 318
15.6 Permutation Tests 320
15.7 More on the Bootstrap 324
15.8 Exercises 325
References 327
16 EM Algorithm 329
16.1 Fisher's Example 331
16.2 Mixtures 333
16.3 EM and Order Statistics 338
16.4 MAP via EM 339
16.5 Infection Pattern Estimation 341
16.6 Exercises 342
References 343
17 Statistical Learning 345
17.1 Discriminant Analysis 346
17.2 Linear Classification Models 349
17.3 Nearest Neighbor Classification 353
17.4 Neural Networks 355
17.5 Binary Classification Trees 361
17.6 Exercises 368
References 369
18 Nonparametric Bayes 371
18.1 Dirichlet Processes 372
18.2 Bayesian Categorical Models 380
18.3 Infinitely Dimensional Problems 383
18.4 Exercises 387
References 389
A WinBUGS 392
A.1 Using WinBUGS 393
A.2 Builtin
Functions 396
B R Coding 400
B.1 Programming in R 400
B.2 Basics of R 402
B.3 R Commands 403
B.4 R for Statistics 405
R Index 411
Author Index 414
Subject Index 418
Preface
Danger lies not in what we don't know, but in what we think we know that just ain't so.
Mark Twain (1835-1910)
This textbook is a substantial revision of a previous textbook written in 2007 by Kvam and Vidakovic. The biggest difference in this version is the adoption of the R programming language as a supplementary learning tool for the purpose of teaching concepts, illustrating examples, and completing computational homework assignments. In the original book, the authors relied on Matlab.
There has been plenty of change in the world of nonparametric statistics since we finished the first edition of this book. While the statistics community had already adapted to a modern framework for data analysis that relies increasingly on nonparametric procedures (not to mention Bayesian alternatives to traditional inference), we sense more adapters in engineering, medical research, chemistry, biology, and especially the behavioral sciences with each passing year. However, the field of nonparametric statistics has also receded toward the periphery of the statistics curriculum in the wake of data science, which continues to encroach on graduate curriculums associated with statistics, causing more programs to replace traditional statistics courses with the trendier versions involving data structures.
There are quality monographs/texts dealing with nonparametric statistics, such as the encyclopedic book by Hollander and Wolfe, Nonparametric Statistical Methods, or the excellent book by Conover, Practical Nonparametric Statistics, which has served as a staple for a generation of professors tasked to teach a course in this subject. Before engaging in writing the first version of this textbook, we taught several iterations of a graduate course on nonparametric statistics at Georgia Tech. The audience consisted of MS and PhD students in Engineering Statistics, Electrical Engineering, Bioengineering, Management, Logistics, Applied Mathematics, and Physics. While comprising a nonhomogeneous group, all of the students had solid mathematical, programming, and statistical training needed to benefit from the course.
In our course, we relied on the third edition of Conover's book, which is mainly concerned with what most of us think of as traditional nonparametric statistics: proportions, ranks, categorical data, goodness of fit, and so on, with the understanding that the text would be supplemented by the instructor's handouts. We ended up supplying an increasing number of handouts every year, for units such as density and function estimation, wavelets, Bayesian approaches to nonparametric problems, EM algorithm, splines, machine learning, and other arguably modern nonparametric topics. Later on, we decided to merge the handouts and fill the gaps.
With this new edition, we adhere to the traditional form one expects in an academic textbook, but we aim to provide more informal discussion and commentary to balance with the regimen of lessons that help the student progress through a statistics methods course. Unlike newer books that focus on data science, we want to help the student learn more than just how to implement a statistical procedure. We want them to understand, to a higher degree, what they are doing (or what R is doing for them).
We hope the book provides all of the tools and motivation for a student to study methods of nonparametric statistics, but we also aim to keep a conversational tone in our writing. Reading math-infused textbooks can be challenging, but it need not be a drudgery. For that reason, we remind the reader of the bigger picture, including the historical and cultural aspects linked to the development and application of nonparametric procedures. We think it is important to acknowledge the fundamental contributions to the field of nonparametric statistics by not only our field's pioneers, such as Karl Pearson, Nathan Mantel, or Brad Efron, but also others in our vanguard, including François-Marie Arouet (Voltaire), Karl Popper, and Baron Von Munchausen.
Computing. The book is integrated with R, and for many procedures covered in this book, we feature subroutines and packages (free libraries of code) of R code. The choice of software was natural: engineers, scientists, and increasingly statisticians are communicating in the "R language." R is an open-source language for statistical computing and quickly emerging environment as the standard for research and development. R provides a wide variety of packages that allow to perform various kinds of analyses and powerful graphic components. For Bayesian calculation we previously relied on WinBUGS, a free software from Cambridge's Biostatistics Research Unit. Both R and WinBUGS are briefly covered in two appendices for readers less familiar with them. For R-programmers who want to see a variety of programming modules for nonparametric inference in the R language, we refer you to the R-series guide Nonparametric Statistical Methods Using R by Kloke and McKean.
Outline of Chapters. For a typical graduate student to cover the full breadth of this textbook, two semesters would be required. For a one-semester course, the instructor should necessarily cover Chapters 1-3 and 5-9 to start. Depending on the scope of the class, the last part of the course can include different chapter selections.
Chapters 2-4 contain important background material the student needs to understand to effectively learn and apply the methods taught in a nonparametric analysis course. Because the ranks of observations have special importance in a nonparametric analysis, Chapter 5 presents basic results for order statistics and includes statistical methods to create tolerance intervals.
Traditional topics in estimation and testing are presented in Chapters 7-10 and should receive emphasis even to students who are most curious about advanced topics such as density estimation (Chapter 11), curve fitting (Chapter 13), and wavelets (Chapter 14). These topics include a core of rank tests that are analogous to common parametric procedures (e.g. -tests, analysis of variance).
Basic methods of categorical data analysis are contained in Chapter 9. Although most students in the biological sciences are exposed to a wide variety of statistical methods for categorical data, engineering students and other students in the physical sciences typically receive less schooling in this quintessential branch of statistics. Topics include methods based on tabled data, chi-square tests, and the introduction of general linear models. Also included in the first part of the book is the topic of "goodness of fit" (Chapter 6), which refers to testing data not in terms of some unknown parameters, but the unknown distribution that generated it. In a way, goodness of fit represents an interface between distribution-free methods and traditional parametric methods of inference, and both analytical and graphical procedures are presented. Chapter 10 presents the nonparametric alternative to maximum likelihood estimation and likelihood ratio-based confidence intervals.
The term "regression" is familiar from your previous course that introduced you to statistical methods. Nonparametric regression provides an alternative method of analysis that requires fewer assumptions of the response variable. In Chapter 12, we use the regression platform to introduce other important topics that build on linear regression, including isotonic (constrained) regression, robust regression, and generalized linear models. In Chapter 13, we introduce more general curve fitting methods. Regression models based on wavelets (Chapter 14) are presented in a separate chapter.
In the latter part of the book, emphasis is placed on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. Beyond the conspicuous rank tests, this text includes many of the newest nonparametric tools available to experimenters for data analysis. Chapter 17 introduces fundamental topics of statistical learning as a basis for data mining and pattern recognition and includes discriminant analysis, nearest-neighbor classifiers, neural networks, and binary classification trees. Computational tools needed for nonparametric analysis include bootstrap resampling (Chapter 15) and the EM algorithm (Chapter 16). Bootstrap methods, in particular, have become indispensable for uncertainty analysis with large data sets and elaborate stochastic models.
The textbook also unabashedly includes a review of Bayesian statistics and an overview of nonparametric Bayesian estimation. If you are familiar with Bayesian methods, you might wonder what role they play in nonparametric statistics. Admittedly, the connection is not obvious, but in fact nonparametric Bayesian methods (Chapter 18) represent an important set of tools for complicated problems in statistical modeling and learning, where many of the models are nonparametric in nature.
The book is intended both as a reference text and a text for a graduate course. We hope the reader will find this book useful. All comments, suggestions, updates, and critiques will be appreciated.
April 2022 Paul Kvam
Department of Mathematics
University of Richmond
Brani Vidakovic
Department of Statistics
Texas A & M University
Seong-joon...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.