
Nonparametric Statistical Methods
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions


Persons
Content
Preface xiii
1. Introduction 1
1.1. Advantages of Nonparametric Methods 1
1.2. The Distribution-Free Property 2
1.3. Some Real-World Applications 3
1.4. Format and Organization 6
1.5. Computing with R 8
1.6. Historical Background 9
2. The Dichotomous Data Problem 11
Introduction 11
2.1. A Binomial Test 11
2.2. An Estimator for the Probability of Success 22
2.3. A Confidence Interval for the Probability of Success (Wilson) 24
2.4. Bayes Estimators for the Probability of Success 33
3. The One-Sample Location Problem 39
Introduction 39
Paired Replicates Analyses by Way of Signed Ranks 39
3.1. A Distribution-Free Signed Rank Test (Wilcoxon) 40
3.2. An Estimator Associated with Wilcoxon's Signed Rank Statistic (Hodges-Lehmann) 56
3.3. A Distribution-Free Confidence Interval Based on Wilcoxon's Signed Rank Test (Tukey) 59
Paired Replicates Analyses by Way of Signs 63
3.4. A Distribution-Free Sign Test (Fisher) 63
3.5. An Estimator Associated with the Sign Statistic (Hodges-Lehmann) 76
3.6. A Distribution-Free Confidence Interval Based on the Sign Test (Thompson, Savur) 80
One-Sample Data 84
3.7. Procedures Based on the Signed Rank Statistic 84
3.8. Procedures Based on the Sign Statistic 90
3.9. An Asymptotically Distribution-Free Test of Symmetry (Randles-Fligner-Policello-Wolfe, Davis-Quade) 94
Bivariate Data 102
3.10. A Distribution-Free Test for Bivariate Symmetry (Hollander) 102
3.11. Efficiencies of Paired Replicates and One-Sample Location Procedures 112
4. The Two-Sample Location Problem 115
Introduction 115
4.1. A Distribution-Free Rank Sum Test (Wilcoxon, Mann and Whitney) 115
4.2. An Estimator Associated with Wilcoxon's Rank Sum Statistic (Hodges-Lehmann) 136
4.3. A Distribution-Free Confidence Interval Based on Wilcoxon's Rank Sum Test (Moses) 142
4.4. A Robust Rank Test for the Behrens-Fisher Problem (Fligner-Policello) 145
4.5. Efficiencies of Two-Sample Location Procedures 149
5. The Two-Sample Dispersion Problem and Other Two-Sample Problems 151
Introduction 151
5.1. A Distribution-Free Rank Test for Dispersion-Medians Equal (Ansari-Bradley) 152
5.2. An Asymptotically Distribution-Free Test for Dispersion Based on the Jackknife-Medians Not Necessarily Equal (Miller) 169
5.3. A Distribution-Free Rank Test for Either Location or Dispersion (Lepage) 181
5.4. A Distribution-Free Test for General Differences in Two Populations (Kolmogorov-Smirnov) 190
5.5. Efficiencies of Two-Sample Dispersion and Broad Alternatives Procedures 200
6. The One-Way Layout 202
Introduction 202
6.1. A Distribution-Free Test for General Alternatives (Kruskal-Wallis) 204
6.2. A Distribution-Free Test for Ordered Alternatives (Jonckheere-Terpstra) 215
6.3. Distribution-Free Tests for Umbrella Alternatives (Mack-Wolfe) 225
6.3A. A Distribution-Free Test for Umbrella Alternatives, Peak Known (Mack-Wolfe) 226
6.3B. A Distribution-Free Test for Umbrella Alternatives, Peak Unknown (Mack-Wolfe) 241
6.4. A Distribution-Free Test for Treatments Versus a Control (Fligner-Wolfe) 249
Rationale For Multiple Comparison Procedures 255
6.5. Distribution-Free Two-Sided All-Treatments Multiple Comparisons Based on Pairwise Rankings-General Configuration (Dwass, Steel, and Critchlow-Fligner) 256
6.6. Distribution-Free One-Sided All-Treatments Multiple Comparisons Based on Pairwise Rankings-Ordered Treatment Effects (Hayter-Stone) 265
6.7. Distribution-Free One-Sided Treatments-Versus-Control Multiple Comparisons Based on Joint Rankings (Nemenyi, Damico-Wolfe) 271
6.8. Contrast Estimation Based on Hodges-Lehmann Two-Sample Estimators (Spjøtvoll) 278
6.9. Simultaneous Confidence Intervals for All Simple Contrasts (Critchlow-Fligner) 282
6.10. Efficiencies of One-Way Layout Procedures 287
7. The Two-Way Layout 289
Introduction 289
7.1. A Distribution-Free Test for General Alternatives in a Randomized Complete Block Design (Friedman, Kendall-Babington Smith) 292
7.2. A Distribution-Free Test for Ordered Alternatives in a Randomized Complete Block Design (Page) 304
Rationale for Multiple Comparison Procedures 315
7.3. Distribution-Free Two-Sided All-Treatments Multiple Comparisons Based on Friedman Rank Sums-General Configuration (Wilcoxon, Nemenyi, McDonald-Thompson) 316
7.4. Distribution-Free One-Sided Treatments Versus Control Multiple Comparisons Based on Friedman Rank Sums (Nemenyi, Wilcoxon-Wilcox, Miller) 322
7.5. Contrast Estimation Based on One-Sample Median Estimators (Doksum) 328
Incomplete Block Data-Two-Way Layout with Zero or One Observation Per Treatment-Block Combination 331
7.6. A Distribution-Free Test for General Alternatives in a Randomized Balanced Incomplete Block Design (BIBD) (Durbin-Skillings-Mack) 332
7.7. Asymptotically Distribution-Free Two-Sided All-Treatments Multiple Comparisons for Balanced Incomplete Block Designs (Skillings-Mack) 341
7.8. A Distribution-Free Test for General Alternatives for Data From an Arbitrary Incomplete Block Design (Skillings-Mack) 343
Replications-Two-Way Layout with at Least One Observation for Every Treatment-Block Combination 354
>1) of Replications Per Treatment-Block Combination (Mack-Skillings) 354
7.10. Asymptotically Distribution-Free Two-Sided All-Treatments Multiple Comparisons for a Two-Way Layout with an Equal Number of Replications in Each Treatment-Block Combination (Mack-Skillings) 367
Analyses Associated with Signed Ranks 370
7.11. A Test Based on Wilcoxon Signed Ranks for General Alternatives in a Randomized Complete Block Design (Doksum) 370
7.12. A Test Based on Wilcoxon Signed Ranks for Ordered Alternatives in a Randomized Complete Block Design (Hollander) 376
7.13. Approximate Two-Sided All-Treatments Multiple Comparisons Based on Signed Ranks (Nemenyi) 379
7.14. Approximate One-Sided Treatments-Versus-Control Multiple Comparisons Based on Signed Ranks (Hollander) 382
7.15. Contrast Estimation Based on the One-Sample Hodges-Lehmann Estimators (Lehmann) 386
7.16. Efficiencies of Two-Way Layout Procedures 390
8. The Independence Problem 393
Introduction 393
8.1. A Distribution-Free Test for Independence Based on Signs (Kendall) 393
8.2. An Estimator Associated with the Kendall Statistic (Kendall) 413
8.3. An Asymptotically Distribution-Free Confidence Interval Based on the Kendall Statistic (Samara-Randles, Fligner-Rust, Noether) 415
8.4. An Asymptotically Distribution-Free Confidence Interval Based on Efron's Bootstrap 420
8.5. A Distribution-Free Test for Independence Based on Ranks (Spearman) 427
8.6. A Distribution-Free Test for Independence Against Broad Alternatives (Hoeffding) 442
8.7. Efficiencies of Independence Procedures 450
9. Regression Problems 451
Introduction 451
One Regression Line 452
9.1. A Distribution-Free Test for the Slope of the Regression Line (Theil) 452
9.2. A Slope Estimator Associated with the Theil Statistic (Theil) 458
9.3. A Distribution-Free Confidence Interval Associated with the Theil Test (Theil) 460
9.4. An Intercept Estimator Associated with the Theil Statistic and Use of the Estimated Linear Relationship for Prediction (Hettmansperger-McKean-Sheather) 463
k(=2) Regression Lines 466
9.5. An Asymptotically Distribution-Free Test for the Parallelism of Several Regression Lines (Sen, Adichie) 466
General Multiple Linear Regression 475
9.6. Asymptotically Distribution-Free Rank-Based Tests for General Multiple Linear Regression (Jaeckel, Hettmansperger-McKean) 475
Nonparametric Regression Analysis 490
9.7. An Introduction to Non-Rank-Based Approaches to Nonparametric Regression Analysis 490
9.8. Efficiencies of Regression Procedures 494
10. Comparing Two Success Probabilities 495
Introduction 495
10.1. Approximate Tests and Confidence Intervals for the Difference between Two Success Probabilities (Pearson) 496
10.2. An Exact Test for the Difference between Two Success Probabilities (Fisher) 511
10.3. Inference for the Odds Ratio (Fisher, Cornfield) 515
10.4. Inference for k Strata of 2 × 2 Tables (Mantel and Haenszel) 522
10.5. Efficiencies 534
11. Life Distributions and Survival Analysis 535
Introduction 535
11.1. A Test of Exponentiality Versus IFR Alternatives (Epstein) 536
11.2. A Test of Exponentiality Versus NBU Alternatives (Hollander-Proschan) 545
11.3. A Test of Exponentiality Versus DMRL Alternatives (Hollander-Proschan) 555
11.4. A Test of Exponentiality Versus a Trend Change in Mean Residual Life (Guess-Hollander-Proschan) 563
11.5. A Confidence Band for the Distribution Function (Kolmogorov) 568
11.6. An Estimator of the Distribution Function When the Data are Censored (Kaplan-Meier) 578
11.7. A Two-Sample Test for Censored Data (Mantel) 594
11.8. Efficiencies 605
12. Density Estimation 609
Introduction 609
12.1. Density Functions and Histograms 609
12.2. Kernel Density Estimation 617
12.3. Bandwidth Selection 624
12.4. Other Methods 628
13. Wavelets 629
Introduction 629
13.1. Wavelet Representation of a Function 630
13.2. Wavelet Thresholding 644
13.3. Other Uses of Wavelets in Statistics 655
14. Smoothing 656
Introduction 656
14.1. Local Averaging (Friedman) 657
14.2. Local Regression (Cleveland) 662
14.3. Kernel Smoothing 667
14.4. Other Methods of Smoothing 675
15. Ranked Set Sampling 676
Introduction 676
15.1. Rationale and Historical Development 676
15.2. Collecting a Ranked Set Sample 677
15.3. Ranked Set Sampling Estimation of a Population Mean 685
15.4. Ranked Set Sample Analogs of the Mann-Whitney-Wilcoxon Two-Sample Procedures (Bohn-Wolfe) 717
15.5. Other Important Issues for Ranked Set Sampling 737
15.6. Extensions and Related Approaches 742
16. An Introduction to Bayesian Nonparametric Statistics via the Dirichlet Process 744
Introduction 744
16.1. Ferguson's Dirichlet Process 745
16.2. A Bayes Estimator of the Distribution Function (Ferguson) 749
16.3. Rank Order Estimation (Campbell and Hollander) 752
16.4. A Bayes Estimator of the Distribution When the Data are Right-Censored (Susarla and Van Ryzin) 755
16.5. Other Bayesian Approaches 759
Bibliography 763
R Program Index 791
Author Index 799
Subject Index 809
Chapter 1
Introduction
1.1 Advantages of Nonparametric Methods
Roughly speaking, a nonparametric procedure is a statistical procedure that has certain desirable properties that hold under relatively mild assumptions regarding the underlying populations from which the data are obtained. The rapid and continuous development of nonparametric statistical procedures over the past decades is due to the following advantages enjoyed by nonparametric techniques:
1. Nonparametric methods require few assumptions about the underlying populations from which the data are obtained. In particular, nonparametric procedures forgo the traditional assumption that the underlying populations are normal. 2. Nonparametric procedures enable the user to obtain exact P-values for tests, exact coverage probabilities for confidence intervals, exact experimentwise error rates for multiple comparison procedures, and exact coverage probabilities for confidence bands without relying on assumptions that the underlying populations are normal. 3. Nonparametric techniques are often (although not always) easier to apply than their normal theory counterparts. 4. Nonparametric procedures are often quite easy to understand. 5. Although at first glance most nonparametric procedures seem to sacrifice too much of the basic information in the samples, theoretical efficiency investigations have shown that this is not the case. Usually, the nonparametric procedures are only slightly less efficient than their normal theory competitors when the underlying populations are normal (the home court of normal theory methods), and they can be mildly or wildly more efficient than these competitors when the underlying populations are not normal. 6. Nonparametric methods are relatively insensitive to outlying observations. 7. Nonparametric procedures are applicable in many situations where normal theory procedures cannot be utilized. Many nonparametric procedures require just the ranks of the observations, rather than the actual magnitude of the observations, whereas the parametric procedures require the magnitudes. 8. The Quenouille–Tukey jackknife (Quenouille (1949), Tukey (1958, 1962)) and Efron's computer-intensive (1979) bootstrap enable nonparametric approaches to be used in many complicated situations where the distribution theory needed to support parametric methods is intractable. See Efron and Tibshirani (1994). 9. Ferguson's Dirichlet process (1973) paved the way to combine the advantages of nonparametric methods and the use of prior information to form a Bayesian nonparametric approach that does not require distributional assumptions. 10. The development of computer software has facilitated fast computation of exact and approximate -values for conditional nonparametric tests.1.2 The Distribution-Free Property
The term nonparametric, introduced in Section 1.1, is imprecise. The related term distribution-free has a precise meaning. The distribution-free property is a key aspect of many nonparametric procedures. In this section, we informally introduce the concept of a distribution-free test statistic. The related notions of a distribution-free confidence interval, distribution-free multiple comparison procedure, distribution-free confidence band, asymptotically distribution-free test statistic, asymptotically distribution-free multiple comparison procedure, and asymptotically distribution-free confidence band are introduced at appropriate points in the text.
Distribution-Free Test Statistic
We introduce the concept of a distribution-free test statistic by referring to the two-sample Wilcoxon rank sum statistic, which you will encounter in Section 4.1.
The data consist of a random sample of observations from a population with continuous probability distribution and an independent random sample of observations from a second population with continuous probability distribution . The null hypothesis to be tested is
The null hypothesis asserts that the two random samples can be viewed as a single sample of size from a common population with unknown distribution . The Wilcoxon (1945) statistic is obtained by ranking the combined sample of observations jointly from least to greatest. The test statistic is , the sum of the ranks obtained by the 's in the joint ranking.
When is true, the distribution of does not depend on ; that is, when is true, for all -values, the probability that , denoted by , does not depend on .
1.1
The distribution-free property given by (1.1) enables one to obtain the distribution of under without specifying the underlying . It further enables one to exactly specify the type I error probability (the probability of rejecting when is true) without making distributional assumptions, such as the assumption that is a normal distribution; this assumption is required by the parametric -test.
The details concerning how to perform the Wilcoxon test are given in Section 4.1.
1.3 Some Real-World Applications
This book stresses the application of nonparametric techniques to real data. The following 10 examples are a sample of the type of problems you will learn to analyze using nonparametric methods.
Example 1.1 Dose–Response Relationship. In many situations, a dose–response relationship may not be monotonic in the dosage. For example, with in vitro mutagenicity assays, experimental organisms may not survive the toxic side effects of high doses of the test agent, so there may be a reduction in the number of organisms at risk of mutation. This would lead to a downturn (i.e., an umbrella pattern) in the dose–response curve. The data in Table 6.10 were considered by Simpson and Margolin (1986) in a discussion of the analysis of the Ames test results. Plates containing Salmonella bacteria of strain TA98 were exposed to various doses of Acid Red 114. Table 6.10 gives the number of visible revertant colonies on the 18 plates in the study, three plates for each of the six doses (in g/ml): 0, 100, 333, 1000, 3333, and 10,000. How can we test the hypothesis of equal population median numbers at each dose against the alternative that the peak of the dose–response curve occurs at 1000 g/ml? How can we determine which particular pairs of doses, if any, significantly differ from one another in the number of revertant colonies? Which particular doses, out of 100, 333, 1000, 3333, and 10,000, differ significantly from the 0 dose in terms of the number of revertant colonies? For doses that significantly differ, how can we estimate the magnitude of the difference? How can we simultaneously estimate all 15 “contrasts,” , where, for example, denotes the difference between the population medians at dose 0 and dose 100. The methods in Chapter 6 can be used to answer these questions. Example 1.2 Shelterbelts. Shelterbelts are long rows of tree plantings across the direction of prevailing winds. They are used in developed countries to protect crops and livestock from the effects of the wind. A study was performed by Ujah and Adeoye (1984) to see if shelterbeltswould limit severe losses from droughts regularly experienced in the arid and semiarid zones of Nigeria. Droughts are considered to be a leading factor in declining food production in Nigeria and in the neighboring countries. Ujah and Adeoye studied the effect of shelterbelts on a number of factors related to drought conditions, including wind velocity, air and soil temperatures, and soil moisture. Their experiment was conducted at two locations about km apart, near Dambatta. Table 7.7 presents the wind velocity data, averaged over the two locations, at various distances leeward of the shelterbelt. The data are given as percent wind speed reduction relative to the wind velocity on the windward side of the shelterbelt. The data are given for 9 months (data were not available for July, November, and December) and five leeward distances, namely, 20, 40, 100, 150, and 250 m, from the shelterbelt. Does the percent reduction in average wind speed tend to decrease as the leeward distance from a shelterbelt increases? Which particular leeward distances, if any, significantly differ from one another in percent reduction in average wind speed? How can the difference in percent reduction for two leeward distances be estimated? Chapter 7 presents nonparametric methods that will enable you to analyze the data and answer these questions. Example 1.3 Nasal Brushing. In order to study the effects of pharmaceutical and chemical agents on mucociliary clearance, doctors often use the ciliary beat frequency (CBF) as an index of ciliary activity. One accepted way to measure CBF in a subject is through the collection and analysis of an endobronchial forceps biopsy specimen. This technique is, however, a rather invasive method for measuring CBF. In a study designed to assess the effectiveness of less invasive procedures for measuring CBF, Low et al. (1984) considered the alternative technique of nasal brushing. The data in Table 8.10 are a subset of the data collected by Low et al. during their investigation. The subjects in the study were all men undergoing bronchoscopy for the diagnosis of a variety of pulmonary problems. The CBF values reported in Table 8.10 are averages of 10 consecutive measurements on each subject. How can we test the hypothesis of independence versus thealternative that the CBF measurements corresponding to nasal brushing and...System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.