
Kernel Smoothing
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions


Person
Content
Preface ix
Density Estimation 1
1.1 Introduction 1
1.1.1 Orthogonal polynomials 2
1.2 Histograms 8
1.2.1 Properties of the histogram 9
1.2.2 Frequency polygons 14
1.2.3 Histogram bin widths 15
1.2.4 Average shifted histogram 19
1.3 Kernel density estimation 19
1.3.1 Naive density estimator 21
1.3.2 Parzen-Rosenblatt kernel density estimator 25
1.3.3 Bandwidth selection 43
1.4 Multivariate density estimation 53
Nonparametric Regression 59
2.1 Introduction 59
2.1.1 Method of least squares 60
2.1.2 Influential observations 70
2.1.3 Nonparametric regression estimators 71
2.2 Priestley-Chao regression estimator 73
2.2.1 Weak consistency 77
2.3 Local polynomials 80
2.3.1 Equivalent kernels 84
2.4 Nadaraya-Watson regression estimator 87
2.5 Bandwidth selection 93
2.6 Further remarks 99
2.6.1 Gasser-M¿uller estimator 99
2.6.2 Smoothing splines 100
2.6.3 Kernel efficiency 103
Trend Estimation 105
3.1 Time series replicates 105
3.1.1 Model 111
3.1.2 Estimation of common trend function 114
3.1.3 Asymptotic properties 114
3.2 Irregularly spaced observations 120
3.2.1 Model 122
3.2.2 Derivatives, distribution function, and quantiles 125
3.2.3 Asymptotic properties 129
3.2.4 Bandwidth selection 137
3.3 Rapid change points 141
3.3.1 Model and definition of rapid change 144
3.3.2 Estimation and asymptotics 145
3.4 Nonparametric M-estimation of a trend function 149
3.4.1 Kernel-based M-estimation 149
3.4.2 Local polynomial M-estimation 154
Semiparametric Regression 157
4.1 Partial linear models with constant slope 157
4.2 Partial linear models with time-varying slope 160
4.2.1 Estimation 165
4.2.2 Assumptions 166
4.2.3 Asymptotics 171
Surface Estimation 181
5.1 Introduction 181
5.2 Gaussian subordination 193
5.3 Spatial correlations 195
5.4 Estimation of the mean and consistency 197
5.4.1 Asymptotics 197
5.5 Variance estimation 203
5.6 Distribution function and spatial Gini index 206
5.6.1 Asymptotics 213
References 217
Author Index 243
Subject Index 251
1
Density Estimation
1.1 Introduction
Use of sampled observations to approximate distributions has a long history. An important milestone was Pearson (1895, 1902a, 1902b), who noted that the limiting case of the hypergeometric series can be written as in the equation below and who introduced the Pearsonian system of probability densities. This is a broad class given as a solution to the differential equation
(1.1)The different families of densities (Type I-VI) are found by solving this differential equation under varying conditions on the constants. It turns out that the constants are then expressible in terms of the first four moments of the probability density function (pdf) f, so that they can be estimated given a set of observations using the method of moments; see Kendall and Stuart (1963).
If the unknown pdf f is known to belong to a known parametric family of density functions satisfying suitable regularity conditions, then the maximum likelihood (MLE; Fisher 1912, 1997) can be used to estimate the parameters of the density, thereby estimating the density itself. This method has very powerful statistical properties, and continues to be perhaps the most popular method of estimation in statistics. Often, the MLE is the solution to an estimating equation, as is also the case for the method of least squares. These procedures then come under the general framework of M-estimation. Two other related approaches that use ranks of the observations are the so-called L-estimation and R-estimation, where the statistics are respectively linear combinations of the order statistics or of their ranks. These estimation methods are covered in many standard textbooks. Some examples are Rao (1973, chapters 4 and 5), Serfling (1986, chapters 7, 8, and 9), and Sen and Srivastava (1990).
1.1.1 Orthogonal polynomials
Yet another approach worth mentioning here is the use of Orthogonal polynomials (see Szego 2003). In this method, the unknown density is approximated by a sum of weighted linear combinations of a set of basis functions. encov (1962) provides a general description whereas other reviews are in Wegman (1972) and Silverman (1980). Additional background information and further references can be found in Beran et al. (2013, Chapter 3) and Kendall and Stuart (1963, Chapter 6). The essential idea behind the use of Orthogonal polynomials is as follows (see Rosenblatt 1971):
Suppose that the pdf
(1.2)belongs to the space of all square integrable functions with respect to the weight function G, i.e.,
(1.3)holds, where denotes the real line. Also, let {Gl(x)} be a complete and orthonormal sequence of functions in . Then f admits an expansion
(1.4)which converges to f in , where al is defined as
(1.5)This formula immediately suggests an unbiased estimator of the coefficient al using sampled observations, followed by a substitution in the expansion for f.
As an example, we take a brief look at the Gram-Charlier series representation followed by a further extension due to Schwartz (1967). The Gram-Charlier series of Type A is based on Hermite polynomials Hl and the standard normal pdf ?. Note that, for Edgeworth expansion based methods, one would consider the Fourier transform of the product Hl(x)?(x) and move on to an expansion that uses the cumulant generating function (see Kendall and Stuart 1963).
First of all consider the pdf f such that it can be expressed as
(1.6)For conditions under which this is valid, see two theorems due to Cramér quoted in Kendall and Stuart (1963, pp. 161-162) as well as some historical notes in Cramér (1972).
Here ? is the standard normal pdf, i.e.,
(1.7)and Hl is the Hermite polynomial of degree l, i.e.,
(1.8)Using the orthogonality property of the Hermite polynomials, i.e.,
(1.9) (1.10)we have
(1.11) (1.12)In other words, the coefficients cl are
(1.13)Due to previous detailed work by Chebyshev, the Hermite polynomials are also known as the Chebyshev-Hermite polynomials. In fact, contributions of Laplace are also known. See Sansone (2004) and Szego (2003) for additional information.
The above formula for cl implies that these coefficients may be estimated from a given set of observations X1, ., Xn from f as sample means of Hermite polynomials, i.e.,
(1.14)Since with increasing l, estimation of higher order moments are involved, this method however is not optimal. From a statistical view-point, one option is to consider a finite sum.
To this end, Schwartz (1967) considers a pdf f that is square integrable (or simply bounded) and seeks to give an approximation of the form
(1.15)where Mn is a sequence of integers depending on the sample size n, dl, n are estimated from observed data, and Gl are Hermite functions
(1.16)The Hermite functions Gl(x) form a complete orthonormal set over the real line. Examples of Hermite polynomials and Hermite functions are in Figure 1.1 and Figure 1.2. Moreover, due to a theorem of Cramér (see Schwartz 1967), |Gl(x)| is bounded above by a constant that does not depend on x or l. Since f is square integrable, f can be expanded (orthogonal series expansion) as
(1.17)where
(1.18)Schwartz (1967) proposes the estimator
(1.19)where Mn 8 and Mn = o(n) as n 8 and the coefficients are estimators based on the sample means of Hermite functions
(1.20)Under some conditions on the rth derivative (r = 2) of f(x)?(x), Schwartz (1967) derives asymptotic properties of his estimator including the rate of convergence to zero of the mean integrated squared error (MISE).
Figure 1.1 Rescaled Hermite polynomials of degree l for l = 0, 1, 2 and the corresponding Hermite functions (right) Gl(x). These functions are related via the relation , where , where Hl is the Hermite polynomial of degree l.
Figure 1.2 Rescaled Hermite polynomials of degree l for l = 3, 4, 5 and the corresponding Hermite functions (right) Gl(x). These functions are related via the relation , where , where Hl is the Hermite polynomial of degree l.
There are various textbooks and review papers that give excellent overvews of nonparametric density estimation techniques. Basic developments and related information can, for instance, be found in Watson and Leadbetter (1964a, 1964b), Shapiro (1969), Rosenblatt (1971), Bickel and Rosenblatt (1973), Rice and Rosenblatt (1976), Tapia and Thompson (1978), Wegman (1982), Silverman (1986), Hart (1990), Jones and Sheather (1991), Müller and Wang (1994), Devroye (1987), Müller (1997), Loader (1999), and Heidenreich et al. (2013). Various textbooks have addressed applied aspects and included various theoretical results on general kernel smoothing methods. Some examples are Bowman and Azzalini (1997), Wand and Jones (1995), Simonoff (1996), Scott (1992), Thompson and Tapia (1987), and others.
In this chapter, we focus on a selection of ideas for density estimation with independently and identically distributed (iid) observations, restricting ourselves to continuous random variables. We start with the univariate case and the multivariate case is mentioned in the sequel.
Let X, X1, X2, ., Xn be iid real-valued univariate continuous random variables with an absolutely continuous cumulative distribution function
(1.21)where f(x) denotes the probability density function (pdf). The pdf f will be assumed to be a three times continuously differentiable function with finite derivatives. Further conditions will be added in the sequel.
The problem is a nonparametric estimation of , using X1, X2, ., Xn.
1.2 Histograms
The most widely used nonparametric density estimation method is the histogram, especially for univariate random variables. The idea has a long history and the name "histogram" seems to have been used for the first time by Karl Pearson 1895). Basic information on the use of the histogram as a graphical tool to display frequency distributions can be found in any elementary statistical textbook.
Construction of a histogram proceeds as follows. We consider the univariate case. Let
(1.22)be a partition of the real line...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.