Kernel Smoothing

Name: Kernel Smoothing | Principles, Methods and Applications
Brand: Wiley
Price: 63.99 EUR
Availability: OnlineOnly

Principles, Methods and Applications

Sucharita Ghosh(Author)

Wiley (Publisher)

Published on 7. November 2017

272 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-118-89051-6 (ISBN)

€63.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Comprehensive theoretical overview of kernel smoothing methods with motivating examples Kernel smoothing is a flexible nonparametric curve estimation method that is applicable when parametric descriptions of the data are not sufficiently adequate. This book explores theory and methods of kernel smoothing in a variety of contexts, considering independent and correlated data e.g. with short-memory and long-memory correlations, as well as non-Gaussian data that are transformations of latent Gaussian processes. These types of data occur in many fields of research, e.g. the natural and the environmental sciences, and others. Nonparametric density estimation, nonparametric and semiparametric regression, trend and surface estimation in particular for time series and spatial data and other topics such as rapid change points, robustness etc. are introduced alongside a study of their theoretical properties and optimality issues, such as consistency and bandwidth selection. Addressing a variety of topics, Kernel Smoothing: Principles, Methods and Applications offers a user-friendly presentation of the mathematical content so that the reader can directly implement the formulas using any appropriate software. The overall aim of the book is to describe the methods and their theoretical backgrounds, while maintaining an analytically simple approach and including motivating examples--making it extremely useful in many sciences such as geophysics, climate research, forestry, ecology, and other natural and life sciences, as well as in finance, sociology, and engineering. * A simple and analytical description of kernel smoothing methods in various contexts * Presents the basics as well as new developments * Includes simulated and real data examples Kernel Smoothing: Principles, Methods and Applications is a textbook for senior undergraduate and graduate students in statistics, as well as a reference book for applied statisticians and advanced researchers.

More details

Other editions

Person

Content

Preface ix

Density Estimation 1

1.1 Introduction 1

1.1.1 Orthogonal polynomials 2

1.2 Histograms 8

1.2.1 Properties of the histogram 9

1.2.2 Frequency polygons 14

1.2.3 Histogram bin widths 15

1.2.4 Average shifted histogram 19

1.3 Kernel density estimation 19

1.3.1 Naive density estimator 21

1.3.2 Parzen-Rosenblatt kernel density estimator 25

1.3.3 Bandwidth selection 43

1.4 Multivariate density estimation 53

Nonparametric Regression 59

2.1 Introduction 59

2.1.1 Method of least squares 60

2.1.2 Influential observations 70

2.1.3 Nonparametric regression estimators 71

2.2 Priestley-Chao regression estimator 73

2.2.1 Weak consistency 77

2.3 Local polynomials 80

2.3.1 Equivalent kernels 84

2.4 Nadaraya-Watson regression estimator 87

2.5 Bandwidth selection 93

2.6 Further remarks 99

2.6.1 Gasser-M¿uller estimator 99

2.6.2 Smoothing splines 100

2.6.3 Kernel efficiency 103

Trend Estimation 105

3.1 Time series replicates 105

3.1.1 Model 111

3.1.2 Estimation of common trend function 114

3.1.3 Asymptotic properties 114

3.2 Irregularly spaced observations 120

3.2.1 Model 122

3.2.2 Derivatives, distribution function, and quantiles 125

3.2.3 Asymptotic properties 129

3.2.4 Bandwidth selection 137

3.3 Rapid change points 141

3.3.1 Model and definition of rapid change 144

3.3.2 Estimation and asymptotics 145

3.4 Nonparametric M-estimation of a trend function 149

3.4.1 Kernel-based M-estimation 149

3.4.2 Local polynomial M-estimation 154

Semiparametric Regression 157

4.1 Partial linear models with constant slope 157

4.2 Partial linear models with time-varying slope 160

4.2.1 Estimation 165

4.2.2 Assumptions 166

4.2.3 Asymptotics 171

Surface Estimation 181

5.1 Introduction 181

5.2 Gaussian subordination 193

5.3 Spatial correlations 195

5.4 Estimation of the mean and consistency 197

5.4.1 Asymptotics 197

5.5 Variance estimation 203

5.6 Distribution function and spatial Gini index 206

5.6.1 Asymptotics 213

References 217

Author Index 243

Subject Index 251

1
Density Estimation

1.1 Introduction

Use of sampled observations to approximate distributions has a long history. An important milestone was Pearson (1895, 1902a, 1902b), who noted that the limiting case of the hypergeometric series can be written as in the equation below and who introduced the Pearsonian system of probability densities. This is a broad class given as a solution to the differential equation

(1.1)

The different families of densities (Type I-VI) are found by solving this differential equation under varying conditions on the constants. It turns out that the constants are then expressible in terms of the first four moments of the probability density function (pdf) f, so that they can be estimated given a set of observations using the method of moments; see Kendall and Stuart (1963).

If the unknown pdf f is known to belong to a known parametric family of density functions satisfying suitable regularity conditions, then the maximum likelihood (MLE; Fisher 1912, 1997) can be used to estimate the parameters of the density, thereby estimating the density itself. This method has very powerful statistical properties, and continues to be perhaps the most popular method of estimation in statistics. Often, the MLE is the solution to an estimating equation, as is also the case for the method of least squares. These procedures then come under the general framework of M-estimation. Two other related approaches that use ranks of the observations are the so-called L-estimation and R-estimation, where the statistics are respectively linear combinations of the order statistics or of their ranks. These estimation methods are covered in many standard textbooks. Some examples are Rao (1973, chapters 4 and 5), Serfling (1986, chapters 7, 8, and 9), and Sen and Srivastava (1990).

1.1.1 Orthogonal polynomials

Yet another approach worth mentioning here is the use of Orthogonal polynomials (see Szego 2003). In this method, the unknown density is approximated by a sum of weighted linear combinations of a set of basis functions. encov (1962) provides a general description whereas other reviews are in Wegman (1972) and Silverman (1980). Additional background information and further references can be found in Beran et al. (2013, Chapter 3) and Kendall and Stuart (1963, Chapter 6). The essential idea behind the use of Orthogonal polynomials is as follows (see Rosenblatt 1971):

Suppose that the pdf

(1.2)

belongs to the space of all square integrable functions with respect to the weight function G, i.e.,

(1.3)

holds, where denotes the real line. Also, let {Gl(x)} be a complete and orthonormal sequence of functions in . Then f admits an expansion

(1.4)

which converges to f in , where al is defined as

(1.5)

This formula immediately suggests an unbiased estimator of the coefficient al using sampled observations, followed by a substitution in the expansion for f.

As an example, we take a brief look at the Gram-Charlier series representation followed by a further extension due to Schwartz (1967). The Gram-Charlier series of Type A is based on Hermite polynomials Hl and the standard normal pdf ?. Note that, for Edgeworth expansion based methods, one would consider the Fourier transform of the product Hl(x)?(x) and move on to an expansion that uses the cumulant generating function (see Kendall and Stuart 1963).

First of all consider the pdf f such that it can be expressed as

(1.6)

For conditions under which this is valid, see two theorems due to Cramér quoted in Kendall and Stuart (1963, pp. 161-162) as well as some historical notes in Cramér (1972).

Here ? is the standard normal pdf, i.e.,

(1.7)

and Hl is the Hermite polynomial of degree l, i.e.,

(1.8)

Using the orthogonality property of the Hermite polynomials, i.e.,

(1.9) (1.10)

we have

(1.11) (1.12)

In other words, the coefficients cl are

(1.13)

Due to previous detailed work by Chebyshev, the Hermite polynomials are also known as the Chebyshev-Hermite polynomials. In fact, contributions of Laplace are also known. See Sansone (2004) and Szego (2003) for additional information.

The above formula for cl implies that these coefficients may be estimated from a given set of observations X1, ., Xn from f as sample means of Hermite polynomials, i.e.,

(1.14)

Since with increasing l, estimation of higher order moments are involved, this method however is not optimal. From a statistical view-point, one option is to consider a finite sum.

To this end, Schwartz (1967) considers a pdf f that is square integrable (or simply bounded) and seeks to give an approximation of the form

(1.15)

where Mn is a sequence of integers depending on the sample size n, dl, n are estimated from observed data, and Gl are Hermite functions

(1.16)

The Hermite functions Gl(x) form a complete orthonormal set over the real line. Examples of Hermite polynomials and Hermite functions are in Figure 1.1 and Figure 1.2. Moreover, due to a theorem of Cramér (see Schwartz 1967), |Gl(x)| is bounded above by a constant that does not depend on x or l. Since f is square integrable, f can be expanded (orthogonal series expansion) as

(1.17)

where

(1.18)

Schwartz (1967) proposes the estimator

(1.19)

where Mn 8 and Mn = o(n) as n 8 and the coefficients are estimators based on the sample means of Hermite functions

(1.20)

Under some conditions on the rth derivative (r = 2) of f(x)?(x), Schwartz (1967) derives asymptotic properties of his estimator including the rate of convergence to zero of the mean integrated squared error (MISE).

Figure 1.1 Rescaled Hermite polynomials of degree l for l = 0, 1, 2 and the corresponding Hermite functions (right) Gl(x). These functions are related via the relation , where , where Hl is the Hermite polynomial of degree l.

Figure 1.2 Rescaled Hermite polynomials of degree l for l = 3, 4, 5 and the corresponding Hermite functions (right) Gl(x). These functions are related via the relation , where , where Hl is the Hermite polynomial of degree l.

There are various textbooks and review papers that give excellent overvews of nonparametric density estimation techniques. Basic developments and related information can, for instance, be found in Watson and Leadbetter (1964a, 1964b), Shapiro (1969), Rosenblatt (1971), Bickel and Rosenblatt (1973), Rice and Rosenblatt (1976), Tapia and Thompson (1978), Wegman (1982), Silverman (1986), Hart (1990), Jones and Sheather (1991), Müller and Wang (1994), Devroye (1987), Müller (1997), Loader (1999), and Heidenreich et al. (2013). Various textbooks have addressed applied aspects and included various theoretical results on general kernel smoothing methods. Some examples are Bowman and Azzalini (1997), Wand and Jones (1995), Simonoff (1996), Scott (1992), Thompson and Tapia (1987), and others.

In this chapter, we focus on a selection of ideas for density estimation with independently and identically distributed (iid) observations, restricting ourselves to continuous random variables. We start with the univariate case and the multivariate case is mentioned in the sequel.

Let X, X1, X2, ., Xn be iid real-valued univariate continuous random variables with an absolutely continuous cumulative distribution function

(1.21)

where f(x) denotes the probability density function (pdf). The pdf f will be assumed to be a three times continuously differentiable function with finite derivatives. Further conditions will be added in the sequel.

The problem is a nonparametric estimation of , using X1, X2, ., Xn.

1.2 Histograms

The most widely used nonparametric density estimation method is the histogram, especially for univariate random variables. The idea has a long history and the name "histogram" seems to have been used for the first time by Karl Pearson 1895). Basic information on the use of the histogram as a graphical tool to display frequency distributions can be found in any elementary statistical textbook.

Construction of a histogram proceeds as follows. We consider the univariate case. Let

(1.22)

be a partition of the real line...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Kernel Smoothing

Description

More details

Other editions

Additional editions

Person

Content

1
Density Estimation

1.1 Introduction

1.1.1 Orthogonal polynomials

1.2 Histograms

System requirements

Schweitzer Fachinformationen

Kernel Smoothing

Description

More details

Other editions

Additional editions

Person

Content

1 Density Estimation

1.1 Introduction

1.1.1 Orthogonal polynomials

1.2 Histograms

System requirements

1
Density Estimation