An Introduction to Envelopes

Name: An Introduction to Envelopes | Dimension Reduction for Efficient Estimation in Multivariate Statistics
Brand: Wiley
Price: 130.99 EUR
Availability: OnlineOnly

Dimension Reduction for Efficient Estimation in Multivariate Statistics

R. Dennis Cook(Author)

Wiley (Publisher)

1st Edition

Published on 7. September 2018

320 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-119-42296-9 (ISBN)

€130.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Written by the leading expert in the field, this text reviews the major new developments in envelope models and methods An Introduction to Envelopes provides an overview of the theory and methods of envelopes, a class of procedures for increasing efficiency in multivariate analyses without altering traditional objectives. The author offers a balance between foundations and methodology by integrating illustrative examples that show how envelopes can be used in practice. He discusses how to use envelopes to target selected coefficients and explores predictor envelopes and their connection with partial least squares regression. The book reveals the potential for envelope methodology to improve estimation of a multivariate mean. The text also includes information on how envelopes can be used in generalized linear models, regressions with a matrix-valued response, and reviews work on sparse and Bayesian response envelopes. In addition, the text explores relationships between envelopes and other dimension reduction methods, including canonical correlations, reduced-rank regression, supervised singular value decomposition, sufficient dimension reduction, principal components, and principal fitted components. This important resource: Offers a text written by the leading expert in this field Describes groundbreaking work that puts the focus on this burgeoning area of study Covers the important new developments in the field and highlights the most important directions Discusses the underlying mathematics and linear algebra Includes an online companion site with both R and Matlab support Written for researchers and graduate students in multivariate analysis and dimension reduction, as well as practitioners interested in statistical methodology, An Introduction to Envelopes offers the first book on the theory and methods of envelopes.

More details

Other editions

Person

Content

Preface xv

Notation and De¿nitions xix

1 Response Envelopes 1

1.1 The Multivariate Linear Model 2

1.1.1 Partitioned Models and Added Variable Plots 5

1.1.2 Alternative Model Forms 6

1.2 Envelope Model for Response Reduction 6

1.3 Illustrations 10

1.3.1 A Schematic Example 10

1.3.2 Compound Symmetry 13

1.3.3 Wheat Protein: Introductory Illustration 13

1.3.4 Cattle Weights: Initial Fit 14

1.4 More on the Envelope Model 19

1.4.1 Relationship with Süciency 19

1.4.2 Parameter Count 19

1.4.3 Potential Gains 20

1.5 Maximum Likelihood Estimation 21

1.5.1 Derivation 21

1.5.2 Cattle Weights: Variation of the X-Variant Parts of Y 23

1.5.3 Insights into Ê_S (B)24

1.5.4 Scaling the Responses 25

1.6 Asymptotic Distributions 25

1.7 Fitted Values and Predictions 28

1.8 Testing the Responses 29

1.8.1 Test Development 29

1.8.2 Testing Individual Responses 32

1.8.3 Testing Containment Only 34

1.9 Nonnormal Errors 34

1.10 Selecting the Envelope Dimension, u 36

1.10.1 Selection Methods 36

1.10.1.1 Likelihood Ratio Testing 36

1.10.1.2 Information Criteria 37

1.10.1.3 Cross-validation 37

1.10.2 Inferring About rank (¿¿¿¿) 38

1.10.3 Asymptotic Considerations 38

1.10.4 Overestimation Versus Underestimation of u 41

1.10.5 Cattle Weights: In¿uence of u 43

1.11 Bootstrap and Uncertainty in the Envelope Dimension 45

1.11.1 Bootstrap for Envelope Models 45

1.11.2 Wheat Protein: Bootstrap and Asymptotic Standard Errors, u Fixed 46

1.11.3 Cattle Weights: Bootstrapping u 47

1.11.4 Bootstrap Smoothing 48

1.11.5 Cattle Data: Bootstrap Smoothing 49

2 Illustrative Analyses Using Response Envelopes 51

2.1 Wheat Protein: Full Data 51

2.2 Berkeley Guidance Study 51

2.3 Banknotes 54

2.4 Egyptian Skulls 55

2.5 Australian Institute of Sport: Response Envelopes 58

2.6 Air Pollution 59

2.7 Multivariate Bioassay 63

2.8 Brain Volumes 65

2.9 Reducing Lead Levels in Children 67

3 Partial Response Envelopes 69

3.1 Partial Envelope Model 69

3.2 Estimation 71

3.2.1 Asymptotic Distribution of ^ 72

3.2.2 Selecting u₁ 73

3.3 Illustrations 74

3.3.1 Cattle Weight: Incorporating Basal Weight 74

3.3.2 Mens' Urine 74

3.4 Partial Envelopes for Prediction 77

3.4.1 Rationale 77

3.4.2 Pulp Fibers: Partial Envelopes and Prediction 78

3.5 Reducing Part of the Response 79

4 Predictor Envelopes 81

4.1 Model Formulations 81

4.1.1 Linear Predictor Reduction 81

4.1.1.1 Predictor Envelope Model 83

4.1.1.2 Expository Example 83

4.1.2 Latent Variable Formulation of Partial Least Squares Regression 84

4.1.3 Potential Advantages 86

4.2 SIMPLS 88

4.2.1 SIMPLS Algorithm 88

4.2.2 SIMPLS When n < p 90

4.2.2.1 Behavior of the SIMPLS Algorithm 90

4.2.2.2 Asymptotic Properties of SIMPLS 91

4.3 Likelihood-Based Predictor Envelopes 94

4.3.1 Estimation 95

4.3.2 Comparisions with SIMPLS and Principal Component Regression 97

4.3.2.1 Principal Component Regression 98

4.3.2.2 SIMPLS 98

4.3.3 Asymptotic Properties 98

4.3.4 Fitted Values and Prediction 100

4.3.5 Choice of Dimension 101

4.3.6 Relevant Components 101

4.4 Illustrations 102

4.4.1 Expository Example, Continued 102

4.4.2 Australian Institute of Sport: Predictor Envelopes 103

4.4.3 Wheat Protein: Predicting Protein Content 105

4.4.4 Mussels' Muscles: Predictor Envelopes 106

4.4.5 Meat Properties 109

4.5 Simultaneous Predictor-Response Envelopes 109

4.5.1 Model Formulation 109

4.5.2 Potential Gain 110

4.5.3 Estimation 113

5 Enveloping Multivariate Means 117

5.1 Enveloping a Single Mean 117

5.1.1 Envelope Structure 117

5.1.2 Envelope Model 119

5.1.3 Estimation 120

5.1.4 Minneapolis Schools 122

5.1.4.2 Four Untransformed Responses 124

5.1.5 Functional Data 126

5.2 Enveloping Multiple Means with Heteroscedastic Errors 126

5.2.1 Heteroscedastic Envelopes 126

5.2.2 Estimation 128

5.2.3 Cattle Weights: Heteroscedastic Envelope Fit 129

5.3 Extension to Heteroscedastic Regressions 130

6 Envelope Algorithms 133

6.1 Likelihood-Based Envelope Estimation 133

6.2 Starting Values 135

6.2.1 Choosing the Starting Value from the Eigenvectors of M^ 135

6.2.2 Choosing the Starting Value from the Eigenvectors of M^ + U^ 137

6.2.3 Summary 138

6.3 A Non-Grassmann Algorithm for Estimating E_M(V) 139

6.4 Sequential Likelihood-Based Envelope Estimation 141

6.4.1 The 1D Algorithm 141

6.4.2 Envelope Component Screening 142

6.4.2.1 ECS Algorithm 143

6.4.2.2 Alternative ECS Algorithm 144

6.5 Sequential Moment-Based Envelope Estimation 145

6.5.1 Basic Algorithm 145

6.5.2 Krylov Matrices and dim(V) = 1 147

6.5.3 Variations on the Basic Algorithm 147

7 Envelope Extensions 149

7.1 Envelopes for Vector-Valued Parameters 149

7.1.1 Illustrations 151

7.1.2 Estimation Based on a Complete Likelihood 154

7.1.2.1 Likelihood Construction 154

7.1.2.2 Aster Models 156

7.2 Envelopes for Matrix-Valued Parameters 157

7.3 Envelopes for Matrix-Valued Responses 160

7.3.1 Initial Modeling 161

7.3.2 Models with Kronecker Structure 163

7.3.3 Envelope Models with Kronecker Structure 164

7.4 Spatial Envelopes 166

7.5 Sparse Response Envelopes 168

7.5.1 Sparse Response Envelopes when r « n 168

7.5.2 Cattle Weights and Brain Volumes: Sparse Fits 169

> n 170

7.6 Bayesian Response Envelopes 171

8 Inner and Scaled Envelopes 173

8.1 Inner Envelopes 173

8.1.1 De¿nition and Properties of Inner Envelopes 174

8.1.2 Inner Response Envelopes 175

8.1.3 Maximum Likelihood Estimators 176

8.1.4 Race Times: Inner Envelopes 179

8.2 Scaled Response Envelopes 182

8.2.1 Scaled Response Model 183

8.2.2 Estimation 184

8.2.3 Race Times: Scaled Response Envelopes 185

8.3 Scaled Predictor Envelopes 186

8.3.1 Scaled Predictor Model 187

8.3.2 Estimation 188

8.3.3 Scaled SIMPLS Algorithm 189

9 Connections and Adaptations 191

9.1 Canonical Correlations 191

9.1.1 Construction of Canonical Variates and Correlations 191

9.1.2 Derivation of Canonical Variates 193

9.1.3 Connection to Envelopes 194

9.2 Reduced-Rank Regression 195

9.2.1 Reduced-Rank Model and Estimation 195

9.2.2 Contrasts with Envelopes 196

9.2.3 Reduced-Rank Response Envelopes 197

9.2.4 Reduced-Rank Predictor Envelopes 199

9.3 Supervised Singular Value Decomposition 199

9.4 Sücient Dimension Reduction 202

9.5 Sliced Inverse Regression 204

9.5.1 SIR Methodology 204

9.5.2 Mussels' Muscles: Sliced Inverse Regression 205

9.5.3 The "Envelope Method" 206

9.5.4 Envelopes and SIR 207

9.6 Dimension Reduction for the Conditional Mean 207

9.6.1 Estimating One Vector in S_E(Y|X) 208

9.6.2 Estimating S_E(Y|X)209

9.7 Functional Envelopes for SDR 211

9.7.1 Functional SDR 211

9.7.2 Functional Predictor Envelopes 211

9.8 Comparing Covariance Matrices 212

9.8.1 SDR for Covariance Matrices 213

9.8.2 Connections with Envelopes 215

9.8.3 Illustrations 216

9.8.4 SDR for Means and Covariance Matrices 217

9.9 Principal Components 217

9.9.1 Introduction 217

9.9.2 Random Latent Variables 219

9.9.2.1 Envelopes 220

9.9.2.2 Envelopes with Isotropic Intrinsic and Extrinsic Variation 222

9.9.2.3 Envelopes with Isotropic Intrinsic Variation 223

9.9.2.4 Selection of the Dimension u 225

9.9.3 Fixed Latent Variables and Isotropic Errors 225

9.9.4 Numerical Illustrations 226

9.10 Principal Fitted Components 229

9.10.1 Isotropic Errors, S_X_|Y = ¿¿¿¿²I_p 230

9.10.2 Anisotropic Errors, S_X|Y > 0 231

9.10.3 Nonnormal Errors and the Choice of f 232

9.10.3.1 Graphical Choices 232

9.10.3.2 Basis Functions 232

9.10.3.3 Categorical Response 232

9.10.3.4 Sliced Inverse Regression 233

9.10.4 High-Dimensional PFC 233

Appendix A Envelope Algebra 235

A.1 Invariant and Reducing Subspaces 235

A.2 M-Envelopes 240

A.3 Relationships Between Envelopes 241

A.3.1 Invariance and Equivariance 241

A.3.2 Direct Sums of Envelopes 244

A.3.3 Coordinate Reduction 244

A.4 Kronecker Products, vec and vech 246

A.5 Commutation, Expansion, and Contraction Matrices 248

A.6 Derivatives 249

A.6.1 Derivatives for ¿¿¿¿, O, and O₀ 249

A.6.2 Derivatives with Respect to G 250

A.6.3 Derivatives of Grassmann Objective Functions 251

A.7 Miscellaneous Results 252

A.8 Matrix Normal Distribution 255

A.9 Literature Notes 256

Appendix B Proofs for Envelope Algorithms 257

B.1 The 1D Algorithm 257

B.2 Sequential Moment-Based Algorithm 262

B.2.1 First Direction Vector w1 263

B.2.2 Second Direction Vector w2 263

B.2.3 (q + 1)st Direction Vector w_q+1, q < u 264

B.2.4 Termination 265

Appendix C Grassmann Manifold Optimization 267

C.1 Gradient Algorithm 268

C.2 Construction of B 269

C.3 Construction of exp{¿¿¿¿A(B)} 271

C.4 Starting and Stopping 272

Bibliography 273

Author Index 283

Subject Index 287

1
Response Envelopes

Envelopes, which were introduced by Cook et al. (2007) and developed for the multivariate linear model by Cook et al. (2010), encompass a class of methods for increasing efficiency in multivariate analyses without altering traditional objectives. They serve to reshape classical methods by exploiting response-predictor relationships that affect the accuracy of the results but are not recognized by classical methods. Multivariate data are often modeled by combining a selected structural component to be estimated with an error component to account for the remaining unexplained variation. Capturing the desired signal and only that signal in the structural component can be an elusive task with the consequence that, in an effort to avoid missing important information, there may be a tendency to overparameterize, leading to overfitting and relatively soft inferences and interpretations. Essentially a type of targeted dimension reduction that can result in substantial gains in efficiency, envelopes operate by enveloping the signal and thereby account for extraneous variation that might otherwise be present in the structural component.

In this chapter, we consider multivariate (multiresponse) linear regression allowing for the presence of "immaterial variation" (described herein) in the response vector. The possibility of such variation being present in the predictors is considered in Chapter 4, where we develop a connection with partial least squares regression. Section 1.1 contains a very brief review of the multivariate linear model, with an emphasis on aspects that will play a role in later developments. Additional background is available from Muirhead (2005). The envelope model for response reduction is introduced in Section 1.2. Introductory illustrations are given in Section 1.3 to provide intuition, to set the tone for later developments, and to provide running examples. In later sections, we discuss additional properties of the envelope model, maximum likelihood estimation, and the asymptotic variance of the envelope estimator of the coefficient matrix. Most of the technical materials used in this chapter are taken from Cook et al. (2010). Some algebraic details are presented without justification. The missing development is given extensively in Appendix A, which covers the linear algebra of envelopes.

1.1 The Multivariate Linear Model

Consider the multivariate regression of a response vector on a vector of nonstochastic predictors . The standard linear model for describing a sample can be represented in vector form as

(1.1)

where the predictors are centered in the sample , the error vectors are independently and identically distributed normal vectors with mean 0 and covariance matrix , is an unknown vector of intercepts, and is an unknown matrix of regression coefficients. Centering the predictors facilitates discussion and presentation of some results, but is technically unnecessary. If is stochastic, so and have a joint distribution, we still condition on the observed values of since the predictors are ancillary under model 1.1. The normality requirement for is not essential, as discussed in Section 1.9 and in later chapters.

Let denote the centered matrix with rows , let denote the uncentered matrix with rows , and let denote the matrix with rows , . Also, let and

Then the maximum likelihood estimator of is , and the maximum likelihood estimator of , which is also the ordinary least squares estimator, is

(1.2)

where the second equality follows because the predictors are centered. To see this result, let and denote the th vectors of fitted values and residuals, , and let . Then after substituting for , the remaining log-likelihood to be maximized can be expressed as

where and the last step follows because . Consequently, is maximized over by setting so , leaving the partially maximized log-likelihood

It follows that the maximum likelihood estimator of is and that the fully maximized log-likelihood is

We notice from 1.2 that can be constructed by doing separate univariate linear regressions, one for each element of on . The coefficients from the th regression then form the th row of , . The stochastic relationships among the elements of are not used in forming these estimators. However, as will be seen later, relationships among the elements of play a central role in envelope estimation. Standard inference on , the th element of , under model 1.1 is the same as inference obtained under the univariate linear regression of , the th element of , on . Model 1.1 becomes operational as a multivariate construction when inferring simultaneously about elements in different rows of or when predicting elements of jointly.

The sample covariance matrices of , , and can be expressed as

(1.3) 1.4 1.5

where is nonstochastic, denotes the projection onto the column space of , , is the sample covariance matrix of the fitted vectors , and is the sample covariance matrix of the residuals .

We will occasionally encounter the standardized version of ,

(1.6)

which corresponds to the estimated coefficient matrix from the ordinary least squares fit of the standardized responses on the standardized predictors .

The joint distribution of the elements of can be found by using the operator to stack the columns of : , where denotes the Kronecker product. Since is normally distributed with mean and variance , it follows that is normally distributed with mean and variance

(1.7) (1.8)

The covariance matrix can also be represented in terms of by using the commutation matrix to convert to : and

Background on the commutation matrix, and related operators is available in Appendix A. The variance is typically estimated by substituting the residual covariance matrix for ,

(1.9)

Let denote the indicator vector with a 1 in the th position and 0s elsewhere. Then the covariance matrix for the th row of is

We see from this that the covariance matrix for the th row of is the same as that from the marginal linear regression of on . We refer to the estimator divided by its standard error as a -score:

(1.10)

This statistic will be used from time to time for assessing the magnitude of , sometimes converting to a -value using the standard normal distribution.

We will occasionally encounter a conditional variate of the form , where is a normal vector with mean and variance , and is a nonstochastic matrix with . The mean and variance of this conditional form are as follows:

(1.11) 1.12

The usual log-likelihood ratio statistic for testing that is

(1.13)

which is asymptotically distributed under the null hypothesis as a chi-square random variable with degrees of freedom. We will occasionally use this statistic in illustrations to assess the presence of any detectable dependence of on . This statistic is sometimes reported with an adjustment that is useful when is not large relative to and (Muirhead 2005, Section 10.5.2).

The Fisher information for in model 1.1 is

(1.14)

where is the expansion matrix that satisfies for , and . It follows from standard likelihood theory that is asymptotically normal with mean 0 and variance given by the upper left block of ,

(1.15)

Asymptotic normality holds also without normal errors but with some technical conditions: if the errors have finite fourth moments and , then converges in distribution to a normal vector with mean 0 (e.g. Su and Cook 2012, Theorem 2).

1.1.1 Partitioned Models and Added Variable Plots

A subset of the predictors may occasionally be of special interest in multivariate regression. Partition into two sets of predictors and , , and conformably partition the columns of into and . Then model 1.1 can be rewritten as

(1.16)

where holds the coefficients of interest. We next reparameterize this model to force the new predictors to be uncorrelated in the sample and to focus attention on .

Recalling that , let denote a typical residual from the ordinary least squares fit of on , and let . Then the partitioned model can be reexpressed as

(1.17)

In this version of the partitioned model, the parameter vector is the same as that in 1.16, while unless . The predictors - and - in 1.17 are uncorrelated in the sample , and consequently the maximum likelihood estimator of is obtained by regressing on . The maximum likelihood estimator of can also be obtained by regressing , the residuals from the regression of on , on . A plot of versus is called an added variable plot (Cook and Weisberg 1982). These plots are often used in univariate linear regression ( ) as general...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

An Introduction to Envelopes

Description

More details

Other editions

Additional editions

Person

Content

1
Response Envelopes

1.1 The Multivariate Linear Model

1.1.1 Partitioned Models and Added Variable Plots

System requirements