Multivariate Analysis

Name: Multivariate Analysis
Brand: Wiley
Price: 64.99 EUR
Availability: OnlineOnly

Kanti V. Mardia John T. Kent Charles C. Taylor(Author)

Wiley (Publisher)

2nd Edition

Published on 10. June 2024

1094 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-118-89251-0 (ISBN)

€64.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Multivariate Analysis

Comprehensive Reference Work on Multivariate Analysis and its Applications

The first edition of this book, by Mardia, Kent and Bibby, has been used globally for over 40 years. This second edition brings many topics up to date, with a special emphasis on recent developments.

A wide range of material in multivariate analysis is covered, including the classical themes of multivariate normal theory, multivariate regression, inference, multidimensional scaling, factor analysis, cluster analysis and principal component analysis. The book also now covers modern developments such as graphical models, robust estimation, statistical learning, and high-dimensional methods. The book expertly blends theory and application, providing numerous worked examples and exercises at the end of each chapter. The reader is assumed to have a basic knowledge of mathematical statistics at an undergraduate level together with an elementary understanding of linear algebra. There are appendices which provide a background in matrix algebra, a summary of univariate statistics, a collection of statistical tables and a discussion of computational aspects. The work includes coverage of:

* Basic properties of random vectors, copulas, normal distribution theory, and estimation
* Hypothesis testing, multivariate regression, and analysis of variance
* Principal component analysis, factor analysis, and canonical correlation analysis
* Discriminant analysis, cluster analysis, and multidimensional scaling
* New advances and techniques, including supervised and unsupervised statistical learning, graphical models and regularization methods for high-dimensional data

Although primarily designed as a textbook for final year undergraduates and postgraduate students in mathematics and statistics, the book will also be of interest to research workers and applied scientists.

More details

Other editions

Persons

Content

Epigraph xvii

Preface to the Second Edition xix

Preface to the First Edition xxi

Acknowledgments from First Edition xxv

Notation, Abbreviations, and Key Ideas xxvii

1 Introduction 1

1.1 Objects and Variables 1

1.2 Some Multivariate Problems and Techniques 1

1.3 The Data Matrix 7

1.4 Summary Statistics 8

1.5 Linear Combinations 12

1.6 Geometrical Ideas 14

1.7 Graphical Representation 15

1.8 Measures of Multivariate Skewness and Kurtosis 20

Exercises and Complements 22

2 Basic Properties of Random Vectors 25

Introduction 25

2.1 Cumulative Distribution Functions and Probability Density Functions 25

2.2 Population Moments 27

2.3 Characteristic Functions 31

2.4 Transformations 32

2.5 The Multivariate Normal Distribution 34

2.6 Random Samples 41

2.7 Limit Theorems 42

Exercises and Complements 44

3 Nonnormal Distributions 49

3.1 Introduction 49

3.2 Some Multivariate Generalizations of Univariate Distributions 49

3.3 Families of Distributions 52

3.4 Insights into Skewness and Kurtosis 57

3.5 Copulas 60

Exercises and Complements 65

4 Normal Distribution Theory 71

4.1 Introduction and Characterization 71

4.2 Linear Forms 73

4.3 Transformations of Normal Data Matrices 75

4.4 The Wishart Distribution 77

4.5 The Hotelling T² Distribution 83

4.6 Mahalanobis Distance 85

4.7 Statistics Based on the Wishart Distribution 88

4.8 Other Distributions Related to the Multivariate Normal 92

Exercises and Complements 93

5 Estimation 101

Introduction 101

5.1 Likelihood and Sufficiency 101

5.2 Maximum-likelihood Estimation 106

5.3 Robust Estimation of Location and Dispersion for Multivariate Distributions 112

5.4 Bayesian Inference 117

Exercises and Complements 119

6 Hypothesis Testing 125

6.1 Introduction 125

6.2 The Techniques Introduced 127

6.3 The Techniques Further Illustrated 134

6.4 Simultaneous Confidence Intervals 142

6.5 The Behrens-Fisher Problem 144

6.6 Multivariate Hypothesis Testing: Some General Points 145

6.7 Nonnormal Data 146

6.8 Mardia's Nonparametric Test for the Bivariate Two-sample Problem 149

Exercises and Complements 151

7 Multivariate Regression Analysis 159

7.1 Introduction 159

7.2 Maximum-likelihood Estimation 160

7.3 The General Linear Hypothesis 162

7.4 Design Matrices of Degenerate Rank 165

7.5 Multiple Correlation 167

7.6 Least-squares Estimation 171

7.7 Discarding of Variables 174

Exercises and Complements 178

8 Graphical Models 183

8.1 Introduction 183

8.2 Graphs and Conditional Independence 184

8.3 Gaussian Graphical Models 188

8.4 Log-linear Graphical Models 195

8.5 Directed and Mixed Graphs 202

Exercises and Complements 204

9 Principal Component Analysis 207

9.1 Introduction 207

9.2 Definition and Properties of Principal Components 207

9.3 Sampling Properties of Principal Components 221

9.4 Testing Hypotheses About Principal Components 227

9.5 Correspondence Analysis 230

9.6 Allometry - Measurement of Size and Shape 237

9.7 Discarding of Variables 240

9.8 Principal Component Regression 241

9.9 Projection Pursuit and Independent Component Analysis 244

9.10 PCA in High Dimensions 247

Exercises and Complements 249

10 Factor Analysis 259

10.1 Introduction 259

10.2 The Factor Model 260

10.3 Principal Factor Analysis 264

10.4 Maximum-likelihood Factor Analysis 266

10.5 Goodness-of-fit Test 269

10.6 Rotation of Factors 270

10.7 Factor Scores 275

10.8 Relationships Between Factor Analysis and Principal Component Analysis 276

10.9 Analysis of Covariance Structures 277

Exercises and Complements 277

11 Canonical Correlation Analysis 281

11.1 Introduction 281

11.2 Mathematical Development 282

11.3 Qualitative Data and Dummy Variables 288

11.4 Qualitative and Quantitative Data 290

Exercises and Complements 293

12 Discriminant Analysis and Statistical Learning 297

12.1 Introduction 297

12.2 Bayes' Discriminant Rule 299

12.3 The Error Rate 300

12.4 Discrimination Using the Normal Distribution 304

12.5 Discarding of Variables 312

12.6 Fisher's Linear Discriminant Function 314

12.7 Nonparametric Distance-based Methods 319

12.8 Classification Trees 323

12.9 Logistic Discrimination 332

12.10 Neural Networks 336

Exercises and Complements 342

13 Multivariate Analysis of Variance 355

13.1 Introduction 355

13.2 Formulation of Multivariate One-way Classification 355

13.3 The Likelihood Ratio Principle 356

13.4 Testing Fixed Contrasts 358

13.5 Canonical Variables and A Test of Dimensionality 359

13.6 The Union Intersection Approach 369

13.7 Two-way Classification 370

Exercises and Complements 375

14 Cluster Analysis and Unsupervised Learning 379

14.1 Introduction 379

14.2 Probabilistic Membership Models 380

14.3 Parametric Mixture Models 384

14.4 Partitioning Methods 386

14.5 Hierarchical Methods 391

14.6 Distances and Similarities 397

14.7 Grouped Data 404

14.8 Mode Seeking 406

14.9 Measures of Agreement 408

Exercises and Complements 412

15 Multidimensional Scaling 419

15.1 Introduction 419

15.2 Classical Solution 421

15.3 Duality Between Principal Coordinate Analysis and Principal Component Analysis 428

15.4 Optimal Properties of the Classical Solution and Goodness of Fit 429

15.5 Seriation 436

15.6 Nonmetric Methods 438

15.7 Goodness of Fit Measure: Procrustes Rotation 440

15.8 Multisample Problem and Canonical Variates 443

Exercises and Complements 444

16 High-dimensional Data 449

16.1 Introduction 449

16.2 Shrinkage Methods in Regression 451

16.3 Principal Component Regression 455

16.4 Partial Least Squares Regression 457

16.5 Functional Data 465

Exercises and Complements 473

A Matrix Algebra 475

A.1 Introduction 475

A.2 Matrix Operations 478

A.3 Further Particular Matrices and Types of Matrices 483

A.4 Vector Spaces, Rank, and Linear Equations 485

A.5 Linear Transformations 488

A.6 Eigenvalues and Eigenvectors 488

A.7 Quadratic Forms and Definiteness 495

A.8 Generalized Inverse 497

A.9 Matrix Differentiation and Maximization Problems 499

A.10 Geometrical Ideas 501

B Univariate Statistics 505

B.1 Introduction 505

B.2 Normal Distribution 505

B.3 Chi-squared Distribution 506

B.4 F and Beta Variables 506

B.5 t Distribution 507

B.6 Poisson Distribution 507

C R commands and Data 509

C.1 Basic R Commands Related to Matrices 509

C.2 R Libraries and Commands Used in Exercises and Figures 510

C.3 Data Availability 511

D Tables 513

References and Author Index 523

Index 543

Notation, Abbreviations, and Key Ideas

Matrices and Vectors

Vectors are viewed as column vectors and are represented using bold lower case letters. Round brackets are generally used when a vector is expressed in terms of its elements. For example, in which the th element or component is denoted . The transpose of is denoted , so is a row vector.
Matrices are written using bold upper case letters, e.g. and . The matrix may be written as in which is the element of the matrix in row and column . If has rows and columns, then the th row of , written as a column vector, is
and the th column is written as

Hence, can be expressed in various forms,

We generally use square brackets when a matrix is expanded in terms of its elements.

Operations on a matrix include
1. - transpose:
2. - determinant:
3. - inverse:
4. - generalized inverse:
where for the final three operations, is assumed to be square, and for the inverse operation, is additionally assumed to be nonsingular. Different types of matrices are given in Tables A.1 and A.3. Table A.2 lists some further matrix operations.

Random Variables and Data

In general, a random vector and a nonrandom vector are both indicated using a bold lower case letter, e.g. . Thus, the distinction between the two must be determined from the context. This convention is in contrast to the standard convention in statistics where upper case letters are used to denote random quantities, and lower case letters their observed values.
The reason for our convention is that bold upper case letters are generally used for a data matrix , both random and fixed.
In spite of the above convention, we very occasionally (e.g. parts of Chapters 2 and 10) use bold upper case letters for a random vector when it is important to distinguish between the random vector and a possible value .
The phrase "high-dimensional data" often implies , whereas the phrase "big data" often just indicates that or is large.

Parameters and Statistics

Elements of an data matrix are generally written , where indices are used to label the observations, and indices are used to label the variables.

If the rows of a data matrix are normally distributed with mean and covariance matrix , and , the following notation is used to distinguish various population and sample quantities:

Parameter Sample Mean vector Covariance matrix Unbiased covariance matrix Concentration matrix Correlation matrix

Distributions

The following notation is used for univariate and multivariate distributions. Appendix B summarizes the univariate distributions used in the Book.

cumulative distribution function/distribution function (d.f.) probability density function (p.d.f.) expectation c.f. characteristic function d.f. distribution function Hotelling multivariate normal distribution in -dimensions with mean (column vector of length ) and covariance matrix variance-covariance matrix correlation matrix Wishart distribution

The terms variance matrix, covariance matrix, and variance-covariance matrix are synonymous.

Matrix Decompositions

Any symmetric matrix can (by the spectral decomposition theorem) be written as where is a diagonal matrix of eigenvalues of (which are real-valued), i.e. , and is an orthogonal matrix whose columns are standardized eigenvectors, i.e. and . See Theorem A.6.8.
Using the above, we define the symmetric square root of a positive definite matrix by
If is an matrix of rank , then by the singular value decomposition, it can be written as where and are column orthonormal matrices, and is a diagonal matrix with positive elements. See Theorem A.6.8.

Geometry

Table A.5 sets out the basic concepts in -dimensional geometry. In particular,

Length of a vector Euclidean distance between and Squared Mahalanobis distance - one of the most important distances in multivariate analysis, since it takes account of a covariance, i.e.

Table 14.6 gives a list of various distances.

Main Abbreviations and Commonly Used Notation

approximately equal to (conditionally) independent of is distributed as the set of elements that are members of but not Euclidean distance between and transpose of matrix determinant of matrix inverse of matrix -inverse (generalized inverse) column vector of 1s column vector or matrix of 0s between-groups sum of squares and products (SSP) matrix beta variable normalizing constant for beta distribution (note nonitalic font to distinguish from the above) BLUE best linear unbiased estimate covariance between and chi-squared distribution with degrees of freedom upper a critical value of chi-squared distribution with degrees of freedom c.f. characteristic function partial derivative - multivariate examples in Appendix A.9 distance matrix squared Mahalanobis distance d.f. distribution function Kronecker delta diagonal elements of a square matrix (as column vector) or diagonal matrix created from a vector (see above) expectation distribution with degrees of freedom and upper a critical value of distribution with degrees of freedom and cumulative distribution function probability density function gamma function GLS generalized least squares centering matrix identity matrix ICA independent component analysis i.i.d. independent and identically distributed Jacobian of transformation (see Table 2.1) concentration matrix () likelihood log likelihood LDA linear discriminant analysis logarithm to the base (natural logarithm) LRT likelihood ratio test MANOVA multivariate analysis of variance MDS multidimensional scaling ML maximum likelihood m.l.e. maximum likelihood estimate mean (population) vector multivariate normal distribution for...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Multivariate Analysis

Description

More details

Other editions

Additional editions

Persons

Content

Notation, Abbreviations, and Key Ideas

Matrices and Vectors

Random Variables and Data

Parameters and Statistics

Distributions

Matrix Decompositions

Geometry

Main Abbreviations and Commonly Used Notation

System requirements