An essential guide on high dimensional multivariate time series including all the latest topics from one of the leading experts in the field
Following the highly successful and much lauded book, Time Series Analysis--Univariate and Multivariate Methods, this new work by William W.S. Wei focuses on high dimensional multivariate time series, and is illustrated with numerous high dimensional empirical time series. Beginning with the fundamentalconcepts and issues of multivariate time series analysis,this book covers many topics that are not found in general multivariate time series books. Some of these are repeated measurements, space-time series modelling, and dimension reduction. The book also looks at vector time series models, multivariate time series regression models, and principle component analysis of multivariate time series. Additionally, it provides readers with information on factor analysis of multivariate time series, multivariate GARCH models, and multivariate spectral analysis of time series.
With the development of computers and the internet, we have increased potential for data exploration. In the next few years, dimension will become a more serious problem. Multivariate Time Series Analysis and its Applications provides some initial solutions, which may encourage the development of related software needed for the high dimensional multivariate time series analysis.
* Written by bestselling author and leading expert in the field
* Covers topics not yet explored in current multivariate books
* Features classroom tested material
* Written specifically for time series courses
Multivariate Time Series Analysis and its Applications is designed for an advanced time series analysis course. It is a must-have for anyone studying time series analysis and is also relevant for students in economics, biostatistics, and engineering.
Fundamental concepts and issues in multivariate time series analysis
With the development of computers and the internet, we have had a data explosion. For example, a study of monthly cancer rates in the United States during the past 10 years can involve 50 or many hundreds or thousands of time series depending on whether we investigate the cancer rates for states, cities, or counties. Multivariate time series analysis methods are needed to properly analyze these data in a study, and these are different from standard statistical theory and methods based on random samples that assume independence. Dependence is the fundamental nature of the time series. The use of highly correlated high-dimensional time series data introduces many complications and challenges. The methods and theory to solve these issues will make up the content of this book.
In studying a phenomenon, we often encounter many variables, Zi,t, where i = 1, 2, ., m, and the observations are taken according to the order of time, t. For convenience we use a vector, Zt = [Z1,t, Z2,t, ., Zm,t]´, to denote the set of these variables, where Zi,t is the ith component variable at time t and it is a random variable for each i and t. The time t in Zt can be continuous and any value in an interval, such as the time series of electric signals and voltages, or discrete and be a specific time point, such as the daily closing price of various stocks or the total monthly sales of various products at the end of each month. In practice, even for a continuous time series, we take observations only at digitized discrete time points for analysis. Hence, we will consider only discrete time series in this book, and with no loss of generalizability, we will consider Zi,t, for i = 1, 2, ., m, t = 0,?±?1,?±?2, ., and hence Zt = [Z1,t, Z2,t, ., Zm,t]´, t = 0,?±?1,?±?2, ..
We call Zt = [Z1,t, Z2,t, ., Zm,t]´ a multivariate time series or a vector time series, where the first subscript refers to a component and the second subscript refers to the time. The fundamental characteristic of a multivariate time series is that its observations depend not only on component i but also time t. The observations between Zi,s and Zj,t can be correlated when i???j, regardless of whether the times s and t are the same or not. They are vector-valued random variables. Most standard statistical theory and methods based on random samples are not applicable, and different methods are clearly needed. The body of statistical theory and methods for analyzing these multivariate or vector time series is referred to as multivariate time series analysis.
Many issues are involved in multivariate time series analysis. They are different from standard statistical theory and methods based on a random sample that assumes independence and constant variance. In multivariate time series, Zt = [Z1,t, Z2,t, ., Zm,t]´, a fundamental phenomenon is that dependence exists not only in terms of i but also in terms of t. In addition, we have the following important issues to consider:
- Fundamental concepts and representations related to dependence.
We will introduce the variance-covariance and correlation matrix functions, vector white noise processes, vector autoregressive and vector moving average representations, vector autoregressive models, vector moving average models, and vector autoregressive moving average models.
- Relationship between several multivariate time series.
A multiple regression is known to be a useful statistical model that describes a relationship between a response variable and several predictor variables. The error term in the model is normally assumed to be uncorrelated noise with zero mean and constant variance. We will extend the results to a multivariate time series regression model where both response variables, and predictor variables are vectors. More importantly, not only are all components in the multivariate regression equation time series variables, but also the error term follows a correlated time series process.
- Dimension reduction and model simplification.
Without losing information, we will introduce useful methods of dimension reduction and representation including principal components and factor analysis in time series.
- Representations of time variant variance-covariance structure.
Unlike most classical linear methods, where the variance of error term has been assumed to be constant, in time series analysis a non-constant variance often occurs and generalized autoregressive conditional heteroscedasticity () models are been introduced. The literature and theory of GARCH models for univariate time series was introduced in chapter 15 of Wei (2006). In this book, we will extend the results to the multivariate GARCH models.
- Repeated measurement phenomenon.
Many fields of study, including medicine, biology, social science, and education, involve time series measurements of treatments for different subjects. They are multivariate time series but often relatively short, and the applications of standard time series methods are difficult, if not impossible. We will introduce some methods and various models that are useful for analyzing repeated measures data. Empirical examples will be used as illustrations.
- Space and time series modeling.
In many multivariate time series applications, the components i in Zi,t refer to regions or locations. For example, in a crime study, the observations can be the crime rates of different counties in a state, and in a market study, one could look at the price of a certain product in different regions. Thus, the analysis will involve both regions and time, and we will construct space and time series models.
- Multivariate spectral analysis.
Similar to univariate time series analysis where one can study a time series through its autocovariance/autocorrelation functions and lag relationships or through its spectrum properties, we can study a multivariate time series through a time domain or a frequency domain approach. In the time domain approach we use covariance/correlation matrices, and in the frequency domain approach we will use spectral matrices. We will introduce spectral analysis for both multivariate stationary and nonstationary time series.
- High dimension problem in multivariate time series.
Because of high-speed internet and the power and speed of the new generation of computers, a researcher now faces some very challenging phenomena. First, he/she must deal with an ever-increasing amount of data. To find useful information and hidden patterns underlying the data, a researcher may use various data-mining methods and techniques. Adding a time dimension to these large databases certainly introduces new aspects and challenges. In multivariate time series analysis, a very natural issue is the high dimension problem where the number of parameters may exceed the length of the time series. For example, a simple second order vector autoregressive VAR(2) model for the 50 states in the USA will involve more than 5000 parameters, and the length of the time series may be much shorter. For example, the length of the monthly observations for 20 years is only 240. Traditional time series methods are not designed to deal with these kinds of high-dimensional variables. Even with today's computer power and speed, there are many difficult problems that remain unsolved. As most statistical methods are developed for a random sample, the use of highly correlated time series data certainly introduces a new set of complications and challenges, especially for a high-dimensional data set.
The methods and theory to solve these issues will be the focus of this book. Examples and applications will be carefully chosen and presented.
1.2 Fundamental concepts
The m-dimensional vector time series process, Zt = [Z1,t, Z2,t, ., Zm,t]´, is a stationary process if each of its component series is a univariate stationary process and its first two moments are time-invariant. Just as a univariate stationary process or model is characterized by its moments such as mean, autocorrelation function, and partial autocorrelation function, a stationary vector time series process or model is characterized by its mean vector, correlation matrix function, and partial correlation matrix function.
1.2.1 Correlation and partial correlation matrix functions
Let Zt = [Z1,t, Z2,t, ., Zm,t]´, t = 0,?±?1,?±?2, . be a m-dimensional stationary real-valued vector process so that E(Zi,t) = µi is constant for each...