Methods and Applications of Longitudinal Data Analysis describes methods for the analysis of longitudinal data in the medical, biological and behavioral sciences. It introduces basic concepts and functions including a variety of regression models, and their practical applications across many areas of research. Statistical procedures featured within the text include:
- descriptive methods for delineating trends over time
- linear mixed regression models with both fixed and random effects
- covariance pattern models on correlated errors
- generalized estimating equations
- nonlinear regression models for categorical repeated measurements
- techniques for analyzing longitudinal data with non-ignorable missing observations
Emphasis is given to applications of these methods, using substantial empirical illustrations, designed to help users of statistics better analyze and understand longitudinal data.
Methods and Applications of Longitudinal Data Analysis equips both graduate students and professionals to confidently apply longitudinal data analysis to their particular discipline. It also provides a valuable reference source for applied statisticians, demographers and other quantitative methodologists.
- From novice to professional: this book starts with the introduction of basic models and ends with the description of some of the most advanced models in longitudinal data analysis
- Enables students to select the correct statistical methods to apply to their longitudinal data and avoid the pitfalls associated with incorrect selection
- Identifies the limitations of classical repeated measures models and describes newly developed techniques, along with real-world examples.
Serving as introduction to the book, Chapter 1 is focused on the description of the definition, historical background, data features and structures, and some other general specifications applied in longitudinal data analysis. The purpose of the chapter is to lead the reader into the realm of longitudinal data analysis by addressing its significance, underlying hypotheses, basic expressions of longitudinal modeling, and existing issues. The presence of missing data and intra-individual correlation are the two primary features in longitudinal data, and therefore, their impacts on longitudinal data analysis are presented and discussed. The chapter also summarizes the organization of the book with a chapter-by-chapter description. Given the emphasis on applications and practices for this book, two longitudinal datasets are used for empirical illustrations throughout the text, with one from a randomized controlled clinical trial and one from a large-scale longitudinal survey. In Chapter 1, these two datasets are described in details.
Intra-individual correlation longitudinal data missing data patterns multivariate and univariate data structures pattern of change over time repeated measurements
1.1 What is Longitudinal Data Analysis? 1
1.2 History of Longitudinal Analysis and its Progress 3
1.3 Longitudinal Data Structures 4
1.3.1 Multivariate Data Structure 5
1.3.2 Univariate Data Structure 6
1.3.3 Balanced and Unbalanced Longitudinal Data 7
1.4 Missing Data Patterns and Mechanisms 9
1.5 Sources of Correlation in Longitudinal Processes 10
1.6 Time Scale and the Number of Time Points 12
1.7 Basic Expressions of Longitudinal Modeling 13
1.8 Organization of the Book and Data Used for Illustrations 16
1.8.1 Randomized Controlled Clinical Trial on the Effectiveness of Acupuncture Treatment on PTSD 17
1.8.2 Asset and Health Dynamics among the Oldest Old (AHEAD) 18
1.1. What is longitudinal data analysis?
We live in a dynamic world full of change. A person grows, ages, and dies. During that process, we may contract disease, develop functional disability, and lose mental ability. Accompanying this biological life course, social change also occurs. We attend school, develop a career and retire. In the meantime, many of us experience family disruption, become involved in social activities, cultivate personal habits and hobbies, and make adjustments to our daily activities according to our physical and mental conditions. Indeed, change characterizes almost all aspects of our social lives, ranging from the aforementioned social facets to unemployment, drug use recidivism, occupational careers, and other social events. In these biological and social processes, the gradual changes and developments over a life course reflect a pattern of change over time. More formally, such changes and developments may be referred to as an individual's trajectory. In a wider scope, trajectories are also seen in the pattern of change referring to such phenomena as the decaying quality over time of a commercial product or the collapse of a political system in a country. In the field of business management, change in consumer purchasing behavior is generally linked both with individual characteristics and with competing products. In population studies, demographers are concerned with such longitudinal processes as internal and international migration, and intervals between successive births. In cases such as these events and in others, the pattern of change over time can be influenced and determined by various factors, such as genetic predisposition, illness, violence, environment, medical and social advancements, or the like. Therefore, each trajectory can differ significantly among individuals and other observational units, or by the variables that govern the timing and rate of change in a period of time. Data available at a single point of time does not suffice to analyze change and its pattern over time. Cross-sectional data, traditionally so popular and so widely used in a wide variety of applied sciences, only designates a snapshot of a course and thus does not possess the capacity to reflect change, growth, or development. Aware of the limitations in cross-sectional studies, many researchers have advanced the analytic perspective by examining data with repeated measurements. By measuring the same variable of interest repeatedly at a number of times, the change is displayed, its pattern over time revealed and constructive findings are derived with regard to the significance of change. Data with repeated measurements are referred to as longitudinal data
. In many longitudinal data designs, subjects are assigned to the levels of a treatment or of other risk factors over a number of time points that are separated by specified intervals. Analyzing longitudinal data poses considerable challenges to statisticians and other quantitative methodologists due to several unique features inherent in such data. First, the most troublesome feature of longitudinal analysis is the presence of missing data in repeated measurements. In a longitudinal survey, the loss of observations on the variables of interest frequently occurs. For example, in a clinical trial on the effectiveness of a new medical treatment for disease, patients may be lost to a follow-up investigation due to migration or health problems. In a longitudinal observational survey, some baseline respondents may lose interest in participating at subsequent times. These missing cases may possess unique characteristics and attributes, resulting in the fact that data collected at later time points may bear little resemblance to the sample initially gathered. Second, repeated measurements for the same observational unit are usually related because average responses usually vary randomly between individuals or other observational units, with some being fundamentally high and some being fundamentally low. Consequently, longitudinal data are clustered within observational units. In the meantime, an individual's repeated measurements may be a response to a time-varying, systematic process, resulting in serial correlation. Third, longitudinal data are generally ordered by time either in equal space or by unequal intervals, with each scenario calling for a specific analytic approach. Sometimes, even with an equal-spacing design, some respondents may enter a follow-up investigation after a specified survey date, which, in turn, imposes unequal intervals for different individuals. Over the years, scientists have developed a variety of statistical models and methods to analyze longitudinal data. Most of these advanced techniques are built upon biomedical and psychological settings, and therefore, these methodologically advanced techniques are relatively unfamiliar to researchers of other disciplines. To date, many researchers still use incorrect statistical methods to analyze longitudinal data without paying sufficient attention to the unique features of longitudinal data. For these researchers, the advanced models and methods developed specifically for longitudinal data analysis can be readily borrowed for use after careful verification, evaluation, and modification. In health and aging research, for example, the pattern of change in health status is generally the main focus. In analyzing such longitudinal courses, failure to use correct, appropriate methods can result in tremendous bias in parameter estimates and outcome predictions. In these areas, the application of advanced models and methods is essential.
1.2. History of longitudinal analysis and its progress
There were some vague, sporadic discussions about the theory of random effects and growth as early as the nineteenth century (Gompertz, 1820
; Ware and Liang, 1996
). The year 1918 witnessed the advent of the earliest repeated measures analysis when Fisher (1918)
published the celebrated article on the analysis of variance (ANOVA). In this historical masterpiece, Fisher introduced variance-component models and the concept of "intraclass correlation." Some later works extended Fisher's approach to the domain of mixed modeling with the developments of such concepts as the split-plot design and the multilevel ANOVA (Yates, 1935
; Jackson, 1939
). For a long period of time, these variance decomposition methods were the major statistical tool to analyze repeated measurements. Though simplistic in many ways, the advancement of these early works provided a solid foundation for the advancement of the modern mixed modeling techniques. Around the same period, there were also some early mathematical formulations of trajectories to analyze the pattern of change over time in biological and social research (Baker, 1954
; Rao, 1958
; Wishart, 1938
; see the summary in Bollen and Curran, 2006
, Chapter 1). Until the early 1980s, however, longitudinal data analysis was largely restricted within the formulation of the classical repeated measures analysis traditionally applied in biomedical settings. Given the substantial limitations and constraints in the traditional approaches in repeated...