
An Introduction to Econometric Theory
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
A GUIDE TO ECONOMICS, STATISTICS AND FINANCE THAT EXPLORES THE MATHEMATICAL FOUNDATIONS UNDERLING ECONOMETRIC METHODS
An Introduction to Econometric Theory offers a text to help in the mastery of the mathematics that underlie econometric methods and includes a detailed study of matrix algebra and distribution theory. Designed to be an accessible resource, the text explains in clear language why things are being done, and how previous material informs a current argument. The style is deliberately informal with numbered theorems and lemmas avoided. However, very few technical results are quoted without some form of explanation, demonstration or proof.
The author-a noted expert in the field-covers a wealth of topics including: simple regression, basic matrix algebra, the general linear model, distribution theory, the normal distribution, properties of least squares, unbiasedness and efficiency, eigenvalues, statistical inference in regression, t and F tests, the partitioned regression, specification analysis, random regressor theory, introduction to asymptotics and maximum likelihood. Each of the chapters is supplied with a collection of exercises, some of which are straightforward and others more challenging. This important text:
- Presents a guide for teaching econometric methods to undergraduate and graduate students of economics, statistics or finance
- Offers proven classroom-tested material
- Contains sets of exercises that accompany each chapter
- Includes a companion website that hosts additional materials, a solution manual and lecture slides
Written for undergraduates and graduate students of economics, statistics or finance, An Introduction to Econometric Theory is an essential beginner's guide to the underpinnings of econometrics.
More details
Other editions
Additional editions


Person
JAMES DAVIDSON is Professor of Econometrics at the University of Exeter. He has also held teaching posts at the University of Warwick, the London School of Economics, the University of Wales Aberystwyth and Cardiff University, as well as visiting positions at the University of California, Berkeley, the University of California, San Diego, and Central European University, Budapest.
Content
List of Figures ix
Preface xi
About the CompanionWebsite xv
Part I Fitting 1
1 Elementary Data Analysis 3
1.1 Variables and Observations 3
1.2 Summary Statistics 4
1.3 Correlation 6
1.4 Regression 10
1.5 Computing the Regression Line 12
1.6 Multiple Regression 16
1.7 Exercises 18
2 Matrix Representation 21
2.1 Systems of Equations 21
2.2 Matrix Algebra Basics 23
2.3 Rules of Matrix Algebra 26
2.4 Partitioned Matrices 27
2.5 Exercises 28
3 Solving the Matrix Equation 31
3.1 Matrix Inversion 31
3.2 Determinant and Adjoint 34
3.3 Transposes and Products 37
3.4 Cramer's Rule 38
3.5 Partitioning and Inversion 39
3.6 A Note on Computation 41
3.7 Exercises 43
4 The Least Squares Solution 47
4.1 Linear Dependence and Rank 47
4.2 The General Linear Regression 50
4.3 Definite Matrices 52
4.4 Matrix Calculus 56
4.5 Goodness of Fit 57
4.6 Exercises 59
Part II Modelling 63
5 Probability Distributions 65
5.1 A Random Experiment 65
5.2 Properties of the Normal Distribution 68
5.3 Expected Values 72
5.4 Discrete Random Variables 75
5.5 Exercises 80
6 More on Distributions 83
6.1 Random Vectors 83
6.2 The Multivariate Normal Distribution 84
6.3 Other Continuous Distributions 87
6.4 Moments 90
6.5 Conditional Distributions 92
6.6 Exercises 94
7 The Classical RegressionModel 97
7.1 The Classical Assumptions 97
7.2 The Model 99
7.3 Properties of Least Squares 101
7.4 The Projection Matrices 103
7.5 The Trace 104
7.6 Exercises 106
8 The Gauss-Markov Theorem 109
8.1 A Simple Example 109
8.2 Efficiency in the General Model 111
8.3 Failure of the Assumptions 113
8.4 Generalized Least Squares 114
8.5 Weighted Least Squares 116
8.6 Exercises 118
Part III Testing 121
9 Eigenvalues and Eigenvectors 123
9.1 The Characteristic Equation 123
9.2 Complex Roots 124
9.3 Eigenvectors 126
9.4 Diagonalization 128
9.5 Other Properties 130
9.6 An Interesting Result 131
9.7 Exercises 133
10 The Gaussian RegressionModel 135
10.1 Testing Hypotheses 135
10.2 Idempotent Quadratic Forms 137
10.3 Confidence Regions 140
10.4 t Statistics 141
10.5 Tests of Linear Restrictions 144
10.6 Constrained Least Squares 146
10.7 Exercises 149
11 Partitioning and Specification 153
11.1 The Partitioned Regression 153
11.2 Frisch-Waugh-Lovell Theorem 155
11.3 Misspecification Analysis 156
11.4 Specification Testing 159
11.5 Stability Analysis 160
11.6 Prediction Tests 162
11.7 Exercises 163
Part IV Extensions 167
12 Random Regressors 169
12.1 Conditional Probability 169
12.2 Conditional Expectations 170
12.3 StatisticalModels Contrasted 174
12.4 The Statistical Assumptions 176
12.5 Properties of OLS 178
12.6 The Gaussian Model 182
12.7 Exercises 183
13 Introduction to Asymptotics 187
13.1 The Lawof Large Numbers 187
13.2 Consistent Estimation 192
13.3 The Central LimitTheorem 195
13.4 Asymptotic Normality 198
13.5 Multiple Regression 201
13.6 Exercises 203
14 Asymptotic Estimation Theory 207
14.1 Large Sample Efficiency 207
14.2 Instrumental Variables 208
14.3 Maximum Likelihood 210
14.4 Gaussian ML 213
14.5 Properties of ML Estimators 214
14.6 Likelihood Inference 216
14.7 Exercises 218
Part V Appendices 221
A The Binomial Coefficients 223
B The Exponential Function 225
C Essential Calculus 227
D The Generalized Inverse 229
Recommended Reading 233
Index 235
1
Elementary Data Analysis
1.1 Variables and Observations
Where to begin? Data analysis is the business of summarizing a large volume of information into a smaller compass, in a form that a human investigator can appreciate, assess, and draw conclusions from. The idea is to smooth out incidental variations so as to bring the 'big picture' into focus, and the fundamental concept is averaging, extracting a representative value or central tendency from a collection of cases. The correct interpretation of these averages, and functions of them, on the basis of a model of the environment in which the observed data are generated,1 is the main concern of statistical theory. However, before tackling these often difficult questions, gaining familiarity with the methods of summarizing sample information and doing the associated calculations is an essential preliminary.
Information must be recorded in some numerical form. Data may consist of measured magnitudes, which in econometrics are typically monetary values, prices, indices, or rates of exchange. However, another important data type is the binary indicator of membership of some class or category, expressed numerically by ones and zeros. A thing or entity of which different instances are observed at different times or places is commonly called a variable. The instances themselves, of which collections are to be made and then analyzed, are the observations. The basic activity to be studied in this first part of the book is the application of mathematical formulae to the observations on one or more variables.
These formulae are, to a large extent, human-friendly versions of coded computer routines. In practice, econometric calculations are always done on computers, sometimes with spreadsheet programs such as Microsoft Excel but more often using specialized econometric software packages. Simple cases are traditionally given to students to carry out by hand, not because they ever need to be done this way but hopefully to cultivate insight into what it is that computers do. Making the connection between formulae on the page and the results of running estimation programs on a laptop is a fundamental step on the path to econometric expertise.
The most basic manipulation is to add up a column of numbers, where the word "column" is chosen deliberately to evoke the layout of a spreadsheet but could equally refer to the page of an accounting led!ger in the ink-and-paper technology of a now-vanished age. Nearly all of the important concepts can be explained in the context of a pair of variables. To give them names, call them and . Going from two variables up to three and more introduces no fundamental new ideas. In linear regression analysis, variables are always treated in pairs, no matter how many are involved in the calculation as a whole.
Thus, let denote the pair of variables chosen for analysis. The enclosure of the symbols in parentheses, separated by a comma, is a simple way of indicating that these items are to be taken together, but note that is not to be regarded as just another way of writing . The order in which the variables appear is often significant.
Let , a positive whole number, denote the number of observations or in other words the number of rows in the spreadsheet. Such a collection of observations, whose order may or may not be significant, is often called a series. The convention for denoting which row the observation belongs to is to append a subscript. Sometimes the letters , , or are used as row labels but there are typically other uses for these, and in this book we generally adopt the symbol for this purpose. Thus, the contents of a pair of spreadsheet columns may be denoted symbolically as
We variously refer to the and as the elements or the coordinates of their respective series.
This brings us inevitably to the question of the context in which observations are made. Very frequently, macroeconomic or financial variables (prices, interest rates, demand flows, asset stocks) are recorded at successive dates, at intervals of days, months, quarters, or years, and then is simply a date, standardized with respect to the time interval and the first observation. Such data sets are called time series. Economic data may also be observations of individual economic units. These can be workers or consumers, households, firms, industries, and sometimes regions, states, and countries. The observations can represent quantities such as incomes, rates of expenditure on consumption or investment, and also individual characteristics, such as family size, numbers of employees, population, and so forth. If these observations relate to a common date, the data set is called a cross-section. The ordering of the rows typically has no special significance in this case.
Increasingly commonly studied in economics are data sets with both a time and a cross-sectional dimension, known as panel data, representing a succession of observations on the same cross section of entities. In this case two subscripts are called for, say and . However, the analysis of panel data is an advanced topic not covered in this book, and for observations we can stick to single subscripts henceforth.
1.2 Summary Statistics
As remarked at the beginning, the basic statistical operation of averaging is a way of measuring the central tendency of a set of data. Take a column of numbers, add them up, and divide by . This operation defines the sample mean of the series, usually written as the symbol for the designated variable with a bar over the top. Thus,
(1.1)where the second equality defines the 'sigma' representation of the sum. The Greek letter , decorated with upper and lower limits, is a neat way to express the adding-up operation, noting the vital role of the subscript in showing which items are to be added together. The formula for is constructed in just the same way.
The idea of the series mean extends from raw observations to various constructed series. The mean deviations are the series
Naturally enough this 'centred' series has zero mean, identically:
(1.2)Not such an interesting fact, perhaps, but the statistic obtained as the mean of the squared mean deviations is very interesting indeed. This is the sample variance,
(1.3)which contains information about how the series varies about its central tendency. The same information, but with units of measurement matching the original data, is conveyed by the square root , called the standard deviation of the series. If is a measure of location, then is a measure of dispersion.
One of the mysteries of the variance formula is the division by , not as for the mean itself. There are important technical reasons for this,2 but to convey the intuition involved here, it may be helpful to think about the case where , a single observation. Clearly, the mean formula still makes sense, because it gives . This is the best that can be done to measure location. There is clearly no possibility of computing a measure of dispersion, and the fact that the formula would involve dividing by zero gives warning that it is not meaningful to try. In other words, to measure the dispersion as , which is what (1.3) would produce with division by instead of , would be misleading. Rather, it is correct to say that no measure of dispersion exists.
Another property of the variance formula worth remarking is found by multiplying out the squared terms and summing them separately, thus:
(1.4)In the first equality, note that "adding up" instances of (which does not depend on ) is the same thing as just multiplying by . The second equality then follows by cancellation, given the definition (1.1). This result shows that to compute the variance, there is no need to perform subtractions. Simply add up the squares of the coordinates, and subtract times the squared mean. Clearly, this second formula is more convenient for hand calculations than the first one.
The information contained in the standard deviation is nicely captured by a famous result in statistics called Chebyshev's rule, after the noted Russian mathematician who discovered it.3 Consider, for some chosen positive number , whether a series coordinate falls 'far from' the central tendency of the data set in the sense that either or . In other words, does lie beyond a distance from the mean, either above or below? This condition can be expressed as
(1.5)Letting denote the number of cases that satisfy inequality (1.5), the inequality
(1.6)is true by definition, where the 'sigma' notation variant expresses compactly the sum of the terms satisfying the stated condition. However, it is also the case that
(1.7)since, remembering the definition of from (1.3), the sum cannot exceed , even with . Putting together the inequalities in (1.6) and (1.7) and also dividing through by and by yields the result
(1.8)In words, the proportion of series coordinates falling beyond a distance from the mean is at...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.