## Statistical Prerequisites

Before we demonstrate the actual mechanics of establishing confidence intervals and testing statistical hypotheses, it is assumed that the reader is familiar with the fundamental concepts of probability and statistics. Although not a substitute for a basic course in statistics, Appendix A provides the essentials of statistics with which the reader should be totally familiar. Key concepts such as probability, probability distributions, Type I and Type II errors, level of significance, power of a...

## Info

FIGURE 12.1 Patterns of autocorrelation and nonautocorrelation. Starting at the bottom of the recession, when economic recovery starts, most of these series start moving upward. In this upswing, the value of a series at one point in time is greater than its previous value. Thus there is a momentum built into them, and it continues until something happens (e.g., increase in interest rate or taxes or both) to slow them down. Therefore, in regressions involving time series data, successive...

## Exercises

For the illustrative example discussed in Section C.10 the X'X and X'y using the data in the deviation form are as follows c. Obtain the variance of ji2 and ji3 and their covariances. e. Comparing your results with those given in Section C.10, what do you find are the advantages of the deviation form C.2. Refer to exercise 22.23. Using the data given therein, set up the appropriate (X'X) matrix and the X'y vector and estimate the parameter vector p and its variance-covariance matrix. Also...

## Illustration Of Gprobit Using Housing Example

Let us continue with our housing example. We have already presented the results of the glogit model for this example. The grouped probit (gprobit) results of the same data are as follows Using the n.e.d. ( I) given in Table 15.10, the regression results are as shown in Table 15.11.31 The regression results based on the probits ( n.e.d. + 5) are as shown in Table 15.12. Except for the intercept term, these results are identical with those given in the previous table. But this should not be...

## Eu2

Is known as the standard error of estimate or the standard error of the regression (se). It is simply the standard deviation of the Y values about the estimated regression line and is often used as a summary measure of the goodness of fit of the estimated regression line, a topic discussed in Section 3.5. Earlier we noted that, given Xi, a2 represents the (conditional) variance of both ui and Yi. Therefore, the standard error of the estimate can also be called the (conditional) standard...

## Statistical Versus Deterministic Relationships

From the examples cited in Section 1.2, the reader will notice that in regression analysis we are concerned with what is known as the statistical, not functional or deterministic, dependence among variables, such as those of classical physics. In statistical relationships among variables we essentially deal with random or stochastic4 variables, that is, variables that have probability distributions. In functional or deterministic dependency, on the other hand, we also deal with variables, but...

## Example

CONSUMPTION-INCOME RELATIONSHIP IN THE UNITED STATES, 1982-1996 Let us return to the consumption income data given in Table I.1 of the Introduction. We have already shown the data in Figure I.3 along with the estimated regression line (I.3.3). Now we provide the underlying OLS regression results. (The results were obtained from the statistical package Eviews 3.) Note Y personal consumption expenditure (PCE) and X gross domestic product (GDP), all measured in 1992 billions of dollars. In this...

## Estimation In The Presence Of Perfect Multicollinearity

It was stated previously that in the case of perfect multicollinearity the regression coefficients remain indeterminate and their standard errors are infinite. This fact can be demonstrated readily in terms of the three-variable regression model. Using the deviation form, where all the variables are expressed as deviations from their sample means, we can write the three-variable regression model as yi Pi xii + 03 x3i + Ui (10.2.1)

## On The World Wide

Economic Statistics Briefing Room An excellent source of data on output, income, employment, unemployment, earnings, production and business activity, prices and money, credits and security markets, and international statistics. Federal Reserve System Beige Book Gives a summary of current economic conditions by Federal Reserve District. There are 12 Federal Reserve Districts. Government Information Sharing Project Provides regional economic information 1990 population and housing census 1992...

## Basic Econometrics

The art of the econometrician consists in finding the set of assumptions that are both sufficiently specific and sufficiently realistic to allow him to take the best possible advantage of the data available to him.5 Econometricians . . . are a positive help in trying to dispel the poor public image of economics (quantitative or otherwise) as a subject in which empty boxes are opened by assuming the existence of can-openers to reveal contents which any ten economists will interpret in 11 ways.6...

## Caution About Overreacting To Heteroscedasticity

Reverting to the R& D example discussed in the previous section, we saw that when we used the square root transformation to correct for het-eroscedasticity in the original model (11.7.3), the standard error of the slope coefficient decreased and its t value increased. Is this change so significant that one should worry about it in practice To put the matter differently, when should we really worry about the heteroscedasticity problem As one author contends, heteroscedasticity has never been...

## Figure 228

Monthly percent change in the NYSE Price Index, 1952-1995. 24This graph and the regression results presented below are based on the data collected by Gary Koop, Analysis of Economic Data, John Wiley & Sons, New York, 2000 (data from the data disk). The monthly percentage change in the stock price index can be regarded as a rate of return on the index. CHAPTER TWENTY-TWO TIME SERIES ECONOMETRICS FORECASTING 861 NEW YORK STOCK EXCHANGE PRICE CHANGES (Continued) Now we obtain the residuals from...

## Example 177 Revisited

If we consider the model given in Eq. (17.4.11), as generated by the adaptive expectations mechanism (i.e., PPCE as a function of expected PPDI), then y, the expectations coefficient can be obtained from (17.5.5) as y 1 - 0.4106 0.5894. Then, following the preceding discussion about the AE model, we can say that about 59 percent of the discrepancy between actual and expected PPCE is eliminated within a year. 19Like the Koyck model, it can be shown that, under AE, expectations of a variable are...

## C6 Hypothesis Testing About Individual Regression Coefficients In Matrix Notation

For reasons spelled out in the previous chapters, if our objective is inference as well as estimation, we shall have to assume that the disturbances ui follow some probability distribution. Also for reasons given previously, in regression analysis we usually assume that each ui follows the normal distribution with zero mean and constant variance a2. In matrix notation, we have where u and 0 are n x 1 column vectors and I is an n x n identity matrix, 0 being the null vector. Given the normality...

## A2 Maximum Likelihood Estimation Of Food Expenditure In India

Return to Example 3.2 and regression (3.7.2), which gives the regression of food expenditure on total expenditure for 55 rural households in India. Since under the normality assumption the OLS and ML estimators of the regression coefficients are the same, we obtain the ML estimators as ft ft 94.2087 and ft ft 0.4386. The OLS estimator of a2 is a2 4469.6913, but the ML estimator is a2 4407.1563, which is smaller than the OLS estimator. As noted, in small samples the ML estimator is downward...

## Just or Exact Identification

The reason we could not identify the preceding demand function or the supply function was that the same variables P and Q are present in both functions and there is no additional information, such as that indicated in Figure 19.1d or e. But suppose we consider the following demand-and-supply model Demand function Qt a0 + a1 Pt + a2It + u1t ai < 0, a2 > 0 (19.2.12) Supply function Qt fa + fa Pt + ut fa > 0 (19.2.13) where I income of the consumer, an exogenous variable, and all other...

## Ef2 XE

Lew Silver, Econometrics An Introduction, Addison-Wesley, Reading, Mass., 1988, p. 265. 36Recall that we have already encountered this assumption in our discussion of the Goldfeld-Quandt test. Gujarati Basic Econometrics, Fourth Edition II. Relaxing the Assumptions of the Classical Model 11. Heteroscedasticity What Happens if the Error Variance is Nonconstant FIGURE 11.10 Error variance proportional to X2. CHAPTER ELEVEN HETEROSCEDASTICITY 419 Hence the variance of vi...

## Seasonality In Refrigerator Sales

From the data on refrigerator sales given in Table 9.3, we obtain the following regression results Yt 1222.125D1( + 1467.500D2( + 1569.750D3( + 1160.000D4( t (20.3720) (24.4622) (26.1666) (19.3364) (9.7.2) Note We have not given the standard errors of the estimated coefficients, as each standard error is equal to 59.9904, because all the dummies take only a value of 1 or zero. The estimated a coefficients in (9.7.2) represent the average, or mean, sales of refrigerators (in thousands of units)...

## Regression Versus Causation

Although regression analysis deals with the dependence of one variable on other variables, it does not necessarily imply causation. In the words of Kendall and Stuart, A statistical relationship, however strong and however 4The word stochastic comes from the Greek word stokhos meaning a bull's eye. The outcome of throwing darts on a dart board is a stochastic process, that is, a process fraught with misses. CHAPTER ONE THE NATURE OF REGRESSION ANALYSIS 23 suggestive, can never establish causal...

## Unit Root Stochastic Process

This model resembles the Markov first-order autoregressive model that we discussed in the chapter on autocorrelation. If p 1, (21.4.1) becomes a RWM (without drift). If p is in fact 1, we face what is known as the unit root problem, that is, a situation of nonstationarity we already know that in this case the variance of Yt is not stationary. The name unit root is due to the fact that p 1.11 Thus the terms nonstationarity, random walk, and unit root can be treated as synonymous. If, however, p...

## Jan Kmenta Elements Of Econometrics

Source Economic Report of the President, 1993. Data on Q (Table B-94), on P (Table B-96), on X (Table B-5). Source Economic Report of the President, 1993. Data on Q (Table B-94), on P (Table B-96), on X (Table B-5). To give some numerical results, we obtained the data shown in Table 20.1. First we estimate the reduced-form equations, regressing separately price and quantity on per capital real consumption expenditure. The results are as follows Pt 72.3091 + 0.0043X se (9.2002) (0.0009) t...

## Multiple Regression Analysis The Problem Of Estimation

The two-variable model studied extensively in the previous chapters is often inadequate in practice. In our consumption-income example, for instance, it was assumed implicitly that only income X affects consumption Y. But economic theory is seldom so simple for, besides income, a number of other variables are also likely to affect consumption expenditure. An obvious example is wealth of the consumer. As another example, the demand for a commodity is likely to depend not only on its own price...

## Historical Origin Of The Term Regression

The term regression was introduced by Francis Galton. In a famous paper, Galton found that, although there was a tendency for tall parents to have tall children and for short parents to have short children, the average height of children born of parents of a given height tended to move or regress toward the average height in the population as a whole.1 In other words, the height of the children of unusually tall or unusually short parents tends to move toward the average height of the...

## Integrated Stochastic Processes

The random walk model is but a specific case of a more general class of stochastic processes known as integrated processes. Recall that the RWM without drift is nonstationary, but its first difference, as shown in (21.3.8), is 14The following discussion is based on Wojciech W. Charemza et al., op. cit., pp. 89-91. CHAPTER TWENTY-ONE TIME SERIES ECONOMETRICS SOME BASIC CONCEPTS 805 stationary. Therefore, we call the RWM without drift integrated of order 1, denoted as 7(1). Similarly, if a time...

## Approaches To Economic Forecasting

Broadly speaking, there are five approaches to economic forecasting based on time series data (1) exponential smoothing methods, (2) single-equation regression models, (3) simultaneous-equation regression models, (4) autoregressive integrated moving average models (ARIMA), and (5) vector autoregression. These are essentially methods of fitting a suitable curve to historical data of a given time series. There are a variety of these methods, such as single exponential smoothing, Holt's linear...

## EuU2n k E yf n

Where k the number of parameters in the model including the intercept term. (In the three-variable regression, k 3. Why ) The R2 thus defined is known as the adjusted R2, denoted by R2. The term adjusted means adjusted for the df associated with the sums of squares entering into (7.8.1) E u2 has n k df in a model involving k parameters, which include 218 PART ONE SINGLE-EQUATION REGRESSION MODELS the intercept term, and J2 yl has n 1 df. (Why ) For the three-variable case, we know that uui2 has...

## Fa faztyE x2

Single-Equation I 5. Two-Variable I I The McGraw-Hill Econometrics, Fourth Regression Models Regression Interval Companies, 2004 Edition Estimation and Hypothesis 122 PART ONE SINGLE-EQUATION REGRESSION MODELS as noted in (4.3.6), is a standardized normal variable. It therefore seems that we can use the normal distribution to make probabilistic statements about ft provided the true population variance a2 is known. If a2 is known, an important property of a normally...

## Model Selection Criteria

According to Hendry and Richard, a model chosen for empirical analysis should satisfy the following criteria4 1. Be data admissible that is, predictions made from the model must be logically possible. 2. Be consistent with theory that is, it must make good economic sense. For example, if Milton Friedman's permanent income hypothesis holds, the intercept value in the regression of permanent consumption on permanent income is expected to be zero. 3. Have weakly exogenous regressors that is, the...

## Causality Between Moneyand Interest Rate In Canada

Refer to the Canadian data given in Table 17.3. Suppose we want to find out if there is any causality between money supply and interest rate in Canada for the quarterly periods of 1979-1988. To show that the Granger causality test depends critically on the number of lagged terms introduced in the model, we present below the results of the F test using several (quarterly) lags. In each case, the null hypothesis is that interest rate does not (Granger) cause money supply and vice versa.

## Reciprocal Models

Models of the following type are known as reciprocal models. Although this model is nonlinear in the variable X because it enters inversely or reciprocally, the model is linear in 0i and 02 and is therefore a linear regression model.i7 This model has these features As X increases indefinitely, the term 02(l X) approaches zero (note 02 is a constant) and Y approaches the limiting or asymptotic value 0i. Therefore, models like (6.7.i) have built in them an asymptote or limit value that the...

## Food Expenditure In India

Refer to the data given in Table 2.8 of exercise 2.15. The data relate to a sample of 55 rural households in India. The regressand in this example is expenditure on food and the regressor is total expenditure, a proxy for income, both figures in rupees. The data in this example are thus cross-sectional data. On the basis of the given data, we obtained the following regression FoodExp, 94.2087 + 0.4368 TotalExp, var (ft) 2560.9401 var (j82) 0.0061 r2 0.3698 se (ft) 50.8563 se( > ) 0.0783 2...

## Some Technical Aspects Of The Dummy Variable Technique

The Interpretation of Dummy Variables in Semilogarithmic Regressions In Chapter 6 we discussed the log-lin models, where the regressand is logarithmic and the regressors are linear. In such a model, the slope coefficients of the regressors give the semielasticity, that is, the percentage change in the regressand for a unit change in the regressor. This is only so if the regressor is quantitative. What happens if a regressor is a dummy variable To be specific, consider the following model ln Yi...

## The Confidenceinterval Approach

Single-Equation I 5. Two-Variable I I The McGraw-Hill Econometrics, Fourth Regression Models Regression Interval Companies, 2004 Edition Estimation and Hypothesis 128 PART ONE SINGLE-EQUATION REGRESSION MODELS Values of lying in this interval are plausible under ffQwith 100(1 - a) confidence. Hence, do not reject Ho if lies in this region. 2 - ta 2 se( 2) 2 + 1 a 2 se( 2) FIGURE 5.2 A 100(1 - a) confidence interval for p2. the hypothesis that the true MPC is 0.3, with 95...

## Yi i 2 X2i 3 X3i 4 X4i u

Aigner, Basic Econometrics, Prentice Hall, Englewood Cliffs, N.J., 1971, pp. 91-92. CHAPTER EIGHT MULTIPLE REGRESSION ANALYSIS THE PROBLEM OF INFERENCE 265 H0 fa3 fa or (fa - fa) 0 0 (8.6.2) H fa3 fa4 or fa - fa 4) 0 that is, the two slope coefficients fa3 and fa4 are equal. Such a null hypothesis is of practical importance. For example, let (8.6.1) represent the demand function for a commodity where Y amount of a commodity demanded, X2 price of the commodity, X3...

## Rbn

Note * indicates that the elasticity is variable, depending on the value taken by X or Yor both. When no X and Yvalues are specified, in practice, very often these elasticities are measured at the mean values of these variables, namely, X and y. 3. The coefficients of the model chosen should satisfy certain a priori expectations. For example, if we are considering the demand for automobiles as a function of price and some other variables, we should expect a negative coefficient for the price...

## The Phenomenon Of Spurious Regression

To see why stationary time series are so important, consider the following two random walk models where we generated 500 observations of ut from ut N(0, 1) and 500 observations of vt from vt N(0, 1) and assumed that the initial values of both Y and X were zero. We also assumed that ut and vt are serially uncorrelated as well as mutually uncorrelated. As you know by now, both these time series are nonstationary that is, they are 7(1) or exhibit stochastic trends. Suppose we regress Yt on Xt....

## Summary And Conclusions

Qualitative response regression models refer to models in which the response, or regressand, variable is not quantitative or an interval scale. 2. The simplest possible qualitative response regression model is the binary model in which the regressand is of the yes no or presence absence type. 3. The simplest possible binary regression model is the linear probability model (LPM) in which the binary response variable is regressed on the relevant explanatory variables by using the standard OLS...

## Approaches To Estimating Nonlinear Regression Models

There are several approaches, or algorithms, to NLRMs (1) direct search or trial and error, ( ) direct optimization, and (3) iterative linearization.5 Direct Search or Trial-and-Error or Derivative-Free Method In the previous section we showed how this method works. Although intuitively appealing because it does not require the use of calculus methods as the other methods do, this method is generally not used. First, if an NLRM 4Note that we call J u the error sum of squares and not the usual...

## Yt Pq Pi[YXt 1 YXi ut Pq PiYXt Pi1 yX1 ut

Yt yPq + YPiXt + (1 _ Y)Yt_i + ut _ (1 _ y)ut_i YPq + YPi Xt + (i _ Y )Yt_i + vt Models Autoregressive and Distributed-Lag Models 672 PART THREE TOPICS IN ECONOMETRICS seems eminently sensible. The belief that people learn from experience is obviously a more sensible starting point than the implicit assumption that they are totally devoid of memory, characteristic of static expectations thesis. Moreover, the assertion that more distant experiences exert a lesser effect than more recent...

## Prediction With Multiple Regression

17For a discussion of the Chow test under heteroscedasticity, see William H. Greene, Econometric Analysis, 4th ed., Prentice Hall, Englewood Cliffs, N.J., 2000, pp. 292-293, and Adrian C. Darnell, A Dictionary of Econometrics, Edward Elgar, U.K., 1994, p. 51. 280 PART ONE SINGLE-EQUATION REGRESSION MODELS *8.10 THE TROIKA OF HYPOTHESIS TESTS THE LIKELIHOOD RATIO (LR), WALD (W), AND LAGRANGE MULTIPLIER (LM) TESTS18 In this and the previous chapters we have, by and large, used the t, F, and...

## Estimation Of Distributedlag Models

Yt a + Pq Xt + Pi Xi_i + P2 Xt-2 + + u Ad Hoc Estimation of Distributed-Lag Models Gujarati Basic I III. Topics in Econometrics I 17. Dynamic Econometric I I The McGraw-Hill Econometrics, Fourth Models Autoregressive Companies, 2004 Edition and Distributed-Lag 664 PART THREE TOPICS IN ECONOMETRICS be applied to (17.3.1). This is the approach taken by Alt7 and Tinbergen.8 They suggest that to estimate (17.3.1) one may proceed sequentially that is, first regress Yt on Xt, then regress Yt on Xt...

## Consequences Of Using Ols In The Presence Of Heteroscedasticity

As we have seen, both 2 and 2 are (linear) unbiased estimators In repeated sampling, on the average, 2 and 2 will equal the true 2 that is, they are both unbiased estimators. But we know that it is 2 that is efficient, that is, has the smallest variance. What happens to our confidence interval, hypotheses testing, and other procedures if we continue to use the OLS estimator 2 We distinguish two cases. OLS Estimation Allowing for Heteroscedasticity Suppose we use 2 and use the variance formula...

## Regression Model In R

Ln Qi - ln a + ln K + ln + ln u3i (3) where u1, u2, and u3 are stochastic disturbances. In the preceding model there are three equations in three endogenous variables Q, L, and K. P, R, and W are exogenous. a. What problems do you encounter in estimating the model if a + 0 1, that is, when there are constant returns to scale b. Even if a + 0 1, can you estimate the equations Answer by considering the identifiability of the system. c. If the system is not identified, what can be done to make it...

## Properties Of Ols Estimators Under The Normality Assumption

With the assumption that ui follow the normal distribution as in (4.2.5), the OLS estimators have the following properties Appendix A provides a general discussion of the desirable statistical properties of estimators. 2. They have minimum variance. Combined with 1, this means that they are minimum-variance unbiased, or efficient estimators. 3. They have consistency that is, as the sample size increases indefinitely, the estimators converge to their true population values. 4. j1 (being a linear...

## Remedial Measures

J., Comment, Journal of Business and Economic Statistics, vol. 5, 1967, pp. 449-451. The quote is reproduced from Peter Kennedy, A Guide to Econometrics, 4th ed., MIT Press, Cambridge, Mass., 1998, p. 190. 364 PART TWO RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL (10.2.3), we can estimate a uniquely, even if we cannot estimate its two components given there individually. Sometimes this is the best we can do with a given set of data.29 One can try the following rules of thumb...

## F y Y2Yn faifa2 a ex a1

If Yi, Y2, , Yn are known or given, but fai, fa2, and a2 are not known, the function in (3) is called a likelihood function, denoted by LF(fai, fa2, a2), CHAPTER FOUR CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM) 115 TTtffl 1 I 1 (Y - l - 2Xi)2 LF( l' 2' a ) V eXH - 2 -12- (4) The method of maximum likelihood, as the name indicates, consists in estimating the unknown parameters in such a manner that the probability of observing the given Y's is as high (or maximum) as possible. Therefore, we...

## The Nature Of Heteroscedasticity

As noted in Chapter 3, one of the important assumptions of the classical linear regression model is that the variance of each disturbance term ui, conditional on the chosen values of the explanatory variables, is some constant number equal to a2. This is the assumption of homoscedasticity, or equal (homo) spread (scedasticity), that is, equal variance. Symbolically, E(u ) a2 i 1,2, ,n (11.1.1) Diagrammatically, in the two-variable regression model homoscedastic-ity can be shown as in Figure...

## Terminology And Notation

8In advanced treatment of econometrics, one can relax the assumption that the explanatory variables are nonstochastic (see introduction to Part II). CHAPTER ONE THE NATURE OF REGRESSION ANALYSIS 25 one explanatory variable, as in the crop-yield, rainfall, temperature, sunshine, and fertilizer examples, it is known as multiple regression analysis. In other words, in two-variable regression there is only one explanatory variable, whereas in multiple regression there is more than one explanatory...

## An Autoregressive Integrated Moving Average Arima Process

The time series models we have already discussed are based on the assumption that the time series involved are (weakly) stationary in the sense defined in Chapter 21. Briefly, the mean and variance for a weakly stationary time series are constant and its covariance is time-invariant. But we know that many economic time series are nonstationary, that is, they are integrated for example, the economic time series in Table 2i.i are integrated. But we also saw in Chapter 2i that if a time series is...

## Regression Models Hayden Economics

PDFs with values of K less than 3 are called platykurtic (fat or short-tailed), and those with values greater than 3 are called leptokurtic (slim or long-tailed). See Figure A.3. A PDF with a kurtosis value of 3 is known as mesokurtic, of which the normal distribution is the prime example. (See the discussion of the normal distribution in Section A.6.) We will show shortly how the measures of skewness and kurtosis can be combined to determine whether a random variable follows a normal FIGURE...

## Tests Of Specification Errors

Knowing the consequences of specification errors is one thing but finding out whether one has committed such errors is quite another, for we do not deliberately set out to commit such errors. Very often specification biases arise inadvertently, perhaps from our inability to formulate the model as 13Michael D. Intriligator, Econometric Models, Techniques and Applications, Prentice Hall, Englewood Cliffs, N.J., 1978, p. 189. Recall the Occam's razor principle. precisely as possible because the...

## Stochastic Specification Of

It is clear from Figure 2.1 that, as family income increases, family consumption expenditure on the average increases, too. But what about the consumption expenditure of an individual family in relation to its (fixed) level of income It is obvious from Table 2.1 and Figure 2.1 that an individual family's consumption expenditure does not necessarily increase as the income level increases. For example, from Table 2.1 we observe that corresponding to the income level of 100 there is one family...

## Relationship Between Compensation And Productivity

To illustrate the Park approach, we use the data given in Table 11.1 to run the following regression where Y average compensation in thousands of dollars, X average productivity in thousands of dollars, and i ith employment size of the establishment. The results of the regression were as follows se (936.4791) (0.0998) (11.5.3) The results reveal that the estimated slope coefficient is significant at the 5 percent level on the basis of a one-tail t test. The equation shows that as labor...

## The Meaning Of Partial Regression Coefficients

As mentioned earlier, the regression coefficients fa and fa are known as partial regression or partial slope coefficients. The meaning of partial regression coefficient is as follows fa measures the change in the mean value of Y, E(Y), per unit change in X2, holding the value of X3 constant. Put differently, it gives the direct or the net effect of a unit change in X2 on the mean value of Y, net of any effect that X3 may have on mean Y. Likewise, fa measures the change in the mean value of Y...

## Interaction Effects Using Dummy Variables

Dummy variables are a flexible tool that can handle a variety of interesting problems. To see this, consider the following model Yi ax + a2 D2i + a3 D3i + P Xi + u (9.6.1) X education (years of schooling) D2 1 if female, 0 otherwise D3 1 if nonwhite and non-Hispanic, 0 otherwise In this model gender and race are qualitative regressors and education is a quantitative regressor.11 Implicit in this model is the assumption that the differential effect of the gender dummy D2 is constant across the...

## Hourly Wages In Relation To Marital Status And Region Of Residence

From a sample of 528 persons in May 1985, the following regression results were obtained8 i 8.8148 + 1.099702 - 1.6729D3 se (0.4015) (0.4642) (0.4854) t (21.9528) (2.3688) (-3.4462) (9.3.1) (0.0000)* (0.0182)* (0.0006)* R2 0.0322 D2 married status, 1 married, 0 otherwise D3 region of residence 1 South, 0 otherwise In this example we have two qualitative regressors, each with two categories. Hence we have assigned a single dummy variable for each category. Which is the benchmark category here...

## Approaches To Estimation

If we consider the general M equations model in M endogenous variables given in (19.1.1), we may adopt two approaches to estimate the structural equations, namely, single-equation methods, also known as limited information methods, and system methods, also known as full information methods. In the single-equation methods to be considered shortly, we estimate each equation in the system (of simultaneous equations) individually, taking into account any restrictions placed on that equation (such...

## E4E4 x2ix3i

2 _ (EwXE4)- (i ,v .v< )(l> ,v< ) (7 4 7) 7Douglas Montgomery and Elizabeth Peck, Introduction to Linear Regression Analysis, John Wiley & Sons, New York, 1982, pp. 289-290. See also R. L. Mason, R. F. Gunst, and J. T. Webster, Regression Analysis and Problems of Multicollinearity, Communications in Statistics A, vol. 4, no. 3, 1975, pp. 277-292 R. F. Gunst, and R. L. Mason, Advantages of Examining Mul-ticollinearities in Regression Analysis, Biometrics, vol. 33, 1977, pp. 249-260....

## Piecewise Linear Regression

To illustrate yet another use of dummy variables, consider Figure 9.5, which shows how a hypothetical company remunerates its sales representatives. It pays commissions based on sales in such a manner that up to a certain level, the target, or threshold, level X*, there is one (stochastic) commission structure and beyond that level another. (Note Besides sales, other factors affect sales commission. Assume that these other factors are represented FIGURE 9.5 Hypothetical relationship between...

## Figure

Average salary (in dollars) of public school teachers in three regions. Differences in educational levels, in cost of living indexes, in gender and race may all have some effect on the observed differences. Therefore, unless we take into account all the other variables that may affect a teacher's salary, we will not be able to pin down the cause(s) of the differences. From the preceding discussion, it is clear that all one has to do is see if the coefficients attached to the various dummy...

## Detection Of Multicollinearity

Having studied the nature and consequences of multicollinearity, the natural question is How does one know that collinearity is present in any given situation, especially in models involving more than two explanatory variables Here it is useful to bear in mind Kmenta's warning 1. Multicollinearity is a question of degree and not of kind. The meaningful distinction is not between the presence and the absence of multicollinearity, but between its various degrees. 2. Since multicollinearity refers...

## Yi fo X2i fi3 X3i

(Y k X2 fo XX3) + k X2i + k X3i (Why ) r + k(X2i X2) + k(X3i X3) (7.4.22) where as usual small letters indicate values of the variables as deviations from their respective means. Summing both sides of (7.4.22) over the sample values and dividing through by the sample size n gives Y Y. (Note J x2i Y x3i 0. Why ) Notice that by virtue of (7.4.22) we can write Therefore, the SRF (7.4.1) can be expressed in the deviation form as yi yi + u k x2i + k xn + u (7.4.24) CHAPTER SEVEN MULTIPLE REGRESSION...

## R23

It is not difficult to see that (10.7.2) is satisfied by r42 0 . 5,r43 0 . 5, and r23 -0 . 5, which are not very high values. Therefore, in models involving more than two explanatory variables, the simple or zero-order correlation will not provide an infallible guide to the presence of multicollinearity. Of course, if there are only two explanatory variables, the zero-order correlations will suffice. 3. Examination of partial correlations. Because of the problem just mentioned in relying on...

## Child Mortality Revisited

Let us return to the child mortality example we have considered on several occasions. From data for 64 countries, we obtained the regression results shown in Eq. (8.2.1). Since the data are cross sectional, involving diverse countries with different child mortality experiences, it is likely that we might encounter heteroscedasticity. To find this out, let us first consider the residuals obtained from Eq. (8.2.1). These residuals are plotted in Figure 11.12. From this figure it seems that the...

## Notations And Definitions

To facilitate our discussion, we introduce the following notations and definitions. 736 PART FOUR SIMULTANEOUS-EQUATION MODELS The general M equations model in M endogenous, or jointly dependent, variables may be written as Eq. (19.1.1) Y1t PllYlt + p13Y3t + + p1MYMt + Y11 X1t + Y12 X2t +-----+ Y1kXK + U1t Y2t p21Y1t + p23Y3t +----+ p2MYMt + y21 X1t + y22 X2t +----+ Y2KXKt + u2t Y3t p31Y1t + p32Y2t +----+ p3MYMt + y31 X1t + y32 X2t +----+ Y3KXKt + u3t YmT PmiYu + M2Y2t +----+ Pm, M 1 Ym-1,t +...

## Panel Data Regression Models

In Chapter 1 we discussed briefly the types of data that are generally available for empirical analysis, namely, time series, cross section, and panel. In time series data we observe the values of one or more variables over a period of time (e.g., GDP for several quarters or years). In cross-section data, values of one or more variables are collected for several sample units, or entities, at the same point in time (e.g., crime rates for 50 states in the United States for a given year). In panel...

## Types Of Specification Errors

Assume that on the basis of the criteria just listed we arrive at a model that we accept as a good model. To be concrete, let this model be Yi ft + ft Xi + ft X + ft X3 + Uli (13.2.1) where Y total cost of production andX output. Equation (13.2.1) is the familiar textbook example of the cubic total cost function. But suppose for some reason (say, laziness in plotting the scattergram) a researcher decides to use the following model Yi ai + a2 Xi + a3 X2 + Ui (13.2.2) Note that we have changed...

## Incorrect Specification Of The Stochastic Error Term

A common problem facing a researcher is the specification of the error term ui that enters the regression model. Since the error term is not directly observable, there is no easy way to determine the form in which it enters the model. To see this, let us return to the models given in (13.2.8) and (13.2.9). For simplicity of exposition, we have assumed that there is no intercept in the model. We further assume that ui in (13.2.8) is such that ln ui satisfies the usual OLS assumptions. If we...

## Per Capita Personal Consumption

This example examines per capita personal consumption expenditure (PPCE) in relation to per capita disposable income (PPDI) in the United States for the period 1970-1999, all data in chained 1996 dollars. As an illustration of the Koyck model, consider the data given in Table 17.2. Regression of PPCE on PPDI and lagged PPCE gave the following results 1242.169 + 0.6033PPDI( + 0.4106PPCE(-1 (402.5784) (0.1502) (0.1546) R2 0.9926 d 1.0056 Durbin h 5.119 Note The calculation of Durbin h is...

## The Koyck Approach To Distributedlag Models

Koyck has proposed an ingenious method of estimating distributed-lag models. Suppose we start with the infinite lag distributed-lag model (17.3.1). Assuming that the fa's are all of the same sign, Koyck assumes that they decline geometrically as follows.10 fak faoXk k 0,1, (17.4.1)11 where X, such that 0 < X < 1, is known as the rate of decline, or decay, of the distributed lag and where 1 X is known as the speed of adjustment. What (17.4.1) postulates is that each successive fa coefficient...

## Log V l log 2 log W u

Where (V L) value added per unit of labor l labor input W real wage rate The coefficient 2 measures the elasticity of substitution between labor and capital (i.e., proportionate change in factor proportions proportionate change in relative factor prices). From the data given in Table 6.8, verify that the estimated elasticity is 1.3338 and that it is not statistically significantly different from 1. Table 6.9 gives data on the GDP (gross domestic product) deflator for domestic goods and the GDP...

## The Neweywest Method Of Correcting The Ols Standard Errors

Instead of using the FGLS methods discussed in the previous section, we can still use OLS but correct the standard errors for autocorrelation by a procedure developed by Newey and West.44 This is an extension of White's heteroscedasticity-consistent standard errors that we discussed in the previous chapter. The corrected standard errors are known as HAC (heteroscedasticity- and autocorrelation-consistent) standard errors or simply as Newey-West standard errors. We will not present the...

## Yi Pi P2 i Xi Ui

Which of these model(s) would you choose for the Engel expenditure curve and why (Hint Interpret the various slope coefficients, find out the expressions for elasticity of expenditure with respect to income, etc.) 196 PART ONE SINGLE-EQUATION REGRESSION MODELS As it stands, is this a linear regression model If not, what trick, if any, can you use to make it a linear regression model How would you interpret the resulting model Under what circumstances might such a model be appropriate 6.12....

## Look At Selected Us Economic Time Series

To set the stage, and to give the reader a feel for the somewhat esoteric concepts of time series analysis to be developed in this chapter, it might be useful to consider several U.S. economic time series of general interest. The time series we consider are (1) GDP (gross domestic product), (2) PDI (personal disposable income), (3) PCE (personal consumption expenditure), (4) profits (corporate profits after tax), and (5) dividends (net corporate dividend) all data are in billions of 1987...

## Forecasting

Remember that the GDP data are for the period i970-I to i99i-IV. Suppose, on the basis of model (22.5.2), we want to forecast GDP for the first four quarters of i992. But in (22.5.2) the dependent variable is change in the GDP over the previous quarter. Therefore, if we use (22.5.2), what we can obtain are the forecasts of GDP changes between the first quarter of i992 and the fourth quarter of i99i, second quarter of i992 over the first quarter of i992, etc. To obtain the forecast of GDP level...

## The Almon Approach To Distributedlag Models The Almon Or Polynomial Distributed Lag Pdl48

Although used extensively in practice, the Koyck distributed-lag model is based on the assumption that the ft coefficients decline geometrically as the lag lengthens (see Figure 17.5). This assumption may be too restrictive in some situations. Consider, for example, Figure 17.7. In Figure 17.7a it is assumed that the ft's increase at first and then decrease, whereas in Figure 17.7c it is assumed that they follow a cyclical pattern. Obviously, the Koyck scheme of distributed-lag models will not...

## Econometric Models

An extensive use of simultaneous-equation models has been made in the econometric models built by several econometricians. An early pioneer in this field was Professor Lawrence Klein of the Wharton School of the University of Pennsylvania. His initial model, known as Klein's model I, is as follows Consumption function Ct p0 + p1 Pt + p2(W + W')t + p3P 1 + u11 724 PART FOUR SIMULTANEOUS-EQUATION MODELS It 4 + 5 Pt + 6 Pt-1 + 7 Kt-1 + U2t + io( Y + T - W')t-i + ii t + U3t Yt + Tt Ct + It + Gt Yt...

## Keynesian Model Of Income Determination

Consider the simple Keynesian model of income determination Consumption function Ct p0 + p1Yt + ut 0 < p1 < 1 (18.2.3) Income identity Yt Ct + It ( St) (18.2.4) where C consumption expenditure I investment (assumed exogenous) S savings t time u stochastic disturbance term p0 and p parameters The parameter is known as the marginal propensity to consume (MPC) (the amount of extra consumption expenditure resulting from an extra dollar of income). From economic theory, is expected to lie...

## Estimation Of Autoregressive Models

From our discussion thus far we have the following three models Yt (1 - X) + oXt + XYt-1 + (ut - Xut-1) (17.4.7) Adaptive expectation Yt Y o + Y Xt + (1 - y)Yt-1 + ut - (1 - Y)ut-1 (17.5.5) Partial adjustment Yt S o + S Xt + (1 - 5)Yt-1 + Sut (17.6.5) All these models have the following common form Yt o + Xt + a2Yt-1 + vt (17.8.1) that is, they are all autoregressive in nature. Therefore, we must now look at the estimation problem of such models, because the classical least-squares may not be...

## Detection Of Heteroscedasticity

CHAPTER ELEVEN HETEROSCEDASTICITY 401 economic investigations. In this respect the econometrician differs from scientists in fields such as agriculture and biology, where researchers have a good deal of control over their subjects. More often than not, in economic studies there is only one sample Y value corresponding to a particular value of X. And there is no way one can know ai2 from just one Y observation. Therefore, in most cases involving econometric investigations, heteroscedasticity may...

## Model Misspecification Versus Pure Autocorrelation

476 PART TWO RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL then we need to include the time or trend, t, variable in the model to see the relationship between wages and productivity net of the trends in the two variables. To test this, we included the trend variable in (12.5.1) and obtained the following results The interpretation of this model is straightforward Over time, the index of real wages has been decreasing by about 0.90 units per year. After allowing for this, if the productivity...

## The Unit Root Test

A test of stationarity (or nonstationarity) that has become widely popular over the past several years is the unit root test. We will first explain it, then illustrate it and then consider some limitations of this test. The starting point is the unit root (stochastic) process that we discussed in Section 21.4. We start with Yt pYt 1 + ut 1 < p < 1 (21.4.1) where ut is a white noise error term. We know that if p 1, that is, in the case of the unit root, (21.4.1) becomes a random walk model...

## E yxi x2 e 4 yxi x e 4 e 4 x2 e 4 x2 e 4

Which is an indeterminate expression. The reader can verify that fa is also indeterminate.8 Why do we obtain the result shown in (10.2.2) Recall the meaning of fa It gives the rate of change in the average value of Y as X2 changes by a unit, holding X3 constant. But if X3 and X2 are perfectly collinear, there is no way X3 can be kept constant As X2 changes, so does X3 by the factor X. What it means, then, is that there is no way of disentangling the separate influences of X2 and X3 from the...

## Classical Normal Linear Regression Model Cnlrm

What is known as the classical theory of statistical inference consists of two branches, namely, estimation and hypothesis testing. We have thus far covered the topic of estimation of the parameters of the (two-variable) linear regression model. Using the method of OLS we were able to estimate the parameters fa, fa, and a2. Under the assumptions of the classical linear regression model (CLRM), we were able to show that the estimators of these parameters, fa, fa, and a2, satisfy several...

## Logarithm Of Hourly Wages In Relation To Gender

To illustrate 9.10.1 , we use the data that underlie Example 9.2. The regression results based on 528 observations are as follows t 72.2943 -5.5048 9.10.4 where indicates p values are practically zero. Taking the antilog of 2.1763, we find 8.8136 , which is the median hourly earnings of male workers, and taking the antilog of 2.1763 - 0.2437 1.92857 , we obtain 6.8796 , which is the median hourly earnings of female workers. Thus, the female workers' median hourly earnings is lower by about...

## Table 1518

Convergence achieved after 7 iterations Y EXP C 0 C 1 X1 C 2 X2 C 3 X3 C 4 X4 Coefficient Std. error t statistic Probability Log likelihood -197.2096 Durbin-Watson statistic 1.7358 Note EXP means e the base of natural logarithm raised by the expression in . 42John Neter, Michael H. Kutner, Christopher J. Nachtsheim, and William Wasserman, Applied Regression Models, Irwin, 3d ed., Chicago, 1996. The data were obtained from the data disk included in the book and refer to exercise 14.28. 622 PART...

## Assumptions Of The Classical Model

Show that the estimates 1 1.572 and 2 1.357 used in the first experiment of Table 3.1 are in fact the OLS estimators. 3.3. According to Malinvaud see footnote 10 , the assumption that E ui Xi 0 is quite important. To see this, consider the PRF Y 1 2 Xi u. Now consider two situations i 1 0, 2 1, and E ui 0 and ii 1 1, 2 0, and E ui Xi 1 . Now take the expectation of the PRF conditional upon X in the two preceding cases and see if you agree with Malinvaud about the significance of the...

## Stochastic Processes

CHAPTER TWENTY-ONE TIME SERIES ECONOMETRICS SOME BASIC CONCEPTS 797 the first quarter of 1970 could have been any number, depending on the economic and political climate then prevailing. The figure of 2872.8 is a particular realization of all such possibilities.5 Therefore, we can say that GDP is a stochastic process and the actual values we observed for the period 1970-I to 1991-IV are a particular realization of that process i.e., sample . The distinction between the stochastic process and...

## An Illustrative Example

As an illustration of the lin-log model, let us revisit our example on food expenditure in India, Example 3.2. There we fitted a linear-in-variables model as a first approximation. But if we plot the data we obtain the plot in Figure 6.5. As this figure suggests, food expenditure increases more slowly as total expenditure increases, perhaps giving credence to Engel's law. The results of fitting the lin-log model to the data are as follows FoodExp -1283.912 257.2700 ln TotalExp,- t -4.3848...

## The Nature Of Multicollinearity

The term multicollinearity is due to Ragnar Frisch.3 Originally it meant the existence of a perfect, or exact, linear relationship among some or all explanatory variables of a regression model.4 For the k-variable regression involving explanatory variable Xi, X2, , Xk where Xi 1 for all observations to allow for the intercept term , an exact linear relationship is said to exist if the following condition is satisfied Xi Xi A 2 X2 --- XkXk 0 10.1.1 where X1, X2, , Xk are constants such that not...

## The Logit Model For Ungrouped Or Individual Data

To set the stage, consider the data given in Table 15.7. Letting Y 1 if a student's final grade in an intermediate microeconomics course was A and Y 0 if the final grade was a B or a C, Spector and Mazzeo used grade point average GPA , TUCE, and Personalized System of Instruction PSI as the TABLE 15.7 DATA ON THE EFFECT OF PERSONALIZED SYSTEM OF INSTRUCTION PSI ON COURSE GRADES TABLE 15.7 DATA ON THE EFFECT OF PERSONALIZED SYSTEM OF INSTRUCTION PSI ON COURSE GRADES

## E4e4 1

Where r2 3 is the coefficient of correlation between X2 and X3. 10.21. Using 7.4.12 and 7.4.15 , show that when there is perfect collinearity, the variances of ft and ft are infinite. 10.22. Verify that the standard errors of the sums of the slope coefficients estimated from 10.5.6 and 10.5.7 are, respectively, 0.1549 and 0.1825. See Section 10.5. 10.23. For the k-variable regression model, it can be shown that the variance of the kth k 2,3, , K partial regression coefficient given in 7.5.6 can...

## 1958-1972 Production Function Of Taiwan

You are to consider the following model Yi 01 02 X2t 03 X3t 04 X4t 05 06 X6t Ut a. Estimate the preceding regression. b. What are the expected signs of the coefficients of this model c. Are the empirical results in accordance with prior expectations d. Are the estimated partial regression coefficients individually statistically significant at the 5 percent level of significance e. Suppose you first regress Y on X2, X3, and X4 only and then decide to add the variables X5 and X6. How would you...

## Yt f2 Xt u

Where Y GDP deflator for domestic goods and X GDP deflator for imports. 198 PART ONE SINGLE-EQUATION REGRESSION MODELS a. How would you choose between the two models a priori b. Fit both models to the data and decide which gives a better fit. c. What other model s might be appropriate for the data 6.16. Refer to the data given in exercise 6.15. The means of Y and X are 1456 and 1760, respectively, and the corresponding standard deviations are 346 and 641. Estimate the following regression where...

## Http Fairmodel.econ.yale.edu Rayfair Pdf 1978dat.zip

If Y is observed, the observations ni , denoted by dots, will lie in the X-Y plane. It is intuitively clear that if we estimate a regression line based on the n1 observations only, the resulting intercept and slope coefficients are bound to be different than if all the n1 n2 observations were taken into account. How then does one estimate tobit, or censored regression, models, such as 15.11.1 The actual mechanics involves the method of maximum likelihood, which is...

## Demaris Logit Model Smsa

Notes All financial variables are in thousands of dollars. Housing status Rent 1 if rents 0 otherwise Housing status Own 1 if owns 0 otherwise Source Janet A. Fisher, An Analysis of Consumer Good Expenditure, The Review of Economics and Statistics, vol. 64, no. 1, Table 1, 1962, p. 67. Notes All financial variables are in thousands of dollars. Housing status Rent 1 if rents 0 otherwise Housing status Own 1 if owns 0 otherwise Source Janet A. Fisher, An Analysis of Consumer Good Expenditure, The...

## Predicting A Bond Rating

Based on a pooled time series and cross-sectional data of 200 Aa high-quality and Baa medium-quality bonds over the period 1961-1966, Joseph Cappelleri estimated the following bond rating prediction model.10 Yi fa fa X2 X3I fa fas Xsi u, where Y, 1 if the bond rating is Aa Moody's rating 0 if the bond rating is Baa Moody's rating X2 debt capitalization ratio, a measure of leverage dollar value of long-term debt _ dollar value of total capitalization X3 profit rate dollar value of net total...

## Consequences Of Model Specification Errors

Whatever the sources of specification errors, what are the consequences To keep the discussion simple, we will answer this question in the context of the three-variable model and consider in this section the first two types of specification errors discussed earlier, namely, 1 underfitting a model, that is, omitting relevant variables, and 2 overfitting a model, that is, including unnecessary variables. Our discussion here can be easily generalized to more than two regressors, but with tedious...

## The Logit Model

We will continue with our home ownership example to explain the basic ideas underlying the logit model. Recall that in explaining home ownership in relation to income, the LPM was Pi E Y 1 Xi P1 fa Xi 15.5.1 where X is income and Y 1 means the family owns a house. But now consider the following representation of home ownership For ease of exposition, we write 15.5.2 as Equation 15.5.3 represents what is known as the cumulative logistic distribution function.15 It is easy to verify that as Zi...

## P n2k2

Where n total number of observations, d Durbin-Watson d, and k number of coefficients including the intercept to be estimated. Show that for large n, this estimate of p is equal to the one obtained by the simpler formula 1 d 2 . 12.7. Estimating p The Hildreth-Lu scanning or search procedure. Since in the first-order autoregressive scheme p is expected to lie between 1 and 1, Hildreth and Lu suggest a systematic scanning or search procedure to locate it. They recommend selecting p between 1 and...