Notes: = 1.572 + 1.357X (i.e., ^ = 1.572 and ft = 1.357) ?2i = 3.0 + 1.0X, (i.e., = 3 and ft = 1.0)

Notes: = 1.572 + 1.357X (i.e., ^ = 1.572 and ft = 1.357) ?2i = 3.0 + 1.0X, (i.e., = 3 and ft = 1.0)


let fa1 = 1.572 and fa2 = 1-357 (let us not worry right now about how we got these values; say, it is just a guess).1 Using these fa values and the X values given in column (2) of Table 3.1, we can easily compute the estimated Y given in column (3) of the table as Y1i (the subscript 1 is to denote the first experiment). Now let us conduct another experiment, but this time using the values of fa1 = 3 and fa = 1. The estimated values of Yi from this experiment are given as Y2i in column (6) of Table 3.1. Since the fa values in the two experiments are different, we get different values for the estimated residuals, as shown in the table; u1i are the residuals from the first experiment and u2i from the second experiment. The squares of these residuals are given in columns (5) and (8). Obviously, as expected from (3.1.3), these residual sums of squares are different since they are based on different sets of fa values.

Now which sets of fa values should we choose? Since the fa values of the first experiment give us a lower J2 U (= 12.214) than that obtained from the fa values of the second experiment (= 14), we might say that the fas of the first experiment are the "best" values. But how do we know? For, if we had infinite time and infinite patience, we could have conducted many more such experiments, choosing different sets of fas each time and comparing the resulting J2 u2 and then choosing that set of fa values that gives us the least possible value of u2i assuming of course that we have considered all the conceivable values of fa and fa. But since time, and certainly patience, are generally in short supply, we need to consider some shortcuts to this trial-and-error process. Fortunately, the method of least squares provides us such a shortcut. The principle or the method of least squares chooses fa and fa in such a manner that, for a given sample or set of data, u2i is as small as possible. In other words, for a given sample, the method of least squares provides us with unique estimates of fa1 and fa2 that give the smallest possible value of J2 u2. How is this accomplished? This is a straight-forward exercise in differential calculus. As shown in Appendix 3A, Section 3A.1, the process of differentiation yields the following equations for estimating fa1 and fa:

where n is the sample size. These simultaneous equations are known as the normal equations.

'For the curious, these values are obtained by the method of least squares, discussed shortly. See Eqs. (3.1.6) and (3.1.7).


Solving the normal equations simultaneously, we obtain nJ2XiY - E Xi E Yi n £ X - (£ Xi)2 £(Xi - X)(Y - Y) £(Xi - X)2

Y^xyi Ex2

where X and Y are the sample means of X and Y and where we define xi = (Xi — XX) and yi = (Y — Y). Henceforth we adopt the convention of letting the lowercase letters denote deviations from mean values.

The last step in (3.1.7) can be obtained directly from (3.1.4) by simple algebraic manipulations.

Incidentally, note that, by making use of simple algebraic identities, formula (3.1.6) for estimating p2 can be alternatively expressed as

The estimators obtained previously are known as the least-squares estimators, for they are derived from the least-squares principle. Note the following numerical properties of estimators obtained by the method of OLS: "Numerical properties are those that hold as a consequence of the use

2Note 1:J2 x2 = £(Xr — X)2 = £ X2 — 2 £ XX + £ X2 = £ X2 — 2X £ Xr + £ X2, since X is a constant. Further noting that £ Xi = nX and £ X2 = nX2 since XX is a constant, we finally get £ x2 = £ X2 — nX2.

Note 2:J2%iyi = £ xi (Yi — Y) = £ xiY — Y£ xi = £ xY — Y£ (Xi — XX) = £ xY, since Y is a constant and since the sum of deviations of a variable from its mean value [e.g., £( Xi — Xi)] is always zero. Likewise, J2yi = £(Yi — Y) = 0.


of ordinary least squares, regardless of how the data were generated."3 Shortly, we will also consider the statistical properties of OLS estimators, that is, properties "that hold only under certain assumptions about the way the data were generated."4 (See the classical linear regression model in Section 3.2.)

I. The OLS estimators are expressed solely in terms of the observable (i.e., sample) quantities (i.e., Xand Y). Therefore, they can be easily computed.

II. They are point estimators;that is, given the sample, each estimator will provide only a single (point) value of the relevant population parameter. (In Chapter 5 we will consider the so-called interval estimators, which provide a range of possible values for the unknown population parameters.)

III. Once the OLS estimates are obtained from the sample data, the sample regression line (Figure 3.1) can be easily obtained. The regression line thus obtained has the following properties:

1. It passes through the sample means of Y and X. This fact is obvious from (3.1.7), for the latter can be written as Y = ¡1 + ¡2X, which is shown diagrammatically in Figure 3.2.

FIGURE 3.2 Diagram showing that the sample regression line passes through the sample mean values of Yand X.

3Russell Davidson and James G. MacKinnon, Estimation and Inference in Econometrics, Oxford University Press, New York, 1993, p. 3.



2. The mean value of the estimated Y = Y is equal to the mean value of the actual Y for

Summing both sides of this last equality over the sample values and dividing through by the sample size n gives y= Y (3.1.10)5

where use is made of the fact that J2(Xi — X) = 0. (Why?)

3. The mean value of the residuals ûi is zero. From Appendix 3A, Section 3A.1, the first equation is

But since ûi = Yi — fa — faXi, the preceding equation reduces to —2 Y, û = 0, whence û = 0.6

As a result of the preceding property, the sample regression

can be expressed in an alternative form where both Y and X are expressed as deviations from their mean values. To see this, sum (2.6.2) on both sides to give

Dividing Eq. (3.1.11) through by n, we obtain

which is the same as (3.1.7). Subtracting Eq. (3.1.12) from (2.6.2), we obtain

5Note that this result is true only when the regression model has the intercept term in it. As App. 6A, Sec. 6A.1 shows, this result need not hold when (¡1 is absent from the model.

6This result also requires that the intercept term 1 be present in the model (see App. 6A, Sec. 6A.1).


where yt and xt, following our convention, are deviations from their respective (sample) mean values.

Equation (3.1.13) is known as the deviation form. Notice that the intercept term ¡¡1 is no longer present in it. But the intercept term can always be estimated by (3.1.7), that is, from the fact that the sample regression line passes through the sample means of Y and X. An advantage of the deviation form is that it often simplifies computing formulas.

In passing, note that in the deviation form, the SRF can be written as yt = 32 Xi (3.1.14)

whereas in the original units of measurement it was Yt = ¡1 + ¡¡2 Xt, as shown in (2.6.1).

4. The residuals ut are uncorrelated with the predicted Y • This statement can be verified as follows: using the deviation form, we can write

where use is made of the fact that ¡¡2 = J2 Xiyi /Y. x}-

5. The residuals ui are uncorrelated with Xi; that is, YI uiXi = 0- This fact follows from Eq. (2) in Appendix 3A, Section 3A.1.


If our objective is to estimate ¡¡1 and ¡2 only, the method of OLS discussed in the preceding section will suffice. But recall from Chapter 2 that in regression analysis our objective is not only to obtain ¡1 and ¡¡2 but also to draw inferences about the true ¡¡1 and ¡2 • For example, we would like to know how close ¡1 and ¡¡2 are to their counterparts in the population or how close Yi is to the true E(Y | Xi). To that end, we must not only specify the functional form of the model, as in (2.4.2), but also make certain assumptions about


the manner in which Yi are generated. To see why this requirement is needed, look at the PRF: Yi = j1 + j2Xi + ui. It shows that Yi depends on both Xi and ui. Therefore, unless we are specific about how Xi and ui are created or generated, there is no way we can make any statistical inference about the Yi and also, as we shall see, about j1 and j2. Thus, the assumptions made about the Xi variable(s) and the error term are extremely critical to the valid interpretation of the regression estimates.

The Gaussian, standard, or classical linear regression model (CLRM), which is the cornerstone of most econometric theory, makes 10 assumptions.7 We first discuss these assumptions in the context of the two-variable regression model; and in Chapter 7 we extend them to multiple regression models, that is, models in which there is more than one regressor.

Assumption 1: Linear regression model. The regression model is linear in the parameters, as shown in (2.4.2)

We already discussed model (2.4.2) in Chapter 2. Since linear-in-parameter regression models are the starting point of the CLRM, we will maintain this assumption throughout this book. Keep in mind that the regressand Y and the regressor X themselves may be nonlinear, as discussed in Chapter 2.8

Assumption 2: Xvalues are fixed in repeated sampling. Values taken by the regressor X are considered fixed in repeated samples. More technically, Xis assumed to be nonstochastic.

This assumption is implicit in our discussion of the PRF in Chapter 2. But it is very important to understand the concept of "fixed values in repeated sampling," which can be explained in terms of our example given in Table 2.1. Consider the various Y populations corresponding to the levels of income shown in that table. Keeping the value of income X fixed, say, at level $80, we draw at random a family and observe its weekly family consumption expenditure Y as, say, $60. Still keeping X at $80, we draw at random another family and observe its Y value as $75. In each of these drawings (i.e., repeated sampling), the value of X is fixed at $80. We can repeat this process for all the X values shown in Table 2.1. As a matter of fact, the sample data shown in Tables 2.4 and 2.5 were drawn in this fashion.

What all this means is that our regression analysis is conditional regression analysis, that is, conditional on the given values of the regressor(s) X.

7It is classical in the sense that it was developed first by Gauss in 1821 and since then has served as a norm or a standard against which may be compared the regression models that do not satisfy the Gaussian assumptions.

8However, a brief discussion of nonlinear-in-the-parameter regression models is given in Chap. 14.


Assumption 3: Zero mean value of disturbance u. Given the value of X, the mean, or expected, value of the random disturbance term u, is zero. Technically, the conditional mean value of u, is zero. Symbolically, we have

Assumption 3 states that the mean value of ui, conditional upon the given Xi, is zero. Geometrically, this assumption can be pictured as in Figure 3.3, which shows a few values of the variable X and the Y populations associated with each of them. As shown, each Y population corresponding to a given X is distributed around its mean value (shown by the circled points on the PRF) with some Y values above the mean and some below it. The distances above and below the mean values are nothing but the ui, and what (3.2.1) requires is that the average or mean value of these deviations corresponding to any given X should be zero.9

This assumption should not be difficult to comprehend in view of the discussion in Section 2.4 [see Eq. (2.4.5)]. All that this assumption says is that the factors not explicitly included in the model, and therefore subsumed in ui, do not systematically affect the mean value of Y; so to speak, the positive u

Was this article helpful?

0 0
Rules Of The Rich And Wealthy

Rules Of The Rich And Wealthy

Learning About The Rules Of The Rich And Wealthy Can Have Amazing Benefits For Your Life And Success. Discover the hidden rules and beat the rich at their own game. The general population has a love / hate kinship with riches. They resent those who have it, but spend their total lives attempting to get it for themselves. The reason an immense majority of individuals never accumulate a substantial savings is because they don't comprehend the nature of money or how it works.

Get My Free Ebook

Post a comment