## Stochastic Specification Of

It is clear from Figure 2.1 that, as family income increases, family consumption expenditure on the average increases, too. But what about the consumption expenditure of an individual family in relation to its (fixed) level of income? It is obvious from Table 2.1 and Figure 2.1 that an individual family's consumption expenditure does not necessarily increase as the income level increases. For example, from Table 2.1 we observe that corresponding to the income level of \$100 there is one family whose consumption expenditure of \$65 is less than the consumption expenditures of two families whose weekly income is only \$80. But notice that the average consumption

44 PART ONE: SINGLE-EQUATION REGRESSION MODELS

expenditure of families with a weekly income of \$100 is greater than the average consumption expenditure of families with a weekly income of \$80 (\$77 versus \$65).

What, then, can we say about the relationship between an individual family's consumption expenditure and a given level of income? We see from Figure 2.1 that, given the income level of Xi, an individual family's consumption expenditure is clustered around the average consumption of all families at that Xi, that is, around its conditional expectation. Therefore, we can express the deviation of an individual Yi around its expected value as follows:

where the deviation ui is an unobservable random variable taking positive or negative values. Technically, ui is known as the stochastic disturbance or stochastic error term.

How do we interpret (2.4.1)? We can say that the expenditure of an individual family, given its income level, can be expressed as the sum of two components: (1) E(Y | Xi), which is simply the mean consumption expenditure of all the families with the same level of income. This component is known as the systematic, or deterministic, component, and (2) ui, which is the random, or nonsystematic, component. We shall examine shortly the nature of the stochastic disturbance term, but for the moment assume that it is a surrogate or proxy for all the omitted or neglected variables that may affect Y but are not (or cannot be) included in the regression model.

If E(Y | Xi) is assumed to be linear in Xi, as in (2.2.2), Eq. (2.4.1) may be written as

Equation (2.4.2) posits that the consumption expenditure of a family is linearly related to its income plus the disturbance term. Thus, the individual consumption expenditures, given X = \$80 (see Table 2.1), can be expressed as

Y1 = 55 = fa + fa(80) + ux Y2 = 60 = fa + fa(80) + u2

CHAPTER TWO: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS 45

Now if we take the expected value of (2.4.1) on both sides, we obtain

where use is made of the fact that the expected value of a constant is that constant itself.8 Notice carefully that in (2.4.4) we have taken the conditional expectation, conditional upon the given X's.

Since E(Y I Xi) is the same thing as E(Y | Xi), Eq. (2.4.4) implies that

Thus, the assumption that the regression line passes through the conditional means of Y (see Figure 2.2) implies that the conditional mean values of ui (conditional upon the given X's) are zero.

From the previous discussion, it is clear (2.2.2) and (2.4.2) are equivalent forms if E(ui I Xi) = 0.9 But the stochastic specification (2.4.2) has the advantage that it clearly shows that there are other variables besides income that affect consumption expenditure and that an individual family's consumption expenditure cannot be fully explained only by the variable(s) included in the regression model. 