## Fa faztyE x2

Gujarati: Basic I I. Single-Equation I 5. Two-Variable I I © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004 Edition Estimation and Hypothesis

Testing

122 PART ONE: SINGLE-EQUATION REGRESSION MODELS

as noted in (4.3.6), is a standardized normal variable. It therefore seems that we can use the normal distribution to make probabilistic statements about ft provided the true population variance a2 is known. If a2 is known, an important property of a normally distributed variable with mean x and variance a 2 is that the area under the normal curve between x ± a is about 68 percent, that between the limits x ± 2a is about 95 percent, and that between x ± 3a is about 99.7 percent.

But a2 is rarely known, and in practice it is determined by the unbiased estimator a2. If we replace a by a, (5.3.1) may be written as t =

ft — ft estimator — parameter se (ft) estimated standard error of estimator

where the se (ft) now refers to the estimated standard error. It can be shown (see Appendix 5A, Section 5A.2) that the t variable thus defined follows the t distribution with n — 2 df. [Note the difference between (5.3.1) and (5.3.2).] Therefore, instead of using the normal distribution, we can use the t distribution to establish a confidence interval for ft as follows:

where the t value in the middle of this double inequality is the t value given by (5.3.2) and where ta/2 is the value of the t variable obtained from the t distribution for a/2 level of significance and n — 2 df; it is often called the critical t value at a/2 level of significance. Substitution of (5.3.2) into (5.3.3) yields

Rearranging (5.3.4), we obtain

Pr [ft - ta/2 se (ft) < ft < ft + ta/2 se (ft)] = 1 - a (5.3.5)3

3Some authors prefer to write (5.3.5) with the df explicitly indicated. Thus, they would write

Pr [Pi — t(n—D,a/i se (fii) < Pi <Pi + t(n—l)a/l se (ft)] = 1 — (

But for simplicity we will stick to our notation; the context clarifies the appropriate df involved.

Gujarati: Basic I I. Single-Equation I 5. Two-Variable I I © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004 Edition Estimation and Hypothesis

Testing

CHAPTER FIVE: TWO VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING 123

Equation (5.3.5) provides a 100(1 — a) percent confidence interval for fa, which can be written more compactly as

Arguing analogously, and using (4.3.1) and (4.3.2), we can then write:

Pr [fa — ta/2 se (fa) < fa < fa + ta/2 se (fa)] = 1 — a (5.3.7)

or, more compactly,

Notice an important feature of the confidence intervals given in (5.3.6) and (5.3.8): In both cases the width of the confidence interval is proportional to the standard error of the estimator. That is, the larger the standard error, the larger is the width of the confidence interval. Put differently, the larger the standard error of the estimator, the greater is the uncertainty of estimating the true value of the unknown parameter. Thus, the standard error of an estimator is often described as a measure of the precision of the estimator, i.e., how precisely the estimator measures the true population value.

Returning to our illustrative consumption-income example, in Chapter 3 (Section 3.6) we found that fa = 0.5091, se(fa) = 0.0357, and df = 8. If we assume a = 5%, that is, 95% confidence coefficient, then the t table shows that for 8 df the critical ta/2 = t0.025 = 2.3 0 6. Substituting these values in (5.3.5), the reader should verify that the 95% confidence interval for fa is as follows:

that is,

The interpretation of this confidence interval is: Given the confidence coefficient of 95%, in the long run, in 95 out of 100 cases intervals like

Gujarati: Basic I I. Single-Equation I 5. Two-Variable I I © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004 Edition Estimation and Hypothesis

Testing

124 PART ONE: SINGLE-EQUATION REGRESSION MODELS

(0.4268, 0.5914) will contain the true ft2. But, as warned earlier, we cannot say that the probability is 95 percent that the specific interval (0.4268 to 0.5914) contains the true ft because this interval is now fixed and no longer random; therefore, ft either lies in it or does not: The probability that the specified fixed interval includes the true ft is therefore 1 or 0.

Confidence Interval for ft

Following (5.3.7), the reader can easily verify that the 95% confidence interval for ft of our consumption-income example is

that is,

Again you should be careful in interpreting this confidence interval. In the long run, in 95 out of 100 cases intervals like (5.3.11) will contain the true ft1; the probability that this particular fixed interval includes the true ft1 is either 1 or 0.

### Confidence Interval for ft and ft Simultaneously

There are occasions when one needs to construct a joint confidence interval for ft1 and ft2 such that with a confidence coefficient (1 — a), say, 95%, that interval includes ft1 and ft2 simultaneously. Since this topic is involved, the interested reader may want to consult appropriate references.4 We will touch on this topic briefly in Chapters 8 and 10.

5.4 CONFIDENCE INTERVAL FOR a2

As pointed out in Chapter 4, Section 4.3, under the normality assumption, the variable a 2

4For an accessible discussion, see John Neter, William Wasserman, and Michael H. Kutner, Applied Linear Regression Models, Richard D. Irwin, Homewood, 1ll., 1983, Chap. 5.

Gujarati: Basic I I. Single-Equation I 5. Two-Variable I I © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004 Edition Estimation and Hypothesis

Testing

CHAPTER FIVE: TWO VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING 125  