## A1 Probability Distributions Related To The Normal Distribution

The t, chi-square (x2), and F probability distributions, whose salient features are discussed in Appendix A, are intimately related to the normal distribution. Since we will make heavy use of these probability distributions in the following chapters, we summarize their relationship with the normal distribution in the following theorem; the proofs, which are beyond the scope of this book, can be found in the references.1

Theorem 5.1. If Z\, Z2,..., Zn are normally and independently distributed random variables such that Zi — N(n, of), then the sum Z = J]kiZi, where ki are constants not all zero, is also distributed normally with mean J]kiii and variance J]kI2oi2; that is, Z — N(J2kiii, Y,kfof). Note: M denotes the mean value.

In short, linear combinations of normal variables are themselves normally distributed. For example, if Z1 and Z2 are normally and independently dis-tributedas Zi ~ N(10,2)and Z2 ~ N(8, 1.5), then the linear combination Z = 0.8Z1 + 0.2Z2 is also normally distributed with mean = 0.8(10) + 0.2(8) = 9.6 and variance = 0.64(2) + 0.04(1.5) = 1.34, that is, Z - (9.6, 1.34).

Theorem 5.2. If Z1, Z2,..., Zn are normally distributed but are not independent, the sum Z = J]kiZi, where ki are constants not all zero, is also normally distributed with mean Mi and variance

Thus, if Z1 - N(6, 2) and Z2 - N(7, 3) and cov(Z1, Z2) = 0.8, then the linear combination 0.6Z1 + 0.4Z2 is also normally distributed with mean = 0.6(6) + 0.4(7) = 6.4 and variance = [0.36(2) + 0.16(3) + 2(0.6)(0.4)(0.8)] = 1.584.

1For proofs of the various theorems, see Alexander M. Mood, Franklin A. Graybill, and Duane C. Bose, Introduction to the Theory of Statistics, 3d ed., McGraw-Hill, New York, 1974, pp. 239-249.

Gujarati: Basic I. Single-Equation 5. Two-Variable © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004

Edition Estimation and Hypothesis

Testing

160 PART ONE: SINGLE-EQUATION REGRESSION MODELS

Theorem 5.3. If Z1t Z2,..., Zn are normally and independently distributed random variables such that each Zi ~ N(0, 1), that is, a standardized normal variable, then YZj = Z2 + Z2 +----+ Zn follows the chi-square distribution with n df. Symbolically, Y Zf ~ x,2, where n denotes the degrees of freedom, df.

In short, "the sum of the squares of independent standard normal variables has a chi-square distribution with degrees of freedom equal to the number of terms in the sum."2

Theorem 5.4. If Z1, Z2,..., Zn are independently distributed random variables each following chi-square distribution with ki df, then the sum Y Zi = Z1 + Z2 +----+ Zn also follows a chi-square distribution with k = £ ki df.

Thus, if Z1 and Z2 are independent x2 variables with df of k1 and k2, respectively, then Z = Z1 + Z2 is also a x2 variable with (k1 + k2) degrees of freedom. This is called the reproductive property of the x2 distribution.

Theorem 5.5. If Z1 is a standardized normal variable [Z1 ~ N(0, 1)] and another variable Z2 follows the chi-square distribution with k df and is independent of Z1, then the variable defined as

Z1 Z1Vk standard normal variable t = _t_ = ___ = _ . ^ tk

^Zll^fk VZ2 y independent chi-square variable/df follows Student's t distribution with k df. Note: This distribution is discussed in Appendix A and is illustrated in Chapter 5.

Incidentally, note that as k, the df, increases indefinitely (i.e., as k ^ to), the Student's t distribution approaches the standardized normal distribu-tion.3 As a matter of convention, the notation tk means Student's t distribution or variable with k df.

Theorem 5.6. If Z1 and Z2 are independently distributed chi-square variables with k1 and k2 df, respectively, then the variable

F Z1I k1

has the F distribution with k1 and k2 degrees of freedom, where k1 is known as the numerator degrees of freedom and k2 the denominator degrees of freedom.

3For proof, see Henri Theil, Introduction to Econometrics, Prentice-Hall, Englewood Cliffs, N.J., 1978, pp. 237-245.

Gujarati: Basic I I. Single-Equation I 5. Two-Variable I I © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004 Edition Estimation and Hypothesis

Testing

CHAPTER FIVE: TWO VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING 161

Again as a matter of convention, the notation Fk1,k2 means an F variable with ki and k2 degrees of freedom, the df in the numerator being quoted first.

In other words, Theorem 5.6 states that the F variable is simply the ratio of two independently distributed chi-square variables divided by their respective degrees of freedom.

Theorem 5.7. The square of (Student's) t variable with k df has an F distribution with ki = 1 df in the numerator and k2 = k df in the denominator.4 That is,

Note that for this equality to hold, the numerator df of the F variable must be 1. Thus, F14 = t| or F123 = tf3 and so on.

As noted, we will see the practical utility of the preceding theorems as we progress.

Theorem 5.8. For large denominator df, the numerator df times the F value is approximately equal to the chi-square value with the numerator df. Thus, m Fm,n = xi as n ^œ

Theorem 5.9. For sufficiently large df, the chi-square distribution can be approximated by the standard normal distribution as follows:

where k denotes df.

5A.2 DERIVATION OF EQUATION (5.3.2)

Zi and

Provided a is known, Z1 follows the standardized normal distribution; that is, Z1 ~ N(0, 1). (Why?) Z2, follows the x2 distribution with (n - 2) df.5 