## JlU Unlli u

where /„ is the identity matrix of order n. The assumption above about the errors simply states that they are pairwise uncorrelated, since E(u,u /) = 0, for all i j- They also have the same variances. If we further assume that the joint distribution of these n errors is normal, then they will be independent, sincc for the case of normality, lack of correlation implies independence, and vice versa. Then we say that the u,s are independently and identically distributed, or i.i.d.. and that they are spherical. 2. The .Ys are fixed, real numbers. In other words, in contrast to the random behavior of the h, s the .Ysdo not add to the randomness that is transmitted to the the vs. Hence the stochastic nature of the model is entirely due to the randomness of the errors. We will also assume that the matrix A'7 X will be nonsingular.

Least-squares estimation involves minimizing the sum of squared deviations of the predicted from the actual series. The solution that minimizes these squared deviations leads to the best choice of estimator. That is, for each choice of estimator, there is a corresponding predicted or estimated or fitted value of the dependent value and we choose the estimator that makes these fitted or predicted values mimic the actual data as well as possible. The problem is to minimize the objective function given by n

Note that the objective function is defined in terms of observable quantities, the e,. which are the residuals from the estimation. The residuals are to be distinguished from the errors, the it-,, Uiat are unobservables. We use h to denote the estimator of fi. which is, of course, unknown. Different choices of b lead to different values of the objective function, since the regression residuals will be different. We then try to choose that value of b that minimizes the above objective function. Since this objective function is a scalar, all of its components are also scalars. Hence b' X7 y = y' A'b, since taking the transpose of a scalar leaves it unchanged. That is. b' X1 y — (yr Xb)'. The objective function then becomes efe = yTy - 2brXry + h7 X7 Xb

In chapter 12 we will investigate the solution to the above minimization problem. The optimal choice of b is known as the ordinary least-squares COLS) estimator of ft. m

Example 10.25 The Generalized Least-Squares Transformation

Suppose that the linear regression model y = Xft-l-u has errors mat are nonspher-ical. In this case u ~ /V((). rr-£2), where £2 is a positive definite matrix. However. Q. ^ L and it may have a nonconstanl main diagonal and/or possibly nonzero diagonal elements. The classical least-squares model of example 10.24 has certain desirable statistical properties that arc partly the resuli of ihe i.i.d. (spherical) structure of the errors. Given thai the errors in the present model are u ~ /V(0. ct^Q I. we need to transform them so tha: they will be spherical. In other words, we want

10 find a transformation 7' such that the variance-covariance matrix of the transformed errors Tu is o-l. We take T to be an n x n fixed matrix. The variance of Tu is given below.

Since E(Tu) = TE(u) = «. we get var(Tu) = E(TuutT' ) = TE(uut)Tt = ^-TSIT7

11 we choose '/' such (hat TUT' =■ /, then translormmg the model and applying least squares to the transformed model will bring us back to the environment where least squares are optimal. Below we will demonstate thai such a transformation exists. By assumption, we have that is a positive definite symmetric matrix Then we can use theorem 10.6 lo write where Q is the orthogonal matrix of eigenvectors and A is the diagonal matrix of eigenvalues. Then, by pre- and postmultiplying the equation above by Q and Q1, respectively, we get

Q. = QAQr since QQ1 - /.We can take the square root of the eigenvalues because they are all positive. That leads us to

¡2=<2A,/2AJ/:Gr = PPr where P is a positive definite matrix defined as Q A1/2. By choosing T ~ P~[, we obtain

since we can interchange inversion and transposition. Having found the appropriate transformation T, we can apply it to the model as a whole. We can write the transformed model as

Then the objective function to be minimized becomes e!et = (Ty - TXb,f(Ty - TXb.> = fy - Xbt)TTTT(y - A'b,)

Again the solution involve.v appropriately choosing b. to minimize e' e.. The optimal choice of b. is known as the generalized least-square* tGLSi estimator of 0. m

0 0