Now yi2 is independent of the number of X variables in the model because it is simply E(Y — Y)2. The RSS, E^2, however, depends on the number of regressors present in the model. Intuitively, it is clear that as the number of X variables increases, E tf is likely to decrease (at least it will not increase); hence R2 as defined in (7.8.1) will increase. In view of this, in comparing two regression models with the same dependent variable but differing number of X variables, one should be very wary of choosing the model with the highest R2.

To compare two R2 terms, one must take into account the number of X variables present in the model. This can be done readily if we consider an alternative coefficient of determination, which is as follows:

