2 > Xa,df a0

22 a2 = a0

2 < X(1-a),df a0

22 a2 = a0

or < X(1-a/2),df

Note: a2 is the value of a2 under the null hypothesis. The first subscript on x2 in the last column is the level of significance, and the second subscript is the degrees of freedom. These are critical chi-square values. Note that df is (n - 2) for the two-variable regression model, (n - 3) for the three-variable regression model, and so on.

Note: a2 is the value of a2 under the null hypothesis. The first subscript on x2 in the last column is the level of significance, and the second subscript is the degrees of freedom. These are critical chi-square values. Note that df is (n - 2) for the two-variable regression model, (n - 3) for the three-variable regression model, and so on.

Gujarati: Basic I. Single-Equation 5. Two-Variable © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004

Edition Estimation and Hypothesis

Testing

134 PART ONE: SINGLE-EQUATION REGRESSION MODELS

5.8 HYPOTHESIS TESTING: SOME PRACTICAL ASPECTS

The Meaning of "Accepting" or "Rejecting" a Hypothesis

If on the basis of a test of significance, say, the t test, we decide to "accept" the null hypothesis, all we are saying is that on the basis of the sample evidence we have no reason to reject it; we are not saying that the null hypothesis is true beyond any doubt. Why? To answer this, let us revert to our consumption-income example and assume that H0: ft (MPC) = 0.50. Now the estimated value of the MPC is ft = 0.5091 with a se(ft) = 0.0357. Then on the basis of the t test we find that t = (0.5091 — 0.50)/0.0357 = 0.25, which is insignificant, say, at a = 5%. Therefore, we say "accept" H0. But now let us assume H0: ft = 0.48. Applying the t test, we obtain t = (0.5091 — 0.48)/0.0357 = 0.82, which too is statistically insignificant. So now we say "accept" this H0. Which of these two null hypotheses is the "truth"? We do not know. Therefore, in "accepting" a null hypothesis we should always be aware that another null hypothesis may be equally compatible with the data. It is therefore preferable to say that we may accept the null hypothesis rather than we (do) accept it. Better still,

. . . just as a court pronounces a verdict as "not guilty" rather than "innocent," so the conclusion of a statistical test is "do not reject" rather than "accept."11

The "Zero" Null Hypothesis and the "2-t" Rule of Thumb

A null hypothesis that is commonly tested in empirical work is H0: ft = 0, that is, the slope coefficient is zero. This "zero" null hypothesis is a kind of straw man, the objective being to find out whether Y is related at all to X, the explanatory variable. If there is no relationship between Y and X to begin with, then testing a hypothesis such as ft = 0.3 or any other value is meaningless.

This null hypothesis can be easily tested by the confidence interval or the t-test approach discussed in the preceding sections. But very often such formal testing can be shortcut by adopting the "2-t" rule of significance, which may be stated as

"2-t" Rule of Thumb. If the number of degrees of freedom is 20 or more and if a, the level of significance, is set at 0.05, then the null hypothesis fS2 = 0 can be rejected if the t value [ = /?2/se 02)] computed from (5.3.2) exceeds 2 in absolute value.

The rationale for this rule is not too difficult to grasp. From (5.7.1) we know that we will reject H0: ß2 = 0 if t = ft/se (ft) > ta/2 when ft > 0 11Jan Kmenta, Elements of Econometrics, Macmillan, New York, 1971, p. 114.

Gujarati: Basic I. Single-Equation 5. Two-Variable © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004

Edition Estimation and Hypothesis

Testing

CHAPTER FIVE: TWO VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING 135

or or when t = fa/se (fa) < -ta/2 when fa < 0

for the appropriate degrees of freedom.

Now if we examine the t table given in Appendix D, we see that for df of about 20 or more a computed t value in excess of 2 (in absolute terms), say, 2.1, is statistically significant at the 5 percent level, implying rejection of the null hypothesis. Therefore, if we find that for 20 or more df the computed t value is, say, 2.5 or 3, we do not even have to refer to the t table to assess the significance of the estimated slope coefficient. Of course, one can always refer to the t table to obtain the precise level of significance, and one should always do so when the df are fewer than, say, 20.

In passing, note that if we are testing the one-sided hypothesis 2 = 0 versus fa2 > 0 or fa2 < 0, then we should reject the null hypothesis if

If we fix a at 0.05, then from the t table we observe that for 20 or more df a t value in excess of 1.73 is statistically significant at the 5 percent level of significance (one-tail). Hence, whenever a t value exceeds, say, 1.8 (in absolute terms) and the df are 20 or more, one need not consult the t table for the statistical significance of the observed coefficient. Of course, if we choose a at 0.01 or any other level, we will have to decide on the appropriate t value as the benchmark value. But by now the reader should be able to do that.

Given the null and the alternative hypotheses, testing them for statistical significance should no longer be a mystery. But how does one formulate these hypotheses? There are no hard-and-fast rules. Very often the phenomenon under study will suggest the nature of the null and alternative hypotheses. For example, consider the capital market line (CML) of portfolio theory, which postulates that Ei = fai + fa2ai, where E = expected return on portfolio and a = the standard deviation of return, a measure of risk. Since return and risk are expected to be positively related—the higher the risk, the

12For an interesting discussion about formulating hypotheses, see J. Bradford De Long and Kevin Lang, "Are All Economic Hypotheses False?" Journal of Political Economy, vol. 100, no. 6, 1992, pp. 1257-1272.

5. Two-Variable Regression: Interval Estimation and Hypothesis Testing

136 PART ONE: SINGLE-EQUATION REGRESSION MODELS

higher the return—the natural alternative hypothesis to the null hypothesis that fa = 0 would be fa > 0. That is, one would not choose to consider values of fa less than zero.

But consider the case of the demand for money. As we shall show later, one of the important determinants of the demand for money is income. Prior studies of the money demand functions have shown that the income elasticity of demand for money (the percent change in the demand for money for a 1 percent change in income) has typically ranged between 0.7 and 1.3. Therefore, in a new study of demand for money, if one postulates that the income-elasticity coefficient fa is 1, the alternative hypothesis could be that fa = 1, a two-sided alternative hypothesis.

Thus, theoretical expectations or prior empirical work or both can be relied upon to formulate hypotheses. But no matter how the hypotheses are formed, it is extremely important that the researcher establish these hypotheses before carrying out the empirical investigation. Otherwise, he or she will be guilty of circular reasoning or self-fulfilling prophesies. That is, if one were to formulate hypotheses after examining the empirical results, there may be the temptation to form hypotheses that justify one's results. Such a practice should be avoided at all costs, at least for the sake of scientific objectivity. Keep in mind the Stigler quotation given at the beginning of this chapter!

It should be clear from the discussion so far that whether we reject or do not reject the null hypothesis depends critically on a, the level of significance or the probability of committing a Type I error—the probability of rejecting the true hypothesis. In Appendix A we discuss fully the nature of a Type I error, its relationship to a Type II error (the probability of accepting the false hypothesis) and why classical statistics generally concentrates on a Type I error. But even then, why is a commonly fixed at the 1, 5, or at the most 10 percent levels? As a matter of fact, there is nothing sacrosanct about these values; any other values will do just as well.

In an introductory book like this it is not possible to discuss in depth why one chooses the 1, 5, or 10 percent levels of significance, for that will take us into the field of statistical decision making, a discipline unto itself. A brief summary, however, can be offered. As we discuss in Appendix A, for a given sample size, if we try to reduce a Type I error, a Type II error increases, and vice versa. That is, given the sample size, if we try to reduce the probability of rejecting the true hypothesis, we at the same time increase the probability of accepting the false hypothesis. So there is a tradeoff involved between these two types of errors, given the sample size. Now the only way we can decide about the tradeoff is to find out the relative costs of the two types of errors. Then,

If the error of rejecting the null hypothesis which is in fact true (Error Type I) is costly relative to the error of not rejecting the null hypothesis which is in fact

Choosing a, the Level of Significance

5. Two-Variable Regression: Interval Estimation and Hypothesis Testing

CHAPTER FIVE: TWO VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING 137

false (Error Type II), it will be rational to set the probability of the first kind of error low. If, on the other hand, the cost of making Error Type I is low relative to the cost of making Error Type II, it will pay to make the probability of the first kind of error high (thus making the probability of the second type of error

Of course, the rub is that we rarely know the costs of making the two types of errors. Thus, applied econometricians generally follow the practice of setting the value of a at a 1 or a 5 or at most a 10 percent level and choose a test statistic that would make the probability of committing a Type II error as small as possible. Since one minus the probability of committing a Type II error is known as the power of the test, this procedure amounts to maximizing the power of the test. (See Appendix A for a discussion of the power of a test.)

But all this problem with choosing the appropriate value of a can be avoided if we use what is known as the p value of the test statistic, which is discussed next.

As just noted, the Achilles heel of the classical approach to hypothesis testing is its arbitrariness in selecting a. Once a test statistic (e.g., the t statistic) is obtained in a given example, why not simply go to the appropriate statistical table and find out the actual probability of obtaining a value of the test statistic as much as or greater than that obtained in the example? This probability is called the p value (i.e., probability value), also known as the observed or exact level of significance or the exact probability of committing a Type I error. More technically, the p value is defined as the lowest significance level at which a null hypothesis can be rejected.

To illustrate, let us return to our consumption-income example. Given the null hypothesis that the true MPC is 0.3, we obtained a t value of 5.86 in (5.7.4). What is the p value of obtaining a t value of as much as or greater than 5.86? Looking up the t table given in Appendix D, we observe that for 8 df the probability of obtaining such a t value must be much smaller than 0.001 (one-tail) or 0.002 (two-tail). By using the computer, it can be shown that the probability of obtaining a t value of 5.86 or greater (for 8 df) is about 0.000189.14 This is the p value of the observed t statistic. This observed, or exact, level of significance of the t statistic is much smaller than the conventionally, and arbitrarily, fixed level of significance, such as 1, 5, or 10 percent. As a matter of fact, if we were to use the p value just computed,

13Jan Kmenta, Elements of Econometrics, Macmillan, New York, 1971, pp. 126-127.

14One can obtain the p value using electronic statistical tables to several decimal places. Unfortunately, the conventional statistical tables, for lack of space, cannot be that refined. Most statistical packages now routinely print out the p values.

low).13

The Exact Level of Significance: The p Value

5. Two-Variable Regression: Interval Estimation and Hypothesis Testing

138 PART ONE: SINGLE-EQUATION REGRESSION MODELS

and reject the null hypothesis that the true MPC is 0.3, the probability of our committing a Type I error is only about 0.02 percent, that is, only about 2 in 10,000!

As we noted earlier, if the data do not support the null hypothesis, |t| obtained under the null hypothesis will be "large" and therefore the p value of obtaining such a |t| value will be "small." In other words, for a given sample size, as | t| increases, the p value decreases, and one can therefore reject the null hypothesis with increasing confidence.

What is the relationship of the p value to the level of significance a? If we make the habit of fixing a equal to thep value of a test statistic (e.g., the t statistic), then there is no conflict between the two values. To put it differently, it is better to give up fixing a arbitrarily at some level and simply choose the p value of the test statistic. It is preferable to leave it to the reader to decide whether to reject the null hypothesis at the givenp value. If in an application the p value of a test statistic happens to be, say, 0.145, or 14.5 percent, and if the reader wants to reject the null hypothesis at this (exact) level of significance, so be it. Nothing is wrong with taking a chance of being wrong 14.5 percent of the time if you reject the true null hypothesis. Similarly, as in our consumption-income example, there is nothing wrong if the researcher wants to choose ap value of about 0.02 percent and not take a chance of being wrong more than 2 out of 10,000 times. After all, some investigators may be risk-lovers and some risk-averters!

In the rest of this text, we will generally quote the p value of a given test statistic. Some readers may want to fix a at some level and reject the null hypothesis if the p value is less than a. That is their choice.

Let us revert to our consumption-income example and now hypothesize that the true MPC is 0.61 (H0: ft = 0.61). On the basis of our sample result of ft = 0.5091, we obtained the interval (0.4268, 0.5914) with 95 percent confidence. Since this interval does not include 0.61, we can, with 95 percent confidence, say that our estimate is statistically significant, that is, significantly different from 0.61.

But what is the practical or substantive significance of our finding? That is, what difference does it make if we take the MPC to be 0.61 rather than 0.5091? Is the 0.1009 difference between the two MPCs that important practically?

The answer to this question depends on what we really do with these estimates. For example, from macroeconomics we know that the income multiplier is 1/(1 - MPC). Thus, if MPC is 0.5091, the multiplier is 2.04, but it is 2.56 if MPC is equal to 0.61. That is, if the government were to increase its expenditure by $1 to lift the economy out of a recession, income will eventually increase by $2.04 if the MPC is 0.5091 but by $2.56 if the MPC is 0.61. And that difference could very well be crucial to resuscitating the economy.

Statistical Significance versus Practical Significance

5. Two-Variable Regression: Interval Estimation and Hypothesis Testing

CHAPTER FIVE: TWO VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING 139

The point of all this discussion is that one should not confuse statistical significance with practical, or economic, significance. As Goldberger notes:

When a null, say, ¡j = 1, is specified, the likely intent is that ¡j is close to 1, so close that for all practical purposes it may be treated as if it were 1. But whether 1.1 is "practically the same as" 1.0 is a matter of economics, not of statistics. One cannot resolve the matter by relying on a hypothesis test, because the test statistic [t = ](bj — 1)/ab;measures the estimated coefficient in standard error units, which are not meaningful units in which to measure the economic parameter ¡j — 1. It may be a good idea to reserve the term "significance" for the statistical concept, adopting "substantial" for the economic concept.15

The point made by Goldberger is important. As sample size becomes very large, issues of statistical significance become much less important but issues of economic significance become critical. Indeed, since with very large samples almost any null hypothesis will be rejected, there may be studies in which the magnitude of the point estimates may be the only issue.

In most applied economic analyses, the null hypothesis is set up as a straw man and the objective of the empirical work is to knock it down, that is, reject the null hypothesis. Thus, in our consumption-income example, the null hypothesis that the MPC ¡2 = 0 is patently absurd, but we often use it to dramatize the empirical results. Apparently editors of reputed journals do not find it exciting to publish an empirical piece that does not reject the null hypothesis. Somehow the finding that the MPC is statistically different from zero is more newsworthy than the finding that it is equal to, say, 0.7!

Thus, J. Bradford De Long and Kevin Lang argue that it is better for economists

. . . to concentrate on the magnitudes of coefficients and to report confidence levels and not significance tests. If all or almost all null hypotheses are false, there is little point in concentrating on whether or not an estimate is indistinguishable from its predicted value under the null. Instead, we wish to cast light on what models are good approximations, which requires that we know ranges of parameter values that are excluded by empirical estimates.16

In short, these authors prefer the confidence-interval approach to the test-of-significance approach. The reader may want to keep this advice in mind.17

15Arthur S. Goldberger, A Course in Econometrics, Harvard University Press, Cambridge, Massachusetts, 1991, p. 240. Note bj is the OLS estimator of ¡j and Objis its standard error. For a corroborating view, see D. N. McCloskey, "The Loss Function Has Been Mislaid: The Rhetoric of Significance Tests," American Economic Review, vol. 75, 1985, pp. 201-205. See also D. N. McCloskey and S. T. Ziliak, "The Standard Error of Regression," Journal of Economic Literature, vol. 37, 1996, pp. 97-114.

16See their article cited in footnote 12, p. 1271.

17For a somewhat different perspective, see Carter Hill, William Griffiths, and George Judge, Undergraduate Econometrics, Wiley & Sons, New York, 2001, p. 108.

The Choice between Confidence-Interval and Test-of-Significance Approaches to Hypothesis Testing

Gujarati: Basic I. Single-Equation 5. Two-Variable © The McGraw-Hill

Econometrics, Fourth Regression Models Regression: Interval Companies, 2004

Edition Estimation and Hypothesis

Testing

140 PART ONE: SINGLE-EQUATION REGRESSION MODELS

Was this article helpful?

Learning About The Rules Of The Rich And Wealthy Can Have Amazing Benefits For Your Life And Success. Discover the hidden rules and beat the rich at their own game. The general population has a love / hate kinship with riches. They resent those who have it, but spend their total lives attempting to get it for themselves. The reason an immense majority of individuals never accumulate a substantial savings is because they don't comprehend the nature of money or how it works.

## Post a comment