The G-test may be used both as a test of goodness-of-fit (comparing frequencies of one nominal variable to theoretical expecations) and as a test of independence (comparing frequencies of one nominal variable for different values of a second nominal variable). The underlying arithmetic of the test is the same. Goodness-of-fit tests and tests of independence are used for quite different experimental designs and test different null hypotheses, so I treat the G-test of goodness-of-fit and the G-test of independence as two distinct statistical tests.

The G-test of independence is an alternative to the chi-square test of independence. Most of the information on this page is identical to that on the chi-square page. You should read the section on "Chi-square vs. G-test", pick either chi-square or G-test, then stick with that choice for the rest of your life.

The G-test of independence is used when you have two nominal variables, each with two or more possible values. A data set like this is often called an "R×C table," where R is the number of rows and C is the number of columns. For example, if you surveyed the frequencies of three flower phenotypes (red, pink, white) in four geographic locations, you would have a 3×4 table. You could also consider it a 4×3 table; it doesn't matter which variable is the columns and which is the rows.

It is also possible to do a G-test of independence with more than two nominal variables, but that experimental design doesn't occur very often and is rather complicated to analyze and interpret, so I won't cover it.

The null hypothesis is that the relative proportions of one variable are independent of the second variable; in other words, the proportions at one variable are the same for different values of the second variable. In the flower example, you would probably say that the null hypothesis was that the proportions of red, pink and white were the same at the four locations.

For some experiments, you can express the null hypothesis in two different ways, and either would make sense. For example, when an individual clasps their hands, there is one comfortable position; either the right thumb is on top, or the left thumb is on top. Downey (1926) collected data on the frequency of right-thumb vs. left-thumb clasping in right-handed and left-handed individuals. You could say that the null hypothesis is that the proportion of right-thumb-clasping is the same for right-handed and left-handed individuals, or you could say that the proportion of right-handedness is the same for right-thumb-clasping and left-thumb-clasping individuals.

For other experiments, it only makes sense to express the null hypothesis one way. In the flower example, it would make sense to say that the null hypothesis is that the proportions of red, pink and white flowers are the same at the four geographic locations; it wouldn't make sense to say that the proportion of flowers at each location is the same for red, pink, and white flowers.

The math of the G-test of independence is the same as for the G-test of goodness-of-fit, only the method of calculating the expected frequencies is different. For the goodness-of-fit test, a theoretical relationship is used to calculate the expected frequencies. For the test of independence, only the observed frequencies are used to calculate the expected. For the hand-clasping example, Downey (1926) found 190 right-thumb and 149 left-thumb-claspers among right-handed women, and 42 right-thumb and 49 left-thumb-claspers among left-handed women. To calculate the estimated frequency of right-thumb-claspers among right-handed women, you would first calculate the overall proportion of right-thumb-claspers: (190+42)/(190+42+149+49)=0.5395. Then you would multiply this overall proportion times the total number of right-handed women, 0.5395×(190+149)=182.9. This is the expected number of right-handed right-thumb-claspers under the null hypothesis; the observed number is 190. Similar calculations would be done for each of the cells in this 2×2 table of numbers.

(In practice, the calculations for the G-test of independence use shortcuts that don't require calculating the expected frequencies; see Sokal and Rohlf, pp. 731-732.)

The degrees of freedom in a test of independence are equal to (number of rows)?1 × (number of columns)?1. Thus for a 2×2 table, there are (2?1)×(2?1)=1 degree of freedom; for a 4×3 table, there are (4?1)×(3?1)=6 degrees of freedom.

Gardemann et al. (1998) surveyed genotypes at an insertion/deletion polymorphism of the apolipoprotein B signal peptide in 2259 men. Of men without coronary artery disease, 268 had the ins/ins genotype, 199 had the ins/del genotype, and 42 had the del/del genotype. Of men with coronary artery disease, there were 807 ins/ins, 759 ins/del, and 184 del/del.

The biological null hypothesis is that the apolipoprotein polymorphism doesn't affect the likelihood of getting coronary artery disease. The statistical null hypothesis is that the proportions of men with coronary artery disease are the same for each of the three genotypes.

The result is G=7.30, 2 d.f., P=0.026. This indicates that the null hypothesis can be rejected; the three genotypes have significantly different proportions of men with coronary artery disease.

Spotted moray eel, Gymnothorax moringa. |

Young and Winn (2003) counted sightings of the spotted moray eel,*Gymnothorax moringa*, and the purplemouth moray eel, *G. vicinus*, in a 150-m by 250-m area of reef in Belize. They identified each eel they saw, and classified the locations of the sightings into three types: those in grass beds, those in sand and rubble, and those within one meter of the border between grass and sand/rubble. The number of sightings are shown in the table, with percentages in parentheses:

G. moringa G. vicinus Grass 127 (25.9) 116 (33.7) Sand 99 (20.2) 67 (19.5) Border 264 (53.9) 161 (46.8)

The nominal variables are the species of eel (*G. moringa* or *G. vicinus* and the habitat type (grass, sand, or border). The difference in habitat use between the species is significant (G=6.23, 2 d.f., P=0.044).

The data used in a test of independence are usually displayed with a bar graph, with the values of one variable on the X-axis and the proportions of the other variable on the Y-axis. If the variable on the Y-axis only has two values, you only need to plot one of them:

A bar graph for when the nominal variable has only two values. |

If the variable on the Y-axis has more than two values, you should plot all of them. Sometimes pie charts are used for this:

A pie chart for when the nominal variable has more than two values. |

But as much as I like pie, I think pie charts make it difficult to see small differences in the proportions, and difficult to show error bars. In this situation, I prefer bar graphs:

A bar graph for when the nominal variable has more than two values. |

If the expected numbers in some classes are small, the G-test will give inaccurate results. In that case, you should try Fisher's exact test; if that doesn't work (because the total sample size is too big, or because there are too many values of one of the nominal variables), you can use therandomization test of independence. See the web page on small sample sizes for further discussion.

If the samples are not independent, but instead are before-and-after observations on the same individuals, you should use McNemar's test.

The chi-square test gives approximately the same results as the G-test. Unlike the chi-square test, G-values are additive, which means they can be used for more elaborate statistical designs. G-tests are a subclass of likelihood ratio tests, a general category of tests that have many uses for testing the fit of data to mathematical models; the more elaborate versions of likelihood ratio tests don't have equivalent tests using the Pearson chi-square statistic. The G-test is therefore preferred by many, even for simpler designs. On the other hand, the chi-square test is more familiar to more people, and it's always a good idea to use statistics that your readers are familiar with when possible. You may want to look at the literature in your field and see which is more commonly used.

I have set up an Excel spreadsheet that performs this test for up to 10 columns and 50 rows. It is largely self-explanatory; you just enter you observed numbers, and the spreadsheet calculates the G-test statistic, the degrees of freedom, and the P-value.

There is a web page that will do a G-test of independence for up to a 10×10 table. Be sure to scroll to the bottom of the page and set the number of rows and columns.

Here is a SAS program that uses PROC FREQ for a G-test. It uses the handclasping data from above.

data handclasp; input thumb $ hand $ count; cards; rightthumb righthand 190 leftthumb righthand 149 rightthumb lefthand 42 leftthumb lefthand 49 ; proc freq data=handclasp; weight count / zeros; tables thumb*hand / chisq; run;

The output includes the following:

Statistics for Table of thumb by hand Statistic DF Value Prob ------------------------------------------------------------ Chi-Square 1 2.8265 0.0927 Likelihood Ratio Chi-Square 1 2.8187 0.0932 Continuity Adj. Chi-Square 1 2.4423 0.1181 Cochran–Mantel–Haenszel Chi-Square 1 2.8199 0.0931 Phi Coefficient 0.0811 Contingency Coefficient 0.0808 Cramer's V 0.0811

The "Likelihood Ratio Chi-Square" is the P-value for the G-test; in this case, G=2.8187, 1 d.f., P=0.0932.

If each nominal variable has just two values (a 2×2 table), use the power analysis for Fisher's exact test.

If either nominal variable has more than two values, use the power analysis for chi-squared tests of independence.

Sokal and Rohlf, pp. 729-739.

Zar, pp. 505-506.

Picture of eel from Wikimedia Commons.

Downey, J.E. 1926. Further observations on the manner of clasping the hands. American Naturalist 60: 387-391.

Gardemann, A., D. Ohly, M. Fink, N. Katz, H. Tillmanns, F.W. Hehrlein, and W. Haberbosch. 1998. Association of the insertion/deletion gene polymorphism of the apolipoprotein B signal peptide with myocardial infarction. Atheroslerosis 141: 167-175.

Young, R.F., and H.E. Winn. 2003. Activity patterns, diet, and shelter site use for two species of

评论这张

<#--最新日志，群博日志-->
<#--推荐日志-->
<#--引用记录-->
<#--博主推荐-->
<#--随机阅读-->
<#--首页推荐-->
<#--历史上的今天-->
<#--被推荐日志-->
<#--上一篇，下一篇-->
<#-- 热度 -->
<#-- 网易新闻广告 -->
<#--右边模块结构-->
<#--评论模块结构-->
<#--引用模块结构-->
<#--博主发起的投票-->

## 评论