Section 6.4 Homogeneity and independence in two-way tables
¶Open Intro: Homogeneity and independence in two way tables
Google is constantly running experiments to test new search algorithms. For example, Google might test three algorithms using a sample of 10,000 google.com search queries. Table 6.4.2 shows an example of 10,000 queries split into three algorithm groups. 1 Google regularly runs experiments in this manner to help improve their search engine. It is entirely possible that if you perform a search and so does your friend, that you will have different search results. While the data presented in this section resemble what might be encountered in a real experiment, these data are simulated. The group sizes were specified before the start of the experiment to be 5000 for the current algorithm and 2500 for each test algorithm.
Search algorithm | current | test 1 | test 2 | Total | ||
Counts | 5000 | 2500 | 2500 | 10000 |
What is the ultimate goal of the Google experiment? What are the null and alternative hypotheses, in regular words?
The ultimate goal is to see whether there is a difference in the performance of the algorithms. The hypotheses can be described as the following:
\(H_{O}:\) The algorithms each perform equally well.
\(H_{A}:\) The algorithms do not perform equally well.
In this experiment, the explanatory variable is the search algorithm. However, an outcome variable is also needed. This outcome variable should somehow reflect whether the search results align with the user's interests. One possible way to quantify this is to determine whether (1) there was no new, related search, and the user clicked one of the links provided, or (2) there was a new, related search performed by the user. Under scenario (1), we might think that the user was satisfied with the search results. Under scenario (2), the search results probably were not relevant, so the user tried a second search.
Table 6.4.4 provides the results from the experiment. These data are very similar to the count data in Section 6.3. However, now the different combinations of two variables are binned in a two-way table. In examining these data, we want to evaluate whether there is strong evidence that at least one algorithm is performing better than the others. To do so, we apply a chi-square test to this two-way table. The ideas of this test are similar to those ideas in the one-way table case. However, degrees of freedom and expected counts are computed a little differently than before.
Search Algorithm | ||||||||
current | test 1 | test 2 | Total | |||||
No new search | 3511 | 1749 | 1818 | 7078 | ||||
New search | 1489 | 751 | 682 | 2922 | ||||
Total | 5000 | 2500 | 2500 | 10000 | ||||
TIP: What is so different about one-way tables and two-way tables?
A one-way table describes counts for each outcome in a single variable. A two-way table describes counts for combinations of outcomes for two variables. When we consider a two-way table, we often would like to know, are these variables related in any way?
The hypothesis test for this Google experiment is really about assessing whether there is statistically significant evidence that the choice of the algorithm affects whether a user performs a second search. In other words, the goal is to check whether the three search algorithms perform differently.
Subsection 6.4.1 Expected counts in two-way tables
Example 6.4.5
From the experiment, we estimate the proportion of users who were satisfied with their initial search (no new search) as \(7078/10000 = 0.7078\text{.}\) If there really is no difference among the algorithms and 70.78% of people are satisfied with the search results, how many of the 5000 people in the “current algorithm” group would be expected to not perform a new search?
About 70.78% of the 5000 would be satisfied with the initial search:
That is, if there was no difference between the three groups, then we would expect 3539 of the current algorithm users not to perform a new search.
Guided Practice 6.4.6
Using the same rationale described in Example 6.4.5, about how many users in each test group would not perform a new search if the algorithms were equally helpful? 2 We would expect \(0.7078*2500 = 1769.5\text{.}\) It is okay that this is a fraction.
We can compute the expected number of users who would perform a new search for each group using the same strategy employed in Example 6.4.5 and Guided Practice 6.4.6. These expected counts were used to construct Table 6.4.7, which is the same as Table 6.4.4, except now the expected counts have been added in parentheses.
Search algorithm | current | test 1 | test 2 | Total | ||||||
No new search | 3511 | (3539) | 1749 | (1769.5) | 1818 | (1769.5) | 7078 | |||
New search | 1489 | (1461) | 751 | (730.5) | 682 | (730.5) | 2922 | |||
Total | 5000 | 2500 | 2500 | 10000 | ||||||
The examples and exercises above provided some help in computing expected counts. In general, expected counts for a two-way table may be computed using the row totals, column totals, and the table total. For instance, if there was no difference between the groups, then about 70.78% of each column should be in the first row:
Looking back to how the fraction 0.7078 was computed — as the fraction of users who did not perform a new search (\(7078/10000\)) — these three expected counts could have been computed as
This leads us to a general formula for computing expected counts in a two-way table when we would like to test whether there is strong evidence of an association between the column variable and row variable.
Computing expected counts in a two-way table
To identify the expected count for the \(i^{th}\) row and \(j^{th}\) column, compute
Subsection 6.4.2 The chi-square test of homogeneity for two-way tables
The chi-square test statistic for a two-way table is found the same way it is found for a one-way table. For each table count, compute
Adding the computed value for each cell gives the chi-square test statistic \(X^2\text{:}\)
Just like before, this test statistic follows a chi-square distribution. However, the degrees of freedom are computed a little differently for a two-way table. 3 Recall: in the one-way table, the degrees of freedom was the number of groups minus 1. For two way tables, the degrees of freedom is equal to
In our example, the degrees of freedom parameter is
If the null hypothesis is true (i.e. the algorithms are equally useful), then the test statistic \(X^2 = 6.12\) closely follows a chi-square distribution with 2 degrees of freedom. Using this information, we can compute the p-value for the test, which is depicted in Figure 6.4.8.
Computing degrees of freedom for a two-way table
When applying the chi-square test to a two-way table, we use
where \(R\) is the number of rows in the table and \(C\) is the number of columns.
TIP: Use two-proportion methods for 2-by-2 contingency tables
When analyzing 2-by-2 contingency tables, use the two-proportion methods introduced in Section 6.2.
![](imagese/inference_for_props/googleHTForDiffAlgPerformancePValue.png)
TIP: Conditions for the chi-square test of homeneity
There are two conditions that must be checked before performing a chi-square test of homogeneity. If these conditions are not met, this test should not be used.
Mutliple random samples or randomly allocated treatments. Data collected by multiple independent random samples or multiple randomlly allocated treatments. Data can then be organized into a two-way table.
All Expected Counts at least 5. All of the expected counts must be at least 5.
Example 6.4.9
Compute the p-value and draw a conclusion about whether the search algorithms have different performances.
Looking in Appendix D, we examine the row corresponding to 2 degrees of freedom. The test statistic, \(X^2=6.120\text{,}\) falls between the fourth and fifth columns, which means the p-value is between 0.02 and 0.05. Because we typically test at a significance level of \(\alpha=0.05\) and the p-value is less than 0.05, the null hypothesis is rejected. That is, the data provide convincing evidence that there is some difference in performance among the algorithms.
Subsection 6.4.3 The chi-square test of independence for two-way tables
The chi-square test of Independence proceeds exactly like the chi-square test of homogeneity, except that it applies when there is only one random sample (versus multiple random samples or an experiment with multiple randomly allocated treatments). The null claim is always that two variables are independent, while the alternate claim is that the variables are dependent.
Example 6.4.10
Table 6.4.11 summarizes the results of a Pew Research poll. 4 See the Pew Research website: http://www.people-press.org/2012/03/14/romney-leads-gop-contest-trails-in-matchup-with-obama/. The counts in Table 6.4.11 are approximate. We would like to determine if three groups and approval ratings are associated. What are appropriate hypotheses for such a test?
\(H_{O}:\) The ratings are independent of the group. (There is no difference in approval ratings between the three groups.)
\(H_{A}:\) The ratings are dependent on the group. (There is some difference in approval ratings between the three groups, e.g. perhaps Obama's approval differs from Democrats in Congress.)
Congress | ||||||
Obama | Democrats | Republicans | Total | |||
Approve | 842 | 736 | 541 | 2119 | ||
Disapprove | 616 | 646 | 842 | 2104 | ||
Total | 1458 | 1382 | 1383 | 4223 | ||
Guided Practice 6.4.12
A chi-square test for a two-way table may be used to test the hypotheses in Example 6.4.10. As a first step, compute the expected values for each of the six table cells. 5 The expected count for row one / column one is found by multiplying the row one total (2119) and column one total (1458), then dividing by the table total (4223): \(\frac{2119\times 1458}{4223} = 731.6\text{.}\) Similarly for the first column and the second row: \(\frac{2104\times 1458}{4223} = 726.4\text{.}\) Column 2: 693.5 and 688.5. Column 3: 694.0 and 689.0
Guided Practice 6.4.13
Compute the chi-square test statistic. 6 For each cell, compute \(\frac{(\text{ obs } - \text{ exp } )^2}{exp}\text{.}\) For instance, the first row and first column: \(\frac{(842-731.6)^2}{731.6} = 16.7\text{.}\) Adding the results of each cell gives the chi-square test statistic: \(X^2 = 16.7 + \cdots + 34.0 = 106.4\)}.
Guided Practice 6.4.14
Because there are 2 rows and 3 columns, the degrees of freedom for the test is \(df=(2-1)\times (3-1) = 2\text{.}\) Use \(X^2=106.4\text{,}\) \(df=2\text{,}\) and the chi-square table Appendix D to evaluate whether to reject the null hypothesis. 7 The test statistic is larger than the right-most column of the \(df=2\) row of the chi-square table, meaning the p-value is less than 0.001. That is, we reject the null hypothesis because the p-value is less than 0.05, and we conclude that Americans' approval has differences among Democrats in Congress, Republicans in Congress, and the president.
Conditions for the chi-square test of independence
There are two conditions that must be checked before performing a chi-square test of independence. If these conditions are not met, this test should not be used.
One simple random sample with two variables/questions. The data must be arrived at by taking a simple random sample. After the data is collected, it is separated and categorized according to two variables and can be organized into a two-way table.
All Expected Counts at least 5 All of the expected counts must be at least 5.
Subsection 6.4.4 Summarizing the chi-square tests for two-way tables
-
State the name of the test being used.
\(X^2\) test of homogeneity
-
Verify conditions.
Multiple random samples or treatments.
All expected counts \(\ge 5\) (calculate and record expected counts).
-
Write the hypotheses in plain language. No mathematical notation is needed for this test.
H\(_0\text{:}\) distribution of [variable 1] matches the distribution of [variable 2].
H\(_A\text{:}\) distribution of [variable 1] does not match the distribution of [variable 2].
Identify the significance level \(\alpha\text{.}\)
-
Calculate the test statistic and degrees of freedom.
\begin{align*} X^2 \amp = \sum{\frac{\text{ (observed counts - expected counts) } ^2}{\text{ expected counts } }}\\ df \amp = (\# \text{ of rows } - 1) \times (\# \text{ of columns } - 1) \end{align*} Find the p-value and compare it to \(\alpha\) to determine whether to reject or not reject \(H_0\text{.}\)
Write the conclusion in the context of the question.
\(X^2\) test of independence
-
State the name of the test being used.
\(X^2\) test of independence
-
Verify conditions.
A simple random sample.
All expected counts \(\ge 5\) (calculate and record expected counts).
-
Write the hypotheses in plain language. No mathematical notation is needed for this test.
H\(_0\text{:}\) [variable 1] and [variable 2] are independent.
H\(_A\text{:}\) [variable 1] and [variable 2] are dependent.
Identify the significance level \(\alpha\text{.}\)
-
Calculate the test statistic and degrees of freedom.
\begin{align*} X^2 \amp = \sum{\frac{\text{ (observed counts - expected counts) } ^2}{\text{ expected counts } }}\\ df \amp = (\# \text{ of rows } - 1) \times (\# \text{ of columns } - 1) \end{align*} Find the p-value and compare it to \(\alpha\) to determine whether to reject or not reject \(H_0\text{.}\)
Write the conclusion in the context of the question.
Example 6.4.15
A 2011 survey asked 806 randomly sampled adult Facebook users about their Facebook privacy settings. One of the questions on the survey was, “Do you know how to adjust your Facebook privacy settings to control what people can and cannot see?” The responses are cross-tabulated based on gender. 8 Survey USA, News Poll #17960, data collected February 16-17, 2011
Gender | |||||
Male | Female | Total | |||
Response | Yes | 288 | 378 | 666 | |
No | 61 | 62 | 123 | ||
Not sure | 10 | 7 | 17 | ||
Total | 359 | 447 | 806 |
Carry out an appropriate test at the 0.10 significance level to see if there is an association between gender and knowing how to adjust Facebook privacy settings to control what people can and cannot see.
According to the problem, there was one random sample taken. Two variables were recorded on the respondents: gender and response to the question regarding privacy settings. Because there was one random sample rather than two independent random samples, we carry out a \(X^2\) test of independence. H\(_0\text{:}\) Gender and knowing how to adjust Facebook privacy settings are independent. H\(_A\text{:}\) Gender and knowing how to adjust Facebook privacy settings are dependent. \(\alpha=0.1\)
Table of expected counts:
Gender | ||||
Male | Female | |||
Response | Yes | 296.64 | 369.36 | |
No | 54.785 | 68.215 | ||
Not sure | 7.572 | 9.428 | ||
All expected counts are \(\ge 5\text{.}\) \(X^2 = 3.13\text{;}\) \(df = 2\) p-value\(= 0.209 > \alpha\) We do not reject H\(_0\text{.}\) We do not have evidence that gender and knowing how to adjust Facebook privacy settings are dependent.
Subsection 6.4.5 Calculator: chi-square test for two-way tables
TI-83/84: Entering data into a two-way table
MISSINGVIDEOLINK
Hit
2ND
\(x^{-1}\) (i.e.MATRIX
).Right arrow to
EDIT
.Hit
1
orENTER
to select matrixA
.Enter the dimensions by typing #rows,
ENTER
, #columns,ENTER
.Enter the data from the two-way table.
TI-83/84: Chi-square test of homogeneity and independence
MISSINGVIDEOLINK Use STAT
, TESTS
, \(X^2\)-Test
.
First enter two-way table data as described in the previous box.
Choose
STAT
.Right arrow to
TESTS
.Down arrow and choose
C:
\(X^2\)-Test
.Down arrow, choose
Calculate
, and hitENTER
, which returns\(X^2\) chi-square test statistic p
p-value df
degrees of freedom
TI-83/84: Finding the expected counts
First enter two-way table data as described previously.
Carry out the chi-square test of homogeneity or independence as described in previous box.
Hit
2ND
\(x^{-1}\) (i.e.MATRIX
).Right arrow to
EDIT
.-
Hit
2
to see matrixB
.This matrix contains the expected counts.
Casio fx-9750GII: Chi-square test of homogeneity and independence
MISSINGVIDEOLINK
Navigate to
STAT
(MENU
button, then hit the2
button or selectSTAT
).Choose the
TEST
option (F3
button).Choose the
CHI
option (F3
button).Choose the
2WAY
option (F2
button).-
Enter the data into a matrix:
Hit \(\triangleright\)
MAT
(F2
button).Navigate to a matrix you would like to use (e.g.
Mat C
) and hitEXE
.Specify the matrix dimensions:
m
is for rows,n
is for columns.Enter the data.
Return to the test page by hitting
EXIT
twice.
Enter the
Observed
matrix that was used by hittingMAT
(F1
button) and the matrix letter (e.g.C
).Enter the
Expected
matrix where the expected values will be stored (e.g.D
).Hit the
EXE
button, which returns\(x^2\) chi-square test statistic p
p-value df
degrees of freedom To see the expected values of the matrix, go to \(\triangleright\)
MAT
(F6
button) and select the corresponding matrix.
Congress | ||||||
Obama | Democrats | Republicans | Total | |||
Approve | 842 | 736 | 541 | 2119 | ||
Disapprove | 616 | 646 | 842 | 2104 | ||
Total | 1458 | 1382 | 1383 | 4223 | ||
Guided Practice 6.4.17
Use Table 6.4.16 and a calculator to find the expected values and the \(X^2\) statistic, \(df\text{,}\) and p-value for the corresponding chi-square test. 9 First create a \(2\times 3\) matrix ith the data. The final summaries should be \(X^2=106.4\text{,}\) \(\text{p-value} = 8.06 \times 10^{-24}\approx 0\text{,}\) and \(df=2\text{.}\) Below is the matrix of expected values:
Obama
Congr. Dem.
Congr. Rep.
Approve
731.59
693.45
693.96
Disapprove
726.41
688.55
689.04