Homogeneity and independence in two-way tables

Section 6.4 Homogeneity and independence in two-way tables

Open Intro: Homogeneity and independence in two way tables

Figure 6.4.1 Homogeneity and independence in two way tables

Google is constantly running experiments to test new search algorithms. For example, Google might test three algorithms using a sample of 10,000 google.com search queries. Table 6.4.2 shows an example of 10,000 queries split into three algorithm groups.¹Google regularly runs experiments in this manner to help improve their search engine. It is entirely possible that if you perform a search and so does your friend, that you will have different search results. While the data presented in this section resemble what might be encountered in a real experiment, these data are simulated. The group sizes were specified before the start of the experiment to be 5000 for the current algorithm and 2500 for each test algorithm.


Search algorithm	current	test 1	test 2	Total
Counts	5000	2500	2500	10000

Table 6.4.2 Google experiment breakdown of test subjects into three search groups.

Example 6.4.3

What is the ultimate goal of the Google experiment? What are the null and alternative hypotheses, in regular words?

Solution

The ultimate goal is to see whether there is a difference in the performance of the algorithms. The hypotheses can be described as the following:

\(H_{O}:\) The algorithms each perform equally well.

\(H_{A}:\) The algorithms do not perform equally well.

In this experiment, the explanatory variable is the search algorithm. However, an outcome variable is also needed. This outcome variable should somehow reflect whether the search results align with the user's interests. One possible way to quantify this is to determine whether (1) there was no new, related search, and the user clicked one of the links provided, or (2) there was a new, related search performed by the user. Under scenario (1), we might think that the user was satisfied with the search results. Under scenario (2), the search results probably were not relevant, so the user tried a second search.

Table 6.4.4 provides the results from the experiment. These data are very similar to the count data in Section 6.3. However, now the different combinations of two variables are binned in a two-way table. In examining these data, we want to evaluate whether there is strong evidence that at least one algorithm is performing better than the others. To do so, we apply a chi-square test to this two-way table. The ideas of this test are similar to those ideas in the one-way table case. However, degrees of freedom and expected counts are computed a little differently than before.

	Search Algorithm
		current	test 1	test 2	Total

No new search		3511	1749	1818	7078
New search		1489	751	682	2922

Total		5000	2500	2500	10000

Table 6.4.4 Results of the Google search algorithm experiment.

TIP: What is so different about one-way tables and two-way tables?

A one-way table describes counts for each outcome in a single variable. A two-way table describes counts for combinations of outcomes for two variables. When we consider a two-way table, we often would like to know, are these variables related in any way?

The hypothesis test for this Google experiment is really about assessing whether there is statistically significant evidence that the choice of the algorithm affects whether a user performs a second search. In other words, the goal is to check whether the three search algorithms perform differently.

Subsection 6.4.1 Expected counts in two-way tables

Example 6.4.5

From the experiment, we estimate the proportion of users who were satisfied with their initial search (no new search) as \(7078/10000 = 0.7078\text{.}\) If there really is no difference among the algorithms and 70.78% of people are satisfied with the search results, how many of the 5000 people in the “current algorithm” group would be expected to not perform a new search?

Solution

About 70.78% of the 5000 would be satisfied with the initial search:

\begin{equation*} 0.7078\times 5000 = 3539\text{ users } \end{equation*}

That is, if there was no difference between the three groups, then we would expect 3539 of the current algorithm users not to perform a new search.

Guided Practice 6.4.6

Using the same rationale described in Example 6.4.5, about how many users in each test group would not perform a new search if the algorithms were equally helpful?²We would expect \(0.7078*2500 = 1769.5\text{.}\) It is okay that this is a fraction.

We can compute the expected number of users who would perform a new search for each group using the same strategy employed in Example 6.4.5 and Guided Practice 6.4.6. These expected counts were used to construct Table 6.4.7, which is the same as Table 6.4.4, except now the expected counts have been added in parentheses.


Search algorithm	current		test 1		test 2			Total

No new search	3511	(3539)	1749	(1769.5)		1818	(1769.5)		7078
New search	1489	(1461)	751	(730.5)		682	(730.5)		2922

Total	5000		2500			2500			10000

Table 6.4.7 The observed counts and the (expected counts).

The examples and exercises above provided some help in computing expected counts. In general, expected counts for a two-way table may be computed using the row totals, column totals, and the table total. For instance, if there was no difference between the groups, then about 70.78% of each column should be in the first row:

\begin{align*} 0.7078\times (\text{ column 1 total } ) \amp = 3539\\ 0.7078\times (\text{ column 2 total } ) \amp = 1769.5\\ 0.7078\times (\text{ column 3 total } ) \amp = 1769.5 \end{align*}

Looking back to how the fraction 0.7078 was computed — as the fraction of users who did not perform a new search (\(7078/10000\)) — these three expected counts could have been computed as

\begin{align*} \left(\frac{\text{ row 1 total } }{\text{ table total } }\right)\text{ (column 1 total) } \amp = 3539\\ \left(\frac{\text{ row 1 total } }{\text{ table total } }\right)\text{ (column 2 total) } \amp = 1769.5\\ \left(\frac{\text{ row 1 total } }{\text{ table total } }\right)\text{ (column 3 total) } \amp = 1769.5 \end{align*}

This leads us to a general formula for computing expected counts in a two-way table when we would like to test whether there is strong evidence of an association between the column variable and row variable.

Computing expected counts in a two-way table

To identify the expected count for the \(i^{th}\) row and \(j^{th}\) column, compute

\begin{equation*} \text{ Expected Count } _{\text{ row } i,\text{ col } j} = \frac{(\text{ row \(i\) total } ) \times (\text{ column \(j\) total } )}{\text{ table total } } \end{equation*}

Subsection 6.4.2 The chi-square test of homogeneity for two-way tables

The chi-square test statistic for a two-way table is found the same way it is found for a one-way table. For each table count, compute

\begin{align*} \amp \text{ General formula } \amp \amp \frac{(\text{ observed count } - \text{ expected count } )^2}{\text{ expected count } }\\ \amp \text{ Row 1, Col 1 } \amp \amp \frac{(3511 - 3539)^2}{3539} = 0.222\\ \amp \text{ Row 1, Col 2 } \amp \amp \frac{(1749 - 1769.5)^2}{1769.5} = 0.237\\ \amp \hspace{9mm}\vdots \amp \amp \hspace{13mm}\vdots\\ \amp \text{ Row 2, Col 3 } \amp \amp \frac{(682 - 730.5)^2}{730.5} = 3.220 \end{align*}

Adding the computed value for each cell gives the chi-square test statistic \(X^2\text{:}\)

\begin{equation*} X^2 = 0.222 + 0.237 + \dots + 3.220 = 6.120 \end{equation*}

Just like before, this test statistic follows a chi-square distribution. However, the degrees of freedom are computed a little differently for a two-way table.³Recall: in the one-way table, the degrees of freedom was the number of groups minus 1. For two way tables, the degrees of freedom is equal to

\begin{gather*} df = \text{ (number of rows - 1) } \times \text{ (number of columns - 1) } \end{gather*}

In our example, the degrees of freedom parameter is

\begin{gather*} df = (2-1)\times (3-1) = 2 \end{gather*}

If the null hypothesis is true (i.e. the algorithms are equally useful), then the test statistic \(X^2 = 6.12\) closely follows a chi-square distribution with 2 degrees of freedom. Using this information, we can compute the p-value for the test, which is depicted in Figure 6.4.8.

Computing degrees of freedom for a two-way table

When applying the chi-square test to a two-way table, we use

\begin{equation*} df = (R-1)\times (C-1) \end{equation*}

where \(R\) is the number of rows in the table and \(C\) is the number of columns.

TIP: Use two-proportion methods for 2-by-2 contingency tables

When analyzing 2-by-2 contingency tables, use the two-proportion methods introduced in Section 6.2.

Figure 6.4.8 Computing the p-value for the Google hypothesis test.

TIP: Conditions for the chi-square test of homeneity

There are two conditions that must be checked before performing a chi-square test of homogeneity. If these conditions are not met, this test should not be used.

Mutliple random samples or randomly allocated treatments. Data collected by multiple independent random samples or multiple randomlly allocated treatments. Data can then be organized into a two-way table.

All Expected Counts at least 5. All of the expected counts must be at least 5.

Example 6.4.9

Compute the p-value and draw a conclusion about whether the search algorithms have different performances.

Solution

Looking in Appendix D, we examine the row corresponding to 2 degrees of freedom. The test statistic, \(X^2=6.120\text{,}\) falls between the fourth and fifth columns, which means the p-value is between 0.02 and 0.05. Because we typically test at a significance level of \(\alpha=0.05\) and the p-value is less than 0.05, the null hypothesis is rejected. That is, the data provide convincing evidence that there is some difference in performance among the algorithms.

Subsection 6.4.3 The chi-square test of independence for two-way tables

The chi-square test of Independence proceeds exactly like the chi-square test of homogeneity, except that it applies when there is only one random sample (versus multiple random samples or an experiment with multiple randomly allocated treatments). The null claim is always that two variables are independent, while the alternate claim is that the variables are dependent.

Example 6.4.10

Table 6.4.11 summarizes the results of a Pew Research poll.⁴See the Pew Research website: http://www.people-press.org/2012/03/14/romney-leads-gop-contest-trails-in-matchup-with-obama/. The counts in Table 6.4.11 are approximate. We would like to determine if three groups and approval ratings are associated. What are appropriate hypotheses for such a test?

Solution

\(H_{O}:\) The ratings are independent of the group. (There is no difference in approval ratings between the three groups.)

\(H_{A}:\) The ratings are dependent on the group. (There is some difference in approval ratings between the three groups, e.g. perhaps Obama's approval differs from Democrats in Congress.)

		Congress
	Obama	Democrats	Republicans	Total

Approve	842	736	541	2119
Disapprove	616	646	842	2104

Total	1458	1382	1383	4223

Table 6.4.11 Pew Research poll results of a March 2012 poll.

Guided Practice 6.4.12

A chi-square test for a two-way table may be used to test the hypotheses in Example 6.4.10. As a first step, compute the expected values for each of the six table cells.⁵The expected count for row one / column one is found by multiplying the row one total (2119) and column one total (1458), then dividing by the table total (4223): \(\frac{2119\times 1458}{4223} = 731.6\text{.}\) Similarly for the first column and the second row: \(\frac{2104\times 1458}{4223} = 726.4\text{.}\) Column 2: 693.5 and 688.5. Column 3: 694.0 and 689.0

Guided Practice 6.4.13

Compute the chi-square test statistic.⁶For each cell, compute \(\frac{(\text{ obs } - \text{ exp } )^2}{exp}\text{.}\) For instance, the first row and first column: \(\frac{(842-731.6)^2}{731.6} = 16.7\text{.}\) Adding the results of each cell gives the chi-square test statistic: \(X^2 = 16.7 + \cdots + 34.0 = 106.4\)}.

Guided Practice 6.4.14

Because there are 2 rows and 3 columns, the degrees of freedom for the test is \(df=(2-1)\times (3-1) = 2\text{.}\) Use \(X^2=106.4\text{,}\) \(df=2\text{,}\) and the chi-square table Appendix D to evaluate whether to reject the null hypothesis.⁷The test statistic is larger than the right-most column of the \(df=2\) row of the chi-square table, meaning the p-value is less than 0.001. That is, we reject the null hypothesis because the p-value is less than 0.05, and we conclude that Americans' approval has differences among Democrats in Congress, Republicans in Congress, and the president.

Conditions for the chi-square test of independence

There are two conditions that must be checked before performing a chi-square test of independence. If these conditions are not met, this test should not be used.

One simple random sample with two variables/questions. The data must be arrived at by taking a simple random sample. After the data is collected, it is separated and categorized according to two variables and can be organized into a two-way table.

All Expected Counts at least 5 All of the expected counts must be at least 5.

Subsection 6.4.4 Summarizing the chi-square tests for two-way tables

State the name of the test being used.
- \(X^2\) test of homogeneity
Verify conditions.
- Multiple random samples or treatments.
- All expected counts \(\ge 5\) (calculate and record expected counts).
Write the hypotheses in plain language. No mathematical notation is needed for this test.
- H\(_0\text{:}\) distribution of [variable 1] matches the distribution of [variable 2].
- H\(_A\text{:}\) distribution of [variable 1] does not match the distribution of [variable 2].
Identify the significance level \(\alpha\text{.}\)
Calculate the test statistic and degrees of freedom.

\begin{align*} X^2 \amp = \sum{\frac{\text{ (observed counts - expected counts) } ^2}{\text{ expected counts } }}\\ df \amp = (\# \text{ of rows } - 1) \times (\# \text{ of columns } - 1) \end{align*}
Find the p-value and compare it to \(\alpha\) to determine whether to reject or not reject \(H_0\text{.}\)
Write the conclusion in the context of the question.

\(X^2\) test of independence

State the name of the test being used.
- \(X^2\) test of independence
Verify conditions.
- A simple random sample.
- All expected counts \(\ge 5\) (calculate and record expected counts).
Write the hypotheses in plain language. No mathematical notation is needed for this test.
- H\(_0\text{:}\) [variable 1] and [variable 2] are independent.
- H\(_A\text{:}\) [variable 1] and [variable 2] are dependent.
Identify the significance level \(\alpha\text{.}\)
Calculate the test statistic and degrees of freedom.

\begin{align*} X^2 \amp = \sum{\frac{\text{ (observed counts - expected counts) } ^2}{\text{ expected counts } }}\\ df \amp = (\# \text{ of rows } - 1) \times (\# \text{ of columns } - 1) \end{align*}
Find the p-value and compare it to \(\alpha\) to determine whether to reject or not reject \(H_0\text{.}\)
Write the conclusion in the context of the question.

Example 6.4.15

A 2011 survey asked 806 randomly sampled adult Facebook users about their Facebook privacy settings. One of the questions on the survey was, “Do you know how to adjust your Facebook privacy settings to control what people can and cannot see?” The responses are cross-tabulated based on gender.⁸Survey USA, News Poll #17960, data collected February 16-17, 2011

		Gender
		Male	Female	Total
Response	Yes	288	378	666
	No	61	62	123
	Not sure	10	7	17
	Total	359	447	806

Carry out an appropriate test at the 0.10 significance level to see if there is an association between gender and knowing how to adjust Facebook privacy settings to control what people can and cannot see.

Solution

According to the problem, there was one random sample taken. Two variables were recorded on the respondents: gender and response to the question regarding privacy settings. Because there was one random sample rather than two independent random samples, we carry out a \(X^2\) test of independence. H\(_0\text{:}\) Gender and knowing how to adjust Facebook privacy settings are independent. H\(_A\text{:}\) Gender and knowing how to adjust Facebook privacy settings are dependent. \(\alpha=0.1\)

Table of expected counts:

		Gender
		Male	Female
Response	Yes	296.64	369.36
	No	54.785	68.215
	Not sure	7.572	9.428

All expected counts are \(\ge 5\text{.}\) \(X^2 = 3.13\text{;}\) \(df = 2\) p-value\(= 0.209 > \alpha\) We do not reject H\(_0\text{.}\) We do not have evidence that gender and knowing how to adjust Facebook privacy settings are dependent.

Subsection 6.4.5 Calculator: chi-square test for two-way tables

TI-83/84: Entering data into a two-way table

MISSINGVIDEOLINK

Hit 2ND \(x^{-1}\) (i.e. MATRIX).
Right arrow to EDIT.
Hit 1 or ENTER to select matrix A.
Enter the dimensions by typing #rows, ENTER, #columns, ENTER.
Enter the data from the two-way table.

TI-83/84: Chi-square test of homogeneity and independence

MISSINGVIDEOLINK Use STAT, TESTS, \(X^2\)-Test.

First enter two-way table data as described in the previous box.
Choose STAT.
Right arrow to TESTS.
Down arrow and choose C:\(X^2\)-Test.
Down arrow, choose Calculate, and hit ENTER, which returns

\(X^2\) chi-square test statistic

p p-value

df degrees of freedom

TI-83/84: Finding the expected counts

First enter two-way table data as described previously.
Carry out the chi-square test of homogeneity or independence as described in previous box.
Hit 2ND \(x^{-1}\) (i.e. MATRIX).
Right arrow to EDIT.
Hit 2 to see matrix B.
- This matrix contains the expected counts.

Casio fx-9750GII: Chi-square test of homogeneity and independence

MISSINGVIDEOLINK

Navigate to STAT (MENU button, then hit the 2 button or select STAT).
Choose the TEST option (F3 button).
Choose the CHI option (F3 button).
Choose the 2WAY option (F2 button).
Enter the data into a matrix:
- Hit \(\triangleright\)MAT (F2 button).
- Navigate to a matrix you would like to use (e.g. Mat C) and hit EXE.
- Specify the matrix dimensions: m is for rows, n is for columns.
- Enter the data.
- Return to the test page by hitting EXIT twice.
Enter the Observed matrix that was used by hitting MAT (F1 button) and the matrix letter (e.g. C).
Enter the Expected matrix where the expected values will be stored (e.g. D).
Hit the EXE button, which returns

\(x^2\) chi-square test statistic

p p-value

df degrees of freedom
To see the expected values of the matrix, go to \(\triangleright\)MAT (F6 button) and select the corresponding matrix.

		Congress
	Obama	Democrats	Republicans	Total

Approve	842	736	541	2119
Disapprove	616	646	842	2104

Total	1458	1382	1383	4223

Table 6.4.16 Pew Research poll results of a March 2012 poll.

Guided Practice 6.4.17

Use Table 6.4.16 and a calculator to find the expected values and the \(X^2\) statistic, \(df\text{,}\) and p-value for the corresponding chi-square test.⁹First create a \(2\times 3\) matrix ith the data. The final summaries should be \(X^2=106.4\text{,}\) \(\text{p-value} = 8.06 \times 10^{-24}\approx 0\text{,}\) and \(df=2\text{.}\) Below is the matrix of expected values:

	Obama	Congr. Dem.	Congr. Rep.

Approve	731.59	693.45	693.96
Disapprove	726.41	688.55	689.04