Section 6.2 Difference of two proportions
¶We often wish two compare to groups to each other. In this section, we will answer the following questions:
How much more effective is a blood thinner than a placebo for those who undergo CPR for a heart attack?
How different is the approval of the 2010 healthcare law under two different question phrasings?
Does the use of fish oils reduce heart attacks better than a placebo?
Subsection 6.2.1 Learning objectives
State and verify whether or not the conditions for inference on the difference of two proportions using a normal distribution are met.
Recognize that the standard error calculation is different for the test and for the interval, and explain why that is the case.
Know how to calculate the pooled proportion and when to use it.
Carry out a complete confidence interval procedure for the difference of two proportions.
Carry out a complete hypothesis test for the difference of two proportions.
Subsection 6.2.2 Sampling distribution of the difference of two proportions
In this section we want to compare two proportions to each other. We can start by taking their difference. If the difference is positive it tells us that the first one is larger. If it is negative, it tells use that the second one is larger. If the difference is zero, it tells us that they are equal. When comparing two proportions, then, the quantity that we want to estimate is really the difference: \(p_1p_2\text{.}\) This tells us how far apart the two proportions are.
Before we find a test statistic and perform inference for the two proportion case, we must investigate the sampling distribution of \(\hat{p}_1\hat{p}_2\text{,}\) which will become our point estimate. We know that the sampling distribution should be centered on \(p_1p_2\text{.}\) The standard deviation of \(\hat{p}_1\hat{p}_2\) can be computed as:
Like with \(\hat{p}\text{,}\) the difference of two sample proportions \(\hat{p}_1\hat{p}_2\) follows a normal distribution when certain conditions are met. First, the sampling distribution for each sample proportion must be nearly normal, and secondly, the samples must be independent. Under these two conditions, the sampling distribution of \(\hat{p}_1  \hat{p}_2\) may be well approximated using the normal model.
Subsection 6.2.3 Checking conditions for inference using a normal distribution
When comparing two proportions, we carry out inference on \(p_1p_2\text{.}\) The assumptions are that the observations are independent, both between groups and within groups and that the sampling distribution of \(\hat{p}_1\hat{p}_2\) is nearly normal. We check whether these assumptions are reasonable by verifying the following conditions.
Independent. Observations between groups can be considered independent when the data are collected from two independent random samples or, in the context of experiments, from two randomly assigned treatments. Randomly assigning subjects to treatments is equivalent to randomly assigning treatments to subjects. When sampling without replacement from a finite population, observations can be considered independent when sampling less than 10% of the population.
Nearly normal sampling distribution. The sampling distribution of \(\hat{p}_1\hat{p}_2\) will be nearly normal when the successfailure condition is met for both groups. In the two sample case, Instead of checking two inequalities, there are four to check.
Subsection 6.2.4 Confidence interval for the difference of two proportions
We consider an experiment for patients who underwent CPR for a heart attack and were subsequently admitted to a hospital. These patients were randomly divided into a treatment group where they received a blood thinner or the control group where they did not receive a blood thinner. The outcome variable of interest was whether the patients survived for at least 24 hours. The results are shown in Table 6.2.1.
Survived  Died  Total  
Treatment  14  26  40  
Control  11  39  50  
Total  25  65  90  
Here, the parameter of interest is a difference of population proportions, specifically, the difference in the proportion of similar patients that would survive for at least 24 hours if in the treatment group versus if in the control group. Let:
Then the parameter of interest is \(p_1  p_2\text{.}\) In order to use a Zinterval to estimate this difference, we must see if the point estimate, \(\hat{p}_{1}  \hat{p}_{2}\text{,}\) follows a normal distribution. Because the patients were randomly assigned to one of the two groups and one heart attack patient is unlikely to influence the next that was in the study, the observations are considered independent, both within the samples and between the samples. Next, the successfailure condition should be verified for each group. We use the sample proportions along with the sample sizes to check the condition.
Because all conditions are met, the normal model can be used for the point estimate of the difference in survival rate.
The point estimate is:
We compute the standard error for the difference of sample proportions in the same way that we compute the standard deviation for the difference of sample proportions — the only difference is that we use the sample proportions in place of the population proportions:
Let us estimate the true difference in survival rate with 90% confidence. For a 90% confidence level, we use \(z^{\star} = 1.645\text{.}\) The 90% confidence interval is calculated as:
We are 90% confident that the true difference in the survival rate (treatment \(\) control) lies between 0.027 and 0.095. That is, we are 90% confident that the treatment of blood thinners changes survival rate for patients like those in the study by 2.7% to +28.7% percentage points. Because this interval contains both negative and positive values, we do not have enough information to say with confidence whether blood thinners harm or help heart attack patients who have been admitted after they have undergone CPR.
Constructing a confidence interval for the difference of two proportions.
To carry out a complete confidence interval procedure to estimate the difference of two proportions \(p_1p_2\text{,}\)
Identify: Identify the parameter and the confidence level, C%.
The parameter will be a difference of proportions, e.g. the true difference in the proportion of 17 and 18 year olds with a summer job (proportion of 18 year olds \(\) proportion of 17 year olds).
Choose: Identify the correct interval procedure and identify it by name.
Here we choose the 2proportion Zinterval.
Check: Check conditions for the sampling distribution of \(\hat{p}_1\hat{p}_2\) to be nearly normal.
Data come from 2 independent random samples or 2 randomly assigned treatments.
\(n_1\hat{p}_1\geq10\text{,}\) \(n_1(1\hat{p}_1)\geq10\text{,}\) \(n_2\hat{p}_2\geq10\text{,}\) and \(n_2(1\hat{p}_2)\geq10\)
Calculate: Calculate the confidence interval and record it in interval form.

\(\text{ point estimate } \ \pm\ z^{\star} \times SE\ \text{ of estimate }\)
point estimate: the difference of sample proportions \(\hat{p}_1  \hat{p}_2\)
\(SE\) of estimate: \(\sqrt{\frac{\hat{p}_1(1\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1\hat{p}_2)}{n_2}}\)
\(z^{\star}\text{:}\) use a \(t\)table at row \(\infty\) and confidence level C
(, )
Conclude: Interpret the interval and, if applicable, draw a conclusion in context.
We are C% confident that the true difference in the proportion of [...] is between and . If applicable, draw a conclusion based on whether the interval is entirely above, is entirely below, or contains the value 0.
Example 6.2.2.
A remote control car company is considering a new manufacturer for wheel gears. The new manufacturer would be more expensive but their higher quality gears are more reliable, resulting in happier customers and fewer warranty claims. However, management must be convinced that the more expensive gears are worth the conversion before they approve the switch. The quality control engineer collects a sample of gears, examining 1000 gears from each company and finds that 879 gears pass inspection from the current supplier and 958 pass inspection from the prospective supplier. Using these data, construct a 95% confidence interval for the difference in the proportion from each supplier that would pass inspection. Use the five step framework described above to organize your work.
Identify: First we identify the parameter of interest. Here the parameter we wish to estimate is the true difference in the proportion of gears from each supplier that would pass inspection, \(p_1p_2\text{.}\) We will take the difference as: current \(\) prospective, so \(p_1\) is the true proportion that would pass from the current supplier and \(p_2\) is the true proportion that would pass from the prospective supplier. We will estimate the difference using a 95% confidence level.
Choose: Because the parameter to be estimated is a difference of proportions, we will use a 2proportion Zinterval.
Check: The samples are independent, but not necessarily random, so to proceed we must assume the gears are all independent. For this sample we will suppose this assumption is reasonable, but the engineer would be more knowledgeable as to whether this assumption is appropriate. We also must verify the minimum sample size conditions:
The successfailure condition is met for both samples.
Calculate: We will calculate the interval:
The point estimate is the difference of sample proportions: \(\hat{p}_1\hat{p}_2 = 0.879  0.958 = 0.079\text{.}\)
The \(SE\) of the difference of sample proportions is:
\(\sqrt{\frac{\ \hat{p}_1(1\hat{p}_1)\ }{n_1}+ \frac{\hat{p}_2(1\hat{p}_2)}{n_2}\ } = \sqrt{\frac{0.879(10.879)}{1000} +\frac{0.958(10.958)}{1000}}= 0.0121\)
So the 95% confidence interval is given by:
Conclude: We are 95% confident that the true difference (current \(\) prospective) in the proportion that would pass inspection is between 0.103 and 0.055, meaning that we are 95% confident that the prospective supplier would have between a 5.5% and 10.3% greater rate of passing inspection. Because the entire interval is below zero, the data provide sufficient evidence that the prospective gears pass inspection more often than the current gears. The remote control car company should go with the new manufacturer.
Subsection 6.2.5 Calculator: the 2proportion Zinterval
¶As with the 1proportion Zinterval, a calculator can be helpful for evaluating the final interval.
TI83/84: 2proportion Zinterval.
Use STAT
, TESTS
, 2PropZInt
.
Choose
STAT
.Right arrow to
TESTS
.Down arrow and choose
B:2PropZInt
.Let
x1
be the number of yeses (must be an integer) in sample 1 and letn1
be the size of sample 1.Let
x2
be the number of yeses (must be an integer) in sample 2 and letn2
be the size of sample 2.Let
CLevel
be the desired confidence level.
Choose
Calculate
and hitENTER
, which returns:(,) the confidence interval \(\hat{p}_1\) sample 1 proportion \(n_1\) size of sample 1 \(\hat{p}_2\) sample 2 proportion \(n_2\) size of sample 2
Casio fx9750GII: 2proportion Zinterval.
Navigate to
STAT
(MENU
button, then hit the2
button or selectSTAT
).Choose the
INTR
option (F4
button).Choose the
Z
option (F1
button).Choose the
2P
option (F4
button).
Specify the interval details:
Confidence level of interest for
CLevel
.Enter the number of successes for each group,
x1
andx2
.Enter the sample size for each group,
n1
andn2
.

Hit the
EXE
button, which returnsLeft
,Right
the ends of the confidence interval \(\hat{p}1\text{,}\) \(\hat{p}2\) the sample proportions n1
,n2
sample sizes
Checkpoint 6.2.3.
From Example 6.2.2, we have that a quality control engineer collects a sample of gears, examining 1000 gears from each company and finds that 879 gears pass inspection from the current supplier and 958 pass inspection from the prospective supplier. Use a calculator to find a 95% confidence interval for the difference (current \(\) prospective) in the proportion that would pass inspection.^{ 1 }
x1
\(= 879\text{,}\) n1
\(= 1000\text{,}\) x2
\(= 958\text{,}\) and n2
\(= 1000\text{.}\) CLevel
is .95. This should lead to an interval of \((0.1027, 0.0553)\text{,}\) which matches what we found previously.Subsection 6.2.6 Hypothesis testing when \(H_0\text{:}\) \(p_1 = p_2\)
¶Here we use a new example to examine a special estimate of the standard error when the null hypothesis is that two population proportions equal each other, i.e. \(H_0\text{:}\) \(p_1 = p_2\text{.}\) We investigate whether the way a question is phrased can influence a person's response. Pew Research Center conducted a survey with the following question:^{ 2 }
As you may know, by 2014 nearly all Americans will be required to have health insurance. [People who do not buy insurance will pay a penalty] while [People who cannot afford it will receive financial help from the government]. Do you approve or disapprove of this policy?
For each randomly sampled respondent, the statements in brackets were randomized: either they were kept in the original order given above, or they were reversed. Results are presented in Table 6.2.4
sample size  Approve law (%)  Disapprove law (%)  Other  
“People who do not buy insurance will pay a penalty” is given first (original order) 
771  47  49  4 
“People who cannot afford it will receive financial help from the government” is given first (reversed order) 
732  34  63  3 
Checkpoint 6.2.5.
Is this study an experiment or an observational study? ^{ 3 }
The approval percents of 47% and 34% seem far apart. However, could this difference be due to random chance? We will answer this question using a hypothesis test. To simplify things, let
Example 6.2.6.
Set up hypotheses to test whether the two statement orders produce the same response.
The null claim is that the question order does not matter, that is, that the two proportions should be equal. The alternate claim, the one that bears the burden of proof, is that the question ordering does matter.
\(H_0\text{:}\) \(p_1 = p_2\)
\(H_A\text{:}\) \(p_1 \ne p_2\)
Now, we can note that:
We can now see that the hypotheses are really about a difference of proportions: \(p_1p_2\text{.}\) In the last section, we used a 2proportion Zinterval to estimate the parameter \(p_1p_2\text{;}\) here, we will use a 2proportion Ztest to test the null hypothesis that \(p_1p_2=0\text{,}\) i.e. that \(p_1=p_2\text{.}\)
Recall that the test statistic Z has the form:
The parameter of interest is \(p_1p_2\text{,}\) so the point estimate will be the observed difference of sample proportions: \(\hat{p}_{1}  \hat{p}_{2} = 0.47  0.34 = 0.13\text{.}\)
The null value depends on the null hypothesis. The null hypothesis is that the approval rate would be the same for both statement orderings, i.e. that the difference is 0, therefore, the null value is 0. In this section we consider only the case where \(H_0\text{:}\) \(p_1=p_2\text{,}\) so the null value for the difference will always be 0.
The \(SD\) of a difference of sample proportions has the form:
However, in a hypothesis test, the distribution of the point estimate is always examined assuming the null hypothesis is true, i.e. in this case, \(p_1 = p_2\text{.}\) Both the successfailure check and the standard error formula should reflect this equality in the null hypothesis. We will use \(p_c\) to represent the common proportion that support healthcare law regardless of statement order:
We don't know the true proportion \(p_c\text{,}\) but we can obtain a good estimate of it, \(\hat{p}_c\text{,}\) by pooling the results of both samples. We find the total number of “yeses” or “successes” and divide that by the total number of cases. This is equivalent to taking a weighted average of \(\hat{p}_1\) and \(\hat{p}_2\text{.}\) We call \(\hat{p}_c\) the pooled sample proportion, and we use it to check the successfailure condition and to compute the standard error when the null hypothesis is that \(p_1 = p_2\text{.}\) Here:
Pooled sample proportion.
When the null hypothesis is \(p_1 = p_2\text{,}\) it is useful to find the pooled sample proportion:
Here \(\text{x} _1\) represents the number of successes in sample 1. If \(\text{x} _1\) is not given, it can be computed as \(n_1\times \hat{p}_1\text{.}\) Similarly, \(\text{x} _2\) represents the number of successes in sample 2 and can be computed as \(n_2\times \hat{p}_2\text{.}\)
Use the pooled sample proportion when \(H_0\text{:}\) \({p}_1 = {p}_2\).
When the null hypothesis states that the proportions are equal, we use the pooled sample proportion (\(\hat{p}_c\)) to check the successfailure condition and to estimate the standard error:
Example 6.2.7.
Verify that conditions for using the normal are met and find the \(SE\) of estimate for this hypothesis test. Recall that the pooled proportion \(\hat{p}_c=0.407\text{,}\) \(n_1 = 771\text{,}\) and \(n_2=732\text{.}\)
The data do come from two randomly assigned treatments, where the treatments are the two different orderings of the question regarding healthcare. Also, the successfailure condition (minimums of 10) easily holds for each group.
Here, we compute the \(SE\) for the difference of sample proportions as:
Example 6.2.8.
Complete the hypothesis test using a significance level of 0.01.
We have already set up the hypotheses and verified that the difference of proportions can be modeled using a normal distribution. We can now calculate the test statistic and pvalue.
This is a twotailed test as \(H_A\) is that \(p_1\ne p_2\text{.}\) We can find the area in one tail and double it. Here, the pvalue \(\approx\) 0. Because the pvalue is smaller than \(\alpha = 0.01\text{,}\) we reject the null hypothesis and conclude that the order of the statements affects how likely a respondent is to support the 2010 healthcare law.
Hypothesis testing for the difference of two proportions.
To carry out a complete hypothesis test to test the claim that two proportions \(p_1\) and \(p_2\) are equal to each other,
Identify: Identify the hypotheses and the significance level, \(\alpha\text{.}\)
\(H_0\text{:}\) \(p_1=p_2\)
\(H_A\text{:}\) \(p_1\ne p_2\text{;}\) \(H_A\text{:}\) \(p_1>p_2\text{;}\) or \(H_A\text{:}\) \(p_1\lt p_2\)
Choose: Identify the correct test procedure and identify it by name.
Here we choose the 2proportion Ztest.
Check: Check conditions for the sampling distribution of \(\hat{p}_1\hat{p}_2\) to be nearly normal.
1. Data come from 2 independent random samples or from 2 randomly assigned treatments.
\(n_1\hat{p}_c\geq 10\text{,}\) \(n_1(1\hat{p}_c)\geq 10\text{,}\) \(n_2\hat{p}_c\geq 10\text{,}\) and \(n_2(1\hat{p}_c)\geq 10\)
Calculate: Calculate the Zstatistic and pvalue.

\(Z = \frac{\text{ point estimate }  \text{ null value } }{SE \text{ of estimate } }\)
point estimate: the difference of sample proportions \(\hat{p}_1  \hat{p}_2\)
\(SE\) of estimate: \(\sqrt{\hat{p}_c(1\hat{p}_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\text{,}\) where \(\hat{p}\) is the pooled proportion
null value: 0
pvalue = (based on the Zstatistic and the direction of \(H_A\))
Conclude: Compare the pvalue to \(\alpha\text{,}\) and draw a conclusion in context.
If the pvalue is \(\lt \alpha\text{,}\) reject \(H_0\text{;}\) there is sufficient evidence that [\(H_A\) in context].
If the pvalue is \(> \alpha\text{,}\) do not reject \(H_0\text{;}\) there is not sufficient evidence that [\(H_A\) in context].
Example 6.2.9.
A 5year experiment was conducted to evaluate the effectiveness of fish oils on reducing heart attacks, where each subject was randomized into one of two treatment groups. We'll consider heart attack outcomes in these patients:
heart_attack  no_event  Total  
fish_oil  145  12788  12933 
placebo  200  12738  12938 
Carry out a complete hypothesis test at the 10% significance level to test whether the use of fish oils is effective in reducing heart attacks.
Identify: Define \(p_1\) and \(p_2\) as follows:
\(p_1\text{:}\) the true proportion that would suffer a heart attack if given fish oil
\(p_2\text{:}\) the true proportion that would suffer a heart attack if given placebo
We will test the following hypotheses at the \(\alpha=0.10\) significance level.
\(H_0\text{:}\) \(p_1=p_2\) Fish oil and placebo are equally effective.
\(H_A\text{:}\) \(p_1 \lt p_2\) Fish oil is effective in reducing heart attacks.
Choose: Because we are testing whether two proportions equal each other, we choose the 2proportion Ztest.
Check: We must verify that the difference of sample proportions can be modeled using a normal distribution. First we note that there are two randomly assigned treatments. Second, we calculate the pooled proportion as follows:
We can now verify: \(12933(0.0133)\geq10\text{,}\) \(12933(10.0133)\geq10\text{,}\) \(12938(0.0133)\geq10\text{,}\) and \(12938(10.0133)\geq10\text{,}\) so both conditions are met.
Calculate: We will calculate the Zstatistic and the pvalue.
The point estimate is the difference of sample proportions: \(\hat{p}_1\hat{p}_2 = 0.0112  0.0155 = 0.0043\text{.}\)
The value hypothesized for the parameter in \(H_0\) is the null value: null value = 0.
The pooled proportion, calculated above, is: \(\hat{p}_c = 0.0133\text{.}\)
The \(SE\) of the difference of sample proportions, assuming \(H_0\) is true, is:
\(\sqrt{\hat{p}_c(1\hat{p}_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}} = \sqrt{0.0133(10.0133)}\sqrt{\frac{1}{12933} + \frac{1}{12938}}=0.00142\text{.}\)
Because \(H_A\) uses a less than, meaning that it is a lowertail test, the pvalue is the area to the left of \(Z=3.0\) under the standard normal curve. This area can be found using a normal table or a calculator. The area or pvalue = \(0.0013\text{.}\)
Conclude: The pvalue of 0.0013 is \(\lt 0.10\text{,}\) so we reject \(H_0\text{;}\) there is sufficient evidence that fish oil is effective in reducing heart attacks.
Subsection 6.2.7 Calculator: the 2proportion Ztest
¶TI83/84: 2proportion Ztest.
Use STAT
, TESTS
, 2PropZTest
.
Choose
STAT
.Right arrow to
TESTS
.Down arrow and choose
6:2PropZTest
.Let
x1
be the number of yeses (must be an integer) in sample 1 and letn1
be the size of sample 1.Let
x2
be the number of yeses (must be an integer) in sample 2 and letn2
be the size of sample 2.Choose \(\ne\text{,}\) \(\lt\text{,}\) or > to correspond to \(H_A\text{.}\)

Choose
Calculate
and hitENTER
, which returns:z
Zstatistic p
pvalue \(\hat{p}_1\) sample 1 proportion \(\hat{p}\) pooled sample proportion \(\hat{p}_2\) sample 2 proportion
Casio fx9750GII: 2proportion Ztest.
Navigate to
STAT
(MENU
button, then hit the2
button or selectSTAT
).Choose the
TEST
option (F3
button).Choose the
Z
option (F1
button).Choose the
2P
option (F4
button).
Specify the test details:
Specify the sidedness of the test using the
F1
,F2
, andF3
keys.Enter the number of successes for each group,
x1
andx2
.Enter the sample size for each group,
n1
andn2
.

Hit the
EXE
button, which returnsz
Zstatistic \(\hat{p}1\text{,}\) \(\hat{p}2\) sample proportions p
pvalue \(\hat{p}\) pooled proportion n1
,n2
sample sizes
Checkpoint 6.2.10.
Use a calculator to find the test statistic, pvalue, and pooled proportion for a test with: \(H_A\text{:}\) \(p\) for fish oil \(\lt p\) for placebo.^{ 4 }
z
\(= 2.977\) and the pvalue p
\(= 0.00145\text{.}\) These two values match our calculated values from the previous example to within rounding error. The pooled proportion is given as \(\hat_{p}\) \(= 0.0133\text{.}\) Note: values for x1
and x2
were given in the table. If, instead, proportions are given, find x1
and x2
by multiplying the proportions by the sample sizes and rounding the result to an integer.heart_attack  no_event  Total  
fish_oil  145  12788  12933 
placebo  200  12738  12938 
Subsection 6.2.8 Section summary
In the previous section, we looked at inference for a single proportion. In this section, we compared two groups to each other with respect to a proportion or a percent.
We are interested in whether the true proportion of yeses is the same or different between two distinct groups. Call these proportions \(p_1\) and \(p_2\text{.}\) The difference, \(p_1p_2\) tells us whether \(p_1\) is greater than, less than, or equal to \(p_2\text{.}\)
When comparing two proportions to each other, the parameter of interest is the difference of proportions, \(p_1p_2\text{,}\) and we use the difference of sample proportions, \(\hat{p}_1\hat{p}_2\text{,}\) as the point estimate.
The sampling distribution of \(\hat{p}_1\hat{p}_2\) is nearly normal when the successfailure condition is met for both groups and when the data is collected using 2 independent random samples or 2 randomly assigned treatments. When the sampling distribution of \(\hat{p}_1\hat{p}_2\) is nearly normal, the standardized test statistic also follows a normal distribution.
When the null hypothesis is that the two populations proportions are equal to each other, use the pooled sample proportion \(\hat{p}_c=\frac{x_1+x_2}{n_1+n_2}\text{,}\) i.e. the combined number of yeses over the combined sample sizes, when verifying the successfailure condition and when finding the \(SE\text{.}\) For the confidence interval, do not use the pooled sample proportion; use the separate values of \(\hat{p}_1\) and \(\hat{p}_2\text{.}\)

When there are two samples or treatments and the parameter of interest is a difference of proportions, e.g. the true difference in proportion of 17 and 18 year olds with a summer job (proportion of 18 year olds \(\) proportion of 17 year olds):
Estimate \(p_1p_2\) at the C% confidence level using a 2proportion Zinterval.
Test \(H_0\text{:}\) \(p_1p_2=0\) (i.e. \(p_1=p_2\)) at the \(\alpha\) significance level using a 2proportion Ztest.

Verify the conditions for using a normal model:
Data come from 2 independent random samples or 2 randomly assigned treatments.

CI: \(n_1\hat{p}_1\ge 10\text{,}\) \(n_1(1\hat{p}_1)\ge 10\text{,}\) \(n_2\hat{p}_2\ge 10\text{,}\) and \(n_2(1\hat{p}_2)\ge 10\)
Test: \(n_1\hat{p}_c\ge 10\text{,}\) \(n_1(1\hat{p}_c)\ge 10\text{,}\) \(n_2\hat{p}_c\ge 10\text{,}\) and \(n_2(1\hat{p}_c)\ge 10\)

When the conditions are met, we calculate the confidence interval and the test statistic using the same structure as in the previous section.
Confidence interval: \(\text{ point estimate } \ \pm\ z^{\star} \times SE\ \text{ of estimate }\)
Test statistic: \(Z = \frac{\text{ point estimate }  \text{ null value } }{SE \text{ of estimate } }\)
Here the point estimate is the difference of sample proportions \(\hat{p}_1  \hat{p}_2\text{.}\)
The \(SE\) of estimate is the \(SE\) of a difference of sample proportions.
For a CI, use: \(SE = \sqrt{\frac{\hat{p}_1(1\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1\hat{p}_2)}{n_2}}\text{.}\)
For a Test, use: \(SE = \sqrt{\hat{p}_c(1\hat{p}_c)}\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}\text{.}\)
Exercises 6.2.9 Exercises
1. Social experiment, Part I.
A “social experiment” conducted by a TV program questioned what people do when they see a very obviously bruised woman getting picked on by her boyfriend. On two different occasions at the same restaurant, the same couple was depicted. In one scenario the woman was dressed “provocatively” and in the other scenario the woman was dressed “conservatively”. The table below shows how many restaurant diners were present under each scenario, and whether or not they intervened.
Scenario  
Provocative  Conservative  Total  
Intervene  Yes  5  15  20 
No  15  10  25  
Total  20  25  45 
Explain why the sampling distribution of the difference between the proportions of interventions under provocative and conservative scenarios does not follow an approximately normal distribution.
This is not a randomized experiment, and it is unclear whether people would be affected by the behavior of their peers. That is, independence may not hold. Additionally, there are only 5 interventions under the provocative scenario, so the successfailure condition does not hold. Even if we consider a hypothesis test where we pool the proportions, the successfailure condition will not be satisfied. Since one condition is questionable and the other is not satisfied, the difference in sample proportions will not follow a nearly normal distribution.
2. Heart transplant success.
The Stanford University Heart Transplant Study was conducted to determine whether an experimental heart transplant program increased lifespan. Each patient entering the program was officially designated a heart transplant candidate, meaning that he was gravely ill and might benefit from a new heart. Patients were randomly assigned into treatment and control groups. Patients in the treatment group received a transplant, and those in the control group did not. The table below displays how many patients survived and died in each group.^{ 5 }
control  treatment  
alive  4  24 
dead  30  45 
Suppose we are interested in estimating the difference in survival rate between the control and treatment groups using a confidence interval. Explain why we cannot construct such an interval using the normal approximation. What might go wrong if we constructed the confidence interval despite this problem?
3. Gender and color preference.
A study asked 1,924 male and 3,666 female undergraduate college students their favorite color. A 95% confidence interval for the difference between the proportions of males and females whose favorite color is black \((p_{male}  p_{female})\) was calculated to be (0.02, 0.06). Based on this information, determine if the following statements are true or false, and explain your reasoning for each statement you identify as false.^{ 6 }
We are 95% confident that the true proportion of males whose favorite color is black is 2% lower to 6% higher than the true proportion of females whose favorite color is black.
We are 95% confident that the true proportion of males whose favorite color is black is 2% to 6% higher than the true proportion of females whose favorite color is black.
95% of random samples will produce 95% confidence intervals that include the true difference between the population proportions of males and females whose favorite color is black.
We can conclude that there is a significant difference between the proportions of males and females whose favorite color is black and that the difference between the two sample proportions is too large to plausibly be due to chance.
The 95% confidence interval for \((p_{female}  p_{male})\) cannot be calculated with only the information given in this exercise.
(a) False. The entire confidence interval is above 0.
(b) True.
(c) True.
(d) True.
(e) False. It is simply the negated and reordered values: \((0.06,0.02)\text{.}\)
4. The Daily Show.
A Pew Research foundation poll indicates that among 1,099 college graduates, 33% watch The Daily Show. Meanwhile, 22% of the 1,110 people with a high school degree but no college degree in the poll watch The Daily Show. A 95% confidence interval for \((p_\text{ college grad }  p_\text{ HS or less } )\text{,}\) where \(p\) is the proportion of those who watch The Daily Show, is (0.07, 0.15). Based on this information, determine if the following statements are true or false, and explain your reasoning if you identify the statement as false. ^{ 7 }
At the 5% significance level, the data provide convincing evidence of a difference between the proportions of college graduates and those with a high school degree or less who watch The Daily Show.
We are 95% confident that 7% less to 15% more college graduates watch The Daily Show than those with a high school degree or less.
95% of random samples of 1,099 college graduates and 1,110 people with a high school degree or less will yield differences in sample proportions between 7% and 15%.
A 90% confidence interval for \((p_\text{ college grad }  p_\text{ HS or less } )\) would be wider.
A 95% confidence interval for \((p_\text{ HS or less }  p_\text{ college grad } )\) is (0.15,0.07).
5. National Health Plan, Part III.
Exercise 6.1.10.11 presents the results of a poll evaluating support for a generically branded “National Health Plan” in the United States. 79% of 347 Democrats and 55% of 617 Independents support a National Health Plan.
Calculate a 95% confidence interval for the difference between the proportion of Democrats and Independents who support a National Health Plan \((p_{D}  p_{I})\text{,}\) and interpret it in this context. We have already checked conditions for you.
True or false: If we had picked a random Democrat and a random Independent at the time of this poll, it is more likely that the Democrat would support the National Health Plan than the Independent.
(a) Standard error:
Using \(z^{*}=1.96\text{,}\) we get:
We are 95% confident that the proportion of Democrats who support the plan is 18.1% to 29.9% higher than the proportion of Independents who support the plan.
(b) True.
6. Sleep deprivation, CA vs. OR, Part I.
According to a report on sleep deprivation by the Centers for Disease Control and Prevention, the proportion of California residents who reported insufficient rest or sleep during each of the preceding 30 days is 8.0%, while this proportion is 8.8% for Oregon residents. These data are based on simple random samples of 11,545 California and 4,691 Oregon residents. Calculate a 95% confidence interval for the difference between the proportions of Californians and Oregonians who are sleep deprived and interpret it in context of the data.^{ 8 }
7. Offshore drilling, Part I.
A survey asked 827 randomly sampled registered voters in California “Do you support? Or do you oppose? Drilling for oil and natural gas off the Coast of California? Or do you not know enough to say?” Below is the distribution of responses, separated based on whether or not the respondent graduated from college.^{ 9 }
College Grad  
Yes  No  
Support  154  132 
Oppose  180  126 
Do not know  104  131 
Total  438  389 
What percent of college graduates and what percent of the noncollege graduates in this sample do not know enough to have an opinion on drilling for oil and natural gas off the Coast of California?
Conduct a hypothesis test to determine if the data provide strong evidence that the proportion of college graduates who do not have an opinion on this issue is different than that of noncollege graduates.
(a) College grads: 23.7%. Noncollege grads: 33.7%.
(b) Let \(p_{CG}\) and \(p_{NCG}\) represent the proportion of college graduates and noncollege graduates who responded “do not know”. \(H_{0} : p_{CG} = p_{NCG}\text{.}\) \(H_{A} : p_{CG} \ne p_{NCG}\text{.}\) Independence is satisfied (random sample), and the successfailure condition, which we would check using the pooled proportion \((\hat{p}_{pool} = 235/827 = 0.284)\text{,}\) is also satisfied. \(Z = 3.18 \rightarrow \text{pvalue } = 0.0014\text{.}\) Since the pvalue is very small, we reject \(H_{0}\text{.}\) The data provide strong evidence that the proportion of college graduates who do not have an opinion on this issue is different than that of noncollege graduates. The data also indicate that fewer college grads say they “do not know” than noncollege grads (i.e. the data indicate the direction after we reject \(H_{0}\)).
8. Sleep deprivation, CA vs. OR, Part II.
Exercise 6.2.9.6 provides data on sleep deprivation rates of Californians and Oregonians. The proportion of California residents who reported insufficient rest or sleep during each of the preceding 30 days is 8.0%, while this proportion is 8.8% for Oregon residents. These data are based on simple random samples of 11,545 California and 4,691 Oregon residents.
Conduct a hypothesis test to determine if these data provide strong evidence the rate of sleep deprivation is different for the two states. (Reminder: Check conditions)
It is possible the conclusion of the test in part (a) is incorrect. If this is the case, what type of error was made?
9. Offshore drilling, Part II.
Results of a poll evaluating support for drilling for oil and natural gas off the coast of California were introduced in Exercise 6.2.9.7.
College Grad  
Yes  No  
Support  154  132 
Oppose  180  126 
Do not know  104  131 
Total  438  389 
What percent of college graduates and what percent of the noncollege graduates in this sample support drilling for oil and natural gas off the Coast of California?
Conduct a hypothesis test to determine if the data provide strong evidence that the proportion of college graduates who support offshore drilling in California is different than that of noncollege graduates.
(a) College grads: 35.2%. Noncollege grads: 33.9%.
(b) Let \(p_{CG}\) and \(p_{NCG}\) represent the proportion of college graduates and noncollege grads who support offshore drilling. \(H_{0} : p_{CG} = p_{NCG}\text{.}\) \(H_{A} : p_{CG} \ne p_{NCG}\text{.}\) Independence is satisfied (random sample), and the successfailure condition, which we would check using the pooled proportion \((\hat{p}_{pool} = 286/827 = 0.346)\text{,}\) is also satisfied. \(Z = 0.39 \rightarrow \text{pvalue } = 0.6966\text{.}\) Since the \(\text{pvalue } > (0.05)\text{,}\) we fail to reject \(H_{0}\text{.}\) The data do not provide strong evidence of a difference between the proportions of college graduates and noncollege graduates who support offshore drilling in California.
10. Full body scan, Part I.
A news article reports that “Americans have differing views on two potentially inconvenient and invasive practices that airports could implement to uncover potential terrorist attacks.” This news piece was based on a survey conducted among a random sample of 1,137 adults nationwide, where one of the questions on the survey was “Some airports are now using ‘fullbody’ digital xray machines to electronically screen passengers in airport security lines. Do you think these new xray machines should or should not be used at airports?” Below is a summary of responses based on party affiliation. ^{ 10 }
Party Affiliation  
Republican  Democrat  Independent  
Answer  Should  264  299  351 
Should not  38  55  77  
Don't know/No answer  16  15  22  
Total  318  369  450 
Conduct an appropriate hypothesis test evaluating whether there is a difference in the proportion of Republicans and Democrats who think the full body scans should be applied in airports. Assume that all relevant conditions are met.
The conclusion of the test in part (a) may be incorrect, meaning a testing error was made. If an error was made, was it a Type 1 or a Type 2 Error? Explain.
11. Sleep deprived transportation workers.
The National Sleep Foundation conducted a survey on the sleep habits of randomly sampled transportation workers and a control sample of nontransportation workers. The results of the survey are shown below. ^{ 11 }
Transportation Professionals  
Truck  Train  Bux/Taxi/Limo  
Control  Pilots  Drivers  Operators  Drivers  
Less than 6 hours of sleep  35  19  35  29  21 
6 to 8 hours of sleep  193  132  117  119  131 
More than 8 hours  64  51  51  32  58 
Total  292  202  203  180  210 
Conduct a hypothesis test to evaluate if these data provide evidence of a difference between the proportions of truck drivers and nontransportation workers (the control group) who get less than 6 hours of sleep per day, i.e. are considered sleep deprived.
Subscript \(_C\) means control group. Subscript \(_T\) means truck drivers. \(H_{0} : p_{C} = p_{T}\text{.}\) \(H_{A} : p_{C} \ne p_{T}\text{.}\) Independence is satisfied (random samples), as is the successfailure condition, which we would check using the pooled proportion \((\hat{p}_{pool} = 70/495 = 0.141)\text{.}\) \(Z = 1.65 \rightarrow \text{pvalue } = 0.0989\text{.}\) Since the pvalue is high (default to alpha = 0.05), we fail to reject \(H_{0}\text{.}\) The data do not provide strong evidence that the rates of sleep deprivation are different for nontransportation workers and truck drivers.
12. Prenatal vitamins and Autism.
Researchers studying the link between prenatal vitamin use and autism surveyed the mothers of a random sample of children aged 24  60 months with autism and conducted another separate random sample for children with typical development. The table below shows the number of mothers in each group who did and did not use prenatal vitamins during the three months before pregnancy (periconceptional period). ^{ 12 }
Autism  
Autism  Typical development  Total  
Periconceptional prenatal vitamin 
No vitamin  111  70  181 
Vitamin  143  159  302  
Total  254  229  483 
State appropriate hypotheses to test for independence of use of prenatal vitamins during the three months before pregnancy and autism.
Complete the hypothesis test and state an appropriate conclusion. (Reminder: Verify any necessary conditions for the test.)
A New York Times article reporting on this study was titled “Prenatal Vitamins May Ward Off Autism”. Do you find the title of this article to be appropriate? Explain your answer. Additionally, propose an alternative title. ^{ 13 }
13. HIV in subSaharan Africa.
In July 2008 the US National Institutes of Health announced that it was stopping a clinical study early because of unexpected results. The study population consisted of HIVinfected women in subSaharan Africa who had been given single dose Nevaripine (a treatment for HIV) while giving birth, to prevent transmission of HIV to the infant. The study was a randomized comparison of continued treatment of a woman (after successful childbirth) with Nevaripine vs Lopinavir, a second drug used to treat HIV. 240 women participated in the study; 120 were randomized to each of the two treatments. Twentyfour weeks after starting the study treatment, each woman was tested to determine if the HIV infection was becoming worse (an outcome called virologic failure). Twentysix of the 120 women treated with Nevaripine experienced virologic failure, while 10 of the 120 women treated with the other drug experienced virologic failure. ^{ 14 }
Create a twoway table presenting the results of this study.
State appropriate hypotheses to test for difference in virologic failure rates between treatment groups.
Complete the hypothesis test and state an appropriate conclusion. (Reminder: Verify any necessary conditions for the test.)
(a) Summary of the study:
Virol. failure  
Yes  No  Total  
Treatment  Nevaripine  26  94  120 
Lopinavir  10  110  120  
Total  36  204  240 
(b) \(H_{0} : p_{N} = p_{L}\text{.}\) There is no difference in virologic failure rates between the Nevaripine and Lopinavir groups. \(H_{A} : p_{N} \ne p_{L}\text{.}\) There is some differencein virologic failure rates between the Nevaripine and Lopinavir groups.
(c) Random assignment was used, so the observations in each group are independent. If the patients in the study are representative of those in the general population (something impossible to check with the given information), then we can also confidently generalize the findings to the population. The successfailure condition, which we would check using the pooled proportion \((\hat{p}_{pool} = 36/240 = 0.15)\text{,}\) is satisfied. \(Z = 2.89 \rightarrow \text{pvalue } = 0.0039\text{.}\) Since the pvalue is low, we reject \(H_{0}\text{.}\) There is strong evidence of a difference in virologic failure rates between the Nevaripine and Lopinavir groups. Treatment and virologic failure do not appear to be independent.
14. An apple a day keeps the doctor away.
A physical education teacher at a high school wanting to increase awareness on issues of nutrition and health asked her students at the beginning of the semester whether they believed the expression “an apple a day keeps the doctor away”, and 40% of the students responded yes. Throughout the semester she started each class with a brief discussion of a study highlighting positive effects of eating more fruits and vegetables. She conducted the same appleaday survey at the end of the semester, and this time 60% of the students responded yes. Can she used a twoproportion method from this section for this analysis? Explain your reasoning.