######
1 Vegetarian college students

Suppose that 8% of college students are vegetarians. Determine if the following statements are true or false, and explain your reasoning.

- The distribution of the sample proportions of vegetarians in random samples of size 60 is approximately normal since \(n \ge 30\text{.}\) Answer
False. Doesn't satisfy success-failure condition.

- The distribution of the sample proportions of vegetarian college students in random samples of size 50 is right skewed. Answer
True. The success-failure condition is not satisfied. In most samples we would expect \(\hat{p}\) to be close to 0.08, the true population proportion. While \(\hat{p}\) can be much above 0.08, it is bound below by 0, suggesting it would take on a right skewed shape. Plotting the sampling distribution would confirm this suspicion.

- A random sample of 125 college students where 12% are vegetarians would be considered unusual. Answer
False. \(SE_{\hat{p}} = 0.0243\text{,}\) and \(\hat{p} = 0.12\) is only \(\frac{0.12 - 0.08}{0.0243} = 1.65\) SEs away from the mean, which would not be considered unusual.

- A random sample of 250 college students where 12% are vegetarians would be considered unusual. Answer
True. \(\hat{p}=0.12\) is 2.32 standard errors away from the mean, which is often considered unusual.

- The standard error would be reduced by one-half if we increased the sample size from 125 to 250. Answer
False. Decreases the SE by a factor of \(1/\sqrt{2}\text{.}\)

######
2 Young Americans, Part I

About 77% of young adults think they can achieve the American dream. Determine if the following statements are true or false, and explain your reasoning.^{ 1 }

The distribution of sample proportions of young Americans who think they can achieve the American dream in samples of size 20 is left skewed.

The distribution of sample proportions of young Americans who think they can achieve the American dream in random samples of size 40 is approximately normal since \(n \ge 30\text{.}\)

A random sample of 60 young Americans where 85% think they can achieve the American dream would be considered unusual.

A random sample of 120 young Americans where 85% think they can achieve the American dream would be considered unusual.

######
3 Orange tabbies

Suppose that 90% of orange tabby cats are male. Determine if the following statements are true or false, and explain your reasoning.

- The distribution of sample proportions of random samples of size 30 is left skewed. Answer
True. See the reasoning of 6.1(b).

- Using a sample size that is 4 times as large will reduce the standard error of the sample proportion by one-half. Answer
True. We take the square root of the sample size in the SE formula.

- The distribution of sample proportions of random samples of size 140 is approximately normal. Answer
True. The independence and success-failure conditions are satisfied.

- The distribution of sample proportions of random samples of size 280 is approximately normal. Answer
True. The independence and success-failure conditions are satisfied.

######
4 Young Americans, Part II

About 25% of young Americans have delayed starting a family due to the continued economic slump. Determine if the following statements are true or false, and explain your reasoning.^{ 2 }

The distribution of sample proportions of young Americans who have delayed starting a family due to the continued economic slump in random samples of size 12 is right skewed.

In order for the distribution of sample proportions of young Americans who have delayed starting a family due to the continued economic slump to be approximately normal, we need random samples where the sample size is at least 40.

A random sample of 50 young Americans where 20% have delayed starting a family due to the continued economic slump would be considered unusual.

A random sample of 150 young Americans where 20% have delayed starting a family due to the continued economic slump would be considered unusual.

Tripling the sample size will reduce the standard error of the sample proportion by one-third.

######
5 Prop 19 in California

In a 2010 Survey USA poll, 70% of the 119 respondents between the ages of 18 and 34 said they would vote in the 2010 general election for Prop 19, which would change California law to legalize marijuana and allow it to be regulated and taxed. At a 95% confidence level, this sample has an 8% margin of error. Based on this information, determine if the following statements are true or false, and explain your reasoning.^{ 3 }

- We are 95% confident that between 62% and 78% of the California voters in this sample support Prop 19. Answer
False. A confidence interval is constructed to estimate the population proportion, not the sample proportion.

- We are 95% confident that between 62% and 78% of all California voters between the ages of 18 and 34 support Prop 19. Answer
True. 95% CI: \(70\pm8\text{.}\)

- If we considered many random samples of 119 California voters between the ages of 18 and 34, and we calculated 95% confidence intervals for each, 95% of them will include the true population proportion of 18-34 year old Californians who support Prop 19. Answer
True. By the definition of the confidence level.

- In order to decrease the margin of error to 4%, we would need to quadruple (multiply by 4) the sample size. Answer
True. Quadrupling the sample size decreases the SE and ME by a factor of \(1/\sqrt{4}\text{.}\)

- Based on this confidence interval, there is sufficient evidence to conclude that a majority of California voters between the ages of 18 and 34 support Prop 19. Answer
True. The 95% CI is entirely above 50%.

######
6 2010 Healthcare Law

On June 28, 2012 the U.S. Supreme Court upheld the much debated 2010 healthcare law, declaring it constitutional. A Gallup poll released the day after this decision indicates that 46% of 1,012 Americans agree with this decision. At a 95% confidence level, this sample has a 3% margin of error. Based on this information, determine if the following statements are true or false, and explain your reasoning.^{ 4 }

We are 95% confident that between 43% and 49% of Americans in this sample support the decision of the U.S. Supreme Court on the 2010 healthcare law.

We are 95% confident that between 43% and 49% of Americans support the decision of the U.S. Supreme Court on the 2010 healthcare law.

If we considered many random samples of 1,012 Americans, and we calculated the sample proportions of those who support the decision of the U.S. Supreme Court, 95% of those sample proportions will be between 43% and 49%.

The margin of error at a 90% confidence level would be higher than 3%.

######
7 Fireworks on July \(4^\text{th}\)

In late June 2012, Survey USA published results of a survey stating that 56% of the 600 randomly sampled Kansas residents planned to set off fireworks on July \(4^{th}\text{.}\) Determine the margin of error for the 56% point estimate using a 95% confidence level.^{ 5 }

AnswerWith a random sample from \(\lt10\) of the population, independence is satisfied. The success-failure condition is also satisfied. \(ME = z^{\star} \sqrt{ \frac{\hat{p} (1-\hat{p})} {n} } = 1.96 \sqrt{ \frac{0.56 \times 0.44}{600} }= 0.0397 \approx 4\)

######
8 Elderly drivers

In January 2011, The Marist Poll published a report stating that 66% of adults nationally think licensed drivers should be required to retake their road test once they reach 65 years of age. It was also reported that interviews were conducted on 1,018 American adults, and that the margin of error was 3% using a 95% confidence level.^{ 6 }

Verify the margin of error reported by The Marist Poll.

Based on a 95% confidence interval, does the poll provide convincing evidence that *more than* 70% of the population think that licensed drivers should be required to retake their road test once they turn 65?

######
9 Life after college

We are interested in estimating the proportion of graduates at a mid-sized university who found a job within one year of completing their undergraduate degree. Suppose we conduct a survey and find out that 348 of the 400 randomly sampled graduates found jobs. The graduating class under consideration included over 4500 students.

- Describe the population parameter of interest. What is the value of the point estimate of this parameter? Answer
Proportion of graduates from this university who found a job within one year of graduating. \(\hat{p} = 348/400 = 0.87\text{.}\)

- Check if the conditions for constructing a confidence interval based on these data are met. Answer
This is a random sample from less than 10% of the population, so the observations are independent. Success-failure condition is satisfied: 348 successes, 52 failures, both well above 10.

- Calculate a 95% confidence interval for the proportion of graduates who found a job within one year of completing their undergraduate degree at this university, and interpret it in the context of the data. Answer
(0.8371, 0.9029). We are 95% confident that approximately 84% to 90% of graduates from this university found a job within one year of completing their undergraduate degree.

- What does “95% confidence” mean? Answer
95% of such random samples would produce a 95% confidence interval that includes the true proportion of students at this university who found a job within one year of graduating from college.

- Now calculate a 99% confidence interval for the same parameter and interpret it in the context of the data. Answer
(0.8267, 0.9133). Similar interpretation as before.

- Compare the widths of the 95% and 99% confidence intervals. Which one is wider? Explain. Answer
99% CI is wider, as we are more confident that the true proportion is within the interval and so need to cover a wider range.

######
10 Life rating in Greece

Greece has faced a severe economic crisis since the end of 2009. A Gallup poll surveyed 1,000 randomly sampled Greeks in 2011 and found that 25% of them said they would rate their lives poorly enough to be considered “suffering”.^{ 7 }

Describe the population parameter of interest. What is the value of the point estimate of this parameter?

Check if the conditions required for constructing a confidence interval based on these data are met.

Construct a 95% confidence interval for the proportion of Greeks who are “suffering”.

Without doing any calculations, describe what would happen to the confidence interval if we decided to use a higher confidence level.

Without doing any calculations, describe what would happen to the confidence interval if we used a larger sample.

######
11 Study abroad

A survey on 1,509 high school seniors who took the SAT and who completed an optional web survey between April 25 and April 30, 2007 shows that 55% of high school seniors are fairly certain that they will participate in a study abroad program in college.^{ 8 }

- Is this sample a representative sample from the population of all high school seniors in the US? Explain your reasoning. Answer
No. The sample only represents students who took the SAT, and this was also an online survey.

- Let's suppose the conditions for inference are met. Even if your answer to part (a) indicated that this approach would not be reliable, this analysis may still be interesting to carry out (though not report). Construct a 90% confidence interval for the proportion of high school seniors (of those who took the SAT) who are fairly certain they will participate in a study abroad program in college, and interpret this interval in context. Answer
(0.5289, 0.5711). We are 90% confident that 53% to 57% of high school seniors who took the SAT are fairly certain that they will participate in a study abroad program in college.

- What does “90% confidence” mean? Answer
90% of such random samples would produce a 90% confidence interval that includes the true proportion.

- Based on this interval, would it be appropriate to claim that the majority of high school seniors are fairly certain that they will participate in a study abroad program in college? Answer
Yes. The interval lies entirely above 50%.

######
12 Legalization of marijuana, Part I

The 2010 General Social Survey asked 1,259 US residents: “Do you think the use of marijuana should be made legal, or not?” 48% of the respondents said it should be made legal.^{ 9 }

Is 48% a sample statistic or a population parameter? Explain.

Construct a 95% confidence interval for the proportion of US residents who think marijuana should be made legal, and interpret it in the context of the data.

A critic points out that this 95% confidence interval is only accurate if the statistic follows a normal distribution, or if the normal model is a good approximation. Is this true for these data? Explain.

A news piece on this survey's findings states, “Majority of Americans think marijuana should be legalized.” Based on your confidence interval, is this news piece's statement justified?

######
13 Public option, Part I

A *Washington Post* article from 2009 reported that “support for a government-run health-care plan to compete with private insurers has rebounded from its summertime lows and wins clear majority support from the public.” More specifically, the article says “seven in 10 Democrats back the plan, while almost nine in 10 Republicans oppose it. Independents divide 52 percent against, 42 percent in favor of the legislation.” (6% responded with “other”.) There were 819 Democrats, 566 Republicans and 783 Independents surveyed.^{ 10 }

- A political pundit on TV claims that a majority of Independents oppose the health care public option plan. Do these data provide strong evidence to support this statement? Answer
This is an appropriate setting for a hypothesis test. \(H_0: p = 0.50\text{.}\) \(H_A: p \gt 0.50\text{.}\) Both independence and the success-failure condition are satisfied. \(Z=1.12\) \(\to\) p-value \(= 0.1314\text{.}\) Since the p-value \(\gt \alpha=0.05\text{,}\) we fail to reject \(H_0\text{.}\) The data do not provide strong evidence that more than half of all Independents oppose the public option plan.

- Would you expect a confidence interval for the proportion of Independents who oppose the public option plan to include 0.5? Explain. Answer
Yes, since we did not reject \(H_0\) in part (a).

######
14 The Civil War

A national survey conducted in 2011 among a simple random sample of 1,507 adults shows that 56% of Americans think the Civil War is still relevant to American politics and political life.^{ 11 }

Conduct a hypothesis test to determine if these data provide strong evidence that the majority of the Americans think the Civil War is still relevant.

Interpret the p-value in this context.

Calculate a 90% confidence interval for the proportion of Americans who think the Civil War is still relevant. Interpret the interval in this context, and comment on whether or not the confidence interval agrees with the conclusion of the hypothesis test.

######
15 Browsing on the mobile device

A 2012 survey of 2,254 American adults indicates that 17% of cell phone owners do their browsing on their phone rather than a computer or other device.^{ 12 }

- According to an online article, a report from a mobile research company indicates that 38 percent of Chinese mobile web users only access the internet through their cell phones.
^{ 13 } Conduct a hypothesis test to determine if these data provide strong evidence that the proportion of Americans who only use their cell phones to access the internet is different than the Chinese proportion of 38%. Answer\(H_0: p = 0.38\text{.}\) \(H_A: p \ne 0.38\text{.}\) Independence (random sample, \(\lt10\) of population) and the success-failure condition are satisfied. \(Z=-20.5\) \(\to\) p-value \(\approx 0\text{.}\) Since the p-value is very small, we reject \(H_0\text{.}\) The data provide strong evidence that the proportion of Americans who only use their cell phones to access the internet is different than the Chinese proportion of 38%, and the data indicate that the proportion is lower in the US.

- Interpret the p-value in this context. Answer
If in fact 38% of Americans used their cell phones as a primary access point to the internet, the probability of obtaining a random sample of 2,254 Americans where 17% or less or 59% or more use their only their cell phones to access the internet would be approximately 0.

- Calculate a 95% confidence interval for the proportion of Americans who access the internet on their cell phones, and interpret the interval in this context. Answer
(0.1545, 0.1855). We are 95% confident that approximately 15.5% to 18.6% of all Americans primarily use their cell phones to browse the internet.

######
16 Is college worth it? Part I

Among a simple random sample of 331 American adults who do not have a four-year college degree and are not currently enrolled in school, 48% said they decided not to go to college because they could not afford school.^{ 14 }

A newspaper article states that only a minority of the Americans who decide not to go to college do so because they cannot afford it and uses the point estimate from this survey as evidence. Conduct a hypothesis test to determine if these data provide strong evidence supporting this statement.

Would you expect a confidence interval for the proportion of American adults who decide not to go to college because they cannot afford it to include 0.5? Explain.

######
17 Taste test

Some people claim that they can tell the difference between a diet soda and a regular soda in the first sip. A researcher wanting to test this claim randomly sampled 80 such people. He then filled 80 plain white cups with soda, half diet and half regular through random assignment, and asked each person to take one sip from their cup and identify the soda as diet or regular. 53 participants correctly identified the soda.

- Do these data provide strong evidence that these people are able to detect the difference between diet and regular soda, in other words, are the results significantly better than just random guessing? Answer
\(H_0: p = 0.5\text{.}\) \(H_A: p \gt 0.5\text{.}\) Independence (random sample, \(\lt10\) of population) is satisfied, as is the success-failure conditions (using \(p_0 = 0.5\text{,}\) we expect 40 successes and 40 failures). \(Z = 2.91\) \(\to\) p-value \(= 0.0018\text{.}\) Since the p-value \(\lt 0.05\text{,}\) we reject the null hypothesis. The data provide strong evidence that the rate of correctly identifying a soda for these people is significantly better than just by random guessing.

- Interpret the p-value in this context. Answer
If in fact people cannot tell the difference between diet and regular soda and they randomly guess, the probability of getting a random sample of 80 people where 53 or more identify a soda correctly would be 0.0018.

######
18 Is college worth it? Part II

Exercise 6.5.16 presents the results of a poll where 48% of 331 Americans who decide to not go to college do so because they cannot afford it.

Calculate a 90% confidence interval for the proportion of Americans who decide to not go to college because they cannot afford it, and interpret the interval in context.

Suppose we wanted the margin of error for the 90% confidence level to be about 1.5%. How large of a survey would you recommend?

######
19 College smokers

We are interested in estimating the proportion of students at a university who smoke. Out of a random sample of 200 students from this university, 40 students smoke.

- Calculate a 95% confidence interval for the proportion of students at this university who smoke, and interpret this interval in context. (Reminder: check conditions) Answer
Independence is satisfied (random sample from \(\lt10\) of the population), as is the success-failure condition (40 smokers, 160 non-smokers). The 95% CI: (0.145, 0.255). We are 95% confident that 14.5% to 25.5% of all students at this university smoke.

- If we wanted the margin of error to be no larger than 2% at a 95% confidence level for the proportion of students who smoke, how big of a sample would we need? Answer
We want \(z^{\star}SE\) to be no larger than 0.02 for a 95% confidence level. We use \(z^{\star}=1.96\) and plug in the point estimate \(\hat{p}=0.2\) within the SE formula: \(1.96\sqrt{0.2(1-0.2)/n} \leq 0.02\text{.}\) The sample size \(n\) should be at least 1,537.

######
20 Legalize Marijuana, Part II

As discussed in Exercise 6.5.12, the 2010 General Social Survey reported a sample where about 48% of US residents thought marijuana should be made legal. If we wanted to limit the margin of error of a 95% confidence interval to 2%, about how many Americans would we need to survey ?

######
21 Public option, Part II

Exercise 6.5.13 presents the results of a poll evaluating support for the health care public option in 2009, reporting that 52% of Independents in the sample opposed the public option. If we wanted to estimate this number to within 1% with 90% confidence, what would be an appropriate sample size?

AnswerThe margin of error, which is computed as \(z^{\star}SE\text{,}\) must be smaller than 0.01 for a 90% confidence level. We use \(z^{\star} = 1.65\) for a 90% confidence level, and we can use the point estimate \(\hat{p}=0.52\) in the formula for \(SE\text{.}\) \(1.65\sqrt{0.52(1-0.52)/n} \leq 0.01\text{.}\) Therefore, the sample size \(n\) must be at least 6,796.

######
22 Acetaminophen and liver damage

It is believed that large doses of acetaminophen (the active ingredient in over the counter pain relievers like Tylenol) may cause damage to the liver. A researcher wants to conduct a study to estimate the proportion of acetaminophen users who have liver damage. For participating in this study, he will pay each subject $20 and provide a free medical consultation if the patient has liver damage.

If he wants to limit the margin of error of his 98% confidence interval to 2%, what is the minimum amount of money he needs to set aside to pay his subjects?

The amount you calculated in part (a) is substantially over his budget so he decides to use fewer subjects. How will this affect the width of his confidence interval?