## Section3.2Describing Data

Once we have collected data from an observational study or an experiment, we need to summarize and present it in a way that will be meaningful to our audience. The raw data is not very useful by itself. In this section we will begin with graphical presentations of data and in the rest of the chapter we will learn about numerical summaries of data.

### Subsection3.2.1Types of Data

There are two types of data, categorical data and quantitative data.

Categorical (qualitative) data are pieces of information that allow us to classify the subjects into various categories.

###### Example3.2.2.

We might conduct a survey to determine the name of the favorite movie that people saw in a movie theater. When we conduct such a survey, the responses would look like: Finding Nemo, Black Panther, Titanic, etc.

We can count the number of people who give each answer, but the answers themselves do not have any numerical values: we cannot perform computations with an answer like “Black Panther” because it is categorical data.

Quantitative data are responses that are numerical in nature and with which we can perform meaningful calculations.

###### Example3.2.3.

A survey could ask the number of movies you have seen in a movie theater in the past 12 months (0, 1, 2, 3, 4, ...). This would be quantitative data.

###### 12.

Habiba categorized the amount of time spent each week into 5 categories: Work, Travel, Housework, Leisure, and Sleep. If there are a total of 168 hours each week, how many hours does Habiba spend travelling each week?

###### 13.

In a survey 2 , 1012 adults were asked whether they personally worried about a variety of environmental concerns. The number of people who indicated that they worried “a great deal” about some selected concerns is listed below.

1. Is this categorical or quantitative data?

2. Make a bar chart for this data.

3. Why can’t we make a pie chart for this data?

Gallup Poll. March 5-8, 2009. http://www.pollingreport.com/enviro.htm
Environmenal Issue Frequency
Pollution of drinking water 597
Contamination of soil and water by toxic waste 526
Air pollution 455
Global warming 354
###### 14.

In a survey, 2056 adults were asked about their views on immigration. The percent of people who responded that immigrants to the United States are making each of the following situations in the country better are listed below.

1. Is this categorical or quantitative data?

2. Make a relative frequency bar chart for this data.

3. Can we make a pie chart for this data?

Situation Relative Frequency (%)
Food, music and the arts 57
The economy in general 43
Social and moral values 31
Job opportunities for
19
Taxes 20
Crime 7
###### 15.

The following table is from a sample of five hundred homes in Oregon that were asked the primary source of heating in their home.

1. How many of the households heat their home with firewood?

2. What percent of households heat their home with natural gas?

Type of Heat Relative Frequency (%)
Electricity 33
Heating Oil 4
Natural Gas 50
Firewood 8
Other 5
###### 16.

The following table is from a sample of 50 undergraduate students at Portland State University.

1. What percent of the sampled students are below senior class?

2. How many of the sampled students are freshmen?

Class Relative Frequency (%)
Freshman 18
Sophmore 13
Junior 23
Senior 46
###### 17.

1. Is this categorical or quantitative data?

2. Make a relative frequency table for the data.

3. Make a bar chart for the data.

4. Make a pie chart for the data.

 1 1 4 2 2 1 2 3 3 1 4 2 2 2 1 3 2 2 1 2 1 1 1 2
###### 18.

The table below shows scores on a math test.

1. Is this categorical or quantitative data?

2. Make a relative frequency table for the data using a class width of 10.

3. Construct a histogram of the data.

 82 90 55 51 97 73 79 100 60 71 85 78 59 100 88 72 46 82 89 70 100 68 61 52
###### 19.

This graph shows the number of adults and kids who prefer each type of soda. There were 130 adults and kids surveyed. Discuss some ways in which the graph could be improved.

###### 20.

A poll was taken asking people if they agreed with the positions of the 4 candidates for a county office. Does this pie chart present a good representation of this data? Explain.

###### 21.

Why is this a misleading or poor graph?

###### 22.

Why is this a misleading or poor graph?

###### 23.

Match each description to one of the graphs.

1. Normal distribution

2. Positive or right skewed

3. Negative or left skewed

4. Bimodal

###### 24.

Write a sentence or two to describe each distribution in terms of modality, symmetry, skewness and outliers.

###### 25.

Studies are often done by pharmaceutical companies to determine the effectiveness of a treatment. Suppose that a new cancer drug is currently under study. Of interest is the average length of time in months patients live once starting the treatment. Two researchers each follow a different set of 40 cancer patients throughout their treatment. The following data (in months) are collected.

1. Create a histogram for each dataset, using the same class intervals and scales so you can compare them.

2. Compare and contrast the two distributions.

Researcher 1: 3, 4, 11, 15, 16, 17, 22, 44, 37, 16, 14, 24, 25, 15, 26, 27, 33, 29, 35, 44, 13, 21, 22, 10, 12, 8, 40, 32, 26, 27, 31, 34, 29, 17, 8, 24, 18, 47, 33, 34

Researcher 2: 3, 14, 11, 5, 16, 17, 28, 41, 31, 18, 14, 14, 26, 25, 21, 22, 31, 2, 35, 44, 23, 21, 21, 16, 12, 18, 41, 22, 16, 25, 33, 34, 29, 13, 18, 24, 23, 42, 33, 29