0% found this document useful (0 votes)
42 views5 pages

Overview: Describing and Interpreting Data: Variable

The document discusses different types of variables and how to analyze and visualize data based on its characteristics. There are two main types of variables - qualitative, which describe qualities in a non-numerical format, and quantitative, which are numerical. Quantitative variables can be discrete, involving counts, or continuous, involving measurements. The document provides examples of different graphs and charts to use for different types of data, such as bar charts for categorical/qualitative data and histograms for continuous quantitative data. It emphasizes selecting visualizations based on the characteristics of the specific data.

Uploaded by

kalindu004
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views5 pages

Overview: Describing and Interpreting Data: Variable

The document discusses different types of variables and how to analyze and visualize data based on its characteristics. There are two main types of variables - qualitative, which describe qualities in a non-numerical format, and quantitative, which are numerical. Quantitative variables can be discrete, involving counts, or continuous, involving measurements. The document provides examples of different graphs and charts to use for different types of data, such as bar charts for categorical/qualitative data and histograms for continuous quantitative data. It emphasizes selecting visualizations based on the characteristics of the specific data.

Uploaded by

kalindu004
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 5

Overview: Describing and Interpreting Data

The manner in which you analyze data depends on the type of data/variables that you are evaluating.
There are several different classifications that are used in classifying data.

Variable
• A variable is an item of data
• Examples of variables include quantities such as: gender, test scores, and weight. The values of these
quantities vary from one observation to another.

Types/Classifications of Variables
• Qualitative: Non-numerical quality
• Quantitative: Numerical
⇒ Discrete: counts
⇒ Continuous: measures

Qualitative Data
• This data describes the quality of something in a non-numerical format.
• Counts can be applied to qualitative data, but you cannot order or measure this type of variable.
Examples are gender, marital status, geographical region of an organization, job title….
• Qualitative data is usually treated as Categorical Data.
With categorical data, the observations can be sorted according into non-overlapping categories or
by characteristics. For example, shirts can be sorted according to color; the characteristic 'color' can
have non-overlapping categories: white, black, red, etc. People can be sorted by gender with
categories male and female. Categories should be chosen carefully since a bad choice can prejudice
the outcome. Every value of a data set should belong to one and only one category.
• Analyze qualitative data using:
⇒ Frequency tables
⇒ Modes - most frequently occurring
⇒ Graphs: Bar Charts and Pie Charts

Quantitative Data
• Quantitative or numerical data arise when the observations are frequencies or measurements.
• The data are said to be discrete if the measurements are integers (e.g. number of employees of a
company, number of incorrect answers on a test, number of participants in a program…)
• The data are said to be continuous if the measurements can take on any value, usually within some
range (e.g. weight). Age and income are continuous quantitative variables. For continuous
variables, arithmetic operations such as differences and averages make sense.
Analysis can take almost any form:
⇒ Create groups or categories and generate frequency tables.
⇒ All descriptive statistics can be applied.
⇒ Effective graphs include: Histograms, Stem-and-Leaf plots, Dot Plots, Box plots, and
XY Scatter Plots (2 variables).
• Some quantitative variables can be treated only as ranks; they have a natural order, but these values
are not strictly measured. Examples are: 1) age group (taking the values child, teen, adult, senior),
and 2) Likert Scale data (responses such as strongly agree, agree, neutral, disagree, strongly
disagree). For these variables, the distinction between adjacent points on the scale is not necessarily
the same, and the ratio of values is not meaningful.
Analyze using:
⇒ Frequency tables
⇒ Mode, Median, Quartiles
• Graphs: Bar Charts, Dot Plots, Pie Charts, and Line Charts (2 variables)

Goodson/ Sln12 1
Tables and Graphs
Note Excel will create any graph that you specify, even if the graph that you select is not appropriate for
the data. Remember - consider the type of data that you have before selecting your graph.

Frequency Table/Frequency Distribution: A frequency table is used to summarize categorical,


nominal, and ordinal data. It may also be used to summarize continuous data when the data set has been
divided into meaningful groups.

Count the number of observations that fall into each category. The number associated with each category
is called the frequency and the collection of frequencies over all categories gives the frequency
distribution of that variable.

Table 1
Frequency Distribution
of Time

Time Count Note Table1


110 1 There are 8 classes. The frequency of the
115 2 first class is 1; i.e. there is 1 value within the
120 4 class; the class has a midpoint of 110.
125 3
130 5
135 3
140 4
145 2
150 1

The relative frequency is a number which describes the proportion of observations falling in a given
category. Instead of counts, we report relative frequencies or percentages.

Graphs Used for Categorical/qualitative Data


Pie Charts
• A circle is divided proportionately and shows what percentage of the whole falls into each category
• These charts are simple to understand.
• They convey information regarding the relative size of groups more readily than does a table.

Pie Chart of Color Preferences

Green
8%

Yellow
Red
16%
44%

Blue
32%

Goodson/ Sln12 2
Bar Charts
• Bar charts also show percentages in various categories and allow comparison between categories.
• The vertical scale is frequencies, relative frequencies, or percentages.
• The horizontal scale shows categories.
• Consider the following in constructing bar charts.
⇒ all boxes should have the same width
⇒ leave gaps between the boxes (because there is no connection between them)
⇒ boxes can be in any order.
• Bar charts can be used to represent two categorical variables simultaneously

Color Preference of Customers

15
N
10

0
Red Blue Yellow Green

Color

Graphs for Measured/Continuos Quantitative Data


• Histograms
• Stem and Leaf
• Box plots
• Line Graphs
• XY Scatter Charts (2 variables)

Histograms
Histograms show the frequency distributions of continuous variables. They are similar to Bar Charts, but
in ‘pure form,’ they are drawn without gaps between the bars because the x-axis is used to represent the
class intervals. However, many of the current software packages do easily not make this distinction (e.g.
Excel).

• The data is divided into non-overlapping intervals (usually use from 5 to 15).
• Intervals generally have the same length
• The number of values in each interval is counted (the class frequency).
• Sometimes relative frequencies or percentages are used. (Divide the cell total by the grand total.)
• Rectangles are drawn over each interval. (The area of rectangle = relative frequency of the interval.
If intervals are not all of the same length then heights have to be scaled so that each area is
proportional to the frequency for that interval. )

Goodson/ Sln12 3
5

4
Frequency

110 115 120 125 130 135 140 145 150

Time

XY Scatter Chart
• This type of chart should be used with two variables when both of the variables are quantitative and
continuous.
• Plot pairs of values using the rectangular coordinate system to examine the relationship between two
values.

Worker-Hours by Lot Size

180

160
140

120
Hours

100
80

60
40
20

0
0 20 40 60 80 100

Lot Size

A Line Chart is similar to the scatter chart; however, it can be used when the values of the independent
variable (shown on the horizontal axis) are ranked values (i.e. they do not have to be continuous
variables).

Goodson/ Sln12 4
Basic Principles for Constructing All Plots
• Data should stand out clearly from background
• The information should be clearly labeled and include:
⇒ title
⇒ axes, bars, pie segments, etc. - include units that are needed to interpret data
⇒ scale including starting points.
• Source of data should be identified, as appropriate.
• Do not clutter the graphs with unnecessary information and graphical components that are really not
necessary.
• Do not put too much information or data on one graph.
• Sometimes, you have to try several approaches before selecting an appropriate graph.

To describe data, consider the following.


• Shape of the Distribution
⇒ Symmetry
⇒ Modality: most frequently occurring value
⇒ Unimodal or bimodal or uniform
⇒ Skewness
• Centrality
• Spread
• Extreme values

In interpreting graphs, consider:


• Horizontal and vertical scales; what is the relationship - are the distances between, for example, 10
and 20, the same on each axis? A no answer may distort the interpretation.
• The center point - of particular importance in comparing two histograms. Look at the starting point
of the vertical scale - does it start at 0? How could this affect the interpretation of the data?

Goodson/ Sln12 5

You might also like