Business Analytics Chapter02
Business Analytics Chapter02
using Excel
Chapter 02
Descriptive Statistics
1
Raw Data: Data stored in its smallest size
No: Yes:
Why?
Because it is easier to analyze data when it is stored in its smallest parts 2
Data:
• Textbook: Facts or figures collected, analyzed and summarized
for presentation and interpretation
• Data = all the unorganized raw data in a Proper Data Set
Transaction
Number Date Sales SalesRep
12568 12/1/2014 $19,161 Jo
12569 12/1/2014 $15,027 Gigi
12570 12/2/2014 $12,953 Chin
12571 12/2/2014 $12,670 Jo
12572 12/2/2014 $8,893 Gigi
12573 12/3/2014 $4,667 Chin
12574 12/3/2014 $20,272 Jo
12575 12/3/2014 $20,204 Gigi
12576 12/3/2014 $17,223 Chin
3
Data Types & Default Alignment in Excel
• Empty Cells Not really a Data Type, but it is a "thing" in Excel that can sometimes cause
problems.
• **Refer to Empty Cells as "Empty Cells", not blanks.
• Why Default Alignment? Because Left means Excel thinks it is Text and Right means Excel thinks it is
a Number. This is important when dealing with data because some systems will mistakenly import
numbers as text. Numbers as text do not always behave like you expect (like not being added by
4
the SUM function. The Default Alignment is a visual cue that informs us about how Excel “sees” the
data.
Proper Data Set: Proper Table of Data
• A structure for your data set
necessary so that Excel Data
Transaction
Analysis features like Sort, Filter Number Date Sales SalesRep
and PivotTables will work 12568 12/1/2014 $19,161 Jo
correctly: 12569 12/1/2014 $15,027 Gigi
12570 12/2/2014 $12,953 Chin
1. Fields in first row (no empty 12571 12/2/2014 $12,670 Jo
cells) 12572 12/2/2014 $8,893 Gigi
12573 12/3/2014 $4,667 Chin
2. Records or Observations in rows 12574 12/3/2014 $20,272 Jo
3. Empty cells or Excel 12575 12/3/2014 $20,204 Gigi
12576 12/3/2014 $17,223 Chin
Row/Column Headers all the
way around Data Set
4. Try not to have empty cells in
data set 5
Terms for Proper Data Set
Primary Key / Variables
List of Unique Elements
Element = Entities on
which data are collected.
We are collecting data for
each Transaction Number.
Transaction Number is the
Element.
Each row is
a Record /
Observation 6
8
Proper Data Set with NO Primary Key /
List of Unique Elements:
Proper Data Set: Using the PivotTable feature we can create a
Proper Data Set with a Primary Key (Unique
List of Products or Elements):
9
Variables
• Variable (from previous slide)
• A characteristic or quantity of interest that can take on different values
• Decision Variables
• Variables under the direct control of decision makers
• Example
• The “Quantity” Variable for a manufacturer. Managers can decide how many to make
each day.
• Random (uncertain variables) Variables:
• In general, variables that are outside of the decision makers control
• A quantity whose value is not known with certainty
• Example:
• Stock Price of Yahoo 10
• Number of units sold of a particular product
Variables and Variation If you own Yahoo Stock, you would be
interested in the Variation in the Variable
• Variation “Price (Adj Close)”.
gel-boomerang.com 2898
ebay.com 5810
coloradoboomerangs.com 6380
20
amazon.com 11436
Histograms for Quantitative Data
• Histograms
• Used to show frequency distribution of continuous quantitative data
over a set of class intervals (lower and upper limit for each category)
• Column or Bar Charts where columns are touching to indicate that the
variable is continuous
• Columns touch to indicate that no numbers can fit between classes.
"No numbers can fit between columns - no gaps"
• Height of columns convey count
• Order of classes is important to help reveal shape of data, or
distribution of data. 21
Mean, Median, Mode
• Mean
• Arithmetic Mean: Add them up and divide by the count
• Good for quantitative data when there are not extreme values - extreme values can make the mean look too
big or too small (Median more representative of a typical value in that case)
• Use AVERAGE function
• Median
• Sort, then take the one in the middle. If count odd, take one in middle, if even, average middle two.
• Marks the point in the sorted list (an actual number) where 50% of the numbers are above and 50% of the
numbers are below
• Good for quantitative data when there are extreme values (like house prices and salaries)
• Use MEDIAN function
• Mode
• One that occurs most frequently (can be bimodal, multimodal)
• Good for Categorical Data (Nominal and Ordinal)
• Use MODE.SNGL for quantitative data and COUNTIF or PivotTable for Categorical or quantitative data. 22
MODE.SNGL will only show 1 mode if the data set is bi-modal or multi-modal. MODE.MULT can be used for
multiple modes.