Slide for Chapter 3
Slide for Chapter 3
CHAPTER 03
Performing the Test Plan and
Analyzing Results
Objectives
1
11/03/2024
Contents
2
11/03/2024
Descriptive analytics
Descriptive analytics help summarize what has
happened in the past.
• A financial accountant would sum all the sales transactions within a
period to calculate the value for Sales Revenue that appears on the
income statement.
• An analyst would count the number of records in a data extract to
ensure the data are complete before running a more complex analysis.
• An auditor would filter data to limit the scope to transactions that
represent the highest risk. In all these cases, basic analysis provides
an understanding of what has happened in the past to help decision
makers achieve good results and correct poor results.
3
11/03/2024
Descriptive analytics
Descriptive analytics
Summary statistics
Statistic Excel formula Description
Sum SUM() The total value of all numerical values
• Summary statistics The center value; sum of all observations divided by the
Mean =AVERAGE()
describe the number of observations
The middle value that divides the top half of the data from the
location, spread, Median =MEDIAN()
bottom half
shape, and Minimum =MIN() The smallest value
Maximum =MAX() The largest value
dependence of a set Count =COUNT() The number of observations
of observations. Frequency =FREQUENCY() The number of observations in each of a series of numerical or
categorical buckets
Standard The variability or spread of the data from the mean; a larger
=STDEV()
deviation standard deviation means a wider spread away from the mean
The value that divides a quarter of the data from the rest;
Quartile =QUARTILE()
indicates skewness of the data
Correlation How closely two datasets are correlated or predictive of one
=CORREL()
coefficient another
4
11/03/2024
Descriptive analytics
Summary statistics
Mean vs Median
Examples Mean Median
5, 3, 9, 7
9, 7, 8, 5, 4
8, 4, 4, 6, 2, 6, 6
7, 3, 5, 7, 1, 17, 13, 7
9
Descriptive analytics
Summary statistics
Mean vs Median: When to use?
10
5
11/03/2024
Descriptive analytics
Summary statistics
Mean vs Median: When to use?
When a distribution is
skewed, the Median does a
better job of describing the
center of the distribution
11
Descriptive analytics
Summary statistics
Mean vs Median: When to use?
12
6
11/03/2024
Descriptive analytics
Summary statistics
Standard deviation
10,15,20,25,30
5,10,15,20,20,25,30,30,35,40
13
Descriptive analytics
Summary statistics
Quartile
14
7
11/03/2024
Descriptive analytics
Summary statistics
Quartile
5,10,15,20,20,25,30,30,35,40
15
Descriptive analytics
16
8
11/03/2024
Descriptive analytics
Data reduction
Fuzzy matching locates approximate matches
• Useful for
identifying
relationships in
imperfect data.
Diagnostic analytics
18
9
11/03/2024
Diagnostic analytics
19
Diagnostic analytics
20
10
11/03/2024
Diagnostic analytics
Profiling
• Profiling involves gaining an understanding of the typical behavior
of an individual, group, or population (or sample).
• Profiling can be used to develop complex models to predict
potential fraud.
• Profiling is done primarily using structured data—data that are
stored in a database or spreadsheet and are readily searchable.
21
Diagnostic analytics
22
11
11/03/2024
Diagnostic analytics
23
Diagnostic analytics
Profiling
Z-Scores - Standardizing Data for Comparison
Where:
• z = Z-score
• x = the value being evaluated
• μ = the mean
• σ = the standard deviation
24
12
11/03/2024
Diagnostic analytics
Profiling
Z-Scores shows spread and outliers.
Exhibit 3-7 Z-Scores Provide an Example of Profiling That Helps Identify Outliers 25
Diagnostic analytics
Profiling
Box plots or whisker plot
• Displays the five-number summary of a set of data including the
minimum, first quartile, median, third quartile, and maximum
• The five-number summary divides the data into sections that each
contain approximately 25% of the data in that set
26
13
11/03/2024
Diagnostic analytics
Profiling
Box plots show spread and outliers
EXHIBIT 3-8 Box Plots Provide an Example of Profiling That Helps Identify Outliers
(in This Case, Categories with Unusually High Average Days to Ship) 27
Diagnostic analytics
Data profiling in management accounting
Variance analysis
• Internal auditors analyze
travel and entertainment
expenses for violations of
internal controls.
• Managers use profiling to
compare variances from
target ranges.
14
11/03/2024
Diagnostic analytics
Data profiling in auditing
Benford’s Law
Diagnostic analytics
Benford’s Law is a diagnostic analytics that compares
actual to expected values.
30
15
11/03/2024
Diagnostic analytics
Cluster analysis shows natural groupings of data.
Diagnostic analytics
Clustering in auditing
• Internal auditors can use
clustering to identify
groups of transactions
that may indicate risk or
fraud in insurance or
other payments.
16
11/03/2024
Diagnostic analytics
33
Diagnostic analytics
34
17
11/03/2024
Diagnostic analytics
Hypothesis Testing for Differences in Groups
EXHIBIT 3-13 T-Test Assessing for Significant Differences in Average Shipping Times across Categories 35
Predictive analytics
36
18
11/03/2024
Predictive analytics
Predictive analytics
Regression helps predict expected outcomes.
38
19
11/03/2024
Predictive analytics
Predictive analytics
Classification predicts which class an individual
belongs to
• Identify the classes you wish to predict.
• Manually classify an existing set of records.
• Select a set of classification models.
• Divide your data into training and testing sets.
• Generate your model.
• Interpret the results and select the “best” model.
40
20
11/03/2024
Predictive analytics
Predictive analytics
21
11/03/2024
Predictive analytics
Predictive analytics
22
11/03/2024
Predictive analytics
45
Predictive analytics
23
11/03/2024
Prescriptive analytics
47
Prescriptive analytics
48
24
11/03/2024
Prescriptive analytics
DSS use rules to guide the accountant.
Prescriptive analytics
Machine learning learns from past data to predict
better outcomes.
• What these all have in common is the use of algorithms and statistical models
to generate a previously unknown model that relies on patterns and
inferences.
• For most application of artificial intelligence models, most companies will
outsource the underlying system from companies like Microsoft, Amazon, or
Google rather than develop it themselves.
• These companies have large datasets to create more accurate prediction and
recommendation engines.
50
25
11/03/2024
Summary
• In this chapter, we addressed the third and fourth steps of the IMPACT
cycle model: the “P” for “performing test plan” and “A” for “address and
refine results.” That is, how are we going to test or analyze the data to
address a problem we are facing?
• We identified descriptive analytics that help describe what happened with
the data, including summary statistics, data reduction, and filtering.
• We provided examples of diagnostic analytics that help users identify
relationships in the data that uncover why certain events happen through
profiling, clustering; similarity matching, and co-occurrence grouping.
51
Summary
26