DAA_Chapter 03
DAA_Chapter 03
CHAPTER 03
Performing the Test Plan and
Analyzing Results
1
Prepared by Nguyen Huu [email protected]
Objectives
2
Prepared by Nguyen Huu [email protected]
Contents
3
Prepared by Nguyen Huu [email protected]
1
21/12/2024
4
Prepared by Nguyen Huu [email protected]
Descriptive analytics
Descriptive analytics help summarize what has
happened in the past.
• A financial accountant would sum all the sales transactions within a
period to calculate the value for Sales Revenue that appears on the
income statement.
• An analyst would count the number of records in a data extract to
ensure the data are complete before running a more complex analysis.
• An auditor would filter data to limit the scope to transactions that
represent the highest risk. In all these cases, basic analysis provides
an understanding of what has happened in the past to help decision
makers achieve good results and correct poor results.
6
Prepared by Nguyen Huu [email protected]
2
21/12/2024
Descriptive analytics
7
Prepared by Nguyen Huu [email protected]
Descriptive analytics
Summary statistics
Statistic Excel formula Description
Sum SUM() The total value of all numerical values
• Summary statistics The center value; sum of all observations divided by the
Mean =AVERAGE()
describe the number of observations
The middle value that divides the top half of the data from the
location, spread, Median =MEDIAN()
bottom half
Descriptive analytics
Summary statistics
Mean vs Median: When to use?
9
Prepared by Nguyen Huu [email protected]
3
21/12/2024
Descriptive analytics
Summary statistics
Mean vs Median: When to use?
When a distribution is
skewed, the Median does a
better job of describing the
center of the distribution
10
Prepared by Nguyen Huu [email protected]
10
Descriptive analytics
Summary statistics
Mean vs Median: When to use?
11
Prepared by Nguyen Huu [email protected]
11
Descriptive analytics
Summary statistics
Quartile
12
Prepared by Nguyen Huu [email protected]
12
4
21/12/2024
Descriptive analytics
Summary statistics
Quartile
13
Prepared by Nguyen Huu [email protected]
13
Descriptive analytics
14
Prepared by Nguyen Huu [email protected]
14
Descriptive analytics
Data reduction
Fuzzy matching locates approximate matches
• Useful for
identifying
relationships in
imperfect data.
15
5
21/12/2024
Fuzzy Fuzzy
Vendor RelatedParty Customer
Match? Match?
VendorState RelatedState CustomerState
VendorName RelatedName CustomerName
VendorZip RelatedZip CustomerZip
VendorAddDate RelatedAddDate CustomerAddDate
VendorAddress RelatedAddress CustomerAddress
VendorType RelatedPosition CustomerType
VendorCity RelatedCity CustomerCity
16
Prepared by Nguyen Huu [email protected]
16
Diagnostic analytics
17
Prepared by Nguyen Huu [email protected]
17
Diagnostic analytics
18
Prepared by Nguyen Huu [email protected]
18
6
21/12/2024
Diagnostic analytics
19
Prepared by Nguyen Huu [email protected]
19
Diagnostic analytics
Profiling
• Profiling involves gaining an understanding of the typical behavior
of an individual, group, or population (or sample).
• Profiling can be used to develop complex models to predict
potential fraud.
• Profiling is done primarily using structured data—data that are
stored in a database or spreadsheet and are readily searchable.
20
Prepared by Nguyen Huu [email protected]
20
Diagnostic analytics
21
Prepared by Nguyen Huu [email protected]
21
7
21/12/2024
Diagnostic analytics
22
Prepared by Nguyen Huu [email protected]
22
Diagnostic analytics
Profiling
Z-Scores - Standardizing Data for Comparison
𝑥−𝜇
𝑧=
𝜎
Where:
• z = Z-score
• x = the value being evaluated
• μ = the mean
• σ = the standard deviation
23
Prepared by Nguyen Huu [email protected]
23
Diagnostic analytics
Profiling
Z-Scores shows spread and outliers.
Exhibit 3-7 Z-Scores Provide an Example of Profiling That Helps Identify Outliers
24
Prepared by Nguyen Huu [email protected]
24
8
21/12/2024
Diagnostic analytics
Profiling
Box plots or whisker plot
• Displays the five-number summary of a set of data including the
minimum, first quartile, median, third quartile, and maximum
• The five-number summary divides the data into sections that each
contain approximately 25% of the data in that set
25
Prepared by Nguyen Huu [email protected]
25
Diagnostic analytics
Profiling
Box plots show spread and outliers
EXHIBIT 3-8 Box Plots Provide an Example of Profiling That Helps Identify Outliers
(in This Case, Categories with Unusually High Average Days to Ship)
26
Prepared by Nguyen Huu [email protected]
26
Diagnostic analytics
Data profiling in management accounting
Variance analysis
• Internal auditors analyze
travel and entertainment
expenses for violations of
internal controls.
• Managers use profiling to
compare variances from
target ranges.
27
9
21/12/2024
Diagnostic analytics
Data profiling in auditing
Benford’s Law
28
Diagnostic analytics
Benford’s Law is a diagnostic analytics that compares
actual to expected values.
29
Prepared by Nguyen Huu [email protected]
29
Diagnostic analytics
30
Prepared by Nguyen Huu [email protected]
30
10
21/12/2024
Diagnostic analytics
Cluster analysis shows natural groupings of data.
31
Diagnostic analytics
Clustering in auditing
• Internal auditors can use
clustering to identify
groups of transactions
that may indicate risk or
fraud in insurance or
other payments.
32
Diagnostic analytics
33
Prepared by Nguyen Huu [email protected]
33
11
21/12/2024
Diagnostic analytics
34
Prepared by Nguyen Huu [email protected]
34
Diagnostic analytics
Hypothesis Testing for Differences in Groups
EXHIBIT 3-13 T-Test Assessing for Significant Differences in Average Shipping Times across Categories 35
Prepared by Nguyen Huu [email protected]
35
Predictive analytics
36
Prepared by Nguyen Huu [email protected]
36
12
21/12/2024
Predictive analytics
37
Predictive analytics
38
Prepared by Nguyen Huu [email protected]
38
Predictive analytics
39
13
21/12/2024
Predictive analytics
Classification predicts which class an individual
belongs to
• Identify the classes you wish to predict.
• Manually classify an existing set of records.
• Select a set of classification models.
• Divide your data into training and testing sets.
• Generate your model.
• Interpret the results and select the “best” model.
40
Prepared by Nguyen Huu [email protected]
40
Predictive analytics
Classification predicts which class an individual
belongs to
Test models
• New data
• Traing data
• Testing data • Real classification
• Select models results
• Interpret the results
• Select the “best” model
Generate
Using model
models
41
Prepared by Nguyen Huu [email protected]
41
Predictive analytics
Classification models
Logistic Regression
Decision Trees
42
Prepared by Nguyen Huu [email protected]
42
14
21/12/2024
Predictive analytics
43
Predictive analytics
Decision trees
Weather
No Yes No Yes
44
Prepared by Nguyen Huu [email protected]
44
Predictive analytics
45
15
21/12/2024
Predictive analytics
46
Prepared by Nguyen Huu [email protected]
46
Predictive analytics
47
48
Prepared by Nguyen Huu [email protected]
48
16
21/12/2024
Prescriptive analytics
49
Prepared by Nguyen Huu [email protected]
49
Prescriptive analytics
50
Prepared by Nguyen Huu [email protected]
50
Prescriptive analytics
DSS use rules to guide the accountant.
51
17
21/12/2024
Prescriptive analytics
Machine learning learns from past data to predict
better outcomes.
• What these all have in common is the use of algorithms and statistical models
to generate a previously unknown model that relies on patterns and
inferences.
• For most application of artificial intelligence models, most companies will
outsource the underlying system from companies like Microsoft, Amazon, or
Google rather than develop it themselves.
• These companies have large datasets to create more accurate prediction and
recommendation engines.
52
Prepared by Nguyen Huu [email protected]
52
Summary
• In this chapter, we addressed the third and fourth steps of the IMPACT
cycle model: the “P” for “performing test plan” and “A” for “address and
refine results.” That is, how are we going to test or analyze the data to
address a problem we are facing?
• We identified descriptive analytics that help describe what happened with
the data, including summary statistics, data reduction, and filtering.
• We provided examples of diagnostic analytics that help users identify
relationships in the data that uncover why certain events happen through
profiling, clustering; similarity matching, and co-occurrence grouping.
53
Prepared by Nguyen Huu [email protected]
53
Summary
54
18
21/12/2024
Key words
55
Prepared by Nguyen Huu [email protected]
55
Key words
56
Prepared by Nguyen Huu [email protected]
56
19