CMA 2020 P1-F Analytics
CMA 2020 P1-F Analytics
Section F
Technology & Analytics
F.1) Information Systems
I) Accounting Information System (AIS)
II) AIS (Business) Cycles
III) Separate Financial & Non-financial Systems
IV) Enterprise Resource Planning (ERP)
V) Database Management System (DBMS)
VI) Data Warehouse
VII) Enterprise Performance Management (EPM)
One of the most important roles of an accountant is to provide information to owners and managers
so that they can make the best possible decisions for their organizations. A well-designed, well-
implemented, and properly controlled information system is the principal tool accountants use to
capture, process, and produce relevant and reliable information. This section examines the role of
technology & analytics in providing and using information. The section covers the role and use of
information systems, data governance including COSO and COBIT, and how technology is used in
transforming financial data into information. The section concludes by discussing data analytics topics
including business intelligence, data mining, analytic tools, and visualization.
E-2
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
In order to provide relevant and reliable information to decision makers, a business requires a method
to capture and process data and then report information. This method is called an information system.
This topic reviews the primary elements of a functional information system.
E-3
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Accounting Information System (AIS) - Formalized process to collect, store, and process accounting
information
Was traditionally maintained with physical books (i.e., journals and ledgers). Today, an AIS is
typically an electronic system made of a software, computers, and servers
Captures the pertinent information and recordkeeping needed in order to produce financial
statements (e.g., balance sheet, income statement, statement of cash flows, statement of
owner's equity) and performance reports (e.g., budgets, project profitability, product cost, etc.)
This information is critical to understanding an organization's business activities
Also, provides information needed for analysis, evaluation, and strategic decision making
Primary function of an AIS is to report information which is relied upon by many
stakeholders/users
Users include but are not limited to the following: executives, accountants, managers,
analysts, auditors, regulators, and tax authorities
Producing accurate and timely information is a critical feature of an effective AIS
E-4
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
AIS Cycles - AIS can be broken down into several cycles, or sets of related business transactions.
The AIS collects and stores information about the activities of an organization related to the
following:
Revenue to Cash Cycle - Process of taking orders, shipping products or delivering services,
billing customers, and collecting cash from sales
Expenditure Cycle - Process of placing orders, receiving shipment of products or delivery of
services, approving invoices, and making cash payments
Production Cycle - Process by which raw materials are converted into finished goods
Human Resources and Payroll Cycle - Process of recruiting, interviewing, and hiring personnel,
paying employees for their work, promoting employees, and finalizing employees’ status from
retirements, firings, or voluntary terminations
Financing Cycle - Process of obtaining funding, through debt or equity, to run an organizations’
activities and to purchase PPE, servicing the financing, and ultimate repayment of financial
obligations
Property, Plant & Equipment (PP&E) Cycle - Process of acquiring resources (e.g., land,
buildings, and machinery) needed to enable an organization's business activities
General Ledger and Reporting System - Process of recording, classifying, and categorizing an
organization's economic transactions and producing summary financial reports
Note:
Every company divides its transactions across cycles, but not all companies use every cycle. For
example, a manufacturing company would likely employ all of the cycles; an advertising agency
would not likely use the production cycle, since that cycle is used to track the costs of
manufacturing a product
Although most transactions are now typically done entirely online, manual systems are
presented in this sub-topic to help candidates more easily envision the process
E-5
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Revenue to Cash Cycle - Process of taking orders, shipping products or delivering services, billing
customers, and collecting cash from sales. Typical process: Order to Cash
Order
Cash
E-6
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-7
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
II B) Expenditure Cycle
Procure
Pay
E-8
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-9
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Operations (Production) Cycle - Varies by company and by product; however, there is a pattern of
accounting data inputs, processes & outputs that can be used when considering the production
cycle
E-10
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-11
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
The payroll process for hourly employees starts with an accurate and up‐to‐date master file.
The master file is updated with relevant payroll‐related data from the:
Human resources department - Updates the master file with data on new hires,
promotions, transfers, and firings
Government agencies - Provide tax rate data
Insurance companies - Provide data on the company’s assessed insurance rates
Employees - Provide any updates on their withholdings and deductions, such as the number
of claimed exemptions due to changes in their family, such as the birth of a baby or an
increase in the amount the employee wishes to contribute to the retirement plan account
E-12
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-13
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-14
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Modern AIS have greater computing capabilities and larger storage capacity than previous AIS.
This has facilitated an integrated approach wherein financial and nonfinancial information can
be linked within a single information system, which is an Enterprise Resource Planning (ERP)
system, which will be discussed in the next sub-topic
Example of a nonfinancial system that runs in parallel with an AIS that processes financial
data is a Customer Relationship Management (CRM) system. This system captures
information about sales calls, shipment tracking, and customer profiles. An ERP can link the
CRM system to AIS to reduce potential errors and increase information usefulness
E-16
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-17
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Project Management - Records and maintains billing, costing, time, performance, expenses,
and activity management
Customer Relationship management - Maintains and supports sales, marketing, services,
contact with customers, commissions, and call center support. In addition, companies seek
to anticipate customer needs through the analysis of big data and the Internet of Things to
the extent possible
System Tools - Provided to establish and maintain master file data, flow of data, access
controls, and others
E-18
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Database Management System (DBMS) - Interface or program between the database and the
application programs that access the database. The DBMS manages and controls the data and the
interface between the database and the application programs. The DBMS facilitates creating,
retrieving, updating, managing data, and protecting data
The DBMS controls two primary components:
The data
The database program that allows data to be accessed, retrieved, modified, and locked if
necessary
The database schema or blueprint defines the database logical structure, or the way humans
view the data
The DBMS provides a centralized view of the data that can be accessed by many different
users from many different locations. The system can also limit which data a particular user
or set of users can access, retrieve, or modify
The DBMS schema allows users to access the database without knowing where the data is
actually physically located
Application programs
E-19
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Data warehouse - Set of large databases consisting of detailed and summarized data that is used
primarily for analysis rather than processing transaction
It is essentially a repository or storage location for all of a company’s data retrieved from
various programs, sources, and databases. The data is typically cleaned and organized so it can
be searched.
E-20
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Evolution of EPM - Although EPM has been around for decades, its methodology has become more
and more sophisticated as the tools and software for EPM have improved and evolved
In its earliest stages, EPM consisted simply of face‐to‐face meetings and phone calls. The first
EMP software applications focused on collecting and providing accounting, budgeting, and
financial performance capability
The advent of electronic spreadsheets eliminated the tedious process of creating manual
spreadsheets, facilitating more strategic planning, better budgeting, and improved reporting
Later, dedicated EPM software packages were developed that automated much of the financial
consolidation and reporting duties of the finance and accounting departments. Windows‐based
client/server systems have given way to web‐based programs. Software as a Service (SAAS)
applications have been widely adopted, freeing employees to focus on higher‐level strategic
tasks rather than on managing IT‐related concerns
E-21
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-22
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
This topic addresses data governance from the standpoint of its definition, frameworks, life cycle,
retention policy, and protection.
E-23
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Data governance - Set of defined procedures, policies, rules, and processes that oversee the
following attributes of an organization's data
Concerned primarily with managing the availability, usability, integrity, and security of data,
data governance is important because an organization’s data holds intrinsic value
However, without a well‐designed and functioning data governance program, data can be
corrupted, devalued, rendered unusable, lost, or even stolen
A data governance plan should include an oversight body, a set of procedures and controls,
and a set of policies or directives to implement the procedures and controls.
During the implementation phase of a data governance plan, data stewards should be selected
and trained in their role of responsible caretakers of their assigned data
Data stewards should be given primary responsibility over their data’s availability, usability,
integrity, and security
Various controls should be established to aid the stewards in their responsibilities. Input,
processing, and output controls aid data stewards in maintaining data quality
Input controls include data entry controls, such as proper data input screen or form
design, field checks, limit checks, completeness checks, validity checks, and batch totals
Processing controls include data matching, proper file labels, cross‐footing balance
tests, and concurrent update controls
Output controls include user review of output, reconciliations, and data transmission
controls
Data availability should include proper fault tolerance and redundancies built into
information systems, uninterruptible power supplies and backup generators, backup and
tested backup procedures, and real‐time mirroring
The integrity of data can be preserved by proper segregation of duties, data change
management and authorization structures, and independent checks and audits
Data security can be aided by using a defense‐in‐depth approach, which includes implementing
data security controls throughout the organization at various levels; in other words, not relying
on just locking the front door but locking all office and closet doors in case a perpetrator
penetrates the front door
In addition, data security depends on employee training on proper data security
procedures, authentication controls, authorization controls such as an access control
matrix, firewalls and other network security tools, data encryption, and patch management
The best controls in the world can be compromised if one employee accidentally (or on
purpose) leaves the data exposed
Of course, there is no way completely safeguard data from hackers, no matter how much
money is spend on safeguards. That is where data risk management comes into play
E-24
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-25
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Objectives
of I/C
Entity
Structure
Components of I/C
Objectives - I/C over data governance are necessary for operations, reporting, and compliance
with applicable laws and regulations
Entity Structure - I/C over data governance should be implemented at all levels of the
organization, including the entity, divisions, operating units, and individual functions
Components - 5 components of I/C {CRIME}:
Control environment - I/C over data governance depends on good leadership and culture
Risk assessment - companies need to identify risks to data governance
Information and communication - ensuring proper I/C over data governance improves
information quality throughout the company
Monitoring activities - companies need to monitor and adapt controls to respond to
changes in the environment
Existing Control activities - specific policies & procedures put in place to ensure data
governance
E-26
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
As the title implies, COBIT is focused on effective internal control as it relates to IT. The COBIT
framework provides best practices for effectively managing controls over IT. It is a voluminous and
very detailed set of manuals for creating, implementing, and maintaining IT‐related controls
COBIT 4 framework divided IT into four major parts (which are then broken down into 32 IT
management control processes):
Plan and Organize,
Acquire and Implement,
Deliver and Support
Monitor and Evaluate
COBIT 5 is based on five key principles for governance and management of enterprise IT;
together, these 5 principles enable the enterprise to build an effective governance and
management framework that optimizes information and technology investment and use for the
benefit of stakeholders
Principle 1: Meeting Stakeholder Needs
Enterprises exist to create value for their stakeholders by maintaining a balance
between the realization of benefits and the optimization of risk and use of resources.
COBIT 5 provides all of the required processes and other enablers to support business
value creation through the use of IT
Since every enterprise has different objectives, an enterprise can customize COBIT 5 to
suit its own context through the goals cascade, translating high-level enterprise goals
into manageable, specific, IT-related goals and mapping these to specific processes and
practices
Principle 2: Covering the Enterprise End-to-end
Integrates governance of enterprise IT into enterprise governance:
- It covers all functions and processes within the enterprise; COBIT 5 does not focus
only on the ‘IT function’, but treats information and related technologies as assets
that need to be dealt with just like any other asset by everyone in the enterprise
- It considers all IT-related governance and management enablers to be enterprise-
wide and end-to-end, i.e., inclusive of everything and everyone—internal and
external—that is relevant to governance and management of enterprise information
and related IT
Principle 3: Applying a Single, Integrated Framework
There are many IT-related standards and good practices, each providing guidance on a
subset of IT activities. COBIT 5 aligns with other relevant standards and frameworks at a
E-27
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
high level, and thus, can serve as the overarching framework for governance and
management of enterprise IT
Principle 4: Enabling a Holistic Approach
Efficient and effective governance and management of enterprise IT require a holistic
approach, taking into account several interacting components
COBIT 5 defines a set of enablers to support the implementation of a comprehensive
governance and management system for enterprise IT. Enablers are broadly defined as
anything that can help to achieve the objectives of the enterprise. The COBIT 5
framework defines 7 categories of enablers:
- Principles, Policies and Frameworks
- Processes
- Organizational Structures
- Culture, Ethics and Behavior
- Information
- Services, Infrastructure and Applications
- People, Skills and Competencies
Principle 5: Separating Governance From Management
The COBIT 5 framework makes a clear distinction between governance and
management. These two disciplines encompass different types of activities, require
different organizational structures and serve different purposes.
COBIT 5’s view on this key distinction between governance and management is
- Governance
Governance ensures that stakeholder needs, conditions and options are
evaluated to determine balanced, agreed-on enterprise objectives to be
achieved; setting direction through prioritization and decision making; and
monitoring performance and compliance against agreed-on direction and
objectives
In most enterprises, overall governance is the responsibility of the board of
directors under the leadership of the chairperson. Specific governance
responsibilities may be delegated to special organizational structures at an
appropriate level, particularly in larger, complex enterprises
- Management
Management plans, builds, runs and monitors activities in alignment with the
direction set by the governance body to achieve the enterprise objectives
In most enterprises, management is the responsibility of the executive
management under the leadership of the chief executive officer (CEO)
E-28
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Data Life Cycle - Although there is some debate on the actual number of phases included in the
data life cycle, these 8 phases represent a general view of the data life cycle:
Data Maintenance - In order to be useful, data must be converted to a usable form. The
process of creating usable data may include cleansing, scrubbing, and processing through an
extract-transform-load (ETL) methodology
Companies use enterprise resource planning (ERP) systems and other less sophisticated
systems for their information needs. ERP and other information systems utilize database
technology to organize and query their data
In order for data to be loaded into a database, it must be cleansed and scrubbed of
superfluous characters and symbols. Also, it must be checked to ensure that date data fills
date fields, numeric data fill numeric fields, and character data fills character fields. In
essence, data cleansing and scrubbing transforms unstructured data into structured data
that can be used in an organization’s information system
Data Synthesis - Involves the use of statistical methods that combine data from many sources
or tests in order to obtain a better overall estimate or answer to the questions being asked of
data
Some may term this data modeling or using inductive reasoning to transform data. Others
view data synthesis as a subset of data maintenance
E-29
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Data Usage - Using data to support the mission of the business, such as strategic planning,
customer relationship management (CRM), processing invoices, sending purchase orders to
vendors, etc.
Data Analytics - Science of examining raw data with the purpose of creating new information
and generating business insight
Encompasses the skills, technologies, and practices for iterative exploration and
investigation of past business performance to gain insight and drive business planning for
the future
At its most basic level, it means using data analysis methodologies to answer questions
Some view data analytics as subset of data usage
Data Archival - Process of removing data from active use to be stored for potential future use
Record Retention - A competent record retention policy is necessary for every organization
Records must be kept and maintained for internal use as long as they are needed by users to
research, analyze and document past events and decisions
In addition, records must be preserved to meet legal and regulatory requirements
For example, the IRS instructs tax payers to retain tax return data for between 2 and 7 years
depending on the date of filing or payment and types of deductions the organization
claimed
E-30
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-31
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-32
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
This topic addresses technology‐enabled finance transformation—in other words, how technology can
make a company’s finance operations run faster, better, and more effectively. This section addresses
the topics of the systems development life cycle (SDLC), business process analysis, robotic process
automation (RPA), artificial intelligence (AI), cloud computing, software as a service (SaaS), and
blockchain.
E-33
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Systems Development Life Cycle - Structured road map for designing and implementing a new
information system. Although there are many versions and variations to SDLC, a basic five‐step
approach is presented below:
Systems analysis - Involves identifying the needs of the organization and assembling the
information regarding modifying the current system, purchasing a new system, and developing
a new system
Conceptual design - Involves creating a plan for meeting the needs of the organization. Design
alternatives are prepared and detailed specifications are created to provide instruction on how
to achieve the desired system
Physical design - Involves taking the conceptual design and creating detailed specifications for
creating the system. The design would include specifications for computer code, inputs,
outputs, data files and databases, processes and procedures, as well as proper controls
Implementation and conversion - Involves the installation of the new system including
hardware and software. The new system is tested and users are trained. New standards,
procedures, and controls are instituted
Operations and maintenance - Involves running the system, checking performance, making
adjustments as necessary, and maintaining the system. Improvements are made and fixes are
put in place until the organization determines that the cost of maintaining the old systems does
not justify its benefits, and the whole cycle starts over again.
E-34
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-35
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-36
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Drawbacks of RPA:
If rules or processes change, then the RPA system requires updating. This can require
significant time and energy if changes are significant.
The initial investment to develop a rule-based system that automates workplace tasks can
be costly. It requires significant time and understanding to define procedures and
processes, and further time and resources to test, verify, and/or audit the process to ensure
accurate completion of the task
RPA can lower companies’ costs by increasing throughput and reducing errors, yet that
benefit must be weighed against the cost of designing and implementing the automated
process
E-37
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-38
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
V) Cloud Computing
Cloud Computing
Cloud computing is a shared resource setup that allows for improved processing of electronic
information. Cloud computing is a network of remote servers that are connected by the
Internet. The remote servers are used to store, manage, and process data. Rather than
performing these functions on a local server or a personal computer, cloud computing joins
numerous servers and computers. Cloud computing can provide access to larger data storage,
processing speeds, and software applications.
Cloud computing can help avoid data loss due to localized hardware failures and malfunctions
because of networked backups and redundancies. The “cloud” or network of servers provides a
safeguard by storing information on multiple servers at multiple geographic locations. While
cloud computing does rely on the Internet in order to function, it alleviates the reliance on
individual servers and computers to store and process information.
E-39
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
VI) Blockchain
Blockchain -
Most business transactions need some kind of middlemen like banks, insurance companies etc.
Blockchain removes the need for a middleman and connects consumers & suppliers directly
Crytocurrency is the most popular currency used in Blockchain technology
Blockchain process -
Transactions initiated are represented as a ‘block’ or an online record (e.g., A wants to send
money to B)
The “block” is broadcasted to every party in the network to all parties who are connected
Once the parties in the network authenticate the transaction, the “block” is added to the
“chain” and transaction is processed (e.g., money transfer from X to Y)
Note that “blocks” on the “chain” are secure and permanent records that cannot be deleted
Advantages of Blockchain -
High transparency in the transactions, which can be tracked accurately
The ledger are permanent record (which cannot be altered)
Every core transaction is processed just once (in one shared electronic ledger), thus,
reducing redundancy and delays
Huge cost savings versus maintain physical records and builds collaborative technology
between companies doing business
The ledger being distributed, publicly verified, and nearly real-time data mining, and records
verification reduces time and effort spent on reconciliation of information
Disadvantages of blockchain -
Technology is complex
Regulatory clearance and implications are unknown
Implementation and training is complicated
Blockchain uses - Blockchain technology is mostly explored in the financial services sector
currently and spreading into healthcare, legal, insurance, telecommunications, etc.
E.g., Settling of stock trades, patient’s health records, insurance
E.g., Supply chain - The ability of blockchain technology to exchange data seamlessly
through decentralized peer‐to‐peer networks with all transactions immutably stored and
available for audit could make the supply chain system much more transparent and reliable.
Blockchain technology and smart contracts could reduce or eliminate the numerous
documents that accompany the international shipment of goods and the time lag required
to process international financial transactions
Few emerging categories for blockchain use cases [Image: Peter Bergstrom]:
E-41
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Cryptocurrency -
Currency which is ‘mined’ by miners (who are the network of parties that authenticate the
transactions)
Miners are generally computer programmers who get rewarded by getting digital currency
Basically, mining serves two purposes:
Authenticating transactions and
Generating digital currency.
Mining needs a lot of resources and is intended to be an arduous task. Individual blocks
contain the proof of work of the miners, and is verified at every time a new block is
generated. Miners ensure that the transaction is secure and processed properly and safely.
The miners add transaction records to public ledger
Characteristics of Cryptocurrency:
Just like currency, cryptocurrency is a medium of exchange
Created and stored electronically
Does not have intrinsic value and cannot be redeemed for another commodity
Central bank does not determine supply for cryptocurrency, the network is distributed and
decentralized
Cryptocurrencies provide cheaper and faster peer-to-peer payment options without the
need to provide personal details
Regulatory requirements for cryptocurrencies are still not well established and how
different countries are going to react is yet to be seen
E.g., Bitcoin, Ethereum, Litecoin
E-42
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-43
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-44
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
I) Business Intelligence
Big Data - Refers to datasets which are extremely large and/or complex
Big data is used to gain insight into relationships and associations to improve decision making
and strategy
Big data is too large to be analyzed by traditional spreadsheet software and requires special
software and computational power to be processed and analyzed
The advancement of computer processing has enabled organizations to record, store, and
analyze large volumes of data which previously was unavailable
Big data is broken down into four dimensions:
Volume - Refers to the quantity of the data
Big data is just that, BIG. Enormous amounts of data are stored in databases with
information on thousands, millions, or even billions of observational units.
Velocity - Refers to the speed by which big data is generated and analyzed; frequently, big
data is available in real time
Big data can involve the analysis of a constant stream of new data. For example, the
New York Stock Exchange captures over 1 terabyte of trade information during each
trading session. This huge amount of data is instantly incorporated into analyses
Variety - Deals with the types of data, such as numerical, textual, images, audio, and video
Big data can refer to traditional relational databases, videos, social media exchanges,
emails, health monitors, and more
Veracity - Refers to the quality/accuracy of data
Poor-quality data can make business decision making more difficult
The opportunities for the use of big data are substantial
Big data can glean data from many sources so that a business can target advertising directly
to potential customers who have shown interest in their product by internet searches,
social media posts, and demographics
Hospitals can capture data on patients to screen for harmful drug interactions or drug
allergies of an incapacitated patient brought into an emergency room
Big data also presents challenges:
Personal privacy - There is concern about all this data being available to marketers in real
time—to say nothing of the cost of data breaches to customers and companies or the risk of
the infringement of civil rights in the case of governmental use of big data
Human touch - Companies also face the challenge of keeping a human touch on interactions
with potential customers, who may not appreciate being mere sets of bytes to a company
E-45
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Semi-structured data - Does not have neat, organized fixed fields like structured data, but may
still contain organizing features such as tags or markers
While it does not have the formal structure of a relational database, semi-structured data
does have features which allow it to have classifications and groupings
E.g., XML (eXtensible Markup Language) which is used to encode documents in human and
machine readable formats
Unstructured data - Unorganized and is not easily searchable - i.e., computers cannot easily
work with the data
Unstructured data is often text-based, like human speech, rendering it difficult to categorize
and organize into predefined, set data fields
E.g., Email, Twitter feed, text messages, photos, videos
E-46
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Business intelligence - Applications, tools, and best practices that transform data into actionable
information in order to make better decisions and optimize performance. Business intelligence
supports better decision making.
Data Transformation - From Data to Action
Data by itself has very limited meaning. It is simply facts, statistics, symbols, numbers, texts,
or characters that may lack structure or form. Data is raw and unorganized
When structures or organizations are used, data is transformed into information
Information is different from data because information carries meaning and understanding
Based on information, knowledge can be defined and created. Knowledge is what we know
and how we understand the way things are
From knowledge, critical analysis can be done through logical thinking to gain insights and
understand situations/context in order to make decisions and undertake actions
E-47
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Using Artificial Intelligence for Business Intelligence - Create computer programs that can mimic
human insight (i.e., to take a database of facts and find the connections/patterns in the data)
Opportunity is to capture and preserve expert knowledge, insight, and decision‐making
skills so that they are not lost to future professionals. Programs can acquire and process
data faster and in greater volume and detail than a human decision maker
Challenge is to capture something as subjective as human insight and decision making and
convert it to something so discrete and objective as computer code; lines of detailed
instructions that can be programmed into software. In essence, this process creates the
thousands of rules necessary to reproduce human decision making
E-48
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-49
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-50
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Descriptive Analytics - Form of data analytics which aims to answer “What happened?”
Is observational and reports the characteristics of historical data
Describes statistical properties such as the mean, median, range, or standard deviation
Beneficial because it provides understanding of what has actually taken place
Diagnostic Analytics - Form of data analytics which aims to answer “Why did it happen?”
Looks at correlations, the size and strength of statistical associations
Can help identify empirical relationships that may be unknown or uncertain and data is
explored to find meaningful statistical associations
Tries to uncover and understand why certain outcomes take place and what could be
important factors causing those outcomes
Predictive Analytics - Form of data analytics which aims to answer “What will happen?”
Builds upon descriptive and diagnostic analytics to make predictions about future events
Can take the form of what-if analysis where sets of possible facts, their likelihoods, and
ranges are used to formulate potential future outcomes
Considers risk assessments, usually taking the form of outcome likelihoods and
uncertainties, to guide the prediction of future outcomes and trends
Prescriptive Analytics - Form of data analytics aims to answer “How can we make it happen?”
Draws upon the other forms of data analytics to infer or recommend the best course of
action
Can take the form of optimization or simulation analyses to identify and prescribe the
actions to undertake to realize the best or most desired result
E-51
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Classification - Data analysis technique which attempts to predict which category or class an
item belongs to
Typically begins with predefined categories and then attempts to sort an item into one of
those categories
- This equation could then be used to predict a total cost based on the output level. If
output was 100 units, then the total cost would be estimated to be
$3,000 = $1,000 + ($20 × 100 units).
How well a regression line fits the observed data is measured by the coefficient of
determination, commonly referred to as R2
2
R can be interpreted as the % of variation in the dependent variable that is explained
by variation in the independent variables
- That is, if R2 is equal to 0.73, it means that 73% of the variation in the dependent
variable is explained by variation in the independent variables
Another commonly used statistical measure is the correlation coefficient, commonly
referred to as R. R is a measure of how two variables are related. It provides both the
direction, positive or negative, and the strength of the relationship between two variables.
R ranges between -1 and 1.
Regression analysis provides other useful statistical measures:
The standard error of the estimate measures the accuracy of predictions
The regression analysis provides a prediction line, and the standard error of the
estimate tells how far away a data point is from the predict point
The goodness of fit is a measure of how well the observed data fit a statistical model.
That is, it is a summary of the discrepancy between observed values and what the
model would predict the value to be
Finally, a confidence interval is a range of values in which the true value lies. Regression
models only provide estimates of effect sizes and relationships. A confidence interval
provides a range rather than point estimate over which the true value is likely to be
Regression modeling provides many useful benefits, but does face limitations. For example,
Regression models should only be used to make predictions within what is called the
relevant range. The relevant range refers to a set of observed values. If a desired
prediction is outside of the set of observed values, there is a possibility that the
estimated relationship from the regression model will not be the same outside of the
observed values. Doing so is referred to as extrapolation whereby predictions about an
unknown observational range are inferred from a known observation range
Regression analysis can also be affected by outliers, extreme values that are far away
from the main observations
Time series - Data analysis method that considers data points over time. The temporal ordering
of data points allows for patterns to be identified, aiding the prediction of future values.
It should be noted that regression analysis is often done on cross-sectional data, that is,
data that takes place within the same time period. It is possible to combine cross-sectional
data with time-series data to observe the same phenomena over multiple periods of time.
Combined cross-sectional and time-series data is referred to as panel data.
Time-series analysis can identify patterns in observed data. Trends provide useful
information to help predict future outcomes based on what has happened in the past.
Common trends include systematic trends (such as prolonged upward or downward
movements), cyclical trends (such as macroeconomic cycles that rise and fall), seasonal
trends (such as periodic spikes in retail shopping around holidays), and irregular trends
(such as erratic fluctuations due to unforeseen events like natural disasters).
E-53
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-54
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Data tables are the first level of providing information to readers. Tables efficiently use space,
are scalable, and can be simple. Due to the prevalence of spreadsheets, most users are familiar
with and comfortable making tables and graphs. Software programs can make data tables
readily accessible and make it relatively easy to find and manipulate data. Six best practices
that can and should be applied to tables and graphs are listed next.
Planning - Before the first cell receives data, the purpose and content of the table and
graph should be planned. Know the audience for the table/graph and plan accordingly.
Focus - Focus of the table/graph should be the most prominent part of the design so that
readers will instantly recognize it
Alignment - For tables, text must be aligned or justified on the left of the cell and numerical
data must be aligned on the right side of the cell. Column headings must be appropriately
placed according to the content.
Size - Character (text and numeric) size matters. If the text is too small, then it is difficult to
read and probably will not be read. In addition, the use of common fonts is recommended;
uncommon fonts may focus attention on the font and not the information the designer is
seeking to convey.
Clutter - Enemy to every table and graph. Always leave sufficient white space to help the
reader focus on the message
Color - Can be a powerful tool in providing depth, focus, and contrast. However, too much
or poorly planned color can distract from the message of the table/graph
E-55
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Data visualization tools - Multiple tools/methods each have their specific purpose. The more
common data visualization tools and their purposes are described next. These tools have common
uses. Some are better at providing comparisons, others, distributions, still others at relationships
and trends. They can also be categorized according to their intended use.
Comparisons
Bar charts - Used to compare categories of data or data across time
E.g., we could compare the annual gross revenue of 10 different action movies with
each bar representing a movie. We could also represent the gross receipts for each
month after the release of one movie
E-56
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Distributions -
Histograms - Show how many data points fall into a certain range so that the distribution of
the data points can be viewed
Can look like bar charts. The main difference is that the histogram displays the
distribution (frequency) of one variable where the bar chart displays a comparison
based on two variables (i.e., one variable is tracked on the y‐axis and the other variable
is tracked on the x‐axis). However, histograms are only used with numerical values, and
the data points are displayed in an interval rather than the actual values.
Dot plot - Similar to a histogram except it uses vertical dots to represent a data distribution
rather than a bar. Dot plots are useful for relatively small data sets
E-57
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Box plot (or box‐and‐whisker plot) - Displays the distribution of a data set using five
standard measurements: the minimum data point, the lower quarter of the data points
(first quartile), the median or middle point of the data, the third quartile of the data points,
and the maximum data point
Box plots do not show individual values and can be skewed, but they are also one of the
few techniques that display outliers. They are also useful in showing a comparison
among distributions
E-58
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Location
Maps and Filled maps - Display geospatial data so that data points can be viewed in relation
to their geographical locations
When data relates to geographic locations, such as countries, states, cities, and zip
codes, maps and filled maps are powerful tools to visualize data
E-59
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Relationships
Scatterplot - Displays data points as they are plotted according to their relative position to
the x‐ and y‐axis
This technique provides a view of how the data is related or positioned as well as its
distribution
However, scatterplots do not show the relation between more than two variables.
Bubble chart - Version of the scatterplot. Here the dots on a scatterplot are consolidated
into bubbles, which vary in size to represent the number of data points
Similar to pie charts, bubble charts are best used when the bubble sizes display
substantial variation
E-60
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Heat map - Displays a relationship among variables through changes in the intensity of color
Heat maps provide a visual way to see numerical values. In fact, heat maps can show a
substantial amount of data without overwhelming readers
Heat maps also facilitate identifying outliers by displaying squares with a color intensity
significantly different from the surrounding squares
However, heat maps are less precise than other techniques since distinguishing among
various color hues can be difficult.
E-61
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
Trend -
Line chart - Displays information as series of data points
Line charts are easy to read and interpret. They are also helpful in making comparisons
between data sets and for showing changes or trends over time
One limitation is that line charts only show data over time
E-62
Miles CMA Review - Class Notes to Wiley CMA Learning System Part 1, Section F
E-63