0% found this document useful (0 votes)
5 views

Lecture 1

The document provides an overview of business intelligence and related concepts. It discusses business analytics, business intelligence, big data, and data mining. It explains that BI tools and techniques turn data into meaningful information to help businesses make better decisions. The document also discusses characteristics of good data for decision making, common BI applications, and tools that can help organizations get more value from their data warehouses.

Uploaded by

adane asfaw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lecture 1

The document provides an overview of business intelligence and related concepts. It discusses business analytics, business intelligence, big data, and data mining. It explains that BI tools and techniques turn data into meaningful information to help businesses make better decisions. The document also discusses characteristics of good data for decision making, common BI applications, and tools that can help organizations get more value from their data warehouses.

Uploaded by

adane asfaw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Business Intelligence Overview

1
Business Analytics, BI, Big Data, Data
Mining - What’s the difference?
• Business Analytics – Tools to explore past data to
gain insight into future business decisions.
• Business Intelligence (BI) – Tools and techniques
to turn data into meaningful information.
• Big Data –data sets that are so large or complex
that traditional data processing applications are
inadequate.
• Data Mining - Tools for discovering
patterns in large data sets.
2
Businesses Need Support for
Decision Making
• Uncertain economics
• Rapidly changing environments
• Global competition
• Demanding customers

• Taking advantage of information acquired by


companies is a Critical Success Factor.

3
Characteristics of Data for Good
Decision Making

4
The Information Gap
• The shortfall between gathering information
and using it for decision making.
– Firms have inadequate data warehouses.
– Business Analysts spend 2 days a week gathering
and formatting data, instead of performing
analysis. (Data Warehousing Institute).
– Business Intelligence (BI) seeks to bridge the
information gap.

5
Business Intelligence
• Tools and techniques to turn data into
meaningful information.
– Process: Methods used by the organization to turn
data into knowledge.
– Product: Information that allows businesses to
make decisions.

6
Business Intelligence
• Collecting and refining information from many
sources (internal and external)
• Analyzing and presenting the information in
useful ways (dashboards, visualizations)
– So that people can make better decisions
– That help build and retain competitive advantage.
• Goal: Convert Data to (Actionable)Knowledge

7
Klipfolio - sample of a marketing
dashboard

8
FitBit – Health Dashboard

9
BI Applications
• Customer Analytics
• Human Capital Productivity Analysis
• Business Productivity Analytics
• Sales Channel Analytics
• Supply Chain Analytics
• Behavior Analytics

10
BI Initiatives
• 70% of senior executives report that analytics will
be important for competitive advantage. Only 2%
feel that they’ve achieved competitive advantage.
(zassociates report)
• 70-80% of BI projects fail because of poor
communication and not understanding what to
ask. (Goodwin, 2010)
• 60-70% of BI projects fail because of technology,
culture and lack of infrastructure (Lapu, 2007)
11
Evolution of BI

12
Evolution of BI (contd.)

13
Implement successful
Business Intelligence Strategy

14
Justify BI
• Users Need TIMELY, ACCURATE, and
CONSISTENT Data
• It is………more than faster reporting
• It’s ALL about profitability

15
Justify BI

16
Justify BI

17
Business Intelligence Benefit OPPORTUNITY

18
BI Selection
• Now that you have a business case……where
do you start?
• Unlocking the Data

19
BI Implementation
• YOU HAVE TO HAVE A PLAN

20
BI Implementation

21
BI Implementation

22
BI Implementation

23
BI Implementation

24
BI Implementation
• Implementation can be easy...
• …getting the value from the technology can be
hard

25
BI Adoption

26
Data Warehouse
• Collection of data
from multiple
sources (internal
and external)
• Summary, historical and raw data from
operations.
• Data “cleaning” before use.
• Stored independently from
operational data.
• Broken down into DataMarts for
use.
27
Data Mining
• “Data mining is an interdisciplinary subfield of
computer science. It is the computational process of
discovering patterns in large data sets involving
methods at the intersection of artificial intelligence,
machine learning, statistics, and database systems.” -
Wikipedia
• Examining large databases to produce new
information.
– Uses statistical methods and artificial intelligence to
analyze data.
– Finds hidden features of the data that were not yet known.

28
5 Tasks of Data Mining in Business
• Classification – Categorizing data into
actionable groups. (ex. loan applicants)
• Estimation – Response rates, probabilities of
responses.
• Prediction – Predicting customer behavior.
• Affinity Grouping – What items or services are
customers likely to purchase together?
• Description – Finding interesting patterns.
29
Data Mining Techniques
• Market Basket Analysis
• Cluster Analysis
• Decision Trees and Rule Induction
• Neural Networks

30
Market Basket Analysis
• Finding patterns or sequences in the way that
people purchase products and services.
• Walmart Analytics
– Obvious: People who buy Gin also buy tonic.
– Non-obvious: Men who bought diapers would also
purchase beer.

31
Cluster Analysis
• Grouping data into like clusters based on
specific attributes.
• Examples
– Crime map clusters to better deploy police.
– Where to build a cellular tower.
– Outbreaks of Zika virus.

32
Why Data Mining?
• Now that we have gathered so much data, what
do we do with it?
• The datasets are of little direct value themselves.
What is of value is the knowledge that can be
inferred from the data and put to use.
• Data volumes are TOO BIG for traditional DSS
Query/ Reporting and OLAP tools.
• Organizations have to get value from the huge
investments of time and money made in building
data warehouses.
33
Discover the Diamonds in Your
Data Warehouse
• Maximize your ROI on data warehousing & data marts by enabling
your decision makers to exploit your customer data for competitive
advantage
• This web-enabled, point-and-click approach lets you employ OLAP,
neutral networks, churn analysis, and many other visualizations and
analytical techniques to improve –
– Customer retention
– Target key prospect
– Profile market segments
– Detect fraud
– Analyze customer response, and much more”

• Without BI, your DW is…….. ….. Well, a warehouse full of data

34
The Economics of Attention
“A wealth of information creates a poverty of
attention.” - Nobel prize- winning economist,
Herbert Simon
• Problem: NOT Information Access BUT
Information Overload
• Challenge: Locating, Filtering &
Communicating What is useful to the user

35
Why is Data Mining a “Hot” Topic
Today?
1. Implementation of ERP, CRM & SCM systems have resulted in vast stores
of operational data.

2. Emergence of global competition has put the pressure on companies to


be “data- driven” – i.e., make informed decisions based on facts and not
hunches.

3. The speed of change in the marketplace demands that the pearls of


actionable information have to be found faster in the ocean of data, for
companies to be one step ahead of competition.

4. The hardware needed to store and process a “ton of data” was


prohibitively expensive until recently – “You would have had to have
NASA at your disposal”. Today, the technology makes it feasible to apply
complex models to ferret out patterns previously left to rot in “data
jails”.

36
The Payoff from Data Mining
- Two Examples
1. Farmer’s Insurance
– Based on traditional data analysis, drivers of sports cars were determined to
be at higher risk for collisions than drivers of “safe” cars such as Volvos
– Hence charged them more for car insurance
– Data mining discovered a pattern that changed the pricing policy….
….. As long as the sports car was not the only car in the household, the river
it the profile of the “safe” family car driver, not the risky sports car driver.
2. Walgreen (A large Retailer)
– In the past, success of promotional offers such as 2-for-1 sales was measured
primarily by product sales…..
….. With data mining, Walgreen can see what other items are selling with its
promotional offers
….. Tuned its programs to put things on sale that people tend to buy in
tandem with high-margin items.

37
Tools to Get Value from Data
Warehouses
• Business Intelligence Tools
– To enable users without programming skills to
analyze the raw data in the data warehouse.
• Ad Hoc Query / Reporting
• OLAP Tools to “slice” and “dice” data.
• Data Mining Tools
– Automate the detection of patterns in the data
warehouse
– Build models to predict behavior through
statistical and machine-learning techniques.
38
Data Mining Not Limited to
Discovery
… i.e., finding an existing nugget of “gold” in the “mountain” of data,
• Data Mining used for Prediction also
– Telling you not just where the gold is “today”, but where the gold
might be “tomorrow”
– Predict what is going to happen next based on what we have found.
“From the moment I signed up for my Total Rewards card in the
casino lobby and filled in my name, address, date of birth and
driver’s license number, Harrah’s had a pretty good hunch that my
long term potential was already low… I was a 32- year old man
from the distant state of Montana… did not fit the profile of a
high- value customer!”
Age, gender and distance from the casino were identified through
data mining as critical predictors of frequency of visiting casinos.

39
Knowledge Discovery in Databases
- Steps in KDD process

40
Data Mining is One Step in the KDD
Process
Determine patterns from observed data to solve a business problem.
Step 1: Identify the Business Problem
- e.g., Who are “good” customers?
Which customers are likely to leave?
Step 2: Choose Model or Goal for Data Mining
- Some models are better for predictions while others are better or
describing behavior
Step 3: Choose Technology to Build Model
Step 4: Apply the Algorithm (Computation process) to Data. Review
the results and refine the Model
Step 5: Validate the Model on New Data (the “hold-out” dataset)

41
Data Mining Models
1. Association
- If customer buys spaghetti, also buys red wine in 70% of cases
2. Sequential Patterns – time or event based
- A customer orders new sheets and pillow cases followed by drapes in
75% of the cases
3. Classification
- Opera ticket buyers are usually young urban professionals with high
income while country music concert ticket purchasers are typically blue
collar workers
4. Clustering
- Discovers different groups in the data whose members are very similar
5. Predictive Models
- Relate behavior of customers (“dependent” variable) to predictors
(“independent” variables felt to be “responsible” for the dependent one)

42
Association Models for Market–Based
Analysis
• Model finds items that occur together in a given
event or record
• Discovers rules of the form:
– If item A is part of an event, then X% of the time
(confidence factor), Item B is part of the event.
• Used to discover patterns of items bought
together from the “mountain” of scanner data
• Example:
– If a customer buys corn chips, then 65% of the time,
also buys cola Unless there is a promotion, in which
case buys cola 85% of the time.

43
Sequential Patterns
• Similar to Association Models, except that the
relationships among items are spread over time.
Sequences are associations in which events are linked
by time
• Require data on the identity of the transactors in
addition to details of each transaction.
• Example:
If surgical procedure X is performed, then 45% of the
time infection Y occurs within 5 days
But after 5 days, the likelihood of infection Y drops to
4%

44
Classification Models
Most Common Data Mining Model
• Describe the group that a member belongs to by
examining existing cases that already have been
classified, and inferring a set of rules
• These IF-THEN rules are often depicted in a tree like
structure
• Examples:
– What are the characteristics of customers who are likely to
switch to a rival telecom service provider?
– Which kinds of promotions have been effective in keeping
which types of customers so that you can target the right
promotion to the right customer?

45
Clustering Models
• Segment a database into different groups whose members are very
similar
– Similar to Classification except that no groups have yet been defined
• The Clustering model discovers groupings within the data
– You do not know what the clusters will be when you start,
or on what attributes the data will be clustered.
– Hence, a user who is knowledgeable in the business needs to
interpret the clusters.
• Example:
– Xerox has developed predictive models using clusters for analyzing
usage profile history, maintenance data, and representations of
knowledge from field engineers to predict photocopy component
failure.An email is sent to the repair staff to schedule maintenance
PRIOR to the breakdown.

46
Predictive Models
• Combine predictors (or “independent” variables) in a model relating them
to the variable to be predicted (“dependent” or “predictive” variable)
using historical data on the predictors and the predictive variable –
“training” data set
– Resulting model is used to predict the value for new data that does not
include the predictive variable.
• Example 1: Predefined Predictors
– If the customer is rural and her monthly usage is high, then the customer will
probably renew.
– If the customer is urban and new feature exploration is high, then the
customer will probably not renew.
• Example 2: Customer Profiling
– “We can tell the profile of someone who is about to have a baby by what
purchases they make…
We can then compare that profile with those of others “who are moving into
baby space” to predict needs. For instance, such a customer may be a good
target for a life insurance sales pitch.”

47
Data Mining Techniques
- Decision Trees
• Derives rules from patterns in data to create a
hierarchy of IF-THEN statements, called a
Decision Tree, to classify the data.
• Segments the original data set:
– Each segment is one of the leaves of the tree
– Records in each segment are similar with regard to
the variable of interest

48
• Example: Classification of Credit Risks

49
Pros & Cons of Decision Trees
1. How to handle continuous sets of data, like age or sales?
– Ranges have to be created such as 25-34 years, 35-44 years, etc.
– This grouping of ages could inadvertently hide patterns…
e.g., a significant break at 30 could be concealed
2. Crux of the “Tree- Growing” Process:
– What is the best possible question to ask at each branch point of the tree?
– e.g., The question “are you over 35?” may not distinguish between churners
and those who are not if the spilt of people over 35 is 40% for churners & 60%
for others. The goal is to get a 90%-10% (10%- 90%) spilt in the segment of
people over 35 years.
3. The algorithms look at all possible distinguishing questions and the
sequence of asking them that could break up the “training data set” into
segments that are nearly homogeneous with respect to the variable to be
predicted. They stop growing the tree when the improvement is not s
ubstantial to warrant asking the question.

50
Decision Tree for Segmenting Customers
Who Responded to a Marketing Campaign

51
How to Evaluate a Data Mining
Product
1. What kind of business problem does it address?
2. What technique does it use to model the data?
3. How does it handle categorical data and continuous data?
4. How sensitive is it to “noise” data?
5. How does it avoid the problem of “overfitting” the model?
6. Does it have a built-in process for validating the model on
the “holdout” data?
7. Is the user interface easy to understand and use?
8. How long does it take to get useful answers from the data?
9. How clear are the results to interpret?
10. ABOVE ALL, TEST DRIVE THE PRODUCT ON YOUR DATA!

52
Text Mining: An Imperative Today
“We are drowning in information, but are
starving for knowledge”
• Unstructured data, most of it in the form
of text files, typicallyaccounts for 85%
of an organization's knowledge stores, but
it’s not always easy to find, access,
analyze or use.

53
New Generation of Text Mining Tools
…to extract key elements from large unstructured data
sets, discover relationships and summarize the
information
• Categorization:
– Presents the search results in categories, rather than an
undifferentiated mass.
• Clustering:
– Grouping similar documents based on their content.
• Extraction:
– Extracting relevant information from a document
e.g., pulling out all the company names from a
data set.
54
New Generation of Text Mining Tools
• Keyword Search:
– Searching documents for the occurrence of a
particular word or set of words.
• Natural-Language processing:
– Determining the meaning of written words taking
into account their context, grammar, etc.
• Visualization:
– Graphically presenting the mined data as
relationships are easier to spot and understand.

55
Case Example of Text Mining
Air Products & Chemical’s Knowledge Management System
• Company has over 18,000 employees in 300 countries, and more
than 600 intranet and extranet sites.
• Its file servers contain 9TB of unstructured data, excluding email or
anything stored on local drives.
• Using Smart Discovery to generate a catalog and index of the data
repository so that it can be more easily accessed by MS SharePoint
Portal Document Management System.
• Also using the software for Sarbanes-Oxley compliance and e-
learning since by correctly categorizing the data, business rules can
be applied to a category of documents rather than to individual
documents:
– e.g., if a document relates to operations covered by SOX, then the
appropriate data-retention policies are applied to it.

56
Text Mining Tools
• Come either as stand-alone products or
embedded as part of a larger software system:
• Database vendors: Oracle, IBM,…
– Incorporating pattern-matching algorithms into their
database products
• Data Mining vendors: SAS, SPSS,…
– Added text mining to their portfolios.
• Enterprise Search Engine Vendors: Autonomy,
Verily,…
• Specialized Text Mining Firms: Inxight Software,
Stratify
57
Conclusion

58

You might also like