0% found this document useful (0 votes)

189 views

Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara

The document discusses decision support systems (DSS) and their evolution over time. It describes the main components of a DSS as a database, decision models, and a user interface. It also outlines the different types of DSS, including communication-driven, data-driven, document-driven, knowledge-driven, and model-driven systems. The document then provides more details on model-driven and data-driven DSS, discussing examples and technologies used in each. Finally, it briefly discusses communication-driven, document-driven, and knowledge-driven DSS as well as the rise of web-based DSS technologies in the late 1990s.

Uploaded by

shouryaraj batra

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

189 views

Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara

Uploaded by

shouryaraj batra

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Decision Support System (DSS)

 Decision Support System (DSS) is a computer based information system that supports business or
organizational decision making activities.
 Components of DSS are the database (or knowledge base), the model (decision context and user
criteria and the user interface.
 As technology evolved new computerized decision support applications were developed and studied.
Researchers used multiple frameworks to help built and understand these systems.
 The types of DSS are communication driven DSS, data driven DSS, Document driven DSS,
Knowledge driven DSS and model driven DSS. One can organize the history of DSS into one of the
five broad DSS category.
 The Application areas include medical diagnosis, Enterprise decision Management, Expert system,
Predictive analysis and it is extensively used in business and management.

DSS evolution relates to

 changing DSS features or components over time,

 changing technology on which the system is used,
 getting more efficient algorithms over time,
 evolving knowledge in the system over time,
 Changing users and user preferences over time.

Model driven DSS

 First widely discussed DSS is model driven DSS.

 A model driven DSS emphasizes access to and manipulation of financial optimization and/or
simulation models. Simple quantitative models provide the most elementary level of functionality.
 Model driven DSS use limited data and parameters provided by decision makers to aid decision
makers in analyzing a situation but in general large databases are not needed for model driven DSS.
 The first commercial tool for building model driven DSS using financial and quantitative models was
called IFPS, an acronym for Interactive Financial Planning System.
 As computerized models became more numerous, research focused on model management and on
enhancing more diverse types of models for use in DSS such as multicriteria optimization and
simulation models.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Data Driven DSS

 In a general, Data driven DSS emphasizes access to and manipulation of a time series of internal
company data and sometimes external and real time data.
 Simple file systems accessed by the query and retrieval tools provides the most elementary levels of
functionality.
 Data driven DSS with online analytical processing provides the highest level of functionality and
decision support that is linked to analysis of large collections of historical data . Executive
Information systems are the examples of Data Driven DSS.
 One of the first data driven DSS was built using an APL based software package called AAIMS an
Acronym for An Analytical Information Management System.
 Business Intelligence (BI) is sometimes used interchangeably with books, report and query tools and
executive information system. In general Business Intelligence systems are Data driven DSS.

Communication Driven DSS

 Communication driven DSS use network and communications technologies to facilitate decision
relevant collaboration and communication.
 In these systems, communication technologies are the dominant architectural component. Tools
include groupware, video conferencing and computer based bulletin boards are primary technologies.
 In past few years voice and video delivered using internet protocol have greatly expanded the
possibilities for synchronous communication driven DSS.

Document Driven DSS

 A document driven DSS uses computer storage and processing technologies to provides document
retrieval and analysis.
 Large document databases may include scanned document, hypertext documents, images, sounds and
videos.
 The WWW technologies significantly increased the availability of documents and facilitated the
development of Document driven DSS.

Knowledge Driven DSS

 Knowledge based DSS can suggest or recommend actions to managers. These DSS are person
computer systems with specialised problem solving expertise.
 These system have been called as suggestion DSS.
 Artificial Intelligence systems have been developed to detect fraud and expedite financial
transactions; many medical diagnostic systems have been based on AI.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

 MYCIN project for blood disease diagnosis is example of knowledge based DSS.

Web Based DSS

 Beginning in approximately1995, the World Wide Web (WWW) global internet provided a
technology platform for further extending the capabilities and development of Computerized decision
support.
 Release of HTML with form tag and tables was turning point in development of Web based DSS.
 A Web based decision support system delivers decision support information to manager using Web
browsers like Netscape Navigator or Internet Explorer.

What is Data warehousing?

Ans:- A Complete repository of historical corporate data extracted from transaction systems that is available
for ad-hoc access by knowledge workers.

Or A Data warehouse refers to a data repository that is maintained separately from organizations operational
databases. DW systems allows for integration of a variety of allocation systems

Or According to Williams H. Inmon “Data warehouse is a subject oriented, integrated, time variant and
non volatile collection of data in support of management’s decision making process”

Or A single, complete and consistent store of data obtained from a variety of different sources made
available to end users in what they can understand and use in business context.

Or A Data warehouse is a subject oriented, integrated, time variant and non volatile collection of huge
amount of data which helps in management decision making process.

What are the different characteristics of data warehousing?

Or
What are features of the data warehousing?

Ans:- Fallowing are the charactestics of data warehousing.

 Subject Oriented: A data warehouse is organized around major subjects such as customer, supplier.
Product and sales. Rather than concentrating on day-to –day operations and transaction processing of
an organization, data warehouse focuses on the modelling and analysis of data for decision makers.
Hence data warehouses typically provide a simple and concise view of particular subject issue by
excluding data that are not useful in decision support system.
 Integrated: A Data Warehouse is usually constructed by integrating multiple heterogeneous sources,
such as a relational databases, flat files, and online transaction records. Data cleaning and integration
techniques are applied to ensure consistency in naming conventions, encoding structure, attributes
measurement and so on.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

 Time variant: Data are stored to provide information from an historic perspective (e.g. past 5-10
years).Every key structure in the data warehouse contains either implicitly or explicitly, a time
variant. Time Variant nature of the data in a data warehouse allows for analysis of the past, relates
information to the present, and enables forecast for the future.
 Non Volatile: Non volatile means, once data entered into the warehouse, data should not change. A
Data warehouse is always a physically separate store of data transformed from the application data
found in operational environment. Due to this separation, data warehouse does not require transaction
processing, recovery and concurrency mechanisms.
 Data Granularity: When a user queries the data warehouse for analysis, they usually start by
looking at summary data. Therefore we find it efficient to keep data summarized at different levels.
Depending on the query, we can then go to the particular level of the detail and satisfy the query.
Data granularity refers to the level of details. The lower the level of details, finer the data granularity.

Q.) What do you mean by Strategic Information? Describe its characteristics features.(W-2015)
Or
Explain the Compelling need for data warehousing.

Ans: A Strategic Information (SI) is a information that helps companies change or otherwise alter their
business strategy and/or structure. It is typically utilized to streamline and quicken the reaction time to
environmental changes and aid it in achieving a competitive advantage.
The executives and managers who are responsible for keeping the enterprise competitive need
information to make proper decisions. They need information to formulate the business strategies, establish
goals and monitor results.
The type of information needed to make decisions in the formulation and execution of business
strategies and objectives are broad-based and encompass the entire organization. We may combine all these
types of essential information into one group and call it strategic information.
Processing large volume of data and providing interactive analysis requires extra computing power.
The explosive increase in computing power and its lower costs make provision of strategic information
feasible.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Characteristics features of Information System:

What are the Application areas of Data Warehouses?

Ans:
a data warehouse helps business executives to organize, analyze, and use their data for decision making.
Data warehouses are widely used in the following fields:
 Financial services
 Banking services
 Consumer goods
 Retail sectors
 Controlled manufacturing
 Weather forecasting
 Medical diagnosis
Q) Explain in Detail life- cycle of Data warehouse System.
Ans:
Following are the main phases for the DW life-cycle

DW planning: This phase is aimed at determining the scope and the goals of the DW, and determines the
number and the order in which the data marts are to be implemented according to the business priorities and
the technical constraints .At this stage the physical architecture of the system must be defined .

Data mart design and implementation: This macro-phase will be repeated for each data mart to be
implemented and will be discussed in more detail in the following. At each iteration a new data mart is
designed and deployed. Multidimensional modelling of each data mart must be carried out considering the
available conformed dimensions and the constraints deriving from previous implementations.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

DW maintenance and evolution: DW maintenance mainly concerns performance optimization that must be
periodically carried out due to user requirements that change according to the problems and the opportunities
the managers run into. On the other hand, DW evolution concerns keeping the DW schema up-to-date with
respect to the business domain and the business requirement changes.

Figure. The main phases for the DW life-cycle.

Data mart design and implementation goes through fallowing phases.

 Requirement analysis: it identifies which information is relevant to the decisional process by either
considering the user needs or the actual availability of data in the operational sources.
 Conceptual design: aims at deriving an implementation-independent and expressive conceptual
schema for the DW, according to the conceptual model chosen.
 Logical design: takes the conceptual schema and creates a corresponding logical schema on the
chosen logical model. While nowadays most of the DW systems are based on the relational logical
model (ROLAP).
 ETL process design: designs the mappings and the data transformations necessary to load into the
logical schema of the DW the data available at the operational data source level.
 Physical design: addresses all the issues specifically related to the suite of tools chosen for
implementation – such as indexing and allocation.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Q.) Define Knowledge Discovery process and Explain KDD process.

Ans:

 Knowledge Discovery from Data (KDD) is the process of discovering useful knowledge from
collection of data.
 Major KDD application areas include marketing, fraud detection and telecommunications.
 KDD process includes fallowing iterative useful steps.
 Data Cleaning: This step is use to remove noise and inconsistent data.
 Data Integration: In this step multiple data sources may be combined.
 Data Selection: In this step, the data which is relevant to the analysis task are retrieved from the
database. On other hand data which is not relevant for analysis task is omitted.
 Data Transformation: In this step , data are transformed and consolidated into forms appropriate for
mining by performing summary or aggregation operations.
 Data Mining: This is an essential process where intelligence methods are applied to extract data
patterns.
 Pattern Evaluation: This step is used to identify the truly interesting patterns representing
knowledge based on interestingness measures.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

 Knowledge presentation: In this step visualization and knowledge representation techniques are
used to present mined knowledge to users.

Steps including data cleaning, data integration, data selection and data transformation are the data pre-
processing steps.
Data mining step may interact with the user or knowledge base. The interesting patterns are presented to the
user and may be stored as new knowledge in the knowledge base.
Data mining is the process of discovering the interesting patterns and knowledge from large amount of data.
Data sources may includes databases, data warehouses, Web and other information repository etc.

Q.) Why do you need separate data staging area in DWH? Explain its function.(W-15)
Ans: A staging area is an intermediate storage area used for data processing during the extract, transform
and load (ETL) process. The data staging area sits between the data sources and the data target, which are
often data warehouses, data marts, or other data repositories. It is also called as landing zone,
The primary motivations for their use are to increase efficiency of ETL processes, ensure data integrity and
support data quality operations. The functions of the staging area include the following:

 Consolidation: One of the primary functions performed by a staging area is consolidation of data
from multiple source systems. In performing this function the staging area acts as a large "bucket" in
which data from multiple source systems can be temporarily placed for further processing.
 Alignment: Aligning data includes standardization of reference data across multiple source systems
and validation of relationships between records and data elements from different sources.
 Minimizing contention: The staging area and ETL processes it supports are often designed with a
goal of minimizing contention within source systems.
 Independent scheduling/multiple targets: The staging area can support hosting of data to be
processed on independent schedules, and data that is meant to be directed to multiple targets.

 Change detection: This functionality is particularly useful when the source systems do not support
reliable forms of change detection, such as system-enforced time stamping.

 Cleansing data: Data cleansing includes identification and removal (or update) of invalid data from
the source systems.

 Data archiving and troubleshooting: In this, staging area can be used to maintain historical records
during the load process, or it can be used to push data into a target archive structure.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Q.) Explain three tier architecture of data warehouse. (S-16)

Ans:

Data Warehouses often adopt three-tier architecture.

1. The bottom tier of the architecture is the data warehouse database server. It is the relational database
system. Back end tools and utilities to feed data into the bottom tier. These back end tools and utilities
perform the Extract, Clean, Load, and refresh functions. The data are extracted using application program
interfaces known as gateways. Gateway is supported by underlying DBMS and allows client programs to
generate SQL code to be executed on servers. Gateways includes ODBC ( Open database Connection) and
JDBC (Java Database Connection). This tier also contains a metadata repository, which stores information
about data warehouses and its contents.
2. Middle tier is an OLAP Server that can be implemented in either of the following ways.
 By Relational OLAP ROLAP, which is an extended relational database management system. The
ROLAP maps the operations on multidimensional data to standard relational operations.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

 By Multidimensional OLAP MOLAP model, which directly implements the multidimensional data
and operations
3. Top-Tier -This tier is the front-end client layer. This layer holds the query tools and reporting tools,
analysis tools and data mining tools ( eg.. Trend analysis, prediction and so on).

Q.) What are different Data Warehouse Models?

Ans: From the perspective of data warehouse architecture, we have the following data warehouse models:
 Virtual Warehouse
 Data mart
 Enterprise Warehouse

Virtual Warehouse
The view over an operational data warehouse is known as a virtual warehouse. It is easy to build a virtual
warehouse. Building a virtual warehouse requires excess capacity on operational database servers.
Data Mart
Data mart contains a subset of organization-wide data. This subset of data is valuable to specific groups of
an organization.
In other words, we can claim that data marts contain data specific to a particular group. For example, the
marketing data mart may contain data related to items, customers, and sales. Data marts are confined to
subjects.
Points to remember about data marts:
 Window-based or Unix/Linux-based servers are used to implement data marts. They are
implemented on low-cost servers.
 The implementation data mart cycles is measured in short periods of time, i.e., in weeks rather than
months or years.
 The life cycle of a data mart may be complex in long run, if its planning and design are not
organization-wide.
 Data marts are small in size.
 Data marts are customized by department.
 The source of a data mart is departmentally structured data warehouse.
 Data mart is flexible.

Enterprise Warehouse
 An enterprise warehouse collects all the information and the subjects spanning an entire organization.
 It provides us enterprise-wide data integration.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

 The data is integrated from operational systems and external information providers.
 This information can vary from a few gigabytes to hundreds of gigabytes, terabytes or beyond.
 This type of warehouse can be implemented on traditional mainframes, super computer servers or
parallel architecture platforms.

Q.) Explain the Role of Metadata in building the data Warehouse.(S-16)

Or
What is Metadata? State and Explain its Categories. (W-15)

Ans:- Metadata is simply defined as data about data. The data that are used to represent other data is known
as metadata. Metadata in data warehouse defines the warehouse objects. Metadata acts as a directory. This
directory helps the decision support system to locate the contents of a data warehouse. Metadata is a road-
map to data warehouse. It is created for the data names and definition of given data warehouse. For example,
the index of a book serves as a metadata for the contents in the book.

Metadata Repository
Metadata repository is an integral part of a data warehouse system. It contains the following metadata:
 Business metadata - It contains the data ownership information, business definition, and changing
policies.
 Operational metadata - It includes currency of data and data lineage. Currency of data refers to the
data being active, archived, or purged. Lineage of data means history of data migrated and
transformation applied on it.
 Algorithms used for summarization, which includes measure and dimension definition algorithms,
partitions, subject areas, aggregation summarization and predefined queries and reports..
 Data for mapping from operational environment to data warehouse - It metadata includes
source databases and their contents, data extraction, data partition, cleaning, transformation rules,
data refresh and purging rules.
 Data related to system performance, which includes indices and profiles that improve data access
and retrieval performance, replication cycles etc.

Types of Metadata

Metadata in a data warehouse fall into three major categories:

 Operational Metadata
 Extraction and Transformation Metadata
 End-User Metadata
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Operational Metadata: Data for the data warehouse comes from several operational systems of the
enterprise. These source systems contain different data structures. The data elements selected for the data
warehouse have various field lengths and data types. In selecting data from the source systems for the data
warehouse, we split records, combine parts of records from different source files, and deal with multiple
coding schemes and field lengths. When you deliver information to the end-users, we must be able to tie that
back to the original source data sets. Operational metadata contain all of this information about the
operational data sources.

Extraction and Transformation Metadata. Extraction and transformation metadata contain data about the
extraction of data from the source systems, namely, the extraction frequencies, extraction methods, and
business rules for the data extraction. Also, this category of metadata contains information about all the data
transformations that take place in the data staging area.

End-User Metadata. The end-user metadata is the navigational map of the data warehouse. It enables the
end-users to find information from the data warehouse. The end-user metadata allows the end-users to use
their own business terminology and look for information in those ways in which they normally think of the
business.

Q.) Differentiate between Operational and Decision-Support Systems. (W-15)

Ans:

 The operational systems such as order processing, inventory control, claims processing, outpatient
billing, and so on are not the signed or intended to provide strategic information. If we need the
ability to provide strategic information, we must get the information from altogether different types
of systems. Only specially designed decision support systems or informational systems can provide
strategic information.

 Operational systems are online transaction processing (OLTP) systems. These are the systems that
are used to run the day-to-day core business of the company. They support the basic business
processes of the company. These systems typically get the data into the database.

 On the other hand, specially designed and built decision-support systems are not meant to run the
core business processes. They are used to watch how the business runs, and then make strategic
decisions to improve the business.

 From the data analyst’s point of view, decision support data differ from operational data in three
main areas: time span, granularity, and dimensionality.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

 Time span: Operational data cover a short time frame. In contrast, decision support data tend to
cover a longer time frame.
 Granularity (level of aggregation): Decision support data must be presented at different levels of
aggregation, from highly summarized to near-atomic.
 Dimensionality: Operational data focus on representing individual transactions rather than on the
effects of the transactions over time. In contrast, data analysts tend to include many data dimensions
and are interested in how the data relate over those dimensions.

Benefits of DSS

 Improves personal efficiency

 Speed up the process of decision making
 Increases organizational control
 Facilitate interpersonal communication

Benefits of Operational database system

 Quick retrieval
 The ability to share information across the company
 Provides simultaneous read/write requests through pre-defined queries
 The amount of data that can be stored that pertains to a business

Q.) Describe in brief the evaluation of database system technology.(S-16)

Ans:
 Database systems are the one of the key enabling forces behind the business transformation.
 Database system technology also needs to be efficient in terms of storage and speed.
 Modern database system thus needs to build high reliability mechanisms in their designs.
 Performance evaluation of database system technology is thus an important concern. Performance
evaluation of database is a non trivial activity make more complicated by the existence of different
flavors of database systems turned for specific requirement.
 Database is the shared resource that is at centre of such system. The databases functionality is
optimal storage and maintains the correctness of the data and maintains the consistency of the
system at all time.
 Database management is complex set of software program that controls the organization, storage,
management and retrieval of data in database.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

 Database management is complex set of software program that allows multiple users to access,
create, update and retrieve the data to and from the database.
 Storage manager is a program module that provides interface between low –level data storage in
the database and application programs and queries submitted to the system. The storage manager is
responsible for fallowing task.
 Such as interaction with file manager, efficient storing, retrieving and updating the data.
 Users are differentiating by the way they want interact with system.Specilazed users, writes
specialized database application that do not fit into in traditional data processing framework.
 Sophisticated users form requests in database query language.
 A naïve Users invoke one of the permanent application programs that have been written previously.
 Data Model is just way of structuring the data. It also defines set of operations that can be
performed on the data. Flat model consists of single, two-dimensional array of data elements.
 Network model organizes data using two fundamental structures called records and sets. Relational
database contains multiple table which similar to one flat database model.
 Dimensional model is often implemented on the top of relational model using star schema
consisting of one table containing the facts and surrounding tables containing the dimensions.
 Object Database models This aims to avoid overhead (referred as independent mismatch) of
converting information.

Que.Why do you need a separate data staging components?

Or What are the building blocks of Data warehouse?
Or Discuss the components of Data warehouse in details.
Ans:
 Data warehouse architecture is the proper arrangement of the components.
 We build data warehouse with software and hardware components.
 In figure shown below, the source data component is shown on left. The data staging component
serves as the next building block. In the middle we have data storage component that manage the
data warehouse data. This component also keeps track of the data by means of metadata repository.
Information delivery component shown on right consist of all the different ways of thinking the
information from data warehouse available to the user.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Figure: Data warehouse building blocks or components.

A) Source data components
Source data consist of the following data
i) Production Data
 This category of data comes from the various operational systems of the enterprise.
 Data comes from various operational systems varies in different data formats, so it’s challenging
task to standardize and transform the disparate data and also to convert and integrate for string in data
warehouse.
ii) Internal Data
 In every organization, users keep their private spreadsheet, documents, customer profiles and even
departmental databases. This is internal data.
 We cannot ignore the internal data held in private files in our organization.
 Internal data adds additional complexity to the process of transforming and integrating the data
before it can be stored in the data warehouse.
iii) Archived Data
 Operational systems are primarily intended to run the current business. In every operational system,
we periodically take odd data and store it in achieved files.
 Some data is archived after a year. Sometimes data is left in the operational system databases for as
long as five years.
 Many different methods exist for archiving data.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

iv) External Data

 Most executives depend on the data from external sources for a high percentage of the information
they use. They use statistics relating to their industry produced by external agencies.
 The purpose served by such external data sources cannot be fulfilled by the data available within our
organization itself.
B) Data staging component

 After we have extracted data from various operational system external sources , we have to prepare
the data for storing in the data warehouse.
 Extracted data made available are available in different format hence different functions such as
transformation are applied to prepared for loading in staging area.
 Data staging provides a place and area with set of functions to clean,change,combine,convert for
storage and use in the data warehouse.

c) Data storage Component

 Data storage for the data warehouse is a separate repository.

 In the data repository for data warehouse, we need to keep large volume of historical data for
analysis.
 We have to keep the data in the data warehouse in structure suitable for analysis not for quick
retrieval of piece of information. Therefore data storage for data warehouse keep separate from the
data storage for operational systems.
 The data in the operational databases could change from moment to moment.
D) Information delivery component
 The novice user comes to data warehouse with no training and needs preferred reports and present
queries.
 Casual user needs information once in a while, not regularly. This type of user also needs pre-
packaged information.
 The business analyst looks for ability to do complex analysis using the information in the data
warehouse
 Ad hoc reports are predefined reports primarily meant for novice and casual users.
 Information fed into Executive Information Systems (EIS) is meant for senior executives and high
levels managers.
 Some data warehouses are also provides data to data warehouse applications with mining algorithms
to help us discover trends and patterns from usage of our data.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

E) Metadata Component
 Metadata in data warehouse is similar to the data dictionary or data catalog in the database
management system.
 The data dictionary contains data about the data in the database. Similarly metadata component is the
data about the data in data warehouse.
F) Management and Control Component
 This component of the data warehouse architecture site on top of all the other components.
 Management and Control Component coordinates the services and activities within the data
warehouse.
 This component controls the data transformation and data transfer into data warehouse storage.
 It monitors the movement of data into staging area and from there into data warehouse storage itself.
 Management and Control Component interact with the metadata component to perform the
management and control functions.

Que. Information Delivery Methods

Ans: A data warehouse is never static; it evolves as the business expands. As the business evolves, its
requirements keep changing and therefore a data warehouse must be designed to ride with these changes.
Hence a data warehouse system needs to be flexible. The delivery method is a variant of the joint application
development approach adopted for the delivery of a data warehouse.

1. Standard Reports:

Purpose: Provides a pre-made document to provide information needed by user.

Usage: Reports that require infrequent structural changes, and can be easily accessed electronically.

2. Queries

Purpose: Provides ability to data using a pre-defined query, or on an ad hoc basis.

Usage: Research, analysis and reporting.

3. Analytical Applications

Purpose: Provides ability to easily access key performance indicators or metrics.

Usage: Monitoring and accessing performance.

4. OLAP Analysis

Purpose: Alerts users to pre-defined conditions that occur.

Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Usage: Research and Analysis.

5. Exception Based Reporting

Purpose: Provides ability to perform summary, detailed or trend analysis on requested data.

Usage: Notification without the need to perform detailed analysis.

6. Data Mining

Purpose: Ability to discover hidden trends with the data.

Usage: Research and analysis of hidden trends within the data.

Que. Data Warehouse Design Process

Ans: To design an effective data warehouse we need to understand and analyze business needs and construct
a business analysis framework. A data warehouse can be built using a top-down approach, a bottom-up
approach or a combination of both.

The top-down approach starts with overall design and planning. It is useful in cases where the
technology is mature and well known, and where the business problems that must be solved are clear and
well understood.

The bottom up approach starts with experiments and prototypes. This is useful in the early stage of
business modeling and technology development. It allows an organization to move forward at considerably
less expense and to evaluate the technological benefits before making significant commitments.

In the combined approach, an organization can exploit the planned and strategic nature of the top-
down approach while retaining the rapid implementation and opportunistic application of the bottom-up
approach.

From the software engineering point of view, the design and construction of a data warehouse may consist of
the following steps: planning, requirements study, problem analysis, warehouse design, data integration and
testing, and finally deployment of the data warehouse. Large software systems can be developed using one
of two methodologies: the waterfall method or the spiral method. The waterfall method performs a
structured and systematic analysis at each step before proceeding to the next, which is like a waterfall,
falling from one step to the next. The spiral method involves the rapid generation of increasingly functional
systems, with short intervals between successive releases. This is considered a good choice for data
warehouse development, especially for data marts, because the turnaround time is short, modifications can
be done quickly, and new designs and technologies can be adapted in a timely manner.
Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

In general, the warehouse design process consists of the following steps:

1. Choose a business process to model (e.g., orders, invoices, shipments, inventory, account administration,
sales, or the general ledger). If the business process is organizational and involves multiple complex object
collections, a data warehouse model should be followed. However, if the process is departmental and
focuses on the analysis of one kind of business process, a data mart model should be chosen.

2. Choose the business process grain, which is the fundamental, atomic level of data to be represented in the
fact table for this process (e.g., individual transactions, individual daily snapshots, and so on).

3. Choose the dimensions that will apply to each fact table record. Typical dimensions are time, item,
customer, supplier, warehouse, transaction type, and status.

4. Choose the measures that will populate each fact table record. Typical measures are numeric additive
quantities like dollars_sold and units_sold.

Because data warehouse construction is a difficult and long-term task, its implementation scope
should be clearly defined. The goals of an initial data warehouse implementation should be specific,
achievable, and measurable. This involves determining the time and budget allocations, the subset of the
organization that is to be modeled, the number of data sources selected, and the number and types of
departments to be served.

Once a data warehouse is designed and constructed, the initial deployment of the warehouse includes
initial installation, roll-out planning, training, and orientation. Platform upgrades and maintenance must also
be considered.

Various kinds of data warehouse design tools are available. Data warehouse development tools
provide functions to define and edit metadata repository contents (e.g., schemas, scripts, or rules), answer
queries, output reports, and ship metadata to and from relational database system catalogs. Planning and
analysis tools study the impact of schema changes and of refresh performance when changing refresh rates
or time windows.

*All the Best***

Marc Atkinson - The UnReel Drum Book
100% (2)
Marc Atkinson - The UnReel Drum Book
142 pages
Data Warehousing-Seminar Report
100% (1)
Data Warehousing-Seminar Report
14 pages
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Data Warehousing Research Paper
50% (2)
Data Warehousing Research Paper
7 pages
IP Office Support Services - Global Offer Definition - Nov 3 PDF
No ratings yet
IP Office Support Services - Global Offer Definition - Nov 3 PDF
32 pages
Decision Support System: Fundamentals and Applications for The Art and Science of Smart Choices
From Everand
Decision Support System: Fundamentals and Applications for The Art and Science of Smart Choices
Fouad Sabry
No ratings yet
Database Management System
From Everand
Database Management System
Manish Soni
No ratings yet
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet
MI0036-Business Intelligence Tools - F1
No ratings yet
MI0036-Business Intelligence Tools - F1
97 pages
An Introduction To Data Warehousing1
No ratings yet
An Introduction To Data Warehousing1
20 pages
Data Mining and Warehousing - L1 & L2
No ratings yet
Data Mining and Warehousing - L1 & L2
30 pages
How Evolution of Database Led To Data Mining
No ratings yet
How Evolution of Database Led To Data Mining
10 pages
Data Warehouse & Data Mining
No ratings yet
Data Warehouse & Data Mining
12 pages
By Bi Jay Mishra
No ratings yet
By Bi Jay Mishra
685 pages
Lab 13 MinchulS
No ratings yet
Lab 13 MinchulS
7 pages
BCOM304 Management Information System Unit-4: Multimedia Approach To Information Processing
No ratings yet
BCOM304 Management Information System Unit-4: Multimedia Approach To Information Processing
10 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
A Decision Support System
No ratings yet
A Decision Support System
3 pages
Data Warehousing Slides
No ratings yet
Data Warehousing Slides
76 pages
1,2 UNITS NOTES
No ratings yet
1,2 UNITS NOTES
53 pages
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
From:-Anmol Chopra-0231152707 Lakshay Gaur-0301152707
No ratings yet
From:-Anmol Chopra-0231152707 Lakshay Gaur-0301152707
11 pages
Data Warehousing
No ratings yet
Data Warehousing
7 pages
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
From Everand
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
Steven Vollmer
No ratings yet
Data Warehouse Architecture Overview: Notice
No ratings yet
Data Warehouse Architecture Overview: Notice
22 pages
UNIT 1 Datamining & Warehousing
No ratings yet
UNIT 1 Datamining & Warehousing
6 pages
Computer Fundamentals (Final)
No ratings yet
Computer Fundamentals (Final)
23 pages
Data Mining
No ratings yet
Data Mining
142 pages
Chapter 1 Modified
No ratings yet
Chapter 1 Modified
51 pages
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Data War Eh Puse
No ratings yet
Data War Eh Puse
51 pages
Data Warehouse Architecture Overview
No ratings yet
Data Warehouse Architecture Overview
22 pages
Data Warehouse and Data Mining - Neccessity or Useless Investment
No ratings yet
Data Warehouse and Data Mining - Neccessity or Useless Investment
8 pages
Data Warehouse: in It's Simplest Form, A Data Ware House Is A Collection of Key
No ratings yet
Data Warehouse: in It's Simplest Form, A Data Ware House Is A Collection of Key
4 pages
Data Warehousing, Data Mining, OLAP and OLTP Technologies Are Indispensable Elements To Support Decision-Making Process in Industrial World
No ratings yet
Data Warehousing, Data Mining, OLAP and OLTP Technologies Are Indispensable Elements To Support Decision-Making Process in Industrial World
7 pages
Need of Two Types of Data: Information
No ratings yet
Need of Two Types of Data: Information
7 pages
Database Management System
From Everand
Database Management System
Knowledge Flow
No ratings yet
Data Base Management Sysytem
No ratings yet
Data Base Management Sysytem
26 pages
Data Analytics and Data Processing Essentials
From Everand
Data Analytics and Data Processing Essentials
gareth thomas
No ratings yet
Dataware Housing Notes
No ratings yet
Dataware Housing Notes
134 pages
1.1 Need For DW
No ratings yet
1.1 Need For DW
14 pages
المستند
No ratings yet
المستند
23 pages
What Is A Data Warehouse?: A Single, Complete and Consistent Store of Data Obtained Ina What They Can
No ratings yet
What Is A Data Warehouse?: A Single, Complete and Consistent Store of Data Obtained Ina What They Can
18 pages
Data Warehouse Material Concepts
100% (2)
Data Warehouse Material Concepts
28 pages
Lect 14 DM
No ratings yet
Lect 14 DM
33 pages
MIS Ch. No. 3
No ratings yet
MIS Ch. No. 3
62 pages
Data Mining N Business Intelligence
No ratings yet
Data Mining N Business Intelligence
63 pages
The Future of Business Intelligence in T
No ratings yet
The Future of Business Intelligence in T
30 pages
DMDW Ref
No ratings yet
DMDW Ref
26 pages
Interview Abinitio
100% (2)
Interview Abinitio
28 pages
Dataware Housing
100% (1)
Dataware Housing
53 pages
Pharma Batch: Data Warehousing
No ratings yet
Pharma Batch: Data Warehousing
32 pages
410259: AC6-1 Business Intelligence: For BE Computer Engineering Students
No ratings yet
410259: AC6-1 Business Intelligence: For BE Computer Engineering Students
49 pages
Warehousing
No ratings yet
Warehousing
10 pages
DWDM Notes 5 Units
No ratings yet
DWDM Notes 5 Units
110 pages
Unit 1 - Introduction To Data Mining and Data Warehousing
No ratings yet
Unit 1 - Introduction To Data Mining and Data Warehousing
84 pages
Course Mba - 2 Semester Subject Management Information Systems Assignment MB0047 - Set 1
No ratings yet
Course Mba - 2 Semester Subject Management Information Systems Assignment MB0047 - Set 1
25 pages
Data Warehousing & ERP Combination
No ratings yet
Data Warehousing & ERP Combination
36 pages
Adbs Unit IV
No ratings yet
Adbs Unit IV
34 pages
Unit1-Overview of Data Warehousing
No ratings yet
Unit1-Overview of Data Warehousing
86 pages
Warehousing
No ratings yet
Warehousing
15 pages
Executive Support Systems (: Mr. Shailendra R Patil
No ratings yet
Executive Support Systems (: Mr. Shailendra R Patil
14 pages
Aloha POSv12.3 Data Security Implementation Guide
No ratings yet
Aloha POSv12.3 Data Security Implementation Guide
94 pages
College ChatBot - Report
No ratings yet
College ChatBot - Report
90 pages
Profiling and Tracing
No ratings yet
Profiling and Tracing
9 pages
User - Manual GW-DLMS-485-SL7 - 5.59 - en
No ratings yet
User - Manual GW-DLMS-485-SL7 - 5.59 - en
14 pages
Ch04-Hardware and Software
No ratings yet
Ch04-Hardware and Software
53 pages
Installation Instructions-Torque Meter Interface Program v2525 - LTT-20181115
No ratings yet
Installation Instructions-Torque Meter Interface Program v2525 - LTT-20181115
3 pages
A New Bridge Management System Based On Spatial Database and Open Source GIS
No ratings yet
A New Bridge Management System Based On Spatial Database and Open Source GIS
14 pages
Data Types in C
No ratings yet
Data Types in C
10 pages
IntelXVision iSentryStandard-1
No ratings yet
IntelXVision iSentryStandard-1
8 pages
MD Shahnewaz Rasel (CV)
No ratings yet
MD Shahnewaz Rasel (CV)
3 pages
Unit - 4: Design Process
No ratings yet
Unit - 4: Design Process
28 pages
Cs3591 Cn Unit 1 Notes (1)
No ratings yet
Cs3591 Cn Unit 1 Notes (1)
37 pages
How To Copy Mysql Databse From One Computer To Another
No ratings yet
How To Copy Mysql Databse From One Computer To Another
2 pages
SnowProCore Exam Study Guide 011425 COF C02
No ratings yet
SnowProCore Exam Study Guide 011425 COF C02
14 pages
Imran CV
No ratings yet
Imran CV
4 pages
Design Patterns in The Android Framework
No ratings yet
Design Patterns in The Android Framework
55 pages
OpenSTA Tutorial
No ratings yet
OpenSTA Tutorial
57 pages
epson-perfection-3590-photo-datasheet
No ratings yet
epson-perfection-3590-photo-datasheet
2 pages
Appendix A A1-Competitor Analysis
No ratings yet
Appendix A A1-Competitor Analysis
1 page
Script Autosave
No ratings yet
Script Autosave
3 pages
ORFS_installation_ubuntu22.04.docx
No ratings yet
ORFS_installation_ubuntu22.04.docx
5 pages
Patching in Oracle
No ratings yet
Patching in Oracle
7 pages
CAT-SOFTWARE ENG Vers-1.1 LR
No ratings yet
CAT-SOFTWARE ENG Vers-1.1 LR
24 pages
Department of Education: Proposal To Conduct District Workshop of The Digitization of Learning Resources
No ratings yet
Department of Education: Proposal To Conduct District Workshop of The Digitization of Learning Resources
6 pages
Q. Describe Memory Layout of Multiprogramming Operating System. State It's Advantage
No ratings yet
Q. Describe Memory Layout of Multiprogramming Operating System. State It's Advantage
9 pages
Simple Social Worker Resume
No ratings yet
Simple Social Worker Resume
5 pages
CityU PDF
No ratings yet
CityU PDF
4 pages
Infinibox User Documentation
No ratings yet
Infinibox User Documentation
375 pages

Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara

Uploaded by

Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara

Uploaded by

Data Warehousing & Mining Prof. J. N. Rajurkar MIET Bhandara.

Decision Support System (DSS)

DSS evolution relates to

 changing DSS features or components over time,

Model driven DSS

 First widely discussed DSS is model driven DSS.

Data Driven DSS

Communication Driven DSS

Document Driven DSS

Knowledge Driven DSS

Web Based DSS

What is Data warehousing?

What are the different characteristics of data warehousing?

Ans:- Fallowing are the charactestics of data warehousing.

Characteristics features of Information System:

What are the Application areas of Data Warehouses?

Figure. The main phases for the DW life-cycle.

Data mart design and implementation goes through fallowing phases.

Q.) Define Knowledge Discovery process and Explain KDD process.

Q.) Explain three tier architecture of data warehouse. (S-16)

Data Warehouses often adopt three-tier architecture.

Q.) What are different Data Warehouse Models?

Q.) Explain the Role of Metadata in building the data Warehouse.(S-16)

Metadata in a data warehouse fall into three major categories:

Q.) Differentiate between Operational and Decision-Support Systems. (W-15)

 Improves personal efficiency

Benefits of Operational database system

Q.) Describe in brief the evaluation of database system technology.(S-16)

Que.Why do you need a separate data staging components?

Figure: Data warehouse building blocks or components.

iv) External Data

c) Data storage Component

 Data storage for the data warehouse is a separate repository.

Que. Information Delivery Methods

Purpose: Provides a pre-made document to provide information needed by user.

Purpose: Provides ability to data using a pre-defined query, or on an ad hoc basis.

Usage: Research, analysis and reporting.

Purpose: Provides ability to easily access key performance indicators or metrics.

Usage: Monitoring and accessing performance.

Purpose: Alerts users to pre-defined conditions that occur.

Usage: Research and Analysis.

5. Exception Based Reporting

Usage: Notification without the need to perform detailed analysis.

Purpose: Ability to discover hidden trends with the data.

Usage: Research and analysis of hidden trends within the data.

Que. Data Warehouse Design Process

In general, the warehouse design process consists of the following steps:

*********All the Best***********

You might also like

*All the Best***