Unit2 - DWDM Notes
Unit2 - DWDM Notes
Prepared by:
Rachana Singh Sisodia
Assistant Professor, CSE
AKGEC
Warehousing Strategy
• Traditional Information Strategy Plan (ISP) addresses
operational computing needs thoroughly but don’t give
sufficient attention to decisional information requirements.
• warehouse strategy focus on decisional needs of enterprise
5
Determine Organizational Context
➢ What are the IS & IT groups in the organization which are involved in
the DWing effort?
➢ What are the roles & responsibilities of the individual who have
been involved in this effort?
Conduct Preliminary Survey of
Requirements
🞂 Obtain an inventory of requirements of users.
🞂 requirements inventory provides information that the
warehouse is expected to eventually provide.
🞂 Objective is to understand user needs enough to prioritize
the requirements.
🞂 critical input for identifying scope of each data warehouse
rollout.
Interview Categories & sample Questions
Questions related to following categories are asked:
I. Functions
• What is the mission of your group or unit?
• How do you go about fulfilling this mission?
• How do you know if you've been successful with your mission?
• What are the key performance indicators and critical success factors?
I. Customers
• How do you group or classify your customers?
• Does your grouping affect how you treat your customers?
• What kind of information do you track for each type of client?
I. Profit
At what level do you measure profitability in your group? Per
agent? Per customer?
I. Systems
II. Time
• Queries and reports. —
• What reports do you use now?
• What information do you actually use in each of the reports you now receive?
• Can we obtain samples of these reports?
• How often are these reports produced?
• What reports do you produce for other people?
• Product. —
• What products do you sell, and how do you classify them?
• Do you have a product hierarchy?
• Do you analyze data for all products at the same time, or do you analyze one
product type at a time?
• How do you handle changes in product hierarchy and product description?
• Geography. —
• Does your company operate in more than one location?
• Do you divide your market into geographical areas?
• Do you track sales per geographic region?
Conduct Preliminary Source System Audit
Advantages:
Increasing speed & optimizing resources utilization
Disadvantages:
Complex programming models – difficult development
• A computer cluster is a group of linked computers working
together
• components of a cluster are connected through fast local
area networks.
• deployed to improve performance and availability
• In such environments, each PU executes a copy of a
standard operations and inter-PU communications are
performed over an open system based
interconnect(Ethernet or TCP/IP)
58
Cluster consists of:
Nodes(master+computing )
➢ Network
➢ OS
➢ Cluster middleware: Middleware such as MPI
which permits compute clustering programs to be
portable to a wide variety of clusters
…
Cluster
Some hardware examples are:-
• Digital-64-bit AlphaServers and Digital Unix or Open VMS. Both SMP and MPP
• HP-HP 9000 Enterprise Parallel Server.
• IBM-RS6000 ,AIX OS have been positioned for data warehousing
• AS/400 -used for data mart implementations
• Microsoft- -Windows NT operating system us successful for datamart
deployments.
• Sequent-Sequent NUMA-Q and the DYNIX operating system.
Parallel processing software perform following steps:
• Parallel server option allows each node to have its own separate database
instance, and enables all database instances to access a common set of
underlying database files.
• parallel query option supports key operations such as query processing, data
loading, and index creation to be parallelized.
Advantages of Using Parallel Processing in Data
Warehouse
• Performance improvement for query processing, data loading, and
index creation
• Scalability, allowing addition of CPUs and memory modules without
any changes to the existing application
• Fault tolerance so that database would be available even when some
of the parallel processors fail
• Single logical view of the database even though the data may reside
on the disks of multiple nodes