A Review On Data Warehouse Management PDF
A Review On Data Warehouse Management PDF
IJESRT
INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH
TECHNOLOGY
A Review on Data Warehouse Management
Umair Rasheed*1, M.Umer Sarwar2, Ramzan Talib3
*1,2,3
College of Computer Science & Information Studies, Government College University, Faisalabad,
Pakistan
[email protected]
Abstract
Data warehouse management is a crucial part of industry and business that have been adopted and put into
practice with increased complexity of managing pools of data. The architectural layout of data warehousing is
composed of a centralized warehouse from which views are generated for needy users. The convention in this case is
that the warehouse receives an information injection from individual distributed databases which may as well be not
related to the warehouse but hold the raw data needed to justify the existence of warehouse. However, data
warehouse management is a relatively complex procedure that needs data updates from distributed databases. The
complexity of this procedure relates to anomaly-prone informational transaction between the data source end and the
materialized view end. This description supports the critical requirements of data warehouses which are accuracy
and timeliness. This review dwells on the assessment of data warehouse literature to verify the dominant strategy
employed in data warehouse management and the trade-offs involved. For this purpose various data warehouse
management related articles were selected and carefully reviewed. The findings revealed a favor towards Immediate
Incremental Management (IIM) of warehouse against Deferred Incremental Management (DIM). It is observed that
system availability which constitutes accurate views and real-time updating was the base determining factor.
Deferred Incremental Management has been the conventional way but with an increase in data volume that requires
a gigabyte transaction rate that defines contemporary business and industry this updating mechanism is quite
unsuitable and strips data warehouse management its contextual meaning.
Keywords: Immediate Incremental Management (IIM), Deferred Incremental Management (DIM), Algorithm,
Informational transaction.
Introduction
Data warehousing has been prevalent in Data warehouse is particularly a repository
contemporary business and industry and be described of integrated information that has been gathered from
as the de facto informatics of the time. It has been a distributed sources often databases [2, 3, 4, 5]. The
prerequisite for conducting information-oriented information is bundled up by dedicated hardware and
business and various industrial functions. Intelligence software systems that subject the information (data)
informatics has been applied in these two to an Extraction-Transfer-Load schema by
environments by use of data warehousing to create a implementing algorithms and definite schemas
discernible culmination of events and generate optimized for the process [6]. This enables logical
logically understandable summary information representation of juggled machine-oriented data
critical for decision-making [1]. The term “data within the warehouse system to relevant viewers.
warehouse” traces root back to the 1980s and has Since it is vital for data warehouse management to
seen adaptation and structural evolution since then consider timing and accuracy of materialized views
[2]. The terms carried a general meaning of the and it is imperative to assess the optimum mechanism
nature of basic system behind the process and the of upholding user-expected time while accessing
purpose of decision making it supported. Recent warehoused information in the user-expected
advancement however has given data warehousing a accuracy. This review assumes Immediate
comprehensive meaning solely related to time and the Incremental Management as the optimal strategy in
revolutionary changes involving information-flow achieving accuracy and timing in fast and reliable
protocol and transaction guidelines. data warehouse management. It also assumes
concurrent exchanges between the source and the