f
f
Diagramma cally and discuss data warehousing Q-Mul dimensional data model and database
architecture 1.Mul dimensional Data Model:
A Data Warehouse (DW) is a centralized repository that stores integrated, A mul dimensional data model organizes data into dimensions and facts. The model
historical data from mul ple sources, primarily for querying and analysis. It represents data as a data cube, where each axis (or dimension) corresponds to a
enables organiza ons to consolidate large amounts of data and perform specific a ribute.
efficient repor ng and data mining ac vi es to support business decision- Dimensions:These are descrip ve a ributes or categories through which data is
making. analyzed. For example, in a sales dataset, dimensions might include Time,Loca on, and
Key Features of a Data Warehouse: Product.
- Centralized: It integrates data from mul ple sources. Facts:These are the measurable or numeric values that are subject to analysis. In the
- Subject-Oriented: Organized around key business subjects (e.g., sales, sales example, facts could include Sales Amount,Quan ty Sold, etc.
finance). Data Cube:The data is stored in a cube-like structure, where each cell in the cube
- Non-Vola le: Once data is entered into the warehouse, it is not changed, contains a fact value, and the coordinates represent the intersec ons of dimension
allowing for consistent repor ng. values. For example, a cell could represent total sales of a product in a specific loca on
Data Warehousing Architecture Opera ons:
1. *Data Source Layer (Source Systems)*: Drill-down:The process of breaking down data into finer granularity.
- Includes external systems and opera onal databases where raw data Roll-up:Summing or aggrega ng data to a higher level.
originates (e.g., CRM systems, transac onal databases). Slice:A selec on of data from one dimension, while keeping others constant.
- Data from these sources are extracted using ETL (Extract, Transform, Load) Dice:A more specific selec on, where mul ple dimensions are sliced.
processes. Pivot:Reorganizing the dimensions for be er understanding.
2. *Data Staging Layer (ETL Process)*: 2.Mul dimensional Databases (MDB):
- This is where data is collected, cleaned, and transformed. ETL tools are used A mul dimensional database is designed to store and manage data that can be
to extract data from the source, transform it into the desired format, and load it modeled using a mul dimensional data model.
into the data warehouse. Structure:MDBs use a mul dimensional schema like Star Schema or Snowflake Schema
3. *Data Warehouse Layer (Data Warehouse Database)*: to structure data.
- The main data repository where processed data is stored. Star Schema:Central fact tables are connected to dimension tables, forming a star-like
- This layer typically uses op mized databases such as rela onal databases structure. It’s simple and fast for querying.
(e.g., SQL-based) or columnar databases for efficient querying. Snowflake Schema:A more normalized version of the star schema, where dimension
4. *Data Presenta on Layer (BI Tools and Analy cs)*: tables are further divided into addi onal tables. It’s more complex but saves storage
- This layer is where end-users access the data using business intelligence (BI) space.
tools, dashboards, reports, and analy cs applica ons.
- Users perform data analysis, query reports, and visualize data trends.