CS5504 Surgery 3 SampleSolution
CS5504 Surgery 3 SampleSolution
Q1
The manager of an airline company wants to know, daily, the number and revenue of flight
bookings made, per flight, at the different travel agencies.
a. Design an appropriate dimensional star schema to satisfy the above requirement.
Explicitly state the grain of such schema and devise at least two non-key attributes for
each of its dimensions; such attributes should make sense. Also, provide a sample row
of all tables of the proposed schema illustrating the respective data interrelationships.
b. Redesign your schema to a snowflake scenario of your choice.
Sample Answer
Grain: Flight Bookings, per flight, per travel agency, per day.
1
Date
Date_Key Date Description Day of Week Month Year
1 21/02/2014 February 21st, Friday February 2014
2014
Travel Agency
TravelAgency_ Name Description Address City Country
Key
1 XPT Travel Central 18, Bond St London UK
London
Branch
Flight
Flight_ Description Destination Origin Capacity
Key
1 European London Lisbon 320
Flight
Flight Bookings Facts
Example Snowflake
Split Travel Agency (City, Country)
2
Q2. Sample Answers in Italics
You are given the following dimensional schema of a BI system. The schema is derived from the
OLTP system of a company that sells a variety of products, to several customers worldwide. The
company products belong to different categories and come from various suppliers. Customers
are located worldwide, therefore products have to be shipped to them.
Study the dimensional model above and answer the following questions:
1. What is the granularity of the model? What are the measures?
Grain is at the OrderLineItem (per Customer, Employee, etc). Measures: Unit Price,
Quantity, Discount, Sales Amount, Freight
2. Do snowflake elements/structures exist in the schema? If yes, which ones?
Yes, City/ State/ Country/Continent, and Product/Category
3. Do degenerate dimensions exist in the schema? If yes, which ones?
Yes, Order No, OrderLineNo
4
Q3.
The manager of the cards department of a bank wants to know on a daily basis the total number
and total amount (i.e. total value) of credit and debit card purchases performed at the
Point-of-Sale (POS) machines of the bank across the country.
Design an appropriate dimensional schema to satisfy the above requirement. The granularity of
such schema should represent every transaction (purchase) performed at each POS of the bank.
Devise at least two non-key attributes for each of the dimensions of the schema.
Provide also a sample row of all entities of the proposed schema illustrating the respective data
interrelationships.
Sample Answer
Date
Date_Key Date Description Day of Week Month Year
1 21/02/2016 February 21st, Sunday February 2016
2016
POS Machine
POS_ POS Manufacturer Merchant Address City Region
Key Code ID
1 322 ABC Co ZARA 18, Bond St London London
03
5
Card
Card Card Number Card Network BIN Cardholder Name Expiry Date
Key Type
1 5167320011923457 Debit Visa 5167 John Smith 17/04/2017
The chain has 300 stores and approx. 40 000 products. Also, there are 500 products per brand.
Out of the 40 000 products, assume that only 4000 sell in each store daily, and that a sold item
may be in only one promotion per store, per day. Finally, there is at least one sale per product,
per store, per week.
6
a. What is the maximum number of fact table rows in this schema? What is the (maximum)
number of fact table records that are actually stored in this schema?
40 000x300x1825 = 21,9 billion
4 000 x 300 x 1825 x1 = 2,19billion
b. Estimate the number of fact table’s rows to be retrieved by the following database
queries.
▪ Query involving 1 product, 1 store, 1 week.
1 fact table row
▪ Query involving 1 product, all stores, 1 week.
300 fact table rows