Assignment 2
Assignment 2
You are going to use SQL Server Database, SQL Server Analysis Services and Power BI or Tableau for this project.
You are going to design and develop a data warehouse, build one or more data cubes on top of it, develop some
OLAP reports and visualize your results. You are going to present your project in Teams (10’-15’ each group). This
should be in the form of a business case. This includes:
1. Find a dataset in the web that seems attractive and interesting to you. Possible links:
www.kaggle.com
https://github.com/caesar0301/awesome-public-datasets
http://www.kdnuggets.com/datasets/index.html
https://catalog.data.gov/dataset?tags=data-warehouse
or, search google for "datasets for data warehousing / data mining / OLAP / etc."
2. Understand the facts and the dimensions of the application. Define a star/snowflake schema in your database
SQLServer. Populate the fact and the dimension tables from the dataset you found - for example by using the
import task in your database server. You may have to clean, transform the dataset, manually define dimension
tables or insert values.
3. Use SQL Server Analysis Services to define a multi-dimensional model (a cube) over your schema. Play with the
reporting capabilities of your tool and show some OLAP reports (drill down/roll up, pivoting, ranking, etc.)
4. Install Power BI and using your database schema, show OLAP examples and visualize these - or whatever else
you consider interesting. Better (and more interesting/interactive/etc) visualizations mean better grade
The deliverables (aside the presentation) should be a document (.doc or .pdf) describing in detail each of the
above steps - with a lot of screenshots: (a) what kind of application you are targeting, description of the dataset
you used, where did you find it, what problems you are trying to solve, what analysis you want to do, (b)
description of the relational design of your fact and dimension tables, import methods, cleaning/transformation
procedures in detail, (c) what cube you have built on top of your schema, dimensions, measures, calculated - if any
- measures; description (in English) of OLAP reports and screenshots, and (d) visualizations of these reports and
description of the visualization, how it was produced, etc.