0% found this document useful (0 votes)
261 views32 pages

Business Intelligence Masters Programme

This document provides course descriptions for 4 courses offered at Universite Libre de Bruxelles: 1) Advanced Databases covers recent developments in databases including object-oriented, distributed, and non-traditional data types like spatial and temporal data. 2) Database Systems Architecture examines the implementation of relational databases including query optimization, execution, transaction processing, and concurrency control. 3) Decision Engineering introduces decision theory and models to help decision makers with complex problems involving multiple alternatives, criteria, outcomes, and decision makers. 4) Data Warehousing covers data warehouses for analytical processing of historical data from multiple sources to support analysis and decision making.

Uploaded by

Admire Mamvura
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
261 views32 pages

Business Intelligence Masters Programme

This document provides course descriptions for 4 courses offered at Universite Libre de Bruxelles: 1) Advanced Databases covers recent developments in databases including object-oriented, distributed, and non-traditional data types like spatial and temporal data. 2) Database Systems Architecture examines the implementation of relational databases including query optimization, execution, transaction processing, and concurrency control. 3) Decision Engineering introduces decision theory and models to help decision makers with complex problems involving multiple alternatives, criteria, outcomes, and decision makers. 4) Data Warehousing covers data warehouses for analytical processing of historical data from multiple sources to support analysis and decision making.

Uploaded by

Admire Mamvura
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Erasmus Mundus Master Course

Information Technologies for Business Intelligence


! " "# #
Detailed Course Description
Academic Year 2012-2013
1
University: Universite Libre de Bruxelles (ULB)
Department: Faculte des Sciences Appliquees
Course ID: ADB (INFO-H-415)
Course name: Advanced Databases
Name and email address of the instructors: Esteban Zimanyi ([email protected])
Web page of the course: http://cs.ulb.ac.be/public/teaching/infoh415
Semester: 1
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h.
Exercises: 24h.
Projects: 12h.
Goals:
Today, databases are moving away from typical management applications, and address new application areas.
For this, databases must consider (1) recent developments in computer technology, as the object paradigm and
distribution, and (2) management of new data types such as spatial or temporal data. This course introduces
the concepts and techniques of some innovative database applications
Learning outcomes:
At the end of the course students are able to
Understand various dierent technologies related to database management system
Understand when to use these technologies according to the requirements of particular applications
Understand dierent alternative approaches proposed by extant database management systems for each of
these technologies
Understand the optimization issues related to particular implementation of these technologies in extant
database management systems.
Readings and text books:
R.T. Snodgrass, Developing Time-Oriented Database Applications in SQL, Morgan Kaufmann, 2000
Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan
Kaufmann, 2001
Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan
Kaufmann, 2002
Shashi Shekhar and Sanjay Chawla, Spatial Databases: A Tour, Prentice Hall, 2003.
Prerequisites:
Knowledge of the basic principles of database management, in particular SQL
Table of contents:
Active Databases
Taxonomy of concepts. Applications of active databases: integrity maintenance, derived data, replication.
Design of active databases: termination, conuence, determinism, modularisation.
Temporal Databases
Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation
of temporal data with standard SQL. New temporal extensions in SQL 2011.
Object-Oriented and Object-Relational Databases
Object-oriented model. Object Persistance. ODMG standard: Object Denition Language and Object
Query Language.
Object-relational model. Built-in constructed types. User-dened types. Typed tables. Type and table
hierarchies. SQL standard and Oracle implementation.
Spatial Databases
Application Domains of Geographical Information Systems (GIS), Common GIS data types and analysis.
Conceptual Data Models for spatial databases. Logical data models for spatial databases: rastor model
(map algebra), vector model (OGIS/ SQL1999). Physical data models for spatial databases: Clustering
methods (space lling curves), Storage methods (R-tree, Grid les).
Assessment breakdown:
75% written examination, 25% project evaluation
2
University: Universite Libre de Bruxelles (ULB)
Department: Faculte des Sciences Appliquees
Course ID: DBSA (INFO-H-417)
Course name: Database Systems Architecture
Name and email address of the instructors: Stijn Vansummeren ([email protected])
Web page of the course: http://cs.ulb.ac.be/public/teaching/infoh417
Semester: 1
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h.
Exercises: 12h.
Projects: 24h.
Goals:
In contrast to a typical introductory course in database systems where one learns to design and query relational
databases, the goal of this course is to get a fundamental insight into the implementation aspects of database
systems. In particular, we take a look under the hood of relational database management systems, with a focus
on query and transaction processing. By having an in-depth understanding of the query-optimisation-and-
execution pipeline, one becomes more procient in administering DBMSs, and hand-optimising SQL queries
for fast execution.
Learning outcomes:
Upon successful completion of this course, the student:
Understands the workow by which a relational database management systems optimises and executes a
query
Is capable of hand-optimising SQL queries for faster execution
Understands the I/O model of computation, and is capable of selecting and designing data structures
and algorithms that are ecient in this model (both in the context of datababase systems, and in other
contexts).
Understands the manner in which relational database management systems provide support for transaction
processing, concurrency control, and fault tolerance
Readings and text books:
Hector Garcia-Molina, Jerey D. Ullman, and Jennifer Widom. Database Systems: The Complete Book,
Prentice Hall, 2nd Edition, 2008.
Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems. McGraw-Hill, 3rd Edition,
2002.
Prerequisites:
Introductory course on relational databases, including SQL and relational algebra
Course on algorithms and data structures
Knowledge of the Java programming language
Table of contents:
Query Processing
With respect to query processing, we study the whole workow of how a typical relational database man-
agement system optimises and executes SQL queries. This entails an in-depth study of:
translating the SQL query into a logical query plan;
optimising the logical query plan;
how each logical operator can be algorithmically implemented on the physical (disk) level, and how
secondary-memory index structures can be used to speed up these algorithms; and
the translation of the logical query plan into a physical query plan using cost-based plan estimation.
Transaction Processing
Logging
Serializability
Concurrency control
Assessment breakdown:
75% written examination, 25% project evaluation
3
University: Universite Libre de Bruxelles (ULB)
Department: Faculte des Sciences Appliquees
Course ID: DE (MATH-H-405)
Course name: Decision Engineering
Name and email address of the instructors: Yves De Smet ([email protected])
Web page of the course: http://uv.ulb.ac.be (no public website)
Semester: 1
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h.
Exercises: 24h.
Projects: 12h.
Goals:
The goal of this course is to introduce the basics of decision theory. The main aim is to illustrate how
mathematical models and specic algorithms can be used to help decision makers facing complex problems
(involving a large number of alternatives /multiple criteria /uncertain or risky outcomes / multiple decision
makers, . . .).
Learning outcomes:
Upon successful completion of this course, the student:
Is able to formulate and to solve basic decision problems;
Can identify the properties and limits of common decision models;
Is ready to deepen his/her knowledge in advanced decision sciences courses.
Readings and text books:
C.D. Aliprantis, S.K. Chakrabarti, Games and decision making, Oxford University Press, 2000
F.S. Hillier, G.J. Lieberman, Introduction to Operations Research, McGraw Hill, 2005
Ph. Vincke, Multicriteria Decision-Aid, J. Wiley, New York, 1992
Prerequisites:
Linear algebra
Basic course on algorithms
Probability and statistics
Table of contents:
Introduction to decision sciences
The origin of operational research and decision sciences, some introductory examples.
Voting theory
Main voting procedures and properties. Paradoxes. Arrows theorem.
Multicriteria Decision Aid
Main concepts, introduction to multi-objective optimization, multi-attribute utility theory, outranking
methods (ELECTRE & PROMETHEE), applications.
Decision under risk and uncertainty
Common decision criteria: Maxmin, Maxmax, Hurwitz, Savage, Laplace. Expected utility.
Game theory
Classic examples. Nash equilibrium. Cournot duopoly. Median Voter theorem.
Assessment breakdown:
75% written examination, 25% project evaluation
4
University: Universite Libre de Bruxelles (ULB)
Department: Faculte des Sciences Appliquees
Course ID: DW (INFO-H-419)
Course name: Data Warehousing
Name and email address of the instructors: Toon Calders ([email protected])
Web page of the course: http://cs.ulb.ac.be/public/teaching/infoh419
Semester: 1
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h.
Exercises: 12h.
Projects: 24h.
Goals:
Relational and object-oriented databases are mainly suited for operational settings in which there are many
small transactions querying and writing to the database. Consistency of the database (in the presence of
potentially conicting transactions) is of utmost importance. Much dierent is the situation in analytical
processing where historical data is analyzed and aggregated in many dierent ways. Such queries dier
signicantly from the typical transactional queries in the relational model:
1. Typically analytical queries touch a larger part of the database and last longer than the transactional
queries;
2. Analytical queries involve aggregations (min, max, avg, . . .) over large subgroups of the data;
3. When analyzing data it is convenient to see it as multi-dimensional.
For these reasons, data to be analyzed is typically collected into a data warehouse with Online Analytical
Processing support. Online here refers to the fact that the answers to the queries should not take too long
to be computed. Collecting the data is often referred to as Extract-Transform-Load (ELT). The data in the
data warehouse needs to be organized in a way to enable the analytical queries to be executed eciently. For
the relational model star and snowake schemes are popular designs. Next to OLAP on top of a relational
database (ROLAP), also native OLAP solutions based on multidimensional structures (MOLAP) exist. In
order to further improve query answering eciency, some query results can already be materialized in the
database, and new indexing techniques have been developed.
The rst and largest part of the course covers the traditional data warehousing techniques. The main
concepts of multidimensional databases are illustrated using the SQL Server tools. The second part of the
course consists of advanced topics such as data warehousing appliances, data stream processing, data mining,
and spatial-temporal data warehousing. The coverage of these topics connects the data warehousing course
with and serves as an introduction towards other related courses in the program. Several associated partners
of the program contribute to the course in the form of invited lectures, case studies, and proof of technology
sessions.
Learning outcomes:
At the end of the course students are able to
Understand the dierence between operational databases and data warehouses
Understand the principles of multidimensional modeling
Understand the exploitation of a data warehouse for querying and reporting
Understand best practices and methodologies for data warehouse development
Understand the process of populating a data warehouse from internal and external sources
Readings and text books:
Christian S. Jensen, Torben Bach Pedersen, Christian Thomsen. Multidimensional Databases and Data
Warehousing. Morgan and Claypool Publishers, 2010
Kimball, Ralph; Margy Ross, Warren Thornthwaite, Joy Mundy, Bob Becker. The Data Warehouse Life-
cycle Toolkit, 2nd ed. Wiley, 2008.
Selected research papers and articles will be oered on the course website
Prerequisites:
A rst course on database systems covering the relational model, SQL, entity-relationship modelling, con-
straints such as functional dependencies and referential integrity, primary keys, foreign keys.
Data structures such as binary search trees, linked lists, multidimensional arrays.
5
Table of contents:
There is a mandatory project to be realized by students in groups of 2 or 3 students. In this project students
have to select and analyze a software tool in the context of data warehousing. The product and its capabilities
have to be positioned into the larger data warehousing context, and a small demo illustrating the capabilities
of the tool. The theoretical part of the course is dedicated to topics that allow the students to successfully
carry out the project. Below is the table of content of the theoretical part of the course:
Foundations of multidimensional modelling
Querying and reporting a multidimensional database with OLAP
Methodological aspects for data warehouse development
Populating a data warehouse: The ETL process
Assessment breakdown:
75% written examination, 25% project evaluation
6
University: Universite Libre de Bruxelles (ULB)
Department: Faculte des Sciences Appliquees
Course ID: BPM (INFO-H-420)
Course name: Business Process Management
Name and email address of the instructors: Toon Calders ([email protected])
Web page of the course: http://cs.ulb.ac.be/public/teaching/infoh420
Semester: 1
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h.
Exercises: 12h.
Assignments and project: 24h.
Goals:
This course introduces basic concepts for modeling and implementing business processes using contemporary
information technologies. The rst part of the considers the modeling of business processes, including the
control ow, and the data and resource perspectives. Petri nets will be used as a theoretical underpinning to
formalize the dierent workow patterns and unambiguously dene the semantics of the dierent constructions
in the workow modelling languages. The workow languages Yet-another-workow-Language (YAWL) and
the Business Process Modelling and Notation (BPMN) will be introduced in detail, as well as the main
characteristics of the Business Process Execution Language (BPeL) for the composition of web services, and
Event-Driven Process Chains (EPCs).
The second part of the course then goes into the analysis, simulation, verication, and discovery of work-
ows. Static techniques to verify properties such as soundness and the option-to-complete at model level will
be studied, as well as dynamic properties such as the compliance of an event log with respect to a given model.
For the discovery of workows, an overview of the main process mining techniques will be discussed.
During the course the students have to perform a couple of modelling assignments in YAWL and BPMN. In
the nal project, students build a prototype system enacting one of the workow modelled in their modelling
assignments.
Aliated industrial partners of the Erasmus Mundus project will be involved in the course in the form of
invited lectures, case studies, and proof of technology sessions. These lectures complement the academic
coverage of the topic with a more business-oriented perspective and form a nice addition to provide a more
complete picture of the Business Processing Modeling landscape.
Learning outcomes:
At the end of the course students are able to
Understand the value and benet as well as the limitations of business process management
Understand the business process management life cycle
Model business processes in BPMN and YAWL
Construct a prototype business process in YAWL
Quickly master vendor-specic products in the BPM area
Readings and text books:
Mathias Weske. Business Process Management: Concepts, Languages, Architectures. Springer. 2007
Arthur H. M. ter Hofstede, Wil M. P. van der Aalst, Michael Adams, Nick Russell (Editors), Modern
Business Process Automation: YAWL and its Support Environment. Springer, 2009.
Wil van der Aalst. Process Mining: Discovery, Conformance and Enhancement of Business Processes.
Springer, 2012.
Prerequisites:
Basic programming skills: variables, control structures such as loops and if-then-else, procedures, object-
oriented notions such as classes and objects, ...
Set theory (Notions such as set, set operations, sequence, multiset, function) and logics (mathematical
notation and argumentation; basic proofs)
Basic graph theory (notions such as graphs, reachability, transitivity, ...)
Experience with modelling languages such as UML and ER diagrams is recommended.
Table of contents:
There is a mandatory project, split into several tasks during the whole period of the course oering, to be
realized by the students in groups of 2. The theoretical part of the course is dedicated to topics that allow
7
the students to successfully carry out the project. Below is a high-level overview of the theoretical part of the
course:
Short overview of enterprise systems architecture and the place of business process management systems in
it. The BPM life cycle.
Modelling business processes: modelling the control ow, data and resource perspective.
Enacting the business process models.
Static and dynamic verication of process models; conformance checking.
Discovering process models and other properties of processes through process mining.
Assessment breakdown:
50% oral examination, 50% project evaluation
8
University: UFRT
Department: Computer Science Dept.
Course ID: ADW
Course name: Advanced Data Warehousing
Name and email address of the instructors: Dr. Patrick Marcel, [email protected], Dr.
Veronika Peralta, [email protected]
Web page of the course:
Semester: 2
Number of ECTS: 5
Course breakdown and hours:
Lectures: 22 h.
Exercises: 16 h.
Project: 12 h.
Goals:
The aim of this course is to complement the course Data Warehouses (Semester 1) in its study of database
technology used in Business Intelligence. A particular focus is given on the problems posed by heterogeneous
data integration and data quality. Classical notions of data warehousing and OLAP are recalled and developed:
architecture, ETL, conceptual and logical design, query processing and optimization. Advanced topics like
query personalization and recommendation are introduced.
Learning outcomes:
Upon successful completion of this course, the student is able to:
eciently design, construct and query a data warehouse
dene, measure and maintain data quality in the context of data warehousing.
Readings and text books:
Ralph Kimball, Margy Ross, The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling,
Second edition, John Willey, 2002.
Matteo Golfarelli, Stefano Rizzi, Data Warehouse Design: Modern Principles and Methodologies, second
edition, McGraw Hill, 2009.
Maurizio Lenzerini, Panos Vassiliadis, Matthias Jarke, Yannis Vassiliou, Fundamentals of Data Warehouses,
Second edition, Springer, 2002.
Carlo Batini, Monica Scannapieco, Data Quality Concepts, Methodologies and Techniques, Springer, 2006.
Prerequisites: Course DBSA and course DW (Semester 1).
Table of contents:
Introduction
Data warehouse architecture and design
Quality
Quality models, diagnosis, correction, prevention
Loading
Integration, ETL
OLAP models and languages
OLAP implementations, query processing, query optimization
Advanced topics
Query personalization and recommendations
Assessment breakdown:
Final written exam (50%), project (50%) assessed by an oral presentation, a demo, and a written report.
9
University: UFRT
Department: Computer Science Dept.
Course ID: BIS
Course name: Business Intelligence Seminars
Name and email address of the instructors: Dr. Patrick Marcel, [email protected]
Web page of the course:
Semester: 2
Number of ECTS: 5
Course breakdown and hours:
Lectures: 36 h.
Project: 14 h.
Goals:
Thanks to technological advances, the domain of business intelligence is witnessing today an increasing diver-
sication and it addresses new application areas. This seminar covers current trends and recent developments
in the domain of business intelligence. It also discusses the implications of business intelligence on individ-
uals, organizations, and society in general. The seminar is designed and jointly taught by all consortium
partners (whether full or associated partners, academic/research institutions or industrial companies), and
will involve guest speakers presenting their organization, research topics, internships, and proposed Masters
thesis subjects for the second year of the master.
Learning outcomes:
With this module, the student will 1) acquire a good understanding of the state of the art and the next
evolutions and challenges in the domain, and 2) learn to synthesize, organize, and present scientic and
technical information related to the domain of Business Intelligence, to make it clear to colleagues, and
discuss it objectively.
Readings and text books: Research articles and white papers in the domain of Business Intelligence.
Prerequisites: All courses of Semester 1
Table of contents: The topics of the seminar vary from year to year, according to the guest speakers. In
addition, students (in groups of maximum 2 persons) must write a report and make a one-hour presentation
in front of their fellow students on a topic of their choice in the domain of Business Intelligence. The subject
must be addressed in a technical way, to explain the underlying technologies. The choice of subject and date
of presentation is determined in agreement with the lecturer. The participation of students to presentations
by guest speakers and fellow students is required.
Assessment breakdown:
A mandatory project is to be realized by students in group of 2.
10
University: UFRT
Department: Computer Science Dept.
Course ID: IR
Course name: Information Retrieval
Name and email address of the instructors: Dr. Veronika Peralta, [email protected]
Web page of the course:
Semester: 2
Number of ECTS: 5
Course breakdown and hours:
Lectures: 20 h.
Exercises: 12 h.
Lab: 18 h.
Goals:
To study the problems posed by information retrieval
To study the processing, indexing, querying, organization and classication of textual documents
To acquire a general idea of natural language applications and fundaments
To acquire some practical skills in natural language applications related to information retrieval
Learning outcomes:
Upon successful completion of this course, the student is able to:
To know how to index document corpus
To know how to design, construct and query a document database
To know how to personalize retrieval results
To have a basic understanding of linguistic modeling
To have a rst experience on some NLP applications (name entities detection; textual information retrieval;
text mining)
Readings and text books:
Christopher D. Manning, Prabhakar Raghavan and Hinrich Sch utze, Introduction to Information Retrieval,
Cambridge University Press. 2008.
James F. Allen, Natural Language Understanding. 2nd Edition, Benjamin Cummings, 1995.
Douglas Biber, Variations across speech and writing. Cambridge University Press, 1988.
Barbara J. Grosz, Karen Sparck-Jones, Bonnie Lynn Webber (Eds.), Readings in Natural Language Pro-
cessing. Morgan Kaufmann Publ., 1986.
Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, Spoken Language Processing: A Guide to Theory, Algorithm
and System Development. Prentice Hall, 2001.
Prerequisites: Courses DBSA and DW (Semester 1). Knowledge on automata and language theory is
welcome.
Table of contents:
Introduction: problems, retrieval processes, architecture, evaluation issues
Retrieval models: boolean model, vector space model, probabilistic model
Indexation
Web search
Personalization
Natural Language Processing for Information Retrieval:
General introduction : applications of NLP and linguistic level of description
Morphology : linguistic modeling (compound words), stemming, lemmatization
Terminology : motivation and applications
Morphology and syntax : POS tagging, named entities detection
Syntax : parsing
Semantic processing for information retrieval : latent semantic analysis
Assessment breakdown:
Final written exam (50%), project (50%) assessed by an oral presentation, a demo, and a written report.
11
University: UFRT
Department: Computer Science Dept.
Course ID: KDDM
Course name: Knowledge Discovery and Data Mining
Name and email address of the instructors: Arnaud Giacometti ([email protected]),
Arnaud Soulet ([email protected]) and Haoyuan Li ([email protected])
Web page of the course:
Semester: 2
Number of ECTS: 5
Course breakdown and hours:
Lectures: 22 h.
Exercises: 16 h.
Lab: 12 h.
Goals:
The key objectives of this course are two-folds: (i) To give students a detailed understanding of the strengths
and limitations of popular data mining techniques, and (ii) To understand the problems associated with the
computational complexity issues in data mining.
Learning outcomes:
Upon successful completion of this course, the student is able to:
Prepare raw input data, and process it appropriately to provide suitable input for a wide range of data
mining algorithms.
Understand the theoretical background of the main data mining algorithms.
Critically evaluate and select appropriate data mining algorithms.
Apply data mining algorithms, interpret and report the output appropriately.
More generally, students should be able to actively manage and participate in data mining projects executed
by consultants or specialists in data mining
Readings and text books:
J. Han and M. Kamber (2006), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers.
P.-N. Tan, M. Steinbach, V. Kumar (2005), Introduction to Data Mining, Addison-Wesley Publisher.
D. J. Hand, H. Mannila and P. Smyth (2001), Principles of Data Mining, The MIT Press.
Prerequisites: Attendees must have prior knowledge on databases, algorithmic, data structures (trees,
graphs). Some mathematical and statistical background will help.
Table of contents:
Introduction to data mining: the Knowledge Discovery Process, Data Preparation
Classication Methods:
Basic Concepts and Decision Trees
Introduction to Articial Neural Networks
Association Rule based Methods
Model Evaluation: Statistical Tests, ROC Analysis
Association Analysis:
Basic Concepts and Algorithms
Sequential Patterns, Data Streams
Clustering:
Basic Concepts and Algorithms
Partitioning and Hierarchical Methods
Bayesian Networks: Basic Concepts and Algorithms
Advanced Topics (seminars)
Assessment breakdown:
Project (40%) + written nal examination (60%)
12
University: UFRT
Department: Computer Science Dept.
Course ID: XWT
Course name: XML and web technologies
Name and email address of the instructors: Dr Beatrice Bouchou Markho, Beatrice.Bouchou@univ-
tours.fr
Web page of the course:
Semester: 2
Number of ECTS: 5
Course breakdown and hours:
Lectures: 22 h.
Exercises: 16 h.
Lab: 12 h.
Goals:
Understanding of the foundations of the web standard for data management, XML, with associated APIs,
schema languages, query languages and transformation languages.
Basic knowledge on most important developments on the web, web services and semantic web.
Learning outcomes:
Upon successful completion of this course, the student is able to:
Design XML documents
Design XML schemas (DTD, XML Schema)
Design integrity constraints for XML documents
Manage XML namespaces
Process XML documents w.r.t. schemas and integrity constraints (e.g. validation)
Query XML data with XPath and Xquery
Transform XML data with XSLT
Use XML Databases, either with Oracle or eXist
Discover how to express ontologies, create semantic annotations and express queries on semantic data (RDF,
RDFS, OWL, SPARQL)
Discover how to design, modify and publish web services (SOAP, WSDL, UDDI)
Readings and text books:
Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset and Pierre Senellart. Web
Data Management and Distribution. Oxford Press, 2012.
Anders Mller and Michael I. Schwartzbach. An Introduction to XML and Web Technologies. Addison-
Wesley, 2006.
Prerequisites: Attendees must have prior knowledge on Algorithms and data structures (trees and graphs),
language theory (nite state automata), First-order logics and AI
Table of contents:
Introduction to semi-structured data and XML
XML document structure, infoset
Schema languages and validation process
DTD, XML Schema, (bottom-up unranked) tree automata
Navigating XML Trees and integrity constraints
XPath
Integrity constraints for XML
Querying (and transforming) XML documents, XML databases
XQuery and XSLT
XML databases
Programming with XML
API SAX
API DOM
Introduction to Semantic Web technologies
13
Semantic annotations, RDF, SparQL
Concepts of ontology, RDF schema and OWL
Introduction to Web Service Technologies
Service web description with SOAP and WSDL, and publication with UDDI
Assessment breakdown:
Written exam (60%) and Project (40%, oral presentation and written report)
14
University: Ecole Centrale Paris (ECP)
Department: Computer Science Department
Course ID: VA
Course name: Visual Analytics
Name and email address of the instructors: Anastasia Bezerianos ([email protected])
Web page of the course: (to be created)
Semester: 3
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h
Laboratory: 24h
Project: 12h
Goals:
This course aims to help students understand the emerging, multidisciplinary eld of VA, to familiarise them
with current VA technology, and help them gain the foundations to building visual analytics tools and systems
using real world data.
Learning outcomes:
Upon completing the course, students will be able to:
Understand basic concepts, theories and methodologies of Visual Analytics
Analyse data using appropriate visual thinking and visual analytics techniques
Present data using appropriate visual communication and graphical methods
Design and implement a Visual Analytics system for supporting decision making
Readings and text books:
Edward Tufte, Envisioning Information, Graphics Press, 1990.
Robert Spence, Information Visualisation: Design for Interaction, Second Edition, Prentice Hall, 2007.
Colin Ware, Information Visualisation: Perception for Design, Second Edition, Morgan Kaufmann, 2004.
The course instructor will provide required weekly readings in the form of scientic articles, and will recom-
mend further reading on each topic.
Prerequisites:
Advanced Data Warehousing (ADW)
Knowledge Discovery and Data Mining (KD&DM)
Table of contents:
VA fundamentals: Theories, methodologies and techniques
Designing interactive graphics
Appropriate methods for dierent data types: Graphs, Hierarchies, Spatio-temporal data, High dimensional
data
VA system design practices
Dashboard design
Assessment breakdown:
30% class participation, 70% project (10% proposal, 20% intermediate, 40% nal)
15
University: Ecole Centrale Paris (ECP)
Department: Computer Science Department
Course ID: CS&SW
Course name: Corporate Semantics and Semantic Web
Name and email address of the instructors: Marie-Aude Aufaure ([email protected])
Web page of the course: (to be created)
Semester: 3
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h
Laboratory: 24h
Project: 12h
Goals:
This course aims at presenting semantic technologies and the benets to use them in companies for various
applications such as semantic information search, recommendation, question and answering systems, search-
based applications.
Learning outcomes:
Provide the student with a deep understanding of the Semantic Web Technologies.
Ability to understand how to use Semantic Technologies for corporate application, with a special emphasise
on the integration of unstructured content to enterprise structured data,
Be able to build an ontology and use it for a specic application.
Readings and text books:
Pascal Hitzler, Markus Krotzsch, Sebastian Rudolph, Foundations of Semantic Web Technologies, Chapman
& Hall/CRC, 2009.
Steen Staab, Rudi Studer, Handbook on Ontologies, Springer, Second Edition, 2009.
John Davies, Rudi Studer, Paul Warren, Semantic Web Technologies: Trends and Research in Ontology-
based Systems, Wiley, 2006 (published online)
David Taniar, Johanna Wenny Rahayu, Web Semantics Ontology, Idea Group Publishing, 2006.
Prerequisites:
XML and Web Technologies (X&WT)
Information Retrieval (IR)
Table of contents:
The Semantic Web Stack (RDF, RDFS, OWL, SKOS, SPARQL)
Ontology Learning and Life-Cycle
Linked Data
Adding Semantic to corporate data (Triple Store, Ontologies and OLAP, Ontologies and Databases)
Applications: Semantic Search, Question and Answering, Recommendation, Social Networks.
Assessment breakdown:
50% written examination + 50% project evaluation
16
University: Ecole Centrale Paris (ECP)
Department: Computer Science Department
Course ID: DM&ML
Course name: Data Mining and Machine Learning
Name and email address of the instructors: Etienne Cuvelier ([email protected]), Antoine
Cornuejols ([email protected])
Web page of the course: (to be created)
Semester: 3
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h
Laboratory: 24h
Project: 27h
Goals:
The goals of this course are to allow the students to discover and practice advanced techniques of data
mining and machine learning. Discover their principles, but also their variants, their applications and their
weaknesses.
Learning outcomes:
Upon successful completion of this course, the student will be able:
to choose the best techniques to solve a given data mining or machine learning task,
to tune the parameters of the chosen technique,
to interpret the results of the chosen technique.
Readings and text books:
David J. Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, The MIT Press, 2001.
Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning: Data Mining,
Inference, and Prediction, Second Edition, Springer, 2009.
Luis Torgo, Data Mining with R, learning with case studies, Chapman & Hall/CRC, 2010.
Prerequisites:
Knowledge Discovery and Data Mining (KD&DM)
Table of contents:
Data Mining
Principal components analysis, factorial analysis.
Advanced clustering: spectral algorithms, galois lattices.
Regressions.
Analysis of non-tabular data types (graphs, symbolic, functional).
Problems and methods of Machine Learning
Formalization of the induction problem. Observations, hypothesis performance criterion.
Linear models. Generalization in kernel methods: SVM.
Deep neural networks.
Ensemble methods: boosting
Structured Data Learning: relational methods.
New problems: on-line learning, multi-task learning.
Assessment breakdown:
50% written examination + 50% project evaluation
17
University: Ecole Centrale Paris (ECP)
Department: Computer Science Department
Course ID: DM
Course name: Decision Modelling
Name and email address of the instructors: Vincent Mousseau ([email protected])
Web page of the course: (to be created)
Semester: 3
Number of ECTS: 5
Course breakdown and hours:
Lectures: 24h
Laboratory: 24h
Project: 12h
Goals:
This course aims at presenting classical decision models with a special emphasis on decision making in uncer-
tain situations, decision with multiple attribute, and decision with multiple stakeholders. During the course,
various applications will be presented, emphasizing the practical interest and applicability of the models in
real-world decision situations.
Learning outcomes:
Provide the student with decision models and a better understanding of the validity of these decision models,
Ability to understand the three levels of decision analysis: representation of observed decision behaviour
(descriptive analysis), decision aiding and recommendation (prescriptive analysis), and the design of arti-
cial decision agents (normative analysis).
Readings and text books:
Denis Bouyssou, Thierry Marchant, Marc Pirlot, Alexis Tsoukias, Philippe Vincke, Evaluation and decision
models with multiple criteria: Stepping stones for the analyst, Springer, International Series in Operations
Research and Management Science Volume 86, 2006.
William W. Cooper, Lawrence M. Seiford, and Kaoru Tone, Introduction to Data Envelopment Analysis
and Its Uses, Springer, 2006.
Prerequisites:
Decision Engineering (DE)
Table of contents:
Data envelopment Analysis: Analysis of the eciency of production units
Decision under uncertainty, decision trees: theory, modeling and applications
Behavioural decision analysis: Empirical analysis of decision behaviour, cognitive decision biases, prospect
theory
Outranking methods (theory and applications): Presentation of the Electre methods( Electre I, Electre 3,
Electre Tri), reference based ranking.
Applications on a generic Decision platform: Decision Deck. Case studies and use o an open source plateform
for decision aid.
Group decision: Group decision, elicitation of a group decision model
Preference learning: Eliciting preference model for a decision maker, for several decision makers
Decision making using Multiple Objective Optimisation : Epsilon constraint method, applications, approx-
imation algorithms, evolutionary algorithms, NSGA II
Assessment breakdown:
Assessment of the homework exercises (10%), Written exam (60%), Project (30%)
18
University: Ecole Centrale Paris (ECP)
Department: Computer Science Department
Course ID: II&R
Course name: Introduction to Innovation and Research
Name and email address of the instructors: Marie-Aude Aufaure ([email protected])
Web page of the course: (to be created)
Semester: 3
Number of ECTS: 5
Course breakdown and hours:
Lectures: 12h
Laboratory: 18h
Project: 30h
Goals:
The objectives of this course are to provide industrial and research presentations for students from the main
BI software editors and clients as well as researchers in this domain. The European research context as well
as intellectual properties, incubators and start-up creation will be presented. Students will also develop a
research project in a collaborative way.
Learning outcomes:
Provide the student with knowledge about intellectual properties, incubators and European research context
(FP7 projects, ICT-labs, etc.)
Various seminars from key BI actors will be presented.
Ability to manage a research project for a client.
Readings and text books:
Scientic papers will be distributed by the course lecturer according to the topics covered.
Prerequisites:
Business Intelligence Seminar (BIS)
Table of contents:
Seminars: innovation and research
Intellectual Property
Presentation of resources such as incubators
The European research context
Research project
Assessment breakdown:
Project evaluation 100%
19
University: Universitat Polit`ecnica de Catalunya (UPC)
Department: Department of Service and Information System Engineering
Course ID: SOBI
Course name: Service Oriented Business Intelligence
Name and email address of the instructors: Alberto Abello ([email protected])
Web page of the course: To be created
Semester: 3
rd
Number of ECTS: 6
Course breakdown and hours:
Lectures: 36 h.
Problems and laboratories: 18 h.
Self-Study: 96 h.
Goals:
This course focuses on developing the students skills to put their knowledge on service companies and ben-
eting from service facilities to build business intelligence solutions. On the one hand, we will analyze the
specicity of this sector and present techniques to engineering their systems (i.e., Service Oriented Architec-
ture). On the other hand, we will also analyze to which extent it is possible to consider Data Warehousing
just a service and use these same techniques in its engineering methods.
Learning outcomes:
Upon successful completion of this course, the student is able to:
Knowledge
Understand the specicity of service companies.
Identify Business Intelligence as a service.
Recognise the characteristics and benets of Infrastructure as a Service.
Recognise the characteristics and benets of a Service Oriented Architecture (SOA).
Skills
Be able to develop Business Intelligence Service on Platform as a Service.
Be able to use Business Intelligence Software as a Service.
Be able to work on Business Process as a Service and provide tools to analyse such processes.
Readings and text books:
Marie-Aude Aufaure and Esteban Zimanyi, editors. Proceedings of the 1st European Business Intelligence
Summer School, eBISS 2011, LNBIP 96. Springer, 2012.
Hector Garcia-Molina, Jerey D. Ullman, Jennifer Widom: Database Systems: The Complete Book, Prentice
Hall, second edition, 2009
Will M. P. van der Aalst: Process Mining: Discovery, Conformance and Enhancement of Business Processes,
Springer, 2011.
Anand Rajaraman, Jerey D. Ullman: Mining of Massive Datasets, Cambridge University. Press, 2012
Tamer

Ozsu und Patrick Valduriez: Principles of Distributed Database Systems, Prentice Hall, third edition,
2011
Other Literature:
Scott E. Sampson. Understanding Service Businesses: Applying Principles of Unied Services Theory, 2nd
Edition, John Wiley& Sons, 2001.
Thomas Erl. Service Oriented Architecture, Prentice Hall, 2006.
Mike Papazoglou et al., editors. Service Research Challenges and Solutions for the Future Internet, LNCS
6500, Springer 2010.
David S. Linthicum. Cloud Computing and SOA Convergence in Your Enterprise: A Step-by-Step Guide,
Addison-Wesley, 2009.
Prerequisites:
Business Process Management (BPM)
Advanced Data Warehousing (ADW)
Table of contents:
Introduction to services: denition and specic characteristics
Service Level Agreement and quality management
20
Availability
Condentiality
Cloud Computing
IaaS
PaaS
Relational DBMS (Oracle, DB2, etc.)
CloudDB (BigTable)
MapReduce (Hadoop)
SaaS
CRM
BaaS
Service Oriented Architecture
ETL
Situational BI
Service analysis
KPI denition and simulation
Process mining
Assessment breakdown:
30% written examination, 70% problem classes and laboratories
21
University: Universitat Polit`ecnica de Catalunya
Department: Department of Service and Information System Engineering
Course ID:
Course name: Software Engineering and Business Intelligence Project (SEBIP)
Name and email address of the instructors: Oscar Romero ([email protected])
Web page of the course: To be created
Semester: 3
rd
Number of ECTS: 6
Course breakdown and hours:
Lectures: 20 h.
Projects: 120 h.
Self-Study: 4h.
Goals:
This course focuses on developing the students skills to put their knowledge on software engineering and
databases (as long as specic knowledge on project management introduced in this course) into practice,
with the aim to develop information systems (IS) to support business intelligence (BI) processes within
organizations. The course simulates an environment whose conditions are similar to those of a BI industrial
project. So that, the students are required to work in a team and undertake a project by planning the
project, modeling the BI processes, gathering requirements, analyze and specify an IS meeting the project
requirements, designing the system and testing, while documenting all the process. Eventually, a prototype
is required.
Learning outcomes:
Upon successful completion of this course, the student is able to undertake BI projects and therefore:
Consolidate his / her software engineering, database and BI concepts by putting them into practice,
Correctly identify and analyze the special needs of the project with regard to requirements,
Propose a suitable architecture meeting the requirements,
Successfully develop the project in a well-rounded, disciplined and methodological manner,
Strength the students team work skills, such as the ability to reach agreements, play a specic team role
and reuse and continue other teammates work.
Additionally, the student will gain basic knowledge on project management and agile software development
methods.
Readings and text books:
M. Golfarelli, S. Rizzi, Data Warehouse Design, McGraw Hill, 2009
Prerequisites:
Advanced Data Warehouses (ADW)
Table of contents: This course focuses on undertaking a BI project. So that, the students are expected
to reuse and consolidate their knowledge on databases, software engineering and BI obtained in previous
courses. At the beginning of the course a case study is presented to the students. Specically, a set of
end-user requirements (both functional and non-functional) are handed to the students. From here on, the
students are expected to develop the whole project until completion throughout 6 well-dened stages:
Modeling BI processes
Formal specication and analysis
Design (including choose the appropriate architecture for the system)
Project documentation
Testing
Project defense
The project is expected to be developed in the course laboratories (under supervision of the teacher), and
as teamwork (with no supervision). The laboratories follow a project based learning approach (PBL) in
which the students are required to play a specic role within their team, undertake team discussions, reach
agreements and champion their decisions. This course also introduces some notions of project management
and agile software development methods. (additional material on these will be handed to support the planning
and development of the project).
In the rst session a case study is presented to the students, who are distributed in groups of 3-5 people.
From there on, every two weeks a checkpoint is carried out with the teacher who is meant to control the
22
project development acting as a customer. The project is supposed to be deployed following the SCRUM
method and be controlled by the students themselves. Students within a group are supposed to play the role
they were assigned and collaborate.
One important aspect of this project is to decide the architecture (e.g., relational or any NOSQL approach)
to be addressed for that case study. Each group can decide on their own but they will be asked to defend and
document their decisions, which will be handed to the teacher before the testing and defense phase in form
of project documentation. In the testing phase groups are intended to compete among them, and the rest of
groups together with the teacher will act as jury. The criteria used to value each project are the functional
and non-functional requirements handed in the rst session, and the ability of the users to relate them with
what other groups present and raise pros and cons. Students are asked to justify their marks to each project
according to such discussions. Finally, each group must defend their ideas in an open forum carried out
in the last week of the course and draw their own conclusions. They must hand a nal group document
discussing whether their approach was correct or they made some mistakes during any of the development
phases. Alternatively, a personal document can be handed if no consensus has been achieved within the group
(anyway, a group document is mandatory). Students are expected to show maturity in their conclusions
regarding their ability to meet the case study requirements and spot out diculties and how to tackle them
in the future.
Assessment breakdown:
25% project documentation, 25% project marks (average of the marks provided by other groups + teacher),
10% checkpoints with the teacher, 5% work in the lab sessions, 20% contributions in the forum, 15% nal
group document (conclusions) + personal conclusions (if delivered)
23
University: Universitat Polit`ecnica de Catalunya (UPC)
Department: Department of Service and Information System Engineering
Course ID:
Course name: Web Services (WS)
Name and email address of the instructors: Carles Farre ([email protected])
Web page of the course: http://www.fib.upc.edu/en/estudiar-enginyeria-informatica/
enginyeries-pla-2003/assignatures/SW.html
Semester: 3
rd
Number of ECTS: 6
Course breakdown and hours:
Lectures: 28 h.
Lab sessions: 28 h.
Autonomous work: 77 h.
Exam preparation: 17 h.
Lectures: 2 hours per week. The instructors may present the some contents of the course using slides or
some other material. Students may also be required to give short presentations about some topic of interest
Lab sessions: 2 hours per week. After a brief introduction of the tasks to be carried out, students will
perform these using the computer in accordance with a pre-established work plan and a list of objectives. The
extent to which these objectives are achieved will determine the grade awarded for the lab session in question.
Autonomous work: 5,5 hours per week Some course contents are not presented in class and must be
privately studied by students. Teachers will indicate which contents should be studied and the teaching
resources that may be employed. Student may also be asked to prepare problems or lab sessions as well as to
deliver on-line assignments
Goals:
This course covers the historical development and current status of Web services and investigates how they
are being used in real-world Web applications, such as those built by Amazon, Twitter, Google, Appian,
Salesforce,... Other topics will include studying the REST style of developing Web applications as well
as investigating how Web 2.0 concepts relate to the Web services landscape. By the end of the course,
students should be familiar with these concepts and have some experience both with building Web services
and interacting with them programmatically.
Lectures are accompanied by lab sessions, in which students will get hands-on experience with the tech-
nologies required to consume and develop Web Services.
Learning outcomes:
Upon successful completion of this course, the student is able to:
Knowledge on the nature, characteristics an types of Web Services
Knowledge on the fundamental technologies that underpin the Web Service paradigm
Knowledge on the key standards necessary for the development of Web Services
Skills for designing and implementing software to interact and use public and private Web Services and
APIs
Skills for designing, implementing, testing, deploying and monitoring Web Services.
Readings and text books:
Gustavo Alonso, Fabio Casati, Harumi Kuno, Vijay Machiraju. Web Services. Concepts, Architectures and
Applications, Springer, 2004.
Robert Daigneau. Service Design Patterns: Fundamental Design Solutions for SOAP/WSDL and RESTful
Web Services. Addison-Wesley Professional, 2011.
Michael P. Papazoglou. Web Services: Principles and Technology, Prentice Hall, 2008.
Leonard Richardson, Sam Ruby. RESTful Web Services, OReilly, 2007.
Leon Shklar, Rich Rosen. Web Application Architecture: Principles, Protocols and Practices. Second
Edition, John Wiley & Sons, 2009.
Prerequisites:
XML and Web Technologies
Table of contents:
Introduction
Origins & Precedents: Distributed Sytems. Middleware, SOA
24
Core Web Technologies:
The Fundamentals: URIs. HTTP. Proxies, caches, cookies
Browser-Based Computing: JavaScript, DOM, AJAX
Server-Side Computing: CGI, PHP, Java Servlets
Web Data Exchange Formats: XML, JSON
Core WS Protocols
SOAP and WSDL
RESTful WS
Consuming WS
Implementing WS Clients: Frameworks and libraries. WS Consoles
UDDI
Developing WS
Properties of a service development methodology
Qualities of service development methodology
Web services development lifecycle
Service analysis, design and construction
Design Patterns for Web Service Development
WS Security
General Concepts
Securing RESTful Web Services
XML Security Standards
Securing WS-* Web Services
Processes and Workows
Business processes
Workows
Service composition meta-model
Web services orchestration & choreography
The Business Process Execution Language (WS-BPEL)
Assessment breakdown:
30% written examination, 30% presentations an on-line Assignments, 40% laboratories
25
University: Universitat Polit`ecnica de Catalunya (UPC)
Department: Department of Management
Course ID:
Course name: Viability of Business Projects (VBP)
Name and email address of the instructors: Marc Eguiguren ([email protected])
Web page of the course: http://www.fib.upc.edu/en/estudiar-enginyeria-informatica/
enginyeries-pla-2003/assignatures/VPE.html
Semester: 3
rd
Number of ECTS: 6
Course breakdown and hours:
Lectures: 40 h.
Projects in the classroom: 35 h.
Projects in the classroom: 56 h.
Self-Study: 31 h.
Goals:
University graduates can nd themselves in the situation of having to analyse or take on the project of starting
their own business. This is especially true in the case of computer scientists in any eld related to Business
Intelligence (BI) or more generally, in the world of services. There are moments in ones professional career at
which one must be able to assess or judge the suitability of business ventures undertaken or promoted by third
parties, or understand the possibilities a new service has for success. It is for this reason that this subject
focuses on providing students with an understanding of the main techniques used in analysing the viability
of new business ventures: business start-up or the implementation of new projects in the world of services or
in the specic eld of BI. This project-oriented, eminently practical subject is aimed at each students being
able to draft as realistic a business plan as possible.
Learning outcomes:
Upon successful completion of this course, the student is able to:
Knowledge
Understanding the world of services as well as the business/company concept and the keys to success in
a BI business start-up.
Evaluate the entrepreneurs role and identify the skills needed to get a business start-up o the ground.
The marketing, nancial, operational and human elements making up a good business plan out of a BI
idea.
The communication skills needed by an entrepreneur to sell a business start-up.
Skills
The ability to set priorities for a new service oriented business and to realistically appraise the business
opportunities.
Ability to draw up a viable business plan in a rational, ecient manner.
Ability to grasp the technical, marketing, and personnel problems involved in starting a business in the
service sector.
Ability to identify and attract nancial resources and to convey and defend the plan for a business
start-up.
Readings and text books:
Rhonda Abrams, Eug`ene Kleiner. The Successful Business Plan. The Planning Shop, 2003.
Rob Cosgrove. Online Backup Guide for Service Providers, Cosgrove, Rob, 2010.
Peter Drucker. Innovation and Entrepreneurs. Butterworth-Heinemann, Classic Drucker Collection edition,
2007.
Robert D. Hisrich, Michael P. Peter, Dean A. Shepherd. Entrepreneurship. Mc Graw Hill, 6
th
Ed., 2005.
Mike McKeever. How to Write a Business Plan. Nolo, 2010.
Lawrence W. Tuller. Finance for Non-Financial Managers and Small Business Owners. Adams Business,
2008.
Prerequisites:
Courses on basic economy or business administration foundations are an asset.
Table of contents: This course focuses on developing a BI or services oriented business plan. So that, the
students are expected to reuse and consolidate any previous knowledge on databases, software engineering
26
and BI obtained in previous courses to develop a comprehensible, sustainable and protable business.
The course is structured in 14 well-dened stages:
Introduction to key aspects of business,
The business idea,
Entrepreneurs, their role in society, traits and prole,
Analysis of business opportunities in services, brainstorming techniques,
From the idea to the company, contents of a business plan,
Dierential factors and competitors, SWOT analysis,
Market opportunities in the services world and gap analysis,
Distribution of services or BI based services,
Communication and marketing,
Resource requirements, technical aspects
Collaborators and team building,
Sales, protability and cost analysis,
Sources of funding: venture capital and external resources,
Closing, reviewing and presenting the plan
The business plan is expected to be partially developed in internal activities (under supervision of the
teacher), and in external activities, always as teamwork (with no supervision).
This course also introduces some notions of business administration and nance, specically:
SWOT analysis, vision, mission and values,
Marketing Mix, a services oriented approach,
Basic nance for non nancial experts,
Assessment breakdown:
The assessment is based on student presentations and the defence of the business plan before a jury comprising
course faculty members and - optionally - another member of the teaching sta or guest professional.
The presentation of the plan is the culmination of the teams work and therefore only those plans meeting
certain minimum requirements may be publicly defended in front of the jury. The presentation simulates a
professional setting. Accordingly, the following aspects will also be assessed: dress, formal, well-structured
communication, etc.
In order to be able to publicly defend the business plan, students must have attended at least 50% of the
classes and teams must have delivered on time the activities that have been planned. The plan is the result
of teamwork, which will be reected in the grade given to the group as a whole. Each member of the group
will be responsible for part of the project and will be graded individually on his or her contribution.
This approach is designed to foster teamwork, in which members share responsibility for attaining a
common objective.
27
University: Technische Universitat Berlin (TUB)
Department: School of Electrical Engineering and Computer Science
Course ID: BDASEM
Course name: Big Data Analytics Seminar
Name and email address of the instructors: Volker Markl ([email protected])
Web page of the course: http://www.dima.cs.tu-berlin.de
Semester: 3
Number of ECTS: 3
Course breakdown and hours:
Lectures: 30h.
Exercises: 30h.
Goals:
Participants of this seminar will acquire knowledge about recent research results and trends in the analysis
of web-scale data. Through the work in this seminar, students will learn the comprehensive preparation and
presentation of a research topic in this eld. In order to achieve this, students will get to read and categorise
a scientic paper, conduct background literature research and present as well as discuss their ndings.
Learning outcomes:
After the course, students will be able to critically read and evaluate scientic publications, and to conduct
background research. They will be capable of preparing for and giving oral presentations on research topics
for an expert audience, of analyzing the state of the art of a research topic, and of summarizing it in a scientic
paper. They should also understand techniques used in the scientic community like peer reviews, conference
presentations, and defenses of the ndings after their presentation, as well as they should understand methods
for large-scale data analytics.
Readings and text books:
At the beginning of the semester students will receive a set of primary literature, which consists of a basic item
for every participant. Then students will learn about presentation techniques and guidelines on how to read
scientic papers. This is be extended by learning how to write texts specially in the context of the English
language. Students should use secondary sources to research the topic assigned to them in the seminar, which
should go beyond the supplied primary literature. Next to conventional sources like the internet students
are required to use research journals and articles published at information management conferences such as
WWW, VLDB, or SIGMOD.
Hector Garcia-Molina, Jerey D. Ullman, Jennifer Widom: Database Systems: The Complete Book, Prentice
Hall, second edition, 2009.
Tom White. Hadoop: The Denitive Guide, Third Edition. OReilly Media, 2012.
Jimmy Lin, Chris Dyer Data-Intensive Text Processing with MapReduce. Morgan & Claypool, 2010.
Prerequisites:
Database Systems Architecture (DBSA)
Table of contents:
Both the sciences and industry are currently undergoing a profound transformation: large-scale, diverse data
sets - derived from sensors, the web, or via crowd sourcing - present a huge opportunity for data-driven decision
making. This data poses new challenges in a variety of dimensions: in its unprecedented volume, in the speed
at which it is generated (its velocity) and in the variety of data sources that need to be integrated. A whole
new breed of systems and paradigms is currently developed to be able to cope with that these challenges.
The eld of Big Data Analytics deals with the technological means of gaining insights from huge amounts of
data. In this seminar, students will review the current state of the art in this eld.
Assessment breakdown:
The grade of the module will be composed from the results of the presentation (50%) and the written seminar
report (50%) for this presentation.
28
University: Technische Universitat Berlin (TUB)
Department: School of Electrical Engineering and Computer Science
Course ID: IDBE (0434 L 468)
Course name: Implementation of a Database Engine
Name and email address of the instructors: Volker Markl ([email protected])
Web page of the course: http://www.dima.cs.tu-berlin.de
Semester: 3
Number of ECTS: 6
Course breakdown and hours:
Exercises: 30h.
Projects: 30h
Goals:
In this lab course you will learn how to implement components of a database system as described in the IDB
course. You will create a working SQL query processor that can answer a set of basic queries.
Learning outcomes:
Upon successful completion of this course, the student:
Understands methods for ecient processing and optimisation of relational queries
Is capable of implementation of locking and concurrency control strategies
Is capable of creating a working query processor
Readings and text books:
Hector Garcia-Molina, Jerey D. Ullman, Jennifer Widom: Database Systems: The Complete Book, Prentice
Hall, second edition, 2009.
Prerequisites:
Internals of Database Systems (IDBS)
Knowledge of data modeling, relational algebra, and SQL as well as a very good command of Java, or possibly
C/C++/C#, programming is required to participate in the course.
Table of contents:
Students will be split up in project teams who under guided self-control will get a hands-on experience
on implementing components of a database system in a robust and scalable way. The actual components
implemented may vary each year, but will include parsing, query optimiser, execution engine, index structures
and storage system.
Assessment breakdown:
The overall grade of the module consists of the results of exam equivalent assessments (pr ufungsaquivalente
Studienleistung). The grade consists of:
Assessment of the homework exercises (10%)
One or two written exams (Klausuren) (40%)
Successful completion of the implementation project (35%)
Presentation/Demonstration of the implementation project (15%)
Successful completion of the homework exercises is a prerequisite for participation in the exams
29
University: Technische Universitat Berlin (TUB)
Department: School of Electrical Engineering and Computer Science
Course ID: H&DIS (0434 L 440)
Course name: Heterogeneous and Distributed Information Systems
Name and email address of the instructors: Ralf Kutsche ([email protected])
Web page of the course: http://www.dima.cs.tu-berlin.de
Semester: 3
Number of ECTS: 6
Course breakdown and hours:
Lectures: 30h.
Exercises: 30h.
Goals:
In this course the student will gain conceptual, methodological and practical knowledge about the develop-
ment and integration of modern distributed, heterogeneous information systems based on the concepts of
model integration, data integration, promotion of information systems and metadata management. This in-
cludes the design of integration and interoperability platforms in the form of appropriate middleware. Also
web programming languages, web architectures and services, and methods for (model-based) evolutionary
development will be taught.
Learning outcomes:
Upon successful completion of this course, the student:
Understands methods integration of modern distributed, heterogeneous information systems
Is capable of approach web programming language
Understands concepts of model integration, metadata management and methods for model-based develop-
ment
Readings and text books:
There is no comprehensive single textbook for this course. Students are required to read instead research
articles, book chapters and other resources for the various aspects in the area of heterogeneous distributed
information systems: Federated Information Systems, Distribution Architectures, Middleware, Persistency
Management, Software Architecture and Patterns, Metadata and Semantic Concepts, Information Search
and Extraction, and others. Basic articles will be given to the students at the beginning of the course, later
they will receive and discover deeper sources.
Prerequisites:
Database Systems Architecture (DBSA)
Knowledge of data modeling, relational algebra, and SQL as well as a very good command of Java, or possibly
C/C++/C#, programming is required to participate in the course.
Table of contents:
Foundations/Terminology of HDIS (FDBS, FIS, MBIS)
Dimensions of HDIS: Distribution, Heterogeneity, Autonomy
Heterogeneous Data Models in HDIS: structured, semistructured, unstructured
Distributed Data Organisation and Software Architectures of HDIS (FIS, P2P, CS, etc.)
Interoperability and Middleware Platforms for HDIS
Persistency Services
Metadata Standards and Management in HDIS
Model-based Development of HDIS
Applications from industry and public services
Assessment breakdown:
The grade will be given with an oral examination. To be admitted for this nal exam, a participant must
fulll all required tasks during the course: seminar work; active participation in home/lab exercises including
nal report and presentation.
30
University: Technische Universitat Berlin (TUB)
Department: School of Electrical Engineering and Computer Science
Course ID: IMPRO3 (0434 L 483)
Course name: Big Data Analytics Projects
Name and email address of the instructors: Volker Markl ([email protected])
Web page of the course: http://www.dima.cs.tu-berlin.de
Semester: 3
Number of ECTS: 9
Course breakdown and hours:
Projects: 60 h
Goals:
In this course you will learn to systematically analyze a current issue in the information management area
and to develop and implement a problem-oriented solution as part of a team. You will learn to cooperate as
team member and to contribute to project organization, quality assurance and documentation. The quality of
your solution has to be proven through analysis, systematic experiments and test cases. Examples of IMPRO
projects carried out in recent semesters are a tool used to analyse Web 2.0 Forum data, an online multiplayer
game for mobile phones, implementation and analysis of new join methods for a cloud computing platform or
the development of data mining operations on the massively parallel system Hadoop as part of the Apache
open source project Mahout.
Learning outcomes:
After the course, students will be able to understand methods for large-scale data analytics and to solve
large-scale data analytics problems. They will be capable of designing and implementing large-scale data
analytics solutions in a collaborative team.
Readings and text books:
Hector Garcia-Molina, Jerey D. Ullman, Jennifer Widom: Database Systems: The Complete Book, Prentice
Hall, second edition, 2009.
Anand Rajaraman, Jerey D. Ullman, Mining of Massive Datasets, Cambridge 2010.
Prerequisites:
Heterogeneous and Distributed Information Systems (H&DIS)
Table of contents:
Both the sciences and industry are currently undergoing a profound transformation: large-scale, diverse data
sets - derived from sensors, the web, or via crowd sourcing - present a huge opportunity for data-driven decision
making. This data poses new challenges in a variety of dimensions: in its unprecedented volume, in the speed
at which it is generated (its velocity) and in the variety of data sources that need to be integrated. A whole
new breed of systems and paradigms is currently developed to be able to cope with that these challenges.
The eld of Big Data Analytics deals with the technological means of gaining insights from huge amounts
of data. Students will conduct projects that deal with applying data mining algorithms to large datasets.
For that, students will learn to use so called Parallel Processing Platforms, systems that execute parallel
computations with terabytes of data on clusters of up to several thousand machines.
At the start of the project, a student will receive a topic as well as some information material. The
team, with the assistance of the lecturer, will decide on a project environment with the suitable tools for
team work, project communication, development and testing. Next, the problem will have to be analyzed,
modelled and decomposed into individual components, from which tasks are derived that are subsequently
assigned to smaller teams or individuals. At weekly project meetings, the project team presents progress and
milestones that have been reached. In consultation with the lecturer, it is decided which further steps to take.
The project is concluded with a nal report, a project poster as well as a nal presentation which includes a
demonstration of the prototype.
Assessment breakdown:
The overall grade for the module consists of the results of exam equivalent course work
(Pr ufungsaquivalenteStudienleistungenPaS). The following are included in the nal grade:
Active participation in the project (10%)
Prototype with test cases (50%)
Documentation (10%)
Final Report (10%)
31
Project Poster (10%)
Final presentation (10%)
32

You might also like