This document provides course descriptions for 4 courses offered at Universite Libre de Bruxelles:
1) Advanced Databases covers recent developments in databases including object-oriented, distributed, and non-traditional data types like spatial and temporal data.
2) Database Systems Architecture examines the implementation of relational databases including query optimization, execution, transaction processing, and concurrency control.
3) Decision Engineering introduces decision theory and models to help decision makers with complex problems involving multiple alternatives, criteria, outcomes, and decision makers.
4) Data Warehousing covers data warehouses for analytical processing of historical data from multiple sources to support analysis and decision making.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
261 views32 pages
Business Intelligence Masters Programme
This document provides course descriptions for 4 courses offered at Universite Libre de Bruxelles:
1) Advanced Databases covers recent developments in databases including object-oriented, distributed, and non-traditional data types like spatial and temporal data.
2) Database Systems Architecture examines the implementation of relational databases including query optimization, execution, transaction processing, and concurrency control.
3) Decision Engineering introduces decision theory and models to help decision makers with complex problems involving multiple alternatives, criteria, outcomes, and decision makers.
4) Data Warehousing covers data warehouses for analytical processing of historical data from multiple sources to support analysis and decision making.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32
Erasmus Mundus Master Course
Information Technologies for Business Intelligence
! " "# # Detailed Course Description Academic Year 2012-2013 1 University: Universite Libre de Bruxelles (ULB) Department: Faculte des Sciences Appliquees Course ID: ADB (INFO-H-415) Course name: Advanced Databases Name and email address of the instructors: Esteban Zimanyi ([email protected]) Web page of the course: http://cs.ulb.ac.be/public/teaching/infoh415 Semester: 1 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h. Exercises: 24h. Projects: 12h. Goals: Today, databases are moving away from typical management applications, and address new application areas. For this, databases must consider (1) recent developments in computer technology, as the object paradigm and distribution, and (2) management of new data types such as spatial or temporal data. This course introduces the concepts and techniques of some innovative database applications Learning outcomes: At the end of the course students are able to Understand various dierent technologies related to database management system Understand when to use these technologies according to the requirements of particular applications Understand dierent alternative approaches proposed by extant database management systems for each of these technologies Understand the optimization issues related to particular implementation of these technologies in extant database management systems. Readings and text books: R.T. Snodgrass, Developing Time-Oriented Database Applications in SQL, Morgan Kaufmann, 2000 Jim Melton and Alan R. Simon, SQL: 1999 - Understanding Relational Language Components, Morgan Kaufmann, 2001 Jim Melton, Advanced SQL: 1999 - Understanding Object-Relational and Other Advanced Features, Morgan Kaufmann, 2002 Shashi Shekhar and Sanjay Chawla, Spatial Databases: A Tour, Prentice Hall, 2003. Prerequisites: Knowledge of the basic principles of database management, in particular SQL Table of contents: Active Databases Taxonomy of concepts. Applications of active databases: integrity maintenance, derived data, replication. Design of active databases: termination, conuence, determinism, modularisation. Temporal Databases Temporal data and applications. Time ontology. Conceptual modeling of temporal aspects. Manipulation of temporal data with standard SQL. New temporal extensions in SQL 2011. Object-Oriented and Object-Relational Databases Object-oriented model. Object Persistance. ODMG standard: Object Denition Language and Object Query Language. Object-relational model. Built-in constructed types. User-dened types. Typed tables. Type and table hierarchies. SQL standard and Oracle implementation. Spatial Databases Application Domains of Geographical Information Systems (GIS), Common GIS data types and analysis. Conceptual Data Models for spatial databases. Logical data models for spatial databases: rastor model (map algebra), vector model (OGIS/ SQL1999). Physical data models for spatial databases: Clustering methods (space lling curves), Storage methods (R-tree, Grid les). Assessment breakdown: 75% written examination, 25% project evaluation 2 University: Universite Libre de Bruxelles (ULB) Department: Faculte des Sciences Appliquees Course ID: DBSA (INFO-H-417) Course name: Database Systems Architecture Name and email address of the instructors: Stijn Vansummeren ([email protected]) Web page of the course: http://cs.ulb.ac.be/public/teaching/infoh417 Semester: 1 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h. Exercises: 12h. Projects: 24h. Goals: In contrast to a typical introductory course in database systems where one learns to design and query relational databases, the goal of this course is to get a fundamental insight into the implementation aspects of database systems. In particular, we take a look under the hood of relational database management systems, with a focus on query and transaction processing. By having an in-depth understanding of the query-optimisation-and- execution pipeline, one becomes more procient in administering DBMSs, and hand-optimising SQL queries for fast execution. Learning outcomes: Upon successful completion of this course, the student: Understands the workow by which a relational database management systems optimises and executes a query Is capable of hand-optimising SQL queries for faster execution Understands the I/O model of computation, and is capable of selecting and designing data structures and algorithms that are ecient in this model (both in the context of datababase systems, and in other contexts). Understands the manner in which relational database management systems provide support for transaction processing, concurrency control, and fault tolerance Readings and text books: Hector Garcia-Molina, Jerey D. Ullman, and Jennifer Widom. Database Systems: The Complete Book, Prentice Hall, 2nd Edition, 2008. Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems. McGraw-Hill, 3rd Edition, 2002. Prerequisites: Introductory course on relational databases, including SQL and relational algebra Course on algorithms and data structures Knowledge of the Java programming language Table of contents: Query Processing With respect to query processing, we study the whole workow of how a typical relational database man- agement system optimises and executes SQL queries. This entails an in-depth study of: translating the SQL query into a logical query plan; optimising the logical query plan; how each logical operator can be algorithmically implemented on the physical (disk) level, and how secondary-memory index structures can be used to speed up these algorithms; and the translation of the logical query plan into a physical query plan using cost-based plan estimation. Transaction Processing Logging Serializability Concurrency control Assessment breakdown: 75% written examination, 25% project evaluation 3 University: Universite Libre de Bruxelles (ULB) Department: Faculte des Sciences Appliquees Course ID: DE (MATH-H-405) Course name: Decision Engineering Name and email address of the instructors: Yves De Smet ([email protected]) Web page of the course: http://uv.ulb.ac.be (no public website) Semester: 1 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h. Exercises: 24h. Projects: 12h. Goals: The goal of this course is to introduce the basics of decision theory. The main aim is to illustrate how mathematical models and specic algorithms can be used to help decision makers facing complex problems (involving a large number of alternatives /multiple criteria /uncertain or risky outcomes / multiple decision makers, . . .). Learning outcomes: Upon successful completion of this course, the student: Is able to formulate and to solve basic decision problems; Can identify the properties and limits of common decision models; Is ready to deepen his/her knowledge in advanced decision sciences courses. Readings and text books: C.D. Aliprantis, S.K. Chakrabarti, Games and decision making, Oxford University Press, 2000 F.S. Hillier, G.J. Lieberman, Introduction to Operations Research, McGraw Hill, 2005 Ph. Vincke, Multicriteria Decision-Aid, J. Wiley, New York, 1992 Prerequisites: Linear algebra Basic course on algorithms Probability and statistics Table of contents: Introduction to decision sciences The origin of operational research and decision sciences, some introductory examples. Voting theory Main voting procedures and properties. Paradoxes. Arrows theorem. Multicriteria Decision Aid Main concepts, introduction to multi-objective optimization, multi-attribute utility theory, outranking methods (ELECTRE & PROMETHEE), applications. Decision under risk and uncertainty Common decision criteria: Maxmin, Maxmax, Hurwitz, Savage, Laplace. Expected utility. Game theory Classic examples. Nash equilibrium. Cournot duopoly. Median Voter theorem. Assessment breakdown: 75% written examination, 25% project evaluation 4 University: Universite Libre de Bruxelles (ULB) Department: Faculte des Sciences Appliquees Course ID: DW (INFO-H-419) Course name: Data Warehousing Name and email address of the instructors: Toon Calders ([email protected]) Web page of the course: http://cs.ulb.ac.be/public/teaching/infoh419 Semester: 1 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h. Exercises: 12h. Projects: 24h. Goals: Relational and object-oriented databases are mainly suited for operational settings in which there are many small transactions querying and writing to the database. Consistency of the database (in the presence of potentially conicting transactions) is of utmost importance. Much dierent is the situation in analytical processing where historical data is analyzed and aggregated in many dierent ways. Such queries dier signicantly from the typical transactional queries in the relational model: 1. Typically analytical queries touch a larger part of the database and last longer than the transactional queries; 2. Analytical queries involve aggregations (min, max, avg, . . .) over large subgroups of the data; 3. When analyzing data it is convenient to see it as multi-dimensional. For these reasons, data to be analyzed is typically collected into a data warehouse with Online Analytical Processing support. Online here refers to the fact that the answers to the queries should not take too long to be computed. Collecting the data is often referred to as Extract-Transform-Load (ELT). The data in the data warehouse needs to be organized in a way to enable the analytical queries to be executed eciently. For the relational model star and snowake schemes are popular designs. Next to OLAP on top of a relational database (ROLAP), also native OLAP solutions based on multidimensional structures (MOLAP) exist. In order to further improve query answering eciency, some query results can already be materialized in the database, and new indexing techniques have been developed. The rst and largest part of the course covers the traditional data warehousing techniques. The main concepts of multidimensional databases are illustrated using the SQL Server tools. The second part of the course consists of advanced topics such as data warehousing appliances, data stream processing, data mining, and spatial-temporal data warehousing. The coverage of these topics connects the data warehousing course with and serves as an introduction towards other related courses in the program. Several associated partners of the program contribute to the course in the form of invited lectures, case studies, and proof of technology sessions. Learning outcomes: At the end of the course students are able to Understand the dierence between operational databases and data warehouses Understand the principles of multidimensional modeling Understand the exploitation of a data warehouse for querying and reporting Understand best practices and methodologies for data warehouse development Understand the process of populating a data warehouse from internal and external sources Readings and text books: Christian S. Jensen, Torben Bach Pedersen, Christian Thomsen. Multidimensional Databases and Data Warehousing. Morgan and Claypool Publishers, 2010 Kimball, Ralph; Margy Ross, Warren Thornthwaite, Joy Mundy, Bob Becker. The Data Warehouse Life- cycle Toolkit, 2nd ed. Wiley, 2008. Selected research papers and articles will be oered on the course website Prerequisites: A rst course on database systems covering the relational model, SQL, entity-relationship modelling, con- straints such as functional dependencies and referential integrity, primary keys, foreign keys. Data structures such as binary search trees, linked lists, multidimensional arrays. 5 Table of contents: There is a mandatory project to be realized by students in groups of 2 or 3 students. In this project students have to select and analyze a software tool in the context of data warehousing. The product and its capabilities have to be positioned into the larger data warehousing context, and a small demo illustrating the capabilities of the tool. The theoretical part of the course is dedicated to topics that allow the students to successfully carry out the project. Below is the table of content of the theoretical part of the course: Foundations of multidimensional modelling Querying and reporting a multidimensional database with OLAP Methodological aspects for data warehouse development Populating a data warehouse: The ETL process Assessment breakdown: 75% written examination, 25% project evaluation 6 University: Universite Libre de Bruxelles (ULB) Department: Faculte des Sciences Appliquees Course ID: BPM (INFO-H-420) Course name: Business Process Management Name and email address of the instructors: Toon Calders ([email protected]) Web page of the course: http://cs.ulb.ac.be/public/teaching/infoh420 Semester: 1 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h. Exercises: 12h. Assignments and project: 24h. Goals: This course introduces basic concepts for modeling and implementing business processes using contemporary information technologies. The rst part of the considers the modeling of business processes, including the control ow, and the data and resource perspectives. Petri nets will be used as a theoretical underpinning to formalize the dierent workow patterns and unambiguously dene the semantics of the dierent constructions in the workow modelling languages. The workow languages Yet-another-workow-Language (YAWL) and the Business Process Modelling and Notation (BPMN) will be introduced in detail, as well as the main characteristics of the Business Process Execution Language (BPeL) for the composition of web services, and Event-Driven Process Chains (EPCs). The second part of the course then goes into the analysis, simulation, verication, and discovery of work- ows. Static techniques to verify properties such as soundness and the option-to-complete at model level will be studied, as well as dynamic properties such as the compliance of an event log with respect to a given model. For the discovery of workows, an overview of the main process mining techniques will be discussed. During the course the students have to perform a couple of modelling assignments in YAWL and BPMN. In the nal project, students build a prototype system enacting one of the workow modelled in their modelling assignments. Aliated industrial partners of the Erasmus Mundus project will be involved in the course in the form of invited lectures, case studies, and proof of technology sessions. These lectures complement the academic coverage of the topic with a more business-oriented perspective and form a nice addition to provide a more complete picture of the Business Processing Modeling landscape. Learning outcomes: At the end of the course students are able to Understand the value and benet as well as the limitations of business process management Understand the business process management life cycle Model business processes in BPMN and YAWL Construct a prototype business process in YAWL Quickly master vendor-specic products in the BPM area Readings and text books: Mathias Weske. Business Process Management: Concepts, Languages, Architectures. Springer. 2007 Arthur H. M. ter Hofstede, Wil M. P. van der Aalst, Michael Adams, Nick Russell (Editors), Modern Business Process Automation: YAWL and its Support Environment. Springer, 2009. Wil van der Aalst. Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, 2012. Prerequisites: Basic programming skills: variables, control structures such as loops and if-then-else, procedures, object- oriented notions such as classes and objects, ... Set theory (Notions such as set, set operations, sequence, multiset, function) and logics (mathematical notation and argumentation; basic proofs) Basic graph theory (notions such as graphs, reachability, transitivity, ...) Experience with modelling languages such as UML and ER diagrams is recommended. Table of contents: There is a mandatory project, split into several tasks during the whole period of the course oering, to be realized by the students in groups of 2. The theoretical part of the course is dedicated to topics that allow 7 the students to successfully carry out the project. Below is a high-level overview of the theoretical part of the course: Short overview of enterprise systems architecture and the place of business process management systems in it. The BPM life cycle. Modelling business processes: modelling the control ow, data and resource perspective. Enacting the business process models. Static and dynamic verication of process models; conformance checking. Discovering process models and other properties of processes through process mining. Assessment breakdown: 50% oral examination, 50% project evaluation 8 University: UFRT Department: Computer Science Dept. Course ID: ADW Course name: Advanced Data Warehousing Name and email address of the instructors: Dr. Patrick Marcel, [email protected], Dr. Veronika Peralta, [email protected] Web page of the course: Semester: 2 Number of ECTS: 5 Course breakdown and hours: Lectures: 22 h. Exercises: 16 h. Project: 12 h. Goals: The aim of this course is to complement the course Data Warehouses (Semester 1) in its study of database technology used in Business Intelligence. A particular focus is given on the problems posed by heterogeneous data integration and data quality. Classical notions of data warehousing and OLAP are recalled and developed: architecture, ETL, conceptual and logical design, query processing and optimization. Advanced topics like query personalization and recommendation are introduced. Learning outcomes: Upon successful completion of this course, the student is able to: eciently design, construct and query a data warehouse dene, measure and maintain data quality in the context of data warehousing. Readings and text books: Ralph Kimball, Margy Ross, The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, Second edition, John Willey, 2002. Matteo Golfarelli, Stefano Rizzi, Data Warehouse Design: Modern Principles and Methodologies, second edition, McGraw Hill, 2009. Maurizio Lenzerini, Panos Vassiliadis, Matthias Jarke, Yannis Vassiliou, Fundamentals of Data Warehouses, Second edition, Springer, 2002. Carlo Batini, Monica Scannapieco, Data Quality Concepts, Methodologies and Techniques, Springer, 2006. Prerequisites: Course DBSA and course DW (Semester 1). Table of contents: Introduction Data warehouse architecture and design Quality Quality models, diagnosis, correction, prevention Loading Integration, ETL OLAP models and languages OLAP implementations, query processing, query optimization Advanced topics Query personalization and recommendations Assessment breakdown: Final written exam (50%), project (50%) assessed by an oral presentation, a demo, and a written report. 9 University: UFRT Department: Computer Science Dept. Course ID: BIS Course name: Business Intelligence Seminars Name and email address of the instructors: Dr. Patrick Marcel, [email protected] Web page of the course: Semester: 2 Number of ECTS: 5 Course breakdown and hours: Lectures: 36 h. Project: 14 h. Goals: Thanks to technological advances, the domain of business intelligence is witnessing today an increasing diver- sication and it addresses new application areas. This seminar covers current trends and recent developments in the domain of business intelligence. It also discusses the implications of business intelligence on individ- uals, organizations, and society in general. The seminar is designed and jointly taught by all consortium partners (whether full or associated partners, academic/research institutions or industrial companies), and will involve guest speakers presenting their organization, research topics, internships, and proposed Masters thesis subjects for the second year of the master. Learning outcomes: With this module, the student will 1) acquire a good understanding of the state of the art and the next evolutions and challenges in the domain, and 2) learn to synthesize, organize, and present scientic and technical information related to the domain of Business Intelligence, to make it clear to colleagues, and discuss it objectively. Readings and text books: Research articles and white papers in the domain of Business Intelligence. Prerequisites: All courses of Semester 1 Table of contents: The topics of the seminar vary from year to year, according to the guest speakers. In addition, students (in groups of maximum 2 persons) must write a report and make a one-hour presentation in front of their fellow students on a topic of their choice in the domain of Business Intelligence. The subject must be addressed in a technical way, to explain the underlying technologies. The choice of subject and date of presentation is determined in agreement with the lecturer. The participation of students to presentations by guest speakers and fellow students is required. Assessment breakdown: A mandatory project is to be realized by students in group of 2. 10 University: UFRT Department: Computer Science Dept. Course ID: IR Course name: Information Retrieval Name and email address of the instructors: Dr. Veronika Peralta, [email protected] Web page of the course: Semester: 2 Number of ECTS: 5 Course breakdown and hours: Lectures: 20 h. Exercises: 12 h. Lab: 18 h. Goals: To study the problems posed by information retrieval To study the processing, indexing, querying, organization and classication of textual documents To acquire a general idea of natural language applications and fundaments To acquire some practical skills in natural language applications related to information retrieval Learning outcomes: Upon successful completion of this course, the student is able to: To know how to index document corpus To know how to design, construct and query a document database To know how to personalize retrieval results To have a basic understanding of linguistic modeling To have a rst experience on some NLP applications (name entities detection; textual information retrieval; text mining) Readings and text books: Christopher D. Manning, Prabhakar Raghavan and Hinrich Sch utze, Introduction to Information Retrieval, Cambridge University Press. 2008. James F. Allen, Natural Language Understanding. 2nd Edition, Benjamin Cummings, 1995. Douglas Biber, Variations across speech and writing. Cambridge University Press, 1988. Barbara J. Grosz, Karen Sparck-Jones, Bonnie Lynn Webber (Eds.), Readings in Natural Language Pro- cessing. Morgan Kaufmann Publ., 1986. Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall, 2001. Prerequisites: Courses DBSA and DW (Semester 1). Knowledge on automata and language theory is welcome. Table of contents: Introduction: problems, retrieval processes, architecture, evaluation issues Retrieval models: boolean model, vector space model, probabilistic model Indexation Web search Personalization Natural Language Processing for Information Retrieval: General introduction : applications of NLP and linguistic level of description Morphology : linguistic modeling (compound words), stemming, lemmatization Terminology : motivation and applications Morphology and syntax : POS tagging, named entities detection Syntax : parsing Semantic processing for information retrieval : latent semantic analysis Assessment breakdown: Final written exam (50%), project (50%) assessed by an oral presentation, a demo, and a written report. 11 University: UFRT Department: Computer Science Dept. Course ID: KDDM Course name: Knowledge Discovery and Data Mining Name and email address of the instructors: Arnaud Giacometti ([email protected]), Arnaud Soulet ([email protected]) and Haoyuan Li ([email protected]) Web page of the course: Semester: 2 Number of ECTS: 5 Course breakdown and hours: Lectures: 22 h. Exercises: 16 h. Lab: 12 h. Goals: The key objectives of this course are two-folds: (i) To give students a detailed understanding of the strengths and limitations of popular data mining techniques, and (ii) To understand the problems associated with the computational complexity issues in data mining. Learning outcomes: Upon successful completion of this course, the student is able to: Prepare raw input data, and process it appropriately to provide suitable input for a wide range of data mining algorithms. Understand the theoretical background of the main data mining algorithms. Critically evaluate and select appropriate data mining algorithms. Apply data mining algorithms, interpret and report the output appropriately. More generally, students should be able to actively manage and participate in data mining projects executed by consultants or specialists in data mining Readings and text books: J. Han and M. Kamber (2006), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers. P.-N. Tan, M. Steinbach, V. Kumar (2005), Introduction to Data Mining, Addison-Wesley Publisher. D. J. Hand, H. Mannila and P. Smyth (2001), Principles of Data Mining, The MIT Press. Prerequisites: Attendees must have prior knowledge on databases, algorithmic, data structures (trees, graphs). Some mathematical and statistical background will help. Table of contents: Introduction to data mining: the Knowledge Discovery Process, Data Preparation Classication Methods: Basic Concepts and Decision Trees Introduction to Articial Neural Networks Association Rule based Methods Model Evaluation: Statistical Tests, ROC Analysis Association Analysis: Basic Concepts and Algorithms Sequential Patterns, Data Streams Clustering: Basic Concepts and Algorithms Partitioning and Hierarchical Methods Bayesian Networks: Basic Concepts and Algorithms Advanced Topics (seminars) Assessment breakdown: Project (40%) + written nal examination (60%) 12 University: UFRT Department: Computer Science Dept. Course ID: XWT Course name: XML and web technologies Name and email address of the instructors: Dr Beatrice Bouchou Markho, Beatrice.Bouchou@univ- tours.fr Web page of the course: Semester: 2 Number of ECTS: 5 Course breakdown and hours: Lectures: 22 h. Exercises: 16 h. Lab: 12 h. Goals: Understanding of the foundations of the web standard for data management, XML, with associated APIs, schema languages, query languages and transformation languages. Basic knowledge on most important developments on the web, web services and semantic web. Learning outcomes: Upon successful completion of this course, the student is able to: Design XML documents Design XML schemas (DTD, XML Schema) Design integrity constraints for XML documents Manage XML namespaces Process XML documents w.r.t. schemas and integrity constraints (e.g. validation) Query XML data with XPath and Xquery Transform XML data with XSLT Use XML Databases, either with Oracle or eXist Discover how to express ontologies, create semantic annotations and express queries on semantic data (RDF, RDFS, OWL, SPARQL) Discover how to design, modify and publish web services (SOAP, WSDL, UDDI) Readings and text books: Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset and Pierre Senellart. Web Data Management and Distribution. Oxford Press, 2012. Anders Mller and Michael I. Schwartzbach. An Introduction to XML and Web Technologies. Addison- Wesley, 2006. Prerequisites: Attendees must have prior knowledge on Algorithms and data structures (trees and graphs), language theory (nite state automata), First-order logics and AI Table of contents: Introduction to semi-structured data and XML XML document structure, infoset Schema languages and validation process DTD, XML Schema, (bottom-up unranked) tree automata Navigating XML Trees and integrity constraints XPath Integrity constraints for XML Querying (and transforming) XML documents, XML databases XQuery and XSLT XML databases Programming with XML API SAX API DOM Introduction to Semantic Web technologies 13 Semantic annotations, RDF, SparQL Concepts of ontology, RDF schema and OWL Introduction to Web Service Technologies Service web description with SOAP and WSDL, and publication with UDDI Assessment breakdown: Written exam (60%) and Project (40%, oral presentation and written report) 14 University: Ecole Centrale Paris (ECP) Department: Computer Science Department Course ID: VA Course name: Visual Analytics Name and email address of the instructors: Anastasia Bezerianos ([email protected]) Web page of the course: (to be created) Semester: 3 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h Laboratory: 24h Project: 12h Goals: This course aims to help students understand the emerging, multidisciplinary eld of VA, to familiarise them with current VA technology, and help them gain the foundations to building visual analytics tools and systems using real world data. Learning outcomes: Upon completing the course, students will be able to: Understand basic concepts, theories and methodologies of Visual Analytics Analyse data using appropriate visual thinking and visual analytics techniques Present data using appropriate visual communication and graphical methods Design and implement a Visual Analytics system for supporting decision making Readings and text books: Edward Tufte, Envisioning Information, Graphics Press, 1990. Robert Spence, Information Visualisation: Design for Interaction, Second Edition, Prentice Hall, 2007. Colin Ware, Information Visualisation: Perception for Design, Second Edition, Morgan Kaufmann, 2004. The course instructor will provide required weekly readings in the form of scientic articles, and will recom- mend further reading on each topic. Prerequisites: Advanced Data Warehousing (ADW) Knowledge Discovery and Data Mining (KD&DM) Table of contents: VA fundamentals: Theories, methodologies and techniques Designing interactive graphics Appropriate methods for dierent data types: Graphs, Hierarchies, Spatio-temporal data, High dimensional data VA system design practices Dashboard design Assessment breakdown: 30% class participation, 70% project (10% proposal, 20% intermediate, 40% nal) 15 University: Ecole Centrale Paris (ECP) Department: Computer Science Department Course ID: CS&SW Course name: Corporate Semantics and Semantic Web Name and email address of the instructors: Marie-Aude Aufaure ([email protected]) Web page of the course: (to be created) Semester: 3 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h Laboratory: 24h Project: 12h Goals: This course aims at presenting semantic technologies and the benets to use them in companies for various applications such as semantic information search, recommendation, question and answering systems, search- based applications. Learning outcomes: Provide the student with a deep understanding of the Semantic Web Technologies. Ability to understand how to use Semantic Technologies for corporate application, with a special emphasise on the integration of unstructured content to enterprise structured data, Be able to build an ontology and use it for a specic application. Readings and text books: Pascal Hitzler, Markus Krotzsch, Sebastian Rudolph, Foundations of Semantic Web Technologies, Chapman & Hall/CRC, 2009. Steen Staab, Rudi Studer, Handbook on Ontologies, Springer, Second Edition, 2009. John Davies, Rudi Studer, Paul Warren, Semantic Web Technologies: Trends and Research in Ontology- based Systems, Wiley, 2006 (published online) David Taniar, Johanna Wenny Rahayu, Web Semantics Ontology, Idea Group Publishing, 2006. Prerequisites: XML and Web Technologies (X&WT) Information Retrieval (IR) Table of contents: The Semantic Web Stack (RDF, RDFS, OWL, SKOS, SPARQL) Ontology Learning and Life-Cycle Linked Data Adding Semantic to corporate data (Triple Store, Ontologies and OLAP, Ontologies and Databases) Applications: Semantic Search, Question and Answering, Recommendation, Social Networks. Assessment breakdown: 50% written examination + 50% project evaluation 16 University: Ecole Centrale Paris (ECP) Department: Computer Science Department Course ID: DM&ML Course name: Data Mining and Machine Learning Name and email address of the instructors: Etienne Cuvelier ([email protected]), Antoine Cornuejols ([email protected]) Web page of the course: (to be created) Semester: 3 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h Laboratory: 24h Project: 27h Goals: The goals of this course are to allow the students to discover and practice advanced techniques of data mining and machine learning. Discover their principles, but also their variants, their applications and their weaknesses. Learning outcomes: Upon successful completion of this course, the student will be able: to choose the best techniques to solve a given data mining or machine learning task, to tune the parameters of the chosen technique, to interpret the results of the chosen technique. Readings and text books: David J. Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, The MIT Press, 2001. Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, Springer, 2009. Luis Torgo, Data Mining with R, learning with case studies, Chapman & Hall/CRC, 2010. Prerequisites: Knowledge Discovery and Data Mining (KD&DM) Table of contents: Data Mining Principal components analysis, factorial analysis. Advanced clustering: spectral algorithms, galois lattices. Regressions. Analysis of non-tabular data types (graphs, symbolic, functional). Problems and methods of Machine Learning Formalization of the induction problem. Observations, hypothesis performance criterion. Linear models. Generalization in kernel methods: SVM. Deep neural networks. Ensemble methods: boosting Structured Data Learning: relational methods. New problems: on-line learning, multi-task learning. Assessment breakdown: 50% written examination + 50% project evaluation 17 University: Ecole Centrale Paris (ECP) Department: Computer Science Department Course ID: DM Course name: Decision Modelling Name and email address of the instructors: Vincent Mousseau ([email protected]) Web page of the course: (to be created) Semester: 3 Number of ECTS: 5 Course breakdown and hours: Lectures: 24h Laboratory: 24h Project: 12h Goals: This course aims at presenting classical decision models with a special emphasis on decision making in uncer- tain situations, decision with multiple attribute, and decision with multiple stakeholders. During the course, various applications will be presented, emphasizing the practical interest and applicability of the models in real-world decision situations. Learning outcomes: Provide the student with decision models and a better understanding of the validity of these decision models, Ability to understand the three levels of decision analysis: representation of observed decision behaviour (descriptive analysis), decision aiding and recommendation (prescriptive analysis), and the design of arti- cial decision agents (normative analysis). Readings and text books: Denis Bouyssou, Thierry Marchant, Marc Pirlot, Alexis Tsoukias, Philippe Vincke, Evaluation and decision models with multiple criteria: Stepping stones for the analyst, Springer, International Series in Operations Research and Management Science Volume 86, 2006. William W. Cooper, Lawrence M. Seiford, and Kaoru Tone, Introduction to Data Envelopment Analysis and Its Uses, Springer, 2006. Prerequisites: Decision Engineering (DE) Table of contents: Data envelopment Analysis: Analysis of the eciency of production units Decision under uncertainty, decision trees: theory, modeling and applications Behavioural decision analysis: Empirical analysis of decision behaviour, cognitive decision biases, prospect theory Outranking methods (theory and applications): Presentation of the Electre methods( Electre I, Electre 3, Electre Tri), reference based ranking. Applications on a generic Decision platform: Decision Deck. Case studies and use o an open source plateform for decision aid. Group decision: Group decision, elicitation of a group decision model Preference learning: Eliciting preference model for a decision maker, for several decision makers Decision making using Multiple Objective Optimisation : Epsilon constraint method, applications, approx- imation algorithms, evolutionary algorithms, NSGA II Assessment breakdown: Assessment of the homework exercises (10%), Written exam (60%), Project (30%) 18 University: Ecole Centrale Paris (ECP) Department: Computer Science Department Course ID: II&R Course name: Introduction to Innovation and Research Name and email address of the instructors: Marie-Aude Aufaure ([email protected]) Web page of the course: (to be created) Semester: 3 Number of ECTS: 5 Course breakdown and hours: Lectures: 12h Laboratory: 18h Project: 30h Goals: The objectives of this course are to provide industrial and research presentations for students from the main BI software editors and clients as well as researchers in this domain. The European research context as well as intellectual properties, incubators and start-up creation will be presented. Students will also develop a research project in a collaborative way. Learning outcomes: Provide the student with knowledge about intellectual properties, incubators and European research context (FP7 projects, ICT-labs, etc.) Various seminars from key BI actors will be presented. Ability to manage a research project for a client. Readings and text books: Scientic papers will be distributed by the course lecturer according to the topics covered. Prerequisites: Business Intelligence Seminar (BIS) Table of contents: Seminars: innovation and research Intellectual Property Presentation of resources such as incubators The European research context Research project Assessment breakdown: Project evaluation 100% 19 University: Universitat Polit`ecnica de Catalunya (UPC) Department: Department of Service and Information System Engineering Course ID: SOBI Course name: Service Oriented Business Intelligence Name and email address of the instructors: Alberto Abello ([email protected]) Web page of the course: To be created Semester: 3 rd Number of ECTS: 6 Course breakdown and hours: Lectures: 36 h. Problems and laboratories: 18 h. Self-Study: 96 h. Goals: This course focuses on developing the students skills to put their knowledge on service companies and ben- eting from service facilities to build business intelligence solutions. On the one hand, we will analyze the specicity of this sector and present techniques to engineering their systems (i.e., Service Oriented Architec- ture). On the other hand, we will also analyze to which extent it is possible to consider Data Warehousing just a service and use these same techniques in its engineering methods. Learning outcomes: Upon successful completion of this course, the student is able to: Knowledge Understand the specicity of service companies. Identify Business Intelligence as a service. Recognise the characteristics and benets of Infrastructure as a Service. Recognise the characteristics and benets of a Service Oriented Architecture (SOA). Skills Be able to develop Business Intelligence Service on Platform as a Service. Be able to use Business Intelligence Software as a Service. Be able to work on Business Process as a Service and provide tools to analyse such processes. Readings and text books: Marie-Aude Aufaure and Esteban Zimanyi, editors. Proceedings of the 1st European Business Intelligence Summer School, eBISS 2011, LNBIP 96. Springer, 2012. Hector Garcia-Molina, Jerey D. Ullman, Jennifer Widom: Database Systems: The Complete Book, Prentice Hall, second edition, 2009 Will M. P. van der Aalst: Process Mining: Discovery, Conformance and Enhancement of Business Processes, Springer, 2011. Anand Rajaraman, Jerey D. Ullman: Mining of Massive Datasets, Cambridge University. Press, 2012 Tamer
Ozsu und Patrick Valduriez: Principles of Distributed Database Systems, Prentice Hall, third edition, 2011 Other Literature: Scott E. Sampson. Understanding Service Businesses: Applying Principles of Unied Services Theory, 2nd Edition, John Wiley& Sons, 2001. Thomas Erl. Service Oriented Architecture, Prentice Hall, 2006. Mike Papazoglou et al., editors. Service Research Challenges and Solutions for the Future Internet, LNCS 6500, Springer 2010. David S. Linthicum. Cloud Computing and SOA Convergence in Your Enterprise: A Step-by-Step Guide, Addison-Wesley, 2009. Prerequisites: Business Process Management (BPM) Advanced Data Warehousing (ADW) Table of contents: Introduction to services: denition and specic characteristics Service Level Agreement and quality management 20 Availability Condentiality Cloud Computing IaaS PaaS Relational DBMS (Oracle, DB2, etc.) CloudDB (BigTable) MapReduce (Hadoop) SaaS CRM BaaS Service Oriented Architecture ETL Situational BI Service analysis KPI denition and simulation Process mining Assessment breakdown: 30% written examination, 70% problem classes and laboratories 21 University: Universitat Polit`ecnica de Catalunya Department: Department of Service and Information System Engineering Course ID: Course name: Software Engineering and Business Intelligence Project (SEBIP) Name and email address of the instructors: Oscar Romero ([email protected]) Web page of the course: To be created Semester: 3 rd Number of ECTS: 6 Course breakdown and hours: Lectures: 20 h. Projects: 120 h. Self-Study: 4h. Goals: This course focuses on developing the students skills to put their knowledge on software engineering and databases (as long as specic knowledge on project management introduced in this course) into practice, with the aim to develop information systems (IS) to support business intelligence (BI) processes within organizations. The course simulates an environment whose conditions are similar to those of a BI industrial project. So that, the students are required to work in a team and undertake a project by planning the project, modeling the BI processes, gathering requirements, analyze and specify an IS meeting the project requirements, designing the system and testing, while documenting all the process. Eventually, a prototype is required. Learning outcomes: Upon successful completion of this course, the student is able to undertake BI projects and therefore: Consolidate his / her software engineering, database and BI concepts by putting them into practice, Correctly identify and analyze the special needs of the project with regard to requirements, Propose a suitable architecture meeting the requirements, Successfully develop the project in a well-rounded, disciplined and methodological manner, Strength the students team work skills, such as the ability to reach agreements, play a specic team role and reuse and continue other teammates work. Additionally, the student will gain basic knowledge on project management and agile software development methods. Readings and text books: M. Golfarelli, S. Rizzi, Data Warehouse Design, McGraw Hill, 2009 Prerequisites: Advanced Data Warehouses (ADW) Table of contents: This course focuses on undertaking a BI project. So that, the students are expected to reuse and consolidate their knowledge on databases, software engineering and BI obtained in previous courses. At the beginning of the course a case study is presented to the students. Specically, a set of end-user requirements (both functional and non-functional) are handed to the students. From here on, the students are expected to develop the whole project until completion throughout 6 well-dened stages: Modeling BI processes Formal specication and analysis Design (including choose the appropriate architecture for the system) Project documentation Testing Project defense The project is expected to be developed in the course laboratories (under supervision of the teacher), and as teamwork (with no supervision). The laboratories follow a project based learning approach (PBL) in which the students are required to play a specic role within their team, undertake team discussions, reach agreements and champion their decisions. This course also introduces some notions of project management and agile software development methods. (additional material on these will be handed to support the planning and development of the project). In the rst session a case study is presented to the students, who are distributed in groups of 3-5 people. From there on, every two weeks a checkpoint is carried out with the teacher who is meant to control the 22 project development acting as a customer. The project is supposed to be deployed following the SCRUM method and be controlled by the students themselves. Students within a group are supposed to play the role they were assigned and collaborate. One important aspect of this project is to decide the architecture (e.g., relational or any NOSQL approach) to be addressed for that case study. Each group can decide on their own but they will be asked to defend and document their decisions, which will be handed to the teacher before the testing and defense phase in form of project documentation. In the testing phase groups are intended to compete among them, and the rest of groups together with the teacher will act as jury. The criteria used to value each project are the functional and non-functional requirements handed in the rst session, and the ability of the users to relate them with what other groups present and raise pros and cons. Students are asked to justify their marks to each project according to such discussions. Finally, each group must defend their ideas in an open forum carried out in the last week of the course and draw their own conclusions. They must hand a nal group document discussing whether their approach was correct or they made some mistakes during any of the development phases. Alternatively, a personal document can be handed if no consensus has been achieved within the group (anyway, a group document is mandatory). Students are expected to show maturity in their conclusions regarding their ability to meet the case study requirements and spot out diculties and how to tackle them in the future. Assessment breakdown: 25% project documentation, 25% project marks (average of the marks provided by other groups + teacher), 10% checkpoints with the teacher, 5% work in the lab sessions, 20% contributions in the forum, 15% nal group document (conclusions) + personal conclusions (if delivered) 23 University: Universitat Polit`ecnica de Catalunya (UPC) Department: Department of Service and Information System Engineering Course ID: Course name: Web Services (WS) Name and email address of the instructors: Carles Farre ([email protected]) Web page of the course: http://www.fib.upc.edu/en/estudiar-enginyeria-informatica/ enginyeries-pla-2003/assignatures/SW.html Semester: 3 rd Number of ECTS: 6 Course breakdown and hours: Lectures: 28 h. Lab sessions: 28 h. Autonomous work: 77 h. Exam preparation: 17 h. Lectures: 2 hours per week. The instructors may present the some contents of the course using slides or some other material. Students may also be required to give short presentations about some topic of interest Lab sessions: 2 hours per week. After a brief introduction of the tasks to be carried out, students will perform these using the computer in accordance with a pre-established work plan and a list of objectives. The extent to which these objectives are achieved will determine the grade awarded for the lab session in question. Autonomous work: 5,5 hours per week Some course contents are not presented in class and must be privately studied by students. Teachers will indicate which contents should be studied and the teaching resources that may be employed. Student may also be asked to prepare problems or lab sessions as well as to deliver on-line assignments Goals: This course covers the historical development and current status of Web services and investigates how they are being used in real-world Web applications, such as those built by Amazon, Twitter, Google, Appian, Salesforce,... Other topics will include studying the REST style of developing Web applications as well as investigating how Web 2.0 concepts relate to the Web services landscape. By the end of the course, students should be familiar with these concepts and have some experience both with building Web services and interacting with them programmatically. Lectures are accompanied by lab sessions, in which students will get hands-on experience with the tech- nologies required to consume and develop Web Services. Learning outcomes: Upon successful completion of this course, the student is able to: Knowledge on the nature, characteristics an types of Web Services Knowledge on the fundamental technologies that underpin the Web Service paradigm Knowledge on the key standards necessary for the development of Web Services Skills for designing and implementing software to interact and use public and private Web Services and APIs Skills for designing, implementing, testing, deploying and monitoring Web Services. Readings and text books: Gustavo Alonso, Fabio Casati, Harumi Kuno, Vijay Machiraju. Web Services. Concepts, Architectures and Applications, Springer, 2004. Robert Daigneau. Service Design Patterns: Fundamental Design Solutions for SOAP/WSDL and RESTful Web Services. Addison-Wesley Professional, 2011. Michael P. Papazoglou. Web Services: Principles and Technology, Prentice Hall, 2008. Leonard Richardson, Sam Ruby. RESTful Web Services, OReilly, 2007. Leon Shklar, Rich Rosen. Web Application Architecture: Principles, Protocols and Practices. Second Edition, John Wiley & Sons, 2009. Prerequisites: XML and Web Technologies Table of contents: Introduction Origins & Precedents: Distributed Sytems. Middleware, SOA 24 Core Web Technologies: The Fundamentals: URIs. HTTP. Proxies, caches, cookies Browser-Based Computing: JavaScript, DOM, AJAX Server-Side Computing: CGI, PHP, Java Servlets Web Data Exchange Formats: XML, JSON Core WS Protocols SOAP and WSDL RESTful WS Consuming WS Implementing WS Clients: Frameworks and libraries. WS Consoles UDDI Developing WS Properties of a service development methodology Qualities of service development methodology Web services development lifecycle Service analysis, design and construction Design Patterns for Web Service Development WS Security General Concepts Securing RESTful Web Services XML Security Standards Securing WS-* Web Services Processes and Workows Business processes Workows Service composition meta-model Web services orchestration & choreography The Business Process Execution Language (WS-BPEL) Assessment breakdown: 30% written examination, 30% presentations an on-line Assignments, 40% laboratories 25 University: Universitat Polit`ecnica de Catalunya (UPC) Department: Department of Management Course ID: Course name: Viability of Business Projects (VBP) Name and email address of the instructors: Marc Eguiguren ([email protected]) Web page of the course: http://www.fib.upc.edu/en/estudiar-enginyeria-informatica/ enginyeries-pla-2003/assignatures/VPE.html Semester: 3 rd Number of ECTS: 6 Course breakdown and hours: Lectures: 40 h. Projects in the classroom: 35 h. Projects in the classroom: 56 h. Self-Study: 31 h. Goals: University graduates can nd themselves in the situation of having to analyse or take on the project of starting their own business. This is especially true in the case of computer scientists in any eld related to Business Intelligence (BI) or more generally, in the world of services. There are moments in ones professional career at which one must be able to assess or judge the suitability of business ventures undertaken or promoted by third parties, or understand the possibilities a new service has for success. It is for this reason that this subject focuses on providing students with an understanding of the main techniques used in analysing the viability of new business ventures: business start-up or the implementation of new projects in the world of services or in the specic eld of BI. This project-oriented, eminently practical subject is aimed at each students being able to draft as realistic a business plan as possible. Learning outcomes: Upon successful completion of this course, the student is able to: Knowledge Understanding the world of services as well as the business/company concept and the keys to success in a BI business start-up. Evaluate the entrepreneurs role and identify the skills needed to get a business start-up o the ground. The marketing, nancial, operational and human elements making up a good business plan out of a BI idea. The communication skills needed by an entrepreneur to sell a business start-up. Skills The ability to set priorities for a new service oriented business and to realistically appraise the business opportunities. Ability to draw up a viable business plan in a rational, ecient manner. Ability to grasp the technical, marketing, and personnel problems involved in starting a business in the service sector. Ability to identify and attract nancial resources and to convey and defend the plan for a business start-up. Readings and text books: Rhonda Abrams, Eug`ene Kleiner. The Successful Business Plan. The Planning Shop, 2003. Rob Cosgrove. Online Backup Guide for Service Providers, Cosgrove, Rob, 2010. Peter Drucker. Innovation and Entrepreneurs. Butterworth-Heinemann, Classic Drucker Collection edition, 2007. Robert D. Hisrich, Michael P. Peter, Dean A. Shepherd. Entrepreneurship. Mc Graw Hill, 6 th Ed., 2005. Mike McKeever. How to Write a Business Plan. Nolo, 2010. Lawrence W. Tuller. Finance for Non-Financial Managers and Small Business Owners. Adams Business, 2008. Prerequisites: Courses on basic economy or business administration foundations are an asset. Table of contents: This course focuses on developing a BI or services oriented business plan. So that, the students are expected to reuse and consolidate any previous knowledge on databases, software engineering 26 and BI obtained in previous courses to develop a comprehensible, sustainable and protable business. The course is structured in 14 well-dened stages: Introduction to key aspects of business, The business idea, Entrepreneurs, their role in society, traits and prole, Analysis of business opportunities in services, brainstorming techniques, From the idea to the company, contents of a business plan, Dierential factors and competitors, SWOT analysis, Market opportunities in the services world and gap analysis, Distribution of services or BI based services, Communication and marketing, Resource requirements, technical aspects Collaborators and team building, Sales, protability and cost analysis, Sources of funding: venture capital and external resources, Closing, reviewing and presenting the plan The business plan is expected to be partially developed in internal activities (under supervision of the teacher), and in external activities, always as teamwork (with no supervision). This course also introduces some notions of business administration and nance, specically: SWOT analysis, vision, mission and values, Marketing Mix, a services oriented approach, Basic nance for non nancial experts, Assessment breakdown: The assessment is based on student presentations and the defence of the business plan before a jury comprising course faculty members and - optionally - another member of the teaching sta or guest professional. The presentation of the plan is the culmination of the teams work and therefore only those plans meeting certain minimum requirements may be publicly defended in front of the jury. The presentation simulates a professional setting. Accordingly, the following aspects will also be assessed: dress, formal, well-structured communication, etc. In order to be able to publicly defend the business plan, students must have attended at least 50% of the classes and teams must have delivered on time the activities that have been planned. The plan is the result of teamwork, which will be reected in the grade given to the group as a whole. Each member of the group will be responsible for part of the project and will be graded individually on his or her contribution. This approach is designed to foster teamwork, in which members share responsibility for attaining a common objective. 27 University: Technische Universitat Berlin (TUB) Department: School of Electrical Engineering and Computer Science Course ID: BDASEM Course name: Big Data Analytics Seminar Name and email address of the instructors: Volker Markl ([email protected]) Web page of the course: http://www.dima.cs.tu-berlin.de Semester: 3 Number of ECTS: 3 Course breakdown and hours: Lectures: 30h. Exercises: 30h. Goals: Participants of this seminar will acquire knowledge about recent research results and trends in the analysis of web-scale data. Through the work in this seminar, students will learn the comprehensive preparation and presentation of a research topic in this eld. In order to achieve this, students will get to read and categorise a scientic paper, conduct background literature research and present as well as discuss their ndings. Learning outcomes: After the course, students will be able to critically read and evaluate scientic publications, and to conduct background research. They will be capable of preparing for and giving oral presentations on research topics for an expert audience, of analyzing the state of the art of a research topic, and of summarizing it in a scientic paper. They should also understand techniques used in the scientic community like peer reviews, conference presentations, and defenses of the ndings after their presentation, as well as they should understand methods for large-scale data analytics. Readings and text books: At the beginning of the semester students will receive a set of primary literature, which consists of a basic item for every participant. Then students will learn about presentation techniques and guidelines on how to read scientic papers. This is be extended by learning how to write texts specially in the context of the English language. Students should use secondary sources to research the topic assigned to them in the seminar, which should go beyond the supplied primary literature. Next to conventional sources like the internet students are required to use research journals and articles published at information management conferences such as WWW, VLDB, or SIGMOD. Hector Garcia-Molina, Jerey D. Ullman, Jennifer Widom: Database Systems: The Complete Book, Prentice Hall, second edition, 2009. Tom White. Hadoop: The Denitive Guide, Third Edition. OReilly Media, 2012. Jimmy Lin, Chris Dyer Data-Intensive Text Processing with MapReduce. Morgan & Claypool, 2010. Prerequisites: Database Systems Architecture (DBSA) Table of contents: Both the sciences and industry are currently undergoing a profound transformation: large-scale, diverse data sets - derived from sensors, the web, or via crowd sourcing - present a huge opportunity for data-driven decision making. This data poses new challenges in a variety of dimensions: in its unprecedented volume, in the speed at which it is generated (its velocity) and in the variety of data sources that need to be integrated. A whole new breed of systems and paradigms is currently developed to be able to cope with that these challenges. The eld of Big Data Analytics deals with the technological means of gaining insights from huge amounts of data. In this seminar, students will review the current state of the art in this eld. Assessment breakdown: The grade of the module will be composed from the results of the presentation (50%) and the written seminar report (50%) for this presentation. 28 University: Technische Universitat Berlin (TUB) Department: School of Electrical Engineering and Computer Science Course ID: IDBE (0434 L 468) Course name: Implementation of a Database Engine Name and email address of the instructors: Volker Markl ([email protected]) Web page of the course: http://www.dima.cs.tu-berlin.de Semester: 3 Number of ECTS: 6 Course breakdown and hours: Exercises: 30h. Projects: 30h Goals: In this lab course you will learn how to implement components of a database system as described in the IDB course. You will create a working SQL query processor that can answer a set of basic queries. Learning outcomes: Upon successful completion of this course, the student: Understands methods for ecient processing and optimisation of relational queries Is capable of implementation of locking and concurrency control strategies Is capable of creating a working query processor Readings and text books: Hector Garcia-Molina, Jerey D. Ullman, Jennifer Widom: Database Systems: The Complete Book, Prentice Hall, second edition, 2009. Prerequisites: Internals of Database Systems (IDBS) Knowledge of data modeling, relational algebra, and SQL as well as a very good command of Java, or possibly C/C++/C#, programming is required to participate in the course. Table of contents: Students will be split up in project teams who under guided self-control will get a hands-on experience on implementing components of a database system in a robust and scalable way. The actual components implemented may vary each year, but will include parsing, query optimiser, execution engine, index structures and storage system. Assessment breakdown: The overall grade of the module consists of the results of exam equivalent assessments (pr ufungsaquivalente Studienleistung). The grade consists of: Assessment of the homework exercises (10%) One or two written exams (Klausuren) (40%) Successful completion of the implementation project (35%) Presentation/Demonstration of the implementation project (15%) Successful completion of the homework exercises is a prerequisite for participation in the exams 29 University: Technische Universitat Berlin (TUB) Department: School of Electrical Engineering and Computer Science Course ID: H&DIS (0434 L 440) Course name: Heterogeneous and Distributed Information Systems Name and email address of the instructors: Ralf Kutsche ([email protected]) Web page of the course: http://www.dima.cs.tu-berlin.de Semester: 3 Number of ECTS: 6 Course breakdown and hours: Lectures: 30h. Exercises: 30h. Goals: In this course the student will gain conceptual, methodological and practical knowledge about the develop- ment and integration of modern distributed, heterogeneous information systems based on the concepts of model integration, data integration, promotion of information systems and metadata management. This in- cludes the design of integration and interoperability platforms in the form of appropriate middleware. Also web programming languages, web architectures and services, and methods for (model-based) evolutionary development will be taught. Learning outcomes: Upon successful completion of this course, the student: Understands methods integration of modern distributed, heterogeneous information systems Is capable of approach web programming language Understands concepts of model integration, metadata management and methods for model-based develop- ment Readings and text books: There is no comprehensive single textbook for this course. Students are required to read instead research articles, book chapters and other resources for the various aspects in the area of heterogeneous distributed information systems: Federated Information Systems, Distribution Architectures, Middleware, Persistency Management, Software Architecture and Patterns, Metadata and Semantic Concepts, Information Search and Extraction, and others. Basic articles will be given to the students at the beginning of the course, later they will receive and discover deeper sources. Prerequisites: Database Systems Architecture (DBSA) Knowledge of data modeling, relational algebra, and SQL as well as a very good command of Java, or possibly C/C++/C#, programming is required to participate in the course. Table of contents: Foundations/Terminology of HDIS (FDBS, FIS, MBIS) Dimensions of HDIS: Distribution, Heterogeneity, Autonomy Heterogeneous Data Models in HDIS: structured, semistructured, unstructured Distributed Data Organisation and Software Architectures of HDIS (FIS, P2P, CS, etc.) Interoperability and Middleware Platforms for HDIS Persistency Services Metadata Standards and Management in HDIS Model-based Development of HDIS Applications from industry and public services Assessment breakdown: The grade will be given with an oral examination. To be admitted for this nal exam, a participant must fulll all required tasks during the course: seminar work; active participation in home/lab exercises including nal report and presentation. 30 University: Technische Universitat Berlin (TUB) Department: School of Electrical Engineering and Computer Science Course ID: IMPRO3 (0434 L 483) Course name: Big Data Analytics Projects Name and email address of the instructors: Volker Markl ([email protected]) Web page of the course: http://www.dima.cs.tu-berlin.de Semester: 3 Number of ECTS: 9 Course breakdown and hours: Projects: 60 h Goals: In this course you will learn to systematically analyze a current issue in the information management area and to develop and implement a problem-oriented solution as part of a team. You will learn to cooperate as team member and to contribute to project organization, quality assurance and documentation. The quality of your solution has to be proven through analysis, systematic experiments and test cases. Examples of IMPRO projects carried out in recent semesters are a tool used to analyse Web 2.0 Forum data, an online multiplayer game for mobile phones, implementation and analysis of new join methods for a cloud computing platform or the development of data mining operations on the massively parallel system Hadoop as part of the Apache open source project Mahout. Learning outcomes: After the course, students will be able to understand methods for large-scale data analytics and to solve large-scale data analytics problems. They will be capable of designing and implementing large-scale data analytics solutions in a collaborative team. Readings and text books: Hector Garcia-Molina, Jerey D. Ullman, Jennifer Widom: Database Systems: The Complete Book, Prentice Hall, second edition, 2009. Anand Rajaraman, Jerey D. Ullman, Mining of Massive Datasets, Cambridge 2010. Prerequisites: Heterogeneous and Distributed Information Systems (H&DIS) Table of contents: Both the sciences and industry are currently undergoing a profound transformation: large-scale, diverse data sets - derived from sensors, the web, or via crowd sourcing - present a huge opportunity for data-driven decision making. This data poses new challenges in a variety of dimensions: in its unprecedented volume, in the speed at which it is generated (its velocity) and in the variety of data sources that need to be integrated. A whole new breed of systems and paradigms is currently developed to be able to cope with that these challenges. The eld of Big Data Analytics deals with the technological means of gaining insights from huge amounts of data. Students will conduct projects that deal with applying data mining algorithms to large datasets. For that, students will learn to use so called Parallel Processing Platforms, systems that execute parallel computations with terabytes of data on clusters of up to several thousand machines. At the start of the project, a student will receive a topic as well as some information material. The team, with the assistance of the lecturer, will decide on a project environment with the suitable tools for team work, project communication, development and testing. Next, the problem will have to be analyzed, modelled and decomposed into individual components, from which tasks are derived that are subsequently assigned to smaller teams or individuals. At weekly project meetings, the project team presents progress and milestones that have been reached. In consultation with the lecturer, it is decided which further steps to take. The project is concluded with a nal report, a project poster as well as a nal presentation which includes a demonstration of the prototype. Assessment breakdown: The overall grade for the module consists of the results of exam equivalent course work (Pr ufungsaquivalenteStudienleistungenPaS). The following are included in the nal grade: Active participation in the project (10%) Prototype with test cases (50%) Documentation (10%) Final Report (10%) 31 Project Poster (10%) Final presentation (10%) 32