0% found this document useful (0 votes)
1 views

0- Course Intro

CIT650: Introduction to Big Data is a course offered in Fall 2024, focusing on advanced principles and methods for managing and processing diverse data types. The course aims to equip students with skills in scalable data storage, processing techniques, and performance analysis, covering topics like batch and stream processing systems. Grading includes quizzes, assignments, participation, a midterm, a research project, and a final exam, with references provided for further reading.

Uploaded by

Karim Osama
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

0- Course Intro

CIT650: Introduction to Big Data is a course offered in Fall 2024, focusing on advanced principles and methods for managing and processing diverse data types. The course aims to equip students with skills in scalable data storage, processing techniques, and performance analysis, covering topics like batch and stream processing systems. Grading includes quizzes, assignments, participation, a midterm, a research project, and a final exam, with references provided for further reading.

Uploaded by

Karim Osama
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

CIT650: Introduction to Big Data

Fall 2024
(Tuesday 6:30 PM)

Dr. Tamer Arafa


Our Vision
To be a world-class school recognized as one of the top information technology and
computer science schools in the region for research, education, and entrepreneurship

Our Mission
The school is committed to preparing scientifically and professionally distinguished
graduates in many information technology and computer science disciplines. It strives
to: strongly contribute to society's prosperity, achieve sustainable development goals;
and support the information technology industry through multidisciplinary scientific
research, innovation and enhancement of entrepreneurial capabilities
Course Description
▪ Big Data Explosion: Coined to express the surge in global digital data, Big Data
originates from diverse sources and formats.
▪ Universal Significance: Big Data is a core theme in industries, research, and society,
impacting sectors like automotive, finance, healthcare, and manufacturing.
▪ Industry Advancements: Industries benefit from faster data processing, with
automotive, finance, healthcare, and manufacturing experiencing notable
improvements.
▪ Tech Boost: Big Data's progress is powered by affordable, high-powered computing
platforms, enabling fault-tolerant storage and processing in large clusters with
thousands of processors and terabytes of memory.
Course Aim
▪ Course Objectives: This course aims to familiarize students with advanced principles and
methods for managing and processing data effectively.
▪ Data Handling Techniques: Students will explore storage and processing techniques for various
data types, including structured, semi-structured, and unstructured data.
▪ Cutting-edge Topics: The course will delve into the latest advancements in big data processing
systems, covering areas such as
▪ Batch processing
▪ Stream processing.
Course Outcomes
On successful completion of this course, students should be able to:
▪ Recognize Scalable Data Needs: Understand the escalating demand for scalable data storage
and processing in diverse domains.
▪ Evaluate Solutions: Assess advanced data management solutions, choosing systems for
specific challenges.
▪ Implement Cutting-edge Systems: Apply state-of-the-art data processing for scalable solutions
in diverse domains.
▪ Performance Analysis: Use qualitative and quantitative methods to analyze and compare
system performance.
▪ Build Data Pipelines: Demonstrate skill in constructing complex data processing pipelines for
diverse data types.
Course Topics
▪ Principles of Big Data
▪ Batch Processing Systems for Big Data
▪ Hadoop
▪ Spark

▪ Big SQL Systems


▪ Hive
▪ Impala
▪ Spark Data Frames/SQL

▪ Big Stream Processing


▪ Storm
▪ Spark Streaming
▪ Flink
Course Delivery
▪ Course Professor
▪ Dr. Tamer Arafa ([email protected])
▪ Lectures: Tuesday: 6:30 pm – 9:00 pm
▪ Office hours: Monday: 4:30pm – 6:30pm

▪ Course TA
▪ TBD
▪ Office hours: TBD
Grade Distribution

▪ 2 Quizzes 10%
▪ 1 Assignment 10%
▪ Lab and Participation 10%
▪ Midterm 20%
▪ 1 Research Project 20%
▪ Final exam 30%
References
▪ Sherif Sakr and Mohamed Gaber. ”Large Scale and Big Data: Processing
and Management”, CRC Press, 2014.
▪ Sherif Sakr. ”Big Data 2.0 Processing Systems”, Springer, 2016
▪ Albert Zomaya and Sherif Sakr. ”Handbook of Big Data Technologies”,
Springer, 2017
▪ Sherif Sakr and Albert Zomaya. ”Encyclopedia of Big Data Technolo-
gies”, Springer, 2018
▪ Sherif Sakr et al. ”Large Scale Graph Processing Using Apache Giraph”,
Springer, 2016

You might also like