0% found this document useful (0 votes)
20 views32 pages

Database Applications Cy S 125242: DR - Layla Abdour

0

Uploaded by

mahmoudajory388
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views32 pages

Database Applications Cy S 125242: DR - Layla Abdour

0

Uploaded by

mahmoudajory388
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Database Applications

Cy S 125242

Introduction

Dr.Layla Abdour
Today…
 Motivation
 Course overview & administrivia
 Databases and Security
Outline

Motivation

Course Overview and Administrivia

A Primer on Databases
On the Verge of A Disruptive Century:
Breakthroughs

Gene
Ubiquitous
Sequencing and
Computing
Biotechnology

Smaller, Faster,
Cheaper Sensors

Faster
Communication
A Common Theme is Data
The amount of data is only growing…
1.2 Zettabytes (1ZB = 1021 B or 1 Billion TB) in 2010
We Live in a World of Data
 Nearly 500 Exabytes per day are generated by the Large Hadron
Collider experiments (not all recorded!)

 2.9 million emails are sent every second

 20 hours of video are uploaded to YouTube every minute

 Google every day handle over 2.5 exabytes (2,500,000,000


gigabytes) of data every single day

 50 million tweets are generated per day

 700 billion total minutes are spent on Facebook each month


Data and Big Data
 The value of data as an organizational asset is widely recognized

 Data is literally exploding and is occurring along three main dimensions


• “Volume” or the amount of data
• “Velocity” or the speed of data
• “Variety” or the range of data types and sources

 What is Big Data?


 It is the creation of data that floods organizations on a daily basis

 It is high volume, high velocity, and/or high variety information assets

 It requires new forms of processing to enable fast mining, enhanced


decision-making, insight discovery and process optimization
What Do We Do With Data and Big Data?

Store Share

Query Mine

…. and
Encrypt more!

We want to do these seamlessly and fast...


Using Diverse Interfaces & Devices

Mobile Devices
Computers

…and even appliances

Consumer Electronics Personal Monitors and


Sensors

We also want to access, share and process our data from all of our devices,
anytime, anywhere!
Data is Becoming Critical to Our Lives

Health Science
Domains
Education of Data
Work

Environment Finance

… and more
Why Studying Databases?
 Data is everywhere and is critical to our lives

 Data need to be recorded, maintained, accessed and manipulated


correctly, securely, efficiently and effectively
 At the “low end”: scramble to web-scale (a mess!)
 At the “high end”: scientific applications

 Database management systems (DBMSs) are indispensable


software for achieving such goals

 The principles and practices of DBMSs are now an integral part of


computer science curricula
 They encompass OS, languages, theory, AI, multimedia, and logic,
among others
As such, the study of database systems can prove to be richly rewarding in more
ways than one!
Outline

Motivation

Course Overview and Administrivia



A Primer on Databases
Course Objectives
In this course we aim at studying:

Big Data,
Hadoop,
How to secure BigTable, parallel
our database and distributed
How to refine DBMSs, NoSQL
and speed up and NewSQL
How to query data retrieval databases and
and manipulate and
How to design their security
databases manipulation
and implement
databases from
‘cradle-to-grave’

Application-Centric Systems-Centric & Theory-Centric Advanced Topics


(A Brief Overview)
Assessment Methods
 How do we measure learning?

Type Weight
Projects 10%
Quizzes 5%
Mid-Exam 30%
Participation 5%
Final exam 50%
Outline

Motivation

Course Overview and Administrivia

A Primer on Databases

A Motivating Scenario
 A Foundation has a “large” collection of data (say 500GB) on employees,
students, universities, research centers, etc.,

 This data is accessedPerformance


concurrently(Concurrency
by several people
Control)

 Queries on data must be answered (Response


Performance quickly Time)

 Changes made to the data by different


Correctness users must be applied consistently
(Consistency)

 Access to certain parts of data (e.g., salaries)


Correctness must be restricted
(Security)

 This data should survive


Correctness
system(Durability and Atomicity)
crashes/failures
Managing Data using File Systems
 What about managing data using local file systems?
 Files of fixed-length and variable-length records as well as formats
 Main memory vs. disk
 Computer systems with 32-bit addressing vs. 64-bit addressing schemes
 Special programs (e.g., C++ and Python programs) for answering user questions
 Special measures to maintain atomicity
 Special measures to maintain consistency of data
 Special measures to maintain data isolation
 Special measures to offer software and hardware fault-tolerance
 Special measures to enforce security policies in which different users are
granted different permissions to access diverse subsets of data

This becomes tedious and inconvenient, especially at large-scale, with


evolving/new user queries and higher probability of failures!
Data Base Management Systems
 A special software is accordingly needed to make the preceding
tasks easier

 This software is known as Data Base Management System (DBMS)

 DBMSs provide automatic:


 Data independence
 Efficient data access
 Data integrity and security
 Data administration
 Concurrent access and crash recovery
 Reduced application development and tuning time
Some Definitions
 A database is a collection of data which describes one or many
real-world enterprises
 E.g., a university database might contain information about entities like
students and courses, and relationships like a student
enrollment in a course

 A DBMS is a software package designed to store and


manage databases
 E.g., DB2, Oracle, MS SQL Server and MySQL.

 A database system = (Big) Data + DBMS + Application Programs


Data Models
 The user of a DBMS is ultimately concerned with some real-world
enterprises (e.g., a University)

 The data to be stored and managed by a DBMS describes various


aspects of the enterprises
 E.g., The data in a university database describes students, faculty and
courses entities and the relationships among them

 A data model is a collection of high-level data description constructs


that hide many low-level storage details

 A widely used data model called the entity-relationship (ER) model


allows users to pictorially denote entities and the relationships
among them
The Relational Model
 The relational model of data is one of the most widely used
models today

 The central data description construct in the relational model


is the relation

 A relation is basically a table (or a set) with rows (or records or


tuples) and columns (or fields or attributes)

 Every relation has a schema, which describes the columns


of a relation

 Conditions that records in a relation must satisfy can be specified


 These are referred to as integrity constraints
The Relational Model: An Example
 Let us consider the student entity in a university database
Students Schema

Students(sid: string, name: string, login: string, dob: string, gpa: real)

An attribute, field or column


Integrity Constraint: Every student has a unique sid value

sid name login dob gpa


A record, tuple
or row 512412 Khaled [email protected] 18-9-1995 3.5

512311 Jones [email protected] 1-12-1994 3.2

512111 Maria [email protected] 3-8-1995 3.85

An instance of a Students relation


Levels of Abstraction
 The data in a DBMS is described at three levels of abstraction,
the conceptual (or logical), physical and external schemas
View 1 View 2 View 3
 The conceptual schema describes
data in terms of a specific data model
(e.g., the relational model of data) Conceptual Schema

Physical Schema
 The physical schema specifies how data
described in the conceptual schema are
stored on secondary storage devices
Disk
 The external schema (or views) allow data
access to be customized at the level of individual users or group of
users (views can be 1 or many)
Views
 A view is conceptually a relation

 Records in a view are computed as needed and usually not


stored in a DBMS

 Example: University Database


Conceptual Schema Physical Schema External Schema (View)
• Students(sid: string, name: • Relations stored as heap files Students can be allowed to find
string, login: string, dob: string, • Index on first column of out course enrollments:
gpa:real) Students • Course_info(cid: string,
• Courses(cid: string, enrollment: integer)
cname:string, credits:integer) Can be computed from the relations in
• Enrolled(sid:string, cid:string, the conceptual schema (so as to avoid
grade:string) data redundancy and inconsistency).
Iterating: Data Independence
 One of the most important benefits of using a DBMS is
data independence

 With data independence, application programs are insulated


from how data are structured and stored

 Data independence entails two properties:


 Logical data independence: users are shielded from changes in the
conceptual schema (e.g., add/drop a column in a table)
 Physical data independence: users are shielded from changes in the
physical schema (e.g., add index or change record order)
Queries in a DBMS
 The ease with which information can be queried from a database
determines its value to users

 A DBMS provides a specialized language, called the query language,


in which queries can be posed

 The relational model supports powerful query languages


 Relational calculus: a formal language based on mathematical logic
 Relational algebra: a formal language based on a collection of
operators (e.g., selection and projection) for manipulating relations
 Structured Query Language (SQL):
 Builds upon relational calculus and algebra
 Allows creating, manipulating and querying relational databases
 Can be embedded within a host language (e.g., Java)
The Architecture of a Relational DBMS
Web Forms Application Front Ends SQL Interface

SQL Commands

Plan Executer Parser Query


Evaluation
Operator Evaluator Optimizer Engine

Transaction Files and Access Methods


Manager Recovery
Buffer Manager
Lock Manager
Manager Disk Space Manager
Concurrency Control DBMS

Database Index Files System Catalog


Data Files
People Who Work With Databases
 There are five classes of people associated with databases:
1. End users
 Store and use data in DBMSs
 Usually not computer professionals
2. Application programmers
 Develop applications that facilitate the usage of DBMSs for end-users
 Computer professionals who know how to leverage host languages, query
languages and DBMSs altogether
3. Database Administrators (DBAs)
 Design the conceptual and physical schemas
 Ensure security and authorization
 Ensure data availability and recovery from failures
 Perform database tuning
4. Implementers
 Build DBMS software for vendors like IBM and Oracle
 Computer professionals who know how to build DBMS internals
5. Researchers
 Innovate new ideas which address evolving and new challenges/problems
The Architecture of a Relational DBMS
Web Forms Application Front Ends SQLApplication
Interface
End Users (e.g., university staff, travel agents, etc.)
Programmers & DBAs
SQL Commands

Plan Executer Parser Query


Evaluation
Operator Evaluator Optimizer Engine

Transaction Files and Access Methods


Manager Recovery
Implementers and Researchers
Buffer Manager
Lock Manager
Manager Disk Space Manager
Concurrency Control DBMS

Database Index Files System Catalog


Data Files
Summary
 We live in a world of data

 The explosion of data is occurring along the 3Vs dimensions

 The data in a DBMS is described at three levels of abstraction

 A DBMS typically has a layered architecture


Summary
 Studying DBMSs is one of the broadest and most exciting
areas in computer science!

 This course provides an in-depth treatment of DBMSs with an


emphasis on how to design, create, refine, use and build
DBMSs and real-world enterprise databases

 Various classes of people who work with databases hold


responsible jobs and are well-paid!

You might also like