0% found this document useful (0 votes)
29 views

1database Systems Master IBU

This document provides information for the CE502-DS DataBase Systems master course, including the instructor Dr. Festim Halili's contact information, prerequisites, required materials, grading criteria involving SQL projects and a final exam, and communication channels like a website and email. It also gives an overview of database systems concepts like data modeling, query languages, concurrency control, and reliable data storage.

Uploaded by

gokhancantas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

1database Systems Master IBU

This document provides information for the CE502-DS DataBase Systems master course, including the instructor Dr. Festim Halili's contact information, prerequisites, required materials, grading criteria involving SQL projects and a final exam, and communication channels like a website and email. It also gives an overview of database systems concepts like data modeling, query languages, concurrency control, and reliable data storage.

Uploaded by

gokhancantas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 38

CE502-DS - DataBase

Systems
Master Studies

Doc.Dr. Festim Halili


[email protected]

Doc.Dr. Festim Halili


Lectures:
Thursdays 16:30

Doc.Dr. Festim Halili


Prerequisites
• Required: Database Design and Management (Bsc)

Course Materials
• Required: Database Systems, 4th edition, A practical
approach to Design, Implementation and
Management
• Recommended: Database Management Systems, X
Edition, Raghu Ramakrishnan and Johannes
Gehrke, McGraw-Hill.
• Beginning Database Design Solutions, Wrox, Rod
Stephens, 2010
Doc.Dr. Festim Halili
Grading
• Homework
• Projects
• Use SQL Server to design a database in two
projects.
– The first project is on the entity-relational (ER) model,
– The second project is on relational algebra (RA) and
relational calculus (RC).
• Final Exam
– Exams in-class, closed-book, non-cumulative

Doc.Dr. Festim Halili


Communication
• Web page: http:sites.google.com/site/festimhalili3
[email protected]
• Recitation: Lab

Doc.Dr. Festim Halili


What Is a Database System?

• Database:
a very large, integrated collection of data.
• Models a real-world enterprise
– Entities (e.g., teams, games)
– Relationships
(e.g., The Forty-Niners are playing in The Superbowl)
– More recently, also includes active components , often
called “business logic”. (e.g., the BCS ranking system)

• A Database Management System (DBMS) is a software


system designed to store, manage, and facilitate access
to databases.

Doc.Dr. Festim Halili


Database Systems: Then

Doc.Dr. Festim Halili


Database Systems: Today

From Friendster.com on-line tour


Doc.Dr. Festim Halili
Other Ways Databases Make Life Better?
• “Players could finally
sign up for the Star
Wars Galaxies game
last week as Sony
opened up registration
to the public.”

• “Once players got in to


the game they found
that the game servers
were offline because of database
problems.”

• “Some players spent hours tuning their in-


game characters only to find that crashes
deleted all their hard work.”
Source:
• Doc.Dr. Festim HaliliBBC News Online, July 1, 2003.
Other databases you may use

Doc.Dr. Festim Halili


= Is the WWW a DBMS?
• Fairly sophisticated search available
– crawler indexes pages on the web
– Keyword-based search for pages
• But, currently
– data is mostly unstructured and untyped
– search only:
• can’t modify the data
• can’t get summaries, complex combinations of data
– few guarantees provided for freshness of data, consistency
across data items, fault tolerance, …
– Web sites typically have a DBMS in the background to provide
these functions.
• The picture is changing
– New standards e.g., XML, Semantic Web can help data
modeling
– Research groups (e.g., at Berkeley) are working on providing
some of this functionality across multiple web sites.
Doc.Dr. Festim Halili
“Search” vs. Query

• What if you
wanted to find out
which actors
donated to John
Kerry’s
presidential
campaign?

• Try “actors
donated to john
kerry” in your
favorite search
engine.

Doc.Dr. Festim Halili


A “Database Query” Approach

Doc.Dr. Festim Halili


= Is a File System a DBMS?

• Thought Experiment 1:
– You and your project partner are editing the same file.
– You both save it at the same time.
– Whose changes survive?

A) Yours B) Partner’s C) Both D) Neither E) ???


•Thought Experiment 2: Q: How do you write
programs over a
–You’re updating a file. subsystem when it
–The power goes out. promises you only “???” ?
–Which of your changes survive? A: Very, very carefully!!
A) All B) None C) All Since Last Save D) ???
Doc.Dr. Festim Halili
Current Commercial Outlook
• A major part of the software industry:
– Oracle, IBM, Microsoft, Sybase
– also Informix (now IBM), Teradata
– smaller players: java-based dbms, devices, OO, …
• Well-known benchmarks (esp. TPC)
• Lots of related industries
– data warehouse, document management, storage, backup,
reporting, business intelligence, app integration
• Relational products dominant and evolving
– adapting for extensibility (user-defined types), adding native
XML support.
• Open Source coming on strong
– MySQL, PostgreSQL, BerkeleyDB

Doc.Dr. Festim Halili


Why Study Databases?? ?

• Shift from computation to information


– always true for corporate computing
– Web made this point for personal computing
– more and more true for scientific computing
• Need for DBMS has exploded in the last years
– Corporate: retail swipe/clickstreams, “customer relationship
mgmt”, “supply chain mgmt”, “data warehouses”, etc.
– Scientific: digital libraries, Human Genome project, NASA
Mission to Planet Earth, physical sensors, grid physics
network
• DBMS encompasses much of CS in a practical discipline
– OS, languages, theory, AI, multimedia, logic
– Yet traditional focus on real-world apps

Doc.Dr. Festim Halili


What’s the intellectual content?
• representing information
– data modeling
• languages and systems for querying data
– complex queries with real semantics*
– over massive data sets
• concurrency control for data manipulation
– controlling concurrent access
– ensuring transactional semantics
• reliable data storage
– maintain data semantics even if you pull the
plug

* semantics: the meaning or relationship of meanings of a sign or set of signs

Doc.Dr. Festim Halili


Describing Data: Data Models
• A data model is a collection of concepts for
describing data.

• A schema is a description of a particular


collection of data, using a given data model.

• The relational model of data is the most widely


used model today.
– Main concept: relation, basically a table with rows
and columns.
– Every relation has a schema, which describes the
columns, or fields.

Doc.Dr. Festim Halili


Levels of Abstraction
Users
• Views describe how users
see the data.

• Conceptual schema View 1 View 2 View 3


defines logical structure
Conceptual Schema

Physical Schema
• Physical schema describes
the files and indexes
used. DB

• (sometimes called the


ANSI/SPARC
Doc.Dr. Festim Halili model)
Example: University Database
View 1 View 2 View 3
• Conceptual schema:
– Students(sid: string, name: string,
Conceptual Schema
login: string, age: integer, gpa:real)
– Courses(cid: string, cname:string, Physical Schema
credits:integer)
– Enrolled(sid:string, cid:string,
grade:string) DB
• External Schema (View):
– Course_info(cid:string,enrollment:integer)
• Physical schema:
– Relations stored as unordered files.
– Index on first column of Students.
Doc.Dr. Festim Halili
Data Independence
• Applications insulated from
how data is structured and View 1 View 2 View 3
stored.
• Logical data independence:
Protection from changes in Conceptual Schema
logical structure of data.
Physical Schema
• Physical data independence:
Protection from changes in
physical structure of data. DB

• Q: Why are these particularly


important for DBMS?
Doc.Dr. Festim Halili
Queries, Query Plans, and Operators
Count
Having
distinct
SELECT eid,
SELECT
COUNT
FROM
FROM
E.loc,
Emp
Emp
E
ename,
AVG(E.sal)
DISTINCT
BYE,E.loc
Proj
title
(E.eid)
P, Asgn A

WHERE
GROUP E.sal > $50K
WHERE E.eid = A.eid
Group(agg)
HAVING Count(*) > 5
AND P.pid = A.pid Join
Select
AND E.loc <> P.loc
Join
 Proj
Emp
Emp Emp
Asgn
• System handles query plan
generation & optimization;
ensures correct execution. Employees
Projects
Assignments

• Issues: view reconciliation, operator ordering, physical operator


choice, memory management, access path (index) use, …
Doc.Dr. Festim Halili
Concurrency Control

• Concurrent execution of user programs: key to good


DBMS performance.
– Disk accesses frequent, pretty slow
– Keep the CPU working on several programs concurrently.
• Interleaving actions of different programs: trouble!
– e.g., account-transfer & print statement at same time
• DBMS ensures such problems don’t arise.
– Users/programmers can pretend they are using a single-user
system. (called “Isolation”)
– Thank goodness! Don’t have to program “very, very
carefully”.

Doc.Dr. Festim Halili


Transactions: ACID Properties
• Key concept is a transaction: a sequence of database actions
(reads/writes).

• DBMS ensures atomicity (all-or-nothing property) even if


system crashes in the middle of a Xact.
• Each transaction, executed completely, must take the DB
between consistent states or must not run at all.
• DBMS ensures that concurrent transactions appear to run in
isolation.
• DBMS ensures durability of committed Xacts even if system
crashes.

• Note: can specify simple integrity constraints on the data. The


DBMS enforces these.
– Beyond this, the DBMS does not understand the semantics of the
data.
– Ensuring that a single transaction (run alone) preserves consistency
is largely the user’s responsibility!
Doc.Dr. Festim Halili
These layers
Structure of a DBMS must consider
concurrency
control and
• A typical DBMS has a recovery
layered architecture. Query Optimization
• The figure does not and Execution
show the concurrency
control and recovery Relational Operators
components. Files and Access Methods
• Each database system
has its own variations. Buffer Management

Disk Space Management

DB

Doc.Dr. Festim Halili


Advantages of a DBMS

• Data independence
• Efficient data access
• Data integrity & security
• Data administration
• Concurrent access, crash recovery
• Reduced application development time
• So why not use them always?
– Expensive/complicated to set up & maintain
– This cost & complexity must be offset by need
– General-purpose, not suited for special-purpose tasks (e.g. text
search!)

Doc.Dr. Festim Halili


Databases make these folks happy ...
• DBMS vendors, programmers
– Oracle, IBM, MS, Sybase, …
• End users in many fields
– Business, education, science, …
• DB application programmers
– Build enterprise applications on top of DBMSs
– Build web services that run off DBMSs
• Database administrators (DBAs)
– Design logical/physical schemas
– Handle security and authorization
– Data availability, crash recovery
– Database tuning as needs evolve

Doc.Dr. Festim Halili …must understand how a DBMS works


Summary (part 1)
• DBMS used to maintain, query large datasets.
– can manipulate data and exploit semantics
• Other benefits include:
– recovery from system crashes,
– concurrent access,
– quick application development,
– data integrity and security.
• Levels of abstraction provide data independence
– Key when dapp/dt << dplatform/dt

Doc.Dr. Festim Halili


Summary, cont.

• DBAs, DB developers the


bedrock of the information
economy

• DBMS R&D represents a broad,


fundamental branch of the science
of computation

Doc.Dr. Festim Halili


The Relational Model

• The Relational Database Management System


(RDBMS) has become the dominant data-processing
software in use today, with estimated new licence
sales of between US$6 billion and US$10 billion per
year (US$25 billion with tools sales included).
• The relational model was first proposed by E. F. Codd
in his seminal paper ‘A relational model of data for
large shared data banks’ (Codd, 1970).

Doc.Dr. Festim Halili


• The relational model’s objectives were specified
as follows:
• To allow a high degree of data independence.
Application programs must not be affected by
modifications to the internal data representation,
particularly by changes to file organizations, record
orderings, or access paths.
• To provide substantial grounds for dealing with data
semantics, consistency, and redundancy problems. In
particular, Codd’s paper introduced the concept of
normalized relations, that is, relations that have no
repeating groups.
• To enable the expansion of set-oriented data
manipulation languages

Doc.Dr. Festim Halili


Tuple A tuple is a row of a
relation.

Doc.Dr. Festim Halili


• Degree The degree of a relation is the number of
attributes it contains.
• Cardinality The cardinality of a relation is the number
of tuples it contains.
• Relational database A collection of normalized
relations with distinct relation names.
• Primary The candidate key that is selected to identify
tuples uniquely within the key relation.
• Foreign An attribute, or set of attributes, within one
relation that matches the key candidate key of some
(possibly the same) relation.
Doc.Dr. Festim Halili
Doc.Dr. Festim Halili
• The common convention for representing a relation
schema is to give the name of the relation followed
by the attribute names in parentheses. Normally, the
primary key is underlined.
• The conceptual model, or conceptual schema, is the
set of all such schemas for the database. Figure 3.3
(book) shows an instance of this relational schema.

Doc.Dr. Festim Halili


• Null Represents a value for an attribute that is
currently unknown or is not applicable for this tuple.
• Entity integrity In a base relation, no attribute of a
primary key can be null.
• Referential If a foreign key exists in a relation,
either the foreign key value must integrity match a
candidate key value of some tuple in its home
relation or the foreign key value must be wholly null.

Doc.Dr. Festim Halili


Views
• In the three-level ANSI-SPARC architecture presented in
Chapter 2 of the book, it is described an external view as the
structure of the database as it appears to a particular user. In
the relational model, the word ‘view’ has a slightly different
meaning.
• Base A named relation corresponding to an entity in the
conceptual schema, relation whose tuples are physically stored
in the database.
• View The dynamic result of one or more relational operations
operating on the base relations to produce another relation. A
view is a virtual relation that does not necessarily exist in the
database but can be produced upon request by a particular
user, at the time of request.

Doc.Dr. Festim Halili


Questions?

Doc.Dr. Festim Halili

You might also like