DBMS 4TH Sem Computer Engg
DBMS 4TH Sem Computer Engg
SYSTEM
1
Basic Definitions
Database:
– A logical coherent collection of data representing the mini-world
such that change in the mini-world brings about change in
database collected for a particular purpose and for a group of
intended users.
Data:
– Meaningful facts, text, graphics, images, sound, video segments
that can be recorded and have an implicit meaning.
Metadata:
– Data that describes data
File Processing System
– A collection of application programs that perform services for the
end-users such as production of reports
– Each program defines and manages its own data
Database Management System (DBMS):
– A software package/ system to facilitate the creation and
maintenance of a computerized database.
Database System:
– The DBMS software together with the data itself. Sometimes, the
applications are also included. Database + DBMS
2
Simplified database system
environment
3
Evolution of DB Systems
Flat files - 1960s - 1980s
Hierarchical – 1970s - 1990s
Network – 1970s - 1990s
Relational – 1980s - present
Object-oriented – 1990s - present
Object-relational – 1990s - present
Data warehousing – 1980s -
present
Web-enabled – 1990s - present
4
Purpose of Database Systems
5
Disadvantages of File Processing
Program-Data Dependence
File structure is defined in the program code.
All programs maintain metadata for each file they use
Duplication of Data (Data Redundancy)
Different systems/programs have separate copies of the same data
– Same data is held by different programs.
– Wasted space and potentially different values and/or different formats
for the same item.
Limited Data Sharing
No centralized control of data
Programs are written in different languages, and so cannot easily access
each other’s files.
Lengthy Development Times
Programmers must design their own file formats
Excessive Program Maintenance
80% of of information systems budget
Vulnerable to Inconsistency
Change in one table need changes in corresponding tables as well
otherwise data will be inconsistent
6
Advantages of Database Approach
Data independence and efficient access.
Data integrity and security.
Uniform data administration.
Concurrent access, recovery from crashes.
Replication control
Reduced application development time.
Improved Data Sharing
– Different users get different views of the data
Enforcement of Standards
– All data access is done in the same way
Improved Data Quality
– Constraints, data validation rules
Better Data Accessibility/ Responsiveness
– Use of standard data query language (SQL)
Security, Backup/Recovery, Concurrency
– Disaster recovery is easier
7
Costs and Risks of the
Database Approach
Up-front costs:
– Installation Management Cost and Complexity
– Conversion Costs
Ongoing Costs
– Requires New, Specialized Personnel
– Need for Explicit Backup and Recovery
Organizational Conflict
– Old habits die hard
8
Database Applications
Database Applications:
– Banking: all transactions
– Airlines: reservations, schedules
– Universities: registration, grades
– Sales: customers, products, purchases
– Manufacturing: production, inventory, orders, supply chain
– Human resources: employee records, salaries, tax deductions
Databases touch all aspects of our lives
9
Levels of Abstraction
Many views, single View 1 View 2 View 3
conceptual (logical) schema
and physical schema.
– Views describe how users
Conceptual Schema
see the data.
– Conceptual schema defines Physical Schema
logical structure
– Physical schema describes
the files and indexes used.
10
Example: University Database
Conceptual schema:
– Students(sid: string, name: string, login: string,
age: integer, gpa:real)
– Courses(cid: string, cname:string, credits:integer)
– Enrolled(sid:string, cid:string, grade:string)
Physical schema:
– Relations stored as unordered files.
– Index on first column of Students.
External Schema (View):
– Course_info(cid:string, enrollment:integer)
11
Instances and Schemas
12
Data Independence
Ability to modify a schema definition in one
level without affecting a schema definition in
the other levels.
The interfaces between the various levels and
components should be well defined so that
changes in some parts do not seriously
influence others.
Two levels of data independence
– Physical data independence:- Protection from changes in
logical structure of data.
– Logical data independence:- Protection from changes in
physical structure of data.
13
Instances and Schemas
Similar to types and variables in programming languages
Schema – the logical structure of the database
– e.g., the database consists of information about a set of customers and accounts
and the relationship between them)
– Analogous to type information of a variable in a program
– Physical schema: database design at the physical level
– Logical schema: database design at the logical level
Instance – the actual content of the database at a particular point in time
– Analogous to the value of a variable
Physical Data Independence – the ability to modify the physical schema
without changing the logical schema
– Applications depend on the logical schema
– In general, the interfaces between the various levels and components should be
well defined so that changes in some parts do not seriously influence others.
14
Database Languages
Data Definition Language (DDL)
Specification notation for defining the database schema
DDL compiler generates a set of tables stored in a data dictionary
Data dictionary contains metadata (data about data)
Data storage and definition language – special type of DDL in which
the storage structure and access methods used by the database system
are specified
15
Database Users
Users are differentiated by the way they expect to
interact with the system
Application programmers – interact with system
through DML calls
Sophisticated users – form requests in a database
query language
Specialized users – write specialized database
applications that do not fit into the traditional data
processing framework
Naïve users – invoke one of the permanent
application programs that have been written
previously
– E.g. people accessing database over the web, bank tellers,
clerical staff
16
Database Administrator
Coordinates all the activities of the database system; the
database administrator has a good understanding of the
enterprise’s information resources and needs.
Database administrator's duties include:
– Schema definition
– Storage structure and access method definition
– Schema and physical organization modification
– Granting user authority to access the database
– Specifying integrity constraints
– Acting as liaison with users
– Monitoring performance and responding to changes
in requirements
17
Data Models
A collection of tools for describing:
– Data
– Data relationships
– Data semantics
– Data constraints
Object-based logical models
– Entity-relationship model
– Object-oriented model
– Semantic model
– Functional model
Record-based logical models
– Relational model (e.g., SQL/DS, DB2)
– Network model
– Hierarchical model (e.g., IMS)
18
Entity-Relationship Model
The basics of Entity-Relationship
modelling
Entities (objects)
– E.g. customers, accounts, bank branch
Attributes
Relationships between entities
– E.g. Account A-101 is held by customer Johnson
– Relationship set depositor associates customers with
accounts
Employees
20
name
since Employees
name dname
super- subor-
ssn lot did budget visor dinate
Reports_To
Employees Works_In Departments
22
Mapping Cardinality Constraints
Express the number of entities to which another
entity can be associated via a relationship set.
Most useful in describing binary relationship sets.
For a binary relationship set the mapping
cardinality must be one of the following types:
– One to one
– One to many
– Many to one
– Many to many
23
Mapping Cardinalities
24
Participation Constraints
Does every department have a manager?
– If so, this is a participation constraint: the participation of Departments in
Manages is said to be total (vs. partial).
Every Department entity must appear in an instance of the
relationship Works_In (have an employee) and every Employee
must be in a Department
Works_In
since
25
Keys
A super key of an entity set is a set of one or more attributes
whose values uniquely determine each entity.
A candidate key of an entity set is a minimal super key
– Customer_id is candidate key of customer
– account_number is candidate key of account
Although several candidate keys may exist, one of the candidate
keys is selected to be the primary key.
Alternate key is the candidate key which are not selected as
primary key.
Foreign key are the attributes of an entity that points to the
primary key of another entity. They act as a cross-reference
between entities.
Composite Key consists of two or more attributes that uniquely
identify an entity.
Non-key attributes are the attributes or fields of a table, other
than candidate key attributes/fields in a table.
Non-prime Attributes are attributes other than Primary Key
attribute(s)..
26
Relational Model
Example of tabular data in the relational model:
name ssn street city account-number
Johnson 192-83-7465 Alma Palo Alto A-101
Smith 019-28-3746 North Rye A-215
Johnson 192-83-7465 Alma Palo Alto A-201
Jones 321-12-3123 Main Harrison A-217
Smith 019-28-3746 North Rye A-201
account-number balance
A-101 500
A-201 900
A-215 700
A-217 750
27
Relational Model (Basic)
The relational model used the basic concept of a
relation or table.
Tuple:- A tuple is a row in a table.
Attribute:- An attribute is the named column of a relation.
Domain:- A domain is the set of allowable values for one or more
attributes.
Degree:- The number of columns in a table is called the degree of
relation.
Cardinality:- The number of rows in a relation,is called the
cardinality of the relation.
28
Integrity Constraints
Integrity constraints guard against accidental damage to the
database, by ensuring that authorized changes
to the database do not result in a loss of data consistency.
Domain Constraints:- It specifies that the value of each
attribute x must be an atomic value from the domain of x.
Key Constraints:- Primary Key must have unique value in
the relational table.
Referential Integrity:-It states that if a foreign key in table
A refers to the primary key of table B then, every value of
the foreign key in table A must be null or be available in
table B.
Entity Integrity:- It states that no attribute of a primary
key can have a null value.
29
A Sample Relational Database
30
SQL Introduction
Standard language for querying and manipulating data
31
SQL
32
Table name Attribute names
Tables in SQL
Product
Tuples or rows
33
Tables Explained
34
Data Types in SQL
Atomic types:
– Characters: CHAR(20), VARCHAR(50)
– Numbers: INT, BIGINT, SMALLINT, FLOAT
– Others: MONEY, DATETIME, …
35
Tables Explained
A tuple = a record
– Restriction: all attributes are of atomic type
36
SQL Query
SELECT <attributes>
FROM <one or more relations>
WHERE <conditions>
37
Simple SQL Query
SELECT *
FROM Product
WHERE category=‘Gadgets’
38
Simple SQL Query
39
Notation
Input Schema
Output Schema
40
Keys and Foreign Keys
Company
CName StockPrice Country
GizmoWorks 25 USA
Key
Canon 65 Japan
Hitachi 15 Japan
Product
PName Price Category Manufacturer
Foreign
Gizmo $19.99 Gadgets GizmoWorks
key
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
41
Joins
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)
42
Joins
Product Company
43
More Joins
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)
SELECT cname
FROM
WHERE
44
NULLS in SQL
45
Outer Joins
46
Modifying the Database
47
Insertions
General form:
49
Insertion: an Example
Product(name, listPrice, category)
Purchase(prodName, buyerName, price)
prodName is foreign key in Product.name
camera - -
51
Insertion: an Example
camera 200 -
53
Updates
Example:
UPDATE PRODUCT
SET price = price/2
WHERE Product.name IN
(SELECT product
FROM Purchase
WHERE Date =‘Oct, 25, 1999’);
54