DBMS Unit - 1
DBMS Unit - 1
MODULE – 1
❖ INTRODUCTION TO DATABASES
2
Basic Definitions
Database:
◦ A collection of related data.
Data:
◦ Known facts that can be recorded and have an implicit meaning.
Mini-world:
◦ Some part of the real world about which data is stored in a database. For example, student grades and transcripts
at a university.
Database System:
◦ The DBMS software together with the data itself. Sometimes, the applications are also included.
3
DATABASE MANAGEMENT SYSTEMS
The DBMS is a general-purpose software system that facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users and applications
Defining a Database
*Involves specifying the data types, structures, and constraints of the data to be stored
in the database.
* The database definition or descriptive information is also stored by the DBMS in the
form of a database catalog or dictionary; it is called meta-data .
Constructing the database
* is the process of storing the data on some storage medium that is controlled by the
DBMS
Manipulating a database
*Includes functions such as querying the database to retrieve specific data
* Updating the database to reflect changes in the miniworld
* reports from the data.
* Sharing a database allows multiple users and programs to access the database 4
An application program accesses the database by sending queries or requests for data to
the DBMS A query typically causes some data to be retrieved;
Transaction may cause some data to be read and some data to be written into the database
Other important functions provided by the DBMS include
*Protecting the database and maintaining it over a long period of time
Protection includes system protection against hardware or software malfunction (or
crashes) and security protection against unauthorized or malicious access.
* DBMS must be able to maintain the database system by allowing the system to evolve as
requirements change over time
5
Simplified Database System Environment
6
Typical DBMS Functionality
Define a particular database in terms of its data types, structures, and constraints
Construct or Load the initial database contents on a secondary storage medium
Manipulating the database:
◦ Retrieval: Querying, generating reports
◦ Modification: Insertions, deletions and updates to its content
◦ Accessing the database through Web applications
Processing and Sharing by a set of concurrent users and application programs – yet, keeping all
data valid and consistent
7
Example of a Database
(with a Conceptual Data Model)
Mini-world for the example:
◦ Part of a UNIVERSITY environment.
8
Example of a Database
(with a Conceptual Data Model)
Some mini-world relationships:
◦ SECTIONs are of specific COURSEs
◦ STUDENTs take SECTIONs
◦ COURSEs have prerequisite COURSEs
◦ INSTRUCTORs teach SECTIONs
◦ COURSEs are offered by DEPARTMENTs
◦ STUDENTs major in DEPARTMENTs
Note: The above entities and relationships are typically expressed in a conceptual
data model, such as the ENTITY-RELATIONSHIP data mode
9
Example of a simple database
College Database
10
Example of a simplified database catalog
BACK
11
Main Characteristics of the Database Approach
The main characteristics of the database approach versus the file-processing
12
Main Characteristics of the Database Approach
Data Abstraction:
◦ The characteristics that allows program – data independence and program – operation independence is
called data abstraction.
◦ A data model is used to hide storage details and present the users with a conceptual view of the database.
13
Differentiate between File System and DBMS
Does not support multiple view of data A database typically has many users, each of whom may
require a different perspective or view of the database.
Concurrent access to the data in the file system has DBMS takes care of Concurrent access using some form
many problems like Reading the file while other deleting of locking.
some information, updating some information etc.
14
Database Users
Database Users
• Database administrators: responsible for authorizing access to the database, for co – ordinating and monitoring its use,
acquiring software, and hardware resources, controlling its use and monitoring efficiency of operations.
• Database Designers: responsible to define the content, the structure, the constraints, and functions or transactions
against the database. They must communicate with the end-users and understand their needs.
• End-users: they use the data for queries, reports and some of them actually update the database content.
System Analyst determine the requirements of end users, especially Naïve and parametric end users and develop
Application Programmers implement these specifications as programs, then they test, debug, document and maintain
16
Categories of End-users
Casual : access database occasionally when needed, use query languages
Naive or Parametric : They use previously well-defined functions in the form of “canned
transactions” against the database. Examples are bank-tellers or reservation clerks who do
this activity for an entire shift of operations.
17
Database Users
Workers behind the scene
18
Advantages of Using the Database Approach
⮚Controlling redundancy in data storage and in development and maintenance efforts.
• In file processing, every user group maintains its own files for handling its data-processing applications, many
of the data being repeated(redundant)
• This redundancy(controlled redundancy) in storing the same data multiple times leads to several problems
• Provide a security and authorization subsystem, which the DBA uses to create accounts
•The values of program variables or objects are discarded once a program terminates, unless the
programmer explicitly stores them in permanent files
•Such an object is said to be persistent, since it survives the termination of program execution
•Provide specialized data structures and search techniques to speed up query execution
•Query processing and optimization module of the DBMS is responsible for choosing an efficient query
20
execution plan for each query based on the existing storage structures
Advantages of Using the Database Approach
⮚Providing backup and recovery services
• Backup and recovery subsystem provide facilities for recovering from hardware or software failures
• Computer system fails in the middle of a complex update transaction, database to be restored to
the state
• Disk backup
• Ex: Query languages for casual users, Programming language interfaces for application
programmers
21
Advantages of Using the Database Approach
⮚Representing complex relationships among data
• Ex: Brown
• Ex: (i) Specifying a data type for each data item, USN character(25)
• Deductive database systems Ex: Determining Year Back students based on university rules
• Active database/Triggers(a form of a rule activated by updates to the table) with tables Ex: Check if manager’s
22
Data Models
•Data Model: A set of concepts to describe the structure of a database, and certain constraints that the
database should obey.
• Data Model Operations: Operations for specifying database retrievals and updates by referring to the
concepts of the data model. Operations on the data model may include basic operations and user-
defined operations
•Categories of data models
i. Conceptual (high-level, semantic) data models
Provide concepts that are close to the way many users perceive data. (Also called entity-based or
object-based data models.)
ii. Physical (low-level, internal) data models
Provide concepts that describe details of how data is stored in the computer
iii Implementation (representational) data models
Provide concepts that fall between the above two, balancing user views with some computer
storage details.
Schemas, Instances and Database State
•Database Schema
The description of a database. Includes descriptions of the database structure and the constraints that
should hold on the database.
•Schema Diagram
A diagrammatic display of (some aspects of) a database schema.
•Schema Construct
A component of the schema or an object within the schema, e.g., STUDENT, COURSE.
•Database State
The actual data stored in a database at a particular moment in time. Also called snapshot or instance or
occurrence.
◦ Internal schema at the internal level to describe physical storage structures and access
paths. Typically uses a physical data model.
◦ Conceptual schema at the conceptual level to describe the structure and constraints for the
whole database for a community of users. Uses a conceptual or an implementation data
model.
◦ External schemas at the external level to describe the various user views. Usually uses the
same data model as the conceptual level.
◦ Mappings among schema levels are needed to transform requests and data.
Programs refer to an external schema, and are mapped by the DBMS to the internal
schema for execution.
Three-Schema Architecture
Data Independence
The capacity to change the schema at one level of a database system without
having to change the system at the next higher level.
◦ Physical Data Independence: The capacity to change the internal schema without having
to change the conceptual schema.
When a schema at a lower level is changed, only the mappings between this
schema and higher-level schemas need to be changed in a DBMS that fully
supports data independence.
Hence, the application programs need not be changed since they refer to the
Additional Implications of Using the Database Approach
Potential for enforcing standards:
◦ This is very crucial for the success of database applications in large organizations Standards refer to data item names,
display formats, screens, report structures, meta-data (description of data) etc.
Economies of scale:
◦ By consolidating data and applications across departments wasteful overlap of resources and personnel can be avoided.
28
Brief History of Database Applications
Early Database Applications: The Hierarchical and Network Models were
introduced in mid 1960’s and dominated during the seventies. A bulk of the
worldwide database processing still occurs using these models.
Relational Model based Systems: The model that was originally introduced in
1970 was heavily researched and experimented with in IBM and the universities.
Relational DBMS Products emerged in the 1980’s.
Data on the Web and E-commerce Applications: Web contains data in HTML
(Hypertext markup language) with links among pages. This has given rise to a
new set of applications and E-commerce is using new standards like XML
(eXtended Markup Language).
Extending Database Capabilities
New functionality is being added to DBMSs in the
following areas:
◦ Scientific Applications
◦ Image Storage and Management
◦ Audio and Video data management
◦ Data Mining
◦ Spatial data management
◦ Time Series and Historical Data Management
The above gives rise to new research and development in incorporating new
data types, complex data structures, new operations and storage and
indexing schemes in database systems.
When not to use a DBMS
Main inhibitors (costs) of using a DBMS:
◦ High initial investment and possible need for additional hardware.
◦ Overhead for providing generality, security, concurrency control, recovery, and integrity
functions.
◦ If the database users need special operations not supported by the DBMS.
DBMS Languages
•Data Definition Language (DDL): Used by the DBA and database designers to specify the conceptual
schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas
(views). In some DBMSs, separate storage definition language (SDL) and view definition language
(VDL) are used to define internal and external schemas.
•Data Manipulation Language (DML): Used to specify database retrievals and updates.
Low Level or Procedural DML: record-at-a-time; they specify how to retrieve data and include
constructs such as looping.
DBMS Interfaces
◦ *Types of Attributes
(i)Composite versus simple(Atomic) Attributes
(ii)Single Valued versus Multivalued Attributes
(iii)Stored versus Derived Attributes
(iv) Complex Attributes
◦ For example, {Address_phone({Phone(Area_Code, Phone_Number)},
◦ Address(Street_Address(Number, Street, Apartment_Number), City,
◦ Zip_Code))
◦ (v) Null Values
Entity Types
Each entity type in the database is described by its name and attributes.
An entity type describes the schema or intension for a set of entities that share the same
structure.
An entity type is represented in ER diagrams as a rectangular box enclosing the entity type name.
Entity Set
The collection of entities of a particular entity type is grouped into an entity set.
CAR
Registration(RegistrationNumber, State), VehicleID, Make, Model, Year, (Color)
car1
((ABC 123, TEXAS), TK629, Ford Mustang, convertible, 1999, (red, black))
car2
((ABC 123, NEW YORK), WP9872, Nissan 300ZX, 2-door, 2002, (blue))
car3
((VSY 720, TEXAS), TD729, Buick LeSabre, 4-door, 2003, (white, blue))
.
Entity Set
.
.
Key Attribute of an Entity Type
An entity type usually has an attribute whose values are distinct for each individual entity in the
entity set.
Such an attribute is called as key attribute and its values can be used to identify each entity
uniquely.
Eg: No two companies will have the same name, so name attribute is the key attribute in
COMPANY.
In ER diagrammatic notation, each key attribute has its name underlined inside the oval.
Value Sets (Domains) of Attributes
•Each simple attribute of an entity type is associated with a value set (or domain of values), which
specifies the set of values that may be assigned to that attribute for each individual entity.
•Value sets are typically specified using the basic data types.
•Ex:If the range of ages allowed for employees is between 16 and 70, we can specify the value set of the
Age attribute of EMPLOYEE to be the set of integer numbers between 16 and 70.
•We can specify the value set for the Name attribute to be the set of strings of alphabetic characters
separated by blank characters, and so on.
ENTITY SET corresponding to the
ENTITY TYPE CAR Entity Type
CAR
Registration(RegistrationNumber, State), VehicleID, Make, Model, Year, (Color)
car1
((ABC 123, TEXAS), TK629, Ford Mustang, convertible, 1999, (red, black))
car2
((ABC 123, NEW YORK), WP9872, Nissan 300ZX, 2-door, 2002, (blue))
car3
((VSY 720, TEXAS), TD729, Buick LeSabre, 4-door, 2003, (white, blue))
.
Entity Set
.
.
Relationship Types, Relationship Sets, and Instances
A relationship type R among n entity types E1, E2, … En defines a set of associations – or a
relationship set – among entities from these entity types.
Mathematically, the relationship set R is a set of relationship instances ri, where each ri
associates n individual entities(e1,e2,…,en) and each entity ej in ri is a member of entity set Ej, 1 ≤ j ≤
n.
Example 1…
e1 r1 d1
e2 r2
d2
e3
r3
e4
r4 d3
e5
r5
e6
e7 r6
r7
Example
EMPLOYEE
2…
WORKS_ON PROJECT
r9
e1 r1 p1
e2 r2
p2
e3 r3
e4
r4 p3
e5
r5
e6
r6
e7
r7
r8
Relationship Degree
The degree of a relationship type is the number of participating entity types.
A relationship type of degree two is called binary, and one of degree three is called ternary.
Role Names and Recursive Relationships
•Each entity type that participates in a relationship type plays a particular role in the relationship.
•The role name signifies the role that a participating entity from the entity type plays in each
relationship instance, and helps to explain what the relationship means.
•Some entity types participates more than once in a relationship type in different roles. Such
relationship types are called recursive relationships.
1-Supervisor
2-Supervisee
Example…
EMPLOYEE SUPERVISION
e1 2
1 r1
e2 2
1 r2
e3 2
1 r3
e4
2
e5 1
1 r4
e6 2
1 r5
e7 2
r6
(1) Supervisor Role
(2) Subordinate Role
Constraints on Binary Relationship Types
There are two types of relationship constraints:
• Cardinality ratio:
◦ * The cardinality ratio for binary relationship specifies the maximum number of relationship instances that an entity can
participate in.
◦ * The possible cardinality ratio for binary relationship types are 1:1, 1:N, N:1 and M:N
• Participation:
◦ * The participation constraint specifies whether the existence of an entity depends on its being related to another entity via the
relationship type.
◦ * This constraint specifies the minimum number of relationship instances that each entity can participate in.
The cardinality ratio and participation constraint together called as structural constraint.
1:1 Relationship
EMPLOYEE MANAGES DEPARTMENT
e1 r1 d1
e2 r2
d2
e3
r3
e4
d3
e5 .
e6
.
e7
.
1:N Relationship
EMPLOYEE WORKS_FOR DEPARTMENT
e1 r1 d1
e2 r2
d2
e3
r3
e4
r4 d3
e5
r5
e6
e7 r6
r7
M:N Relationship
EMPLOYEE WORKS_ON PROJECT
r9
e1 r1 p1
e2 r2
p2
e3
r3
e4
r4 p3
e5
r5
e6
e7 r6
r7
r8
Participation Constraint and Existence Dependence
The participation constraint specifies whether the existence of an entity depends on its being related to
another entity via the relationship type.
This constraint specifies the minimum number of relationship instances that each entity can participate in
and is sometimes called the minimum cardinality constraint.
◦ Total Participation, also called as Existence Dependency-If a company policy states that every employee must
work for a department, then an employee entity can exist only if it participates in at least one WORKS_FOR
relationship instance
◦ Partial Participation-we do not expect every employee to manage a department, so the participation of
EMPLOYEE in the MANAGES relationship type is partial, meaning that some or part of the set of employee
entities are related to some department entity via MANAGES, but not necessarily all.
Total Participation, also called as Existence
Dependency
EMPLOYEE WORKS_FOR DEPARTMENT
e1 r1 d1
e2 r2
d2
e3
r3
e4
r4 d3
e5
r5
e6
e7 r6
r7
1:1 Relationship
EMPLOYEE MANAGES DEPARTMENT
e1 r1 d1
e2 r2
d2
e3
r3
e4
d3
e5 .
e6
.
e7
.
Attributes as Relationship Types
A relationship type can have attributes; for example, HoursPerWeek of WORKS_ON; its value for
each relationship instance describes the number of hours per week that an EMPLOYEE works on
a PROJECT.
Weak Entity Types
Entity types that do not have key attributes of their own are called weak entity types.
The entity types which contain key attributes of their own are called regular entity types
or strong entity types.
Entities belonging to weak entity type are identified by being related to specific entities
from another entity type in combination with one of their attribute values. This other
entity type is called as identifying entity type or owner entity type.
The relationship type that relates a weak entity type to its owner is called as identifying
relationship of the weak entity type.
The weak entity type normally has a partial key, which is the set of attributes that can
uniquely identify weak entities that are related to the same owner entity.
Example…
Example:
Suppose that a DEPENDENT entity is identified by the dependent’s first name and birth date,
and the specific EMPLOYEE that the dependent is related to.
DEPENDENT is a weak entity type with EMPLOYEE as its identifying entity type via the identifying
relationship type DEPENDENT_OF.
Notations Used in ER Diagrams
Symbol Meaning
ENTITY TYPE
RELATIONSHIP TYPE
ATTRIBUTE
KEY ATTRIBUTE
MULTIVALUED ATTRIBUTE
COMPOSITE ATTRIBUTE
DERIVED ATTRIBUTE
TOTAL PARTICIPATION OF E2 IN R
(min,max)
R E
Proper Naming of Schema Constructs
Use singular names for entity types, rather than plural one.
Another naming consideration involves choosing the binary relationship names to make the ER
diagram of the schema readable from left to right and top to bottom.
E – R Diagram for COMPANY
Alternative Notations for ER Diagrams