The document provides an overview of database systems and their components. It discusses the purpose of database systems, database languages, data models, database internals including storage management, query processing and transaction management. It also describes different types of database users and the role of the database administrator.
01-Database Administration and Management.pdfTOUSEEQHAIDER14
This document provides an introduction and overview of database systems. It discusses the purpose of database systems in addressing issues with file-based data storage like data redundancy, inconsistent data, and difficulty of data access. It also describes database applications, data models, database languages like SQL, database design, database architecture, and the major components of a database system including the storage manager, query processor, and transaction manager.
The document provides an introduction to database systems and their components. It discusses the purpose of database systems in addressing issues with using file systems to store data, such as data redundancy, difficulty of accessing data, and lack of integrity constraints. It also describes the logical and physical views of data in a database, database languages like SQL for manipulating and defining data, and relational and entity-relationship models for structuring information.
The document provides an overview of database systems, including their purpose, components, and history. It discusses how database systems address issues with using file systems to store data, such as data redundancy, difficulty of accessing data, integrity problems, and concurrent access. The key components of a database system are the database management system (DBMS), data models, data definition and manipulation languages, database design, storage and querying, transaction management, architecture, users, and administrators. The relational model and SQL are introduced as widely used standards. A brief history outlines the evolution from early data processing using tapes and cards to modern database systems.
Introduction to Database Management SystemsAdri Jovin
This presentation contains content relevant to the introduction to the database management systems. The content is adapted from the original work of Abe Silberschatz et. al.
*What is DBMS
*Database System Applications
*The Evolution of a Database
*Drawbacks of File Management System / Purpose of Database Systems
*Advantages of DBMS
*Disadvantages of DBMS
*DBMS Architecture
*types of modules
*Three-Tier and n-Tier Architectures for Web Applications
*different level and types
*Data Abstraction
*Data Independence
*Database State or Snapshot
*Database Schema vs. Database State
*Categories of data models
*Different Users
*Database Languages
*Relational Model
*ER Model
*Object-based model
*Semi-structured data model
DBMS introduction and functionality of of dbmsranjana dalwani
Database management systems (DBMS) allow for the storage and manipulation of large collections of related data. A DBMS includes software that provides efficient access to data and ensures data integrity. Key benefits of DBMS include data independence, efficient data access, data integrity and security, concurrent access and crash recovery. DBMS touch many aspects of daily life through applications in banking, transportation, education and more.
This document defines database and DBMS, describes their advantages over file-based systems like data independence and integrity. It explains database system components and architecture including physical and logical data models. Key aspects covered are data definition language to create schemas, data manipulation language to query data, and transaction management to handle concurrent access and recovery. It also provides a brief history of database systems and discusses database users and the critical role of database administrators.
This document defines database and DBMS, describes their advantages over file-based systems like data independence and integrity. It explains database system components and architecture including physical and logical data models. Key aspects covered are data definition language to create schemas, data manipulation language to query data, and transaction management to handle concurrent access and recovery. It also provides a brief history of database systems and discusses database users and the critical role of database administrators.
This document discusses database concepts and architecture. It covers data models including conceptual, physical and implementation models. It discusses the history of relational, network and hierarchical data models. It also covers the three-level database architecture including the external, conceptual and internal schemas. The architecture supports logical and physical data independence. The document discusses database languages like DDL and DML and different database interfaces and systems.
Attributes are properties or characteristics that describe entities. In the EMPLOYEE entity example, attributes could include:
- Employee ID
- Name
- Date of birth
- Address
- Salary
These attributes describe and provide information about each employee entity instance. Attributes help define and differentiate entity instances from each other.
The document provides an overview of database management systems (DBMS). It discusses the need for DBMS, different database architectures including centralized, client-server and distributed. It also covers data models, ER diagrams, relational models, and SQL. Key advantages of DBMS over file systems include reducing data redundancy, improving data integrity and security, and enabling concurrent access.
Unit 1: Introduction to DBMS Unit 1 CompleteRaj vardhan
This document discusses database management systems (DBMS) and their advantages over traditional file-based data storage. It describes the key components of a DBMS, including the hardware, software, data, procedures, and users. It also explains the three levels of abstraction in a DBMS - the physical level, logical level, and view level - and how they provide data independence. Finally, it provides an overview of different data models like hierarchical, network, and relational models.
This document discusses the key components of a database system including applications, file systems, data views, query processors, users and administrators, data languages, transaction management, and storage managers. It provides examples of common database applications and describes how data is abstracted at the physical, logical, and view levels. It also explains the roles of DDL, DML, transactions, and storage managers in database design and management.
The document provides an overview of database management systems (DBMS). It discusses DBMS applications, why DBMS are used, different users of databases, data models and languages like SQL. It also summarizes key components of a DBMS including data storage, query processing, transaction management and database architecture.
The document provides an overview of databases and database management systems. It defines what a database is and provides examples. It discusses the objectives and purpose of databases, including controlling redundancy, ease of use, data independence, accuracy, recovery from failure, privacy and security. Key terms related to database design and structure are explained, such as tables, rows, indexes, primary keys and foreign keys. The document also covers data definition language, data manipulation language, SQL, users and types of databases. Factors to consider when selecting a database management system are outlined.
This document provides an overview of the topics that will be covered in a database systems textbook. It introduces the major parts of the book, including relational databases, database design, data storage and querying, transaction management, and database architectures. Each chapter is briefly described to give the reader an understanding of what concepts will be discussed in more depth throughout the textbook.
The document provides an overview of key concepts in database management systems including:
- The benefits of using a DBMS over file systems such as data independence, data integrity, and concurrent access.
- The three levels of abstraction in a DBMS - physical, logical, and view level.
- Common data models including relational, entity-relationship, and object-oriented models.
- Database languages including data manipulation languages (DML) like SQL and data definition languages (DDL) to define schemas.
- Key components of a DBMS including storage management, query processing, and transaction management.
- Roles of database users and administrators.
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingCeline George
The Accounting module in Odoo 17 is a complete tool designed to manage all financial aspects of a business. Odoo offers a comprehensive set of tools for generating financial and tax reports, which are crucial for managing a company's finances and ensuring compliance with tax regulations.
Ad
More Related Content
Similar to 20CS402 - DATABASE MANAGEMENT SYSTEMS NOTES (20)
Introduction to Database Management SystemsAdri Jovin
This presentation contains content relevant to the introduction to the database management systems. The content is adapted from the original work of Abe Silberschatz et. al.
*What is DBMS
*Database System Applications
*The Evolution of a Database
*Drawbacks of File Management System / Purpose of Database Systems
*Advantages of DBMS
*Disadvantages of DBMS
*DBMS Architecture
*types of modules
*Three-Tier and n-Tier Architectures for Web Applications
*different level and types
*Data Abstraction
*Data Independence
*Database State or Snapshot
*Database Schema vs. Database State
*Categories of data models
*Different Users
*Database Languages
*Relational Model
*ER Model
*Object-based model
*Semi-structured data model
DBMS introduction and functionality of of dbmsranjana dalwani
Database management systems (DBMS) allow for the storage and manipulation of large collections of related data. A DBMS includes software that provides efficient access to data and ensures data integrity. Key benefits of DBMS include data independence, efficient data access, data integrity and security, concurrent access and crash recovery. DBMS touch many aspects of daily life through applications in banking, transportation, education and more.
This document defines database and DBMS, describes their advantages over file-based systems like data independence and integrity. It explains database system components and architecture including physical and logical data models. Key aspects covered are data definition language to create schemas, data manipulation language to query data, and transaction management to handle concurrent access and recovery. It also provides a brief history of database systems and discusses database users and the critical role of database administrators.
This document defines database and DBMS, describes their advantages over file-based systems like data independence and integrity. It explains database system components and architecture including physical and logical data models. Key aspects covered are data definition language to create schemas, data manipulation language to query data, and transaction management to handle concurrent access and recovery. It also provides a brief history of database systems and discusses database users and the critical role of database administrators.
This document discusses database concepts and architecture. It covers data models including conceptual, physical and implementation models. It discusses the history of relational, network and hierarchical data models. It also covers the three-level database architecture including the external, conceptual and internal schemas. The architecture supports logical and physical data independence. The document discusses database languages like DDL and DML and different database interfaces and systems.
Attributes are properties or characteristics that describe entities. In the EMPLOYEE entity example, attributes could include:
- Employee ID
- Name
- Date of birth
- Address
- Salary
These attributes describe and provide information about each employee entity instance. Attributes help define and differentiate entity instances from each other.
The document provides an overview of database management systems (DBMS). It discusses the need for DBMS, different database architectures including centralized, client-server and distributed. It also covers data models, ER diagrams, relational models, and SQL. Key advantages of DBMS over file systems include reducing data redundancy, improving data integrity and security, and enabling concurrent access.
Unit 1: Introduction to DBMS Unit 1 CompleteRaj vardhan
This document discusses database management systems (DBMS) and their advantages over traditional file-based data storage. It describes the key components of a DBMS, including the hardware, software, data, procedures, and users. It also explains the three levels of abstraction in a DBMS - the physical level, logical level, and view level - and how they provide data independence. Finally, it provides an overview of different data models like hierarchical, network, and relational models.
This document discusses the key components of a database system including applications, file systems, data views, query processors, users and administrators, data languages, transaction management, and storage managers. It provides examples of common database applications and describes how data is abstracted at the physical, logical, and view levels. It also explains the roles of DDL, DML, transactions, and storage managers in database design and management.
The document provides an overview of database management systems (DBMS). It discusses DBMS applications, why DBMS are used, different users of databases, data models and languages like SQL. It also summarizes key components of a DBMS including data storage, query processing, transaction management and database architecture.
The document provides an overview of databases and database management systems. It defines what a database is and provides examples. It discusses the objectives and purpose of databases, including controlling redundancy, ease of use, data independence, accuracy, recovery from failure, privacy and security. Key terms related to database design and structure are explained, such as tables, rows, indexes, primary keys and foreign keys. The document also covers data definition language, data manipulation language, SQL, users and types of databases. Factors to consider when selecting a database management system are outlined.
This document provides an overview of the topics that will be covered in a database systems textbook. It introduces the major parts of the book, including relational databases, database design, data storage and querying, transaction management, and database architectures. Each chapter is briefly described to give the reader an understanding of what concepts will be discussed in more depth throughout the textbook.
The document provides an overview of key concepts in database management systems including:
- The benefits of using a DBMS over file systems such as data independence, data integrity, and concurrent access.
- The three levels of abstraction in a DBMS - physical, logical, and view level.
- Common data models including relational, entity-relationship, and object-oriented models.
- Database languages including data manipulation languages (DML) like SQL and data definition languages (DDL) to define schemas.
- Key components of a DBMS including storage management, query processing, and transaction management.
- Roles of database users and administrators.
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingCeline George
The Accounting module in Odoo 17 is a complete tool designed to manage all financial aspects of a business. Odoo offers a comprehensive set of tools for generating financial and tax reports, which are crucial for managing a company's finances and ensuring compliance with tax regulations.
As of Mid to April Ending, I am building a new Reiki-Yoga Series. No worries, they are free workshops. So far, I have 3 presentations so its a gradual process. If interested visit: https://www.slideshare.net/YogaPrincess
https://ldmchapels.weebly.com
Blessings and Happy Spring. We are hitting Mid Season.
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsDrNidhiAgarwal
Unemployment is a major social problem, by which not only rural population have suffered but also urban population are suffered while they are literate having good qualification.The evil consequences like poverty, frustration, revolution
result in crimes and social disorganization. Therefore, it is
necessary that all efforts be made to have maximum.
employment facilities. The Government of India has already
announced that the question of payment of unemployment
allowance cannot be considered in India
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetSritoma Majumder
Introduction
All the materials around us are made up of elements. These elements can be broadly divided into two major groups:
Metals
Non-Metals
Each group has its own unique physical and chemical properties. Let's understand them one by one.
Physical Properties
1. Appearance
Metals: Shiny (lustrous). Example: gold, silver, copper.
Non-metals: Dull appearance (except iodine, which is shiny).
2. Hardness
Metals: Generally hard. Example: iron.
Non-metals: Usually soft (except diamond, a form of carbon, which is very hard).
3. State
Metals: Mostly solids at room temperature (except mercury, which is a liquid).
Non-metals: Can be solids, liquids, or gases. Example: oxygen (gas), bromine (liquid), sulphur (solid).
4. Malleability
Metals: Can be hammered into thin sheets (malleable).
Non-metals: Not malleable. They break when hammered (brittle).
5. Ductility
Metals: Can be drawn into wires (ductile).
Non-metals: Not ductile.
6. Conductivity
Metals: Good conductors of heat and electricity.
Non-metals: Poor conductors (except graphite, which is a good conductor).
7. Sonorous Nature
Metals: Produce a ringing sound when struck.
Non-metals: Do not produce sound.
Chemical Properties
1. Reaction with Oxygen
Metals react with oxygen to form metal oxides.
These metal oxides are usually basic.
Non-metals react with oxygen to form non-metallic oxides.
These oxides are usually acidic.
2. Reaction with Water
Metals:
Some react vigorously (e.g., sodium).
Some react slowly (e.g., iron).
Some do not react at all (e.g., gold, silver).
Non-metals: Generally do not react with water.
3. Reaction with Acids
Metals react with acids to produce salt and hydrogen gas.
Non-metals: Do not react with acids.
4. Reaction with Bases
Some non-metals react with bases to form salts, but this is rare.
Metals generally do not react with bases directly (except amphoteric metals like aluminum and zinc).
Displacement Reaction
More reactive metals can displace less reactive metals from their salt solutions.
Uses of Metals
Iron: Making machines, tools, and buildings.
Aluminum: Used in aircraft, utensils.
Copper: Electrical wires.
Gold and Silver: Jewelry.
Zinc: Coating iron to prevent rusting (galvanization).
Uses of Non-Metals
Oxygen: Breathing.
Nitrogen: Fertilizers.
Chlorine: Water purification.
Carbon: Fuel (coal), steel-making (coke).
Iodine: Medicines.
Alloys
An alloy is a mixture of metals or a metal with a non-metal.
Alloys have improved properties like strength, resistance to rusting.
*Metamorphosis* is a biological process where an animal undergoes a dramatic transformation from a juvenile or larval stage to a adult stage, often involving significant changes in form and structure. This process is commonly seen in insects, amphibians, and some other animals.
INTRO TO STATISTICS
INTRO TO SPSS INTERFACE
CLEANING MULTIPLE CHOICE RESPONSE DATA WITH EXCEL
ANALYZING MULTIPLE CHOICE RESPONSE DATA
INTERPRETATION
Q & A SESSION
PRACTICAL HANDS-ON ACTIVITY
Odoo Inventory Rules and Routes v17 - Odoo SlidesCeline George
Odoo's inventory management system is highly flexible and powerful, allowing businesses to efficiently manage their stock operations through the use of Rules and Routes.
How to Set warnings for invoicing specific customers in odooCeline George
Odoo 16 offers a powerful platform for managing sales documents and invoicing efficiently. One of its standout features is the ability to set warnings and block messages for specific customers during the invoicing process.
The Pala kings were people-protectors. In fact, Gopal was elected to the throne only to end Matsya Nyaya. Bhagalpur Abhiledh states that Dharmapala imposed only fair taxes on the people. Rampala abolished the unjust taxes imposed by Bhima. The Pala rulers were lovers of learning. Vikramshila University was established by Dharmapala. He opened 50 other learning centers. A famous Buddhist scholar named Haribhadra was to be present in his court. Devpala appointed another Buddhist scholar named Veerdeva as the vice president of Nalanda Vihar. Among other scholars of this period, Sandhyakar Nandi, Chakrapani Dutta and Vajradatta are especially famous. Sandhyakar Nandi wrote the famous poem of this period 'Ramcharit'.
1. KONGUNADU COLLEGE OF ENGINEERING AND TECHNOLOGY
(AUTONOMOUS)
DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND DATA SCIENCE
20CS402 - DATABASE MANAGEMENT SYSTEMS
UNIT 1 – RELATIONAL DATABASES
2. Database Applications Examples
Enterprise Information
• Sales: customers, products, purchases
• Accounting: payments, receipts, assets
• Human Resources: Information about employees, salaries, payroll
taxes.
Manufacturing: management of production, inventory, orders, supply
chain.
Banking and finance
• customer information, accounts, loans, and banking transactions.
• Credit card transactions
• Finance: sales and purchases of financial instruments (e.g., stocks
and bonds; storing real-time market data
Universities: registration, grades
3. Database Applications Examples (Cont.)
Airlines: reservations, schedules
Telecommunication: records of calls, texts, and data usage, generating
monthly bills, maintaining balances on prepaid calling cards
Web-based services
• Online retailers: order tracking, customized recommendations
• Online advertisements
Document databases
Navigation systems: For maintaining the locations of varies places of
interest along with the exact routes of roads, train systems, buses, etc.
4. Purpose of Database Systems
Data redundancy and inconsistency: data is stored in multiple file
formats resulting induplication of information in different files
Difficulty in accessing data
• Need to write a new program to carry out each new task
Data isolation
• Multiple files and formats
Integrity problems
• Integrity constraints (e.g., account balance > 0) become “buried”
in program code rather than being stated explicitly
• Hard to add new constraints or change existing ones
In the early days, database applications were built directly on top of file
systems, which leads to:
5. Purpose of Database Systems (Cont.)
Atomicity of updates
• Failures may leave database in an inconsistent state with partial
updates carried out
• Example: Transfer of funds from one account to another should either
complete or not happen at all
Concurrent access by multiple users
• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
Ex: Two people reading a balance (say 100) and updating it by
withdrawing money (say 50 each) at the same time
Security problems
• Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
6. Data Models
A collection of tools for describing
• Data
• Data relationships
• Data semantics
• Data constraints
Relational model
Entity-Relationship data model (mainly for database design)
Object-based data models (Object-oriented and Object-relational)
Semi-structured data model (XML)
Other older models:
• Network model
• Hierarchical model
7. Relational Model
All the data is stored in various tables.
Example of tabular data in the relational model
Columns
Rows
Ted Codd
Turing Award 1981
10. Instances and Schemas
Similar to types and variables in programming languages
Logical Schema – the overall logical structure of the database
• Example: The database consists of information about a set of
customers and accounts in a bank and the relationship between them
Analogous to type information of a variable in a program
Physical schema – the overall physical structure of the database
Instance – the actual content of the database at a particular point in time
• Analogous to the value of a variable
11. Physical Data Independence
Physical Data Independence – the ability to modify the physical
schema without changing the logical schema
• Applications depend on the logical schema
• In general, the interfaces between the various levels and
components should be well defined so that changes in some parts
do not seriously influence others.
12. Data Definition Language (DDL)
Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
DDL compiler generates a set of table templates stored in a data
dictionary
Data dictionary contains metadata (i.e., data about data)
• Database schema
• Integrity constraints
Primary key (ID uniquely identifies instructors)
• Authorization
Who can access what
13. Data Manipulation Language (DML)
Language for accessing and updating the data organized by the
appropriate data model
• DML also known as query language
There are basically two types of data-manipulation language
• Procedural DML -- require a user to specify what data are needed
and how to get those data.
• Declarative DML -- require a user to specify what data are needed
without specifying how to get those data.
Declarative DMLs are usually easier to learn and use than are procedural
DMLs.
Declarative DMLs are also referred to as non-procedural DMLs
The portion of a DML that involves information retrieval is called a query
language.
14. SQL Query Language
SQL query language is nonprocedural. A query takes as input several
tables (possibly only one) and always returns a single table.
Example to find all instructors in Comp. Sci. dept
select name
from instructor
where dept_name = 'Comp. Sci.'
SQL is NOT a Turing machine equivalent language
To be able to compute complex functions SQL is usually embedded in
some higher-level language
Application programs generally access databases through one of
• Language extensions to allow embedded SQL
• Application program interface (e.g., ODBC/JDBC) which allow SQL
queries to be sent to a database
15. Database Access from Application Program
Non-procedural query languages such as SQL are not as powerful as a
universal Turing machine.
SQL does not support actions such as input from users, output to
displays, or communication over the network.
Such computations and actions must be written in a host language, such
as C/C++, Java or Python, with embedded SQL queries that access the
data in the database.
Application programs -- are programs that are used to interact with the
database in this fashion.
16. Database Design
Logical Design – Deciding on the database schema. Database design
requires that we find a “good” collection of relation schemas.
• Business decision – What attributes should we record in the
database?
• Computer Science decision – What relation schemas should we
have and how should the attributes be distributed among the
various relation schemas?
Physical Design – Deciding on the physical layout of the database
The process of designing the general structure of the database:
17. Database Engine
A database system is partitioned into modules that deal with each of the
responsibilities of the overall system.
The functional components of a database system can be divided into
• The storage manager,
• The query processor component,
• The transaction management component.
18. Storage Manager
A program module that provides the interface between the low-level data
stored in the database and the application programs and queries
submitted to the system.
The storage manager is responsible to the following tasks:
• Interaction with the OS file manager
• Efficient storing, retrieving and updating of data
The storage manager components include:
• Authorization and integrity manager
• Transaction manager
• File manager
• Buffer manager
19. Storage Manager (Cont.)
The storage manager implements several data structures as part of the
physical system implementation:
• Data files -- store the database itself
• Data dictionary -- stores metadata about the structure of the
database, in particular the schema of the database.
• Indices -- can provide fast access to data items. A database index
provides pointers to those data items that hold a particular value.
20. Query Processor
The query processor components include:
• DDL interpreter -- interprets DDL statements and records the
definitions in the data dictionary.
• DML compiler -- translates DML statements in a query language into
an evaluation plan consisting of low-level instructions that the query
evaluation engine understands.
The DML compiler performs query optimization; that is, it picks
the lowest cost evaluation plan from among the various
alternatives.
• Query evaluation engine -- executes low-level instructions generated
by the DML compiler.
22. Transaction Management
A transaction is a collection of operations that performs a single logical
function in a database application
Transaction-management component ensures that the database
remains in a consistent (correct) state despite system failures (e.g.,
power failures and operating system crashes) and transaction failures.
Concurrency-control manager controls the interaction among the
concurrent transactions, to ensure the consistency of the database.
23. Database Architecture
Centralized databases
• One to a few cores, shared memory
Client-server,
• One server machine executes work on behalf of multiple client
machines.
Parallel databases
• Many core shared memory
• Shared disk
• Shared nothing
Distributed databases
• Geographical distribution
• Schema/data heterogeneity
25. Database Applications
Two-tier architecture -- the application resides at the client machine,
where it invokes database system functionality at the server machine
Three-tier architecture -- the client machine acts as a front end and
does not contain any direct database calls.
• The client end communicates with an application server, usually
through a forms interface.
• The application server in turn communicates with a database
system to access data.
Database applications are usually partitioned into two or three parts
28. Database Administrator
Schema definition
Storage structure and access-method definition
Schema and physical-organization modification
Granting of authorization for data access
Routine maintenance
Periodically backing up the database
Ensuring that enough free disk space is available for normal
operations, and upgrading disk space as required
Monitoring jobs running on the database
A person who has central control over the system is called a database
administrator (DBA). Functions of a DBA include:
29. History of Database Systems
1950s and early 1960s:
• Data processing using magnetic tapes for storage
Tapes provided only sequential access
• Punched cards for input
Late 1960s and 1970s:
• Hard disks allowed direct access to data
• Network and hierarchical data models in widespread use
• Ted Codd defines the relational data model
Would win the ACM Turing Award for this work
IBM Research begins System R prototype
UC Berkeley (Michael Stonebraker) begins Ingres prototype
Oracle releases first commercial relational database
• High-performance (for the era) transaction processing
30. History of Database Systems (Cont.)
1980s:
• Research relational prototypes evolve into commercial systems
SQL becomes industrial standard
• Parallel and distributed database systems
Wisconsin, IBM, Teradata
• Object-oriented database systems
1990s:
• Large decision support and data-mining applications
• Large multi-terabyte data warehouses
• Emergence of Web commerce
31. History of Database Systems (Cont.)
2000s
• Big data storage systems
Google BigTable, Yahoo PNuts, Amazon,
“NoSQL” systems.
• Big data analysis: beyond SQL
Map reduce and friends
2010s
• SQL reloaded
SQL front end to Map Reduce systems
Massively parallel database systems
Multi-core main-memory databases
32. Outline
Structure of Relational Databases
Database Schema
Keys
Schema Diagrams
Relational Query Languages
The Relational Algebra
33. Example of a Instructor Relation
attributes
(or columns)
tuples
(or rows)
34. Relation Schema and Instance
A1, A2, …, An are attributes
R = (A1, A2, …, An ) is a relation schema
Example:
instructor = (ID, name, dept_name, salary)
A relation instance r defined over schema R is denoted by r (R).
The current values a relation are specified by a table
An element t of relation r is called a tuple and is represented by
a row in a table
35. Attributes
The set of allowed values for each attribute is called the domain of the
attribute
Attribute values are (normally) required to be atomic; that is, indivisible
The special value null is a member of every domain. Indicated that the
value is “unknown”
The null value causes complications in the definition of many operations
36. Relations are Unordered
Order of tuples is irrelevant (tuples may be stored in an arbitrary order)
Example: instructor relation with unordered tuples
37. Database Schema
Database schema -- is the logical structure of the database.
Database instance -- is a snapshot of the data in the database at a given
instant in time.
Example:
• schema: instructor (ID, name, dept_name, salary)
• Instance:
38. Keys
Let K R
K is a superkey of R if values for K are sufficient to identify a unique tuple
of each possible relation r(R)
• Example: {ID} and {ID,name} are both superkeys of instructor.
Superkey K is a candidate key if K is minimal
Example: {ID} is a candidate key for Instructor
One of the candidate keys is selected to be the primary key.
• Which one?
Foreign key constraint: Value in one relation must appear in another
• Referencing relation
• Referenced relation
• Example: dept_name in instructor is a foreign key from instructor
referencing department
40. Relational Query Languages
Procedural versus non-procedural, or declarative
“Pure” languages:
• Relational algebra
• Tuple relational calculus
• Domain relational calculus
The above 3 pure languages are equivalent in computing power
We will concentrate in this chapter on relational algebra
• Not Turing-machine equivalent
• Consists of 6 basic operations
41. Relational Algebra
A procedural language consisting of a set of operations that take one or
two relations as input and produce a new relation as their result.
Six basic operators
• select:
• project:
• union:
• set difference: –
• Cartesian product: x
• rename:
42. Select Operation
The select operation selects tuples that satisfy a given predicate.
Notation: p (r)
p is called the selection predicate
Example: select those tuples of the instructor relation where the
instructor is in the “Physics” department.
• Query
dept_name=“Physics” (instructor)
• Result
43. Select Operation (Cont.)
We allow comparisons using
=, , >, . <.
in the selection predicate.
We can combine several predicates into a larger predicate by using the
connectives:
(and), (or), (not)
Example: Find the instructors in Physics with a salary greater $90,000, we
write:
dept_name=“Physics” salary > 90,000 (instructor)
The select predicate may include comparisons between two attributes.
• Example, find all departments whose name is the same as their
building name:
• dept_name=building (department)
44. Project Operation
A unary operation that returns its argument relation, with certain attributes
left out.
Notation:
A1,A2,A3 ….Ak
(r)
where A1, A2, …, Ak are attribute names and r is a relation name.
The result is defined as the relation of k columns obtained by erasing the
columns that are not listed
Duplicate rows removed from result, since relations are sets
45. Project Operation Example
Example: eliminate the dept_name attribute of instructor
Query:
ID, name, salary (instructor)
Result:
46. Composition of Relational Operations
The result of a relational-algebra operation is relation and therefore of
relational-algebra operations can be composed together into a
relational-algebra expression.
Consider the query -- Find the names of all instructors in the Physics
department.
name( dept_name =“Physics” (instructor))
Instead of giving the name of a relation as the argument of the projection
operation, we give an expression that evaluates to a relation.
47. Cartesian-Product Operation
The Cartesian-product operation (denoted by X) allows us to combine
information from any two relations.
Example: the Cartesian product of the relations instructor and teaches is
written as:
instructor X teaches
We construct a tuple of the result out of each possible pair of tuples: one
from the instructor relation and one from the teaches relation (see next
slide)
Since the instructor ID appears in both relations we distinguish between
these attribute by attaching to the attribute the name of the relation from
which the attribute originally came.
• instructor.ID
• teaches.ID
49. Join Operation
The Cartesian-Product
instructor X teaches
associates every tuple of instructor with every tuple of teaches.
• Most of the resulting rows have information about instructors who did
NOT teach a particular course.
To get only those tuples of “instructor X teaches “ that pertain to
instructors and the courses that they taught, we write:
instructor.id = teaches.id (instructor x teaches ))
• We get only those tuples of “instructor X teaches” that pertain to
instructors and the courses that they taught.
The result of this expression, shown in the next slide
50. Join Operation (Cont.)
The table corresponding to:
instructor.id = teaches.id (instructor x teaches))
51. Join Operation (Cont.)
The join operation allows us to combine a select operation and a
Cartesian-Product operation into a single operation.
Consider relations r (R) and s (S)
Let “theta” be a predicate on attributes in the schema R “union” S. The
join operation r ⋈𝜃 s is defined as follows:
𝑟 ⋈𝜃 𝑠 = 𝜎𝜃 (𝑟 × 𝑠)
Thus
instructor.id = teaches.id (instructor x teaches ))
Can equivalently be written as
instructor ⋈ Instructor.id = teaches.id teaches.
52. Union Operation
The union operation allows us to combine two relations
Notation: r s
For r s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd
column of r deals with the same type of values as does the
2nd column of s)
Example: to find all courses taught in the Fall 2017 semester, or in the
Spring 2018 semester, or in both
course_id ( semester=“Fall” Λ year=2017 (section))
course_id ( semester=“Spring” Λ year=2018 (section))
53. Union Operation (Cont.)
Result of:
course_id ( semester=“Fall” Λ year=2017 (section))
course_id ( semester=“Spring” Λ year=2018 (section))
54. Set-Intersection Operation
The set-intersection operation allows us to find tuples that are in both
the input relations.
Notation: r s
Assume:
• r, s have the same arity
• attributes of r and s are compatible
Example: Find the set of all courses taught in both the Fall 2017 and the
Spring 2018 semesters.
course_id ( semester=“Fall” Λ year=2017 (section))
course_id ( semester=“Spring” Λ year=2018 (section))
• Result
55. Set Difference Operation
The set-difference operation allows us to find tuples that are in one relation
but are not in another.
Notation r – s
Set differences must be taken between compatible relations.
• r and s must have the same arity
• attribute domains of r and s must be compatible
Example: to find all courses taught in the Fall 2017 semester, but not in the
Spring 2018 semester
course_id ( semester=“Fall” Λ year=2017 (section)) −
course_id ( semester=“Spring” Λ year=2018 (section))
56. The Assignment Operation
It is convenient at times to write a relational-algebra expression by
assigning parts of it to temporary relation variables.
The assignment operation is denoted by and works like assignment in
a programming language.
Example: Find all instructor in the “Physics” and Music department.
Physics dept_name=“Physics” (instructor)
Music dept_name=“Music” (instructor)
Physics Music
With the assignment operation, a query can be written as a sequential
program consisting of a series of assignments followed by an expression
whose value is displayed as the result of the query.
57. The Rename Operation
The results of relational-algebra expressions do not have a name that we
can use to refer to them. The rename operator, , is provided for that
purpose
The expression:
x (E)
returns the result of expression E under the name x
Another form of the rename operation:
x(A1,A2, .. An) (E)
58. Equivalent Queries
There is more than one way to write a query in relational algebra.
Example: Find information about courses taught by instructors in the
Physics department with salary greater than 90,000
Query 1
dept_name=“Physics” salary > 90,000 (instructor)
Query 2
dept_name=“Physics” ( salary > 90.000 (instructor))
The two queries are not identical; they are, however, equivalent -- they
give the same result on any database.
59. Equivalent Queries
There is more than one way to write a query in relational algebra.
Example: Find information about courses taught by instructors in the
Physics department
Query 1
dept_name=“Physics” (instructor ⋈ instructor.ID = teaches.ID teaches)
Query 2
(dept_name=“Physics” (instructor)) ⋈ instructor.ID = teaches.ID teaches
The two queries are not identical; they are, however, equivalent -- they
give the same result on any database.
60. Entity Sets
An entity is an object that exists and is distinguishable from other
objects.
• Example: specific person, company, event, plant
An entity set is a set of entities of the same type that share the same
properties.
• Example: set of all persons, companies, trees, holidays
An entity is represented by a set of attributes; i.e., descriptive properties
possessed by all members of an entity set.
• Example:
instructor = (ID, name, salary )
course= (course_id, title, credits)
A subset of the attributes form a primary key of the entity set; i.e.,
uniquely identifying each member of the set.
61. Representing Entity sets in ER Diagram
Entity sets can be represented graphically as follows:
• Rectangles represent entity sets.
• Attributes listed inside entity rectangle
• Underline indicates primary key attributes
62. Relationship Sets
A relationship is an association among several entities
Example:
44553 (Peltier) advisor 22222 (Einstein)
student entity relationship set instructor entity
A relationship set is a mathematical relation among n 2 entities, each
taken from entity sets
{(e1, e2, … en) | e1 E1, e2 E2, …, en En}
where (e1, e2, …, en) is a relationship
• Example:
(44553,22222) advisor
63. Relationship Sets (Cont.)
Example: we define the relationship set advisor to denote the
associations between students and the instructors who act as their
advisors.
Pictorially, we draw a line between related entities.
65. Relationship Sets (Cont.)
An attribute can also be associated with a relationship set.
For instance, the advisor relationship set between entity sets instructor
and student may have the attribute date which tracks when the student
started being associated with the advisor
instructor
student
76766 Crick
Katz
Srinivasan
Kim
Singh
Einstein
45565
10101
98345
76543
22222
98988
12345
00128
76543
44553
Tanaka
Shankar
Zhang
Brown
Aoi
Chavez
Peltier
3 May 2008
10 June 2007
12 June 2006
6 June 2009
30 June 2007
31 May 2007
4 May 2006
76653
23121
67. Roles
Entity sets of a relationship need not be distinct
• Each occurrence of an entity set plays a “role” in the relationship
The labels “course_id” and “prereq_id” are called roles.
68. Degree of a Relationship Set
Binary relationship
• involve two entity sets (or degree two).
• most relationship sets in a database system are binary.
Relationships between more than two entity sets are rare. Most
relationships are binary. (More on this later.)
• Example: students work on research projects under the guidance of
an instructor.
• relationship proj_guide is a ternary relationship between instructor,
student, and project
69. Non-binary Relationship Sets
Most relationship sets are binary
There are occasions when it is more convenient to represent
relationships as non-binary.
E-R Diagram with a Ternary Relationship
70. Complex Attributes
Attribute types:
• Simple and composite attributes.
• Single-valued and multivalued attributes
Example: multivalued attribute: phone_numbers
• Derived attributes
Can be computed from other attributes
Example: age, given date_of_birth
Domain – the set of permitted values for each attribute
71. Composite Attributes
Composite attributes allow us to divided attributes into subparts (other
attributes).
name address
first_name middle_initial last_name street city state postal_code
street_number street_name apartment_number
composite
attributes
component
attributes
73. Mapping Cardinality Constraints
Express the number of entities to which another entity can be associated
via a relationship set.
Most useful in describing binary relationship sets.
For a binary relationship set the mapping cardinality must be one of the
following types:
• One to one
• One to many
• Many to one
• Many to many
74. Mapping Cardinalities
One to one One to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
75. Mapping Cardinalities
Many to one Many to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
76. Representing Cardinality Constraints in ER Diagram
We express cardinality constraints by drawing either a directed line (),
signifying “one,” or an undirected line (—), signifying “many,” between the
relationship set and the entity set.
One-to-one relationship between an instructor and a student :
• A student is associated with at most one instructor via the relationship
advisor
• A student is associated with at most one department via stud_dept
77. One-to-Many Relationship
one-to-many relationship between an instructor and a student
• an instructor is associated with several (including 0) students via
advisor
• a student is associated with at most one instructor via advisor,
78. Many-to-One Relationships
In a many-to-one relationship between an instructor and a student,
• an instructor is associated with at most one student via advisor,
• and a student is associated with several (including 0) instructors via
advisor
79. Many-to-Many Relationship
An instructor is associated with several (possibly 0) students via advisor
A student is associated with several (possibly 0) instructors via advisor
80. Total and Partial Participation
Total participation (indicated by double line): every entity in the entity set
participates in at least one relationship in the relationship set
participation of student in advisor relation is total
every student must have an associated instructor
Partial participation: some entities may not participate in any relationship
in the relationship set
• Example: participation of instructor in advisor is partial
81. Notation for Expressing More Complex Constraints
A line may have an associated minimum and maximum cardinality, shown
in the form l..h, where l is the minimum and h the maximum cardinality
• A minimum value of 1 indicates total participation.
• A maximum value of 1 indicates that the entity participates in at most
one relationship
• A maximum value of * indicates no limit.
Example
• Instructor can advise 0 or more students. A student must have 1
advisor; cannot have multiple advisors
82. Cardinality Constraints on Ternary Relationship
We allow at most one arrow out of a ternary (or greater degree)
relationship to indicate a cardinality constraint
For example, an arrow from proj_guide to instructor indicates each
student has at most one guide for a project
If there is more than one arrow, there are two ways of defining the
meaning.
• For example, a ternary relationship R between A, B and C with
arrows to B and C could mean
1. Each A entity is associated with a unique entity from B
and C or
2. Each pair of entities from (A, B) is associated with a
unique C entity, and each pair (A, C) is associated
with a unique B
• Each alternative has been used in different formalisms
• To avoid confusion we outlaw more than one arrow
83. Primary Key
Primary keys provide a way to specify how entities and relations are
distinguished. We will consider:
• Entity sets
• Relationship sets.
• Weak entity sets
84. Primary key for Entity Sets
By definition, individual entities are distinct.
From database perspective, the differences among them must be
expressed in terms of their attributes.
The values of the attribute values of an entity must be such that they can
uniquely identify the entity.
• No two entities in an entity set are allowed to have exactly the same
value for all attributes.
A key for an entity is a set of attributes that suffice to distinguish entities
from each other
85. Primary Key for Relationship Sets
To distinguish among the various relationships of a relationship set we use
the individual primary keys of the entities in the relationship set.
• Let R be a relationship set involving entity sets E1, E2, .. En
• The primary key for R is consists of the union of the primary keys of
entity sets E1, E2, ..En
• If the relationship set R has attributes a1, a2, .., am associated with it,
then the primary key of R also includes the attributes a1, a2, .., am
Example: relationship set “advisor”.
• The primary key consists of instructor.ID and student.ID
The choice of the primary key for a relationship set depends on the
mapping cardinality of the relationship set.
86. Choice of Primary key for Binary Relationship
Many-to-Many relationships. The preceding union of the primary keys is a
minimal superkey and is chosen as the primary key.
One-to-Many relationships . The primary key of the “Many” side is a
minimal superkey and is used as the primary key.
Many-to-one relationships. The primary key of the “Many” side is a minimal
superkey and is used as the primary key.
One-to-one relationships. The primary key of either one of the participating
entity sets forms a minimal superkey, and either one can be chosen as the
primary key.
87. Weak Entity Sets
Consider a section entity, which is uniquely identified by a course_id,
semester, year, and sec_id.
Clearly, section entities are related to course entities. Suppose we create
a relationship set sec_course between entity sets section and course.
Note that the information in sec_course is redundant, since section
already has an attribute course_id, which identifies the course with which
the section is related.
One option to deal with this redundancy is to get rid of the relationship
sec_course; however, by doing so the relationship between section and
course becomes implicit in an attribute, which is not desirable.
88. Weak Entity Sets (Cont.)
An alternative way to deal with this redundancy is to not store the attribute
course_id in the section entity and to only store the remaining attributes
section_id, year, and semester.
• However, the entity set section then does not have enough attributes
to identify a particular section entity uniquely
To deal with this problem, we treat the relationship sec_course as a
special relationship that provides extra information, in this case, the
course_id, required to identify section entities uniquely.
A weak entity set is one whose existence is dependent on another entity,
called its identifying entity
Instead of associating a primary key with a weak entity, we use the
identifying entity, along with extra attributes called discriminator to
uniquely identify a weak entity.
89. Weak Entity Sets (Cont.)
An entity set that is not a weak entity set is termed a strong entity set.
Every weak entity must be associated with an identifying entity; that is,
the weak entity set is said to be existence dependent on the identifying
entity set.
The identifying entity set is said to own the weak entity set that it
identifies.
The relationship associating the weak entity set with the identifying entity
set is called the identifying relationship.
Note that the relational schema we eventually create from the entity set
section does have the attribute course_id, for reasons that will become
clear later, even though we have dropped the attribute course_id from
the entity set section.
90. Expressing Weak Entity Sets
In E-R diagrams, a weak entity set is depicted via a double rectangle.
We underline the discriminator of a weak entity set with a dashed line.
The relationship set connecting the weak entity set to the identifying
strong entity set is depicted by a double diamond.
Primary key for section – (course_id, sec_id, semester, year)
91. Redundant Attributes
Suppose we have entity sets:
• student, with attributes: ID, name, tot_cred, dept_name
• department, with attributes: dept_name, building, budget
We model the fact that each student has an associated department using
a relationship set stud_dept
The attribute dept_name in student below replicates information present
in the relationship and is therefore redundant
• and needs to be removed.
BUT: when converting back to tables, in some cases the attribute gets
reintroduced, as we will see later.
93. Reduction to Relation Schemas
Entity sets and relationship sets can be expressed uniformly as relation
schemas that represent the contents of the database.
A database which conforms to an E-R diagram can be represented by a
collection of schemas.
For each entity set and relationship set there is a unique schema that is
assigned the name of the corresponding entity set or relationship set.
Each schema has a number of columns (generally corresponding to
attributes), which have unique names.
94. Representing Entity Sets
A strong entity set reduces to a schema with the same attributes
student(ID, name, tot_cred)
A weak entity set becomes a table that includes a column for the primary
key of the identifying strong entity set
section ( course_id, sec_id, sem, year )
Example
95. Representation of Entity Sets with Composite Attributes
Composite attributes are flattened out by creating a
separate attribute for each component attribute
• Example: given entity set instructor with composite
attribute name with component attributes first_name
and last_name the schema corresponding to the
entity set has two attributes name_first_name and
name_last_name
Prefix omitted if there is no ambiguity
(name_first_name could be first_name)
Ignoring multivalued attributes, extended instructor
schema is
• instructor(ID,
first_name, middle_initial, last_name,
street_number, street_name,
apt_number, city, state, zip_code,
date_of_birth)
96. Representation of Entity Sets with Multivalued Attributes
A multivalued attribute M of an entity E is represented by a separate
schema EM
Schema EM has attributes corresponding to the primary key of E and an
attribute corresponding to multivalued attribute M
Example: Multivalued attribute phone_number of instructor is
represented by a schema:
inst_phone= ( ID, phone_number)
Each value of the multivalued attribute maps to a separate tuple of the
relation on schema EM
• For example, an instructor entity with primary key 22222 and phone
numbers 456-7890 and 123-4567 maps to two tuples:
(22222, 456-7890) and (22222, 123-4567)
97. Representing Relationship Sets
A many-to-many relationship set is represented as a schema with
attributes for the primary keys of the two participating entity sets, and
any descriptive attributes of the relationship set.
Example: schema for relationship set advisor
advisor = (s_id, i_id)
98. Redundancy of Schemas
Many-to-one and one-to-many relationship sets that are total on the many-
side can be represented by adding an extra attribute to the “many” side,
containing the primary key of the “one” side
Example: Instead of creating a schema for relationship set inst_dept, add
an attribute dept_name to the schema arising from entity set instructor
Example
99. Redundancy of Schemas (Cont.)
For one-to-one relationship sets, either side can be chosen to act as the
“many” side
• That is, an extra attribute can be added to either of the tables
corresponding to the two entity sets
If participation is partial on the “many” side, replacing a schema by an
extra attribute in the schema corresponding to the “many” side could
result in null values
100. Redundancy of Schemas (Cont.)
The schema corresponding to a relationship set linking a weak entity set
to its identifying strong entity set is redundant.
Example: The section schema already contains the attributes that would
appear in the sec_course schema
101. Specialization
Top-down design process; we designate sub-groupings within an entity set
that are distinctive from other entities in the set.
These sub-groupings become lower-level entity sets that have attributes or
participate in relationships that do not apply to the higher-level entity set.
Depicted by a triangle component labeled ISA (e.g., instructor “is a”
person).
Attribute inheritance – a lower-level entity set inherits all the attributes
and relationship participation of the higher-level entity set to which it is
linked.
103. Representing Specialization via Schemas
Method 1:
• Form a schema for the higher-level entity
• Form a schema for each lower-level entity set, include primary key
of higher-level entity set and local attributes
• Drawback: getting information about, an employee requires
accessing two relations, the one corresponding to the low-level
schema and the one corresponding to the high-level schema
104. Representing Specialization as Schemas (Cont.)
Method 2:
• Form a schema for each entity set with all local and inherited
attributes
• Drawback: name, street and city may be stored redundantly for
people who are both students and employees
105. Generalization
A bottom-up design process – combine a number of entity sets that
share the same features into a higher-level entity set.
Specialization and generalization are simple inversions of each other;
they are represented in an E-R diagram in the same way.
The terms specialization and generalization are used interchangeably.
106. Completeness constraint
Completeness constraint -- specifies whether or not an entity in the
higher-level entity set must belong to at least one of the lower-level
entity sets within a generalization.
• total: an entity must belong to one of the lower-level entity sets
• partial: an entity need not belong to one of the lower-level entity
sets
107. Completeness constraint (Cont.)
Partial generalization is the default.
We can specify total generalization in an ER diagram by adding the
keyword total in the diagram and drawing a dashed line from the
keyword to the corresponding hollow arrow-head to which it applies (for
a total generalization), or to the set of hollow arrow-heads to which it
applies (for an overlapping generalization).
The student generalization is total: All student entities must be either
graduate or undergraduate. Because the higher-level entity set arrived
at through generalization is generally composed of only those entities
in the lower-level entity sets, the completeness constraint for a
generalized higher-level entity set is usually total
108. Aggregation
Consider the ternary relationship proj_guide, which we saw earlier
Suppose we want to record evaluations of a student by a guide on a
project
109. Aggregation (Cont.)
Relationship sets eval_for and proj_guide represent overlapping
information
• Every eval_for relationship corresponds to a proj_guide relationship
• However, some proj_guide relationships may not correspond to any
eval_for relationships
So we can’t discard the proj_guide relationship
Eliminate this redundancy via aggregation
• Treat relationship as an abstract entity
• Allows relationships between relationships
• Abstraction of relationship into new entity
110. Aggregation (Cont.)
Eliminate this redundancy via aggregation without introducing
redundancy, the following diagram represents:
• A student is guided by a particular instructor on a particular project
• A student, instructor, project combination may have an associated
evaluation
111. Entities vs. Attributes
Use of entity sets vs. attributes
Use of phone as an entity allows extra information about phone numbers
(plus multiple phone numbers)
112. Entities vs. Relationship sets
Use of entity sets vs. relationship sets
Possible guideline is to designate a relationship set to describe
an action that occurs between entities
Placement of relationship attributes
For example, attribute date as attribute of advisor or as attribute
of student
115. Outline
Features of Good Relational Design
Functional Dependencies
Decomposition Using Functional Dependencies
Normal Forms
Functional Dependency Theory
Algorithms for Decomposition using Functional Dependencies
Decomposition Using Multivalued Dependencies
More Normal Form
Atomic Domains and First Normal Form
Database-Design Process
Modeling Temporal Data
116. Features of Good Relational Designs
Suppose we combine instructor and department into in_dep, which
represents the natural join on the relations instructor and department
There is repetition of information
Need to use null values (if we add a new department with no instructors)
117. Decomposition
The only way to avoid the repetition-of-information problem in the in_dep
schema is to decompose it into two schemas – instructor and department
schemas.
Not all decompositions are good. Suppose we decompose
employee(ID, name, street, city, salary)
into
employee1 (ID, name)
employee2 (name, street, city, salary)
The problem arises when we have two employees with the same name
The next slide shows how we lose information -- we cannot reconstruct
the original employee relation -- and so, this is a lossy decomposition.
119. Lossless Decomposition
Let R be a relation schema and let R1 and R2 form a decomposition of R .
That is R = R1 U R2
We say that the decomposition is a lossless decomposition if there is
no loss of information by replacing R with the two relation schemas R1
U R2
Formally,
R1
(r) R2
(r) = r
And, conversely a decomposition is lossy if
r R1
(r) R2
(r) = r
120. Example of Lossless Decomposition
Decomposition of R = (A, B, C)
R1 = (A, B) R2 = (B, C)
121. Normalization Theory
Decide whether a particular relation R is in “good” form.
In the case that a relation R is not in “good” form, decompose it into set
of relations {R1, R2, ..., Rn} such that
• Each relation is in good form
• The decomposition is a lossless decomposition
Our theory is based on:
• Functional dependencies
• Multivalued dependencies
122. Functional Dependencies
There are usually a variety of constraints (rules) on the data in the real
world.
For example, some of the constraints that are expected to hold in a
university database are:
• Students and instructors are uniquely identified by their ID.
• Each student and instructor has only one name.
• Each instructor and student is (primarily) associated with only one
department.
• Each department has only one value for its budget, and only one
associated building.
123. Functional Dependencies (Cont.)
An instance of a relation that satisfies all such real-world constraints is
called a legal instance of the relation;
A legal instance of a database is one where all the relation instances are
legal instances
Constraints on the set of legal relations.
Require that the value for a certain set of attributes determines uniquely
the value for another set of attributes.
A functional dependency is a generalization of the notion of a key.
124. Functional Dependencies Definition
Let R be a relation schema
R and R
The functional dependency
holds on R if and only if for any legal relations r(R), whenever any two
tuples t1 and t2 of r agree on the attributes , they also agree on the
attributes . That is,
t1[] = t2 [] t1[ ] = t2 [ ]
Example: Consider r(A,B ) with the following instance of r.
On this instance, B A hold; A B does NOT hold,
1 4
1 5
3 7
125. Closure of a Set of Functional Dependencies
Given a set F set of functional dependencies, there are certain other
functional dependencies that are logically implied by F.
• If A B and B C, then we can infer that A C
• etc.
The set of all functional dependencies logically implied by F is the
closure of F.
We denote the closure of F by F+
.
126. Keys and Functional Dependencies
K is a superkey for relation schema R if and only if K R
K is a candidate key for R if and only if
• K R, and
• for no K, R
Functional dependencies allow us to express constraints that cannot be
expressed using superkeys. Consider the schema:
in_dep (ID, name, salary, dept_name, building, budget ).
We expect these functional dependencies to hold:
dept_name building
ID building
but would not expect the following to hold:
dept_name salary
127. Use of Functional Dependencies
We use functional dependencies to:
• To test relations to see if they are legal under a given set of
functional dependencies.
If a relation r is legal under a set F of functional dependencies,
we say that r satisfies F.
• To specify constraints on the set of legal relations
We say that F holds on R if all legal relations on R satisfy the set
of functional dependencies F.
Note: A specific instance of a relation schema may satisfy a functional
dependency even if the functional dependency does not hold on all legal
instances.
• For example, a specific instance of instructor may, by chance, satisfy
name ID.
128. Trivial Functional Dependencies
A functional dependency is trivial if it is satisfied by all instances of a
relation
Example:
• ID, name ID
• name name
In general, is trivial if
129. Lossless Decomposition
We can use functional dependencies to show when certain
decomposition are lossless.
For the case of R = (R1, R2), we require that for all possible relations r on
schema R
r = R1 (r ) R2 (r )
A decomposition of R into R1 and R2 is lossless decomposition if at least
one of the following dependencies is in F+:
• R1 R2 R1
• R1 R2 R2
The above functional dependencies are a sufficient condition for lossless
join decomposition; the dependencies are a necessary condition only if all
constraints are functional dependencies
130. Example
R = (A, B, C)
F = {A B, B C)
R1 = (A, B), R2 = (B, C)
• Lossless decomposition:
R1 R2 = {B} and B BC
R1 = (A, B), R2 = (A, C)
• Lossless decomposition:
R1 R2 = {A} and A AB
Note:
• B BC
is a shorthand notation for
• B {B, C}
131. Dependency Preservation
Testing functional dependency constraints each time the database is
updated can be costly
It is useful to design the database in a way that constraints can be
tested efficiently.
If testing a functional dependency can be done by considering just one
relation, then the cost of testing this constraint is low
When decomposing a relation it is possible that it is no longer possible
to do the testing without having to perform a Cartesian Produced.
A decomposition that makes it computationally hard to enforce
functional dependency is said to be NOT dependency preserving.
132. Dependency Preservation Example
Consider a schema:
dept_advisor(s_ID, i_ID, department_name)
With function dependencies:
i_ID dept_name
s_ID, dept_name i_ID
In the above design we are forced to repeat the department name once
for each time an instructor participates in a dept_advisor relationship.
To fix this, we need to decompose dept_advisor
Any decomposition will not include all the attributes in
s_ID, dept_name i_ID
Thus, the composition NOT be dependency preserving
133. Boyce-Codd Normal Form
A relation schema R is in BCNF with respect to a set F of functional
dependencies if for all functional dependencies in F+ of the form
where R and R, at least one of the following holds:
• is trivial (i.e., )
• is a superkey for R
134. Boyce-Codd Normal Form (Cont.)
Example schema that is not in BCNF:
in_dep (ID, name, salary, dept_name, building, budget )
because :
• dept_name building, budget
holds on in_dep
but
• dept_name is not a superkey
When decompose in_dept into instructor and department
• instructor is in BCNF
• department is in BCNF
135. Example
R = (A, B, C)
F = {A B, B C)
R1 = (A, B), R2 = (B, C)
• Lossless-join decomposition:
R1 R2 = {B} and B BC
• Dependency preserving
R1 = (A, B), R2 = (A, C)
• Lossless-join decomposition:
R1 R2 = {A} and A AB
• Not dependency preserving
(cannot check B C without computing R1 R2)
136. BCNF and Dependency Preservation
It is not always possible to achieve both BCNF and dependency
preservation
Consider a schema:
dept_advisor(s_ID, i_ID, department_name)
With function dependencies:
i_ID dept_name
s_ID, dept_name i_ID
dept_advisor is not in BCNF
• i_ID is not a superkey.
Any decomposition of dept_advisor will not include all the attributes in
s_ID, dept_name i_ID
Thus, the composition is NOT be dependency preserving
137. Third Normal Form
A relation schema R is in third normal form (3NF) if for all:
in F+
at least one of the following holds:
• is trivial (i.e., )
• is a superkey for R
• Each attribute A in – is contained in a candidate key for R.
(NOTE: each attribute may be in a different candidate key)
If a relation is in BCNF it is in 3NF (since in BCNF one of the first two
conditions above must hold).
Third condition is a minimal relaxation of BCNF to ensure dependency
preservation (will see why later).
138. 3NF Example
Consider a schema:
dept_advisor(s_ID, i_ID, dept_name)
With function dependencies:
i_ID dept_name
s_ID, dept_name i_ID
Two candidate keys = {s_ID, dept_name}, {s_ID, i_ID }
We have seen before that dept_advisor is not in BCNF
R, however, is in 3NF
• s_ID, dept_name is a superkey
• i_ID dept_name and i_ID is NOT a superkey, but:
{ dept_name} – {i_ID } = {dept_name } and
dept_name is contained in a candidate key
139. Comparison of BCNF and 3NF
Advantages to 3NF over BCNF. It is always possible to obtain a 3NF
design without sacrificing losslessness or dependency preservation.
Disadvantages to 3NF.
• We may have to use null values to represent some of the possible
meaningful relationships among data items.
• There is the problem of repetition of information.
140. It is better to decompose inst_info into:
• inst_child:
• inst_phone:
This suggests the need for higher normal forms, such as Fourth
Normal Form (4NF), which we shall see later
Higher Normal Forms
141. Closure of a Set of Functional Dependencies
Given a set F set of functional dependencies, there are certain other
functional dependencies that are logically implied by F.
• If A B and B C, then we can infer that A C
• etc.
The set of all functional dependencies logically implied by F is the closure
of F.
We denote the closure of F by F+
.
142. Closure of a Set of Functional Dependencies
We can compute F+, the closure of F, by repeatedly applying Armstrong’s
Axioms:
• Reflexive rule: if , then
• Augmentation rule: if , then
• Transitivity rule: if , and , then
These rules are
• Sound -- generate only functional dependencies that actually hold,
and
• Complete -- generate all functional dependencies that hold.
143. Example of F+
R = (A, B, C, G, H, I)
F = { A B
A C
CG H
CG I
B H}
Some members of F+
• A H
by transitivity from A B and B H
• AG I
by augmenting A C with G, to get AG CG
and then transitivity with CG I
• CG HI
by augmenting CG I to infer CG CGI,
and augmenting of CG H to infer CGI HI,
and then transitivity
144. Closure of Attribute Sets
Given a set of attributes , define the closure of under F (denoted by
+) as the set of attributes that are functionally determined by under F
Algorithm to compute +, the closure of under F
result := ;
while (changes to result) do
for each in F do
begin
if result then result := result
end
145. Example of Attribute Set Closure
R = (A, B, C, G, H, I)
F = {A B
A C
CG H
CG I
B H}
(AG)+
1. result = AG
2. result = ABCG (A C and A B)
3. result = ABCGH (CG H and CG AGBC)
4. result = ABCGHI (CG I and CG AGBCH)
Is AG a candidate key?
1. Is AG a super key?
1. Does AG R? == Is R (AG)+
2. Is any subset of AG a superkey?
1. Does A R? == Is R (A)+
2. Does G R? == Is R (G)+
3. In general: check for each subset of size n-1
146. Canonical Cover
Suppose that we have a set of functional dependencies F on a relation
schema. Whenever a user performs an update on the relation, the
database system must ensure that the update does not violate any
functional dependencies; that is, all the functional dependencies in F are
satisfied in the new database state.
If an update violates any functional dependencies in the set F, the system
must roll back the update.
We can reduce the effort spent in checking for violations by testing a
simplified set of functional dependencies that has the same closure as the
given set.
This simplified set is termed the canonical cover
To define canonical cover we must first define extraneous attributes.
• An attribute of a functional dependency in F is extraneous if we can
remove it without changing F +
147. Dependency Preservation (Cont.)
Let F be the set of dependencies on schema R and let R1, R2 , .., Rn be
a decomposition of R.
The restriction of F to Ri is the set Fi of all functional dependencies in F +
that include only attributes of Ri .
Since all functional dependencies in a restriction involve attributes of only
one relation schema, it is possible to test such a dependency for
satisfaction by checking only one relation.
Note that the definition of restriction uses all dependencies in in F +, not
just those in F.
The set of restrictions F1, F2 , .. , Fn is the set of functional dependencies
that can be checked efficiently.
148. Testing for BCNF
To check if a non-trivial dependency causes a violation of BCNF
1. compute + (the attribute closure of ), and
2. verify that it includes all attributes of R, that is, it is a superkey of R.
Simplified test: To check if a relation schema R is in BCNF, it suffices to
check only the dependencies in the given set F for violation of BCNF,
rather than checking all dependencies in F+.
• If none of the dependencies in F causes a violation of BCNF, then
none of the dependencies in F+ will cause a violation of BCNF either.
However, simplified test using only F is incorrect when testing a relation
in a decomposition of R
• Consider R = (A, B, C, D, E), with F = { A B, BC D}
Decompose R into R1 = (A,B) and R2 = (A,C,D, E)
Neither of the dependencies in F contain only attributes from
(A,C,D,E) so we might be mislead into thinking R2 satisfies BCNF.
In fact, dependency AC D in F+ shows R2 is not in BCNF.
149. Testing Decomposition for BCNF
Either test Ri for BCNF with respect to the restriction of F+ to Ri (that
is, all FDs in F+ that contain only attributes from Ri)
Or use the original set of dependencies F that hold on R, but with the
following test:
for every set of attributes Ri, check that + (the attribute
closure of ) either includes no attribute of Ri- , or includes all
attributes of Ri.
• If the condition is violated by some in F+, the dependency
(+ - ) Ri
can be shown to hold on Ri, and Ri violates BCNF.
• We use above dependency to decompose Ri
To check if a relation Ri in a decomposition of R is in BCNF
150. BCNF Decomposition Algorithm
result := {R };
done := false;
compute F +;
while (not done) do
if (there is a schema Ri in result that is not in BCNF)
then begin
let be a nontrivial functional dependency that
holds on Ri such that Ri is not in F +,
and = ;
result := (result – Ri ) (Ri – ) (, );
end
else done := true;
Note: each Ri is in BCNF, and decomposition is lossless-join.
151. BCNF Decomposition (Cont.)
course is in BCNF
• How do we know this?
building, room_number→capacity holds on class-1
• but {building, room_number} is not a superkey for class-1.
• We replace class-1 by:
classroom (building, room_number, capacity)
section (course_id, sec_id, semester, year, building,
room_number, time_slot_id)
classroom and section are in BCNF.
152. Third Normal Form
There are some situations where
• BCNF is not dependency preserving, and
• efficient checking for FD violation on updates is important
Solution: define a weaker normal form, called Third Normal Form (3NF)
• Allows some redundancy (with resultant problems; we will see
examples later)
• But functional dependencies can be checked on individual relations
without computing a join.
• There is always a lossless-join, dependency-preserving
decomposition into 3NF.
153. 3NF Example -- Relation dept_advisor
dept_advisor (s_ID, i_ID, dept_name)
F = {s_ID, dept_name i_ID, i_ID dept_name}
Two candidate keys: s_ID, dept_name, and i_ID, s_ID
R is in 3NF
• s_ID, dept_name i_ID s_ID
dept_name is a superkey
• i_ID dept_name
dept_name is contained in a candidate key
154. 3NF Decomposition Algorithm
Let Fc be a canonical cover for F;
i := 0;
for each functional dependency in Fc do
if none of the schemas Rj, 1 j i contains
then begin
i := i + 1;
Ri :=
end
if none of the schemas Rj, 1 j i contains a candidate key for R
then begin
i := i + 1;
Ri := any candidate key for R;
end
/* Optionally, remove redundant relations */
repeat
if any schema Rj is contained in another schema Rk
then /* delete Rj */
Rj = R;;
i=i-1;
return (R1, R2, ..., Ri)
155. 3NF Decomposition Algorithm (Cont.)
Each relation schema Ri is in 3NF
Decomposition is dependency preserving and lossless-join
Proof of correctness is at end of this presentation (click here)
Above algorithm ensures
156. Comparison of BCNF and 3NF
It is always possible to decompose a relation into a set of relations that
are in 3NF such that:
• The decomposition is lossless
• The dependencies are preserved
It is always possible to decompose a relation into a set of relations that
are in BCNF such that:
• The decomposition is lossless
• It may not be possible to preserve dependencies.
157. Multivalued Dependencies (MVDs)
Suppose we record names of children, and phone numbers for
instructors:
• inst_child(ID, child_name)
• inst_phone(ID, phone_number)
If we were to combine these schemas to get
• inst_info(ID, child_name, phone_number)
• Example data:
(99999, David, 512-555-1234)
(99999, David, 512-555-4321)
(99999, William, 512-555-1234)
(99999, William, 512-555-4321)
This relation is in BCNF
• Why?
158. Multivalued Dependencies
Let R be a relation schema and let R and R. The multivalued
dependency
holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r
such that t1[] = t2 [], there exist tuples t3 and t4 in r such that:
t1[] = t2 [] = t3 [] = t4 []
t3[] = t1 []
t3[R – ] = t2[R – ]
t4 [] = t2[]
t4[R – ] = t1[R – ]
159. Fourth Normal Form
A relation schema R is in 4NF with respect to a set D of functional and
multivalued dependencies if for all multivalued dependencies in D+ of the
form , where R and R, at least one of the following hold:
• is trivial (i.e., or = R)
• is a superkey for schema R
If a relation is in 4NF it is in BCNF
160. 4NF Decomposition Algorithm
result: = {R};
done := false;
compute D+;
Let Di denote the restriction of D+ to Ri
while (not done)
if (there is a schema Ri in result that is not in 4NF) then
begin
let be a nontrivial multivalued dependency that holds
on Ri such that Ri is not in Di, and ;
result := (result - Ri) (Ri - ) (, );
end
else done:= true;
Note: each Ri is in 4NF, and decomposition is lossless-join
161. Example
R =(A, B, C, G, H, I)
F ={ A B
B HI
CG H }
R is not in 4NF since A B and A is not a superkey for R
Decomposition
a) R1 = (A, B) (R1 is in 4NF)
b) R2 = (A, C, G, H, I) (R2 is not in 4NF, decompose into R3 and R4)
c) R3 = (C, G, H) (R3 is in 4NF)
d) R4 = (A, C, G, I) (R4 is not in 4NF, decompose into R5 and R6)
• A B and B HI A HI, (MVD transitivity), and
• and hence A I (MVD restriction to R4)
e) R5 = (A, I) (R5 is in 4NF)
f)R6 = (A, C, G) (R6 is in 4NF)
162. First Normal Form
Domain is atomic if its elements are considered to be indivisible units
• Examples of non-atomic domains:
Set of names, composite attributes
Identification numbers like CS101 that can be broken up into parts
A relational schema R is in first normal form if the domains of all attributes
of R are atomic
Non-atomic values complicate storage and encourage redundant
(repeated) storage of data
• Example: Set of accounts stored with each customer, and set of
owners stored with each account
• We assume all relations are in first normal form (and revisit this in
Chapter 22: Object Based Databases)
163. First Normal Form (Cont.)
Atomicity is actually a property of how the elements of the domain are
used.
• Example: Strings would normally be considered indivisible
• Suppose that students are given roll numbers which are strings of the
form CS0012 or EE1127
• If the first two characters are extracted to find the department, the
domain of roll numbers is not atomic.
• Doing so is a bad idea: leads to encoding of information in application
program rather than in the database.
165. Classification of Physical Storage Media
Can differentiate storage into:
• volatile storage: loses contents when power is switched off
• non-volatile storage:
Contents persist even when power is switched off.
Includes secondary and tertiary storage, as well as batter-backed
up main-memory.
Factors affecting choice of storage media include
• Speed with which data can be accessed
• Cost per unit of data
• Reliability
167. Storage Hierarchy (Cont.)
primary storage: Fastest media but volatile (cache, main memory).
secondary storage: next level in hierarchy, non-volatile, moderately fast
access time
• Also called on-line storage
• E.g., flash memory, magnetic disks
tertiary storage: lowest level in hierarchy, non-volatile, slow access time
• also called off-line storage and used for archival storage
• e.g., magnetic tape, optical storage
• Magnetic tape
Sequential access, 1 to 12 TB capacity
A few drives with many tapes
Juke boxes with petabytes (1000’s of TB) of storage
168. Storage Interfaces
Disk interface standards families
• SATA (Serial ATA)
SATA 3 supports data transfer speeds of up to 6 gigabits/sec
• SAS (Serial Attached SCSI)
SAS Version 3 supports 12 gigabits/sec
• NVMe (Non-Volatile Memory Express) interface
Works with PCIe connectors to support lower latency and higher
transfer rates
Supports data transfer rates of up to 24 gigabits/sec
Disks usually connected directly to computer system
In Storage Area Networks (SAN), a large number of disks are connected
by a high-speed network to a number of servers
In Network Attached Storage (NAS) networked storage provides a file
system interface using networked file system protocol, instead of
providing a disk system interface
169. Magnetic Hard Disk Mechanism
Schematic diagram of magnetic disk drive Photo of magnetic disk drive
170. Magnetic Disks
Read-write head
Surface of platter divided into circular tracks
• Over 50K-100K tracks per platter on typical hard disks
Each track is divided into sectors.
• A sector is the smallest unit of data that can be read or written.
• Sector size typically 512 bytes
• Typical sectors per track: 500 to 1000 (on inner tracks) to 1000 to
2000 (on outer tracks)
To read/write a sector
• disk arm swings to position head on right track
• platter spins continually; data is read/written as sector passes under
head
Head-disk assemblies
• multiple disk platters on a single spindle (1 to 5 usually)
• one head per platter, mounted on a common arm.
Cylinder i consists of ith track of all the platters
171. Magnetic Disks (Cont.)
Disk controller – interfaces between the computer system and the disk
drive hardware.
• accepts high-level commands to read or write a sector
• initiates actions such as moving the disk arm to the right track and
actually reading or writing the data
• Computes and attaches checksums to each sector to verify that
data is read back correctly
If data is corrupted, with very high probability stored checksum
won’t match recomputed checksum
• Ensures successful writing by reading back sector after writing it
• Performs remapping of bad sectors
172. Performance Measures of Disks
Access time – the time it takes from when a read or write request is
issued to when data transfer begins. Consists of:
• Seek time – time it takes to reposition the arm over the correct track.
Average seek time is 1/2 the worst case seek time.
• Would be 1/3 if all tracks had the same number of sectors, and
we ignore the time to start and stop arm movement
4 to 10 milliseconds on typical disks
• Rotational latency – time it takes for the sector to be accessed to
appear under the head.
4 to 11 milliseconds on typical disks (5400 to 15000 r.p.m.)
Average latency is 1/2 of the above latency.
• Overall latency is 5 to 20 msec depending on disk model
Data-transfer rate – the rate at which data can be retrieved from or stored
to the disk.
• 25 to 200 MB per second max rate, lower for inner tracks
173. Performance Measures (Cont.)
Disk block is a logical unit for storage allocation and retrieval
• 4 to 16 kilobytes typically
Smaller blocks: more transfers from disk
Larger blocks: more space wasted due to partially filled blocks
Sequential access pattern
• Successive requests are for successive disk blocks
• Disk seek required only for first block
Random access pattern
• Successive requests are for blocks that can be anywhere on disk
• Each access requires a seek
• Transfer rates are low since a lot of time is wasted in seeks
I/O operations per second (IOPS)
• Number of random block reads that a disk can support per second
• 50 to 200 IOPS on current generation magnetic disks
174. Performance Measures (Cont.)
Mean time to failure (MTTF) – the average time the disk is expected to
run continuously without any failure.
• Typically 3 to 5 years
• Probability of failure of new disks is quite low, corresponding to a
“theoretical MTTF” of 500,000 to 1,200,000 hours for a new disk
E.g., an MTTF of 1,200,000 hours for a new disk means that given
1000 relatively new disks, on an average one will fail every 1200
hours
• MTTF decreases as disk ages
175. Flash Storage
NOR flash vs NAND flash
NAND flash
• used widely for storage, cheaper than NOR flash
• requires page-at-a-time read (page: 512 bytes to 4 KB)
20 to 100 microseconds for a page read
Not much difference between sequential and random read
• Page can only be written once
Must be erased to allow rewrite
Solid state disks
• Use standard block-oriented disk interfaces, but store data on multiple
flash storage devices internally
• Transfer rate of up to 500 MB/sec using SATA, and
up to 3 GB/sec using NVMe PCIe
176. Flash Storage (Cont.)
Erase happens in units of erase block
• Takes 2 to 5 millisecs
• Erase block typically 256 KB to 1 MB (128 to 256 pages)
Remapping of logical page addresses to physical page addresses avoids
waiting for erase
Flash translation table tracks mapping
• also stored in a label field of flash page
• remapping carried out by flash translation layer
After 100,000 to 1,000,000 erases, erase block becomes unreliable and
cannot be used
• wear leveling
177. SSD Performance Metrics
Random reads/writes per second
• Typical 4 KB reads: 10,000 reads per second (10,000 IOPS)
• Typical 4KB writes: 40,000 IOPS
• SSDs support parallel reads
Typical 4KB reads:
• 100,000 IOPS with 32 requests in parallel (QD-32) on SATA
• 350,000 IOPS with QD-32 on NVMe PCIe
Typical 4KB writes:
• 100,000 IOPS with QD-32, even higher on some models
Data transfer rate for sequential reads/writes
• 400 MB/sec for SATA3, 2 to 3 GB/sec using NVMe PCIe
Hybrid disks: combine small amount of flash cache with larger magnetic
disk
178. Storage Class Memory
3D-XPoint memory technology pioneered by Intel
Available as Intel Optane
• SSD interface shipped from 2017
Allows lower latency than flash SSDs
• Non-volatile memory interface announced in 2018
Supports direct access to words, at speeds comparable to main-
memory speeds
179. RAID
RAID: Redundant Arrays of Independent Disks
• disk organization techniques that manage a large numbers of disks,
providing a view of a single disk of
high capacity and high speed by using multiple disks in parallel,
high reliability by storing data redundantly, so that data can be
recovered even if a disk fails
The chance that some disk out of a set of N disks will fail is much higher
than the chance that a specific single disk will fail.
• E.g., a system with 100 disks, each with MTTF of 100,000 hours
(approx. 11 years), will have a system MTTF of 1000 hours (approx.
41 days)
• Techniques for using redundancy to avoid data loss are critical with
large numbers of disks
180. Improvement of Reliability via Redundancy
Redundancy – store extra information that can be used to rebuild
information lost in a disk failure
E.g., Mirroring (or shadowing)
• Duplicate every disk. Logical disk consists of two physical disks.
• Every write is carried out on both disks
Reads can take place from either disk
• If one disk in a pair fails, data still available in the other
Data loss would occur only if a disk fails, and its mirror disk also
fails before the system is repaired
• Probability of combined event is very small
Except for dependent failure modes such as fire or building
collapse or electrical power surges
Mean time to data loss depends on mean time to failure,
and mean time to repair
• E.g., MTTF of 100,000 hours, mean time to repair of 10 hours gives
mean time to data loss of 500*106 hours (or 57,000 years) for a
mirrored pair of disks (ignoring dependent failure modes)
181. Improvement in Performance via Parallelism
Two main goals of parallelism in a disk system:
1. Load balance multiple small accesses to increase throughput
2. Parallelize large accesses to reduce response time.
Improve transfer rate by striping data across multiple disks.
Bit-level striping – split the bits of each byte across multiple disks
• In an array of eight disks, write bit i of each byte to disk i.
• Each access can read data at eight times the rate of a single disk.
• But seek/access time worse than for a single disk
Bit level striping is not used much any more
Block-level striping – with n disks, block i of a file goes to disk (i mod n)
+ 1
• Requests for different blocks can run in parallel if the blocks reside on
different disks
• A request for a long sequence of blocks can utilize all disks in parallel
182. RAID Levels
Schemes to provide redundancy at lower cost by using disk striping
combined with parity bits
• Different RAID organizations, or RAID levels, have differing cost,
performance and reliability characteristics
RAID Level 0: Block striping; non-redundant.
• Used in high-performance applications where data loss is not critical.
RAID Level 1: Mirrored disks with block striping
• Offers best write performance.
• Popular for applications such as storing log files in a database system.
183. RAID Levels (Cont.)
Parity blocks: Parity block j stores XOR of bits from block j of each disk
• When writing data to a block j, parity block j must also be computed
and written to disk
Can be done by using old parity block, old value of current block
and new value of current block (2 block reads + 2 block writes)
Or by recomputing the parity value using the new values of blocks
corresponding to the parity block
• More efficient for writing large amounts of data sequentially
• To recover data for a block, compute XOR of bits from all other
blocks in the set including the parity block
184. RAID Levels (Cont.)
RAID Level 5: Block-Interleaved Distributed Parity; partitions data and
parity among all N + 1 disks, rather than storing data in N disks and parity
in 1 disk.
• E.g., with 5 disks, parity block for nth set of blocks is stored on disk
(n mod 5) + 1, with the data blocks stored on the other 4 disks.
185. RAID Levels (Cont.)
RAID Level 5 (Cont.)
• Block writes occur in parallel if the blocks and their parity blocks are
on different disks.
RAID Level 6: P+Q Redundancy scheme; similar to Level 5, but stores
two error correction blocks (P, Q) instead of single parity block to guard
against multiple disk failures.
• Better reliability than Level 5 at a higher cost
Becoming more important as storage sizes increase
186. RAID Levels (Cont.)
Other levels (not used in practice):
• RAID Level 2: Memory-Style Error-Correcting-Codes (ECC) with bit
striping.
• RAID Level 3: Bit-Interleaved Parity
• RAID Level 4: Block-Interleaved Parity; uses block-level striping,
and keeps a parity block on a separate parity disk for corresponding
blocks from N other disks.
RAID 5 is better than RAID 4, since with RAID 4 with random
writes, parity disk gets much higher write load than other disks
and becomes a bottleneck
187. Choice of RAID Level
Factors in choosing RAID level
• Monetary cost
• Performance: Number of I/O operations per second, and bandwidth
during normal operation
• Performance during failure
• Performance during rebuild of failed disk
Including time taken to rebuild failed disk
RAID 0 is used only when data safety is not important
• E.g., data can be recovered quickly from other sources
188. Choice of RAID Level (Cont.)
Level 1 provides much better write performance than level 5
• Level 5 requires at least 2 block reads and 2 block writes to write a
single block, whereas Level 1 only requires 2 block writes
Level 1 had higher storage cost than level 5
Level 5 is preferred for applications where writes are sequential and large
(many blocks), and need large amounts of data storage
RAID 1 is preferred for applications with many random/small updates
Level 6 gives better data protection than RAID 5 since it can tolerate two
disk (or disk block) failures
• Increasing in importance since latent block failures on one disk,
coupled with a failure of another disk can result in data loss with RAID
1 and RAID 5.
189. Hardware Issues
Software RAID: RAID implementations done entirely in software, with
no special hardware support
Hardware RAID: RAID implementations with special hardware
• Use non-volatile RAM to record writes that are being executed
• Beware: power failure during write can result in corrupted disk
E.g., failure after writing one block but before writing the second
in a mirrored system
Such corrupted data must be detected when power is restored
• Recovery from corruption is similar to recovery from failed
disk
• NV-RAM helps to efficiently detected potentially corrupted
blocks
Otherwise all blocks of disk must be read and compared
with mirror/parity block
190. Hardware Issues (Cont.)
Latent failures: data successfully written earlier gets damaged
• can result in data loss even if only one disk fails
Data scrubbing:
• continually scan for latent failures, and recover from copy/parity
Hot swapping: replacement of disk while system is running, without power
down
• Supported by some hardware RAID systems,
• reduces time to recovery, and improves availability greatly
Many systems maintain spare disks which are kept online, and used as
replacements for failed disks immediately on detection of failure
• Reduces time to recovery greatly
Many hardware RAID systems ensure that a single point of failure will not
stop the functioning of the system by using
• Redundant power supplies with battery backup
• Multiple controllers and multiple interconnections to guard against
controller/interconnection failures
191. Optimization of Disk-Block Access
Buffering: in-memory buffer to cache disk blocks
Read-ahead: Read extra blocks from a track in anticipation that they will
be requested soon
Disk-arm-scheduling algorithms re-order block requests so that disk arm
movement is minimized
• elevator algorithm
192. Optimization of Disk-Block Access
Buffering: in-memory buffer to cache disk blocks
Read-ahead: Read extra blocks from a track in anticipation that they will
be requested soon
Disk-arm-scheduling algorithms re-order block requests so that disk arm
movement is minimized
• elevator algorithm
R1 R5 R2 R4
R3
R6
Inner track Outer track
193. Magnetic Tapes
Hold large volumes of data and provide high transfer rates
• Few GB for DAT (Digital Audio Tape) format, 10-40 GB with DLT
(Digital Linear Tape) format, 100 GB+ with Ultrium format, and 330 GB
with Ampex helical scan format
• Transfer rates from few to 10s of MB/s
Tapes are cheap, but cost of drives is very high
Very slow access time in comparison to magnetic and optical disks
• limited to sequential access.
• Some formats (Accelis) provide faster seek (10s of seconds) at cost of
lower capacity
Used mainly for backup, for storage of infrequently used information, and
as an off-line medium for transferring information from one system to
another.
Tape jukeboxes used for very large capacity storage
• Multiple petabyes (1015 bytes)
194. File Organization
The database is stored as a collection of files. Each file is a sequence of
records. A record is a sequence of fields.
One approach
• Assume record size is fixed
• Each file has records of one particular type only
• Different files are used for different relations
This case is easiest to implement; will consider variable length records
later
We assume that records are smaller than a disk block
.
195. Fixed-Length Records
Simple approach:
• Store record i starting from byte n (i – 1), where n is the size of
each record.
• Record access is simple but records may cross blocks
Modification: do not allow records to cross block boundaries
196. Fixed-Length Records
Deletion of record i: alternatives:
• move records i + 1, . . ., n to i, . . . , n – 1
• move record n to i
• do not move records, but link all free records on a free list
Record 3 deleted
197. Fixed-Length Records
Deletion of record i: alternatives:
• move records i + 1, . . ., n to i, . . . , n – 1
• move record n to i
• do not move records, but link all free records on a free list
Record 3 deleted and replaced by record 11
198. Fixed-Length Records
Deletion of record i: alternatives:
• move records i + 1, . . ., n to i, . . . , n – 1
• move record n to i
• do not move records, but link all free records on a free list
199. Variable-Length Records
Variable-length records arise in database systems in several ways:
• Storage of multiple record types in a file.
• Record types that allow variable lengths for one or more fields such
as strings (varchar)
• Record types that allow repeating fields (used in some older data
models).
Attributes are stored in order
Variable length attributes represented by fixed size (offset, length), with
actual data stored after all fixed length attributes
Null values represented by null-value bitmap
200. Variable-Length Records: Slotted Page Structure
Slotted page header contains:
• number of record entries
• end of free space in the block
• location and size of each record
Records can be moved around within a page to keep them contiguous
with no empty space between them; entry in the header must be
updated.
Pointers should not point directly to record — instead they should point
to the entry for the record in header.
201. Storing Large Objects
E.g., blob/clob types
Records must be smaller than pages
Alternatives:
• Store as files in file systems
• Store as files managed by database
• Break into pieces and store in multiple tuples in separate relation
PostgreSQL TOAST
202. Organization of Records in Files
Heap – record can be placed anywhere in the file where there is space
Sequential – store records in sequential order, based on the value of the
search key of each record
In a multitable clustering file organization records of several different
relations can be stored in the same file
• Motivation: store related records on the same block to minimize I/O
B+-tree file organization
• Ordered storage even with inserts/deletes
• More on this in Chapter 14
Hashing – a hash function computed on search key; the result specifies in
which block of the file the record should be placed
• More on this in Chapter 14
203. Heap File Organization
Records can be placed anywhere in the file where there is free space
Records usually do not move once allocated
Important to be able to efficiently find free space within file
Free-space map
• Array with 1 entry per block. Each entry is a few bits to a byte, and
records fraction of block that is free
• In example below, 3 bits per block, value divided by 8 indicates
fraction of block that is free
• Can have second-level free-space map
• In example below, each entry stores maximum from 4 entries of first-
level free-space map
Free space map written to disk periodically, OK to have wrong (old) values
for some entries (will be detected and fixed)
204. Sequential File Organization
Suitable for applications that require sequential processing of
the entire file
The records in the file are ordered by a search-key
205. Sequential File Organization (Cont.)
Deletion – use pointer chains
Insertion –locate the position where the record is to be inserted
• if there is free space insert there
• if no free space, insert the record in an overflow block
• In either case, pointer chain must be updated
Need to reorganize the file
from time to time to restore
sequential order
206. Multitable Clustering File Organization
Store several relations in one file using a multitable clustering
file organization
department
instructor
multitable clustering
of department and
instructor
207. Multitable Clustering File Organization (cont.)
good for queries involving department ⨝ instructor, and for queries
involving one single department and its instructors
bad for queries involving only department
results in variable size records
Can add pointer chains to link records of a particular relation
208. Partitioning
Table partitioning: Records in a relation can be partitioned into smaller
relations that are stored separately
E.g., transaction relation may be partitioned into
transaction_2018, transaction_2019, etc.
Queries written on transaction must access records in all partitions
• Unless query has a selection such as year=2019, in which case only
one partition in needed
Partitioning
• Reduces costs of some operations such as free space management
• Allows different partitions to be stored on different storage devices
E.g., transaction partition for current year on SSD, for older years
on magnetic disk
209. Data Dictionary Storage
Information about relations
• names of relations
• names, types and lengths of attributes of each relation
• names and definitions of views
• integrity constraints
User and accounting information, including passwords
Statistical and descriptive data
• number of tuples in each relation
Physical file organization information
• How relation is stored (sequential/hash/…)
• Physical location of relation
Information about indices (Chapter 14)
The Data dictionary (also called system catalog) stores
metadata; that is, data about data, such as
210. Relational Representation of System Metadata
Relational
representation on
disk
Specialized data
structures designed
for efficient access,
in memory
211. Storage Access
Blocks are units of both storage allocation and data transfer.
Database system seeks to minimize the number of block transfers
between the disk and memory. We can reduce the number of disk
accesses by keeping as many blocks as possible in main memory.
Buffer – portion of main memory available to store copies of disk blocks.
Buffer manager – subsystem responsible for allocating buffer space in
main memory.
212. Buffer Manager
Programs call on the buffer manager when they need a block from disk.
• If the block is already in the buffer, buffer manager returns the
address of the block in main memory
• If the block is not in the buffer, the buffer manager
Allocates space in the buffer for the block
• Replacing (throwing out) some other block, if required, to make
space for the new block.
• Replaced block written back to disk only if it was modified
since the most recent time that it was written to/fetched from
the disk.
Reads the block from the disk to the buffer, and returns the
address of the block in main memory to requester.
213. Buffer Manager
Buffer replacement strategy (details coming up!)
Pinned block: memory block that is not allowed to be written back to disk
• Pin done before reading/writing data from a block
• Unpin done when read /write is complete
• Multiple concurrent pin/unpin operations possible
Keep a pin count, buffer block can be evicted only if pin count = 0
Shared and exclusive locks on buffer
• Needed to prevent concurrent operations from reading page contents
as they are moved/reorganized, and to ensure only one
move/reorganize at a time
• Readers get shared lock, updates to a block require exclusive lock
• Locking rules:
Only one process can get exclusive lock at a time
Shared lock cannot be concurrently with exclusive lock
Multiple processes may be given shared lock concurrently
214. Buffer-Replacement Policies
Most operating systems replace the block least recently used (LRU
strategy)
• Idea behind LRU – use past pattern of block references as a
predictor of future references
• LRU can be bad for some queries
Queries have well-defined access patterns (such as sequential scans),
and a database system can use the information in a user’s query to
predict future references
Mixed strategy with hints on replacement strategy provided
by the query optimizer is preferable
Example of bad access pattern for LRU: when computing the join of 2
relations r and s by a nested loops
for each tuple tr of r do
for each tuple ts of s do
if the tuples tr and ts match …
215. Buffer-Replacement Policies (Cont.)
Toss-immediate strategy – frees the space occupied by a block as soon
as the final tuple of that block has been processed
Most recently used (MRU) strategy – system must pin the block
currently being processed. After the final tuple of that block has been
processed, the block is unpinned, and it becomes the most recently used
block.
Buffer manager can use statistical information regarding the probability
that a request will reference a particular relation
• E.g., the data dictionary is frequently accessed. Heuristic: keep
data-dictionary blocks in main memory buffer
Operating system or buffer manager may reorder writes
• Can lead to corruption of data structures on disk
E.g., linked list of blocks with missing block on disk
File systems perform consistency check to detect such situations
• Careful ordering of writes can avoid many such problems
216. Optimization of Disk Block Access (Cont.)
Buffer managers support forced output of blocks for the purpose of recovery
(more in Chapter 19)
Nonvolatile write buffers speed up disk writes by writing blocks to a non-
volatile RAM or flash buffer immediately
• Writes can be reordered to minimize disk arm movement
Log disk – a disk devoted to writing a sequential log of block updates
• Used exactly like nonvolatile RAM
Write to log disk is very fast since no seeks are required
Journaling file systems write data in-order to NV-RAM or log disk
• Reordering without journaling: risk of corruption of file system data
217. Column-Oriented Storage
Also known as columnar representation
Store each attribute of a relation separately
Example
218. Columnar Representation
Benefits:
• Reduced IO if only some attributes are accessed
• Improved CPU cache performance
• Improved compression
• Vector processing on modern CPU architectures
Drawbacks
• Cost of tuple reconstruction from columnar representation
• Cost of tuple deletion and update
• Cost of decompression
Columnar representation found to be more efficient for decision support than
row-oriented representation
Traditional row-oriented representation preferable for transaction processing
Some databases support both representations
• Called hybrid row/column stores
219. Columnar File Representation
ORC and Parquet: file
formats with columnar
storage inside file
Very popular for big-data
applications
Orc file format shown on
right:
220. Storage Organization in Main-Memory Databases
Can store records directly in
memory without a buffer manager
Column-oriented storage can be
used in-memory for decision
support applications
• Compression reduces
memory requirement
221. Outline
Basic Concepts
Ordered Indices
B+-Tree Index Files
B-Tree Index Files
Hashing
Static Hashing
Dynamic Hashing
222. Basic Concepts
Indexing mechanisms used to speed up access to desired data.
• E.g., author catalog in library
Search Key - attribute to set of attributes used to look up records in a
file.
An index file consists of records (called index entries) of the form
Index files are typically much smaller than the original file
Two basic kinds of indices:
• Ordered indices: search keys are stored in sorted order
• Hash indices: search keys are distributed uniformly across
“buckets” using a “hash function”.
search-key pointer
223. Index Evaluation Metrics
Access types supported efficiently. E.g.,
• Records with a specified value in the attribute
• Records with an attribute value falling in a specified range of values.
Access time
Insertion time
Deletion time
Space overhead
224. Ordered Indices
In an ordered index, index entries are stored sorted on the search key
value.
Clustering index: in a sequentially ordered file, the index whose search
key specifies the sequential order of the file.
• Also called primary index
• The search key of a primary index is usually but not necessarily the
primary key.
Secondary index: an index whose search key specifies an order
different from the sequential order of the file. Also called
nonclustering index.
Index-sequential file: sequential file ordered on a search key, with a
clustering index on the search key.
225. Dense Index Files
Dense index — Index record appears for every search-key value in the
file.
E.g. index on ID attribute of instructor relation
226. Dense Index Files (Cont.)
Dense index on dept_name, with instructor file sorted on dept_name
227. Sparse Index Files
Sparse Index: contains index records for only some search-key
values.
• Applicable when records are sequentially ordered on search-key
To locate a record with search-key value K we:
• Find index record with largest search-key value < K
• Search file sequentially starting at the record to which the index
record points
228. Sparse Index Files (Cont.)
Compared to dense indices:
• Less space and less maintenance overhead for insertions and deletions.
• Generally slower than dense index for locating records.
Good tradeoff:
• for clustered index: sparse index with an index entry for every block in file,
corresponding to least search-key value in the block.
• For unclustered index: sparse index on top of dense index (multilevel index)
229. Secondary Indices Example
Secondary index on salary field of instructor
Index record points to a bucket that contains pointers to all the actual
records with that particular search-key value.
Secondary indices have to be dense
230. Multilevel Index
If index does not fit in memory, access becomes expensive.
Solution: treat index kept on disk as a sequential file and construct a
sparse index on it.
• outer index – a sparse index of the basic index
• inner index – the basic index file
If even outer index is too large to fit in main memory, yet another level of
index can be created, and so on.
Indices at all levels must be updated on insertion or deletion from the file.
232. Indices on Multiple Keys
Composite search key
• E.g., index on instructor relation on attributes (name, ID)
• Values are sorted lexicographically
E.g. (John, 12121) < (John, 13514) and
(John, 13514) < (Peter, 11223)
• Can query on just name, or on (name, ID)
234. B+-Tree Index Files (Cont.)
All paths from root to leaf are of the same length
Each node that is not a root or a leaf has between n/2 and n
children.
A leaf node has between (n–1)/2 and n–1 values
Special cases:
• If the root is not a leaf, it has at least 2 children.
• If the root is a leaf (that is, there are no other nodes in the tree), it
can have between 0 and (n–1) values.
A B+-tree is a rooted tree satisfying the following properties:
235. B+-Tree Node Structure
Typical node
• Ki are the search-key values
• Pi are pointers to children (for non-leaf nodes) or pointers to records or
buckets of records (for leaf nodes).
The search-keys in a node are ordered
K1 < K2 < K3 < . . . < Kn–1
(Initially assume no duplicate keys, address duplicates later)
236. Leaf Nodes in B+-Trees
For i = 1, 2, . . ., n–1, pointer Pi points to a file record with search-key value
Ki,
If Li, Lj are leaf nodes and i < j, Li’s search-key values are less than or equal
to Lj’s search-key values
Pn points to next leaf node in search-key order
Properties of a leaf node:
237. Non-Leaf Nodes in B+-Trees
Non leaf nodes form a multi-level sparse index on the leaf nodes. For a
non-leaf node with m pointers:
• All the search-keys in the subtree to which P1 points are less than K1
• For 2 i n – 1, all the search-keys in the subtree to which Pi points
have values greater than or equal to Ki–1 and less than Ki
• All the search-keys in the subtree to which Pn points have values
greater than or equal to Kn–1
• General structure
238. Example of B+-tree
B+-tree for instructor file (n = 6)
Leaf nodes must have between 3 and 5 values
((n–1)/2 and n –1, with n = 6).
Non-leaf nodes other than root must have between 3 and 6
children ((n/2 and n with n =6).
Root must have at least 2 children.
239. Observations about B+-trees
Since the inter-node connections are done by pointers, “logically” close
blocks need not be “physically” close.
The non-leaf levels of the B+-tree form a hierarchy of sparse indices.
The B+-tree contains a relatively small number of levels
Level below root has at least 2* n/2 values
Next level has at least 2* n/2 * n/2 values
.. etc.
• If there are K search-key values in the file, the tree height is no more
than logn/2(K)
• thus searches can be conducted efficiently.
Insertions and deletions to the main file can be handled efficiently, as the
index can be restructured in logarithmic time (as we shall see).
240. Queries on B+-Trees
function find(v)
1. C=root
2. while (C is not a leaf node)
1. Let i be least number s.t. V Ki.
2. if there is no such number i then
3. Set C = last non-null pointer in C
4. else if (v = C.Ki ) Set C = Pi +1
5. else set C = C.Pi
3. if for some i, Ki = V then return C.Pi
4. else return null /* no record with search-key value v exists. */
241. Queries on B+-Trees (Cont.)
Range queries find all records with search key values in a given range
• See book for details of function findRange(lb, ub) which returns set
of all such records
• Real implementations usually provide an iterator interface to fetch
matching records one at a time, using a next() function
242. Queries on B+-Trees (Cont.)
If there are K search-key values in the file, the height of the tree is no
more than logn/2(K).
A node is generally the same size as a disk block, typically 4 kilobytes
• and n is typically around 100 (40 bytes per index entry).
With 1 million search key values and n = 100
• at most log50(1,000,000) = 4 nodes are accessed in a lookup
traversal from root to leaf.
Contrast this with a balanced binary tree with 1 million search key values
— around 20 nodes are accessed in a lookup
• above difference is significant since every node access may need a
disk I/O, costing around 20 milliseconds
243. Updates on B+-Trees: Insertion (Cont.)
Splitting a leaf node:
• take the n (search-key value, pointer) pairs (including the one being
inserted) in sorted order. Place the first n/2 in the original node, and
the rest in a new node.
• let the new node be p, and let k be the least key value in p. Insert
(k,p) in the parent of the node being split.
• If the parent is full, split it and propagate the split further up.
Splitting of nodes proceeds upwards till a node that is not full is found.
• In the worst case the root node may be split increasing the height of
the tree by 1.
Result of splitting node containing Brandt, Califieri and Crick on inserting Adams
Next step: insert entry with (Califieri, pointer-to-new-node) into parent
246. Examples of B+-Tree Deletion
Deleting “Srinivasan” causes merging of under-full leaves
Before and after deleting “Srinivasan”
Affected nodes
247. Examples of B+-Tree Deletion (Cont.)
Leaf containing Singh and Wu became underfull, and borrowed a value
Kim from its left sibling
Search-key value in the parent changes as a result
Before and after deleting “Singh” and “Wu”
Affected nodes
248. Example of B+-tree Deletion (Cont.)
Node with Gold and Katz became underfull, and was merged with its sibling
Parent node becomes underfull, and is merged with its sibling
• Value separating two nodes (at the parent) is pulled down when merging
Root node then has only one child, and is deleted
Before and after deletion of “Gold”
249. B+-Tree File Organization
B+-Tree File Organization:
• Leaf nodes in a B+-tree file organization store records, instead of
pointers
• Helps keep data records clustered even when there are
insertions/deletions/updates
Leaf nodes are still required to be half full
• Since records are larger than pointers, the maximum number of
records that can be stored in a leaf node is less than the number of
pointers in a nonleaf node.
Insertion and deletion are handled in the same way as insertion and
deletion of entries in a B+-tree index.
250. B+-Tree File Organization (Cont.)
Example of B+-tree File Organization
Good space utilization important since records use more space than
pointers.
To improve space utilization, involve more sibling nodes in redistribution
during splits and merges
• Involving 2 siblings in redistribution (to avoid split / merge where
possible) results in each node having at least entries
3
/
2n
251. Static Hashing
A bucket is a unit of storage containing one or more entries (a bucket
is typically a disk block).
• we obtain the bucket of an entry from its search-key value using a
hash function
Hash function h is a function from the set of all search-key values K to
the set of all bucket addresses B.
Hash function is used to locate entries for access, insertion as well as
deletion.
Entries with different search-key values may be mapped to the same
bucket; thus entire bucket has to be searched sequentially to locate an
entry.
In a hash index, buckets store entries with pointers to records
In a hash file-organization buckets store records
252. Handling of Bucket Overflows
Bucket overflow can occur because of
• Insufficient buckets
• Skew in distribution of records. This can occur due to two reasons:
multiple records have same search-key value
chosen hash function produces non-uniform distribution of key
values
Although the probability of bucket overflow can be reduced, it cannot be
eliminated; it is handled by using overflow buckets.
253. Handling of Bucket Overflows (Cont.)
Overflow chaining – the overflow buckets of a given bucket are chained
together in a linked list.
Above scheme is called closed addressing (also called closed hashing
or open hashing depending on the book you use)
• An alternative, called
open addressing
(also called
open hashing or
closed hashing
depending on the
book you use) which
does not use over-
flow buckets, is not
suitable for database
applications.
254. Example of Hash File Organization
Hash file organization of instructor file, using dept_name as key.
255. Dynamic Hashing
Periodic rehashing
• If number of entries in a hash table becomes (say) 1.5 times size of
hash table,
create new hash table of size (say) 2 times the size of the
previous hash table
Rehash all entries to new table
Linear Hashing
• Do rehashing in an incremental manner
Extendable Hashing
• Tailored to disk based hashing, with buckets shared by multiple hash
values
• Doubling of # of entries in hash table, without doubling # of buckets
256. Comparison of Ordered Indexing and Hashing
Cost of periodic re-organization
Relative frequency of insertions and deletions
Is it desirable to optimize average access time at the expense of worst-
case access time?
Expected type of queries:
• Hashing is generally better at retrieving records having a specified
value of the key.
• If range queries are common, ordered indices are to be preferred
In practice:
• PostgreSQL supports hash indices, but discourages use due to poor
performance
• Oracle supports static hash organization, but not hash indices
• SQLServer supports only B+-trees
258. Outline
Transaction Concept
Transaction State
Concurrent Executions
Serializability
Recoverability
Implementation of Isolation
Transaction Definition in SQL
Testing for Serializability.
259. Transaction Concept
A transaction is a unit of program execution that accesses and possibly
updates various data items.
E.g., transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Two main issues to deal with:
• Failures of various kinds, such as hardware failures and system
crashes
• Concurrent execution of multiple transactions
260. Example of Fund Transfer
Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Atomicity requirement
• If the transaction fails after step 3 and before step 6, money will be “lost”
leading to an inconsistent database state
Failure could be due to software or hardware
• The system should ensure that updates of a partially executed transaction
are not reflected in the database
Durability requirement — once the user has been notified that the transaction
has completed (i.e., the transfer of the $50 has taken place), the updates to the
database by the transaction must persist even if there are software or hardware
failures.
261. Example of Fund Transfer (Cont.)
Consistency requirement in above example:
• The sum of A and B is unchanged by the execution of the transaction
In general, consistency requirements include
• Explicitly specified integrity constraints such as primary keys and foreign
keys
• Implicit integrity constraints
e.g., sum of balances of all accounts, minus sum of loan amounts must
equal value of cash-in-hand
• A transaction must see a consistent database.
• During transaction execution the database may be temporarily
inconsistent.
• When the transaction completes successfully the database must be
consistent
Erroneous transaction logic can lead to inconsistency
262. Example of Fund Transfer (Cont.)
Isolation requirement — if between steps 3 and 6, another transaction T2
is allowed to access the partially updated database, it will see an
inconsistent database (the sum A + B will be less than it should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
Isolation can be ensured trivially by running transactions serially
• That is, one after the other.
However, executing multiple transactions concurrently has significant
benefits, as we will see later.
263. ACID Properties
Atomicity. Either all operations of the transaction are properly reflected in
the database or none are.
Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
Isolation. Although multiple transactions may execute concurrently, each
transaction must be unaware of other concurrently executing transactions.
Intermediate transaction results must be hidden from other concurrently
executed transactions.
• That is, for every pair of transactions Ti and Tj, it appears to Ti that
either Tj, finished execution before Ti started, or Tj started execution
after Ti finished.
Durability. After a transaction completes successfully, the changes it has
made to the database persist, even if there are system failures.
A transaction is a unit of program execution that accesses and possibly
updates various data items. To preserve the integrity of data the database
system must ensure:
264. Transaction State
Active – the initial state; the transaction stays in this state while it is
executing
Partially committed – after the final statement has been executed.
Failed -- after the discovery that normal execution can no longer proceed.
Aborted – after the transaction has been rolled back and the database
restored to its state prior to the start of the transaction. Two options after it
has been aborted:
• Restart the transaction
Can be done only if no internal logical error
• Kill the transaction
Committed – after successful completion.
266. Concurrent Executions
Multiple transactions are allowed to run concurrently in the system.
Advantages are:
• Increased processor and disk utilization, leading to better
transaction throughput
E.g., one transaction can be using the CPU while another is
reading from or writing to the disk
• Reduced average response time for transactions: short transactions
need not wait behind long ones.
Concurrency control schemes – mechanisms to achieve isolation
• That is, to control the interaction among the concurrent transactions in
order to prevent them from destroying the consistency of the database
Will study in Chapter 15, after studying notion of correctness of
concurrent executions.
267. Schedules
Schedule – a sequences of instructions that specify the chronological order
in which instructions of concurrent transactions are executed
• A schedule for a set of transactions must consist of all instructions of
those transactions
• Must preserve the order in which the instructions appear in each
individual transaction.
A transaction that successfully completes its execution will have a commit
instructions as the last statement
• By default transaction assumed to execute commit instruction as its last
step
A transaction that fails to successfully complete its execution will have an
abort instruction as the last statement
268. Schedule 1
Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from
A to B.
A serial schedule in which T1 is followed by T2 :
269. Schedule 2
A serial schedule where T2 is followed by T1
270. Schedule 3
Let T1 and T2 be the transactions defined previously. The following
schedule is not a serial schedule, but it is equivalent to Schedule 1
In Schedules 1, 2 and 3, the sum A + B is preserved.
271. Schedule 4
The following concurrent schedule does not preserve the value of (A + B ).
272. Serializability
Basic Assumption – Each transaction preserves database consistency.
Thus, serial execution of a set of transactions preserves database
consistency.
A (possibly concurrent) schedule is serializable if it is equivalent to a serial
schedule. Different forms of schedule equivalence give rise to the notions of:
1. Conflict serializability
2. View serializability
273. Conflicting Instructions
Instructions li and lj of transactions Ti and Tj respectively, conflict if and
only if there exists some item Q accessed by both li and lj, and at least one
of these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
Intuitively, a conflict between li and lj forces a (logical) temporal order
between them.
If li and lj are consecutive in a schedule and they do not conflict, their
results would remain the same even if they had been interchanged in the
schedule.
274. Conflict Serializability
If a schedule S can be transformed into a schedule S’ by a series of swaps
of non-conflicting instructions, we say that S and S’ are conflict
equivalent.
We say that a schedule S is conflict serializable if it is conflict equivalent
to a serial schedule
275. Conflict Serializability (Cont.)
Schedule 3 can be transformed into Schedule 6, a serial schedule where T2
follows T1, by series of swaps of non-conflicting instructions. Therefore
Schedule 3 is conflict serializable.
Schedule 3 Schedule 6
276. Conflict Serializability (Cont.)
Example of a schedule that is not conflict serializable:
We are unable to swap instructions in the above schedule to obtain either
the serial schedule < T3, T4 >, or the serial schedule < T4, T3 >.
277. View Serializability
Let S and S’ be two schedules with the same set of transactions. S and S’
are view equivalent if the following three conditions are met, for each data
item Q,
1. If in schedule S, transaction Ti reads the initial value of Q, then in
schedule S’ also transaction Ti must read the initial value of Q.
2. If in schedule S transaction Ti executes read(Q), and that value was
produced by transaction Tj (if any), then in schedule S’ also
transaction Ti must read the value of Q that was produced by the
same write(Q) operation of transaction Tj .
3. The transaction (if any) that performs the final write(Q) operation in
schedule S must also perform the final write(Q) operation in schedule S’.
As can be seen, view equivalence is also based purely on reads and writes
alone.
278. View Serializability (Cont.)
A schedule S is view serializable if it is view equivalent to a serial
schedule.
Every conflict serializable schedule is also view serializable.
Below is a schedule which is view-serializable but not conflict serializable.
What serial schedule is above equivalent to?
Every view serializable schedule that is not conflict serializable has blind
writes.
279. Other Notions of Serializability
The schedule below produces same outcome as the serial schedule
< T1, T5 >, yet is not conflict equivalent or view equivalent to it.
Determining such equivalence requires analysis of operations other
than read and write.
280. Testing for Serializability
Consider some schedule of a set of transactions T1, T2, ..., Tn
Precedence graph — a direct graph where the vertices are the
transactions (names).
We draw an arc from Ti to Tj if the two transaction conflict, and Ti
accessed the data item on which the conflict arose earlier.
We may label the arc by the item that was accessed.
Example of a precedence graph
281. Test for Conflict Serializability
A schedule is conflict serializable if and only if
its precedence graph is acyclic.
Cycle-detection algorithms exist which take
order n2 time, where n is the number of
vertices in the graph.
• (Better algorithms take order n + e where
e is the number of edges.)
If precedence graph is acyclic, the
serializability order can be obtained by a
topological sorting of the graph.
• This is a linear order consistent with the
partial order of the graph.
• For example, a serializability order for
Schedule A would be
T5 T1 T3 T2 T4
Are there others?
282. Recoverable Schedules
Recoverable schedule — if a transaction Tj reads a data item previously
written by a transaction Ti , then the commit operation of Ti appears before
the commit operation of Tj.
The following schedule (Schedule 11) is not recoverable
If T8 should abort, T9 would have read (and possibly shown to the user) an
inconsistent database state. Hence, database must ensure that schedules
are recoverable.
Need to address the effect of transaction failures on concurrently
running transactions.
283. Cascading Rollbacks
Cascading rollback – a single transaction failure leads to a series of
transaction rollbacks. Consider the following schedule where none of the
transactions has yet committed (so the schedule is recoverable)
If T10 fails, T11 and T12 must also be rolled back.
Can lead to the undoing of a significant amount of work
284. Cascadeless Schedules
Cascadeless schedules — cascading rollbacks cannot occur;
• For each pair of transactions Ti and Tj such that Tj reads a data item
previously written by Ti, the commit operation of Ti appears before the
read operation of Tj.
Every Cascadeless schedule is also recoverable
It is desirable to restrict the schedules to those that are cascadeless
285. Concurrency Control
A database must provide a mechanism that will ensure that all possible
schedules are
• either conflict or view serializable, and
• are recoverable and preferably cascadeless
A policy in which only one transaction can execute at a time generates
serial schedules, but provides a poor degree of concurrency
• Are serial schedules recoverable/cascadeless?
Testing a schedule for serializability after it has executed is a little too late!
Goal – to develop concurrency control protocols that will assure
serializability.
286. Concurrency Control (Cont.)
Schedules must be conflict or view serializable, and recoverable, for the
sake of database consistency, and preferably cascadeless.
A policy in which only one transaction can execute at a time generates
serial schedules, but provides a poor degree of concurrency.
Concurrency-control schemes tradeoff between the amount of concurrency
they allow and the amount of overhead that they incur.
Some schemes allow only conflict-serializable schedules to be generated,
while others allow view-serializable schedules that are not conflict-
serializable.
287. Outline
Lock-Based Protocols
Timestamp-Based Protocols
Validation-Based Protocols
Multiple Granularity
Multiversion Schemes
Insert and Delete Operations
Concurrency in Index Structures
288. Lock-Based Protocols
A lock is a mechanism to control concurrent access to a data item
Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well as
written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is
requested using lock-S instruction.
Lock requests are made to concurrency-control manager. Transaction can
proceed only after request is granted.
289. Lock-Based Protocols (Cont.)
Lock-compatibility matrix
A transaction may be granted a lock on an item if the requested lock is
compatible with locks already held on the item by other transactions
Any number of transactions can hold shared locks on an item,
But if any transaction holds an exclusive on the item no other transaction
may hold any lock on the item.
290. Schedule With Lock Grants
Grants omitted in rest of
chapter
• Assume grant
happens just before
the next instruction
following lock
request
This schedule is not
serializable (why?)
A locking protocol is a
set of rules followed by
all transactions while
requesting and releasing
locks.
Locking protocols
enforce serializability by
restricting the set of
possible schedules.
291. Deadlock
Consider the partial schedule
Neither T3 nor T4 can make progress — executing lock-S(B) causes T4
to wait for T3 to release its lock on B, while executing lock-X(A) causes
T3 to wait for T4 to release its lock on A.
Such a situation is called a deadlock.
• To handle a deadlock one of T3 or T4 must be rolled back
and its locks released.
292. Deadlock (Cont.)
The potential for deadlock exists in most locking protocols. Deadlocks are
a necessary evil.
Starvation is also possible if concurrency control manager is badly
designed. For example:
• A transaction may be waiting for an X-lock on an item, while a
sequence of other transactions request and are granted an S-lock on
the same item.
• The same transaction is repeatedly rolled back due to deadlocks.
Concurrency control manager can be designed to prevent starvation.
293. The Two-Phase Locking Protocol
A protocol which ensures conflict-
serializable schedules.
Phase 1: Growing Phase
• Transaction may obtain locks
• Transaction may not release locks
Phase 2: Shrinking Phase
• Transaction may release locks
• Transaction may not obtain locks
The protocol assures serializability. It can be
proved that the transactions can be
serialized in the order of their lock points
(i.e., the point where a transaction acquired
its final lock).
Time
Locks
294. The Two-Phase Locking Protocol (Cont.)
Two-phase locking does not ensure freedom from deadlocks
Extensions to basic two-phase locking needed to ensure recoverability of
freedom from cascading roll-back
• Strict two-phase locking: a transaction must hold all its exclusive
locks till it commits/aborts.
Ensures recoverability and avoids cascading roll-backs
• Rigorous two-phase locking: a transaction must hold all locks till
commit/abort.
Transactions can be serialized in the order in which they commit.
Most databases implement rigorous two-phase locking, but refer to it as
simply two-phase locking
295. The Two-Phase Locking Protocol (Cont.)
Two-phase locking is not a necessary
condition for serializability
• There are conflict serializable
schedules that cannot be obtained
if the two-phase locking protocol is
used.
In the absence of extra information
(e.g., ordering of access to data), two-
phase locking is necessary for conflict
serializability in the following sense:
• Given a transaction Ti that does
not follow two-phase locking, we
can find a transaction Tj that uses
two-phase locking, and a schedule
for Ti and Tj that is not conflict
serializable.
296. Locking Protocols
Given a locking protocol (such as 2PL)
• A schedule S is legal under a locking protocol if it can be generated
by a set of transactions that follow the protocol
• A protocol ensures serializability if all legal schedules under that
protocol are serializable
297. Lock Conversions
Two-phase locking protocol with lock conversions:
– Growing Phase:
• can acquire a lock-S on item
• can acquire a lock-X on item
• can convert a lock-S to a lock-X (upgrade)
– Shrinking Phase:
• can release a lock-S
• can release a lock-X
• can convert a lock-X to a lock-S (downgrade)
This protocol ensures serializability
298. Deadlock Handling
System is deadlocked if there is a set of transactions such that every
transaction in the set is waiting for another transaction in the set.
299. Deadlock Handling
Deadlock prevention protocols ensure that the system will never enter
into a deadlock state. Some prevention strategies:
• Require that each transaction locks all its data items before it begins
execution (pre-declaration).
• Impose partial ordering of all data items and require that a
transaction can lock data items only in the order specified by the
partial order (graph-based protocol).
300. More Deadlock Prevention Strategies
wait-die scheme — non-preemptive
• Older transaction may wait for younger one to release data item.
• Younger transactions never wait for older ones; they are rolled back
instead.
• A transaction may die several times before acquiring a lock
wound-wait scheme — preemptive
• Older transaction wounds (forces rollback) of younger transaction
instead of waiting for it.
• Younger transactions may wait for older ones.
• Fewer rollbacks than wait-die scheme.
In both schemes, a rolled back transactions is restarted with its original
timestamp.
• Ensures that older transactions have precedence over newer ones,
and starvation is thus avoided.
301. Deadlock prevention (Cont.)
Timeout-Based Schemes:
• A transaction waits for a lock only for a specified amount of time. After
that, the wait times out and the transaction is rolled back.
• Ensures that deadlocks get resolved by timeout if they occur
• Simple to implement
• But may roll back transaction unnecessarily in absence of deadlock
Difficult to determine good value of the timeout interval.
• Starvation is also possible
302. Deadlock Detection
Wait-for graph
• Vertices: transactions
• Edge from Ti Tj. : if Ti is waiting for a lock held in conflicting mode
byTj
The system is in a deadlock state if and only if the wait-for graph has a
cycle.
Invoke a deadlock-detection algorithm periodically to look for cycles.
Wait-for graph without a cycle Wait-for graph with a cycle
303. Deadlock Recovery
When deadlock is detected :
• Some transaction will have to rolled back (made a victim) to break
deadlock cycle.
Select that transaction as victim that will incur minimum cost
• Rollback -- determine how far to roll back transaction
Total rollback: Abort the transaction and then restart it.
Partial rollback: Roll back victim transaction only as far as
necessary to release locks that another transaction in cycle is
waiting for
Starvation can happen (why?)
• One solution: oldest transaction in the deadlock set is never chosen
as victim
304. Multiple Granularity
Allow data items to be of various sizes and define a hierarchy of data
granularities, where the small granularities are nested within larger ones
Can be represented graphically as a tree (but don't confuse with tree-
locking protocol)
When a transaction locks a node in the tree explicitly, it implicitly locks all
the node's descendants in the same mode.
Granularity of locking (level in tree where locking is done):
• Fine granularity (lower in tree): high concurrency, high locking
overhead
• Coarse granularity (higher in tree): low locking overhead, low
concurrency
305. Example of Granularity Hierarchy
The levels, starting from the coarsest (top) level are
• database
• area
• file
• record
306. Example of Granularity Hierarchy
The levels, starting from the coarsest (top) level are
• database
• area
• file
• record
The corresponding tree
307. Intention Lock Modes
In addition to S and X lock modes, there are three additional lock modes
with multiple granularity:
• intention-shared (IS): indicates explicit locking at a lower level of the
tree but only with shared locks.
• intention-exclusive (IX): indicates explicit locking at a lower level with
exclusive or shared locks
• shared and intention-exclusive (SIX): the subtree rooted by that
node is locked explicitly in shared mode and explicit locking is being
done at a lower level with exclusive-mode locks.
Intention locks allow a higher level node to be locked in S or X mode
without having to check all descendent nodes.
310. Failure Classification
Transaction failure :
• Logical errors: transaction cannot complete due to some internal
error condition
• System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
System crash: a power failure or other hardware or software failure
causes the system to crash.
• Fail-stop assumption: non-volatile storage contents are assumed to
not be corrupted by system crash
Database systems have numerous integrity checks to prevent
corruption of disk data
Disk failure: a head crash or similar disk failure destroys all or part of disk
storage
• Destruction is assumed to be detectable: disk drives use checksums to
detect failures
311. Recovery Algorithms
Suppose transaction Ti transfers $50 from account A to account B
• Two updates: subtract 50 from A and add 50 to B
Transaction Ti requires updates to A and B to be output to the database.
• A failure may occur after one of these modifications have been made
but before both of them are made.
• Modifying the database without ensuring that the transaction will
commit may leave the database in an inconsistent state
• Not modifying the database may result in lost updates if failure occurs
just after transaction commits
Recovery algorithms have two parts
1. Actions taken during normal transaction processing to ensure enough
information exists to recover from failures
2. Actions taken after a failure to recover the database contents to a state
that ensures atomicity, consistency and durability
312. Storage Structure
Volatile storage:
• Does not survive system crashes
• Examples: main memory, cache memory
Nonvolatile storage:
• Survives system crashes
• Examples: disk, tape, flash memory, non-volatile RAM
• But may still fail, losing data
Stable storage:
• A mythical form of storage that survives all failures
• Approximated by maintaining multiple copies on distinct nonvolatile
media
• See book for more details on how to implement stable storage
313. Stable-Storage Implementation
Maintain multiple copies of each block on separate disks
• copies can be at remote sites to protect against disasters such as fire
or flooding.
Failure during data transfer can still result in inconsistent copies: Block
transfer can result in
• Successful completion
• Partial failure: destination block has incorrect information
• Total failure: destination block was never updated
Protecting storage media from failure during data transfer (one solution):
• Execute output operation as follows (assuming two copies of each
block):
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same
information onto the second physical block.
3. The output is completed only after the second write successfully
completes.
314. Protecting storage media from failure (Cont.)
Copies of a block may differ due to failure during output operation.
To recover from failure:
1. First find inconsistent blocks:
1. Expensive solution: Compare the two copies of every disk block.
2. Better solution:
• Record in-progress disk writes on non-volatile storage (Flash,
Non-volatile RAM or special area of disk).
• Use this information during recovery to find blocks that may
be inconsistent, and only compare copies of these.
• Used in hardware RAID systems
2. If either copy of an inconsistent block is detected to have an error
(bad checksum), overwrite it by the other copy. If both have no error,
but are different, overwrite the second block by the first block.
315. Data Access
Physical blocks are those blocks residing on the disk.
Buffer blocks are the blocks residing temporarily in main memory.
Block movements between disk and main memory are initiated through
the following two operations:
• input (B) transfers the physical block B to main memory.
• output (B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.
We assume, for simplicity, that each data item fits in, and is stored inside,
a single block.
316. Data Access (Cont.)
Each transaction Ti has its private work-area in which local copies of all
data items accessed and updated by it are kept.
• Ti 's local copy of a data item X is called xi.
Transferring data items between system buffer blocks and its private work-
area done by:
• read(X) assigns the value of data item X to the local variable xi.
• write(X) assigns the value of local variable xi to data item {X} in the
buffer block.
• Note: output(BX) need not immediately follow write(X). System can
perform the output operation when it deems fit.
Transactions
• Must perform read(X) before accessing X for the first time (subsequent
reads can be from local copy)
• write(X) can be executed at any time before the transaction commits
318. Recovery and Atomicity
To ensure atomicity despite failures, we first output information describing
the modifications to stable storage without modifying the database itself.
We study log-based recovery mechanisms in detail
• We first present key concepts
• And then present the actual recovery algorithm
Less used alternative: shadow-copy and shadow-paging (brief details in
book)
shadow-copy
319. Log-Based Recovery
A log is a sequence of log records. The records keep information about
update activities on the database.
• The log is kept on stable storage
When transaction Ti starts, it registers itself by writing a
<Ti start> log record
Before Ti executes write(X), a log record
<Ti, X, V1, V2>
is written, where V1 is the value of X before the write (the old
value), and V2 is the value to be written to X (the new value).
When Ti finishes it last statement, the log record <Ti commit> is written.
Two approaches using logs
• Immediate database modification
• Deferred database modification.
320. Immediate Database Modification
The immediate-modification scheme allows updates of an
uncommitted transaction to be made to the buffer, or the disk itself,
before the transaction commits
Update log record must be written before database item is written
• We assume that the log record is output directly to stable storage
• (Will see later that how to postpone log record output to some
extent)
Output of updated blocks to disk can take place at any time before or
after transaction commit
Order in which blocks are output can be different from the order in which
they are written.
The deferred-modification scheme performs updates to buffer/disk only
at the time of transaction commit
• Simplifies some aspects of recovery
• But has overhead of storing local copy
321. Transaction Commit
A transaction is said to have committed when its commit log record is
output to stable storage
• All previous log records of the transaction must have been output
already
Writes performed by a transaction may still be in the buffer when the
transaction commits, and may be output later
322. Immediate Database Modification Example
Log Write Output
<T0 start>
<T0, A, 1000, 950>
<T0, B, 2000, 2050>
A = 950
B = 2050
<T0 commit>
<T1 start>
<T1, C, 700, 600>
C = 600
BB , BC
<T1 commit>
BA
Note: BX denotes block containing X.
BC output before T1
commits
BA output after T0
commits
323. Concurrency Control and Recovery
With concurrent transactions, all transactions share a single disk buffer and
a single log
• A buffer block can have data items updated by one or more
transactions
We assume that if a transaction Ti has modified an item, no other
transaction can modify the same item until Ti has committed or aborted
• i.e., the updates of uncommitted transactions should not be visible to
other transactions
Otherwise, how to perform undo if T1 updates A, then T2 updates A
and commits, and finally T1 has to abort?
• Can be ensured by obtaining exclusive locks on updated items and
holding the locks till end of transaction (strict two-phase locking)
Log records of different transactions may be interspersed in the log.
324. Undo and Redo Operations
Undo and Redo of Transactions
• undo(Ti) -- restores the value of all data items updated by Ti to their
old values, going backwards from the last log record for Ti
Each time a data item X is restored to its old value V a special log
record <Ti , X, V> is written out
When undo of a transaction is complete, a log record
<Ti abort> is written out.
• redo(Ti) -- sets the value of all data items updated by Ti to the new
values, going forward from the first log record for Ti
No logging is done in this case
325. Recovering from Failure
When recovering after failure:
• Transaction Ti needs to be undone if the log
Contains the record <Ti start>,
But does not contain either the record <Ti commit> or <Ti abort>.
• Transaction Ti needs to be redone if the log
Contains the records <Ti start>
And contains the record <Ti commit> or <Ti abort>
326. Recovering from Failure (Cont.)
Suppose that transaction Ti was undone earlier and the <Ti abort> record
was written to the log, and then a failure occurs,
On recovery from failure transaction Ti is redone
• Such a redo redoes all the original actions of transaction Ti including
the steps that restored old values
Known as repeating history
Seems wasteful, but simplifies recovery greatly
327. Checkpoints
Redoing/undoing all transactions recorded in the log can be very slow
• Processing the entire log is time-consuming if the system has run for a
long time
• We might unnecessarily redo transactions which have already output
their updates to the database.
Streamline recovery procedure by periodically performing checkpointing
1. Output all log records currently residing in main memory onto stable
storage.
2. Output all modified buffer blocks to the disk.
3. Write a log record < checkpoint L> onto stable storage where L is a
list of all transactions active at the time of checkpoint.
4. All updates are stopped while doing checkpointing
328. Checkpoints (Cont.)
During recovery we need to consider only the most recent transaction Ti
that started before the checkpoint, and transactions that started after Ti.
• Scan backwards from end of log to find the most recent <checkpoint
L> record
• Only transactions that are in L or started after the checkpoint need to
be redone or undone
• Transactions that committed or aborted before the checkpoint
already have all their updates output to stable storage.
Some earlier part of the log may be needed for undo operations
• Continue scanning backwards till a record <Ti start> is found for
every transaction Ti in L.
• Parts of log prior to earliest <Ti start> record above are not needed
for recovery, and can be erased whenever desired.
329. Example of Checkpoints
T1 can be ignored (updates already output to disk due to
checkpoint)
T2 and T3 redone.
T4 undone
330. Recovery Algorithm
Logging (during normal operation):
• <Ti start> at transaction start
• <Ti, Xj, V1, V2> for each update, and
• <Ti commit> at transaction end
Transaction rollback (during normal operation)
• Let Ti be the transaction to be rolled back
• Scan log backwards from the end, and for each log record of Ti of the
form <Ti, Xj, V1, V2>
Perform the undo by writing V1 to Xj,
Write a log record <Ti , Xj, V1>
• such log records are called compensation log records
• Once the record <Ti start> is found stop the scan and write the log
record <Ti abort>
331. Recovery Algorithm (Cont.)
Recovery from failure: Two phases
• Redo phase: replay updates of all transactions, whether they
committed, aborted, or are incomplete
• Undo phase: undo all incomplete transactions
Redo phase:
1. Find last <checkpoint L> record, and set undo-list to L.
2. Scan forward from above <checkpoint L> record
1. Whenever a record <Ti, Xj, V1, V2> or <Ti, Xj, V2> is found, redo
it by writing V2 to Xj
2. Whenever a log record <Ti start> is found, add Ti to undo-list
3. Whenever a log record <Ti commit> or <Ti abort> is found,
remove Ti from undo-list
332. Recovery Algorithm (Cont.)
Undo phase:
1. Scan log backwards from end
1. Whenever a log record <Ti, Xj, V1, V2> is found where Ti is in
undo-list perform same actions as for transaction rollback:
1. perform undo by writing V1 to Xj.
2. write a log record <Ti , Xj, V1>
2. Whenever a log record <Ti start> is found where Ti is in undo-list,
1. Write a log record <Ti abort>
2. Remove Ti from undo-list
3. Stop when undo-list is empty
1. i.e., <Ti start> has been found for every transaction in undo-list
After undo phase completes, normal transaction processing can commence
334. Database Buffering
Database maintains an in-memory buffer of data blocks
• When a new block is needed, if buffer is full an existing block needs to
be removed from buffer
• If the block chosen for removal has been updated, it must be output to
disk
The recovery algorithm supports the no-force policy: i.e., updated blocks
need not be written to disk when transaction commits
• force policy: requires updated blocks to be written at commit
More expensive commit
The recovery algorithm supports the steal policy: i.e., blocks containing
updates of uncommitted transactions can be written to disk, even before the
transaction commits
335. Database Buffering (Cont.)
If a block with uncommitted updates is output to disk, log records with
undo information for the updates are output to the log on stable storage
first
• (Write ahead logging)
No updates should be in progress on a block when it is output to disk. Can
be ensured as follows.
• Before writing a data item, transaction acquires exclusive lock on
block containing the data item
• Lock can be released once the write is completed.
Such locks held for short duration are called latches.
To output a block to disk
1. First acquire an exclusive latch on the block
Ensures no update can be in progress on the block
2. Then perform a log flush
3. Then output the block to disk
4. Finally release the latch on the block
336. Failure with Loss of Nonvolatile Storage
So far we assumed no loss of non-volatile storage
Technique similar to checkpointing used to deal with loss of non-volatile
storage
• Periodically dump the entire content of the database to stable
storage
• No transaction may be active during the dump procedure; a
procedure similar to checkpointing must take place
Output all log records currently residing in main memory onto
stable storage.
Output all buffer blocks onto the disk.
Copy the contents of the database to stable storage.
Output a record <dump> to log on stable storage.
337. Recovering from Failure of Non-Volatile Storage
To recover from disk failure
• restore database from most recent dump.
• Consult the log and redo all transactions that committed after the dump
Can be extended to allow transactions to be active during dump;
known as fuzzy dump or online dump
• Similar to fuzzy checkpointing
338. ARIES
ARIES is a state of the art recovery method
• Incorporates numerous optimizations to reduce overheads during
normal processing and to speed up recovery
• The recovery algorithm we studied earlier is modeled after ARIES, but
greatly simplified by removing optimizations
Unlike the recovery algorithm described earlier, ARIES
1. Uses log sequence number (LSN) to identify log records
Stores LSNs in pages to identify what updates have already been
applied to a database page
2. Physiological redo
3. Dirty page table to avoid unnecessary redos during recovery
4. Fuzzy checkpointing that only records information about dirty pages,
and does not require dirty pages to be written out at checkpoint time
More coming up on each of the above …
339. ARIES Data Structures: Log Record
Each log record contains LSN of previous log record of the same
transaction
• LSN in log record may be implicit
Special redo-only log record called compensation log record (CLR) used
to log actions taken during recovery that never need to be undone
• Serves the role of operation-abort log records used in earlier recovery
algorithm
• Has a field UndoNextLSN to note next (earlier) record to be undone
Records in between would have already been undone
Required to avoid repeated undo of already undone actions
LSN TransID PrevLSN RedoInfo UndoInfo
LSN TransID UndoNextLSN RedoInfo
341. ARIES Recovery Algorithm
ARIES recovery involves three passes
Analysis pass: Determines
• Which transactions to undo
• Which pages were dirty (disk version not up to date) at time of crash
• RedoLSN: LSN from which redo should start
Redo pass:
• Repeats history, redoing all actions from RedoLSN
RecLSN and PageLSNs are used to avoid redoing actions already
reflected on page
Undo pass:
• Rolls back all incomplete transactions
Transactions whose abort was complete earlier are not undone
• Key idea: no need to undo these transactions: earlier undo
actions were logged, and are redone as required
343. Centralized Database Systems
Run on a single computer system
Single-user system
Multi-user systems also known as server systems.
• Service requests received from client systems
• Multi-core systems with coarse-grained parallelism
Typically, a few to tens of processor cores
In contrast, fine-grained parallelism uses very large number
of computers
344. Speed-Up and Scale-Up
Speedup: a fixed-sized problem executing on a small system is given to a
system which is N-times larger.
• Measured by:
speedup = small system elapsed time
large system elapsed time
• Speedup is linear if equation equals N.
Scaleup: increase the size of both the problem and the system
• N-times larger system used to perform N-times larger job
• Measured by:
scaleup = small system small problem elapsed time
big system big problem elapsed time
• Scale up is linear if equation equals 1.
347. Distributed Systems
Data spread over multiple machines (also referred to as sites or nodes).
Local-area networks (LANs)
Wide-area networks (WANs)
• Higher latency
site A site C
site B
communication
via network
network
348. Distributed Databases
Homogeneous distributed databases
• Same software/schema on all sites, data may be partitioned among
sites
• Goal: provide a view of a single database, hiding details of distribution
Heterogeneous distributed databases
• Different software/schema on different sites
• Goal: integrate existing databases to provide useful functionality
Differentiate between local transactions and global transactions
• A local transaction accesses data in the single site at which the
transaction was initiated.
• A global transaction either accesses data in a site different from the
one at which the transaction was initiated or accesses data in several
different sites.
349. Data Integration and Distributed Databases
Data integration between multiple distributed databases
Benefits:
• Sharing data – users at one site able to access the data residing at
some other sites.
• Autonomy – each site is able to retain a degree of control over data
stored locally.
350. Availability
Network partitioning
Availability of system
• If all nodes are required for system to function, failure of even one
node stops system functioning.
• Higher system availability through redundancy
data can be replicated at remote sites, and system can function
even if a site fails.
351. Implementation Issues for Distributed
Databases
Atomicity needed even for transactions that update data at multiple sites
The two-phase commit protocol (2PC) is used to ensure atomicity
• Basic idea: each site executes transaction until just before commit, and
the leaves final decision to a coordinator
• Each site must follow decision of coordinator, even if there is a failure
while waiting for coordinators decision
2PC is not always appropriate: other transaction models based on
persistent messaging, and workflows, are also used
Distributed concurrency control (and deadlock detection) required
Data items may be replicated to improve data availability
Details of all above in Chapter 24
352. Cloud Based Services
Cloud computing widely adopted today
• On-demand provisioning and elasticity
ability to scale up at short notice and to release of unused
resources for use by others
Infrastructure as a service
• Virtual machines/real machines
Platform as a service
• Storage, databases, application server
Software as a service
• Enterprise applications, emails, shared documents, etc,
Potential drawbacks
• Security
• Network bandwidth
355. Application Deployment Architectures
Services
Microservice Architecture
• Application uses a variety of services
• Service can add or remove instances as required
Kubernetes supports containers, and microservices
356. Outline
Complex Data Types and Object Orientation
Structured Data Types and Inheritance in SQL
Table Inheritance
Array and Multiset Types in SQL
Object Identity and Reference Types in SQL
Implementing O-R Features
Persistent Programming Languages
Comparison of Object-Oriented and Object-Relational Databases
357. Object-Relational Data Models
Extend the relational data model by including object orientation and
constructs to deal with added data types.
Allow attributes of tuples to have complex types, including non-atomic
values such as nested relations.
Preserve relational foundations, in particular the declarative access to
data, while extending modeling power.
Upward compatibility with existing relational languages.
358. Complex Data Types
Motivation:
• Permit non-atomic domains (atomic indivisible)
• Example of non-atomic domain: set of integers, or set of tuples
• Allows more intuitive modeling for applications with complex data
Intuitive definition:
• Allow relations whenever we allow atomic (scalar) values — relations
within relations
• Retains mathematical foundation of relational model
• Violates first normal form.
359. Example of a Nested Relation
Example: library information system
Each book has
• Title,
• A list (array) of authors,
• Publisher, with subfields name and branch, and
• A set of keywords
Non-1NF relation books
360. Structured Types and Inheritance in SQL
Structured types (a.k.a. user-defined types) can be declared and used in
SQL
create type Name as
(firstname varchar(20),
lastname varchar(20))
final
create type Address as
(street varchar(20),
city varchar(20),
zipcode varchar(20))
not final
• Note: final and not final indicate whether subtypes can be created
Structured types can be used to create tables with composite attributes
create table person (
name Name,
address Address,
dateOfBirth date)
Dot notation used to reference components: name.firstname
361. Structured Types (cont.)
User-defined row types
create type PersonType as (
name Name,
address Address,
dateOfBirth date)
not final
Can then create a table whose rows are a user-defined type
create table customer of CustomerType
Alternative using unnamed row types.
create table person_r(
name row(firstname varchar(20),
lastname varchar(20)),
address row(street varchar(20),
city varchar(20),
zipcode varchar(20)),
dateOfBirth date)
362. Methods
Can add a method declaration with a structured type.
method ageOnDate (onDate date)
returns interval year
Method body is given separately.
create instance method ageOnDate (onDate date)
returns interval year
for CustomerType
begin
return onDate - self.dateOfBirth;
end
We can now find the age of each customer:
select name.lastname, ageOnDate (current_date)
from customer
363. Object-Identity and Reference Types
Define a type Department with a field name and a field head which is a
reference to the type Person, with table people as scope:
create type Department (
name varchar (20),
head ref (Person) scope people)
We can then create a table departments as follows
create table departments of Department
We can omit the declaration scope people from the type declaration and
instead make an addition to the create table statement:
create table departments of Department
(head with options scope people)
Referenced table must have an attribute that stores the identifier, called
the self-referential attribute
create table people of Person
ref is person_id system generated;
365. Implementing O-R Features
Similar to how E-R features are mapped onto relation schemas
Subtable implementation
• Each table stores primary key and those attributes defined in that
table
or,
• Each table stores both locally defined and inherited attributes
366. Persistent Programming Languages
Languages extended with constructs to handle persistent data
Programmer can manipulate persistent data directly
• no need to fetch it into memory and store it back to disk (unlike
embedded SQL)
Persistent objects:
• Persistence by class - explicit declaration of persistence
• Persistence by creation - special syntax to create persistent objects
• Persistence by marking - make objects persistent after creation
• Persistence by reachability - object is persistent if it is declared
explicitly to be so or is reachable from a persistent object
367. Comparison of O-O and O-R Databases
Relational systems
• simple data types, powerful query languages, high protection.
Persistent-programming-language-based OODBs
• complex data types, integration with programming language, high
performance.
Object-relational systems
• complex data types, powerful query languages, high protection.
Object-relational mapping systems
• complex data types integrated with programming language, but built as
a layer on top of a relational database system
Note: Many real systems blur these boundaries
• E.g., persistent programming language built as a wrapper on a
relational database offers first two benefits, but may have poor
performance.
368. Outline
Structure of XML Data
XML Document Schema
Querying and Transformation
Application Program Interfaces to XML
Storage of XML Data
XML Applications
369. Introduction
XML: Extensible Markup Language
Defined by the WWW Consortium (W3C)
Derived from SGML (Standard Generalized Markup Language), but
simpler to use than SGML
Documents have tags giving extra information about sections of the
document
• E.g., <title> XML </title> <slide> Introduction …</slide>
Extensible, unlike HTML
• Users can add new tags, and separately specify how the tag should
be handled for display
370. XML Introduction (Cont.)
The ability to specify new tags, and to create nested tag structures make
XML a great way to exchange data, not just documents.
• Much of the use of XML has been in data exchange applications, not
as a replacement for HTML
Tags make data (relatively) self-documenting
• E.g.,
<university>
<department>
<dept_name> Comp. Sci. </dept_name>
<building> Taylor </building>
<budget> 100000 </budget>
</department>
<course>
<course_id> CS-101 </course_id>
<title> Intro. to Computer Science </title>
<dept_name> Comp. Sci </dept_name>
<credits> 4 </credits>
</course>
</university>
371. Comparison with Relational Data
Inefficient: tags, which in effect represent schema information, are
repeated
Better than relational tuples as a data-exchange format
• Unlike relational tuples, XML data is self-documenting due to presence
of tags
• Non-rigid format: tags can be added
• Allows nested structures
• Wide acceptance, not only in database systems, but also in browsers,
tools, and applications
372. Structure of XML Data
Tag: label for a section of data
Element: section of data beginning with <tagname> and ending with
matching </tagname>
Elements must be properly nested
• Proper nesting
<course> … <title> …. </title> </course>
• Improper nesting
<course> … <title> …. </course> </title>
• Formally: every start tag must have a unique matching end tag, that is
in the context of the same parent element.
Every document must have a single top-level element
376. Attributes vs. Subelements
Distinction between subelement and attribute
• In the context of documents, attributes are part of markup, while
subelement contents are part of the basic document contents
• In the context of data representation, the difference is unclear and may
be confusing
Same information can be represented in two ways
• <course course_id= “CS-101”> … </course>
• <course>
<course_id>CS-101</course_id> …
</course>
• Suggestion: use attributes for identifiers of elements, and use
subelements for contents
377. Namespaces
XML data has to be exchanged between organizations
Same tag name may have different meaning in different organizations,
causing confusion on exchanged documents
Specifying a unique string as an element name avoids confusion
Better solution: use unique-name:element-name
Avoid using long unique names all over document by using XML
Namespaces
<university xmlns:yale=“http://www.yale.edu”>
…
<yale:course>
<yale:course_id> CS-101 </yale:course_id>
<yale:title> Intro. to Computer Science</yale:title>
<yale:dept_name> Comp. Sci. </yale:dept_name>
<yale:credits> 4 </yale:credits>
</yale:course>
…
</university>
378. XML Document Schema
Database schemas constrain what information can be stored, and the data
types of stored values
XML documents are not required to have an associated schema
However, schemas are very important for XML data exchange
• Otherwise, a site cannot automatically interpret data received from
another site
Two mechanisms for specifying XML schema
• Document Type Definition (DTD)
Widely used
• XML Schema
Newer, increasing use
379. Document Type Definition (DTD)
The type of an XML document can be specified using a DTD
DTD constraints structure of XML data
• What elements can occur
• What attributes can/must an element have
• What subelements can/must occur inside each element, and how
many times.
DTD does not constrain data types
• All values represented as strings in XML
DTD syntax
• <!ELEMENT element (subelements-specification) >
• <!ATTLIST element (attributes) >
380. Element Specification in DTD
Subelements can be specified as
• names of elements, or
• #PCDATA (parsed character data), i.e., character strings
• EMPTY (no subelements) or ANY (anything can be a subelement)
Example
<! ELEMENT department (dept_name building, budget)>
<! ELEMENT dept_name (#PCDATA)>
<! ELEMENT budget (#PCDATA)>
Subelement specification may have regular expressions
<!ELEMENT university ( ( department | course | instructor | teaches )+)>
Notation:
• “|” - alternatives
• “+” - 1 or more occurrences
• “*” - 0 or more occurrences
382. Limitations of DTDs
No typing of text elements and attributes
• All values are strings, no integers, reals, etc.
Difficult to specify unordered sets of subelements
• Order is usually irrelevant in databases (unlike in the document-layout
environment from which XML evolved)
• (A | B)* allows specification of an unordered set, but
Cannot ensure that each of A and B occurs only once
IDs and IDREFs are untyped
• The instructors attribute of an course may contain a reference to
another course, which is meaningless
instructors attribute should ideally be constrained to refer to
instructor elements
383. XML Schema
XML Schema is a more sophisticated schema language which addresses
the drawbacks of DTDs. Supports
• Typing of values
E.g., integer, string, etc
Also, constraints on min/max values
• User-defined, comlex types
• Many more features, including
uniqueness and foreign key constraints, inheritance
XML Schema is itself specified in XML syntax, unlike DTDs
• More-standard representation, but verbose
XML Scheme is integrated with namespaces
BUT: XML Schema is significantly more complicated than DTDs.
384. Querying and Transforming XML Data
Translation of information from one XML schema to another
Querying on XML data
Above two are closely related, and handled by the same tools
Standard XML querying/translation languages
• XPath
Simple language consisting of path expressions
• XSLT
Simple language designed for translation from XML to XML and
XML to HTML
• XQuery
An XML query language with a rich set of features
385. Tree Model of XML Data
Query and transformation languages are based on a tree model of XML
data
An XML document is modeled as a tree, with nodes corresponding to
elements and attributes
• Element nodes have child nodes, which can be attributes or
subelements
• Text in an element is modeled as a text node child of the element
• Children of a node are ordered according to their order in the XML
document
• Element and attribute nodes (except for the root node) have a single
parent, which is an element node
• The root node has a single child, which is the root element of the
document
386. XPath
XPath is used to address (select) parts of documents using
path expressions
A path expression is a sequence of steps separated by “/”
• Think of file names in a directory hierarchy
Result of path expression: set of values that along with their containing
elements/attributes match the specified path
E.g., /university-3/instructor/name evaluated on the university-3 data
we saw earlier returns
<name>Srinivasan</name>
<name>Brandt</name>
E.g., /university-3/instructor/name/text( )
returns the same names, but without the enclosing tags
388. Functions in XPath
XPath provides several functions
• The function count() at the end of a path counts the number of
elements in the set generated by the path
E.g., /university-2/instructor[count(./teaches/course)> 2]
• Returns instructors teaching more than 2 courses (on
university-2 schema)
• Also function for testing position (1, 2, ..) of node w.r.t. siblings
Boolean connectives and and or and function not() can be used in
predicates
IDREFs can be referenced using function id()
• id() can also be applied to sets of references such as IDREFS and
even to strings containing multiple references separated by blanks
• E.g., /university-3/course/id(@dept_name)
returns all department elements referred to from the dept_name
attribute of course elements.
390. Storage of XML Data
XML data can be stored in
• Non-relational data stores
Flat files
• Natural for storing XML
• But has all problems discussed in Chapter 1 (no concurrency,
no recovery, …)
XML database
• Database built specifically for storing XML data, supporting
DOM model and declarative querying
• Currently no commercial-grade systems
• Relational databases
Data must be translated into relational form
Advantage: mature database systems
Disadvantages: overhead of translating data and queries
391. XML Applications
Storing and exchanging data with complex structures
• E.g., Open Document Format (ODF) format standard for storing Open
Office and Office Open XML (OOXML) format standard for storing
Microsoft Office documents
• Numerous other standards for a variety of applications
ChemML, MathML
Standard for data exchange for Web services
• remote method invocation over HTTP protocol
• More in next slide
Data mediation
• Common data representation format to bridge different systems
392. Outline
Relevance Ranking Using Terms
Relevance Using Hyperlinks
Synonyms., Homonyms, and Ontologies
Indexing of Documents
Measuring Retrieval Effectiveness
Web Search Engines
Information Retrieval and Structured Data
Directories
393. Information Retrieval Systems
Information retrieval (IR) systems use a simpler data model than
database systems
• Information organized as a collection of documents
• Documents are unstructured, no schema
Information retrieval locates relevant documents, on the basis of user
input such as keywords or example documents
• e.g., find documents containing the words “database systems”
Can be used even on textual descriptions provided with non-textual data
such as images
Web search engines are the most familiar example of IR systems
395. Keyword Search
In full text retrieval, all the words in each document are considered to be
keywords.
• We use the word term to refer to the words in a document
Information-retrieval systems typically allow query expressions formed using
keywords and the logical connectives and, or, and not
• Ands are implicit, even if not explicitly specified
Ranking of documents on the basis of estimated relevance to a query is critical
• Relevance ranking is based on factors such as
Term frequency
– Frequency of occurrence of query keyword in document
Inverse document frequency
– How many documents the query keyword occurs in
» Fewer give more importance to keyword
Hyperlinks to documents
– More links to a document document is more important
396. Relevance Ranking Using Terms
TF-IDF (Term frequency/Inverse Document frequency) ranking:
• Let n(d) = number of terms in the document d
• n(d, t) = number of occurrences of term t in the document d.
• Relevance of a document d to a term t
The log factor is to avoid excessive weight to frequent terms
• Relevance of document to query Q
n(d)
n(d, t)
1 +
TF (d, t) = log
r (d, Q) = TF (d, t)
n(t)
tQ
405. Measuring Retrieval Effectiveness
Information-retrieval systems save space by using index structures that
support only approximate retrieval. May result in:
• false negative (false drop) - some relevant documents may not be
retrieved.
• false positive - some irrelevant documents may be retrieved.
• For many applications a good index should not permit any false
drops, but may permit a few false positives.
Relevant performance metrics:
• precision - what percentage of the retrieved documents are relevant
to the query.
• recall - what percentage of the documents relevant to the query
were retrieved.
410. Directories
Storing related documents together in a library facilitates browsing
• Users can see not only requested document but also related ones.
Browsing is facilitated by classification system that organizes logically
related documents together.
Organization is hierarchical: classification hierarchy