Dbs Note of 2014
Dbs Note of 2014
Database is shared collection of logically related data (and a description of this data),
designed to meet the information needs of an organization. (Centralized System)
A collection of related data.
Data: Known facts that can be recorded and have an implicit meaning.
Database Management System (DBMS): A software package/ system to facilitate the
creation and maintenance of a computerized database.
Database System: The DBMS software together with the data itself. Sometimes, the
applications are also included. Its elements are the data in the database,
Hardware(client and server computers), Software(Application program and
DBMS:DDL,DML,TML etc.) and users(end users, application programmers, database
administrator)
Database Planning
– Analyze the Company Situation
• What is the organization’s general operating environment, and what is its
mission within that environment?
• What is the organization’s structure?
– Identifies work to be done; the resources with which to do it; and the money to
pay for it all.
– Integrated with the overall IS strategy of the organization.
Data analysis and requirements
• Designer’s efforts are focused on
– Information needs,
– Information sources.
• Sources of information for the designer
– Developing and gathering end user data views
– Direct observation of the current system:
– existing system and desired output
1
Fundamentals of Database Systems Lecture Note
• The designer must identify the company’s business rules and analyze
their impacts.
Define Problems and Constraints
• How does the existing system function?
• What input does the system require?
• What reports does the system generate?
• How is the system output used? By Whom?
• What are the operational relationships among business units?
• What are the limits and constraints imposed on the system?
Database Design
Gather user and system requirements
Create a conceptual model of the database using the Entity Relationship model that is
based on the user requirements
Convert this conceptual model (E-R) into a logical database model - we will use the
Relational model. Data model mapping
Normalize the Relational model of the database
Implement the normalized relations as tables in a relational database – this is the
Physical Database design and implementation.
Choose a DBMS - in our case, we will use MS-SQL Server, a desktop relational DBMS.
Design: in database designing more emphasis is given to this phase.
The phase is further divided into three sub-phases:-
a - Conceptual Design: concise description of :-
– Entities and
– Relationship between entities.
b - Logical Design: a higher level conceptual abstraction with selected specific data
model to implement the data structure.
– Attributes
– Primary key
– Foreign key
– Mapping E-R diagram into relational schema.
c - Physical Design: involves the actual design of a database according to the
requirments that were established during logical modelling.
– Table names
– Column names
– Column data types
DBMS Selection
• The selection of an appropriate DBMS to support the database application.
• Also design the user interface and the application programs using the selected DBMS.
Prototyping : Building a working model of a database application.
• Purpose
– To identify features of a system that work well, or are inadequate
2
Fundamentals of Database Systems Lecture Note
– To suggest improvements or even new features
– To clarify the users’ requirements
– To evaluate the feasibility of a particular system design.
Implementation
• The physical realization of the database and application designs.
– Use DDL of DBMS to create database schemas and empty database files.
– Use DML to create any specified user views.
Testing
• The process of executing the application programs with the intent of finding errors.
– Use carefully planned test strategies and realistic data.
– Testing cannot show the absence of faults; it can show only that software faults
are present.
– Demonstrates that database and application programs appear to be working
according to requirements.
Operational Maintenance
– The process of monitoring and maintaining the system following installation.
– Monitoring the performance of the system.
– If performance falls, may require reorganization of the database.
– Maintaining and upgrading the database application (when required).
– Incorporating new requirements into the database application.
3
Fundamentals of Database Systems Lecture Note
Database Planning
Systems Definition
Requirements Collection
and analysis
Database Design
DBMS Application
Selection Design
Implementation
Data Conversion and loading
Prototyping Testing
Evaluation & Maintenance
4
Fundamentals of Database Systems Lecture Note
• Such systems have number of programs for each of the different applications in the
organization.
• Since every application defines and manages its own data, the system is subjected to
serious data duplication problem.
• File, in traditional file based approach, is a collection of records which contains logically
related data.
File-Based Approach
16
16
5
Fundamentals of Database Systems Lecture Note
Limitations of the Traditional File Based approach
• Limited data sharing
• Lengthy development and maintenance time
• Duplication or redundancy of data
• Data dependency on the application
• Incompatible file formats between different applications and programs creating
inconsistency.
• The most significant problem experienced by the traditional file based approach of data
handling is the “anomalies”. We have three types of anomalies;
1. Modification Anomalies: a problem experienced when one or more data value is
modified on one application program but not on others containing the same data set.
2. Deletion Anomalies: a problem encountered where one record set is deleted from
one application but remain untouched in other application programs.
3. Insertion Anomalies: a problem experienced when ever there is new data item to be
recorded, and the recording is not made in all the applications.
• And when same data item is inserted at different applications, there could be errors in
encoding which makes the new data item to be considered as a totally different object.
6
Fundamentals of Database Systems Lecture Note
Redundancy can be reduced: isolated data is integrated in database to decrease the redundant
data stored at different applications.
Inconsistency can be avoided: controlled data redundancy will avoid inconsistency of the data
in the database to some extent.
Transaction support can be provided: basic demands of any transaction support systems are
implanted in a full scale DBMS.
Integrity can be maintained: data at different applications will be integrated together with
additional constraints to facilitate shared data resource.
Security measures can be enforced: the shared data can be secured by having different levels
of clearance and other data security mechanisms.
Centralized information control: it can be controlled and managed at the central level.
7
Fundamentals of Database Systems Lecture Note
Some Common uses of Databases?
In a university, there is a database
– containing information about yourself, the course you are enrolled in, the
dormitory you have been given..
– containing details of Staff who work at the university at personnel, payroll, etc.
When you visit your library
– There may be a database containing details of the books in the library and details
of the users,
– The database system handles activities such as
• Allowing a user to reserve a book
• Charging materials to users
• Notifying when materials are overdue : Sends out reminders to
borrowers who have failed to return books on the due data
– The system will have a bar code reader to keep track of books and users.
9
Fundamentals of Database Systems Lecture Note
Chapter 2: Data models & Database System Architecture
10
Fundamentals of Database Systems Lecture Note
The database state changes every time the database is updated.
The major purpose of a database system is to provide users with an abstract view of the system
which some time referred as architecture of the system .
The system hides certain details of how data is stored and maintained
Complexity should be hidden from database users.
There are several levels of abstraction of the database architecture:
(a) Physical Level:
_ How the data are stored.
_ E.g. index, B-tree, hashing.
_ Lowest level of abstraction.
_ Complex low-level structures described in detail.
(b) Conceptual Level:
_ Next highest level of abstraction.
Describes what data are stored.
_ Describes the relationships among data.
_ Database administrator level.
(c) View Level:
_ Highest level.
_ Describes part of the database for a particular group of users.
_ Can be many different views of a database.
_ E.g. tellers in a bank get a view of customer accounts, but not of payroll data.
11
Fundamentals of Database Systems Lecture Note
Fig: The three levels of data abstraction of a database system architecture
12
Fundamentals of Database Systems Lecture Note
Used to specify database retrievals and updates
DML commands (data sublanguage) can be embedded in a general-purpose
programming language (host language), such as COBOL, C, C++, or Java.
A library of functions can also be provided to access the DBMS from a
programming language
13
Fundamentals of Database Systems Lecture Note
Different licensing options: site license, maximum number of concurrent users (seat
license), single user, etc.
14
Fundamentals of Database Systems Lecture Note
Network Data Model
• Allows record types to have more than one parent unlike hierarchical model
• It doesn't allow many to many relationship between entities
• Like hierarchical model network model is a collection of physically linked records.
A Relational model
10
10
17
Fundamentals of Database Systems Lecture Note
CHAPTER THREE: Database Modelling using ERD (Conceptual modelling)
• In developing a good database design/model, one should answer such questions as:
• What are the relevant Entities or the Organization
• What are the important features of each Entity
• What are the important Relationships
• What are the important queries from the user
• What are the other requirements of` the Organization and the Users
The Three levels of Database Design
19
Fundamentals of Database Systems Lecture Note
• Connected entities are called relationship
participants.
Attributes are represented by OVALS
and are connected to the entity by a line.
20
Fundamentals of Database Systems Lecture Note
• Example 1: Build an ER Diagram for the following information:
– A student record management system will have the following two basic data
object categories with their own features or properties:
– Students will have an Id, Name, Dept, Age, GPA and Course will have an Id,
Name, Credit Hours\2019SZA\M2…………4.
– Whenever a student enrol in a course in a specific Academic Year and Semester,
the Student will have a grade for the course.
Example 2
• Build an ER Diagram for the following information:
– Patients
• Name, Address, Phone #, Age
– Drugs
• Name, Manufacturer , Expiration Date
– Patients are prescribed drugs
• Dosage, # Days
23
Fundamentals of Database Systems Lecture Note
Example 2:- A customer is associated with at most one loan via the relationship borrower.
– A loan is associated with at most one customer via borrower.
One-To-Many Relationships
• In the one-to-many relationship:-
– a loan is associated with at most one customer via borrower.
– a customer is associated with several (including 0) loans via borrower.
24
Fundamentals of Database Systems Lecture Note
• E.g.: Relationship Leads between staff and project.
• The multiplicity of the relationship:-
– One staff may Lead zero or more project(s).
– One project is led by one staff.
Many-To-Many Relationship
• A customer is associated with several (possibly 0) loans via borrower.
• A loan is associated with several (possibly 0) customers via borrower.
25
Fundamentals of Database Systems Lecture Note
Participation constraint of a relationship
• Specifies whether the existance of an entity depends on its being related to another
entity via the relationship type.
• There are two distinct participation constraints with this respect, namely:-
– Total Participation and
– Partial Participation
Total participation constriants require the participation of every entity in the relationship.
• It is displayed by a double line.
• Partial participation constrints(displayed by a single line).
• Participation of EMPLOYEE in “manages” relationship with DEPARTMENT, is partial
participation since not all employees are managers.
• Participation of DEPARTMENT in “Manages” relationship with EMPLOYEE is total since
every department should have a manager.
26
Fundamentals of Database Systems Lecture Note
CHAPTER FOUR
MODELING WITH THE RELATIONAL DATAMODEL & NORMALIZATION
Logical Database Design
Is the process of constructing a model of the information used in an enterprise based on
a specific data model.
(e.g. relational, hierarchical or network or object),
but independent of a particular DBMS and other physical considerations.
Normalization process:
– Collection of Rules to be maintained
– Discover new entities in the process
– Revise attributes based on the rules and the discovered Entities.
Converting ER Diagram to Relational Tables
Three basic rules to convert ER into tables or relations:-
Rule 1: Entity Names will automatically be table names.
Rule 2: Mapping of attributes: attributes will be columns of the respective tables.
– Atomic or single-valued or derived or stored attributes will be columns.
– Composite attributes: the parent attribute will be ignored and the decomposed attributes
(child attributes) will be columns of the table.
– Multi-valued attributes: will be mapped to a new table where the primary key of the main
table will be posted for cross referencing.
Rule 3: Relationships: relationship will be mapped by using a foreign key attribute.
Foreign key is a primary or candidate key of one relation used to create association
between tables.
– For a relationship with One-to-One Cardinality: post the primary or candidate key of one of
the table into the other as a foreign key.
In cases where one entity is having partial participation on the relationship, it is
recommended to post the candidate key of the partial participants to the total
participant so as to save some memory location due to null values on the foreign key
attribute.
E.g.: for a relationship between Employee and Department where employee manages a
department, the cardinality is one-to-one.
Here the PK of the Employee can be posted to the Department or
the PK of the Department can be posted to the Employee.
But the Employee is having partial participation on the relationship "Manages" as not all
employees are managers of departments.
it is recommended to post the primary key of the employee to the Department table as
a foreign key.
For a relationship with One-to-Many Cardinality:- Post the primary key or candidate key from
the “one” side as a foreign key attribute to the “many” side.
E.g.: For a relationship called “Belongs To” between
27
Fundamentals of Database Systems Lecture Note
Employee (Many) and Department (One) the primary or candidate key of the one side
which is Department should be posted to the many side which is Employee table.
– For a relationship with Many-to-Many Cardinality: create a new table (which is the
associative entity). and
post primary key or candidate key from the participant entities as foreign key attributes
in the new table along with some additional attributes (if applicable).
The same approach should be used for relationships with degree greater than binary.
For a relationship having Associative Entity property: in cases where the relationship
has its own attributes (associative entity), – create a new table for the associative entity
and
– post primary key or candidate key from the participating entities as foreign key attributes in
the new table.
Example: to illustrate the major rules in mapping ER to relational schema:
Employee ,Department and Project information.
Employee: Eid, Name,Salary,Tel.
Department: Did, Dname, Dloc.
Project: Pid, Pname, PFund
Employee works for department.
– where an employee might be assigned to manage a department.
– Employee might participate on different projects with in
the organization.
– An employee might as well be assigned to lead a project where the starting and ending date
of his/her project leadership and bonus will be registered.
28
Fundamentals of Database Systems Lecture Note
After we have drawn the ER diagram, the next step is to map the ER into relational
schema so as the rules of the relational data model can be tested for each relational
schema.
The mapping can be done for the entities followed by relationships based on the rule
of mapping.
The mapping has been done as follows.
Mapping EMPLOYEE Entity:
There will be Employee table with EID, Salary, FName and LName being the columns.
The composite attribute Name will be ignored as its decomposed attributes (FName and
LName) are columns in the Employee Table.
The Tel attribute will be a new table as it is multi-valued.
29
Fundamentals of Database Systems Lecture Note
Mapping the MANAGES Relationship:
As the relationship is having one-to-one cardinality, the PK or CK of one of the table can
be posted into the other.
But based on the recommendation, the Pk or CK of the partial participant (Employee)
should be posted to the total participants (Department).
This will require adding the PK of Employee (EID) in the Department Table as a foreign
key.
We can give the foreign key another name which is MEID to mean "managers employee
id".
this will affect the degree of the Department table.
Mapping the PARTICIPATES Relationship:
As the relationship is having many-to-many cardinality, we need to create a new table
and post the PK or CK of the Employee and Project table into the new table.
We can give a descriptive new name for the new table like
Emp_Partc_Project to mean "Employee participate in a project".
30
Fundamentals of Database Systems Lecture Note
Mapping the LEADS Relationship:
As the relationship is associative entity, we are supposed to create a table for the
associative entity where the PK of Employee and Project tables will be posted in the
new table as a foreign key.
The new table will have the attributes of the associative entity as columns.
We can give a descriptive new name for the new table like Emp_Lead_Project to mean
"Employee participate in a project".
• At the end of the mapping, we will have the following relational schema (tables) for the
logical design phase.
Normalization
Database normalization is a series of steps followed to obtain a database design that
allows:-
– for consistent data storage and efficient access of data in a relational database.
31
Fundamentals of Database Systems Lecture Note
These steps reduce:-
– data redundancy and
– the risk of data becoming inconsistent.
NORMALIZATION is the process of identifying the logical associations between data
items and designing a database that will represent such associations but without
suffering the update anomalies which are;
1. Insertion Anomalies
2. Deletion Anomalies
3. Modification Anomalies
Normalization may reduce system performance since data will be cross referenced from
many tables.
Thus denormalization is sometimes used to improve performance, at the cost of
reduced consistency guarantees.
All the normalization rules will eventually remove the update anomalies that may exist
during data manipulation after the implementation.
The update anomalies are;
The type of problems that could occur in insufficiently normalized table is called update
anomalies which includes;
Insertion anomalies
An "insertion anomaly" is a failure to place information about a new database entry into
all the places in the database.
In a properly normalized database, information about a new entry needs to be inserted
into only one place in the database;
Deletion anomalies
A "deletion anomaly" is a failure to remove information about an existing database
entry when it is time to remove that entry.
In a properly normalized database, information about an old, to-be-gotten-rid-of entry needs to
be deleted from only one place in the database;
Modification anomalies
• A modification of a database involves changing some value
of the attribute of a table.
• In a properly normalized database table, what ever
information is modified by the user, the change will be
effected and used accordingly.
• Thus the purpose of normalization is to reduce the chances
for anomalies to occur in a database.
• Example of problems related with anomalies.
32
Fundamentals of Database Systems Lecture Note
Deletion Anomalies:
If employee with ID 16 is deleted then information about skill C++ and the type of skill is
deleted from the database.
Then we will not have any information about C++ and its skill type.
Insertion Anomalies:
What if we have a new employee with a skill called Pascal? We can not decide weather
Pascal is allowed as a value for skill and
we have no clue about the type of skill that Pascal should be categorized as.
Modification Anomalies:
What if the address for Helico is changed from Piazza to Mexico?
We need to look for every occurrence of Helico and change the value of School_Add
from Piazza to Mexico, which is prone to error.
Functional Dependency (FD)
Two data items A and B are said to be in dependent relationship if certain values of data
item B always appears with certain values of data item A.
The notation is: A → B which is read as; B is functionally dependent on A.
A → B holds if whenever a given tuple has the same value for A, it must have the same value for
B.
33
Fundamentals of Database Systems Lecture Note
Since the type of Wine served depends on the type of Dinner, we say Wine is
functionally dependent on Dinner.
Dinner → Wine
Partial Dependency
If an attribute which is not a member of the primary key is dependent on some part of
the primary key (if we have composite primary key) then that attribute is partially
functionally dependent on the primary key.
Example:
Let {A,B} is the Primary Key and C is no key attribute.
Then if {A,B} → C and B --> C
Then C is partially functionally dependent on {A,B}
Full Dependency
If an attribute which is not a member of the primary key is not dependent on some part
of the primary key but the whole key (if we have composite primary key) then that
attribute is fully functionally dependent on the primary key.
Let {A,B} is the Primary Key and C is no key attribute
Then if {A,B} → C and B → C and A → C does hold
Then C Fully functionally dependent on {A,B}.
Transitive Dependency
In mathematics and logic, a transitive relationship is a relationship of the following form:
"If A implies B, and if B implies C, then A implies C."
Example:
If Mr X is a Human, and if every Human is an Animal, then Mr X must be an Animal.
Generalized way of describing transitive dependency is that:
If A functionally governs B, AND
If B functionally governs C
34
Fundamentals of Database Systems Lecture Note
Steps in Normalization
We have various levels or steps in normalization called Normal Forms.
Normalization towards a logical design consists of the following steps:
Unnormalized Form:
– Identify all data elements
First Normal Form:
– Find the key with which you can find all data.
Second Normal Form:
– Remove part-key dependencies. Make all data dependent on the whole key.
Third Normal Form
– Remove non-key dependencies. Make all data dependent on nothing but the key.
First Normal Form (1NF)
Requires that all column values in a table are atomic.
(e.g., a number is an atomic value, while a list or a set is not).
We have two ways of achieving this:-
1. Putting each repeating group into a separate table and connecting them with a primary key-
foreign key relationship.
2. Moving this repeating groups to a new row by repeating the
common attributes.
If so then Find the key with which you can find all data.
Definition
A table (relation) is in 1NF If:-
There are no duplicated rows in the table. Unique identifier
Each cell is single-valued (i.e., there are no repeating groups).
Entries in a column (attribute, field) are of the same kind.
Example for First Normal form (1NF )
35
Fundamentals of Database Systems Lecture Note
FIRST NORMAL FORM (1NF)
Remove all repeating groups.
Distribute the multi-valued attributes into different
rows and
identify a unique identifier for the relation so that is can be said is a relation in relational
database.
36
Fundamentals of Database Systems Lecture Note
Second Normal form 2NF
No partial dependency of a non key attribute on part of the primary key.
This will result in a set of relations with a level of Second Normal Form.
Any table that is in 1NF and has a single-attribute (i.e., a non-composite) key is
automatically also in 2NF.
Definition: a table (relation) is in 2NF If:-
It is in 1NF and
If all non-key attributes are dependent on the entire primary key. i.e. no partial
dependency.
Example for 2NF:
38
Fundamentals of Database Systems Lecture Note
This schema is in its 2NF since the primary key is a single attribute.
Let’s take StudID, Year and Dormitory and see the dependencies.
StudID -->Year AND Year → Dormitory
And Year can not determine StudID and
Dormitary can not determine StudID Then transitively StudID--> Dormitory
To convert it to a 3NF we need to remove all transitive dependencies of non key
attributes on another non-key attribute.
The non-primary key attributes, dependent on each other will be moved to another
table and linked with the main table using Candidate Key- Foreign Key relationship.
39
Fundamentals of Database Systems Lecture Note
Generally, even though there are other four additional levels of Normalization, a table is
said to be normalized if it reaches 3NF.
A database with all tables in 3NF is said to be Normalized Database.
Mnemonic for remembering the rationale for normalization up to 3NF could be the
following:-
– No Repeating or Redundancy: no repeating fields in the table.
– The Fields Depend Upon the Key: the table should solely depend
on the key.
– The Whole Key: no partial key dependency.
– And Nothing But The Key: no inter data dependency.
40
Fundamentals of Database Systems Lecture Note
5. CHAPTER EIGHT: Structured Query Language (SQL)
Many database management systems support some version of structured query language (SQL).
In some DBMSs (i.e., ORACLE) SQL is the primary data manipulation interface. Consequently,
SQL is a very important topic. The purpose of this document is to introduce you to the major
SQL statements and to show you how they work. This document will concentrate primarily on
ORACLE SQL; however, some attention also will be given to other versions of SQL. This is not
a complete reference of SQL. If you are interested in a more detailed coverage I can suggest
several good textbooks for outside reading.
- Data Definition Language (DDL) - used to define the schema (structure) of the
database
- CREATE TABLE
- ALTER TABLE
- CREATE INDEX
- DROP TABLE
- DROP INDEX
- Data Control Language (DCL) - used to manipulate the processing of data and erform
other misc. functions
- COMMIT
- ROLLBACK
TABLES
41
Fundamentals of Database Systems Lecture Note
Format:
The 'constraint' clause in the CREATE TABLE statement is used to enforce referential integrity.
Specifically, PRIMARY KEY, FOREIGN KEY, and CHECK integrity can be set when you
define the table. The syntax for key and check constraints is shown below.
CHECK (condition)
ALTER TABLE - Add a column to the "right" of an existing table, modify an existing attribute,
or drop an existing column or constraint.
Format:
LIMITATIONS/ENHANCEMENTS:
When you add a column, all existing tuples get the extra column filled with NULL values. You
have to go in and update the column to enter valid data later to get rid of the NULLs. You can
only add or drop a single column at a time in the ALTER statement.
Indexes are used to improve system performance by providing a more efficient means of
accessing selected attributes.
Format:
Example2: Create a unique index on SSN of EMPLOYEES and make it sort in reverse order
CREATE UNIQUE INDEX empindex ON EMPLOYEES (SSN) DESC;
Example 3: Create a composite index on COURSENUM and SDUID from ENROLL table
CREATE INDEX enroll-idx ON ENROLL (COURSENUM, STUID);
DROP TABLE - Remove a table (and all data) or an index on a table from database. If the table
has any foreign key constraints then the CASCADE CONSTRAINTS clause is required to drop
the table.
Format:
43
Fundamentals of Database Systems Lecture Note
Example3: Remove the emp-name index on employee table
DROP INDEX emp-name;
When you drop a table you also delete all data currently in that table. Be careful!
The DML component of SQL is the part that is used to query and update the tables (once they
are built via DDL commands or other means).
By far, the most commonly used DML statement is the SELECT. It combines a range of
functionality into one complex command.
SELECT
Used primarily to retrieve data from the database. Also used to create copies of tables, create
views, and to specify rows for updating.
General Format: Generic overview applicable to most commercial SQL implementations - lots of
potential combinations. There are several variations available in Oracle.
Only the SELECT and the FROM clauses are required. The others are optional.
FROM - A required clause that lists the tables that the select works on. You can define
"alias" names with this clause to speed up query input and to allow recursive "self-joins".
WHERE - An optional clause that selects rows that meet the stated condition. A "sub-
select" can appear as the expression of a where clause. This is called a "nested select".
GROUP BY - An optional clause that groups rows according to the values in one or
more columns and sorts the results in ascending order (unless otherwise specified). The
duplicate rows are not eliminated, rather they are consolidated into one row. This is
similar to a control break in traditional programming.
HAVING - An optional clause that is used with GROUP BY. It selects from the rows
that result from applying the GROUP BY clause. This works the same as the WHERE
clause, except that it only applies to the output of GROUP BY.
44
Fundamentals of Database Systems Lecture Note
ORDER BY - An optional clause that sorts the final result of the SELECT into either
ascending or descending order on one or more named columns.
There can be complex interaction between the WHERE, GROUP BY, and HAVING clauses.
When all three are present the WHERE is done first, the GROUP BY is done second, and the
HAVING is done last.
Example 2: Show what salary would be if each employee recieved a 10% raise.
SELECT LNAME, SALARY AS CURRENT, SALARY * 1.1 AS PROPOSED
FROM EMPLOYEES;
Example 1: Retrieve all information about students ('*' means all attributes)
Example 2: Find the last name, ID, and credits of all students
Example 3: Find all information about students who are math majors
45
Fundamentals of Database Systems Lecture Note
SELECT *
FROM STUDENT
WHERE MAJOR = 'Math';
SELECT STUID
FROM STUDENT
WHERE MAJOR = 'History';
STUID
S1001
MAJOR
Art
CIS
History
Math
The WHERE clause can be enhanced to be more selective. Operators that can appear in
WHERE conditions include:
=, <> ,< ,> ,>= ,<=
IN
BETWEEN...AND...
LIKE
IS NULL
AND, OR, NOT
Example 1: Find the student ID of all math majors with more than 30 credit hours.
46
Fundamentals of Database Systems Lecture Note
SELECT STUID
FROM STUDENT
WHERE MAJOR = 'Math' AND CREDITS > 30;
STUID
S1015
S1002
Example 2: Find the student ID and last name of students with between 30 and 60 hours
(inclusive).
STUID LNAME
S1015 Jones
S1002 Chin
Example 3: Retrieve the ID of all students who are either a math or an art major.
SELECT STUID
FROM STUDENT
WHERE MAJOR IN ('Math','Art');
this is the same as...
SELECT STUID
FROM STUDENT
WHERE (MAJOR = 'Math') OR (MAJOR = 'Art');
STUID
S1010
S1015
S1002
S1013
47
Fundamentals of Database Systems Lecture Note
48
Fundamentals of Database Systems Lecture Note
Example 4: Retrieve the ID and course number of all students without a grade in a class.
STUID COURSENUM
S1010 ART103A
S1010 MTH103C
NOTE: IS NULL may only appear in the WHERE clause. Also note that you say "IS NULL",
not "= NULL". NULL means "unknown" and does not really have a value in the normal sense.
Example 5: List the ID and course number for all students that successfully completed classes
(the inverse of #4 above).
STUID COURSENUM
S1001 ART103A
S1020 CIS201A
S1002 CIS201A
S1002 ART103A
S1020 MTH101B
S1001 HST205A
S1002 MTH103C
Example 6: List the course number and faculty ID for all math courses.
COURSENUM FACID
MTH101B F110
MTH103C F110
NOTE: % is a wildcard for any number of characters. _ is a wildcard that replaces a single
character. They can be used together along with normal characters.
49
Fundamentals of Database Systems Lecture Note
COLUMN FUNCTIONS (AGGREGATE FUNCTIONS)
Aggregate functions allow you to calculate values based upon all data in an attribute of a table.
The SQL aggregate functions are: Max, Min, Avg, Sum, Count, StdDev, Variance. Note that
AVG and SUM work only with numeric values and both exclude NULL values from the
calculations.
SELECT COUNT(*)
FROM STUDENT;
COUNT(*)
6
NOTE: COUNT can be used in two ways. COUNT(*) is used to count the number of tuples that
satisfy a query. COUNT with DISTINCT is used to count the number of unique values in a
named column.
COUNT(DISTINCT)
4
Example 3: Find the average number of credits for students who major in math.
SELECT AVG(CREDITS)
FROM STUDENT
WHERE MAJOR = 'Math';
AVG(CREDITS)
29
The ORDER BY clause is used to force the query result to be sorted based on one or more
column values. You can select either ascending or descending sort for each named column.
50
Fundamentals of Database Systems Lecture Note
Example 1: List the names and IDs of all faculty members arranged in alphabetical order.
FACID FACNAME
F101 Adams
F110 Byrne
F221 Smith
F202 Smith
F105 Tanaka
Example 2: List names and IDs of faculty members. The primary sort is the name and the
secondary sort is by descending ID (for seniority purposes).
The GROUP BY clause is used to specify one or more fields that are to be used for organizing
tuples into groups. Rows that have the same value(s) are grouped together.
The only fields that can be displayed are the ones used for grouping and ones derived using
column functions. The column function is applied to a group of tuples instead to the entire table.
If the HAVING clause is used, it takes the intermediate table produced by the GROUP BY and
applies further selection criteria. Note that the data are aggregated when the HAVING is
applied, so the HAVING expression should be written appropriately.
Example 1: Find the number of students enrolled in each course. Display the course number and
the count.
51
Fundamentals of Database Systems Lecture Note
COURSENUM COUNT(*)
ART103A 3
CIS201A 2
HST205A 1
MTH101B 1
MTH103C 2
Example 2: Find the average number of hours taken for all majors. Display the name of the
major, the number of students, and the average.
Example 3: Find all courses in which fewer than three students are enrolled.
SELECT COURSENUM
FROM ENROLL
GROUP BY COURSENUM
HAVING COUNT(*) < 3;
COURSENUM
CIS201A
HST205A
MTH101B
MTH103C
A JOIN operation is performed when more than one table is specified in the FROM clause. You
would join two tables if you need information from both.
You must specify the JOIN condition explicitly in SQL. This includes naming the columns in
common and the comparison operator.
52
Fundamentals of Database Systems Lecture Note
Example 1: Find the name and courses that each faculty member teaches.
FACULTY.FACNAME COURSENUM
Adams ART103A
Tanaka CIS201A
Byrne MTH101B
Smith HST205A
Byrne MTH103C
Tanaka CIS203A
When both tables have an attribute name in common, you must specify which version of the
attribute that you are referring to by preceding the attribute name with the table name and a
period. (e.g., table-name.col-name). This is called "qualification".
It is sometimes more convenient to use an "alias" (an alternative name) for each table. SQL
specifies alias names in the FROM clause immediately following the actual table. Once defined,
you can use the alias anywhere in the SELECT where you would normally use the table name.
Example 2: Find the course number and the major of all students taught by the faculty member
with ID number 'F110'. (3 table JOIN)
53
Fundamentals of Database Systems Lecture Note
WHERE FACID = 'F110'
AND C.COURSENUM = E.COURSENUM
AND E.STUID = S.STUID;
NESTED QUERIES
SQL allows the nesting of one query inside another, but only in the WHERE and the HAVING
clauses. In addition, SQL permits a subquery only on the right hand side of an operator.
Example 1: Find the names and IDs of all faculty members who teach a class in room 'H221'.
SELECT FACID
FROM CLASS
WHERE ROOM = 'H221'; ---> RESULT: F101, F102
Note that the nested SELECT is executed first and its results are used as the argument to the
outer SELECTs IN clause.
FACNAME FACID
Adams F101
Smith F202
Example 2: Retrieve an alphabetical list of last names and IDs of all students in any class taught
by faculty number 'F110'.
54
Fundamentals of Database Systems Lecture Note
SELECT LNAME, STUID
FROM STUDENT
WHERE STUID IN
(SELECT STUID
FROM ENROLL
WHERE COURSENUM IN
(SELECT COURSENUM
FROM CLASS
WHERE FACID = 'F110'))
ORDER BY LNAME;
LNAME STUID
Burns S1010
Chin S1002
Rivera S1020
The most deeply nested SELECT is done first. Thus, after the first select you have:
Finally, the outer Select is executed giving the result printed above.
Example 3: Find the name and IDs of students who have less than the average number of credits.
LNAME STUID
Chin S1002
Rivera S1020
McCarthy S1013
UNION QUERIES
A union query performs the 'union' set operation on two or more tables. The union operation
returns all tuples from all tables (like appending a second table to the bottom of the first). The
union operation also allows you to sort the resulting data, perform where restriction, etc. The
syntax for the UNION operator is shown below.
Format:
SELECT fields
FROM tables
WHERE criteria
GROUP BY field
HAVING criteria
UNION
SELECT fields
FROM tables
WHERE criteria
GROUP BY field
HAVING criteria
ORDER BY sortcriteria;
Each select is a standard select with two exceptions. First, the fields shown in the SELECT
clause must be 'union compatible' (i.e., equivalent number, type, and order). Second, there can
only be one order by for the entire query.
Example 1: Join a compatible customer table and a supplier table for all customers and suppliers
located in 'Brazil'. Sort the final result by zip code.
Oracle SQL also supports an INTERSECTION and MINUS clause that is similar to the UNION
clause. These two new connectors have the same syntax as the UNION clause and perform the
set operations indicated by their name.
UPDATE
Update gives you a way to modify individual attributes of a single tuple, a group of tuples, or a
whole table (or view).
Format:
You can only update tuples already present in the table (i.e., you cannot use UPDATE to add
new tuples). You can either UPDATE one table at a time. You don't have to know the present
value of a field to set it (although you can refer to it in the "expression" clause). The expression
cannot be a sub-query or involve aggregate operations.
Example 1: Change the major of student 'S1020' to music. (Update a single field of one tuple)
UPDATE STUDENT
SET MAJOR = 'Music'
WHERE STUID = 'S1020';
Example 2: Change Tanaka's department to MIS and rank to Assistant. (Update several fields in
one tuple)
UPDATE FACULTY
SET DEPT = 'MIS'
RANK = 'Assistant'
WHERE FACNAME = 'Tanaka';
Example 3: Change the major of student 'S1013' from math to NULL. (Updating using NULL)
57
Fundamentals of Database Systems Lecture Note
UPDATE STUDENT
SET MAJOR = NULL
WHERE STUID = 'S1013';
UPDATE ENROLL
SET GRADE = 'A'
WHERE COURSENUM = 'CIS201A';
Example 5: Give all students three extra credits. (Update all tuples)
UPDATE STUDENT
SET CREDITS = CREDITS + 3;
Example 6: Change the room to 'B220' for all courses taught by Tanaka. (Updating with a
subquery)
UPDATE CLASS
SET ROOM = 'B220'
WHERE FACID =
(SELECT FACID
FROM FACULTY
WHERE FACNAME = 'Tanaka');
INSERT
The INSERT operator is used to put new records into a table. Normally it is not used to load an
entire database (since other utilities can do that more efficiently). Aside from this, older
implementations of SQL use it to remove columns from existing tables (before the ALTER
TABLE had this capability).
Format1:
INSERT INTO table (fieldlist)
SELECT fieldlist
FROM table
WHERE append_criteria;
OR
Format2
INSERT INTO table (col1, col2...) VALUES (val1, val2...);
On the general format2 above, you can specify the columns in any order you wish and the system
will match them to the appropriate table attributes.
58
Fundamentals of Database Systems Lecture Note
Example 1: Insert a new faculty record with ID of 'F330', name of Jones, department of CIS, and
rank of Instructor. (Inserting a single record).
INSERT INTO FACULTY (FACID, FACNAME, DEPT, RANK)
VALUES ('F330','Jones','CIS',Instructor');
Since you are inserting for all fields, you can leave off the column names after FACULTY and
get the same effect. For instance, the following two examples are equivalent:
'Datatable' is a table that holds the data to be inserted. Since the data is already in a table, this
format of the INSERT is not as useful as the first version.
Example 2: Insert a new student record with Id of 'S1031', name of Maria Bono, 0 credits, and no
major. (Insert a record with NULL value in a field)
Notice that the field names are rearranged from the table order. This does not matter. Also
notice that major is missing and will therefore be inserted into the tuple as NULL.
Example 3: Create and fill a new table that shows each course and the number of students
enrolled in it. (Inserting multiple tuples into a new table)
DELETE
The DELETE operator is used to erase records (not table structure). The number of records
deleted may be 0, 1, or many, depending on how many satisfy the predicate.
59
Fundamentals of Database Systems Lecture Note
Format:
DELETE FROM table/view WHERE delete_criteria;
Example 2: Erase all enrollment records for student 'S1020'. (Delete several tuples).
Example 3: Erase all the class records. (Deleting all the tuples from a table)
Note that the table CLASS still exists, but is empty. To remove the data and the table you use
the DROP TABLE operator.
Example 4: Erase all enrollment records for Owen McCarthy. (Delete with a subquery)
In our database there is no such student, so no records are deleted from ENROLL.
60
Fundamentals of Database Systems Lecture Note
EXAMPLE DATABASE
The following E/R diagram and tables are used in the examples throughout this handout.
STUDENT
ENROLL
CLASS
FAC-CLASS
FACULTY
61
Fundamentals of Database Systems Lecture Note
STUDENT
STUID LNAME FNAME MAJOR CREDITS
S1001 Smith Tom History 90
S1010 Burns Edward Art 63
S1015 Jones Mary Math 42
S1002 Chin Ann Math 36
S1020 Rivera Jane CIS 15
S1015 McCarthy Owen Math 9
CLASS
COURSENUM FACID SCHED ROOM
ART103A F101 MWF9 H221
CIS201A F105 TUTHF10 M110
MTH101B F110 MTUTH9 H225
HST205A F202 MWF11 H221
MTH103C F110 MWF11 H225
CIS203A F105 MTHF12 M110
FACULTY
FACID FACNAME DEPT RANK
F101 Adams Art Professor
F202 Smith History Associate
F105 Tanaka CIS Instructor
F110 Byrne Math Asistant
F221 Blume CIS Professor
ENROLL
COURSENUM STUID GRADE
ART103A S1001 A
CIS201A S1020 B
CIS201A S1002 F
ART103A S1010 -0-
ART103A S1002 D
MTH101B S1020 A
HST205A S1001 C
MTH103C S1010 -0-
MTH103C S1002 B
62
Fundamentals of Database Systems Lecture Note