0% found this document useful (0 votes)
9 views341 pages

Dim 1 Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views341 pages

Dim 1 Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 341

LECTURE I

Introduction to Data and Information


Management 1

1/25/2024 CSC 1204 1


Objectives
 Data, Information and its relevance
 Manual filling systems and their challenges
 Characteristics of file based systems and their challenges
 Why database approach?
 Terminologies
 Types of DBMS
 Functions of a DBMS and its major components.
 Advantages and disadvantages of DBMS’s
 Roles/ Personnel in the database approach

1/25/2024 CSC 1204 2


Data vs. Information
Data Information
 “Data" is the plural of datum.  Data that has been
processed into a
 Data are raw, unprocessed meaningful form.
facts with no purposeful
meaning.
 value-added to data
 Data is represented by ◼ Summarized
symbols such as letters of the ◼ Organized
alphabets, numerals or other ◼ analyzed
special symbols.
1/25/2024 CSC 1204 3
Data vs. Information: Example
Data (No meaning attached)
◼ 51212

Information (Has meaning)


◼ 5/12/12 The start date of your final exam.

◼ $51,212 The average starting salary of an I.T support officer.

◼ 51212 Zip code of Austria

1/25/2024 CSC 1204 4


Data and Information - Relevance
 With data, information can be generated

 With the right, accurate and relevant information, Individuals or


organizations cab be able to;
◼ Make Informed decisions
◼ React to internal and external environment

 The relevance of data and information compels individuals or


organizations to manage it using various means.
 …….
1/25/2024 CSC 1204 5
File-based Systems
 A file is a collection of records which contain logically related
data.
 Collection of application programs that perform services for
the end users (e.g. reports).

 Each program defines and manages its own data.


 Better alternative to paper based filing systems.

1/25/2024 CSC 1204 9


FILE-BASED APPROACH
 A system of files and collection of application programs
manipulating them is a file-based system

The
University’s
File-Based
System

1/25/2024 CSC 1204 10


LIMITATIONS OF FILE-BASED APPROACH
 Separation and isolation of data
✓ Each program maintains its own set of data.
✓ Users of one program may be unaware of potentially
useful data held by other programs.

 Duplication of data
✓ Same data is held by different programs.
✓ Wasted space and potentially different values and/or
different formats for the same item.

1/25/2024 CSC 1204 11


LIMITATIONS OF FILE-BASED APPROACH
 Data dependence
✓ Unhealthy dependency between data and programs. File
structure is defined in the program code.
 Incompatible file formats
✓ Programs are written in different languages, and so cannot
easily access each others files.
 Fixed Queries/Proliferation of application programs
✓ Programs are written to satisfy particular functions. Any new
requirement needs a new program.

1/25/2024 CSC 1204 12


DATABASE APPROACH
 Databases were developed as a solution for the limitations of file
based systems for data storage and management;
✓ Definition of data was embedded in application programs,
rather than being stored separately and independently.
✓ No control over access and manipulation of data beyond that
imposed by application programs.
✓ Therefore there was a need to separate data from programs
but with a facility for them to interact.
 Result: the database to store, organize and secure data and
Database Management System (DBMS)

1/25/2024 CSC 1204 13


DATABASE TERMINOLOGIES: Database
 A database can be simply known as an organised collection of
related data. Formally may be defined as a collection of related
data, and a description of the data, designed to meet the
information needs of an organisation.

 Why Databases are important


◼ Efficient Manipulation of large data sets.
◼ Integration of Multiple Data Sources

1/25/2024 CSC 1204 14


Characteristics of the Database approach
 Single repository of data
◼ sharable by multiple users, hence favorable for concurrency
control

 Self-describing
◼ System catalogue contains metadata (data defining other
data).
◼ Catalogue contain information about the definitions of the
database objects (for example, tables, views)and security
information about the type of access that users have to these
objects.

 Multiple views of data - to support individual needs of users.


1/25/2024 CSC 1204 15
Characteristics of the Database approach cont’d
 Program-data independence
◼ Data independence is the notion of keeping data separate from
all programs that make use of the data.
 Data independence ensures that the data cannot be redefined
or reorganized by any of the programs that make use of the
data. In this approach, the data remains accessible, but is also
stable and cannot be corrupted by the applications using it.

◼ It refers to the immunity of user applications to make changes


in the definition and organization of data

1/25/2024 CSC 1204 16


When not to use a Database
Some circumstances which do not require use of databases
include:
✓ If there is a real time constraint to the application
availability
✓ If the speed of the application is critical. Databases are,
on average, slow compared to other computer processes.
✓ If the data involved is small and can easily be organized
using other traditional means
✓ If the application lifespan is small

1/25/2024 CSC 1204 17


DATABASE TERMINOLOGIES
 Database application: A program that interacts with a
database at some point in its execution.
◼ Database applications access the database, process the
data and generate the required information.

 Relational DB: A collection of tabular structures that can be


related to each other by a common field.

 Database Management System: A software system that


enables users to define, create, and maintain the database and
which provides controlled access to this database

1/25/2024 CSC 1204 18


DATABASE LANGUAGES
 Data definition language (DDL)
1. Permits specification of data types, structures and any
data constraints that should be part of the database.
2. All specifications are stored in the database.
 Data manipulation language (DML)
✓ Enables those with access to the database to insert,
update, delete and retrieve data from it.
✓ Standard Query Language (SQL) is an example of a
DML

19
Database Management System (DBMS)

Users

1/25/2024 CSC 1204 20


Types of DB Management Systems
 The number of users determines whether the DBMS is classified
as single-user or multi-user.
◼ A Single –user DBMS supports only one user at a time. In
other wards if a user A is using the database, users B and C
must await.

◼ Multi-user DBMS: supports multiple users at the same time


The entire organization and supports many users across many
departments.

Task: Give examples of single user and multiuser DBMS?


1/25/2024 CSC 1204 21
Components of DBMS Environment

1/25/2024 CSC 1204 22


Components of DBMS Environment.
◼ Hardware: refers to the physical parts of a computer and
related devices. Can range from a PC to a network
devices of computers.
◼ Software: collection of programs used by the computers
within the database system. DBMS, operating system,
network software.
◼ Procedures: Instructions and rules that should be applied
to the design and use of the database and DBMS. These
may consist instructions on how to:
 Log onto the DBMS, Start and Stop DBMS, make backup
copies
1/25/2024 CSC 1204 23
Components of DBMS Environment
 People
-Users of the database system.
-Includes database designers, DBAs, application
programmers, and end-users.

 Data: Data acts the bridge between the machine and


human components. The database contains both operational
and meta data .

1/25/2024 CSC 1204 24


Roles/Users in the Database Environment
Data Administrator (DA): Responsible over the data policies in
the organization. Does not need to be a technical person.

 Data administrator's duties include:


◼ Database planning,
◼ Development and Maintenance of standards, policies
◼ Taking part in the conceptual design of the database

1/25/2024 CSC 1204 25


Roles/Users in the Database Environment
Database Administrator (DBA) Responsible for the
management and control of the database. More technically
oriented than the DA

 Database administrator's duties include:


◼ Granting user authority to access the database.
◼ Specifying integrity constraints.
◼ Monitoring performance and responding to changes in
requirements.

1/25/2024 CSC 1204 26


Roles/Users in the Database Environment
Database Designers ( Logical and Physical)
◼ Logical database designer: concerned with identifying
data ( the attributes, entities, relationships between the
data).

◼ Physical database designer: designs any security


measures required on the data. Selecting specific storage
structures and access methods for data to achieve good.

1/25/2024 CSC 1204 27


Roles/ Users in the Database Environment
Application Programmers: Write programs for accessing and
interacting with the database through DML calls such as data
retrieval, data update, data insertion and data deletion.

End Users (naive and sophisticated): participate in data entry of


data and manipulation.
◼ Naive Users: do not know any thing about the database e.g.
checkout assistant at a local supermarket.
◼ Sophisticated users: have knowledge of how the database runs
e.g. Systems administrator

1/25/2024 CSC 1204 28


Functions of a DBMS
◼ Data Storage, Retrieval, and Update.
◼ Authorization Services: A User-Accessible Catalog (rights of
the user, database tables)
◼ Transaction Support
◼ Concurrency Control Services.
◼ Recovery Services.
◼ Support for Data Communication.
◼ Integrity/reliable Services (maintain something thing in its
truth or originality)
◼ Services to Promote Data Independence.

1/25/2024 CSC 1204 29


Advantages of DBMS
 Control of data redundancy; As we discussed TFBS
wastes space by storing the same information in more than
one file. DBMS help in integrating files and avoiding
multiple copies of the same data.
◼ Redundancy is not eliminated completely because of
performance reasons.

 Improved Security. If data is always accessed through the


DBMS. The DBMS can enforce integrity constraints e.g.
before inserting salary information for an employee, the
DBMS can check that the dept budget is not exceeded.

1/25/2024 CSC 1204 30


Advantages Cont’d
 Increased concurrency; In some file-based systems, if two
or more users are allowed to access the same file
simultaneously, the accesses may interfere with one another.
With DBMS two or more users can access data at the same
time without loss of data integrity.

 Improved data accessibility and responsiveness; data


crosses departments and unplanned questions can be asked
unlike fixed queries in file based systems.

1/25/2024 CSC 1204 31


Advantages cont’d
 Improved backup and recovery services; DBMS provides
facilities of minimizing amount of processing that is lost
following a failure.

 Sharing of data. Typically files are owned by the departments


that use them . On the other hand the database belongs to the
entire organization and can be shared by all authorized users.

1/25/2024 CSC 1204 32


Disadvantages of DBMS
 Complexity: It has several components which need
serious and careful planning to get a good DBMS.

 Size: large size of software taking large disk space and


memory.

 Cost of DBMS: varies with the number of users 1 or 50

 Additional hardware costs: e.g. storage, memory etc

1/25/2024 CSC 1204 33


Disadvantages of DBMS
 Cost of conversion; costs of converting existing applications to
run on the new DBMS and hardware, employing specialists,
training staff etc.

 Performance; some applications may not run as first as the used


to because DBMS caters for several applications at once.

 Higher impact of a failure; integration of resources increases


vulnerability.

1/25/2024 CSC 1204 34


CHAPTER 2
1 Database Environment
CHAPTER 2 - OBJECTIVES

 Purpose of three-level database architecture.


 Contents of external, conceptual, and internal levels.
 Purposeof external/conceptual and
conceptual/internal mappings.
 Meaning of logical and physical data independence.
 Functions of a DBMS

2
ANSI SPARC 3 LEVEL DATABASE ARCHITECTURE

(ANSI SPARC) American National


Standard Institute Standard Planning
and Requirements Committee

Uses a three level architecture;


 External Level
 Conceptual Level
 Internal Level
3
ANSI-SPARC THREE-LEVEL
ARCHITECTURE

 External Level
 Users' view of the database.
 Describes that part of database that is relevant to a
particular user.

 Conceptual Level
 Content view of the database.
 Describes what data is stored in database and
relationships among the data. Also describes the
constraints.
4
ANSI-SPARC THREE-LEVEL
ARCHITECTURE

 Internal Level
 Physical representation of the database on the computer.
 Describes how the data is stored in the database.
 Describes the definitions of the stored records, the
representations, data fields, etc

5
ANSI-SPARC THREE-LEVEL
ARCHITECTURE

6
DIFFERENCES BETWEEN THREE LEVELS
OF ANSI-SPARC ARCHITECTURE

7
OBJECTIVES OF THREE-LEVEL
ARCHITECTURE
 Allusers should be able to access same
data.

 A user'sview is immune to changes made in


other views.

 Usersdo not need to know physical


database storage details.
8
OBJECTIVES OF THREE-LEVEL
ARCHITECTURE
 DBA should be able to change database storage
structures without affecting the users' views.

 Internal
structure of database should be unaffected
by changes to physical aspects of storage.

 DBA should be able to change conceptual structure


of database without affecting all users.

9
MAPPINGS

 Mappings between the different database schemas allows


for data independence. The DBMS manages these
mappings and checks the schemas for consistency.
 Internal-Conceptual mappings enable the DBMS to
find records within the database storage medium that
correspond to the logical (internal) record in the
conceptual schema.
 External-Conceptual mappings enable the DBMS to
match names of data items etc... in the user's view with
the parts of the conceptual schema that correspond to
those items.

10
DATA INDEPENDENCE
 Data Independence means that the higher levels of the
database model are designed to be unaffected by changes to
the lower levels (internal and physical). There are two types
of Data Independence.

- Logical data independence


- Physical data independence

11
DATA INDEPENDENCE
 Logical Data Independence
 Refers to immunity of external schemas to
changes in conceptual schema.
 Conceptual schema changes (e.g.
addition/removal of entities) should not require
changes to external schema or rewrites of
application programs.

12
DATA INDEPENDENCE

 Physical Data Independence


 Refers to immunity of conceptual schema to
changes in the internal schema.
 Internal schema changes (e.g. using different file
organizations, storage structures/devices).
 Should not require change to conceptual or
external schemas.

13
DATA INDEPENDENCE AND THE ANSI-
SPARC THREE-LEVEL ARCHITECTURE

14
DATABASE PLANNING, DESIGN
AND ADMINISTRATION
OBJECTIVES
 Main components of an information system.
 Main stages of database application lifecycle.
 Main phases of database design: conceptual,
logical, and physical design.
 How to evaluate and select a DBMS.

2
DATABASE DEVELOPMENT LIFECYCLE /
DATABASE SYSTEM LIFECYCLE
Consists of 11 steps which are not strictly
sequential but are iterative to some extent; there
are feedback loops between most stages of the
lifecycle.

 Database planning

 System definition

 Requirements collection and analysis

 Database design 8

 DBMS selection (optional)


DATABASE APPLICATION
LIFECYCLE
 Application design
 Prototyping (optional)
 Implementation
 Data conversion and loading
 Testing
 Operational maintenance.

9
STAGES OF THE DATABASE
APPLICATION LIFECYCLE

10
STEP 1: DATABASE PLANNING
Management activities that allow stages of
database application lifecycle to be realized
as efficiently and effectively, as possible.
 Must be integrated with overall IS strategy of
the organization.
What the database application is going to

do.
To what area it will be applied.

Who will be using it.

11
DATABASE PLANNING – MISSION
STATEMENT

 Mission statement for the database project defines


major aims of database application.
 Those driving database projects normally define
the mission statement (director/owner).
 Mission statement helps clarify purpose of the
database project and provides clearer path towards
the efficient and effective creation of required
database application.
12
DATABASE PLANNING – MISSION
OBJECTIVES
 Once mission statement is defined, mission
objectives are defined.

 Each objective should identify a particular task


that the database must support.
 May be accompanied with some additional
information that specifies the work to be done, the
resources with which to do it, and the money to
pay for it all.

13
DATABASE PLANNING
 Databaseplanning should also include
development of standards that govern:
 how data will be collected,
 how the format should be specified,
 what necessary documentation will be needed,
 how design and implementation should proceed.
This step is critical and takes a lot of time in
terms of development and maintenance, but
provides a good basis for staff training and
15
quality control.
STEP 2: SYSTEM DEFINITION
Describes scope and boundaries of database
application and the major user views (both current
and future). It also involves describing or
identifying how it interfaces with the other
parts of the organization's information
system.
 User view defines what is required of a database
application from perspective of:
 a particular job role (such as Manager or
Supervisor) or
 enterprise application area (such as marketing,
personnel, or stock control).
16
SYSTEM DEFINITION
 Databaseapplication may have one or more user
views and each user view presents the data
(what data is to be held or is needed) and
transaction (what will be done with that
data) requirements of a system.
 Identifyinguser views helps ensure that no major
users of the database are forgotten when
developing requirements for new application
(Requirement Elicitation).
 Userviews also help in development of complex
database application allowing requirements to be
broken down into manageable pieces.
18
REPRESENTATION OF A DATABASE
APPLICATION WITH MULTIPLE USER
VIEWS

19
STEP 3: REQUIREMENTS
COLLECTION AND ANALYSIS
Process of collecting and analyzing
information about the part of organization
to be supported by the database application,
and using this information to identify users’
requirements of new system.

20
REQUIREMENTS COLLECTION
AND ANALYSIS
 Information is gathered for each major user view
including:
 a description of data used or generated
 details of how data is to be used/generated
 any additional requirements for new database
application.

 Information is analyzed to identify requirements to


be included in new database application.
21
REQUIREMENTS COLLECTION
AND ANALYSIS

 Another important activity is deciding how


to manage database application with
multiple user views.
 Three main approaches:

 centralized approach
 view integration approach
 combination of both approaches.

23
CENTRALIZED APPROACH TO
MANAGING MULTIPLE USER VIEWS

25
REQUIREMENTS COLLECTION
AND ANALYSIS
 View integration approach
 Requirements for each user view are used to
build a separate data model.
 Suitable when there are significant
differences between the user views and the
database application is sufficiently complex
and requires to be broken into manageable
pieces.

26
VIEW INTEGRATION APPROACH TO
MANAGING MULTIPLE USER VIEWS

28
STEP 4: DATABASE DESIGN
Process of creating a design for a
database that will support the
enterprise’s operations and
objectives.

30
DATABASE DESIGN
 Approaches include:
 Top-down
 Bottom-up
 Inside-out
 Mixed
 Most Commonly used are the Top-down
(Starts with a general overview and
details keep being added) and Bottom-up
approaches (Overall design is constructed
from the smaller details).
32
DATABASE DESIGN
 Top-down:
 Starts by developing a model containing few
high-level entities and relationships, after
which low level entities are identified. For
example the ER model. Suitable for complex
databases.
 Bottom-up:

 Begins at a fundamental level of attributes


(properties of entities and relationships).
 The association between the attributes is
analyzed. Attributes that are closely related
are then grouped into relations that represent
types of entities and relationships between 33
entities.
DATABASE DESIGN

 This approach is therefore suitable for simple


databases; those that have a manageable number
of attributes.
 N.B: It is difficult to identify all the
attributes for complex database and hence
the functional dependencies between them

34
DATABASE DESIGN

 Inside-out:
 This approach is a variant of or related to the
bottom-up approach but differs by first
identifying the major entities then spreads out
to consider other entities, relationships and
attributes associated with the first one.

 Mixed:
 Combines both top-down and bottom-up in
different aspects of the model before finally
combining all the parts.
35
DATABASE DESIGN
 Data modeling
 This is the process of building data models to
represent a designer’s understanding of the
information requirements of an enterprise
 Data modeling is a method used to define and
analyze data requirements needed to support
the business processes of an organization
 Main purposes of data modeling include:

 To assist in understanding the meaning (semantics)


of the data. This involves answering questions
about entities, relationships and attributes.

36
DATABASE DESIGN
 Answering such questions enables one to
understand each user view’s perspective of the
data, nature of the data itself, independent of its
physical representations; and use of data across
user views.
 To facilitate communication about the information
requirements.
 Data models are a means by which a designer
conveys his/her understanding of an enterprise’s
information requirements.
 Provided the two parties are familiar with the
notation, these provide a basis for
communication e.g. Entity Relationship
37
Diagrams (ERD).
DATABASE DESIGN
 Three phases of database design:

 Conceptual database design


 Logical database design
 Physical database design.

 The Conceptual and logical design phases


correspond to the first two levels of the ANSI-
SPARC architecture of a database system.
 The physical design phase provides the basis to
define the internal schema.

39
THREE-LEVEL ANSI-SPARC
ARCHITECTURE AND PHASES OF
DATABASE DESIGN

40
PHYSICAL DATABASE DESIGN
 Thephysical database design deals with
the how while the logical and conceptual
design deal with the what.

48
STEP 5: DBMS SELECTION
(OPTIONAL)
 This is done at any time prior to the Logical
database design phase provided there is
sufficient information regarding the systems
requirements e.g. performance, security and
integrity constraints.
 The DBMS selection process must cater for the
enterprise’s current and future requirements, at
an optimum cost (purchase of the DBMS,
additional hardware/software, changeover costs
and staff training) as there may be need to
expand or replace the system.
55
DBMS SELECTION
Selection of an appropriate DBMS to
support the database application.
 Main steps to selecting a DBMS:
 define Terms of Reference of study
 shortlist two or three products
 evaluate products
 recommend selection and produce report.

56
STEP 6: APPLICATION DESIGN
Design of user interface and application programs
that use and process the database.

 Database and application design are parallel


activities, since a database exists to support
applications.

 Includes two important activities:


 transaction design;
 user interface design.

61
APPLICATION DESIGN - TRANSACTIONS
 This involves designing the application programs
that access the database and the transactions
(access methods).
 This design provides a description of how the
functionality of the database application will be
achieved.
 A transaction is an action, or series of
actions, carried out by a single user or
application program, which accesses or
changes content of the database.
 For example, cash withdrawal, cash deposit, etc

62
APPLICATION DESIGN -
TRANSACTIONS
 The purpose of the transaction design is to define
and document the high-level characteristics of the
transactions required

 Important characteristics of transactions:


 data to be used by the transaction
 functional characteristics of the transaction
 output of the transaction
 importance to the users
 expected rate of usage.

64
STEP 7: PROTOTYPING
(OPTIONAL)
It involves building a working model of a
database application.
It does not have all the required features functionality.

 Purpose
 to identify features of a system that work well,
or are inadequate
 to suggest improvements or even new features
 to clarify the users’ requirements
 to evaluate feasibility of a particular system
design.
66
STEP 8: IMPLEMENTATION
 Physical realization of the database and
application designs.
 Database implementation is achieved using the data
definition language (DDL) of the selected DBMS.

 Components such as forms, menu screens and reports


are also implemented in this stage

68
STEP 9: DATA CONVERSION AND
LOADING
Transferring any existing data into new database
and converting any existing applications to run on
new database.

 Only required when new database system is replacing an


old system.
 DBMS normally has utility that loads existing files into new
database.
 May be possible to convert and use application programs
from old system for use by new system.

70
STEP 10: TESTING

Process of executing application programs with


intent of finding errors.

 Use carefully planned test strategies and realistic data.


 Testing cannot show absence of faults; it can show only
that software faults are present.
 Demonstrates that database and application programs
appear to be working according to requirements.
 Important to include users at this stage.

71
STEP 11: OPERATIONAL
MAINTENANCE
Process of monitoring and maintaining
system following installation.
 Monitoring performance of system.
 if performance falls, may require tuning or
reorganization of the database.
 Maintaining and upgrading database
application (when required).
 Incorporating new requirements into database
application.

72
END

77
1

THE RELATIONAL MODEL


Objectives
2

Terminology of relational model.


How tables are used to represent data.
Properties of database relations.
How to identify candidate, primary, and
foreign keys.
Meaning of entity integrity and referential
integrity.
Purpose and advantages of views.
3 Relational Model
Terminology
A relation is a table with columns and
rows.
Only applies to logical structure of the
database, not the physical structure.

Attribute is a named column of a relation.

Domain is the set of allowable values for


one or more attributes.
4 Relational Model
Terminology
Tuple is a row of a relation.

Degree is the number of attributes in a relation.

Cardinality is the number of tuples in a relation.

Relational Database is a collection of


normalized relations with distinct relation
names.
Instances of Branch and Staff (part)
5
Relations
6 Examples of Attribute
Domains
Alternative Terminology for Relational
7
Model
Entity Type
 Entity: An entity is a data object to be modelled/stored in a
database application. The entity may be physical (like
student, staff) or logical (like course unit). An entity can be
strong or weak.
 Entity type
 The set of all possible values for an entity, such as all possible
students.
 Entity occurrence (Instance)
 Individual occurrence of an entity. Similar to a row in the relational
table.

8
Database Relations
9
Relation schema
 Named relation defined by a set of attribute and domain name pairs.
Employee Table

EmpName Sex D.O.B EmpID Dept

Relational database schema


 Set of relation schemas, each with a distinct name.
10 Properties of Relations
Relation name is distinct from all other relation
names in relational schema.

Each cell of relation contains exactly one


atomic (single) value.

Each attribute has a distinct name.

Values of an attribute are all from the same


domain.
11 Properties of Relations
 Each tuple is distinct; there are no duplicate tuples.

 Order of attributes has no significance.

 Order of tuples has no significance, theoretically.


Relational Keys
12
 Primary Key
 Candidate key selected to identify tuples
uniquely within relation.

 Alternate Keys
 Candidate keys that are not selected to be
primary key.

 Foreign Key
Attribute, or set of attributes, within one
relation that matches candidate key of some
(possibly same) relation.

 Composite Key
 A candidate key that consists of two or more attributes.
13 Relational Integrity
 Null

 Represents value for an attribute that is


currently unknown or not applicable for
tuple
 Deals with incomplete or exceptional
data.
 Representsthe absence of a value and is
not the same as zero or spaces, which are
values.
Relational Integrity
14
Entity Integrity
 In a base relation, no attribute of a primary
key can be null.
 To identify each row in a table, the table must
have a primary key. The primary key is a
unique value that identifies each row. This
requirement is called the entity integrity
constraint.
 Entity Integrity ensures that there are no duplicate records within
the table and that the field that identifies each record within the
table is unique and never null.
 The existence of the PK is the core of the entity integrity
15 Relational Integrity

Referential Integrity
 If foreign key exists in a relation, either foreign
key value must match a candidate key value
of some tuple in its home relation or foreign
key value must be wholly null.
 Enterprise Constraints
 Additional rules specified by users or database administrators. E.g No
student is allowed to register beyond a given no. Of courses in a
semester.
16
Types of Relations

Base Relation
Named relation corresponding to an entity in
conceptual schema, whose tuples are
physically stored in database.
View Relation
Dynamic result of one or more relational
operations operating on base relations to
produce another relation.

A base relation (table) actually contains


data. A view is a query over one (or more)
base relations but does not actually contain
any data itself
17

A virtual relation that does not necessarily


actually exist in the database but is produced
upon request, at time of request.

Contents of a view are defined as a query on


one or more base relations.

Views are dynamic, meaning that changes


made to base relations that affect view
attributes are immediately reflected in the
view.
18 Purpose of Views
Provides powerful and flexible security
mechanism by hiding parts of database from
certain users.

Permits users to access data in a customized


way, so that same data can be seen by
different users in different ways, at same time.

Can simplify complex operations on base


relations.
19 Updating Views

All updates to a base relation should be


immediately reflected in all views that
reference that base relation.

If view is updated, underlying base


relation should reflect change. (there are
restrictions on this)
Entity relationship
modelling
2 Objectives
 How to use Entity–Relationship (ER) modeling in database design.

 Basic concepts associated with ER model.

 Diagrammatic technique for displaying ER model using Unified


Modelling Language (UML).

 How to identify and resolve problems with ER models called


connection traps.

 How to build an ER model from a requirements specification.


Why ER Modeling?
3 ❑ This involves transforming human level data collected in form of
user requirements into a universally understandable entity
relationship model.
This helps in the following:
❑ Identifying and resolving traps: Traps may occur where in real life
relationships cannot be represented and after computerizing
may actually get lost.
❑ Elimination of ambiguity: Ambiguity can be caused about due
to different interpretation of a problem, hence different
representation.
Why ER Modeling?
4

 Foundation for entity and referential integrity:


ER Modelling provides a foundation for integrity before the
database is implemented.
5 Problems with ER Models
 Problems may arise when designing a conceptual data model called
connection traps.

 Often due to a misinterpretation of the meaning of certain relationships.

 Two main types of connection traps are called fan traps and chasm
traps.
6 Problems with ER Models
 Fan Trap
Where a model represents a relationship between
entity types, but pathway between certain entity
occurrences is ambiguous.
 Chasm Trap
Where a model suggests the existence of a
relationship between entity types, but pathway
does not exist between certain entity occurrences.
7 An Example of a Fan Trap
8 Semantic Net of ER Model with
Fan Trap

At which branch office does staff number SG37


work?
Restructuring ER model to remove Fan Trap
9
10 Semantic Net of Restructured ER
Model with Fan Trap Removed

 SG37 works at branch B003.


11 An Example of a Chasm Trap
Semantic Net of ER Model with Chasm Trap
12

 At which branch office is property PA14 available?


ER Model restructured to remove Chasm Trap
13
14 Semantic Net of Restructured ER
Model with Chasm Trap Removed
Entity Type
 Strong Entity Type: This is an entity whose occurrences exist in their
18
own rights in the modelled system. If one is developing a
database application for university students, then a course unit,
student, staff, lecture theater, etc are strong entities.
 Weak Entity Type: This is an entity whose occurrences’ existence
depend on the existence of specific occurrences in other
entities. In a database
application for university students, parent and next of kin are
weak entities.

 An entity is weak or small in respect to some application not as a


rule of thumb.
TYPES OF Attributes
21

 Simple Attribute (SA)


 Attribute composed of a single component with an independent existence
example Reg.No

 Composite Attribute (CA)


 This is an attribute that can have different components for the
same entity occurrence. Examples of composite attribute
include address, contact, etc
22 Attributes
 Single-valued Attribute
 Attribute that holds a single value for each occurrence of an entity type. Eg
reg no,

 Multi-valued Attribute (MVA)


 This is an attribute that can have different entries/values for the
same entity occurrence. Examples of multi-valued attributes
include telephone number, academic qualifications
Attributes
23
 Derived Attribute (/)

This is an attribute whose value can be derived


from another attribute, not necessarily in the same
entity type. Age, for example, is a derived attribute
of date of birth. Except in special circumstances,
derived attributes are not stored.

The size of the attribute set depends on the amount


of information required in the application being
developed.
Relationship Types
 Derived from the fact that a database consists of related data meaning
25 that the entities must be related according to the application being
developed.
 Relationships give a broader view of the level of interaction, whereas
Multiplicities give a more specific view.
 Relationship type
 Set of meaningful associations among entity types.

 Relationship occurrence
 Uniquely identifiable association, which includes one occurrence from each
participating entity type.
Semantic net of Has relationship
type

26
28 Relationship Types
 Degree of a Relationship
 Number of participating entities in relationship.

 Types
 One is cyclic
 two is binary
 three is ternary
 four is quaternary.
 More than two is tertiary. These are normally broken down into multiple cyclic and binary
relationships.
Binary relationship called POwns
29
30 Ternary relationship called
Registers
31 Quaternary relationship called
Arranges
32 Relationship Types
 Recursive Relationship
 Relationship type where same entity type participates more than once in different
roles.

 Relationships may be given role names to indicate purpose that each


participating entity type plays in a relationship.
33 Recursive relationship called
Supervises with role names
34 Entities associated through two
distinct relationships with role
names
35 ER diagram of Staff and Branch
entities and their attributes
Relationship called Advertises with attributes
(representing
36
a weak entity)
37 Structural Constraints
 Main type of constraint on relationships is called multiplicity.

 Multiplicity - number (or range) of possible occurrences of an entity


type that may relate to a single occurrence of an associated entity type
through a particular relationship.

 Represents policies (called business rules) established by user or


company.
38 Structural Constraints
 The most common degree for relationships is binary.

 Binary relationships are generally referred to as being:


 one-to-one (1:1)
 one-to-many (1:*)
 many-to-many (*:*)
ONE TO ONE RELATIONSHIP
39Entities A and B have a one-to-one relationship if a single
occurrence in A can be mapped onto at most one
occurrence in B and a single occurrence in B can be mapped
onto a maximum of one occurrence in A.
 Examples include the relationships between

 A football team and a captain,


 A president and a country
 A university and a vice chancellor, etc
40 Semantic net of Staff Manages
Branch relationship type
41 Multiplicity of Staff Manages
Branch (1:1) relationship
ONE TO MANY RELATIONSHIP
42Entities A and B have a one-to-many relationship if a single
occurrence in A (B) can be mapped onto a maximum of one
occurrence in B (A) but a single occurrence in B(A) can be
mapped onto a maximum of more than one occurrences in
A(B).

 Examples include a relationship between


 a child and a mother,
 a soldier and an army,
 a student and a university, etc
43 Semantic net of Staff Oversees
PropertyForRent relationship type
Multiplicity of Staff Oversees PropertyForRent
(1:*) relationship type
44
MANY TO MANY RELATIONSHIPS
45Entities A and B have a many-to-many relationship if a single
occurrence in A can be mapped onto more than one
occurrences in B and a single occurrence in B can be
mapped onto multiple occurrences in A.
 Examples include the relationship between

 students and course units they offer


 leaders and subjects, etc
Semantic net of Newspaper Advertises
PropertyForRent
46
relationship type
Multiplicity of Newspaper Advertises
PropertyForRent (*:*) relationship
47
50 Summary of multiplicity
constraints
51 Structural Constraints
 Multiplicity is made up of two types of restrictions on relationships: cardinality and
participation.

 Cardinality

Describes maximum number of possible relationship


occurrences for an entity participating in a given
relationship type.
 Participation

Determines whether all or only some entity occurrences


participate in a relationship.
Multiplicity as cardinality and participation
constraints
52
ENHANCED ENTITY
RELATIONSHIP MODELLING

1/25/2024
1
EER

 EER model concepts includes

1/25/2024
 All modeling concepts of basic ER
 Additional concepts:

4
 subclasses/superclasses

 specialization/generalization

 attribute and relationship inheritance

 These are fundamental to conceptual modeling


 The additional EER concepts are used to model
applications more completely and more accurately
 EER includes some object-oriented concepts, such as
inheritance
EER MODELLING

1/25/2024
 Two steps in EER modelling include:
 removing all aspects that are not
acceptable in the relational model
and
 identifying and adequately
representing
specialization/generalization.

5
UNACCEPTABLE ASPECTS

1/25/2024
many-to many relationships
composite attributes
multi valued attributes
cyclic relationships and
tertiary relationships.

6
ELIMINATION OF THESE ASPECTS
 Many-to-many relationships:
 Break them up and create two one to

1/25/2024
many relationships with a weak entity
in between the initially participating
entities
 Composite attributes:

 Eliminate them and make the


components atomic attributes
 Multi-valued attributes:

 Eliminate them from the parent entity


and make them weak entities of the 7
former entities
ELIMINATION CONT’D

1/25/2024
 Cyclic Relationships:
 Eliminate them and create an entity
along the relationship so as to create
two binary relationships
 Tertiary relationships:

 Replace them with at least n − 1 binary


relationships where n is the number of
participating entities.
 This partially translates an ER into an
EER 8
SPECIALIZATION AND GENERALIZATION

1/25/2024
 Issues of specialization and generalization are
associated with the superclass and subclass
entities.
 Supertypes/ Super classes and subtypes/ sub
classes are used when there exist entities which
share common properties.
 The entity supertype contains the shared
properties of all the subtypes.
 An entity subtype has a more specific role and
belongs to a supertype.
9
SPECIALIZATION / GENERALIZATION
 Superclass

1/25/2024
 An entity type that includes one or more distinct
subgroupings of its occurrences.
 Subclass

 A distinct subgrouping of occurrences of an


entity type.
 The relationship between a superclass and
a subclass is a one -to- one (mandatory or
optional).
 The subclass inherits some or all the
attributes from the superclass. 10
SUBCLASSES AND SUPERCLASSES
❑An entity type may have additional meaningful
subgroupings of its entities
SECRETARY Superclasses: EMPLOYEE
Subclasses: SECRETARY,
ENGINEER
EMPLOYEE
ENGINEER, TECHNICIAN,
TECHNICIA
N SALARIED_EMPLOYEE,

SALARIED_EMPLOYEE HOURLY_EMPLOYEE
HOURLY_EMPLOYEE

1/25/2024 12
EXAMPLE

Lname SSN
Fname Addr

1/25/2024
DEPARTMENT WORKS EMPLOYEE

14
 d

TypingSpeed MANAGER
TGrade EngType HOURLY_EMP

SECRETARY TECHNICIA ENGINEER SALARIED_EMP


N

EMPLOYEE: WORKS MANAGES BELONGS_TO


SECRETARY: WORKS
TECHNICIAN: WORKS
ENGINEER: WORKS
MANAGER: WORKS, MANAGES PROJECT TRADE_UNION
SALARIED_EMP: WORKS
HOURLY_EMP: WORKS, BELONGS_TO
PROPERTIES OF SUPERCLASSES AND
SUBCLASSES (CONT.)

1/25/2024
 An entity CANNOT exist in the DB merely by
being a member of a subclass. It must also be a
member of the superclass.

18
 An entity can be a member of more than one
subclass.
 Example: A salaried employee who is also an
engineer belongs to the two subclasses
ENGINEER and SALARIED_EMPLOYEE
 It is not necessary that every entity in a
superclass be a member of some subclass
 Example: A technical writer is an employee
but does not belong to any subclasses.
ATTRIBUTE INHERITANCE IN SUPERCLASS /
SUBCLASS RELATIONSHIPS

1/25/2024
 The type of an entity is defined by the
attributes it possesses and the relationship
types which it participates in.

19
 An entity that is a member of a subclass
inherits all the attributes of the entity as a
member of the superclass, as well as all the
relationships in which the superclass
participates.
EXAMPLE
Lname SSN
Addr EMPLOYEE
Fname
Fname, Lname, SSN, Addr

SECRETARY
EMPLOYEE Fname, Lname, SSN, Addr TypingSpeed
TECHNICIAN
Fname, Lname, SSN, Addr, TGrade
d ENGINEER
Fname, Lname, SSN, Addr, EngType

TypingSpeed  EngType
TGrade

SECRETARY ENGINEER
TECHNICIAN

1/25/2024 20
EXAMPLE

DEPARTMENT WORKS EMPLOYEE

1/25/2024
d
 d

21
TypingSpeed MANAGER
TGrade EngType HOURLY_EMP

SECRETARY TECHNICIA ENGINEER SALARIED_EMP


N

Entity Type: Relationship Type MANAGES BELONGS_TO


EMPLOYEE: WORKS
SECRETARY: WORKS
TECHNICIAN: WORKS PROJECT TRADE_UNION
ENGINEER: WORKS
MANAGER: WORKS, MANAGES
SALARIED_EMP: WORKS
HOURLY_EMP: WORKS, BELONGS_TO
SPECIALIZATION
 Top-down design process; we designate subgroupings
within an entity set that are distinctive from other
entities in the set.
 These sub groupings become lower-level entity sets
that have attributes or participate in relationships
that do not apply to the higher-level entity set.
 Depicted by a triangle component labeled ISA (E.g.
customer “is a” person).
 Attribute inheritance – a lower-level entity set
inherits all the attributes and relationship
participation of the higher-level entity set to which it
is linked.
GENERALIZATION

1/25/2024
 Generalization is the reverse of specialization
process. It defines a generalized entity type from
the given entity types.

25
 A bottom-up design process – combine a number
of entity sets that share the same features into a
higher-level entity set.
 Specialization and generalization are simple
inversions of each other; they are represented in
an E-R diagram in the same way.
 The terms specialization and generalization are
used interchangeably
GENERALIZATION (CONT.)

NoOfPassengers LicensePlateNo NoOfAxies LicensePlateNo

MaxSpeed Tonnage
CAR Price TRUCK Price
VehicleID VehicleID

VehicleID Price LicensePlateNo

VEHICLE

NoOfPassengers d NoOfAxies

MaxSpeed
CAR TRUCK Tonnage

1/25/2024 26
GENERALIZATION (CONT.)

1/25/2024
 We can view {CAR, TRUCK} as a specialization of
VEHICLE

27
 Alternatively, we can view VEHICLE as a
generalization of CAR and TRUCK
SPECIALIZATION / GENERALIZATION
 Specialization

1/25/2024
 Process of maximizing differences between
occurrences of an entity by identifying their
distinguishing characteristics.

 Generalization
 Process of minimizing differences between
entities by identifying their common
characteristics.

28
CONSTRAINTS ON
SPECIALIZATION/GENERALIZATION

1/25/2024
 Disjoint Constraints: Disjointness (OR) vs.
Overlap Constraints (Nondisjoint, AND)

29
 Participation / Completeness Constraints: A
total specialization (Mandatory)vs. a partial
(Optional) specialization
DISJOINTNESS CONSTRAINT

1/25/2024
 Disjointness(d) constraint (OR) specifies that the
subclasses of the specialization must be disjointed (an

30
entity can be a member of at most one of the subclasses
of the specialization)

 In EER diagram, d in the circle stands for disjoint.


EXAMPLE
Name SSN BirthDate Address

EMPLOYEE

d d

TypeSpeed TGrade EngType Salary PayScale

SECRETARY TECHNICIA ENGINEER SALARIED_EMP HOURLY_EMP


N

Disjoint subclasses Disjoint subclasses


1/25/2024 31
OVERLAP CONSTRAINT

1/25/2024
 Overlap(o) (AND) specifies that the subclasses
are not constrained to be disjoint, i.e., the same
(real-world) entity may be a member of more

32
than one subclass of the specialization.
 Overlap is the default constraint and displayed
by placing an o in the circle.
COMPLETENESS (PARTICIPATION)
CONSTRAINT

1/25/2024
Completeness constraint may be total or
partial.

33
 A total specialization (Mandatory) constraint
specifies that every entity in the superclass must
be a member of some subclass in the
specialization.

 Represented by a double line connecting the


superclass to the circle.
COMPLETENESS (PARTICIPATION)
CONSTRAINT (CONT.)

1/25/2024
 A partial specialization (Optional) allows an
entity not to belong to any of the subclasses,
using a single line in EER.

34
e.g., if some EMPLOYEE entities, for example,
sales representatives, do not belong to any of the
subclasses {SECRETARY, ENGINEER,
TECHNICIAN}, then the specialization is
partial.

 Represented by a single line connecting the


superclass to the circle.
FOUR POSSIBLE CONSTRAINTS

1/25/2024
 The disjointness and completeness constraints
are independent.

35
 There are four possible constraints on
specialization:
 Disjoint, total (Mandatory, Or)
 Disjoint, partial (Optional, Or)
 Overlapping, total (Mandatory, And)
 Overlapping, partial (Optional, And)
SPECIALIZATION/GENERALIZATION OF STAFF ENTITY
INTO SUBCLASSES REPRESENTING JOB ROLES

1/25/2024
37
EER DIAGRAM WITH SHARED SUBCLASS AND
SUBCLASS WITH ITS OWN SUBCLASS

1/25/2024
38
DREAMHOME WORKED EXAMPLE - STAFF SUPERCLASS
WITH SUPERVISOR AND MANAGER SUBCLASSES

1/25/2024
39
DREAMHOME WORKED EXAMPLE - OWNER SUPERCLASS WITH
PRIVATEOWNER AND BUSINESSOWNER SUBCLASSES

1/25/2024
40
DREAMHOME WORKED EXAMPLE - PERSON SUPERCLASS
WITH STAFF, PRIVATEOWNER, AND CLIENT SUBCLASSES

1/25/2024
41
SOME INSERTION AND DELETION RULES
APPLIED TO SPECIALIZATION/GENERALIZATION

1/25/2024
 Deleting an entity from a superclass implies that
it is automatically deleted from all the subclasses
to which it belongs

42
 Inserting an entity in a superclass implies that
the entity is mandatorily inserted in all
applicable subclasses.
 Inserting an entity in a superclass of a total
specialization implies that the entity is
mandatorily inserted in at least one of the
subclasses of the specialization.
THE DIFFERENCES BETWEEN THE
SPECIALIZATION AND GENERALIZATION

1/25/2024
 The specialization process corresponds to a top-
down conceptual refinement process during
conceptual schema design.

43
 we typically start with an entity type and then
define subclasses of the entity type by
successive specialization;
 The generalization process corresponds to a
bottom-up conceptual synthesis.
 we typically start with an entity type of
subclasses and then define superclasses of the
entity type by successive generalization.
FOOD FOR THOUGHT

1/25/2024
 Assuming we have entities staff and students in
a conceptual model of a university, propose ways:

 each can be specialized and


 both can be generalized.

44
LECTURE SIX
Advanced select statement
Join
Other DML commands

1
SELECTING FROM MULTIPLE TABLES
❑ You are not limited to selecting from only one table. When
you select from more than one table in one select statement,
you are said to be joining tables together.

❑ When you want to select from both tables at once, there are a
few differences in the syntax of the select statement.

❑ You need to ensure that all the tables you are using appear in
the FROM clause of the select statement.

4.2
SELECTING FROM MULTIPLE TABLES
Suppose you have two tables, fruit and color; you can
select all rows from each of the two tables.
Fruit Color
ID FRUITNAME ID COLORNAME

1 Apple 1 Red

2 Orange 2 Orange

3 Grape 3 Purple

4 Banana 4 Yellow

4.3
SELECTING FROM MULTIPLE TABLES
❑ Note: When you select from multiple tables, you must build
proper WHERE clauses to ensure that you get the result you
want.
 From the fruit and color tables, the query for selecting
fruitname and colorname from both tables where the id’s
match would be;

Mysql>select FRUITNAME, COLORNAME


FROM Fruit, Color
WHERE Fruit.ID = Color.ID;
4.4
SELECTING FROM MULTIPLE TABLES
 If you meant to select the id from the fruit table, you would
use;

Mysql>select Fruit.ID, FRUITNAME, COLORNAME


FROM Fruit, Color
WHERE Fruit.ID = Color.ID;

4.5
SELECTING FROM MULTIPLE TABLES
Result
ID FRUITNAME COLORNAME

1 Apple Red

2 Orange Orange

3 Grape Purple

4 Banana Yellow

The above illustration joins two tables using a single


SELECT query
4.6
Using JOIN
 Several types of joins exist in MySQL; all of which refer to
the order in which the tables are put together and results
displayed.
 The type of join used in the previous Example Is called an
inner join although it was not written explicitly as such.
 To re-write the SQL statement using the proper INNER JOIN
syntax, you would use:
Mysql>select FRUITNAME, COLORNAME
FROM Fruit inner join Color
ON Fruit.ID = Color.ID;
4.7
Using JOIN
 Notice the use of an ON clause instead of the WHERE
clause.
 Both clauses are synonymous and the ON clause uses any
conditions that you would use with the WHERE including the
various logical and arithmetic operators
FRUITNAME COLORNAME
Apple Red
Result table
Orange Orange
Grape Purple
Banana Yellow

4.8
LEFT JOIN
❑ Here, all rows from the first table will be returned, no matter
if there are matches in the second table or not.

 Consider the following tables;


Email Table:
id Email
42 [email protected]
45 [email protected]

4.9
LEFT JOIN
Master_name Table id firstname lastname
1 John Smith
2 Jane Smith
3 Jimbo Jones
4 Andy Smith
7 Chris Jones
45 Anna Bell
44 Jimmy Carr
43 Albert Smith
42 John Doe

4.10
LEFT JOIN
 Using LEFT JOIN, you can see that if a value from the e-mail
table doesn’t exist, a null will appear in place of the email
address.

Mysql>SELECT firstname, lastname, Email


FROM Master_name left join Email
ON Master_name.id= Email.id;

 The result of the query is shown below:

4.11
LEFT JOIN
firstname lastname Email
John Smith null
Jane Smith null
Jimbo Jones null
Andy Smith null
Chris Jones null
Anna Bell [email protected]
Jimmy Carr null
Albert Smith null
John Doe [email protected]

4.12
RIGHT JOIN
 Works like a LEFT JOIN, but with the table order reversed.
When using RIGHT JOIN, all rows from the second table
will be returned no matter whether there are matches in the
first table or not.

 Example
Mysql>SELECT firstname, lastname, Email
FROM Master_name RIGHT JOIN Email
ON Master_name.id= Email.id;

4.13
RIGHT JOIN
firstname lastname Email
Result Table John Doe [email protected]
Anna Bell [email protected]

Several different types of joins are available in MySQL such


as Equi-Join, Cross Join, Straight Join and Natural Join. To
find out more about joins, visit the mysql manual at:

http://www.mysql.com/doc/J/O/JOIN.html

4.14
Other DML Commands
 UPDATE command is used to modify contents of one or
more columns in an existing record. The most basic update
syntax is as follows:

UPDATE table_name
SET column1 = “new value”,Column2 = “new value”
[Where some_condition_is_true];

 The rules for updating a record are similar to those used when
inserting a record:
4.15
UPDATE Command
1. The data you are entering must be appropriate to the data type
of the field and
2. You must enclose your strings in single or double quotes.

Example
Consider the fruit table below:
Id Fruit_name status
1 Apple Ripe
2 Pear Rotten
3 Banana Ripe
4 Grape Rotten

4.16
UPDATE Command
 To update the status of the fruit to “ripe”, use
Mysql>UPDATE fruit SET status = ‘Ripe’;

 The above query will update all the data in the column in
question
 NOTE: It is important to incorporate the WHERE condition
to specify a particular condition.

 Conditional updates refer to the use of WHERE clauses to


match specific records.
4.17
Conditional Updates
 When using conditional updates, the same comparison and
logical operators can be used such as, equal to, less than, e.t.c.

 Consider the fruit Example; you could have

Mysql>UPDATE fruit SET Fruit_name= ‘carrot’


WHERE Fruit_name = ‘Pear’;

4.18
The REPLACE Command
 Another method for modifying records is to use the
REPLACE command which is remarkably similar to the
INSERT.
 Example

REPLACE INTO table_name (column list)


VALUES (Column Values);

 NOTE: The REPLACE command mimics the action of


DELETE and re-insert.
4.19
The DELETE Command
 The basic DELETE syntax is:

DELETE FROM table_name


[Where Some Condition is true];

 An example of a conditional delete using the fruit table is as


follows;

DELETE from fruit WHERE status = ‘Rotten’;

4.20
Lecture Nine: Normalization
Introduction to Normalization:
redundancy, anormalies?
1st – 3rd Normal Forms

1/25/2024 1
Objectives
 Purpose of normalization.
 Problems associated with redundant data.
 Identification of various types of update anomalies such as
insertion, deletion, and modification anomalies.
 How to recognize appropriateness or quality of the design of
relations.
 How functional dependencies can be used to group attributes
into relations that are in a known normal form.
 How to undertake process of normalization.
 How to identify most commonly used normal forms, namely
1NF, 2NF, and 3NF

1/25/2024 2
Normalization
 Normalization is defined as a technique for producing a set of
well designed relations that measure up to a set of requirements
which are outlined in various levels of normalization (or Normal
Forms).

 Most commonly used normal forms are first (1NF), second (2NF)
and third (3NF) normal forms.

 Normalization has the underlying aim of minimizing information


redundancy, avoiding data inconsistency and preventing Update
anomalies (insertion, deletion, and modification anomalies).

1/25/2024 3
Data Redundancy
 Major aim of relational database design is to group attributes
into relations to minimize data redundancy and reduce file
storage space required by base relations.

 Problems associated with data redundancy are illustrated by


comparing the following Staff and Branch relations with the
StaffBranch relation.
 Anomaly

something that deviates from what is standard, normal, or


expected

1/25/2024 4
Data Redundancy

1/25/2024 5
Update Anomalies
 Relations that contain redundant information may potentially
suffer from update anomalies.

 Types of update anomalies include:


✓ Insertion
✓ Deletion
✓ Modification.

Insertion Anomaly: Occurs when extra data beyond the desired


data must be added to the database.
1/25/2024 6
Update anomalies: Insertion Anomaly
 Until the new faculty member, Dr. Newsome, is assigned to
teach at least one course, his details cannot be recorded.

1/25/2024 7
Update anomalies: Modification Anomaly
Modification Anomaly: Changing the value of one of the
columns in a table will mean changing all the values that have to
do with that column.

Employee 519 is shown as having different addresses on different records.

1/25/2024 8
Update anomalies: Deletion Anomaly
 Deletion Anomaly: Occurs whenever deleting a row
inadvertently causes other data to be deleted.

All information about Dr. Giddens is lost when he temporarily ceases to


be assigned to any courses.
1/25/2024 9
Functional Dependency
 Functional Dependency: Describes relationship between two or
more attributes in a given relation.
✓ If A and B are attributes of relation R, B is functionally
dependent on A (denoted A B), if each value of A in R is
associated with exactly one value of B in R.
 Diagrammatic representation:

u Determinant of a functional dependency refers to attribute or group of


attributes on left-hand side of the arrow.

 Main concept associated with normalization.


1/25/2024 10
Example
 branchNo bAddress

Func
tiona
1/25/2024 l
2.11
Dep
ende
Example - Functional Dependency

1/25/2024 12
Example

2.13
Func
tiona
1/25/2024 l
Dep
ende
Example

✓Given TEXT we know the COURSE.


✓TEXT ->COURSE

✓TEXT maps to a single value of COURSE


1/25/2024 14
The Process of Normalization
 Formal technique for analyzing a relation based on its
primary key and functional dependencies between its
attributes.
 Often executed as a series of steps. Each step corresponds to
a specific normal form, which has known properties.
 As normalization proceeds, relations become progressively
more restricted (stronger) in format and also less vulnerable
to update anomalies.

1/25/2024 15
Unnormalized Form (UNF)
 A table that contains one or more repeating groups.
✓ Note: A repeating group is an attribute or group of
attributes within a table that occurs with multiple
values for a single occurrence of the nominated key
attributes for that table. For example a book with
multiple authors, etc

 To create an unnormalized table:


✓ transform data from information source (e.g. form) into
table format with columns and rows.

1/25/2024 16
First normal form (1NF)
 A table is in First Normal Form (1NF) if all its attributes are
atomic.
 A domain is atomic if its elements are considered to be
indivisible units. A relation in which intersection of each row
and column contains one and only one value.
 Implies that it should have no composite attributes or multi-
valued attributes.
 In case a table is not in 1NF, we do two things

1/25/2024 17
UNF to 1NF
 First identify a primary key, then
Either
Place each value of a repeating group on a tuple with duplicate
values of the non-repeating data (called “flattening” the table)
Or
 Make a new table to cater for multi-valued attributes.

 Place repeating data along with copy of the original key


attribute(s) into a separate relation
 The new primary key should be a combination of the (multi-
valued) attribute and the primary key of the parent table.
1/25/2024 18
UNF to 1NF

Nor
mali
zatio
n
UNF to 1NF

1/25/2024 20
UNF to 1NF

1/25/2024 21
HEALTH HISTORY REPORT
PROCEDURE
PET ID PET NAME PET TYPE PET AGE OWNER VISIT DATE PID PNAME

246 ROVER DOG 12 SAM COOK JAN 13/2002 01 RABIES VACCINATION

MAR 27/2002 10 EXAMINE and TREAT WOUND

APR 02/2002 05 HEART WORM TEST

298 SPOT DOG 2 TERRY KIM JAN 21/2002 08 TETANUS VACCINATION

MAR 10/2002 05 HEART WORM TEST

341 MORRIS CAT 4 SAM COOK JAN 23/2001 01 RABIES VACCINATION

JAN 13/2002 01 RABIES VACCINATION

519 TWEEDY BIRD 2 TERRY KIM APR 30/2002 20 ANNUAL CHECK UP

APR 30/2002 12 EYE WASH


1/25/2024 22
Second Normal Form (2NF)
 Based on concept of full functional dependency:
✓ A and B are attributes of a relation,
✓ B is fully dependent on A if B is functionally dependent
on A but not on any proper subset of A.
 2NF - A relation that is in 1NF and every non-primary-key
attribute is fully functionally dependent on the primary
key.
 It applies to relations that have composite keys for a primary
key.
Nor
mali
1/25/2024 zatio
n
1NF to 2NF
 This involves the removal of partial dependencies

 A partial dependency occurs when the primary key is made up


of more than one attribute (i.e. it is a composite primary key)
and there exists an attribute (which is a non-primary key
attribute) that is dependant on only part of the primary key.

 These partial dependencies can be removed by removing all of


the partially dependent attributes into another relation along
with a copy of the determinant attribute (which is part of the
Nor
primary key in the original relation) mali
1/25/2024 zatio
n
1NF to 2NF

1/25/2024 25
1/25/2024 26
Third Normal Form (3NF)
 Based on concept of transitive (indirect) dependency:
✓ A, B and C are attributes of a relation such that if A B
and B C,
✓ then C is transitively dependent on A through B. (Provided
that A is not functionally dependent on B or C).

 3NF - A relation that is in 1NF and 2NF and in which no


non-primary-key attribute is transitively dependent on the
primary key.

Nor
mali
1/25/2024 zatio
n
2NF to 3NF
 Identify the primary key in the 2NF relation.

 Identify functional dependencies in the relation.

 If transitive dependencies exist on the primary key remove


them by placing them in a new relation along with copy of
their determinant.

Nor
mali
1/25/2024 zatio
n
``````3ea4EZQq `1 1

1/25/2024 29
1/25/2024 30
Exercises: Instructions

 The following tables are susceptible to update anomalies.


Provide examples of insertion, deletion, and modification
anomalies.

 Describe and illustrate the process of normalizing the tables


to 3NF. State any assumptions you make about the data
shown in these tables.

1/25/2024 31
Exercise 1

1/25/2024 32
Exercise 2

1/25/2024 33
Exercise 3

1/25/2024 34
Solution: Exercise 3
 0NF
◼ ORDER(order#, customer#, name, address, orderdate(product#,
description, quantity, unitprice))
 1NF
◼ ORDER(order#, customer#, name, address, orderdate)
◼ ORDER_LINE(order#, product#, description, quantity, unitprice)
 2NF
◼ ORDER(order#, customer#, name, address, orderdate)
◼ ORDER_LINE(order#, product#, quantity)
◼ PRODUCT(product#, description, unitprice)
 3NF
◼ ORDER(order#, customer#, orderdate)
◼ CUSTOMER(customer#, name, address)
◼ ORDER_LINE(order#, product#, quantity)
◼ PRODUCT(product#, description, unitprice)
1/25/2024 35
Logical DB Design

Mapping Rules from ERD to a relational


database schema
MAPPING
 A conceptual data model is converted to a
logical data model.

 Tables are created which cater for the details


of the entities as well as their relationships.

1/25/2024 2
MAPPING ENTITIES
 Mapping strong entities: When mapping strong
entities, the entity becomes the table/relation and the
attributes become the fields.

1/25/2024 3
Mapping a Strong Entity into a Relation
Employee  An Entity name: Employee
Emp_ID  Attributes: Emp_ID, Emp_Lname,
Emp_Lname Emp_Fname, Salary
Emp_Fname
Salary
 Primary Key: Emp_ID

Employee

Emp_Id Emp_Lname Emp_Fname Salary


Mapping a Strong Entity into a Relation
Example
Movies
Movies
title year length filmType
Title
Year Star Wars 1977 124 color
Length Mighty
Film Type Ducks
1991 104 color
Wayne’s
World
1992 95 color
Mapping a Weak Entity into a Relation
 Mapping Weak entities: Weak entities are mapped
like strong entities.

 Create a relation that includes all single attributes

 However, the primary key of the weak entity is


combined with that of the strong entity on which it
depends.
Mapping a Weak Entity into a Relation
Dependent
Employ ee Dep_SS_No
Lname
Emp_ID
Fname
Emp_Name
DOB
Gender

Employee NOTE: The FK of


DEPENDENT should NOT
Emp_ID Emp_name allow null values if
DEPENDENT is a weak
entity
Dependent

Dep_SS_No Emp_ID Lname Fname DOB Gender


Mapping a Weak Entity
(a) Weak entity CHILD

EMPLOYEE CHILD

Employee ID Child Name

8
Name supports (First Name, Middle
(First Name, Last Name) Initial, Last Name)
Date Of Birth

(b) Tables resulting from mapping entities

Employee EmployeeID FirstName LastName

Note the composite


PK in Child table

Child EmployeeID FirstName MiddleInitial LastName DateOfBirth


Mapping Associative Entities
◼ Identifier Not Assigned
 Default primary key for the table formed for the
associative entity is typically a composite PK
composed of (at least) the primary keys of the two
entities
◼ Identifier Assigned
 May use if one exists that is natural and familiar to
end-users
 Must use if the composite PK can not be made
unique by adding intersection data. 9
Mapping an Associative Entity with Identifier not Assigned
(a) Order Line as associative entity
ORDER ORDER LINE PRODUCT

Order ID Quantity Product ID


Order Date Actual Price Description

10
Ship Date MSRP

(b) Three resulting tables


Order OrderID OrderDate ShipDate
Note the PK of the
associative table

OrderLine OrderID ProductID Quantity ActualPrice

Product ProductID Description MSRP

Note similarity of this situation to the M:M relationship


Mapping an Associative Entity with an Identifier
(a) Associative entity
EMPLOYEE ASSIGNMENT PROJECT

Employee ID Assignment ID Project ID


Name Pay Rate Name

11
(First Name, Last Name) Assignment Date

(b) Three resulting tables

Employee EmployeeID FirstName LastName


Note the PK of the
associative table

Assignment AssignmentID EmployeeID ProjectID PayRate AssignmentDate

Project ProjectID Name


Mapping Attributes
1. Simple attributes: E-R attributes map directly

12
onto the relation
2. Composite attributes: Use only their simple,
component attributes
3. Multivalued Attribute–Becomes a separate
relation with a foreign key taken from the
superior entity
Multi-valued Attribute & Attributes with
Repeating Values
 Create a new relation to represent the multi-valued attribute
and place a copy of the primary key of the owner entity into
the new relation (to act as a foreign key.

 Primary key of the new relation is usually the primary key


of the owner entity plus the multi-valued attribute, unless
the multi-valued attribute is unique for all the rows.
◼ Example: Telephone(telfno, branchno)
Mapping Multi-valued Attributes
Employee Phone
Employee SSN Name SSN Phone#

E101 Johnson E101 312 …


SSN
Name E102 Smith E102 708 …
Phone # E103 312 …
E103 Conley

E104 Roberts E104 603 …


Mapping a Composite Attribute
Mapping a Composite Attribute
Mapping Relationships: Many-to-Many
 Mappings are done according to relationships exhibited.

 Mapping many-to-many relationships:


◼ When mapping a many to many relationship, we create a
new table where the primary key is a combination of the
two primary keys of the related entities. Attributes that
may exist on the relationship are put as fields on the
created table.

1/25/2024 17
Example of mapping an M:N
(Many-to-Many) relationship

a) Completes
relationship (M:N)

b) Three resulting
relations

18
Example of Mapping an m:m Relationship

Relational
schema
notation
Mapping Relationships: One-to-Many
 When mapping one to many relationships, the primary key from
the one side migrates to the many side and becomes a foreign
key.

 Entity on “1 side” is designated the parent entity and the entity


on “* side” is the child entity

 Place the primary key of the parent entity into the relation
schema representing the child entity (acting as a foreign key)

 Place the attributes of the relationship type in the relation


schema representing the child entity
1/25/2024 20
Example of mapping a 1:* relationship
a) Relationship between customers and orders

Note the mandatory one

b) Mapping the relationship

Again, no null value in the


foreign key…this is because
of the mandatory minimum
cardinality
Foreign key 21
Mapping Relationships: One-to-One
The decision made here depends on the nature of participation of
the related entities.

 Mandatory – mandatory: Merge the two entities and create a


single table out of them. Demote one of the primary keys to an
alternate key.

 Optional – mandatory: Place primary key of entity with


‘optional’ participation (parent entity) to act as foreign key in
relation representing entity with ‘mandatory’ participation
(child entity)
✓ Place attributes of the relationship on the child entity
1/25/2024 22
Example: Mapping a binary 1:1 relationship
In_charge 1:1 relationship

Often in 1:1 relationships, one direction is optional.

Foreign key goes in the relation on the optional side,


23
Matching the primary key on the mandatory side
Mapping Relationships: One-to-One cont’d
 Optional – Optional: This can be approximated to an
optional mandatory case and treated as so. Designation of the
optional participation (parent) and mandatory participation
(child) entities is arbitrary unless one can find out more about
the relationship.
✓ In case it is not possible, it is handled like a many-to-
many relationship.
Mapping Recursive/Unary Relationships:
One-to-Many
 Foreign key in the same relation

 Place recursive foreign key in the same table (also true for
recursive One-to-One)

(a) EMPLOYEE entity with unary relationship (1:M)


EMPLOYEE

Employee ID
Name
(First Name, Last Name)
Date Of Birth

manages
Mapping Unary Relationships:
1:M Relationship
(b) Resulting Employee table with recursive foreign key

Employee EmployeeID FirstName LastName DateOfBirth ManagerID

Note mandatory use of synonym for FK

26
Mapping a Unary 1:M Relationship
(c) Example data for Employee table
PK FK
EmployeeID FirstName LastName DateOfBirth ManagerID

137 John Doe 03/15/1980


142 Mary Brown 05/16/1982 137
170 George Turner 11/04/1969 137
186 Stephen Smith 09/17/1978 142
198 Amanda Walters 12/17/1984 170
204 Ernest Hodges 08/29/1972 137
267 Michael Rogers 01/02/1985 170
285 Juan Rodriguez 10/10/1968 137
323 Kevin McFadden 11/11/1977 142
361 Charles Robideaux 02/28/1980 142
Requires a column in the table to act as a recursive foreign 27
key referencing the primary key of the table
Mapping Unary Relationships:
M:N (*:*)Relationship
Two relations are formed
✓ One for the entity type
✓ One for an associative relation in which the primary key has
two fields, both taken from the identifier of the original
entity
Example
 In a manufacturing assembly line, several items consist of
multiple items as components.
✓ One item can be used to create other items.
✓ Associations among items are M:N. That is, there is a M:N
unary relationship.
Mapping Unary Relationships:
M:N (*:*)Relationship
ITEM
a) “Bill-of- Item No

Materials” Name
Selling Price

relationship
consists of Quantity

Item ItemNo Name SellingPrice

(b) Two resulting Note composite


tables PK, two FKs
referencing the
same PK
Component ItemNo ComponentNo Quantity
29
Mapping Ternary & N-ary Relationships
 One table for each original entity and one for the common
relationship (associative entity) (i.e. a ternary relationship
maps to a total of four tables)

 Table representing the associative entity has foreign keys to


each entity in the relationship

 PK of the table formed for the associative entity is typically a


composite PK composed of (at least) the primary keys of the
three entities

30
(a) Ternary relationship as associative entity
PATIENT ADMINISTRATION PHYSICIAN

Patient ID DateTime Physician ID


Name Results Name
(First Name, Last Name) (First Name, Last Name)

Mapping a TREATMENT

Ternary Treatment Code


Description

Relationship
(b) Four resulting tables
Patient PatientID FirstName LastName
Note composite PK of
associative relation
Physician PhysicianID FirstName LastName
(linking table)

Administration PatientID PhysicianID TreatmentCode DateTime Results

Remember that the CPK must


Traetment TreatmentCode Description
represent a unique set of values 31
Mapping Generalized/Specialized Entities
 Mapping generalized/specialized entities have to put their
OPTIONAL/MANDATORY together with the AND/OR
status into consideration. i.e. participation and disjoint
constraints.

 Mandatory - And:
✓ Put all attributes of super class and sub class in the same
table.

 Mandatory - Or:
✓ Create one table for each of the subclasses. No table for
the superclass 32
Mapping Generalized/Specialized Entities
 Optional - And:
✓ Create one table for the superclass and one table for the
subclasses

 Optional - Or:
✓ Create one table for each of the sub classes and one table
for the superclass

 After mapping an EER, the resultant relations are in 3NF, it is


therefore not necessary to normalize them unless we are
interested in normalizing beyond 3NF. 33
Mapping Supertype/subtype relationships
Example: Optional, OR
✓ Create a separate relation for the supertype and each of
the subtypes
✓ Assign common attributes to supertype
✓ Assign primary key and unique attributes to each subtype
✓ Assign an attribute of the supertype to act as subtype
discriminator
Mapping Supertype/Subtype Relationships
Would Look Like This...
Mapping Generalization/Specialization to
Relations
 General approach: Create a relation for the superclass and each
subclass.
✓ Add primary key of superclass relation into each subclass
relation

 Alternative: Just create relations for each subclass.


✓ Works for mandatory participation.
✓ Superclass atributes must be added to each subclass relation.
Validating the Number of Tables
 One simple check to ensure that the resulting relational
schema contains all of the required tables based on correctly
converting the original ERD, is to add up the number of the
following structures on the ERD:
✓ Entities (strong, associative, and weak)
✓ M:M relationships
✓ Multivalued attributes

 The number of tables in the schema should match the sum of


the numbers of these items

38
Redundancies
 Redundancies may occur due to mapping because some tables
are represented many times, or they are not ‘normal’.

 A schema is created to remove all multiple representation of


the same tables by creating a union of the attributes.

 The created schema is then normalized to remove


redundancies when the system is operational.

1/25/2024 39
Mapping Exercise
Task 1:
 Identify any errors in the EERD and correct them

Task 2:
 Derive the relations to be included in the Logical database
design; clearly indicating the primary key and the foreign
keys
DATABASE DEVELOPMENT: SQL
Introduction to SQL
DDL with tables

1/25/2024 DDL with tables 1


Introduction to SQL
 It is stands for Structured Query Language. It is also
known as a transform-oriented language (a language
designed to use relations to transform inputs to outputs).
 A database language must be able to allow a user to
create a database and relation structures, perform
basic data management tasks and, perform simple
and complex queries.
 It should be portable (use commands that allows one
to move from one DBMS to another easily), have an
easy to learn command structure and syntax.
1/25/2024 SQL 2.2
SQL Statements- Summary
Statement Description
SELECT Data Retrieval: Retrieves data from the
database
CREATE , ALTER, DROP, Data definition Language (DDL): allows the
RENAME creation and updating/modifying database
structures. Sets up, changes and removes data
structures from tables
INSERT, UPDATE, Data Manipulation Language (DML): used to
DELETE manipulate data in a databases. The examples
allow one to enter new rows, change existing
rows and remove unwanted rows from tables in
a database respectively

1/25/2024 DDL with tables 2.3


ISO SQL Data Types

1/25/2024 DDL with tables 2.4


DDL WITH TABLES
❖ Creating
❖ Modifying
❖ Deleting

1/25/2024 DDL with tables 5


Creating a Table

 Involves using the CREATE TABLE statement which


includes:
◼ Specifying a table name [mandatory]
◼ Defining columns of the table [mandatory]
 Column name [mandatory]
 Column data type [mandatory]
 Column constraints [optional]
 Providing a default value [optional]
◼ Specifying table constraints [optional]

1/25/2024 DDL with tables 2.6


Creating Table Syntax
CREATE TABLE table_name
(
column_name data_type [column_constraint] [DEFAULT expr] ,
….….
[, table_constraints]
)

Example - considering only mandatory specifications.

CREATE TABLE emp (emp_id int, name varchar(30));

1/25/2024 DDL with tables 2.7


Create Table – using the DEFAULT option

 Example:
CREATE TABLE students
(regNo varchar(15), name varchar2(20), dob date, gender
char(1) default ‘M’);

1/25/2024 DDL with tables 2.8


Create Table – using table and column
constraints
 Constraints enforce rules on data whenever a row is
inserted, updated, or deleted from a table. The constraints
have to be satisfied for the operation to succeed.
 Data Integrity Constraints:
◼ Not null: specifies that the column cannot contain a
null value.
◼ Unique: specifies that a column or combination of
columns whose values must be unique for all rows in
the table.

1/25/2024 DDL with tables 2.9


Create Table – using table and column
constraints cont’d
◼ Check: specifies a condition that must be true
◼ Primary key: uniquely identifies each row of the table
◼ Foreign key: establishes and enforces a foreign key
relationship between the column and a column of the
referenced table.
 Constraints can be divided at one of the two levels
i. Column constraint:
◼ References a single column and is defined within
the column definition.
◼ Can define any type of integrity constraint.

1/25/2024 DDL with tables 2.10


Create Table – using table and column
constraints Cont’d
ii. Table constraint:
◼ References one or more columns and is defined
after the column list.
◼ Can define any constraint except not null.

 It is an issue of style to define constraints at either table


level or column level; however, the not null constraint
must be strictly defined at the column level.

1/25/2024 DDL with tables 2.11


Create Table – using table and column
constraints Cont’d
Example:
 [using column constraints]
create table students (regNo varchar(15) primary key, name
varchar(20), dob date, gender char(1) not null);
OR
[using table constraints]
create table students (regNo varchar(15), name varchar(20),
dob date, gender char(1) not null, constraint
students_regNo_pk primary key(regNo));

1/25/2024 DDL with tables 2.12


Creating Tables Using Subqueries
 The table is created with the specified column names, and
the rows retrieved by the select statement are inserted into
the table.

 The column definition can contain only the column name


and default value.

 If column specifications are given, the number of columns


must equal the number of columns in the subquery select
list.

1/25/2024 DDL with tables 2.13


Creating Tables Using Subqueries cont’d
 If no column specifications are given, the column names of
the table are the same as the column names in the subquery.

 Only the not null constraint and data types are passed onto the
new table.

 Be sure to give a column alias when selecting an expression.

1/25/2024 DDL with tables 2.14


Creating a table by using a subquery
 [providing column specifications]
create table dept20 (empno primary key, ename, ann_sal)
as select employee_id, last_name || first_name, salary*12
“annual salary” from employees where department_id
=20;
OR

 [without providing column specifications]


create table dept20 as select employee_id, last_name || ‘ ‘
|| first_name name, salary*12 “annual salary” from
employees where department_id=20;
1/25/2024 DDL with tables 2.15
Modifying a Table Structure
 Modifying a table involves using the ALTER TABLE
statement which could cater for three kinds of adjustments:
◼ MODIFY column (modify [data type/size][not
null][default])
◼ ADD column(s) / constraint(s),
◼ DROP column(s) / constraint(s)
 Examples:
◼ alter table students add weight number;
◼ alter table students add height number;
◼ alter table students modify weight number(4);
1/25/2024 DDL with tables 2.16
General Syntax
To add a column to a table
 ALTER TABLE table_name
ADD column_name datatype;

1/25/2024 DDL, DML and DDL with tables 2.17


Modifying a Table Structure cont’d
◼ alter table students drop column height;
◼ alter table students modify gender default ‘F’;
◼ Alter table students add constraint students_gender_ck
check (gender in (‘M’,’F’));
◼ Alter table students add constraint students_name_uk
unique (name);
◼ alter table students drop constraint students_name_uk;

1/25/2024 DDL with tables 2.18


Modifying a Table Structure Cont’d

Notes:
 The not null constraint is added by modifying the column
that is to be defined as not null
Example:
ALTER TABLE students
MODIFY weight not null;

 Adjustments to populated tables is more restrictive


because the adjustments should not violet the nature of
data stored in the tables.
1/25/2024 DDL with tables 2.19
To delete a column in a table
 ALTER TABLE table_name
DROP COLUMN column_name;

1/25/2024 DDL, DML and DDL with tables 2.20


Removing a Table from a DB schema

 Example:
DROP TABLE students;

1/25/2024 DDL with tables 2.21


Quiz
 Why does the statement fail to execute:
create table dept20 as select employee_id, last_name
|| ' ' || first_name, salary*12 "annual salary" from
employees where department_id=20;

 Identify the error in this statement:


alter table students add constraint
students_name_nn not null;

1/25/2024 DDL with tables 2.22


Quiz Cont’d

 Write all SQL statements that will create the relational DB schema:

DEPTS (DeptNo, DName, Loc)


PERSONNEL (EmpNo, EName, Job, *Mgr, HireDate, Sal,
Comm, *DeptNo)

Data Constraints:
◼ A department name value must never be repeated.
◼ No person should earn a salary that is more than 5000.
Note:
◼ A foreign key is preceded by an asterisk (*);
◼ Mgr references EmpNo.
◼ Enforce another foreign key where necessary
1/25/2024 DDL with tables 2.23
DML with tables

1/25/2024 DML with tables 1


DML Overview
 This is a language that provides a set of operations to
support the basic data manipulation operations on the
data held in the database.

 There are many types of DML statements, these include;

▪ INSERT – to add data/new rows to a table

▪ SELECT – to query data in the database (could


involve one table or many tables to be covered in
another learning unit)
1/25/2024 DML with tables 3.2
DML Overview cont’d
◼ UPDATE – to make changes to existing data/rows in a
table

◼ DELETE – to remove data from a table/remove existing


rows from a table

Therefore some of the operations performed are insertion,


modification, retrieval and deletion.

1/25/2024 DML with tables 3.3


Example to be used
 Create a table called employees with following attributes:

1/25/2024 DML with tables 3.4


Statement for creating emp_table
Create table emp_table (empId varchar(20) primary
key, fName varchar(15), lName varchar(15) not null,
dob date);

1/25/2024 DML with tables 3.5


The INSERT operation
 This is used to insert new rows in the relation or table.

 The general syntax of as insert statement is;


INSERT INTO table [(column [,, column….])]
VALUES (value [,value….]);

 Note:
◼ This statement will insert one row at a time.
◼ Enclose date and character values in single quotes.
◼ The list of values is by default in the order that of columns in the
relation.

1/25/2024 DML with tables 3.6


The Insert Operation
Two primary ways of adding data to a table:
i. Using the VALUES clause in an INSERT INTO statement [only one
row is added]
INSERT INTO emp_table(dob, empid, fName, lName)
VALUES ('22-jan-88','93/3/3', 'sarah', 'naka');

ii. Using a subquery in an insert into statement [many rows can be


added] this is the same as copying rows from another table.
INSERT INTO emp_table (empid, fname, lname, dob)
SELECT employee_id,first_name, last_name, hire_date
FROM employees
WHERE lower(last_name) like '%k%';

1/25/2024 DML with tables 3.7


The Insert Operation Cont’d
 In either case (i.e. using values clause or subquery) – the
order and number of columns listed must match the order
and number of values to be inserted.

 If a column list is omitted then ensure that all values are


listed in the default order of the columns in the table.
Example:
INSERT INTO employees
VALUES (‘998’, 'rich', 'maya', '02-feb-98');

1/25/2024 DML with tables 3.8


The Insert Operation Cont’d
 Inserting rows with null values
◼ When using the implicit method; specifying the columns you are
interesting in, eliminate the columns names of those that you are not
interested in.
Example: INSERT INTO departments (department_id,
department_name)
VALUES ( 30, ‘Purchasing’);

◼ When using the explicit method; one must specify the NULL
keyword for the columns whose values you are not entering
Example: INSERT INTO departments
VALUES ( 30, ‘Purchasing’, NULL, NULL);

1/25/2024 DML with tables 3.9


The Insert Operation Cont’d
 Known values must be supplied for columns that are
defined as ‘not null’.
Example:
INSERT INTO employees (employee_id, last_name, dob) VALUES
(903,'Okello', '22-jan-88');

 Why does the statement below fail?


INSERT INTO employees (employee_id, first_name, dob) VALUES
(311,'Smith', '22-Oct-03');

1/25/2024 DML with tables 3.10


SQL Terminology
 Keyword

 Clause

 Statement

Basic SELECT Statements 4.11


SELECT Statement
 When writing the select statement, the following should be at
the back of ones mind;
◼ SQL statements can be written on one or more lines
◼ SQL statements are not case sensitive, unless indicated;
keywords are usually written in uppercase and other
words in lowercase.
◼ New lines and indents can be used to readability purposes
e.g. clauses are usually written on separate lines for
readability purposes.
◼ Keywords can not be split or abbreviated.

Basic SELECT Statements 4.12


Basic SELECT statement
 The simplest form of the SELECT statement must include;
◼ The SELECT clause; this specifies the columns that are to
be retrieved from a table/relation
◼ The FROM clause; this specifies the table from which the
specified columns are to be retrieved.
 The generalised syntax of a basic select statement is as
follows:
SELECT * FROM table_name;
SELECT [DISTINCT] column FROM table_name;
SELECT expression [alias] FROM table_name;

Basic SELECT Statements 4.13


SELECT Statement Cont’d
Notes:
 SELECT and FROM clauses are mandatory
 The order of execution is
FROM table(s)
[WHERE condition]
[GROUP BY group_by_expression]
[HAVING group_condition]
SELECT column_list
[ORDER BY column_list]

Basic SELECT Statements 4.14


Capabilities of a SELECT Statement
3 Capabilities:

i. Projection – specifying columns to be displayed


(achieved through the use of the SELECT clause).
ii. Selection – specifying rows to be retrieved (achieved
through the use of the WHERE clause).
iii. Join – combines information from two or more
tables.

Basic SELECT Statements 4.15


SELECTION
 This is the SQL capability of specifying the rows one is
interested in retrieving from a table (Row selection)

 This is done using the WHERE clause.

 General syntax:
◼ SELECT clause
FROM table
WHERE condition;
 Comparison operators are used to define conditions in the
where clause.
Basic SELECT Statements 4.16
Selection Cont’d

 Note
i. Character strings and dates in the where clause
must be enclosed in single quotation marks (‘ ‘).
Numeric constants should not be enclosed in single
quotation marks.
ii. All character searches are case sensitive.

Basic SELECT Statements 4.17


Selection Cont’d
 Note
i. Character strings and dates in the where clause must
be enclosed in single quotation marks (‘ ‘). Numeric
constants should not be enclosed in single quotation
marks.
ii. All character searches are case sensitive.

Basic SELECT Statements 4.18


Selection Cont’d
 Logical Conditions
A logical condition combines the result of two component
conditions to produce a single result based on them or
inverts the result of a single condition. A row is returned
only if the overall result of the condition is true. The three
logical operators available in SQL are:
◼ NOT
◼ AND
◼ OR

Basic SELECT Statements 4.19


Data Retrieval from one Table
 Four major scenarios of data retrieval from a single table:
i. Retrieving all columns and all rows;
ii. Retrieving specific columns and all rows;
iii. Retrieving all columns and specific rows;
iv. Retrieving specific columns and specific
rows.

Basic SELECT Statements 4.20


(i) Retrieving all columns and all rows from the
employees table.

SELECT * FROM employees;

Alternatively specify all columns of the employees table;


separate columns with commas, order of columns does not
matter.

Basic SELECT Statements 4.21


(ii) Retrieving specific columns and all rows
from the employees table.

SELECT empid, fname, last_name FROM Employees;

Basic SELECT Statements 4.22


(iii) Retrieving all columns and specific rows from the
employees table.
SELECT *
FROM Employees
WHERE empid = ‘c01';

Basic SELECT Statements 4.23


(iv) Retrieving specific columns and specific rows from
the employees table.

SELECT empid, fname, last_name


FROM Employees
WHERE empid = ‘c03';

Basic SELECT Statements 4.24


Quiz
 Describe the intension of the following SQL queries:
i. SELECT empid, fname, lname,
FROM employees WHERE fname >‘Rasha';
i. SELECT empid, fname, lname, salary
FROM employees WHERE salary >= 2500 and
empid like '%MAN%';
ii. SELECT lname, empid FROM employees WHERE
empid not in (‘C02', ‘C04');

Basic SELECT Statements 4.25


Using the DISTINCT Keyword
 Purpose – enables the retrieval of a unique set of tuple(s)
or row(s).
SELECT DISTINCT lname
FROM employees;
 You can specify multiple columns after the DISTINCT
qualifier which results into distinct tuples.
SELECT DISTINCT lname, fname
FROM employees;

Basic SELECT Statements 4.26


Arithmetic operators and Expressions
 The Arithmetic operators present in SQL include: +, -, *, /
◼ These can be used in any clause except the FROM clause;
◼ Same operator precedence rules that apply in SQL
statements

 Arithmetic expression can contain column names, constant


values and the arithmetic operator.

Basic SELECT Statements 4.27


Using Expressions in a SELECT Clause
 Date expressions (+, –)
SELECT empid, fname, last_name, dob + 3
FROM employees
WHERE empid=‘c03';
 Numeric expressions (+, –, * , / )
SELECT lname, salary, salary + 300
FROM employees;
 Character expressions ( concat)
SELECT concat (fname, lname)
FROM employees;
N.B: An expression that involves a null value returns a null value.
Basic SELECT Statements 4.28
Using Column Aliases
 A column heading is changed by using a column aliase with
the purpose of making the column heading more descriptive.
 Double quotation marks must be used when the aliase has
spaces, is case sensitive or has special characters.
 Is useful for calculations

 May be defined using the AS keyword.

SELECT column_name AS alias_name


FROM table_name;

SELECT CONCAT(fname,' ' ,lname ) "Name",


salary*12 AS "Annual Salary"
Basic SELECT Statements 4.29
FROM employees;
Using the ORDER BY clause
 Facilitates sorting of data results in a particular order /
direction.
 The order by clause consists of column identifies that the
result is to be sorted on, separated by commas. The
column identifier can be a column name / expression /
alias / position.
 Sorting is done in ascending (ASC) or descending
(DESC) order on any combination of columns, regardless
of whether that column appears in the result.
 Sorting is by default ascending.
Basic SELECT Statements 4.31
ORDER BY Clause Cont’d
 Data Sorting Example:
SELECT last_name, job_id, department_id, hire_date
FROM employees WHERE job_id like '_A% ' ORDER BY
hire_date;
Note:
 The ORDER BY clause must be the last clause of the SQL
statement.
 You can sort by multiple columns.

Basic SELECT Statements 4.32


The Update Operation
 The Update operation is used to modify existing rows in a
database.

 The general syntax of an update statement is;


UPDATE table
SET column = value [,column = value,…]
[WHERE condition];

1/25/2024 DML with tables 3.33


The Update Operation
 Specific value(s) are modified if you specify the WHERE
clause in the UPDATE statement. This is mainly by the
use of the primary key because any other column name
may cause several rows to be updated unexpectedly .
Example:
UPDATE employees
SET lname = 'nakato'
WHERE empid = ‘c05';

1/25/2024 DML with tables 3.34


The Update Operation cont’d
 All rows in the table are modified if you omit the where
clause.
Example:
UPDATE employees
SET lname= 'nakato';

 Updating rows with values that are tied to integrity


constraints usually returns an error when the constraint is
violated.
◼ Example when updating a foreign key.

1/25/2024 DML with tables 3.35


Updating with the use of a Subquery
 A subquery may be used to update a row or rows in a given
relation
Example:
UPDATE employees
SET empid='SAL_PER',
salary = (select salary
from employees
where empid = c06)
WHERE empid = ‘c03';

1/25/2024 DML with tables 3.36


The Update Operation cont’d
 Updating multiple columns using a subquery;
◼ The general syntax is
UPDATE table
SET column = ( SELECT column
FROM table
WHERE condition)
[,
column = ( SELECT column
FROM table
WHERE condition)]
[WHERE condition];

1/25/2024 DML with tables 3.37


The Delete Operation
 To be able to remove rows from a table, the DELECT
statement is used.

 The general syntax is;


DELETE [FROM] table
[WHERE condition];

1/25/2024 DML with tables 3.38


The Delete Operation
 Specific row(s) are deleted if you specify the WHERE
clause in the DELETE FROM statement.
Example:
DELETE FROM employees
WHERE empid= ‘c03’;

 Take care, if a WHERE clause is not specified, then all


rows from the table will be deleted.
Example:
DELETE FROM employees;

1/25/2024 DML with tables 3.39


Deleting Rows based on Another Table
 This is usually done using a Subquery
Example:
DELETE FROM employees
WHERE empid = (SELECT empid
FROM employees
WHERE location_id=2400);

1/25/2024 DML with tables 3.40


Deleting Rows: Integrity constraint Error
 You can not delete a row that contains a primary key that is
used as a foreign key in another table.

◼ Example:
DELETE FROM employees
WHERE empid = ‘c04’;

Error: child record found violation

1/25/2024 DML with tables 3.41


Common Integrity Constraint Errors
 Leaving out values for columns with NOT NULL
constraints.
 Deleting a parent record that has dependencies on it.
Example:
DELETE FROM dept_copy
WHERE department_id = 20;
 Inserting or updating a record with a foreign key value
that does not exist in the referenced primary or unique key
of the parent table. Example:
UPDATE emp_copy
SET department_id = 55
WHERE department_id = 110;
1/25/2024 DML with tables 3.42

You might also like