Bcs Database - Complete Reference 2022
Bcs Database - Complete Reference 2022
DATABASE SYSTEMS
COMPLETE REFERENCE
SEPTEMBER 2022
2
FILE SYSTEMS
The traditional approach to data management is called a file system (some call it
Conventional file system). In this approach, each department of the organization has its
own data files and application programs. There is no integration between different
applications. Conventional file systems are created and accessed using conventional
programming languages such as Cobol and Pascal.
Data redundancy - Data is duplicated in several files (e.g. customer data in accounts
dept for credit sales, same customer data in advertising dept for promotions). This
results in waste of storage, makes insertions, updates etc. to be redundant and it
creates a potential for data inconsistancies (e.g. customer address changed in one file,
not in the other file).
3
Data inconsistency – Same data field can contain two different values in two different
files (e.g. a customer changes his address and it is changed in one file but not in the
other, now the same customer has two different addresses)
Poor data integrity – All data validations must be coded in application programs. This
makes room for invalid data to get in.
Proceduaral Query language – Data are manipulated by using procedural code such as
“if else” and “while”. This is diificult and time consuming.
4
DATABASE SYSTEMS
1. Minimum Data Redundancy – In a database each data item is stored just once.
This is due to the fact that a database integrates ALL corporate data into a single
structure. This allows users to extract data from multiple files as if they were in a
single file.
2. Consistency of data – Minimum data redundancy helps to achieve data
consistancy. Since each data item is stored just once, something like same
customer having two different addresses cannot happen.
3. Support for data sharing – Data can be shared across multiple applications.
Access paths are created between related files/tables so that differrent
applications can access the same data.
4. Improved data integrity – A database offers stronger data integrity via three
types of integrity that are inherent to a relational database : Entity Integrity,
5
DBMS’ are expensive to acquire. Hardware such as database servers can be expensive.
A database environment is a highly complex set up requiring sophisticated technology
(e.g. transaction logs, checkpoints, concurrency control). It demands the presence of a
DBA. It also requires extensive user training.
NOTE: File systems are still beneficial in some systems. They are economically feasible
in small, single user business applications that have only few data files, each with few
hundred records (e.g. a staff payroll or a stock control system of a small business or a
library management system of a small library). For small scale systems the complex
technology of a database, the high cost of DBMS and hardware and the hiring of a
DBA do not provide adequate ROI. Many advanced features of a DBMS including
concurrency control, recovery, integrity and security are not relevant to such systems.
6
A DD serves as a repository for Meta data (i.e. Data about data). It is stored within the
database and is accessible to the DBMS on-line. Information stored in the DD include:
Descriptions of base tables (Table names, attribute names and their data types
and sizes etc.)
Integrity constraints (e.g. Entity integrity, referential integrity etc.)
Security constraints (declarations of authorized users, their passwords and access
privileges etc.)
Descriptions of views
Information about indexes, clusters etc.
Database statistics (e.g. number of tuples in each table)
An Example of SQL code that accesses Meta data:
Select *
From User_Sys_Privs
This will retrieve all the system privileges granted to user ‘TOM’.
DATABASE ADMINISTRATOR
NOV 2020
9
RELATIONAL MODEL
The relational model of data representation was based on set theory and Boolean
algebra. It uses the concept of a relation to represent a file. A relation is a two
dimensional table in which rows represent records and columns represent fields. It is
equivalent to a set in mathematics. Data are stored in relations.
Rows are called tuples while the columns are called attributes.
The intersection of a row and a column forms a cell. A cell can contain only an atomic
value.
Each relation/table has a primary key that can be used to identify rows/records
uniquely. Joining two tables is facilitated via foreign keys.
10
1. Entity integrity – This is concerned with primary keys. This rule says that the
column or group of columns chosen as the primary key should be unique and
not null.
2. Referential integrity – This rule says that mismatching foreign keys are not
permitted into the system.
3. Domain integrity – This is concerned with the validity of values within
columns. The values entered into a column of a table should be drawn from
the correct domain.
Terminology:
Domain – A pool of values, from which the actual values appearing in a column of a
table are drawn. All the values of a given domain belong to the same data type. e.g.
Domain of marks Type : Integer Values : Numbers from 0 to100
Foreign key – A non-key attribute of one table which draws its values from the primary
key domain of some related table. In a relational database, foreign keys are used to join
related tables together.
Foreign key
Composite key - A group of two or more attributes that can be used to identify a record
uniquely (i.e. primary key consisting two or more fields)
Candidate key – An attribute or group of attributes that could serve as the primary key.
Alternate key – Once we choose the primary key from a set of candidate keys, the other
candidate keys become alternate keys.
In the Employee table, if we choose EmployeeID as the primary key then NICNumber
and email will become alternate keys.
12
Degree – Number of columns in a table. e.g. in the above employee table the degree is
4 because it has 4 columns.
Cardinality – Number of rows in a table. E.g. e.g. in the above employee table the
cardinality is 3 4 because it has 3 rows/tuples
In ER Diagrams - Degree means how many Entity Types are involved in a relationship.
This can be UNARY, BINARY and TERNARY.
In Relational Model - Degree means the number of COLUMNS in a table. The degree of
the student table below is 3 (i.e. 3 columns/attributes)
STUDENTS
125 Roy 13
127 Jane 16
128 Ben 13
129 Tina 17
In relational model, Cardinality means the number of rows in a table. In the above
table, cardinality is 4 (i.e. 4 rows/tuples)
13
RELATIONAL ALGEBRA
A set of 8 operators for querying a relational database. In algebra, we formulate a query
by writing a series of instructions. Each instruction produces an output which is taken as
the input to the next instruction. This feature is called the property of “Closure”.
PROJECT π
This takes one relation as the input and produces another relation as the output, which contains a
subset of columns from the input relation. It allows us to partition a table vertically and select certain
columns.
Employee
T1
Brown 5000
SELECT / RESTRICT σ
This takes one relation as the input and produces another relation as the output, which
is a subset of rows from the input relation. It allows us to partition a table horizontally
and select certain rows, which satisfies a condition.
T1
From Employee
Where salary>8000;
15
JOIN ∞
This is used to combine related tuples from two relations into single tuples over a
common attribute. Joins are performed between a foreign key and a matching primary
key. Where the foreign key value of one table matches with a primary key value of
another table, the two corresponding tuples are joined.
Employee
Department
DeptNo DepName
A1 IT
A2 Accounts
JOIN RESULT
T1
Select *
UNION U
It takes two union compatible relations (i.e. The two input tables must have the same number of
columns and each corresponding pair of columns must have the same domain) as the input and
produces a result consisting of all tuples appearing in either or both original relations. Duplicate tuples
are eliminated from the result.
ADMINISTRATOR
UNION COMPATIBLE TABLES
StaffNo Name
001 Brown
Two tables with the same
002 Jane
number of columns where
003 Ajit
each corresponding pair of
columns have the same
LECTURER
domain
StaffNo Name
004 Lee
002 Jane
T1 ADMINISTRATOR U LECTURER
StaffNo Name
SQL implementation
001 Brown
CARTESIAN PRODUCT X
It takes two tables as the input and produces a result, which contains all possible
combinations of rows from the input tables. The two input tables need not be union
compatible.
EMP
TheSQL clause ‘FROM’ corresponds to
EMPID NAME SALARY Cartesian Product.
111 Tom 5000
222 Sam 6000
Select *
T3 EMP X DEP
T3
INTERSECT
This takes two union compatible relations as the input and produces tuples that
are common to both relations as the output.
T1 ADMINISTRATOR LECTURER
T1
StaffNo Name
002 Jane
SQL implementation
INTERSECT
DIFFERENCE/MINUS -
This takes two union compatible relations as the input and produces tuples that are found in the first
one but not in the second one as the output. (i.e. When denoted by R MINUS S, the result is all tuples in
R that are not in S).
T1 ADMINISTRATOR – LECTURER
(Select * From Administrator)
MINUS
T1
003 Ajit
21
DIVIDE ÷
This takes one binary table (Table with two columns - X,Y) and one unary table (Table
with one column - X) as the input. The unary table’s column is also present in the binary
table. We divide the binary table by the unary table to produce a result that contains all
Y column values which are common to the whole set of X column values in the unary
table.
T1
Find the suppliers who supply ALL the products
PRODUCT SUPPLIER
TV Brown
DVD Sam
USB Brown
T3 T1 ÷ T2
TV Ben
USB Ben
DVD Brown
T3
TV Sam
SUPPLIER
Brown
T2
PRODUCT
TV
DVD
USB SQL does not support DIVIDE
operation
22
ALGEBRA EXERCISES
T1 σ gender=”M” (Customer)
T2 T1 ∞ cno=cno Orders
T3 π orderno, amount (T2)
We could also write it as follws
Alternative answer
T1 π cno (Customers)
T2 π cno (Orders)
T3 T1 – T2
Entity Type – The term entity type may refer to a living thing such as a person or
animal, an object, an organization or an event which is important to the organization.
The organization wishes to store data about it. e.g. Student, Customer, Course, Product
Student
Entity instance – A single occurrence of an entity type. (e.g. BCS is an entity instance of
Course, Ann Davis is an instance of Student). Sometimes the word “entity’ is used to
refer to an entity instance.
Attributes – The properties of an entity are known as attributes (e.g. name and salary
are attributes of Employee entity)
Salary
qualification
Address
Age
Simple attribute – An attribute that cannot be broken down into components (e.g.
salary in the diagram above)
Composite attribute – An attribute that can be broken down into sub parts or
components (e.g. address)
26
Multi value attribute – an attribute that may store several values for a single entity
occurrence/instance. It is shown by a double line ellipse (e.g. qualification). Later in the
design, each multi value attribute maps to a separate table in the database.
Single value attribute - an attribute that stores a single value for a single entity
occurrence/instance. (e.g. salary, name, date of birth)
Derived attribute - an attribute that is computed dynamically from one or more stored
attributes (e.g. Age is a derived attribute because it can be computed from date of
birth). A derived attribute has a dashed line ellipse.
Relationship – An association among entity types. Also called relationship type. (e.g. Student
follows Course). A relationship involves a property called Cardinality/Multiplicity. This is
concerned with the number of instances of one entity type which is related to a single instance
of another entity type. Based on cardinality, we can identify three types of relationships:
1 N
Customer Places Order
1 Ruled 1
Country President
by
M N
Student Follows Course
M
27
Participation
This refers to the extent to which an entity type participates in a relationship. This can
be partial (i.e. optional) or total (i.e. mandatory).
Total – Every entity instance must participate in the relationship. Also called mandatory
participation. Total participation is shown by a double line connecting the relevant
entity type to the relationship.
Partial - It is not necessary for all instances to participate in the relationship. There can
be some entity instances that do not participate in the relationship. Also called optional
participation. Partial participation is shown by a single line connecting the relevant
entity type to the relationship.
Degree of relationship
1 N
EMPLOYEE
Manages
Binary – A relationship between two entity types. This is the most common type of
relationship.
M Follows N
Student Course
A strong entity type (also called regular entity type) has its own identifier or primary
key. Its existence does not depend on another entity type. That is, it can exist
independently.
A weak entity type cannot exist independently. Its existence depends on a strong
entity type or owner. A weak entity type has total participation in the relationship with
its owner. A weak entity type does not have its own primary key. It has a partial key
which can uniquely identify weak entities related to the same owner entity.
In the example below, “Dependant” (e.g. wife, child) is a weak entity type. Its partial key
would be first name, assuming two persons do not have the same name within the
family. At a later design stage, this partial key would be combined with the owner’s
primary key to form a full primary key. So the primary key of Dependant would become
EmpNo+FirstName.
30
An entity type that associates the instances of one or more entity types & contains
attributes that are peculiar to the relationship between those entity instances.
EXERCISE:
Patient
Appointment
Doctor
Ward
31
Each entity type becomes a table. Attributes of the entity become columns of the
table. Identification attribute of the entity becomes the primary key.
Each multi value attribute is mapped to a separate table. The primary key of the original
entity will be taken to the new table as foreign key.
For each one-to-many relationship, take the primary key on the “one” end to the
table representing the “many” end as the foreign key.
33
If the relationship is an associative entity type, then its attributes will be taken to the
link table
34
We can combine both entity types into a single table. The identification attribute of
either entity type could be chosen as the primary key of the new table.
In this example, every vehicle relates to a single manager who can have only one vehicle, but
there can be managers without vehicles. Create two tables and take the primary key of the
optional end (i.e. Manager) to mandatory end (i.e. Vehicle) as the foreign key. This will avoid
the need for storing null values for foreign keys.
35
Assume that there can be buses without a driver currently assigned and there can be buses
without a driver being assigned. Create two tables and take the primary key of either end to
other end as the foreign key.
Weak entity type has a partial identifier (firstname in this example). Forming a primary
key is done by combining the partial id with the owner’s primary key.
36
Create a single table and form a foreign key whose values are drawn from the
identification attribute/primary key of the same table
Person
P1 Brown 34 M P2
P2 Jane 27 F P1
P3 Ted 25 M P4
P4 Tina 22 F P3
37
Create a single table and form a foreign key whose values are drawn from the
identification attribute/primary key of the same table (same procedure as the one used
for unary 1:1)
Employee
E1 John 5000
E2 Sam 3000 E1
E3 Jane 2000 E1
E4 Kate 6000
E5 Ben 3000 E4
38
A unary many-to-many relationship is mapped by creating two tables: one for the original
entity and another to serve as a link entity. The link table should contain two foreign keys, both
drawing values from the primary key domain of the original entity.
ResearchArticle
Cites
ReferencingID ReferredToID
A1 A2
A1 A4
A3 A1
A3 A2
39
Ternary
Create four tables – three for the original entity types and the other for linking the
different entity instances of the original three. The new link table has three foreign keys
whose values are drawn from the primary key domains of the original entity types.
40
Relational Schema
Customer (cno, name, address, contactno)
Order (orderno, date, orderTotal, cno*)
OrderLine (orderno*, productno* , orderQuantity)
Product (pno, pname, price, QuantityOnHand)
41
ORDER
ORDERLINE
PRODUCT
INSTANCE DIAGRAMS
This shows how entity instances connect at run time depending on their cardinality.
001
C1
002
C2
003
C3
NORMALIZATION
Normalization is a way of grouping data in relational database design that will eliminate
data redundancies. That in turn removes data anomalies. The difficulties of inserting,
updating and deleting data are known as data anomalies.
Update anomaly – a modification to a single data item will require looking for multiple
occurrences of the same data item.
Insert anomaly – Inserting data about one thing depends on data about another thing.
Delete anomaly – Deleting unwanted data will also result in the deletion of useful data.
A B
Product No Price
EMPLOYEE INFORMATION
ENO ENAME SALARY DNO DNAME DLOCATION
(INSERT ANOMALY)
Deleting an employee (e.g. Kate) also deletes information about department (DELETE
ANOMALY)
45
Student
001 Sam IT
003 Tiran IT
The primary key of the student table is Std#. All the attributes of the Student table are
fully functionally dependant on the primary key. Therefore, the student table is in third
normal form (3NF).
46
Subject-Lecturer-Grade 1NF
Grade depends fully on the whole key, but other attributes like subject#, Title etc.
depend only on Subject#. This is called a partial dependency. As a result, there are
anomalies (e.g. updating the address of a lecturer such as Ann, Cannot insert a new
subject without at least one student).
StudGrade – 3NF
001 111 A
001 222 C
001 333 B
002 444 A
002 222 C
SubjectLecturer – 2NF
The primary key of the SubjectLecturer table is Subject#. Subject title functionally
depends on the key, but other attributes like lecturer name and address depend on
Lec#. This is called a non-key/transitive/hidden dependency. As a result there will be
anomalies (e.g. Changing Ann’s address, If ICT is deleted, so would John, Cannot insert a
new lecturer unless we give him a subject).
Subject – 3NF
Lecturer – 3NF
EMPLOYEE INFORMATION
ENO ENAME SALARY DNO DNAME DLOCATION
… … … … … ..
ENO
2NF
EMPLOYEE
… … … …
DEPARTMENT
A1 Admin Colombo
A2 Factory Galle
A2 Sales Kelaniya
… … ..
Conceptual level - The conceptual level has conceptual schema that describes the
structure of the whole database for a community of users. It describes the entity types,
their relationships and the integrity constraints of the entire database. The conceptual
schema does not include the details of physical storage structures and access
mechanisms.
External level - The external level includes a number of external schemas or user views.
Each external schema describes the part of the database that a particular user group is
interested and hides the rest of the database from that user group. Application
programs and user application interfaces such as Database forms/Web Forms work on
this level (i.e. they access the database via the external schemas or views).
52
External Level
Conceptual Level
Employee Table
Internal level
Na bytes 20
Add bytes 30
Mappings
The correspondence from one level of the three-schema architecture to another level is
called a mapping. The conceptual/internal mapping defines the correspondence
between the conceptual level and the stored database. The external/conceptual
mapping defines the correspondence between the conceptual level and the user views.
Data independence
The ability to change the schema at one level of a database system without affecting
the schema at the next higher level is known as data independence.
Database vendors adopted the three-schema terminology, but they have implemented
it in incompatible ways. There was no single standard. Various groups attempted to
define their own standards for the conceptual schema and physical schema. However,
external schema (SQL Views) has one standard among most vendors. As a result,
achieving Logical Data Independence is fairly uniform across many RDBMS products
such as Oracle and MS SQL Server.
55
TRANSACTION MANAGEMENT
A transaction is a collection of operations that forms a single logical unit of work (e.g.
transferring a sum of 1000 from account A to account B).
A transaction takes the database from one consistent state to another. To ensure the
integrity of the database, the DBMS must maintain the following properties of a
transaction known as ACID (Atomicity, Consistency, Isolation and Durability).
1. Atomicity
When executing a transaction, the DBMS must ensure that either the entire transaction
is performed or none of it is performed (All or nothing). Atomicity is ensured by the
operations COMMIT and ROLLBACK.
COMMIT
This signals the successful end of a transaction. It tells the transaction manager that a
logical unit of work has been successfully completed and that all updates made by the
transaction should be made permanent.
ROLLBACK
This signals the unsuccessful end of a transaction. It tells the transaction manager that
something has gone wrong and all updates made so far by that transaction should be
undone/removed.
56
2. Consistency
All transactions must preserve the consistency and the integrity of the database.
Transaction must not leave the database in an inconsistent state. This is enforced by
data integrity constraints (Entity, Referential, Domain and Additional integrity).
3. Isolation
Concurrent transactions must not interfere with each other. Any given transaction’s
updates should be concealed from other transactions until that transaction commits.
Isolation is enforced by concurrency control methods such as locking.
4. Durability
Once a transaction completes successfully, all updates carried out by that transaction on
the database must persist, in case of a software or a hardware failure. The changes
made by the transaction must be preserved. This is supported by back up and recovery
methods.
Concurrency
Allowing multiple transactions to access the same database at the same time is
called concurrent processing. Concurrent processing is a desiarable feature of a
DBMS as it improves data utilization.
Despite its many benefits, concurrent processing has its own problems which, IF
NOT CONTROLLED, would lead to an inconsistent database.
58
Concurrency Problems
Lost Update - Two concurrent transactions read a record to update it, and the first one
to write the record loses its update, when the second one completes (Second transaction
overwrites the update made by the first one).
Time Transaction A Transaction B Record1
(Disk)
0 Seats = 10
1 Read Rec1 [Seats=10] Seats = 10
2 Seats = Seats + 3 Seats = 10
3 Read Rec1[Seats=10] Seats = 10
4 Write Rec1, Commit Seats = 13
5 Seats=Seats+5 Seats = 13
6 Write Rec1, Commit Seats = 15
Uncommitted Dependency (Dirty read) - One transaction B reads a value that has
been changed by an as yet incomplete transaction A. If transaction A rolls back, then
transaction B is using a data value which is no longer valid.
Unrepeatable read - A transaction T1 reads a data item twice and the item is changed
by another transaction T2 between the two reads. Hence the reading transaction T1
receives two different values of the same data item within the same transaction.
Locking
A transaction can lock a data granule such as a record or table so that other
transactions will not be able to access it. The transaction can retain this lock until
COMMIT so that other transactions will not be able to interfere with it. Locking can
prevent concurrency problems such as Lost update, Dirty read, Unrepeatable read etc.
Types of Locks:
Shared Lock (Read Lock) : This gives read-only access to a data unit such as a record and
prevents any transaction from updating the data. Any number of transactions can hold a
shared lock on a data unit at a given time.
Exclusive Lock (Write Lock) : The transaction imposing the write lock can have both
read and write access to the data. Other transactions cannot read from or write to the
it. A data unit can have only one write lock at a given time.
61
T1 T2
Write (X)
Read (x)
Write (A)
Read (A)
Write (A)
SERIAL SCHEDULE
A serial schedule is a schedule where all operations of each transaction are
executed consecutively without any interleaved operation from other
transactions.
T1 T2
Read (x)
Write (x)
Read(x)
Write (x)
62
Read (x)
x = x + 10
write (x)
read (y)
y = y +2
write (y)
read (x)
x = x +5
write (x)
read (y)
y = y +1
write (y)
63
T1 T2
Read (x)
x = x + 10
write (x)
read (x)
x = x +5
write (x)
Read (y)
y=y+2
write (y)
read (y)
y=y+1
write (y)
Read (x)
x = x + 10
read (x)
x = x +5
write (x)
read (y)
Write (x)
read (y)
y=y+2
write (y)
y=y+1
write (y)
SERIALIZABILITY
To ensure that concurrent execution of transactions does not take the database to an
incorrect state, the system must enforce Serializability. This concept says that a non-
serial schedule must have the same effect as a serial schedule.
65
DATABASE RECOVERY
Recovery facilities:
2. Transaction Log
A transaction log records essential details of each transaction. Data that are recorded
for each transaction include:
A single transaction can have several log records depending on the number of data
items accessed. The log allows two types of recovery:
Backward recovery - When a transaction fails halfway through, it should be undone (i.e.
rollback). This is done by using before images of changed data items.
Forward recovery – This is used to reapply changes (i.e. redo) to the database. This is
done by using after images of changed data items.
3. Checkpoints
Periodically (e.g. every 15 minutes) the DBMS takes a checkpoint where it ensures the
consistency of the database. At the checkpoint, any committed transactions still waiting
in the buffer are force written to disk and a special checkpoint record is written to the
transaction log. This checkpoint record indicates the time of the check point and a list of
transactions currently in progress. Checkpoints make recovering from a failure easier.
How to recover: We must take the most recent back up copy of the database and copy
it to a new disk. We need to bring the database to a state immediately before it got lost.
This is done by REDOING the transactions that have occurred since the last back up.
Redoing involves applying after images of changed data items from the log to the
database. This is called FORWARD RECOVERY.
2. Transaction failure
A single transaction aborts due to an exception/run time error.
3. System failure
All the transactions currently in progress in the database server are lost. e.g. power
failure. There are 2 recovery methods for system failure:
Modifications to data items are carried out as they occur to disk without waiting for the
transaction to reach its COMMIT point. In case of a failure, all the transactions that
were in progress at the time of failure must be rolled back/undone. Since the
transactions are allowed to commit before all its changes are written to the actual
database (but fully recorded in the log under WAL protocol), then there may be a need
to REDO committed transactions in case of a failure. So the transactions that were
committed after the last checkpoint are redone. This is called UNDO/REDO recovery
algorithm.
T1 , T3 UNDO/ROLLBACK T2 , T4 REDO
T5 DO NOTHING
69
Defer or postpone any actual updates to the database until the transaction completes
its execution successfully and reaches its commit point.
After the transaction reaches its commit point the updates are recorded in the actual
database.
If a transaction fails before reaching its commit point, there is no need to UNDO any
operation. However, it may be necessary to REDO a committed transaction from the
log because its effect may not have been recorded in the actual database. Hence,
deferred update is known as NO_UNDO/REDO recovery algorithm.
Gather client’s data requirements using methods like interviewing, questionnaire etc.
Conceptual Design
This is a detailed conceptual model of all entity types, relationships and constraints,
free of any technlogy or database model. This is expressed as an ER diagram.
Logical design
This maps the conceptual model (i.e. ER) to logical schema of the target database
model. This is independent of any technology (DBMS, platform etc. ) but dependant of a
particular data model. It is a concise description of the data requirements. If the target
model is relational then the result is fully normalized relational schema in bracketting
notation.
71
Physical design
The result of logical design is converted to data structures of the target database model
using the target DBMS. The result is physical schema which can be used create the
physical database. This is technology dependant. Physical schema is created using SQL
DDL.
capacity int,
….. );
fullname varchar(20),
age int,
);
72
DATABASE SECURITY
Authentication
Authentication is concerned with the positive identification of database users. It
ensures that only authorized users with a login account and a valid password are
allowed access to the database. To enroll users into the DBMS, the DBA allocates a
unique user name, a password and a predefined profile. Certain initial privileges such as
any server roles, what databases are allowed to access, disk space quota etc. are
assigned to the profile.
Once the users are logged into the database, they should be given appropriate database
privileges. For a given user, these privileges define the types of action he/she can take
against the database.
Types of privileges include SELECT, INSERT, UPDATE and DELETE. Some users might be
given read only access while others may be given the full set of privileges. Access
privileges can be defined for individual users as well as for user roles.
Above statement issues read and update rights on employee table to Jane.
Above statement issues ALL privileges (insert, update, delete, select) on employee table to
Jane.
73
Roles - We can group users according to their roles. For example, in a bank, the roles
could include Teller, Manager and System Administrator. Once we classify the users into
roles, we can then issue access privileges to roles rather than individual users.
Views
Confidentiality of data can be enforced via SQL Views. Views can be used to hide
sensitive/confidential data from inappropriate staff. Views ensure that a given user
sees only what is relevant to him/her. Views are used by the DBA to control access to
the database.
Example: Supposing there is an employee table with empid, name, address, contactno and
salary. The reception need to know their contact numbers. So we create a view that has only
employee names and contact numbers. Confidential data like salary are hidden from the user
group.
Security can be further enhanced by providing Access Privileges to Views via GRANT
statement of SQL.
74
NOTE: GRANT clause can only provide column level or table level security (i.e. it can
only restrict access to columns or tables). It cannot provide row level security. To restrict
access to rows, views can be used. Grants can then be defined on views to define who
can access them.
Data Encryption
This involves scrambling data so that they cannot be understood without the proper
decryption key. Sensitive information typically encrypted include credit card numbers,
sensitive personal information (e.g. health details) and user passwords. The decryption
key should be given only to authorized users.
OCTOBER 2021
76
CLIENT/SERVER
Client/server model divides duties between client machines and a server. The
server is a powerful machine that contains the database. Clients send queries to
the server. The server processes the queries and send the answers to the clients.
User authentication, concurrency control, integrity checking all done by server.
Above advantages/reasons make Three Tier more suitable for Web applications.
78
CLOUD COMPUTING
The idea of Cloud is to obtain computing services such as servers, data analytics,
storage and other things over the Internet at a price (usually a monthly rental fee). This
is similar to the way we obtain services such as electricity and water. Companies
offering these computing services are called cloud providers.
1. Cost
Cloud computing eliminates the high initial cost of buying expensive hardware and
software. It also eliminates many costs related to running and maintaining an on-site
data center.
2. Expertise
Companies that provide cloud services specialize in necessary experitse and technical
“know how”. Such expert knowledge is not easily available to small companies.
3. Performance
Cloud computing services run on powerful, fast and secure data centres. They are
regularly upgraded to latest computing hardware. This vastly reduces network latency
for business applications.
SQL
required.
BLOB (Binary Large Object) – It is used to store large amounts of binary data such
CLOB (Character Large Object) – It is used to store very large text objects
81
EMP
From Emp
Where Gender=’M’;
The condition can have multiple parts – the logical operators AND, OR ,NOT can be used
to construct multi part conditios.
2. Insert Data
3. Update Data
Update Emp
Update Emp
Update Emp
4. Delete Data
DELETE BROWN
This command deletes all the records/rows, but the table structure remains. This
operation cannot be rolled back. It performs an automatic COMMIT.
Note: This is same as DELETE FROM EMP but the difference is that DELETE
command can be rolled back, truncate cannot.
84
Sorting Information
Select ename, salary
Pattern Matching
SQL pattern matching enables you to use “_” to match any single character and “%” to
match an arbitrary number of characters. SQL patterns are case-insensitive by default.
Pattern matching is done by using LIKE or NOT LIKE comparison operators.
EMP
Summary Information
Summaries are produced by using “Group By” clause. It isused with one or more
aggregate functions. When a summary output is subject to a condition, “where” clause
cannot be used, instead a clause called HAVING is used.
PURCHASE
SUPPLIER PRODUCT QUANTITY
LG AC 25
SONY TV 10
LG DVD PLAYER 40
SONY DIGITAL CAMERA 15
APPLE IPOD 7
From Purchase
Group By Supplier ;
86
List the number of products supplied by each supplier, from highest to lowest
From Purchase
Group By Supplier
List the total and average number of products supplied by each supplier provided
he/she supplies more than one product
From Purchase
Group By Supplier
Table Joins
EMP
DEP
A1 IT 45454
A2 Admin 56565
A3 Sales 67677
A4 R&D 22222
List the employee names along with their department numbers and department names
List the names and salaries of employees who belong to “IT” department
Note: We can also answer the above query using “IN” clause
Select DEPNAME
From DEP
Where DEPNO IN
( Select DEPNO
From EMP) ;
89
Using “IN” clause method, List the names of employees who belong to “ADMIN” department
Select ENAME
From EMP
Where DEPNO IN
( Select DEPNO
From DEP
Select DEPNAME
From DEP
( Select DEPNO
From EMP) ;
Using “IN” clause method, List the names of male employees who belong to “IT’ department
Select ENAME
( Select DEPNO
From DEP
Inner Join
Inner join is the most common type of join. It combines rows/tuples of two tables based
on a join predicate which refers to the primary keys and foreign keys. Where the primary
key value of one table matches with the foreign key value of the other table, the two
corresponding tuples are joined.
On EMP.DEPNO = DEP.DEPNO;
Outer Join
An outer join extracts rows/tuples that do not match their primary and foreign key
values in addition to the tuples that match. There are three types of outer joins: LEFT
OUTER JOIN, RIGHT OUTER JOIN and FULL OUTER JOIN.
LEFT OUTER JOIN – Where table A and table B are joined and table A is specified as left,
a left outer join produces all the tuples from table A including the ones that do not
match with table B.
SQL
ALGEBRA
JOE A1 IT A1
SAM A2 ADMIN A2
BROWN A1 IT A1
PAT A2 ADMIN A2
TOM
RIGHT OUTER JOIN – a right outer join produces all the rows from the right table
including the tuples that do not match with the left table.
SQL
ALGEBRA
JOE A1 IT A1
SAM A2 ADMIN A2
BROWN A1 IT A1
PAT A2 ADMIN A2
QM A3
93
FULL OUTER JOIN – Produces all the tuples from both tables (it combines left outer
join and right outer join)
SQL
ALGEBRA
SQL VIEWS
A view is a virtual table. It derives its data from underlying base tables.
Views are used by the DBA to control access to the database. Views can be used to
hide sensitive/confidential data from inappropriate staff. Views ensure that a given
user sees only what is relevant to him/her. In a multiuser database, each user/user
group has their own view of the database.
Views belong to the external level of the ANSI-SPARC Three Schema Architecture.
Views can be manipulated just like other tables within some limits. We can also create a
view from another view.
EMP
Fname Lname Address Salary Gender Dno
Tom Jones Kandy 5000 M A1
Rita Heyworth Colombo 8000 F A2
Alan Turing Galle 4000 M A1
Roy Silva Jaffna 9000 M A1
SELECT *
FROM EmpContact ;
FROM Emp;
Derived information (i.e. calculations) and aggregate functions (sum, avg etc.)
cannot be modified
Views with ”GROUP BY”, DISTINCT and ORDER BY cannot be modified
Modifying a view
It is NOT possible to modify a view but you can replace an existing view like this:
Create Or Replace View DEMO As Select * From STUDENT Where Gender= ‘M’;
Advantages of Views:
Deleting a view
This is used to create user accounts and grant access privileges to those users on
database objects such as tables and views. Privileges are access rights (e.g. SELECT
privilege for data retrieval) over the database objects given to users. DCL is also used to
remove those access privileges.
DCL has GRANT statement to issue privileges and REVOKE statement to remove
privileges.
DML privileges include SELECT, INSERT, UPDATE and DELETE (‘ALL’ can be given when
issuing all privileges)
Removing privileges
GRAPH DATABASES
A graph database organizes data into nodes, relationships and properties. Nodes are
entities or objects that have properties, such as name and age. Nodes have
relationships to other nodes. Graph databases are use din NOSQL systems.
A graph database is a group of nodes related to each other. Edges of the graph
connects one node to another via a relationship.
A node can hold any number of attributes (e.g. empid, name, salary) called properties.
Nodes can be tagged with labels (e.g. Employee, Company, City).
KEY–VALUE DATABASES/MODEL
The database consists of lots of aggregates (an aggregate is a grouping of related data)
with each aggregate having a key / ID to retrieve data. The aggregate may contain
alphanumerical data (e.g. product name, price), images, audio and video.
With a key-value database, we can only access an aggregate by a lookup based on its
key. That is, it allows us to search for a given key (e.g. product id) and retrieve the
corressponding data value (e.g. product price).
99
An object is a software component that has a unique ObjectID (OID), a state and a set
of operations that work on the state. The Object ID is system generated. The OID is
used by the system to identify objects uniquly and to implement interobject references
(e.g. a customer object refering to order objects)
The state of an object is made up of its attribute values. The operations specify the
behavior of objects. These are implemented as methods or functions.
Objects are grouped into classes. A class is a blueprint for a set of similar objects. A class
is a type. An object is an instance of a class. Classes are created using Object Definition
Language (ODL).
An OO database has PERSISTENT objects, that is they continue to exist even after
terminating the program.
100
March 2013 – Q5. - (a) Describe the various interfaces, tools and techniques that a
user (technical or otherwise) may employ when interacting with a database.
(10 Marks)
SQL interface at command line terminal - A basic text only interface where the user can
enter and execute SQL code (example: Oracle SQL plus). This can be used for complex
queries and database administration work. It requires a comprehensive training on SQL.
Suitable for advanced users such as DBA, not for end users.
QBE (Query By Example) interfaces (example: QBE of Microsoft Access) that allow end
users to perform queries simply by filling a template. This is easy to use and does not
require any special training but it is not suitable for complex queries.
101
Report generators and Form generators – An easy to use GUI based interface that
enables end users create their own database reports (example: Report Wizard of
Microsoft Access) and database access forms (example: Form Wizard of Microsoft
Access). Users do not need coding knowledge. These facilities enable faster application
development but may require more memory and processing power. Also there may be
customizing problems. Suitable for end users.
SQL script - A file of SQL code used to perform a batch of database tasks. (e.g. a
Notepad file with .sql extension containing a series of CREATE TABLE statements).
Created by technical users such as the DBA or developers. It demands strong coding
skills and technical knowledge.
Database Utilities – Provides automated support for common database related tasks.
(e.g. ETL tools for loading data into a data warehouse). Requires high levels of technical
knowledge. Used by the DBA and operators
102
Middleware – Software that interposes between a client and a server so that a client
application can connect to a remote database (examples: ODBC, JDBC)
Web forms – Provides a simple and easy to use facility to query, insert, view or update
data in a database. It is used at the client layer (presented in a browser) of three tier
client/server architecture. Any with Internet can access it (e.g. customers, employees)
Client side processing tools (e.g. javascript) to carry out client side processing such as
validating user input. Performing validations on the client itself is more efficient as it
saves the round trip to the server.
Server side processing tools (e.g. PHP) to carry out server side processing such as
requesting data from a database.
103
(b) Explain what the term data validation means. Using your own examples, describe
the various data validation techniques that may be embedded into a forms-based
interface to a database. (10 Marks)
Data validation is concerned with ensuring that only valid, accurate and well-
formatted data is accepted into the database. Data validation ensures data integrity.
A form based interface could carry out following validations:
Validating user name and password at log on – This is done by comparing the input
user name and password with the stored ones in the database
Range checks for numerical quantities (for example, marks entered for a single subject
should be in the rang 0 to 100)
Format masks (for example, date of birth can have DD/MM/YY format)
Consistency checking (e.g. a user who enters “Mr” for title, must enter “Male” for
gender)
Presence checks to ensure important fields are not left blank (example: customer
name and address cannot be null)
(c) Describe the form components that may be used to implement these data
validation techniques. (5 Marks)
Drop down list boxes (combo boxes) ensure only a predefined item is selected out of
many (example: selecting country of the user)
104
Radio buttons ensure only a single choice is made out of several (example: selecting
gender out of male or female)
On screen calendar when prevents typing mistakes when entering date fields such as
reservation date
Check boxes enable the user to select several items out of many
(c) Using examples derived from the ER model explain the difference between :-
A strong entity type can exist independantly on its own. Its existance does not depend on
another entity type. A strong enbity type has its own primary key. Order is an example for a
strong entity type. Its primary key is ordernumber.
A weak entity type cannot exist on its own. Its existance depends on another entity type
(owner). It cannot form its own primary key. It only has a partial key. A complete primary key is
formed by combining the partial key with the owner’s foreign key.
Orderitem is a weak entity type. It has itemnumber as partial key. It is unique only for items of a
single order. We can form a complete primary key by combining the partial key itemnumber
with owner’s foreign key ordernumber.
105
Following instance diagram shows how instances of three entity types relate.
March 2018 - A1
a) Explain the MAIN objectives of the ANSI-SPARC architecture for a DBMS. Discuss
briefly the challenges of achieving these objectives in practice. (10 marks)
The main challenge is that there is no universal standard among database vendors ……..
(explain)
107
When concurrent transactions update the same data at the same time, problems like lost
update , dirty read, etc. could make the database inconsistent. Follwing example shows the
problem of lost update: Two concurrent transactions read a record to update it, and the first
one to write the record loses its update, when the second one completes (Second transaction
overwrites the update made by the first one).
108
Concurrent transactions must not interfere with each other. They should be isolated
from one another. Even though there will be many transactions running concurrently,
any given transaction’s updates should be concealed from other transactions until that
transaction commits. If not, the updates made by one trnsaction can get overwritten by
another (lost update) or changes made by one transaction may be read by another
before first one commits (dirty read). Isolation is enforced by concurrency control
methods such as locking. When a transaction locks a data unit, it cannot be accessed
by others until the first transaction completes.
e) How the DBMS recovers transactions that are lost following system failure or
crashes.
Both the transaction log and the most recent back up copy is needed to recover from a
disk crash. Copy the back up to a new disk and REDO all the transactions that have
occurred since the time of back up to the time of crash. This is done by applying AFTER
images of changed data items.