CS312 Handouts
CS312 Handouts
(CS312)
Database Modeling and Design VU
Contents
Module-01: Introduction to Data & Information ............................................................................ 3
Module 02: Data Storage Mechanism............................................................................................ 6
Module 03: Three-Tier Architecture ............................................................................................... 8
Module 04: Fact Finding Technique ............................................................................................. 12
Module 05: Implementing Fact Finding Techniques .................................................................... 16
Module 06: Process to Database Design ....................................................................................... 19
Module 07: Relational Database ................................................................................................... 26
Module 08: Conceptual Data Modeling and Entity Relationship Diagram Overview ................. 32
Module 09: Entities, Attributes and Relationship ......................................................................... 34
Module 10: Extended Entity Relationship Diagram ..................................................................... 43
Module 11: Example of Entity Relationship Diagram (ERD) ...................................................... 46
Module 12: Anomalies.................................................................................................................. 51
Module 13: Normalization ............................................................................................................ 54
Module 14: Denormalization ........................................................................................................ 60
Module 15: Introduction to Oracle 11g on Cloud ......................................................................... 65
Module 16: Using Data Definition Language (DDL) in Oracle 11g ............................................ 68
Module 17: Using Data Manipulation Language (DML) and Data Control Language (DCL) .... 70
Module 18: Structured Query Language (SQL) Basics ................................................................ 73
Module 19: Advance SQL ............................................................................................................ 83
Module 20: Database Views and Data Dictionaries in Oracle 11g .............................................. 89
Module 21: Introduction to Sequence and Synonyms with implementation in Oracle 11g ......... 92
Module 22: Indexes in Databases ................................................................................................. 94
Module 23: Transaction ................................................................................................................ 96
Module 24: Locks & Granularity.................................................................................................. 98
A. Definition of Data
Data can be referred to raw material from which we can draw conclusion after analysis or the
facts from which we can infer new facts. Data has nothing to do with decision making but the
data lays down foundation for decision making. Data is collected and further analyzed to convert
it into information suitable for decision making. Data and information are two separate things
and are not to be confused with each other.
B. Example of Data
The followings are the examples of data:
Telephone Directory: it’s a huge collection of Data, with an internal structure of
representation i.e. name, telephone number, city or address. Facts with structures are also
termed as data.
Student Record on Computer: a file on computer containing data of ten thousand students is
also a data. This data too have internal structure in the form of Student ID, Student Name,
CGPA and likes.
C. Definition of Information
Information can be defined as processed or organized data presented in a given context so as to
make it useful related to the problem in hand. When a specific set of data is analyzed or
interpreted, it becomes information that is more suitable for decision making. Simply stating,
data becomes information when it becomes relevant to the decision problem.
D. Examples of Information
The followings are the examples of information:
Telephone Directory: as stated above, telephone directory is a huge set of data and when we
process this set of data to get the telephone number of a specific dentist or a colleague, it
becomes information.
Student Record on Computer: a file on computer containing data of ten thousand students is
again a huge set of data, converts in to information when processed to get list of a students
with CGPA more than three.
Example 02:
Database: Individual files on Hard Disk
DBMS: Operating System to read the file from Hard Disk
W. Layer Independence
The tiers mentioned above have no interdependence and work exclusively. Every tier is totally
unaware of the internal working or the logics of other tiers and the decision made by the tiers are
also independent. In simple words:
Data layer is not aware of the internal working of logic layer
Logic layer is not concerned about the workings of data layer
Presentation layer works in total independence for other layers
FF.
facts, getting more diverse ideas and information, gathering the requirements, involving the end
user to generate enthusiasm and to make them feel engaged in the whole project.
much about the individual and activity to perform as possible. Here the question like, when low,
normal and peak periods for the activity being observed.
MM. Research
It is very much valued to research the solution for the problem in hand. Internet, journals,
reference books are good sources of information about how other have solves related problems
or what solutions are available. It can save a lot of time as we can get ready to deploy solutions
to fulfill the requirements. It also a helpful tool in validating the information gathered thorough
observation as the chances are there that the actual execution of the activity is different from
what is observed.
NN. Questionnaire
Another useful method for fact finding is to conduct surveys through questionnaires which are
specific purpose document to gather information from a large number of people while
maintaining some control over the responses. There are two types of questions that can be asked
in a questionnaire and the details are given below:
Free- Format Questions: it offers greater freedom to the respondent in answering question as
the respondents have to answer the question in the space provided after the question. The
responses are more subjective in nature and difficult to tabulate.
Fixed-Format Questions: These kinds of questions require specific responses and the
individuals have to choose from the options available. The responses are more specific which
make the responses much easier to tabulate. The disadvantage of this kind of questions is that
the researcher doesn’t get additional information that might be worth knowing.
Requirements:
In a plaza renting system, there are three types of buildings primarily apartment, offices and
shops. There are multiple floors in a plaza and for 3 floors apartments are allocated but for
remaining there can be mix of shops and offices. When customer is registered usually customer
id, name, phone and address is stored as profile and bill is generated for the rent on bi-monthly
basis but customer has an opportunity to pay the bill on monthly basis. Contract is signed and
uploaded against every building type. Contract id assigned to every contract. It is possible that a
customer may rent multiple building types, in this case different contract no is assigned to same
customer. One Bill is generated for customers with multiple building types are generated.
architecture and the design of the system to be developed. Concept definition artifact lays down
the foundation of system definition; concept definitions are activities in which the needs and the
requirements of all the stakeholders are closely examined before defining the system. The
concept definition is the articulation of System-of-Interest (Sol) – the model of system on the
basis of needs and requirement – which is a collective set of all elements of any system being
considered by the lifecycle.
SS.Process of Extracting
System Definition
System definition is based on the
concept definition, a deep
understanding of concepts, needs,
expectations and stakeholders’
requirements. System is further
divided into subsystems through
top-down decomposition of system-
of-interest in order to reach the
exact destination. The same model
is shown in the Figure 13 on the
right side. The needs and the
concepts along with the stakeholder Figure 2: Process of Extracting System Definition
requirements are gathered to form a
system-of-interest and this is further divided into subsystem to fulfill the desired requirements.
The system should list the entire important feature for a database system and these should
include the followings:
Initial database size
Database rate of growth
The types and average number of record searches;
Performance
Security;
Backup and recovery
Legal issues etc.
features required by the end users or the delivery schedule disturbing the development process.
The attributes of the requirements are explained in the following parts.
Issue 01: How uniquely would you be able to identify the unique batch report?
Issue 02: What if requirement has changed and discrepancy report is not required?
Implication: Zero Traceability
Solution 01: System must generate a batch report when batch is completed and aborted –
Assign Unique Requirement Number
Solution 02: System must generate a discrepancy report when batch is aborted or
completed – Assign Unique Requirement Number
Now understand the concept. A requirement is traceable if both the origins and the references of
the requirements are available. Traceability of the origin or a requirement can help understand
what modifications have been made to the requirement to bring the requirement to its current
state. Traceability of references is used to aid the modification of future documents by stating
where a requirement has been referenced. By having foreword traceability, consistency can be
more easily controlled. Making requirements traceable is also useful in coding and testing of the
system.
MMM. Requirement
Specification
Requirement specification is the
result of requirement analysis.
Requirement specification
established the basis for the
agreement between clients and the
development team on what the
software product is to do and what
not to do. The requirements are
further classified into functional and Figure 5: Process to Database Design
nonfunctional categories and recorded in mutually agreed document. This document is called
Software Requirement Specification or SRS. This document enlists enough and necessary
requirements that are required for the project development. Clear and thorough understanding of
the system to be developed is making sure and this is achieved through continuous
communication between client and project development team.
Relational database can be defined as a database whose organization is based on the relational
model of data in which all the data is presented in the forms of tuple (also known as a record or a
row) which are further grouped into relations. The relational model was first proposed by E. F.
Codd in his seminal paper ‘A relational model of data for large shared data banks’ (Codd, 1970).
There are some rules proposed by E. F. Codd which any database must follow to be called a
relational database.
The system must support set-at-a-time insert, update, and delete operators. This must not be
limited to a single row, that is, it must also support union, intersection and minus operations to
yield sets of data records.
locations. Users should always get the impression that the data is located at one site only.User
have no concerns with the internal storage strategy. This rule has been regarded as the foundation
of distributed database systems.
reference which is known as a key. This key equips each row or database record with a unique
identification which can be easily tracked.
AAAA. Relationships
Relationship represents how data is connected among entities in a given System. The association
among the entities can also be termed as relationships. In our school example, the two entities
e.g. student and course have an association or relation with each other as student enroll in a
course. Interaction among entities is captured using relationships.
The ability to find meaningful names comes with fundamental understanding of what the model
represents. Since the requirement specification is a description of a business, it is best to choose
meaningful business names wherever that is possible. If there is no business name for an entity,
you must give the entity a name that fits its purpose in the model.
As a rule, a primary key should be minimal and it should not contain unnecessary information.
So in the above case, building name fits into the criteria or becoming a primary as rest of the
attributes can be complex containing unnecessary information. Apartment ID is unique and the
apartments in the building can have same covered area, rent or status making these attributes
common. The primary key for rest of the entities is determined in the very same way.
An receipt database has a table and each receipt is associated with a particular customer.
Customer details (such as name and address) are kept in a separate table; each customer is given
a ‘customerID' to identify it. Each receipt record has an attribute containing the customerID (cid)
for that receipt. Then, the 'customerID' is the primary key in the customer table and that primary
key will be the foreign key in the receipt table.
OOOO. Relationships
Relationship define how data is connected among the entities in a given system or in other words
how one entity is logically connected with another entity of the system. Relationships in a
database are said to be a combination of cardinality and optionality where optional relationship is
one in which there may or may not be a matching record in parent / child table and cardinality
represents the concept of “how many” and normally it is 0 or more. Equation for creating
relationship is as follow:
R𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠ℎ𝑖𝑝 (𝑅) = 𝐶𝑎𝑟𝑑𝑎𝑛𝑖𝑙𝑡𝑦 (𝐶) + 𝑂𝑝𝑡𝑖𝑜𝑛𝑎𝑙𝑖𝑡𝑦 (𝑂)
The concepts of cardinality and optionality are explained below in details.
Relationships are bi-directional in nature; Relationship between two entities A and B is as
follow:
i. Relationship from A to B
ii. Relationship from B to A
(______) . A dotted or solid line shows this kind of relationship. To translate …… or _____
following rules are to be followed:
Optional: -----: zero or one
Mandatory: ____: Exactly one
a. One-To-Many Relationship
A one-to-many relationship is a type of cardinality that refers to the relationship between two
entities A and B in which element of A may be linked to many elements of B, but a member of B
is linked to only one element of A. One-to-many relationship is the most common type of
relationship. In this type of relationship, a row in table A can have many matching rows in table
B, but a row in table B can have only one matching row in table A. One-to-many relationship
occurs when a parent record in one table can potentially reference several child records in
another table.
As a rule, the Primary key of Parent table acts as Foreign Key in Child Table. Foreign Key is
usually created as Composite Primary Key with PK of child table but this is no specific rule.
A B
b. One-To-One Relationship
A one-to-one relationship is a type of cardinality that refers to the relationship between two
entities A and B in which one element of A may only be linked to one element of B, and vice
versa. As the one-to-one label implies, in this relationship, one entity can be related to only one
other entity, and vice versa. It is important to note that a one-to-one relationship is not a property
of the data, but rather of the relationship itself as there is no parent-child relationship in on-to-
one relation scenario. The following are rules to implement One-to-One:
i. There will be foreign key in any one of the participating table.
ii. Foreign key will be made as Unique key.
A B
A to B:
Optionality is mandatory (exactly one) because line is solid from A till half line and cardinality is
also one since there is no < or > symbol attach with B
Relationship:
Every instance of A is having exactly one instance in B
B to A:
Optionality is mandatory (exactly one) because line is solid from B till half line and cardinality is
also one since there is no < or > symbol attach with A
Relationship:
Every instance of B is having exactly one instance in A
Overall Relationship:
One-to-One from A to B
d. Many-To-Many Relationship
A many-to-many relationship is a type of cardinality that refers to the relationship between two
entities A and B in which A may contain a parent record for which there are many children in B
and vice versa. This means that a parent row in one table contains several child rows in the
second table, and vice versa. Many-to-Many relations are tricky to represent and are not
supported directly in the relational environment. To represent this kind of relationships, a third
entity or intersection table is created where PK of two table act as FK and CPK in third table.
A B
A to B:
Optionality is mandatory (exactly one) because line is solid from A till half line and cardinality is
more than one due to < symbol attach with B
Relationship:
Every instance of A is having one or more instances in B
B to A:
Optionality is mandatory (exactly one) because line is solid from B till half line and cardinality is
more than one due to < symbol attach with B
Relationship:
Every instance of A is having one or more instances in B
Overall Relationship:
Many- to- Many from A to B
UUUU. Cascading
The purpose of the Foreign Key is to identify a particular row of the referenced table, it is
required that the foreign key has a valid reference to from the referenced table or has NULL
value. This rule is called the referential integrity constraint and violation of these constraints can
create many issues. Cascading is the mechanisms to ensure the foreign keys have valid reference
in the parent table. Cascading defines the behavior of foreign key when the record from the
parent table is deleted or updated. To maintain referential and data integrity, cascading define
two rules; CASCADE DELETE and CASCADE UPDATE. These are discussed in detail in the
following lines.
Example:
Suppose company X has 2 tables, an Employee table, and a Department table. In the Employee
table we have 2 columns – the employee ID and the Department ID. In the Employee
Department table, we have 2 columns – the Department ID and the Department name for the
given ID. Department ID is the foreign key in the Employee table. Now suppose we wanted to
remove a department due to any reason, when a department is deleted, any rows in the employee
table, that references the DepartmentID, are also deleted.
The diagram above explain the concept of super-type and sub-type. Common attributes (sno,
sname, designation, address) are placed in Staff entity (Super type) as there will be multiple
employee type (Full-time, part-time) and every employee type will share common attributes
so instead of writing (sno, sname, designation, address) multiple times for Full-time and Part-
time sub-types these 4 attributes are written only once and are shared by super type.
DDDDD. Exhaustive
It’s the first property of Super & Sub Types which states that there should be at least two sub
types and each subtype should have at least one attribute. Subtype without no specific attributes
is not a valid Sub-type In other words, every instance of the super type is also an instance of one
of the sub types.
If we remove Salary attribute from sub-type Full-time then its not a valid sub-type because as
per Exhaustive rule there has to be at least one specific attribute for sub-type to qualify to be
a valid sub-type
In the diagram above, for example if there is sno=1 at any particular time, then this sno(1) can be
associated with only one of its sub-type i-e either Full-time or Part-time; not the both sub-types.
If sno=1 is associated with Full-time then sub-type part-time is not aware about this association
due to mutually exclusive rule.
In ERD it’s not possible to show foreign keys as a rule but it is possible in Physical Data Model
to show attributes and foreign keys.
As rule of thumb primary key of parent table is referred as foreign key in child table in One-to-
Many relationship.
In Many-to-Many a new third table is introduced where primary key of both tables are written as
FK and made part of CPK.
In One-to-One any one table will contain FK but FK will be made as UK as rule.
Physical Data model is last step towards before database Implementation.
Building(bname, address)
Floor (floor #,no_apt, bname)
Assumption:
Floor# can be repeated among multiple building due to this bname is made part of
Composite primary key along with Floor # in floor table
Band (bname,totalperson,cat)
Company( cname, address, phone#)
Songs(sname,duration,language,bname,aname)
Writer( wid, wname, charges, category)
Album( aname, total_songs,cname)
Writer_song(wid,sname)
Flight (fid,dod,tod,destination,aname)
Passenger(pid,pname,fid,contact,address,date)
Ticket(fid,tid,pid,seat#,meal)
Plane(pid,capacity,age)
Plane_flight(pid,fid) – Resolving Many-to-Many between flight and Plane
A. What is Anomaly?
Anomalies can be defined as the problems that can occur due to poor planning and designing of
the databases. This usually occurs when a single table is created instead of creating multiple
tables. An anomaly is an irregularity, or something which deviates from the expected or normal
state.
This anomaly occurs when the data is lost due to the deletion of some other data. The reasons of
this anomaly are same as mentioned above i.e. poor design decision. Due to this anomaly certain
attributes are lost due to the deletion of other attributes. If we want to delete pid : 4 then we will
be deleting cid=2 also because when we delete complete row is deleted and there is only record
for cid=2 meaning we might end up losing customer data when we delete product data which is a
loss of unwanted data.
An Update Anomaly exists when one or more instances of duplicated data are updated, but not
all. An update anomaly occurs when the same data item has to be updated more than once. This
can lead to errors and inconsistency of data. The same information can be expressed on multiple
rows; therefore all the instances must be updates. If this is not done properly, update anomaly
arises. If there are multiple records and we fail to update all the instances then it will result in
data-inconsistency across database which will result in wrong reporting of data which is not
acceptable.
In the table given above if we decide to update Cname of cid=3 and we fail to update all the rows
(cname) then it will end up in wrong information. Records with yellow color are showing that
cid=3 is having different Cname and system will report you both Cname against CID=3 and its
wrong. It is result that all the records are not updated.
A. Normalization Basics
Database normalization is the process of storing the data efficiently in order to reduce data
redundancy and undesirable characteristics from the system. The two main objectives of the
database normalization are to reduce redundancy and to make sure the relationships /
dependencies are logical. That’s why the related data is stored together. This process serves as a
solution to database anomalies as normalization is a method to remove all these anomalies and
bring the database to a consistent state.
group as FK.
First group has to be non-repeating group
Based on the figure 24 above, the following 1NF is as below entities and attributes can be
defined:
Primary key of first group which has to be Non-repeating group – Student group; has to be
written as FK with every other following group – Course and Semester. Just like One-to-Many
primary key of Parent table is written as FK in child table, student is a parent table and Course &
Semester table are child table.
First Normal Form (1NF)
Entities Attributes
Student Sid, Sname, Total_Registered_Courses, Status
Course Cid, Cname, Credithrs, Sid
Semester Semid, Sid, Start Date, End Date
The data presented in the above format fulfill all the criteria of being in first normal form.
Semester Semid, Sid, Start Date, End Date Semid⇒ start date, end date… Both have
functional dependency on part of CPK i-e semid
only. To find out correct start and end date only
value of Semid is required.Semester table is
not in 2NF
Semester_info ( semid, startdate, enddate ) New Table - – In 2NF now
Semester ( semid, sid ) Original table - – In 2NF now
After applying the normalization up to third normal form (3NF) the data would be in the
following shape:
TableName Attributes
Course (cid, sname, credits )
Student-course (cid, id)
Semester ( semid, sid )
Semester_info ( semid, startdate, enddate )
Student-Status Total_courses_registered, status
Student sid,sname, reg_courses
After applying the normalization up to third normal form (3NF) the data would be in the
following shape:
TableName Attributes
Flight Flight#, destination, origin, distance, dod, tod
Org-Dest origin, destination,distance
Flight Flight#,, origin, destination, dod,tod
Passenger Flight#,pid, pname, seat#, ticket#, cost
Airport-Info aid,gate#,aname,runway#
A. What is Denormalization?
Denormalization is the process of systematically adding redundancy in a database with the
purpose of improving the database performance. The tables are merged into each other in order
to group data as the more tables you have, the more joins you have to perform in your queries,
and joins have a negative impact on performance and more joins mean more time is required to
retrieve data because data is required from multiple table. The purpose of denormalization is to
reduce the running time of, the time it takes the database management system (DBMS) to
calculate the results.
Considering the above image, the normalization of the database resulted in two separate tables
for Order and Order Details as shown in the figure. In order to speed up the query processing
time, these two tables will be merged together i.e. denormalization. In this way no joins will be
required, saving the space and speeding up the query processing time and ultimately improving
the database performance.
produce fewer table joins when queries execute. It is appropriate when queries frequently require
values from a grandparent and grandchild, but not from the parent.
B. Introduction
It’s a cloud service hosted by Oracle with full access to the features and operations that are
available with Oracle Database, but Oracle hosts the VM and cloud storage. You can perform all
database management and development operations—without purchasing and maintaining
hardware, without knowing backup and recovery commands, and without having to perform such
complex tasks as database software upgrades and patching.
C. Login to Cloud
Login to the cloud will require the following steps:
Step 01:
Follow the following address to access Oracle 11g:
https://apex.oracle.com/en/
Step 02:
Click on the Sign In button on the upper right corner of the page and enter the following
credentials:
Workspace Name
Username
Password
Step 03:
Click on the SQL Workshop tab at the top-mid of the page
Step 04:
Click on SQL Commands tab from the drop down list to access the code or enter SQL
Statements / Commands and click Run to see the results
Data Definition Language (DDL) is used to create database object in the Database. Database
Object has Data dictionary created and managed by System in the Database. Table is the most
important database important which is required to storage and retrieve data. DDL include
Create, Alter and Drop statements.
A. Syntax of DDL
To implement the single table (building table) from the ERD given below:
Create table Building (bname varchar2 (30) primary key, address varchar2 (40), phone
number number (10));
In floor table, bname is foreign key, Floor# and bname is Composite Primary key. Code to
implement the table is as below:
Create table Floor (floorno number (10), no_of_apt number (10), bname varchar2 (30)
references building (bname), primary key (floorno, bname));
Alter command is used to either add new column or modify data type or size in existing table
Adding New Column in Building Table
To modify data type of phone# from number to varchar2, following code can be used:
Alter table Building modify (phoneno varchar2 (10));
To drop table from the database, drop command can be used. While dropping the table it is
important to maintain the sequence to drop the table. All the child table should be dropped first
and then parent table can be drop.
Drop table floor;
Drop table building;
Continuing with the above example, the example of insert statement would be:
INSERT into building
VALUES ('Test', 'VU', 3948283);
UPDATEemp
SET ename='Anders'
WHERE empno=7369;
This will update the Employee Name to Anders of employee no: 7369
To verify the changes following command can be used:
SELECT * from emp;
UPDATE emp
SET job ='MANAGER' ,sal = 2000
WHERE empno=7369;
This will update Job and Sal column of the Emp table
made in the database to the way it was before, (this only works if you haven’t already used
Commit command). Delete command can also be used with condition to delete a particular row.
DELETEFROM Table Name;
DELETEFROM Table Name WHERE condition 1;
The command below will delete all the rows from table emp:
DELETE FROM emp;
The following command will display all the unique values from Deptno column of
Emp table
The following command will display values from all column with all the row mean it
will display the whole table
The following command will display ename and job column from emp table
Example 02: In a situation where you need to get list of all employees (from the employee table)
whose salary is not equal to 2000. The code will be as follows:
SELECT ename, sal FROM emp
WHERE sal<> 2000;
The following table shows the result of comparing two conditions with Logical Operator:
Truth Table
Condition 01 Condition 02 OR AND Description
If both conditions are TRUE, AND/OR will
TRUE TRUE TRUE TRUE
results in TRUE
TRUE FALSE TRUE FALSE If one condition is TRUE and other is
FALSE, OR Operator will result TRUE,
FALSE TRUE TRUE FALSE AND Operator will results FALSE
If both conditions are FALSE, AND/OR will
FALSE FALSE FALSE FALSE
results in FALSE
character used with MS-DOS. The percent sign allows for the substitution of one or more
characters in a field. The underscore is similar to the MS-DOS wildcard question mark character.
The underscore allows for the substitution of a single character in an expression. Wild cards are
used where part of value to be searched is known not the exact value mean if pattern is known
not the complete value. Wild cards are used to make regular expressions to identify patterns in
the queries. There are two wild cards as below:
Practice Scenario:
Write a query to display all information about all those employees who are at least one
vowel in the ename?
Write a query to display list of those employees who are earning more than 2000 but at
most salary is 5998 and are having at least two occurrence of E in the ename?
IIIIIII. IN Operator
The IN operator is SQL, works like OR Operator, allows to Search for a value from given list of
value. The basic syntax of IN Operator is shown below:
SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1,value2,...);
Solution – 1
SELECT avg (sal), maximum (sal) FROM emp
WHERE mgr = 7839
GROUP BY job;
Solution – 2
SELECT avg (sal), max(sal), job, mgr, sal FROM emp
WHERE mgr=7839
GROUP BY job
Result – Error job, mrgare not written after group by clause
The NEXT_DAY function returns the first weekday that is greater than
NEXT_DAY a date.
Syntax: NEXT_DAY( date, weekday )
The LAST_DAY function returns the last day of the month based on
LAST_DAY a date value.
Syntax: LAST_DAY( date )
A. Cartesian Product
Cartesian product is mathematically a binary operation in which two objects or sets (or tables)
are combined in an “everything in combination with everything” fashion. In SQL statement, a
Cartesian product is where every row of the first table is joined with every row of the second
table. Simply defining the Cartesian product, pairing of one element in set with every element of
second set I called Cartesian product. The whole concept is further explained by the following
example:
Example:
Suppose the following two sets:
Set A = {2, 5, 6}
Set B = {8, 1}
Cartesian product of the above two sets would be:
A * B = {(2,8), (2,1), (5,8), (5,1), (6,8), (6,1)
Total Number of Pairs = No. of Element in A * No. of Element in B
=3*2=6
In an SQL Statement:
AAAAAAAA. Self-Join
A self-join is a query in which a table is joined (compared) to itself. Self-joins are used to
compare values in a column with other values in the same column in the same table. In self-join
a table is joined with itself, especially when the table has a FOREIGN KEY which references its
own PRIMARY KEY. To join a table itself, means that each row of the table is combined with
itself and with every other row of the table. Self joins are used in a recursive relationship. To
explain that further, think of a COURSE table with columns including PREREQUISTE,
COURS_NO and others. There is a recursive relationship between PREREQUSITE and
COURSE_NO as PREREQUSITE is valid only if it is also a valid COURSE_NO. the basic
syntax of self-join is:
SELECT a.column_name, b.column_name...
FROM table1 a, table1 b
WHERE a.common_field = b.common_field;
Orders Table
Suppliers Table
The rows for Microsoft and NVIDIA will be included as full outer join was used but the
outer_date filed contain the NULL value for these two.
Suppliers Table
Orders Table
The row for 500127 (order_id) would be included because a RIGHT OUTER JOIN was used.
However, you will notice that the supplier_name field for that record contains a <null> value.
Scenario 02:Write a query to display all those deptno where minimum salary is less than
average salary of all the salary among all the employee.
Solution:
SELECT deptno, MIN (sal)
FROM employees
GROUP BY deptno
HAVING MIN (sal) < (SELECT AVG (sal)
FROM employees)
The first reference to customers_seq.nextval returns 1000. The second returns 1001. Each
subsequent reference will return a value 1 greater than the previous reference.
A. Index Basics
An index is a data structure that the database uses to find records within a table more quickly.
Indexes are built on one or more columns of a table; each index maintains a list of values within
that field that are sorted in ascending or descending order. Rather than sorting records on the
field or fields during query execution, the system can simply access the rows in order of the
index. Index, in simple terms, is a database object which helps in efficient searching of the data.
Every time database table is accessed, all the rows in the
table are searched to find the required data which limits
the database performance. Indexes are used to quickly
locate data without having to search every row in a
database table every time a database table is accessed.
An example to clarify the concept, consider the data
shown in the table. In order to search the salary of Aqeel,
for example, all the rows in the table will be searched to
locate the salary of Aqeel. In large tables, this can
tremendously increase the run time of a query. Imagine
the same scenario if the data is to be located from
different tables.
It is like a supplement to the table and it will contain just 2 columns- the key values and a row
ID/address to the actual row in the table. The index table / structure sort the data of the column in
alphabetical order and the address or row ID in another column. As shown in the above tables,
after the index table is created, the data has been sorted alphabetically and the row ID against all
the records is also stored in that table. Now to search any record, index table can tell the row ID
for each record.
Index is also a Database Object and Data Dictionary is maintained. Following are the different
Data dictionaries maintained for Indexes.
It will display all the indexes created by currently logged in user schema
It will display all the columns of indexes created by currently logged in user schema
A. Transaction Basics
A transaction is a sequence of operations performed as a single logical unit of work, whether in a
manual fashion by a user or automatically by some sort of a database program. A transaction is
the propagation of one or more changes to the database. Practically, many SQL queries (DDL or
DML)can be clubbed into a group and can be executed together as a part of a transaction. A
logical unit of work i.e. Transaction must exhibit four properties, called the atomicity,
consistency, isolation, and durability (ACID) properties, to qualify as a transaction. The details
are given in the following.
WWWWWWWW. Isolation
This property states that all the steps during the execution of transactions are done without
interference. This simply means that now two transactions can interfere with each other and if
another transaction wants to access same data then it should wait. n a database system where
more than one transaction are being executed simultaneously and in parallel, the property of
isolation states that all the transactions will be carried out and executed as if it is the only
transaction in the system. No transaction will affect the existence of any other transaction.
XXXXXXXX. Durability
This property states that System should be capable enough to hold transactional data and if
transaction fails then there should be backup and recovery process. The database should be
durable enough to hold all its latest updates even if the system fails or restarts. After a transaction
has completed, its effects are permanently in place in the system. The modifications persist even
in the event of a system failure.
A. Concurrent Transaction
Concurrent execution of database transactions in a multi-user system means that any number of
users can use the same database at the same time. Data is shared and accessed by multiple users
simultaneously and this simultaneous access and processing of data may lead to dirty data. The
uncontrolled execution of concurrent transactions in a multi-user environment can lead to various
problems. The two main problems and examples of how they can occur are listed below.
level, and practically all major database vendors support row level locks.
Column Particular column or multiple columns are locked for a particular users.
DDDDDDDDD. Level of Locks
There are two levels of locks in database. These are:
1. Exclusive Lock: When a statement modifies data, its transaction holds an exclusive lock on
data that prevents other transactions from accessing the data. This lock remains in place until
the transaction holding the lock issues a commit or rollback. Table-level locking lowers
concurrency in a multi-user system.
2. Shared Lock: When a statement reads data without making any modifications, its transaction
obtains a shared lock on the data. Another transaction that tries to read the same data is
permitted to read, but a transaction that tries to update the data will be prevented from doing
so until the shared lock is released
Using exclusive locks the locked data
can be read or processed by one user
only. A request for another exclusive
lock or for a shared lock is rejected, as
shown in the ‘Compatibility Graph’.
In shared locks, several users can read
the same data at the same time, but as
soon as a user edits the data, a second
user can no longer access this data. Requests for further shared locks are accepted, even if they
are issued by different users, but exclusive locks are rejected.
EEEEEEEEE. Deadlock
In a database, a deadlock is a situation in which two or more transactions are waiting for one
another to give up locks. For example, Transaction A might hold a lock on some rows in the
ACCOUNTS table and needs to update some rows in the ORDERS table to finish. Transaction B
holds locks on those very rows in the ORDERS table but needs to update the rows in the
ACCOUNTS table held by Transaction A. Transaction A cannot complete its transaction
because of the lock on Orders. Transaction B cannot complete its transaction because of the lock
on Accounts. All activity comes to a halt and remains at a standstill forever unless the DBMS
detects the deadlock and aborts one of the transactions.