Mca DBMS
Mca DBMS
Introduction:
Data: Data is referred to known facts (Unorganized) that could be recorded on computer media. Data
consists of text, graphics, images, audio, video etc. that have meaning in the user’s environment.
Database: We can define a database as a well-organized collection of logically related data.
DBMS (Database Management System): A software application that is used to create, maintain, and
provide controlled access on databases to users.
The database approach has proven far better than the traditional file management
system. Database Approach has many characteristics that make it more robust in nature.
ACCOUNTING APPLICATIONS
An accounting system is a custom database application used to manage financial data. Custom forms are
used to record assets, liabilities, inventory and the transactions between customers and suppliers. The
income statements, balance sheets, purchase orders and invoices generated are custom reports based
upon information that is entered into the database. Accounting applications can run on a single computer
suitable for a small business or in a networked shared environment to accommodate the needs of
multiple departments and locations in larger organizations. "Microsoft Money," "Quicken,"
"QuickBooks" and "Peachtree" are accounting systems built upon database applications.
CRM APPLICATIONS
A customer relationship management system (CRM) is another example of a database application that
has been customized to manage the marketing, sales, and support relationships between a business and
it's customers. The ultimate goal is to maximize sales, minimize costs and foster strategic customer
relationships. Simple contact management programs such as "ACT," or the task manager in Microsoft's
"Outlook" can be customized to suit the needs of individuals and small businesses. "SAP,"
"Salesforce.com," and Oracle's "Siebel" are robust CRM database applications suitable for larger
enterprises.
WEB APPLICATIONS
Many contemporary web sites are built using several database applications simultaneously as core
components. Most retail store Web sites including "Bestbuy.com," and "Amazon.com" use database
systems to store, update and present data about products for sale. These Web sites also combine an
accounting database system to record sales transactions and a CRM database application to incorporate
feedback and drive a positive customer experience. The popular Web-based "Facebook" application is
essentially a database built upon the "MySQL" database system and is an indication of the increasing
usage of database applications as foundations for Web-based applications.
In the above representation one level is specially designated as root. Every segment in the hierarchy will
have a parent and every parent can have multiple children. Thus the relationship between a parent and its children
is 1:M.
Network Model: In Network Model, the user views the database as a collection of records in 1:M relationships.
Network Model allows a record to have more than one parent.
Example:
Entity Relational Model: Entity Relationship Data Model consists of a collection of objects called entities and
relationships among those entities.
An Entity is anything like a Person, Place, Object, Event or Concept in the user environment about which
data to be collected and stored.
Example:
Person: Employee, Student
Place : Store, State
Object-Oriented Model:
In Object-Oriented data model both data and their relationships are contained in a single structure called object.
DATABASE SCHEMA :
In the SQL environment, a schema is a group of database objects such as tables, indexes, views that are
related to each other. Usually the schema belongs to single user or application. A single database can hold
multiple schemas belonging to different users or applications. Schemas are useful as they group tables by owner
and enforce a first level security for them. Most of the RDBMSs automatically create a schema for each user
whenever any new user is created.
Syntax:
CREATE USER <User Name> IDENTIFIED BY <Password>;
GRANT <Privilege List> PRIVILEGES TO <User Name>;
Example:
CREATE USER Student IDENTIFIED BY Stu;
GRANT ALL PRIVILEGES TO Student;
Three schema architecture:
Design of a database is called the schema. Schema is of three types: Physical schema, logical schema
and view schema.
All the schemas are logical, and the actual data is stored in bit format on the disk. Physical data independence is the
power to change the physical data without impacting the schema or logical data.
DBMS Languages
DDL – the data definition language, used by the DBA and database designers to define the
conceptual and internal schemas.
The DBMS has a DDL compiler to process DDL statements in order to identify the schema
constructs, and to store the description in the catalogue.
In databases where there is a separation between the conceptual and internal schemas, DDL is
used to specify the conceptual schema, and SDL, storage definition language, is used to specify
the internal schema.
For true three-schema architecture, VDL, view definition language, is used to specify the user
views and their mappings to the conceptual schema. But in most DBMSs, the DDL is used to
specify both the conceptual schema and the external schemas.
Once the schemas are compiled, and the database is populated with data, users need to
manipulate the database. Manipulations include retrieval, insertion, deletion and modification.
The DBMS provides operations using the DML, data manipulation language.
In most DBMSs, the VDL, DML and the DML are not considered separate languages, but a
comprehensive integrated language for conceptual schema definition, view definition and data
manipulation. Storage definition is kept separate to fine-tune the performance, usually done by
the DBA staff.
An example of a comprehensive language: SQL, which represents a VDL, DDL, DML as well as
statements for constraint specification, etc.
High-level/Non procedural
Can be used on its own to specify complex database operations.
DMBSs allow DML statements to be entered interactively from a terminal, or to be embedded in
a programming language. If the commands are embedded in a general purpose programming
language, the statements must be identified so they can be extracted by a pre-compiler and
processed by the DBMS.
Low Level/Procedural
Must be embedded in a general purpose programming language.
Typically retrieves individual records or objects from the database and processes each separately.
Therefore it needs to use programming language constructs such as loops.
Low-level DMLs are also called record at a time DMLS because of this.
High-level DMLs, such as SQL can specify and retrieve many records in a single DML
statement, and are called set at a time or set oriented DMLs.
High-level languages are often called declarative, because the DML often specifies what to
retrieve, rather than how to retrieve it.
DML Commands
When DML commands are embedded in a general purpose programming language, the
programming language is called the host language and the DML is called the data sub-
language.
Forms-Based Interfaces
Displays a form to each user.
User can fill out form to insert new data or fill out only certain entries.
Designed and programmed for naïve users as interfaces to canned transactions.
The database and the DBMS catalogue are usually stored on disk. Access to the disk is
principally controlled by operating system (OS). This includes disk input/Output. A higher point
stored data manager module of DBMS controls access to DBMS information that is stored on the
disk.
The top part of the figure it shows interface to casual users, DBA staff, application programmers and
parametric users.
The DDL compiler specified in the DDL, processes schema definitions as well as stores the
description of the schema in the DBMS Catalogue. The catalogue includes information such as
names and sizes of the sizes of the files and data types of data of data items.
Storage particulars of every file mapping information among schemas as well as constraints.
Casual users as well as persons with occasional need of information from database interact using
some of interface which is interactive query interface. The queries are parsed analyse for
correctness of the operations for the model. The names of the data elements as well as therefore
on by a query compiler that compiles them into internal form. The interior query is subjected to
query optimization. The query optimizer is worried with rearrangement and possible recording of
operations and eliminations of redundancies.
Application programmer inscribes programs in host languages. The precompiled take out DML
commands from an application program.
2. Database System Utilities
Loading: A loading utility is used to load existing data files-such as text files or sequential files-
into the database. Usually, the current (source) format of the data file and the desired (target)
database file structure are specified to the utility, which then automatically reformats the data and
stores it in the database. With the proliferation of DBMSs, transferring data from one DBMS to
another is becoming common in many organizations. Some vendors are offering products that
generate the appropriate loading programs, given the existing source and target database storage
descriptions (internal schemas). Such tools are also called conversion tools.
Backup: A backup utility creates a backup copy of the database, usually by dumping the entire
database onto tape. The backup copy can be used to restore the database in case of catastrophic
failure. Incremental backups are also often used, where only changes since the previous backup
are recorded. Incremental backup is more complex but saves space.
File Reorganization: This utility can be used to reorganize a database file into a different file
organization to improve performance.
Performance Monitoring: Such a utility monitors database usage and provides statistics to the
DBA. The DBA uses the statistics in making decisions such as whether or not to reorganize files
to improve performance.
Two-tier Client / Server architecture is used for User Interface program and Application Programs that
runs on client side. An interface called ODBC (Open Database Connectivity) provides an API that allow
client side program to call the dbms. Most DBMS vendors provide ODBC drivers. A client program may
connect to several DBMS's. In this architecture some variation of client is also possible for example in
some DBMS's more functionality is transferred to the client including data dictionary, optimization etc.
Such clients are called Data server.
Three-tier Client / Server Architecture
Example:
Attributes:
An Attribute is a property or characteristic of an Entity type. Based on the importance and behaviour,
attributes are divided into the following types.
Required Attributes:
A required attribute is an attribute that must have a value for each entity occurrence
of an entity set. So a required attribute of an entity set should not be left empty.
Optional Attribute:
An optional attribute is an attribute that may or may not have value for each
occurrence of an entity set. So an optional attribute can be left empty also.
Identifier:
An identifier is an attribute or set of attributes that uniquely identifies each instance of
an entity set. The attributes that are part of an identifier are underlined in ER Diagrams. When the
identifier contains only one key attribute, then it is called simple identifier otherwise it is called
composite identifier.
Composite Attribute:
The attribute that can be sub-divided to yield additional attributes is called as a
“Composite Attribute”.
Simple Attribute:
The attribute that cannot be sub-divided to yield additional attributes is called as a
“Simple Attribute”.
Single-Valued Attribute:
A single-valued attribute can have only one value for a given entity instance.
Multi-Valued Attribute:
A multi-valued attribute can have more than one value for a given entity instance.
Stored Attribute:
A stored attribute is an attribute about which the designer wants to store data in the
database.
Derived Attribute:
A derived attribute is an attribute whose value can be calculated from other stored
attributes.
Relationships:
A relationship is a meaningful association between entities. Each relationship is
identified by a name that describes the relationship. The entities that participate in a relationship are
called as participants.
The relationship between two entities might be a strong relationship or a weak
relationship. The relationship between two strong entities is called as strong relationship or identifying
relationship and the relationship between two weak entities or one strong and one weak entity is called
as weak relationship or non-identifying relationship.
Notation:
Relationship Participation:
The participation in an entity relationship is either Optional or Mandatory.
Optional participation of a relationship means that the occurrence of one entity does not require a
corresponding entity occurrence in that particular relationship.
Mandatory participation of a relationship means that the occurrence of one entity requires a
corresponding entity occurrence in that particular relationship.
Example : If there are two entities PROFESSOR and CLASS, some Professors who conduct research
without teaching any class, but each class must be conducted by a professor. Hence the entity CLASS is
optional for the entity PROFESSOR in the relation “PROFESSOR teaches CLASS”, the entity
PROFESSOR is mandatory for the entity CLASS in the same relation.
Unary Relationship:
When there is an
association is maintained with only one
entity in a relationship, then it is called
“Unary Relationship.”
Binary Relationship:
When there is an
association is maintained between two
entities in a relationship, then it is called
“Binary Relationship.” Majority of the
relationships are Binary relationships.
Ternary Relationship:
When there is an
association is maintained among three entities
in a relationship, then it is called “Ternary
Relationship.”
Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions are
called Relational Integrity Constraints. There are three main integrity constraints −
Key constraints
Domain constraints
Referential integrity constraints
Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a tuple
uniquely. This minimal subset of attributes is called key for that relation. If there are more than one such
minimal subsets, these are called candidate keys.
Key constraints force that −
In a relation with a key attribute, no two tuples can have identical values for key attributes.
A key attribute cannot have NULL values.
Key constraints are also referred to as Entity Constraints.
Domain Constraints
Attributes have specific values in real-world scenario. For example, age can only be a positive integer.
The same constraints have been tried to employ on the attributes of a relation. Every attribute is bound to
have a specific range of values. For example, age cannot be less than zero and telephone numbers cannot
contain a digit outside 0-9.
In relational data model, the understanding and usage of keys is very important. Keys are used to identify
individual records of table uniquely. They are also used to represent relationships between tables and maintain
integrity of data.
Key:
A Key is an attribute or combination of attributes that identifies other attributes uniquely.
Relational algebra is a procedural query language, which takes instances of relations as input and
yields instances of relations as output. It uses operators to perform queries. An operator can be
either unary or binary. They accept relations as their input and yield relations as their output.
Relational algebra is performed recursively on a relation and intermediate results are also considered
relations.
Select
Project
Union
Set different
Cartesian product
Rename
Select Operation ( σ ): It selects tuples that satisfy the given predicate from a relation.
Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is prepositional logic formula which
may use connectors like and, or, and not. These terms may use relational operators like − =, ≠, ≥, < , >,
≤.
For example −
σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
Notation − r − s
Notation − r Χ s
r Χ s = { q t | q ∈ r and t ∈ s}
Notation − ρ x (E)
There are several processes and algorithms available to convert ER Diagrams into Relational Schema.
Some of them are automated and some of them are manual. We may focus here on the mapping
diagram contents to relational basics.
Mapping Entity
An entity is a real-world object with some attributes.
Mapping Relationship
A relationship is an association among entities.
Mapping Process
Mapping Process
Create table for weak entity set.
Mapping Process
Create tables for all higher-level entities.
A bottom-up approach design methodology considers the basic relationships among individual attributes
and uses those to construct relation schema.
If we integrate these two and is used as a single table i.e Student Table
Here whenever if we insert the tuples there may be ‘N’ stunents in one department, so
Dept No,Dept Name values are repeated ‘N’ times which leads to data redundancy.
Another problem is up data anomalies ie if we insert new dept that has no students.
If we delet the last student of a dept,then whole information about that department will be
deleted
If we change the value of one of the attributes of particular table the we must update
the tuples of all the students belonging to thet dept else Database will become
Inconsistent.
Note: Design in such a way that no insertion ,deletion, modification anomalies will occur
Note: The relations should be designed to satisfy the lossless join condition. No spurious tuples should
be generated by doing a natural-join of any relations.
EMPLOYEE
1. One goal of schema design is to minimize the storage space used by base relation
3. Another problem with using the relation as base relations is the problem of ‘UPDATE Anomalies’.
These can be classified into insertion, deletion, and modification anomalies.
Many of the attributes do not applay to all tuples in the relation we end up with many nulls in
those tuples. This can waste value space at storage level and may also lead to problems with
understanding the meaning of attributes and with specifying JOIN operations in logical level.
Definition: the set of all dependencies that include F as well as all dependencies that can be inferred
from F is called closure of f, it is denoted by F+ .
F={ENO→{ENAME,DOB,ADDRESS,DNUMBERS}, DNUMBERS→{DNAME,DMGRENO}
Some of the additional functional dependencies that we can infer from F are the following.
ENO→{DNAME,DMGENO} DNUMBER→DNAME
The closure F+ of F is the set of all functional dependencies that can be inferred from F. to
determine a systematic way to infer dependencies. We must discover a set of ‘inference rules’ that can
be used to infer new dependencies from a given set of dependencies.
We used the nation F1=X→Y to denote that the functional dependencies. The following six rules IR1
through IR6
Definition: two set of the functional dependencies E and Equivalence if E+ =F+ . hence equivalence
means that every FD in E can be infers from F1 and every FD in F can be inferred from E. i.e, E is
equivalent of F if both the conditions E covers F and covers E hold.
We can formally define a set of functional dependencies F to be minimal if it satisfies the following
conditions.
Evry dependency in F has a single attribute for its right hand side .
We can not replace any dependency X→A in F with a dependency Y→A, where Y is a proper
subset of X ,and still have a set of dependencies that is equivalent to F.
We connect remove any dependency from F and still have a set of dependencies that is
equivalent to F.
Functional Dependency
A Functional Dependency is a constraint between two attributes or two sets of attributes. The
keys are identified based on the theory of Functional Dependency.
If there are two attributes A and B, then attribute B is said to be functionally dependent on
attribute A if each value in A determines one and only one value in B.
The functional dependency between the attributes A and B can be written as :
Normalization is the process of evaluating, correcting a poor table structure to minimize data
redundancy and anomalies.
Normalization works through a series of stages called Normal Forms.
First Normal Form(1 NF):
A relation is said to be in 1NF, if it does not have repeating group of attributes. It means the
value of each row and column must be unique.
In 1NF, the normalization process starts with 3 steps procedure.
EMPLOYEE
Empno Name Age Address This relation is not in 1NF, because this has
multivalued attribute that is address.
101 Ravi 22 Gudur
1-2-44
Raja street
102 Siva 21 Nellore Now to convert this relation as into 1NF fill the
5-3-202 multivalued attribute with its sub attributes.
Ramlingapuram
A relation is said to be in 2NF, if it is in 1NF and every non-key attribute is fully functionally
dependent on the primary key.
A relation in 2NF applies following conditions.
1. It must not contain any partial functional dependencies.
2. Every non-key attribute functionally dependent on fully set of primary key attributes.
Consider the following relation and its functional dependencies.
STUDENT
The above relation is not in 2NF, here the primary key is (SNO,COURSE) is composite primary key.
1. SNO SNAME,AGE,GROUP
2. SNO,COURSE FEES
S.V ARTS & SCIENCE COLLEGE: GUDUR MCA:DBMS PAGE:34
In the second functional dependency SNO is a part of composite primary key. Hence, this dependency
is called partial functional dependency. This partial dependency creates redundancy in this relation,
which results anomalies.
1. Insert anomaly: if we want to insert student details who were not joined a course. We must
insert null values for related attributes, which results wastage of memory.
2. Deletion anomaly: if we delete the student details than the related course details also be
removed. So there is a loss of information.
3. Update anomaly: if we want to modify the more student details, it will take more time.
To overcome these problems, we must decompose the above relation into 2 relations
STUDENT STUDENT_COURSE
CUSTOMER SALESPERSON
When a relation has more than one candidate key, anomalies may result even though that
relation is in 3NF.
Consider the following relation STUDENT in 3NF, but not in BCNF.
STUDENT
Rahul Balaguruswamy
Roopsagar S.Hortsmann
KiranKumar ORelly
In this relation there is a multivalued dependency between the attributes subject, Lecturer and
TextBook.That is
Now, to eliminate the multivalued dependency for this relation by dividing the relation into 2
relations each of these relations contains 2 attributes that have a multivalued relationship in the
original relation.
Course_Lecturer Course_TextBook
Subject Textbook
Subject Lecturer
Cpp Balaguruswamy
Cpp Rahul
Cpp K.R Venugopal
Cpp Vikram
Cpp Eric S.Roberts
Cpp Srinivas
Java Herbert shildt
Java Pratap
Java S.Hortsmann
Java Roopsagar
Java ORelly
Java KiranKumar
It's in 4NF
If we can decompose table further to eliminate redundancy and anomaly, and when we
re-join the decomposed tables by means of candidate keys, we should not be losing
the original data or any new record set should not arise. In simple words, joining two
or more decomposed table should not lose records nor create new records.
Hence we have to decompose the table in such a way that it satisfies all the rules till
4NF and when join them by using keys, it should yield correct record. Here, we can represent
each lecturer's Subject area and their classes in a better way. We can divide above table into
three - (SUBJECT, LECTURER), (LECTURER, CLASS), (SUBJECT, CLASS)
A set of relations that together form the relational database schema must posses certain
additional properties to ensure a good design. The properties such as dependency presentation and
lossless-join property.
I. Relation decomposition:
A single universal relation schema R={A1,A2,. . . . . .An} that includes the attributes of the
database and every attribute name is unique.
Using the functional dependencies, that algorithms decompose the universal relation
schema R into a set of relation schemas
D={R1 ,R2…Rm} that wise become the relational database schema, D is called a
decomposition’ of R.
That each attribute in R will ppear in at least one relational schema Ri in the
decomposition so that no attribute are
FD1 : STUDENT.COURSE→INSTRUCTOR
FD1 : INSTRUCTOR→COURSE
Decomposition this relation schema into two relation schemas in one of the following possible
pairs.
Definition: a decomposition D={R1,R2, ….Rm }of R has the loseless join property with respect to the set
of dependencies F on Rif. For every every relation state r of R that statisfies F.
The word loss refers to loss of information, not to be lose of tuples. If a decomposition does not
have the loseless join property. We may get additional spurious tuples affer the PROJECT (π) and
natural join(*)operation are applied.
The decomposition of
EMP-PROJ (ENO,PNUMBER,HOURS,ENAME,PNAME,PLOCATION)
EMP-PROJ(ENO,PNUMBER,HOURS,PNAME,PLOCATION)
EMP-PROJ
EMP-LOCS EPM-PROJ1
ENAME PLOCATION
ENO PNUMBER HOURS PNAME PLOCATION
suppose we use EMP-LOCS and EMP-PROJ1 as the base relations instead of the EMP-PROJ, it produces
bad schema design, because we cannot recover the information that was originally in EMP-PROJ. if we
perform natural join operation on EMP-LOCS and EMP=PROJ1 result produces many more tuples that
the original set of tuples in EMP-PROJ. the additional tuples that are not in EMP-PROJ are
calledspurious tuples they represent wrong information that is not valid.
The following algorithm decompose a universal relation schema R={a1,a2,..Rn} into a decomposition
D={R1,R2,….Rm} such that each Ri is in BCNF and decomposition Dhas the lossless join property with
respect to F.
Input:
1. set D={R}
2. whole this is a relation schema Q in D that is not in BCNF do
{
a. choose a relation schema Q in D that is not in BCNF;
b. find a functional dependency X→Y in Q that violates BCNF;
c. replace Q in D by two relation schema (Q-Y)and(XUY);
};
For example consider the relation EMP,represent the fact that an Employee whose name
is Ename works on the project whose name is PNAME and has a dependent whise name is DNAME. An
employee may work on several projects and may have several departments, and the employees
projects and dependents are independent of one another.
when ever X→→Y holds, we say that X multidetermines Y. because of the symmetry in the definition,
when ever X→→Y holds in R so does X→→Z. hence, X→→Y implies X→→Z, and therefore it is some
times written as X→→Y/Z.
The another dependency called join dependency, if it is present, carry out a malty way
decomposition into 5NF.
A join dependency JD, denoted by JD(R1 R2….Rn)specified on relation schema R1 specifies a
constraint on the states r of R. The constraints states that every legal state r of R should have a
nonadditive join decomposition into R1,R2…..,Rm, that is for every r we have
*(πR1(r), πR2(r)… πRn()=r
A join dependency JD(R1,R2….Rn)specified on a relationship, trivial JD one of the relation
schemas Ri in JD(R1,R2,…Rn)is equal to R. such a dependency is called trivial because it has
non additive join property for any relation state r of R and hence does not specify any
constraint on R.
Definition: a relation schema R is in 5NF or project join normal form JNF with respect to a set F
of functional , multivalve , and join dependencies, if for every nontrivial join dependency
JD(R1,R2,……..Rn) in F+, every Ri is a super key of R.
A database system is classified according to number of users who can use the system
concurrently.
A DBMS is single-user if at most one user at a time can use the system, and it is multi-user if
many users can use the system.
Multi-user DBMSs are at airline reservation system, banks, insurance agencies, stock exchanges,
super market where many users who submit transactions concurrently to the system.
Multiple users can access databases with multiprogramming, which allows the computer to
execute multiple programs at same time. Multiprogramming OS execute some commands of one
process, then suspend that process and execute some commands from the next process and so on.
The concurrent execution of process is interleaved; it keeps the CPU busy when a process
requires an input/output operation. The CPU is switched to execute another process rather than
remain idle during I/O time.
2. Transactions, Read and Write operations and DBMS Buffer:
A transaction is a logical unit of work, it includes one or more database access operations such as
insertion, deletion or retrieval operations.
The database operations that form a transaction can either be embedded with in an application
program or specified interactivity with a high level query language such as SQL.
One way of specifying the transaction boundaries is by specifying explicit ‘Begin transaction’
and ‘End Transaction’ statements in an application program, all the database access operations
between these two statements are considered as one transaction.
A single application program may contain more than one transaction if it contains several
transaction boundaries.
A database is basically represented as a collection of named data items. The size of data item is
called its ‘Granularity’ and it may be a field of some record or it may be a disk block.
The basic database access operations that a transaction can include are,
read_item(x): Reads a data base item x into a program variable x.
write_item(x): Writes the value of variable x into the database item x.
For read and write operations the item values in disk block are copied into a buffer in the main
memory.
3. Need of Concurrency Control:
T1 T2
T1 T2
read-item(X);
read-item(X) read-item(X) X:=X-N; read-item(X)
X:=X-N; X:=X+M; X:=X+M;
write-item(X); Write-item(X); write-item(X);
read-item(Y); read-item(Y);
Y=Y+N; S.V ARTS & SCIENCE COLLEGE: GUDUR MCA:DBMS PAGE:44
write-item(X);
write-item(Y); Y:=Y+N;
write-item(Y);
The lost update problem:
This problem occurs when two transactions are access the same database item, and one of the
update is lost. The transactions T1 and T2 are submitted at same time and their operations are
interleaved, and then the final value of item X is incorrect, because T2 reads the value of X
before T1 changes in the database, and hence the update value resulting from T1 is lost.
Let X=80, N=5 and M=4 the final result should be X=79 but it produces incorrect result X=84.
Committed − If a transaction executes all its operations successfully, it is said to be committed. All its
effects are now permanently established on the database system.
The system maintains a Log to keep track of all transaction operations that affect the values of database
items. This information may be needed to permit recovery from failures.
The Log is kept on disk, so it not affected by any type of failures except disk crash.
In the Log records, T refers to a unique Transaction_Id that is generated automatically by the system and
it is used to identify each transaction.
1. [start_transaction, T]: Indicates that transaction T has started execution.
2. [write_item, T , old_value, new_value, x]: Indicates that transaction T has changed the value of
database item x from old-value to new-value.
3. [read_item, T ,x]: indicates that transaction T has read the value of database item x.
4. [commit, T]: indicates that transaction T has completed successfully and recorded permanently
to the database.
5. [abort, T]: indicates that transacation T has been aborted.
Consistency − The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the database
was in a consistent state before the execution of a transaction, it must remain consistent after the
execution of the transaction as well.
Durability − The database should be durable enough to hold all its latest updates even if the
system fails or restarts. If a transaction updates a chunk of data in a database and commits, then
the database will hold the modified data. If a transaction commits but the system fails before the
data could be written on to the disk, then that data will be updated once the system springs back
into action.
Isolation − In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions will be
carried out and executed as if it is the only transaction in the system. No transaction will affect
the existence of any other transaction.
Characterizing schedules:
Schedules of Transaction:
When transactions are executing concurrently in an interleaved manner, then the order of
execution of operations from various transactions is known as schedule.
A schedule S of n transactions T1,T2,…Tn is an ordering of the operations of the transactions,
that transaction Ti that participates in S, the operation of Ti in S must appear in same order in
which they occur in Ti.
For describing the schedule a notation r,w,e and a symbols are used for the operations read_item,
write_item, commit and abort respectively.
Sa is the schedule of Lost Update problem transactions.
Sa=r1(x); r2(x); w1(x); w2(y); r1(y); c2; r2(x); c1(x).
For some schedules it is easy to recover from transaction failures. It is important to characterize the
types of schedules for which recovery is possible as well as those for which recovery is relatively simple.
1. Once a transaction T is committed, it should never be necessary to rollback. The schedules meet this
criterion is called recoverable schedules and those do not are called Non_recoverable schedules. A
schedule S is recoverable if a transaction T reads from T1 which writes an item x and later read by T. T1
should not have been aborted before T reads item x, and these should no transactions that write x after
T1 writes it and before T reads it.
Consider the schedule S1a, which is same as Sa except that two commit operations have been added to
Sa.
2. A schedule is said to be cascade-less or to avoid cascading rollback, if every transaction in the schedule
reads only items that were written by committed transactions,
S1 = r1(x); w1(x); r2(x); r1(y); w2(x); w1(y); c1:c2; In this schedule r2(x) command must be postponed
until after T1 has committed, thus delaying T2 but ensuring no cascading rollback if T1 aborts.
3. A more restrictive type of schedule is a strict schedule, in which transactions can neither read nor write an
item x until the last transaction, that write x has committed or aborted. Strict schedules simplify the recovery
process.
Suppose the original value of x is 9, which is the before image stored in log. If T1 aborts, the recovery
procedure that restores the before image of an aborted write operation will restore the value of x to 9. Even
though it has already been changed to 8 by T2, thus loading to potentially incorrect results. A strict schedule
does not have this problem.
Serial schedule: A schedule S is serial if for every transaction T participating in the schedule, all the operations of
T is executed connectivity in the schedule. Otherwise, the schedule is called non serial schedule.
read-item(x) read-item(X);
X:=X-N; X=X+M;
write-item(X); write-item(X);
read-item(Y);
Y=Y+N; read-item(X);
write-item(Y);
read-item(X)
X:=X+M;
Write-item(X);
T1 T2 T1 T2
read-item(x); read-item(X)
X:=X-N; X:=X-N;
read-item(X) write-item(X); read-item(X)
write-item(x); X:=X+M; X:=X+M;
read-item(Y); write-item(X);
read-item(Y);
Y:=Y+N; write-item(X); Y=Y+N;
write-item(Y); write-item(Y);
schedule-C schedule-D
Schedules A and B is called serial because the operations of each transactions schedules C and D called non-
serial because each sequence inter levels operations from the transactions.
Serializable Schedule: A schedule S is serializable if it is equivalent to some serial schedule of the same n
transactions.
Result equivalent: Two schedules are called result equivalent if they producer the same final state of in the data
bas.
Conflict equivalent: Two schedules are said to be conflict equivalent if the order of any two conflicting
operations is the same in both schedules.
If the conflict operations are applied in different orders in two schedules, the effect can be different on the
database or no other transaction in the schedule, and hence the schedules are not conflict equivalent.
Ex: The operations r,(X),w2(X) in a schedule S1 and in the reverse order W1(X); r1(X) in schedule S2, the value
read by r1(X) can be different in the schedules.
Conflict serializable:
A schedule & is said to be conflict serializable if it is conflict equivalent to some serial schedules.
Being serializable is not some as being serial.
Being serializable implies that the schedule is a concurrent schedule.
We have concurrency control protocols to ensure atomicity, isolation, and serializability of concurrent
transactions. Concurrency control protocols can be broadly divided into two categories − 1. Lock based
protocols and 2. Time stamp based protocols
1. Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any transaction
cannot read or write data until it acquires an appropriate lock on it.
Locks are of two kinds −
Binary Locks − A lock on a data item can be in two states; it is either locked or unlocked.
Shared/exclusive − This type of locking mechanism differentiates the locks based on their uses. If a lock is
acquired on a data item to perform a write operation, it is an exclusive lock. Allowing more than one transaction
to write on the same data item would lead the database into an inconsistent state. Read locks are shared
because no data value is being changed.
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a 'write' operation is
performed. Transactions may unlock the data item after completing the ‘write’ operation.
1. Expanding phase (Growing phase): locks are acquired and no locks are released (the number of locks can
only increase).
2. Shrinking phase: locks are released and no locks are acquired.
The two phase locking rule can be summarized as: never acquire a lock after a lock has been released.
The serializability property is guaranteed for a schedule with transactions that obey this rule.
Typically, without explicit knowledge in a transaction on end of phase-1, it is safely determined only
when a transaction has completed processing and requested commit. In this case all the locks can be
released at once (phase-2).
Lock compatibility for SCO
Lock read- write-
type lock lock
read-lock
write- X X
lock
All databases READ and WRITE operations within the same transaction must have the same
time stamp. The DBMS executes conflicting operations in the time stamp order; it ensures serializability
of the transactions. If two transactions conflict, one is stopped, rolled back and rescheduled and assigned
the same time stamp value.
There are two schemes to control conflicting transactions by using time stamp methods:
1. Wait/Die Scheme
2. Wound/Wait Scheme
Wait/Die Scheme:
If there are two conflicting transactions each assigned a time stamp value, Wait/Die scheme follows the
following rules to control them.
1. If the transaction requesting the lock is the older of the two transactions, it will wait until the other
transaction is completed and the locks are released.
2. If the transaction requesting the lock is the younger of the two transactions, it will die (rollback) and
is rescheduled using the same time stamp value.
Wound/Wait Scheme:
If there are two conflicting transactions each assigned a time stamp value, Wound/Wait scheme follows
the following rules to control them.
1. If the transaction requesting the lock is the older of the two transactions, it will preempt (wound) the
younger transaction. The younger preempted transaction is rescheduled using the same time stamp
value.
2. If the transaction requesting the lock is the younger of the two transactions, it will wait until the other
transaction is completed and the locks are released.
In both the schemes, one of the transactions waits for other transaction to finish and release the locks .
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol. This protocol uses
either system time or logical counter as a timestamp.
Lock-based protocols manage the order between the conflicting pairs among transactions at the time of
execution, whereas timestamp-based protocols start working as soon as a transaction is created.
Every transaction has a timestamp associated with it, and the ordering is determined by the age of the
transaction. A transaction created at 0002 clock time would be older than all other transactions that
come after it. For example, any transaction 'y' entering the system at 0004 is two seconds younger and
the priority would be given to the older one.
In addition, every data item is given the latest read and write-timestamp. This lets the system know
when the last ‘read and write’ operation was performed on the data item.
Timestamp Ordering Protocol
The timestamp-ordering protocol ensures serializability among transactions in their conflicting read and writes
operations. This is the responsibility of the protocol system that the conflicting pair of tasks should be executed
according to the timestamp values of the transactions.
The timestamp of transaction Ti is denoted as TS(Ti).
Read time-stamp of data-item X is denoted by R-timestamp(X).
Write time-stamp of data-item X is denoted by W-timestamp(X).
Timestamp ordering protocol works as follows −
Ti issues a read(X) operation: Ti issues a write(X) operation −
If TS(Ti) < W-timestamp(X) If TS(Ti) < R-timestamp(X)
Operation rejected. Operation rejected.
If TS(Ti) >= W-timestamp(X)
Operation executed. If TS(Ti) < W-timestamp(X)
All data-item timestamps Operation rejected and Ti rolled back.
updated.
Otherwise, operation executed.
Hardware failures may be of several types like hard disk failure, a bad capacitor on motherboard, failing
memory etc. Software failures may occur in application programs, operating systems etc. These are the most
problems occur in database failures.
Sometimes the end-users may do mistakes unintentionally or intentionally. Unintentional errors include
mistakes like deleting, updating wrong rows of a table, choosing inappropriate options etc. Intentional errors like
hacking data by accessing unauthorized data resources and attacking with viruses cause major problems.
Natural Disasters:
Natural disasters like fires, earthquakes, floods and power failures are critical to bring back a system into
an operational system.
Transaction Recovery:
Database transaction recovery uses data in the transaction log to recover a database from an
inconsistent state to a consistent state.
There are the following concepts that affect the recovery process:
1. The write-ahead-log protocol ensures that transaction logs are always written before any database data are
actually updated. In case of a failure, the database can recover to consistent state using the data in the
transaction log.
2. Database checkpoints are operations in which the DBMS writes all of its update buffers to disk.
The database recovery process involves brining the database to a consistent state after a failure.
Transaction recovery procedures generally make use of techniques like Deferred-Write and Write-
Through.
Database recovery-Techniques
In many cases, the concurrency control and recovery process are same.
Concurrency control uses two-phasing locking, so the locks on items remain in effect remains in
effect until two transactions reach its commit point. After that the LOCKS ARE RELEASED
The recovery using update is a multi-user environment, uses two lists of transactions, one is the
commit transaction T since the last check point, and another’s is the active transactions T’.
REDO all the write-item operations of committed transaction form the log, in order in which they
are written into the log. The active transactions and did not commit are effectively canceled and
must re submitted.
Consider the following schedule of executing transactions
• In these techniques, the DB on disk can be updated immediately without any need to wait for the transaction
to reach its commit point.
• However, the update operation must still be recorded in the log (on disk) before it is applied to the database
*-using WAL protocol- so that we can recover in case of failure.
• Undo the effect of update operations that have been applied to the DB by a failed transaction. – Rollback the
transaction and UNDO the effect of its write operations
• If the recovery technique ensures that all updates of a transaction are recorded in the DB on disk before the
transaction commits, there is never a need to REDO any operations of committed transactions –
• If the transaction is allowed to commit before all its changes are written to the DB, REDO all the operations of
committed transactions –UNDO/REDO recovery algorithm- UNDO/REDO Immediate Update in a Single-User
Environment – Procedure RIU_S – Recovery Immediate Update in Single-User environment –
• Use two lists of transactions maintained by the system: the committed transactions since the last checkpoint
and the active transactions –at most one because single-user-
• Undo all write-item operations of the active transaction from the log, using the UNDO procedure. – The
operations should be undone in the reverse of the order in which they were written into the log – After making
these changes, the recovery subsystem writes a log record [abort,T] to each uncommitted transaction into the
log.
• Redo the write-item operations of the committed transactions from the log, in the order in which they were
written in the log, using the REDO procedure. – UNDO(write-op)
• Examine the log entry [write-item, T,X, old-value ,new-value] and setting the value of item X in the DB to old-
value which is the before image (BFIM).
Log-Based Recovery
The most widely used structure for recording database modifications is the log. The log is a sequence
of log records and maintains a history of all update activities in the database. There are several types of
log records.
An update log record describes a single database write:
Transactions identifier.
Data-item identifier.
Old value.
New value.
Whenever a transaction performs a write, it is essential that the log record for that write be created
before the database is modified. Once a log record exists, we can output the modification that has
already been output to the database. Also we have the ability to undo a modification that has already
been output to the database, by using the old-value field in the log records.
For log records to be useful for recovery from system and disk failures, the log must reside on stable
storage. However, since the log contains a complete record of all database activity, the volume of data
stored in the log may become unreasonable large.
The immediate-update technique allows database modifications to be output to the database while the
transaction is still in the active state. These modifications are called uncommitted modifications. In the
event of a crash or transaction failure, the system must use the old-value field of the log records to
restore the modified data items.
Transactions T0 and T1 executed one after the other in the order T0 followed by T1. The portion
of the log containing the relevant information concerning these two transactions appears in the
following,
Portion of the system log corresponding to T0 and T1
< T0 start >
< T0, A, 1000, 950 >
< T0, B, 2000, 2050 >
< T0 commit >
< T1 start >
< T1, C, 700, 600 >
< T0 commit >
Checkpoints
Shadow paging considers the database to make up of a member of fixed size disk blocks, for
recovery purpose.
A directory with n entries is constricted, where ith entry point to the ith database page on disk.
The directory is kept in main memory if it is not too large, and all references (read/write) to the
database pages go through it.
When a transaction begins executing current directory, whose entries is copied into a shadow
directory. The shadow directory is saved on disk while the current directory is is used by
transaction.
During execution the shadow directory is never modified. When write item operation is
performed, a new copy of modified page is created, without over write copy. The new page is
created in unused disk block.
Current directory entry is modified to point to new disk block, where as shadow directory is not
modified and continues to point to point to the old unmodified disk block.
JDBC Packages:
The JDBC API is contained in two packages. java.sql and javax.sql.
The first package is called java.sql and contains core java does objects of the JDBC API.
It provides the basics for connecting to the DBMS and interacting with data stored in the
DBMS. It is a part of the J2SE.
The other package that contains the JDBC API is javax.sql, which extends java.sql and is
in the J2EE. It includes java data objects that interact with Java Naming and Directory
Interface (JNDI) and java data objects that manage connection pooling, among other
advanced JDBC features.
Statement st=con.createStatement();
ResultSet rs=st.executeQuery("select * from student");
Process Data returned by the DBMS:
The java.sql.ResultSet object is assigned the results received from the DBMS after the
query is processed. It contains methods used to interact with data that is returned by the
DBMS to the J2EE component.
The result returned by the DBMS is assigned to the ResultSet object.
The next( ) method positioned at the first row in the ResultSet and returns a Boolean value
that if false indicates that no rows are present in the ResultSet.
The getInt( ) method is used to copy the value of specified column in the current row of the
ResultSet to a Integer object.
The getString( ) method is used to copy the value of specified column in the current row of
the ResultSet to a String object.
For example the student table contains 3 columns namely sno, sname and class.
S.V ARTS & SCIENCE COLLEGE: GUDUR MCA:DBMS PAGE:61
while(rs.next())
{
System.out.println(rs.getInt("sno")+" "
+rs.getString("sname")+" "
+rs.getString("class"));
}
Terminating the connection with the DBMS:
The connection to the DBMS is terminated by using the close( ) method of the connection
object once the J2EE component is finished accessing the DBMS.
The close( ) method throws an exception if a problem is encountered when disengaging
object.
For example : con.close();
Database Connection:
Before this connection is made, the JDBC driver must be loaded and registered with the
DriverManager.
The JDBC driver is automatically registered with the DriverManager once the JDBC
driver is loaded and is therefore available to the JVM and can be used by J2EE
components.
The Class.forName() method is used to load the JVM driver. In the example, the
JDBC/ODBC Bridge is he driver that is being loaded.
The Connection:
The data source that the JDBC component will connect to is defined using the URL
format. The URL consists of the three parts. These are
1. jdbc which indicates that the JDBC protocol is to be used to read the URL.
2. <subprotocol> which is the JDBC driver name.
3. <subname> which is the name of the database.
the connection to the database is established by using one of three getConnection()
methods of the DriverManager object.
A connection object is returned by the getConnection( ) method if access is granted;
otherwise, the getConnection( ) method throws a SQLException.
String url=”jdbc:odbc:CustomerInfomration”;
Statement DataRequest;
Connection con;
try {
class.forName(“sun.jdbc.odbc.JdbcOdbcDriver”);
Connection con= DriverManager.getConnection
(“jdbc:odbc:svarts”,”scott”,”tiger”);
}
Catch(ClassNotFoundExcetion error)
{ System.err.println(“unable to load JDBC/ODBC bridge.” + error);
System.exit(1);
}
Catch(SQLException error)
{ System.err.println(“cannot connect to the database.” + error);
System.exit(2);
}
TimeOut:
2. PreparedStatement Object:
The PreparedStatement interface extends the Statement interface, which gives you
added functionality with a couple of advantages over a generic Statement object.
A SQL query can be precompiled and executed by using the PreparedStatement
object.
The query is constructed with a question mark, for a placeholder for a value that is
inserted in the query after the query is compiled. This value change each time the
query is executed.
The PrepareStatement( ) method of the connection object is called to return the
PreparedStatement object. This method is passed the query, which is then
precompiled.
For example to insert a row, the statements are as follows,
String sql= “insert into student values(?,?,?,?)”;
PreparedStatement pstmt= conn.prepareStatement(sql);
For example to update a row, the statements are as follows
String sql= “update student set age=? Where sno=?”;
PreparedStatement pstmt= conn.prepareStatement(sql);
IN A parameter whose value is unknown when the SQL statement is created. We bind
values to IN parameters with the setxxx( ) methods.
OUT A parameter whose value is supplied by the SQL statement it returns. We retrieve
values from the OUT parameters with the getxxx( ) methods.
INOUT A parameter that provides both input and output values. We bind variables with
the setxxx () methods and retrieve values with the getxxx ( ) methods.
The preparedCall( ) method of the Connection object is called and is passed the query.
The registerOutParameter( ) method requires two parameters. The first parameter is an
integer that represents the number of parameter, and second parameter is the data type of
value returned by the stored procedure, which is Types.VARCHAR.
S.V ARTS & SCIENCE COLLEGE: GUDUR MCA:DBMS PAGE:65
The execute( ) method of the CallableStatement object is called next to execute the query.
The Syntax to create an object, is as follows
String query = "{call procedureName(?,?)}";
CallableStatement cstmt = conn.prepareCall(query);
cstmt.registerOutParameter(2,Types.VARCHAR);
In SQL, we can create a procedure as follows.
create or replace procedure getename(n in number,
name out varchar)is
begin
select ename into name from emp where empno=n;
end;
/
Consider the following a J2EE component.
import java.sql.*;
import java.io.*;
class procedurecall
{
public static void main(String args[])throws
SQLException,IOException,Exception
{
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
Connection con=
DriverManager.getConnection("jdbc:odbc:svarts","scott","tiger");
DataInputStream in=new DataInputStream(System.in);
int num;
String ename;
System.out.println("Enter employee number");
num=Integer.parseInt(in.readLine());
String s="{call getename(?,?)}";
CallableStatement cstmt=con.prepareCall(s);
cstmt.setInt(1,num);
cstmt.registerOutParameter(2,Types.VARCHAR);
cstmt.execute();
ename=cstmt.getString(2);
System.out.println("Employee Name="+ename);
cstmt.close();
}
}
Scrollable ResulstSet:
The JDBC 2.1 API also enables a J2EE component to specify the number of rows to
returns from the DBMS.
There are six methods first( ),last ( ), previous( ), absolute( ), relative( ), and getRow( ).
1. first( ): the first ( )method moves the virtual cursor to the first row in the ResultSet.
2. last( ): the last( ) method positions the virtual cursor at the last row in the ResultSet.
3. previous( ):the previous( ) method the virtual cursor to the previous row
4. absolute( ): the absolute( ) method positions the virtual cursor at the row number
specified by the integer passed as a parameter to the absolute ( ) method.
5. relative( ): the relative( ) method is again called to return the virtual back to its
original row by moving the virtual cursor two rows forward
6. getRow( ): the getRow() method returns an integer that represents the number of the
current row in the ResultSet.
The Statement object that is created using the createStatement( )of the Connection object
must be set up to handle a scrollable ResultSet by passing the createStatement( ) method
one of three constants. These constants are TYPE_FORWARD_ONLY,
TYPE_SCROLL_SENSITIVE and TYPE_SCROLL_SENSITIVE.
The TYPE_FORWARD_ONLY constant restricts the virtual cursor to downward
movement, which is the default setting. The TYPE_SCROLL_INSENSITIVE and
TYPE_SCROLL_SENSITIVE constants permit the virtual cursor to move in both
directions.
Updatable ResultSet:
The rows contained in the ResultSet can be updatable by passing the createStatement( )
method of the Connection object the CONCUR_UPDATABLE.
There are 3 ways in which a ResultSet can be changed.
1. Update a row in the ResultSet:
The updatexxx( ) method is used to change the value of a column in the current row of
the ResultSet.
The updatexxx( ) method requires two parameters. The first is either the number or
name of the column of the ResultSet that is being updated and the second parameter is
the values that will replace the value in the column of the ResultSet.
The updateRow( ) method is called after all the updatexxx() method are called the
updateRow( ) method changes values in column row of the current row of the ResultSet
based on the values of the updatexxx() methods.
Consider the following example:
try {
String query=”select empno,ename,job from emp where ename=’SMITH’ ”;
Statement st=con.createStatement(ResultSet.CONCUR_UPDATABLE);
ResultSet rs=st.executeQuery(query);
rs.updateString(“job”,”MANAGER”);
rs.updateRow();
con.close();
}catch(SQLException error)
{
System.out.println(error);
}
2. Delete a row in the ResultSet:
The deleteRow( ) method is used to remove a row fro a ResultSet.
The deleteRow( ) method is passed an integer that contains the number of the row to be
deleted.
However, the values of that row should be examined by the program to assure it is the
proper row before the deleteRow() method is called.
The deleteRow( ) method is then passed a zero integer indicating that the current row
must be deleted, as show in the following statement.
Results.deleteRow(0);
Transaction Processing
A database transaction consists of a set of SQL statements, each of which must be
successfully completed for the transaction to be completed. If one fails, sql statements
that executed are rolled back.
A database transaction is not completed until the J2EE component calls the commit( )
command method of the Connection object.
The commit( ) method must be called regardless if the SQL statement is part of a
transaction or not.
If a J2EE component is processing a transaction, the AutoCommit feature must be
deactivated by calling the setAutoCommit( ) method and passing it a false parameter.
Once the transaction is completed, the setAutoCommit( ) method is called again
automatically and pass the true parameter, when we call the commit( ) method.
Example:
con.setAutoCommit(false);
ResultSet Holdability:
Whenever the commit( ) method is called, all ReslultSet objects that were created for the
transaction are closed. Sometimes a J2EE component needs to keep the ResultSet open
even after the commit( ) method is called.
The HOLD_CURSOR_OVER_COMMIT constant keeps ReslutSet objects open
following a call to the commit( ) method and CLOSE_CURSOR_AT_COMMIT closes
ResultSet objects when the commit( ) method is called.
RowSets:
The JDBC RowSets object is used to encapsulate a ResultSet for use with Enterprise
Java Beans (EJB).
A RowSet object contains rows of data from a table that can be used in disconnected
operations.
Indexing:
Creating an Index:
An index is created by using the CREATE INDEX statement in a query, contains the
name of the index and any modifier that describes to the DBMS the type of index that is
to be created.
try {
String query= “CREATE UNIQUE INDEX empnoindex on EMP(empno)”;
Statement st=con.createStatement( );
st.execute(query);
st.close( );
con.close();
}
Creating a Secondary Index:
A secondary index is created by using the CREATE INDEX statement in a query without
the use of the UNIQUE modifier that a secondary index can have duplicate values.
try {
String query= “CREATE INDEX enameindex on EMP(ename)”;
Statement st=con.createStatement( );
st.execute(query);
st.close( );
con.close();
}
Creating a Clustered Index:
A Clustered index is an index whose keys is created from two or more columns of a table.
The OR clause requires that a least one of the expression in the compound expression
evaluate to true before the WHERE clause expression evaluates true.
String query=”select * from emp where job=’MANAGER’ OR job=’CLERK’”;
The NOT clause is used reverse the logic, changing an expression that evaluates true to a
false.
String query=”select * from emp where NOT job=’MANAGER’”;
ResultSet rs=st.executeQuery(query);
DISTINCT Modifier:
The SELECT statement return all rows in a table unless a WHERE clause is used to
exclude specific rows.
However the ResultSet includes duplicate rows unless a primary key index is created for
the table or only unique rows are required in the table.
When we want to exclude all but one copy of arrow from the ResultSet. We can do this
by using the DISTINCT modifier in the SELECT statement. The DISTINCT modifier
tells the DBMS not to include duplicate rows in the ResultSet.
String query=”select DISTINCT(JOB) from emp”;
ResultSet rs=st.executeQuery(query);
IN Modifier:
The IN modifier is used to define a set values used by the DBMS to match values in a
specified column. The set can include any number of values and appear in any order.
String query=”select * from emp where EMPNO IN(101,103,105)”;
ResultSet rs=st.executeQuery(query);
Updating Tables:
Modifying data in a database is one of the most common functionalities included in every
J2EE component that provides database interactions. Generally any information that is
retrievable is also changeable depending on access rights and data integrity issues.
The next several sections illustrate techniques that are used to update rows in a table of a
database. Code segments described the executeUpdate( ) method to process queries.
The executeUpdate( ) method does not returns a ResultSet.
Updating a Row and Column:
The UPDATE statement is used to change the value of one or more columns in one or
multiple rows of a table. The UPDATE statement must contain the name of the table that
is to be updated and a SET clause.
The SET clause identifies the name of the column and the new values that will be placed
into the column, overriding the current value.
The UPDATE statement may have a WHERE clause if a specific number of rows are to
be updated. If the WHERE clause is omitted all rows are updated based on the value of he
SET clause.
String query="Update student set class=’mca’ where sno=111";
Statement st=con.createStatement( );
st.executeUpdate(query);
Updating Multiple Rows:
Multiple rows of a table can be updated by formatting the WHERE clause expressions to
include criteria that qualify multiple rows for the update.
The IN test: the WHERE clause expression contains multiple values in the IN clause that
must match the value in the specified column for the update to occur in the row.
try {
String qry="Update student set class=’mca’ where sno in(111,222,333) ";
Statement st=con.createStatement( );
st.executeUpdate(query);
}
The IS NULL: Rows that don’t have a value in the specified column are updated when
the IS NULL operator is used in the WHERE clause expression.
try {
String qry="Update student set class=’mca’ where college IS NULL";
Statement st=con.createStatement( );
st.executeUpdate(query);
}
The Comparison test: Updating based on values in a column: The WHERE clause
expression contains a comparison operator that compares the value in the specified
column with a value in the WHERE clause expression.
try {
String qry="Update student set class=’mca’ where sno in(111,222,333)";
All Rows: Updating Every row: A query can direct the DBMS to update the specified
column in all rows of a table by excluding the WHERE clause in the query.
try {
String qry="Update student set class=’mca’";
Statement st=con.createStatement( );
st.executeUpdate(query);
}
Updating Based on Values in a Column:
An expression in the WHERE clause can be used to identify rows that are to be updated
by the UPDATE statement. The SELECT statement also applies to UPDATE statement.
Updating Every Row:
All rows in a table can be updated by excluding the WHERE clause in the UPDATE
statement.
Updating Multiple Columns:
Multiple columns are of rows can be updated simultaneously by specifying the column
names and appropriate values in the SET clause of the query.
Updating Using Calculations:
The value that replaces the current value in a column does not to be explicit defined in the
SET clause if the value can be derived from a value in another column of the same row.
try {
String qry="Update employee set comm=basic*0.12;
Statement st=con.createStatement( );
st.executeUpdate(query);
}
Descending Sort:
In addition to choosing the column to sort we can also select the direction sort by using
ASC or DSC modifier.
SubQueries:
We can direct the DBMS the result of a query by creating a subquery within the query.
A subquery joins together two queries to form one complex query, which efficiently
identifies data to be included in the ResultSet.
A subquery is a query that is formatted very similar to a query. Each has a SELECT
statement and a FROM clause, and can also included a WHERE clause and a HAVING
clause to qualify rows to return.
The WHERE clause and the HAVING clause are used to express a condition that must
be met for a row to be included in the ResultSet.
There are two rules that we must follow when using a subquery in our program.
1. Return one column from the query: The purpose of a subquery is to derive a
list of information from which a query can choose appropriate rows. Only a
single column needs to be included in the list.
2. Don’t sort or group the result from a subquery: since the ResultSet of the
subquery isn’t going to be returned in the ResultSet of the query, there isn’t a
need to sort or group data in the ResultSet of a Subquery.
Creating a Subquery:
A subquery is a query whose results are evaluated by an expression in the WHERE clause
of another query.
String qry="SELECT empno,ename,job,sal from emp where sal>=(select max(sal) from Emp)";
ResultSet rs= st.executeQuery(qry);
while (rs.next())
{
System.out.println(rs.getInt(1)+" "
+rs.getString(2)+" "
+rs.getString(3)+" "
+rs.getInt(4));
}
Conditional Testing:
There are four types of conditional tests we can use with a subquery. these are
1. Comparison test: This is a test that uses comparison operators to compare values in
the temporary table with values in the table used by the query.