0% found this document useful (0 votes)
21 views51 pages

Dbms

Uploaded by

Lissy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views51 pages

Dbms

Uploaded by

Lissy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

A DBMS (Database Management System) is a software system used to manage and

organize data in a database. It provides an interface for users and applications to


interact with data in a structured way. DBMSs enable the storage, retrieval,
updating, and management of data while ensuring its integrity, security, and
accessibility.

Key Functions of a DBMS:

1. Data Definition: DBMS allows users to define the structure of the data
(tables, views, indexes, etc.) using a Data Definition Language
(DDL).
2. Data Manipulation: It provides tools for inserting, updating, deleting, and
querying data using a Data Manipulation Language (DML).
3. Data Retrieval: DBMS supports querying the database and retrieving data
based on user-defined criteria using Structured Query Language (SQL).
4. Concurrency Control: It manages multiple users accessing the data
simultaneously, ensuring that no conflicts occur (e.g., two users
modifying the same data at the same time).
5. Data Integrity: DBMS ensures that the data is accurate and consistent by
enforcing constraints, such as primary keys and foreign keys.
6. Data Security: It controls access to data, allowing different levels of user
privileges to protect sensitive information.
7. Backup and Recovery: DBMS provides methods for backing up data and
restoring it in case of failure.
8. Data Independence: DBMS abstracts the physical details of the data
storage and provides a logical view, which simplifies data management.

Types of DBMS:

1. Hierarchical DBMS: Data is organized in a tree-like structure, where each


record has a single parent. Example: IBM's IMS.
2. Network DBMS: Data is stored in a graph structure, where records can have
multiple parent and child relationships. Example: Integrated Data
Store (IDS).
3. Relational DBMS (RDBMS): Data is organized into tables (relations) with rows
(records) and columns (fields). This is the most widely used type of DBMS.
Example: MySQL, PostgreSQL, Oracle, SQL Server.
4. Object-Oriented DBMS (OODBMS): Data is stored as objects, similar to how
data is represented in object-oriented programming languages.
Example: db4o, ObjectDB.
5. NoSQL DBMS: These are non-relational databases designed for handling
unstructured, semi-structured, or large volumes of data.
Examples: MongoDB, Cassandra, Couchbase, Redis.

Examples of DBMSs:

● MySQL: Open-source relational DBMS.


● PostgreSQL: Open-source object-relational DBMS known for its advanced
features.
● Oracle: A commercial relational DBMS with extensive enterprise-level
features.
● MongoDB: A NoSQL DBMS, suitable for handling large amounts of
unstructured data.
● Microsoft SQL Server: A relational DBMS developed by Microsoft.

In the context of databases, keys are fundamental concepts used to uniquely


identify and establish relationships between records in a database. They are
essential for maintaining data integrity and ensuring that the data is consistent
and accurate.

Here are the main types of keys in a relational database:

1. Primary Key
● A primary key is a column or a set of columns in a table that uniquely
identifies each record in that table.
● Each table can have only one primary key.
● The primary key must contain unique values, and it cannot contain NULL
values.
Example: In a Customers table, the CustomerID might be a primary key
because it uniquely identifies each customer.

2. Foreign Key
● A foreign key is a column or a set of columns in one table that refers to the
primary key in another table.
● Foreign keys create a relationship between two tables.
● The values in the foreign key column must either be NULL or match a
value in the referenced table's primary key.
Example: In an Orders table, the CustomerID might be a foreign key
that links to the
CustomerID in the Customers table.

3. Candidate Key
● A candidate key is a column or set of columns that can uniquely identify
records in a table.
● Each table may have multiple candidate keys, but only one primary key is
selected.
● All candidate keys are potential primary keys.
Example: In a Students table, both StudentID and Email could be candidate
keys if they both uniquely identify students.

4. Alternate Key
● An alternate key is any candidate key that is not chosen as the primary key.
● It is still unique and can be used as an alternative to the primary key.
Example: If StudentID is the primary key in the Students table, then Email
might be considered an alternate key.

5. Composite Key
● A composite key is a primary key that consists of two or more columns used
together to uniquely identify a record.
● This is used when a single column is not sufficient to uniquely identify
records.
Example: In an Enrollment table that records students in courses, the
combination of
StudentID and CourseID might be used as a composite key to uniquely
identify each enrollment.

6. Superkey
● A superkey is a set of one or more attributes (columns) that uniquely identify
a record in a table.
● A superkey can contain additional attributes beyond what is necessary for
uniqueness.
● All primary keys are superkeys, but not all superkeys are primary keys.
Example: In a Employees table, a set of columns like EmployeeID + Email
can be a superkey, even though EmployeeID alone can uniquely identify a
record.

7. Unique Key
● A unique key is similar to a primary key in that it ensures that all values in
the column are unique across the table.
● However, unlike the primary key, a unique key column can have NULL values
(depending on the DBMS).
Example: In a Users table, the Username might be a unique key, ensuring
that no two users can have the same username, but it can allow NULL
values if some users have not set a username.

8. Candidate Key vs. Primary Key

● Candidate Key: A set of columns that can uniquely identify a record.


There may be more than one candidate key.
● Primary Key: One of the candidate keys chosen to uniquely identify records in
a table. It cannot have NULL values.

9. Foreign Key Constraints


● A foreign key constraint ensures that the values in the foreign key column(s)
correspond to values in the primary key of another table (or the same table in
the case of self-referencing tables).
● It helps maintain referential integrity, ensuring that no record in the foreign
key column can exist without a corresponding record in the referenced
table.
Example: A foreign key constraint in the Orders table ensures that the
CustomerID
must correspond to a valid CustomerID in the Customers table.

Summary of Key Types:


Key Type Purpose Allows Uniqueness
NULL?

Primary Key Uniquely identifies each record in Yes

No a table.
Foreign Key Creates a relationship with another No (it
Yes references a
table. primary key)
Candidate Potential keys that can No Yes
Key uniquely identify records.

Alternate A candidate key that is not No Yes


Key the primary key.

Composite A key that combines No Yes


Key multiple columns.

Superkey A set of columns that No Yes (may include


uniquely identifies a extra attributes)
record.
Unique Key Ensures uniqueness but may allow Yes
Yes
NULL values.
The Entity-Relationship (E-R) Model is a conceptual framework used to describe the
structure of a database. It defines the relationships between entities in a system
and helps in designing the database schema before it is implemented. The E-R
Model is a high-level, abstract representation of how data interacts and is organized,
making it easier to visualize and design databases.

Key Components of the E-R Model

1. Entities:
○ An entity is any object or thing in the real world that has a distinct
existence and can be represented in the database.
○ Entities can be physical objects (like Employee or Product) or
abstract concepts (like Course or Order).
○ An entity set is a collection of similar entities.
○ Each entity has a unique identifier called a primary key.
2. Example: In a school database, the entities might be Student,
Teacher, and Course.
3. Attributes:
○ Attributes are the properties or characteristics that describe an
entity. Each attribute stores specific information about the entity.
○ There are different types of attributes:
■ Simple (Atomic): An attribute that cannot be further divided.
E.g., Age, Salary.
■ Composite: An attribute that can be broken down into smaller
parts. E.g.,
Full Name can be divided into First Name and Last Name.
■ Derived: An attribute whose value can be derived from other
attributes. E.g., Age can be derived from Date of Birth.
■ Multivalued: An attribute that can have multiple values. E.g.,
Phone Numbers (a person may have more than one phone
number).
4. Example: In a Student entity, attributes could be StudentID, Name, DOB,
and Email.
5. Relationships:
○ A relationship is an association between two or more entities. A
relationship connects entities based on their attributes or behavior
in the real world.
○ Relationships can be classified as:
■ One-to-One (1:1): One entity instance is related to exactly one
instance of another entity.
■ One-to-Many (1:N): One entity instance is related to many
instances of another entity.
■ Many-to-Many (M:N): Many instances of one entity are related
to many instances of another entity.
6. Example: A Teacher can teach many Courses (one-to-many relationship),
while a
Student can enroll in many Courses, and each Course can have many
Students
(many-to-many relationship).
7. Primary Key:
○ The primary key is an attribute or set of attributes that uniquely
identifies each entity in an entity set.
○ In the E-R diagram, primary keys are often underlined.
8. Example: In a Student entity, the StudentID could be the primary key.
9. Foreign Key:
○ A foreign key is an attribute in one entity that refers to the primary key
of another entity, establishing a relationship between them.
○ In the E-R diagram, foreign keys are often depicted as arrows
connecting entities.

Types of E-R Diagrams

1. Chen's Notation:
○ Introduced by Peter Chen, it uses rectangles for entities, ellipses for
attributes, and diamonds for relationships.
○ Lines connect attributes to entities and entities to relationships.
2. Crow's Foot Notation:
○ A more popular and simplified version of E-R diagrams,
commonly used in modern database design.
○ It uses a “crow’s foot” symbol to represent the "many" side of a
relationship and a straight line for the "one" side.
3. UML (Unified Modeling Language) Class Diagrams:
○ UML is often used for Object-Oriented Modeling, but its class diagrams
can also represent E-R models.

Example of an E-R Diagram

Let's consider a simple database for a school system.

● Entities: Student, Course, Instructor


● Attributes:
○ Student: StudentID, Name, DOB
○ Course: CourseID, CourseName
○ Instructor: InstructorID, Name
● Relationships:
○ A Student can enroll in many Courses (many-to-many relationship).
○ An Instructor can teach many Courses (one-to-many relationship).

In this case, you could create an E-R diagram with:

● Student and Course entities, connected by an enrollment relationship.


● Instructor related to Course through a teaching relationship.
● Foreign keys would be used to connect Student to Enrollment and
Instructor to
Course.

Cardinality in E-R Diagrams

Cardinality defines the number of instances of one entity that can or must be
associated with each instance of another entity in a relationship:
● One-to-One (1:1): One instance of an entity is related to only one instance
of another entity.
● One-to-Many (1:N): One instance of an entity is related to many instances
of another entity.
● Many-to-Many (M:N): Many instances of one entity are related to many
instances of another entity.

Advantages of the E-R Model:

1. Simplified Database Design: E-R diagrams provide a clear and visual


representation of how data entities are related, which helps in
designing databases.
2. Database Schema Generation: E-R diagrams are a blueprint for creating
relational database schemas, making it easier to translate into tables
and relationships.
3. Helps in Understanding Data: It simplifies understanding the
structure and relationships of data, even for non-
technical stakeholders.

Normalization is a process in database design that aims to reduce redundancy and


dependency by organizing data in such a way that it eliminates undesirable
characteristics, such as data anomalies. The goal is to structure the database in a
way that allows it to be easily updated, modified, and maintained while ensuring
consistency and efficiency.

Normalization involves dividing large tables into smaller ones and defining
relationships between them. It is done in several normal forms (NF), ranging from
1NF (First Normal Form) to 5NF (Fifth Normal Form). Each normal form addresses
different types of issues and progressively reduces redundancy and improves data
integrity.

4. First Normal Form (1NF)

Definition: A table is in 1NF if it meets the following conditions:

● Each column contains atomic values, meaning the values in each column
cannot be further subdivided.
● Each record (row) must be unique (no duplicate rows).
● All columns must contain single-valued attributes (no repeating groups or
arrays).

Steps to achieve 1NF:


StudentID Courses
● Eliminate repeating groups or multi-valued attributes.
Name
● Ensure that allMath,
attributes contain atomic values.
1 Alice Science
Example:
2 Bob English, History

To convert the above table to 1NF, we need to remove the multi-valued


attribute Courses:
StudentID Name Course

1 Alice Math

1 Alice Scien
ce

2 Bob English

2 Bob History

2. Second Normal Form (2NF)

Definition: A table is in 2NF if:

● It is in 1NF.
● It has no partial dependencies, i.e., every non-prime attribute (an attribute
that is not part of a candidate key) must be fully functionally dependent on
the entire primary key, not just part of it.

Steps to achieve 2NF:

● Remove partial dependencies by dividing the table into two or more tables.
● Each non-prime attribute must depend on the whole primary key (if the
primary key is composite).

Example: Consider a table with a composite primary key:

StudentID CourseID Instructor Grade

1 101 Mr. Smith A

1 102 Mrs. Davis B

2 101 Mr. Smith C

The partial dependency is Instructor depends only on CourseID, not on the entire
primary key (StudentID, CourseID). To normalize to 2NF, we split the table:

1. Student_Courses table (with the composite key of StudentID and


CourseID):
StudentID CourseID Grade

1 101 A

1 102 B

2 101 C

2 Courses table (with CourseID as the primary


. key):
CourseID Instructor

101 Mr. Smith

102 Mrs. Davis

3. Third Normal Form (3NF)

Definition: A table is in 3NF if:

● It is in 2NF.
● It has no transitive dependencies, i.e., non-prime attributes must not depend
on other non-prime attributes.

Steps to achieve 3NF:

● Remove transitive dependencies by creating separate tables.

Example: Consider a table:

StudentID CourseID Instructor InstructorPhone

1 101 Mr. Smith 123-456

1 102 Mrs. Davis 987-654

Here, InstructorPhone depends on Instructor, and Instructor


depends on
CourseID, creating a transitive dependency. To convert to 3NF, we split the
table:

1. Student_Courses table:
StudentID CourseID

1 102
101
2. Courses
table:
CourseID Instructor

10 Mr.
1 Smith
10 Mrs. Davis
2
Instructors
3. table:
Instructor InstructorPhone

Mr. Smith 123-456

Mrs. Davis 987-654

4. Fourth Normal Form (4NF)

Definition: A table is in 4NF if:

● It is in 3NF.
● It has no multi-valued dependencies.

Multi-valued Dependency: A multi-valued dependency occurs when one attribute


determines two or more independent attributes that are not functionally related to
each other.

Steps to achieve 4NF:

● Eliminate multi-valued dependencies by creating separate tables for


each set of multi-valued attributes.

Example: Consider a table where a student can have multiple skills and
hobbies:
StudentID Skill Hobby

1 Coding Readi
ng

1 Drawing Hiking

Here, both Skill and Hobby are multi-valued attributes. To convert to 4NF, we
create two separate tables:

1. Student_Skills table:
StudentID Skill

1 Coding
1 Drawing
2. Student_Hobbies
table:
StudentID Hobby

1 Readi
ng

1 Hiking

5. Fifth Normal Form (5NF)

Definition: A table is in 5NF if:

● It is in 4NF.
● It has no join dependency and cannot be decomposed into smaller tables
without losing information.

Join Dependency: A join dependency occurs when a table can be split into
multiple tables without any loss of data, and the original table can be
reconstructed by joining the smaller tables.

Steps to achieve 5NF:

● Eliminate join dependencies by ensuring that the table cannot be further


decomposed without losing data.

Example: Consider a table that stores a combination of a student, course, and


instructor:
StudentID CourseID Instructor ExamDate

1 101 Mr. Smith 2024-05-


01
1 102 Mrs. Davis
2024-06-
01
This table has a join dependency because you could separate CourseID,
Instructor, and
ExamDate into different tables. To achieve 5NF, we decompose the table further:

1. Student_Courses table:
StudentID CourseID

1 101
1 102
2. Course_Instructor
table:
CourseID Instructor

10 Mr.
1 Smith
10 Mrs. Davis
2
Course_ExamDate
3. table:
CourseID ExamDate

101 2024-05-
01

102 2024-06-
01

Summary of Normal Forms:


Normal Form Conditions
(NF)

1NF No repeating groups, atomic values


only.
2NF In 1NF, no partial dependencies (non-prime attributes depend
on the full primary key).

3NF In 2NF, no transitive dependencies (non-prime attributes do


not depend on other non-prime attributes).

4NF In 3NF, no multi-valued


dependencies.
5NF In 4NF, no join dependencies (tables cannot be decomposed
without loss of information).

Aggregate functions in SQL are functions that operate on a collection of values,


returning a single value as a result. These functions are often used in conjunction
with the GROUP BY clause to group rows that have the same values in specified
columns and perform calculations on each group.

Commonly Used Aggregate Functions in SQL:

1. COUNT():
○ Counts the number of rows in a set or the number of non-NULL
values in a column.

Syntax
: sql
Copy code
SELECT COUNT(column_name) FROM table_name;

Example:
sql
Copy code
SELECT COUNT(*) FROM Employees;

○ This counts the total number of rows in the Employees


table.
2. SUM():
○ Adds up all the values in a specified column (usually
numerical).

Syntax:
sql
Copy code
SELECT SUM(column_name) FROM table_name;

Example:
sql
Copy code
SELECT SUM(Salary) FROM Employees;

○ This sums up the Salary column in the Employees


table.
3. AVG():
○ Calculates the average value of a numeric column.

Syntax:
sql
Copy code
SELECT AVG(column_name) FROM table_name;

Example:
sql
Copy code
SELECT AVG(Salary) FROM Employees;

○ This returns the average salary of all employees.


4. MIN():
○ Returns the smallest (minimum) value in a column.
Syntax:
sql
Copy code
SELECT MIN(column_name) FROM table_name;

Example:
sql
Copy code
SELECT MIN(Salary) FROM Employees;

○ This returns the minimum salary from the Employees table.


5. MAX():
○ Returns the largest (maximum) value in a column.

Syntax:
sql
Copy code
SELECT MAX(column_name) FROM table_name;

Example:
sql
Copy code
SELECT MAX(Salary) FROM Employees;

○ This returns the maximum salary from the Employees table.


6. GROUP_CONCAT() (MySQL) / STRING_AGG() (PostgreSQL, SQL Server):
○ Concatenates values from multiple rows into a single string.
○ GROUP_CONCAT (MySQL) and STRING_AGG (PostgreSQL, SQL Server)
are used to combine multiple row values into a single string, separated
by a delimiter.

Syntax (MySQL - GROUP_CONCAT):


sql
Copy code
SELECT GROUP_CONCAT(column_name) FROM table_name;

Example:
sql
Copy code
SELECT GROUP_CONCAT(CourseName) FROM Students;
○ This returns a single string with all course names concatenated
together.
7. VARIANCE():
○ Calculates the variance of a set of values. Variance measures how far
the values in the set are spread out from the mean.

Syntax:
sql
Copy code
SELECT VARIANCE(column_name) FROM table_name;

Example:
sql
Copy code
SELECT VARIANCE(Salary) FROM Employees;

○ This returns the variance of the Salary column.


8. STDDEV():
○ Calculates the standard deviation of a set of values, which is the
square root of the variance.

Syntax:
sql
Copy code
SELECT STDDEV(column_name) FROM table_name;

Example:
sql
Copy code
SELECT STDDEV(Salary) FROM Employees;

○ This returns the standard deviation of the Salary column.

Using Aggregate Functions with GROUP BY

When you use aggregate functions along with the GROUP BY clause, SQL will group
the rows based on one or more columns and perform the aggregation for each
group.

Example: Let's say you want to find the average salary of employees in each
department.
sql
Copy code
SELECT DepartmentID, AVG(Salary)
FROM Employees
GROUP BY DepartmentID;

This query will group the employees


by their DepartmentID and calculate
the average salary
for each department.

Example of Multiple Aggregate Functions in One Query:

You can use multiple aggregate functions in a single query. For instance, you may
want to find the total number of employees, the maximum salary, and the average
salary in a company:

sql
Copy code
SELECT COUNT(*) AS TotalEmployees, MAX(Salary) AS MaxSalary,
AVG(Salary) AS AvgSalary
FROM Employees;

This query returns:

● The total number of employees,


● The maximum salary,
● The average salary from the Employees table.

Handling NULL Values in Aggregate Functions

● COUNT(): Does not count NULL values if a specific column is mentioned. If


COUNT(*) is used, it counts all rows including NULLs.
● SUM(), AVG(), MIN(), MAX(): Ignore NULL values in the calculation.

Summary of Aggregate Functions:


Function Example

COUNT() Purpose Counts the number of SELECT COUNT(*) FROM


rows or non-NULL Employees;
values
SUM() Adds up all values SELECT SUM(Salary) FROM
in a column Employees;

AVG() Calculates the average SELECT AVG(Salary) FROM


value of a numeric Employees;
column
MIN() Returns the SELECT MIN(Salary) FROM
minimum value in Employees;
a column
MAX() Returns the SELECT MAX(Salary) FROM
maximum value in Employees;
a column
GROUP_CONCAT() / Concatenates values SELECT
STRING_AGG() from multiple rows into GROUP_CONCAT(CourseName)
a single string FROM Students;

VARIANCE() Calculates the SELECT VARIANCE(Salary)


variance of a set of FROM Employees;
values
STDDEV() Calculates the standard SELECT STDDEV(Salary)
deviation of a set of FROM Employees;
values
These aggregate functions are essential tools for summarizing and analyzing data
in SQL, and they play a key role in reporting and decision-making processes.

Nested Subqueries in SQL are queries that are embedded within another query. A
subquery (also known as an inner query or a nested query) is a query that is
placed inside the WHERE, FROM, or SELECT clause of another SQL query (the
outer query). The purpose of nested subqueries is to allow for more complex
querying by breaking down problems into smaller, manageable parts.

Types of Subqueries

1. Single-row Subquery:
○ Returns a single value (one row and one column).
○ Often used in the WHERE clause to compare values.
2. Multiple-row Subquery:
○ Returns multiple rows (one column, multiple rows).
○ Used in conjunction with operators like IN, ANY, or ALL.
3. Multiple-column Subquery:
○ Returns multiple columns (one or more rows, each containing
multiple columns).
○ Often used in the FROM clause.
4. Correlated Subquery:
○ A subquery that references columns from the outer query. It cannot be
run independently of the outer query because it depends on the values
from the outer query for its execution.

Syntax of Nested Subqueries


sql
Copy code
SELECT column_name
FROM table_name
WHERE column_name = (
SELECT
column_name
FROM table_name
WHERE condition
);

Examples of
Different Types of
Nested Subqueries

1. Single-Row Subquery

A single-row subquery returns a single value. For example, if you want to find
employees who earn more than the average salary:

sql
Copy code
SELECT EmployeeID, Name, Salary
FROM Employees
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees
);

● Here, the subquery (SELECT AVG(Salary) FROM Employees) returns a


single value, the average salary, and the outer query finds employees
earning more than that value.

2. Multiple-Row Subquery
A multiple-row subquery returns more than one row of values. For example, if you
want to find employees who are in departments with a budget greater than
$1,000,000:

sql
Copy code
SELECT EmployeeID, Name, DepartmentID
FROM Employees
WHERE DepartmentID IN (
SELECT DepartmentID
FROM Departments
WHERE Budget > 1000000
);

● The subquery (SELECT DepartmentID FROM Departments WHERE Budget


> 1000000) returns a list of department IDs, and the outer query selects
employees whose DepartmentID matches any of those returned by the
subquery.

3. Multiple-Column Subquery

A multiple-column subquery returns more than one column and is typically used in
the FROM clause. For example, to compare employees' salaries against the average
salary per department:

sql
Copy code
SELECT E.EmployeeID, E.Name, E.Salary, D.AvgSalary
FROM Employees E
JOIN (
SELECT DepartmentID, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY DepartmentID
) D ON E.DepartmentID = D.DepartmentID
WHERE E.Salary > D.AvgSalary;

● In this example, the subquery calculates the average salary for each
department and is joined with the Employees table to find employees who
earn more than the average salary in their department.

4. Correlated Subquery
A correlated subquery refers to columns from the outer query. It is executed once for
each row processed by the outer query. Here’s an example of finding employees
who earn more than the average salary in their department:

sql
Copy code
SELECT EmployeeID, Name, Salary, DepartmentID
FROM Employees E
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees
WHERE DepartmentID = E.DepartmentID
);

● The subquery (SELECT AVG(Salary) FROM


Employees WHERE DepartmentID
= E.DepartmentID) depends on the DepartmentID from the outer query.
For each row processed by the outer query, the subquery is re-evaluated
with the corresponding DepartmentID.

Operators Used with Subqueries

1. IN:
○ Used to compare a value against a list of values returned by a
subquery.

Example:
sql
Copy code
SELECT Name
FROM Employees
WHERE
DepartmentID
IN (
SELECT DepartmentID
FROM Departments
WHERE Budget > 1000000
);

2.
3. ANY/ALL:
○ Used to compare
a value with all or
any of the values
returned by a
subquery.
○ ALL: Returns true if the comparison is true for all of the values
returned by the subquery.

Example:
sql
Copy code
SELECT Name
FROM Employees
WHERE Salary >
ANY (
SELECT Salary
FROM Employees
WHERE
DepartmentID =
101
);

4.
5. EXISTS:
○ Used to check if a subquery returns any rows. If the subquery
returns one or more rows, the condition is true.

Example:
sql
Copy code
SELECT Name
FROM Employees E
WHERE EXISTS (
SELECT 1
FROM
Departments
D
WHERE
D.Department
ID =
E.Department
ID AND
D.Budget >
1000000
);

6.
7. NOT EXISTS:
○ The
oppos
ite of
EXIST
S. It
check
WHERE P.EmployeeID = E.EmployeeID
);

8.

Benefits of Using Subqueries

1. Simplify Complex Queries: Subqueries allow you to break down complex


logic into smaller, more understandable parts, making the overall
query easier to read and maintain.
2. Modular Design: Subqueries help modularize the logic, as a subquery can be
reused in different parts of a query.
3. Flexibility: They can be used in various clauses (WHERE, FROM, SELECT) to
provide flexibility in query design.

Performance Considerations

While subqueries are powerful tools, they can sometimes lead to


performance issues, particularly when:

● The subquery is executed many times (as in correlated subqueries).


● The subquery returns a large number of rows or columns.

In such cases, it may be better to use joins or temporary tables instead of


subqueries to improve performance.

Views in SQL

A view is a virtual table in SQL that provides a way to simplify complex queries,
enhance security, and offer a customized view of the data. A view is essentially a
stored SQL query that can be treated as a table, but unlike a table, it does not store
data physically. Instead, the data is dynamically generated by the SQL query each
time the view is queried.

Why Use Views?

1. Simplify Complex Queries: Views allow you to encapsulate complex joins and
queries into a single virtual table, making it easier to query data.
2. Security: Views can limit access to specific columns or rows in a table,
providing a way to expose only relevant data while hiding sensitive
information.
3. Reusability: You can reuse views in different queries, which makes
complex query writing more manageable and consistent.
4. Data Abstraction: Views can present a simplified, customized version of
data for end-users or applications.
5. Data Integrity: Views can be used to enforce business rules by filtering and
presenting data in a controlled manner.

Types of Views

1. Simple Views:
○ Simple views are created using a single SELECT query without any
complex joins or aggregations. They map directly to the underlying
table(s).
2. Complex Views:
○ These views may involve complex SQL queries, including JOIN
operations, aggregate functions, and multiple tables.
3. Updatable Views:
○ Views that allow data modification (INSERT, UPDATE, DELETE) on the
underlying tables. However, not all views are updatable (e.g., views
with joins or aggregate functions may not be updatable).
4. Materialized Views (in some databases like Oracle or PostgreSQL):
○ Unlike regular views, materialized views store the query result
physically in the database. This improves performance but needs to
be refreshed periodically.

Creating a View

To create a view, you use the CREATE VIEW statement, followed by the view
name and the
SELECT query that defines the view.

Syntax:

sql
Copy code
CREATE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE condition;

Example:

sql
Copy
code
CREATE VIEW EmployeeDetails AS
SELECT EmployeeID, Name, Department, Salary
FROM Employees
WHERE Status = 'Active';

In this example, a view named EmployeeDetails is created, which selects


EmployeeID, Name, Department, and Salary from the Employees table where
the employee status is 'Active'.

Using Views

Once a view is created, you can use it in the same way you would use a table in a
query.

Example:

sql
Copy code
SELECT * FROM EmployeeDetails;

This query retrieves all data from the EmployeeDetails view, which, in turn, pulls
data from the Employees table based on the defined criteria (where Status =
'Active').

Updating Data Through Views

Some views are updatable, meaning you can use INSERT, UPDATE, and DELETE
statements on them to modify the underlying data. However, there are limitations.
For example:

● Views involving aggregates, GROUP BY, or JOINs might not be updatable.


● Some databases have specific rules regarding what makes a view
updatable (e.g., MySQL allows updates on simple views but not on
complex ones).

Example: If the view is simple and maps directly to a table, you can update the
underlying data through the view:

sql
Copy code
UPDATE EmployeeDetails
SET Salary = 55000
WHERE EmployeeID = 101;

This updates the Salary in the Employees table for the employee with EmployeeID
= 101.

Modifying or Dropping a View

● Modifying a View: You cannot directly modify a view once it is created.


However, you can drop the existing view and recreate it with new logic.

Syntax:

sql
Copy code
DROP VIEW view_name;

To drop the EmployeeDetails view:

sql
Copy code
DROP VIEW EmployeeDetails;

● Altering a View: Some databases, like MySQL and PostgreSQL, allow you to
modify a view using CREATE OR REPLACE VIEW.

sql
Copy code
CREATE OR REPLACE VIEW view_name AS
SELECT ...

Performance Considerations

1. Views do not store data: Since views are virtual tables, they don’t store data
on their own. Every time you query a view, the database runs the
underlying query to fetch the data, which can sometimes be less
efficient for complex queries or large datasets.
2. Materialized Views: To optimize performance, you can use materialized views
(available in some databases like Oracle or PostgreSQL), which store the
result of the query
physically. This can greatly improve query performance but requires periodic
refreshing to keep the data up-to-date.

Example Use Cases for Views


Simplifying Complex Joins:
sql
Copy code
CREATE VIEW DepartmentEmployee AS
SELECT E.EmployeeID, E.Name, D.DepartmentName
FROM Employees E
JOIN Departments D ON E.DepartmentID =
D.DepartmentID;
Now, to get the list of employees with their department names, you can simply
query the view: sql
Copy code
SELECT * FROM DepartmentEmployee;

1.

Limiting Data Access: If you want to limit users to only viewing active
employees' data: sql
Copy code
CREATE VIEW ActiveEmployees AS
SELECT EmployeeID, Name, Department, Salary
FROM Employees
WHERE Status = 'Active';

2.

Aggregating Data: To create a view that shows the total salary per
department: sql
Copy code
CREATE VIEW DepartmentSalaries AS
SELECT DepartmentID, SUM(Salary) AS TotalSalaries
FROM Employees

3. GROUP BY DepartmentID;

Joined Relations in SQL


In relational databases, a join is an operation that combines columns from two or
more tables based on a related column between them. Joins are essential when
you need to retrieve data that is spread across multiple tables.

There are different types of joins in SQL, each serving a specific purpose in
combining tables based on the conditions you define.

Types of Joins
Inner Join
The INNER JOIN returns records that have matching values in both tables. If there is
no match, the row will not appear in the result.
Syntax:
sql
Copy code
SELECT columns
FROM table1
INNER JOIN table2
ON table1.column
= table2.column;
Example: Suppose we have two tables: Employees and
Departments. sql
Copy code
SELECT Employees.EmployeeID, Employees.Name,
Departments.DepartmentName
FROM Employees
INNER JOIN Departments
ON Employees.DepartmentID =
Departments.DepartmentID;

1.
○ This query returns a list of employee IDs, names, and their
associated department names, but only for employees who belong
to a department (i.e.,
where DepartmentID matches in both Employees and
Departments).

Left Join (Left Outer Join)


The LEFT JOIN (or LEFT OUTER JOIN) returns all records from the left table and
the matching
records from the right table. If there is no match, the result is NULL on the right
side.
Syntax:
sql
Copy code
SELECT columns
FROM table1
LEFT JOIN table2
ON table1.column = table2.column;
Example:
sql
Copy code
SELECT Employees.EmployeeID, Employees.Name,
Departments.DepartmentName
FROM Employees
LEFT JOIN Departments
ON Employees.DepartmentID =
Departments.DepartmentID;

2.
○ This query returns all employees, even
those who are not assigned to a
department. For those without a
department, the DepartmentName will
be
NULL.

Right Join (Right Outer Join)


The RIGHT JOIN (or RIGHT OUTER JOIN) is the opposite of the LEFT JOIN. It returns all
records from the right table and the matching records from the left table. If there is
no match, the
result is NULL on the left side.
Syntax:
sql
Copy code
SELECT columns
FROM table1
RIGHT JOIN table2
ON table1.column
= table2.column;
Example:
sql
Copy code
SELECT Employees.EmployeeID, Employees.Name,
Departments.DepartmentName
FROM Employees
RIGHT JOIN Departments
ON Employees.DepartmentID =
Departments.DepartmentID;

3.
○ This query returns all departments,
including those without any employees.
For
departments without employees, the
left or the right table. If there is no match, the result is NULL on the side where no
match exists.
Syntax:
sql
Copy code
SELECT columns
FROM table1
FULL JOIN table2
ON table1.column
= table2.column;
Example:
sql
Copy code
SELECT Employees.EmployeeID, Employees.Name,
Departments.DepartmentName
FROM Employees
FULL JOIN Departments
ON Employees.DepartmentID =
Departments.DepartmentID;

4.
○ This query returns all employees and
all departments. If an employee does
not
have a department, the DepartmentName will be NULL. If a department
has no employees, the EmployeeID and Name will be NULL.

Cross Join (Cartesian Join)


The CROSS JOIN returns the Cartesian product of both tables. That is, it
returns every combination of rows from both tables. This type of join does
not require a condition.
Syntax:
sql
Copy code
SELECT columns
FROM table1
CROSS JOIN
table2;
Example:
sql
Copy code
SELECT Employees.Name, Departments.DepartmentName
FROM Employees
CROSS JOIN Departments;

5.
○ This query will return every possible
combination of Employees and
Self Join
A self join is a join where a table is joined with itself. This is useful when you need
to compare rows within the same table, such as when you need to find
relationships between records.
Syntax:
sql
Copy code
SELECT A.column1, B.column2
FROM table A
JOIN table B
ON A.common_column =
B.common_column;
Example: Suppose the Employees table has an ManagerID field that links
to the EmployeeID of the manager. A self join could find employees and
their managers: sql
Copy code
SELECT E.Name AS Employee, M.Name AS Manager
FROM Employees E
LEFT JOIN Employees M
ON E.ManagerID = M.EmployeeID;

6.
○ This query returns a list of employees
and their managers by joining the
Employees table with itself.
Join Conditions

The most common way to join two tables is by matching a column in one table to
a column in the other table, typically using the ON keyword. In most cases, this
will be a foreign key in one table that references a primary key in the other.

Example: Let's assume we have two tables, Orders and Customers, where
CustomerID is a foreign key in the Orders table that references the primary key in
the Customers table.

sql
Copy code
SELECT Orders.OrderID, Orders.OrderDate, Customers.Name
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID;
● This query will return all orders along with the customer name by
matching the
CustomerID in both tables.

Join Clauses and Filtering

You can filter the result of a join operation using the WHERE clause in addition to
the ON
condition.

Example:

sql
Copy code
SELECT Employees.Name, Departments.DepartmentName
FROM Employees
INNER JOIN Departments
ON Employees.DepartmentID = Departments.DepartmentID
WHERE Employees.Salary > 50000;

● This query returns employees who earn more than $50,000 and their
corresponding department names.

Aliases in Joins

It is a common practice to use aliases for tables in join queries, especially when
joining multiple tables or performing self joins. Table aliases make the query more
concise and easier to read.

Example:

sql
Copy code
SELECT E.Name AS EmployeeName, D.Name AS DepartmentName
FROM Employees E
INNER JOIN Departments D
ON E.DepartmentID = D.DepartmentID;

● In this case, E and D are aliases for the Employees and Departments
tables, respectively.
Join Performance Considerations

● Indexes: Joins can be more efficient if appropriate indexes are used on the
columns being joined (especially foreign keys and primary keys).
● Join Type: The choice of join type (e.g., INNER JOIN, LEFT JOIN, etc.) can
impact query performance depending on the amount of data and the type
of relationship between the tables.
● Avoiding Cross Joins: CROSS JOIN should be used with caution as it can
generate large result sets that could cause performance issues.

Summary of Joins
Join Returns Use Case
Type

INNER Only rows with matching Commonly used to combine related


JOIN values in both tables. data from two tables (e.g.,
employees and departments).

LEFT All rows from the left table, To ensure all records from the left
JOIN and matched rows from the table are included (e.g., find all
right table (NULL for non- employees, including those with
matching rows). no department).
RIGHT All rows from the right table, To ensure all records from the right
JOIN and matched rows from the table are included (e.g., list all
left table (NULL for non- departments, even without
matching rows). employees).
FULL All rows from both tables, To combine data from both tables
JOIN with NULLs where there is and handle unmatched rows from
no match. either side.
CROSS Cartesian product of both Used when you need all
JOIN tables (all combinations of combinations of rows (e.g., testing
rows). all possible pairs of items).

SELF Joins a table with Used to compare rows within the


JOIN itself. same table (e.g., employees and
their managers).

Joins are a crucial aspect of relational databases as they allow you to combine data
across multiple tables and answer complex queries that involve relationships
between different entities.

Transactions and ACID Properties in SQL


A transaction in SQL is a sequence of one or more operations (like INSERT, UPDATE,
DELETE, SELECT) that are executed as a single unit of work. A transaction ensures
that the database remains in a consistent state, even in the face of failures,
crashes, or unexpected behavior.

The ACID properties are a set of principles that guarantee that transactions are
processed reliably, ensuring the integrity of the database.

ACID Properties

The acronym ACID stands for four key properties that are critical to ensuring
that database transactions are processed reliably:

1. Atomicity
2. Consistency
3. Isolation
4. Durability

5. Atomicity

● Definition: Atomicity ensures that a transaction is an "all-or-nothing"


operation. This means that either all operations within a transaction are
completed successfully, or none are. If any part of the transaction fails, the
entire transaction is rolled back, and no changes are made to the database.
● Example: Suppose you transfer money from one bank account to another. If
you deduct the money from the sender’s account but the credit operation to
the receiver's account fails, the transaction is rolled back, and no money is
transferred. Either both the deduction and the credit happen, or neither
happens.
● Real-world analogy: Think of it like an atomic unit of work: you can't just pay
part of the price of a good — the whole transaction must succeed or fail as a
whole.

SQL Example:

sql
Copy code
BEGIN TRANSACTION;

-- Deduct amount from account A


UPDATE Accounts
SET balance = balance - 100
WHERE account_id = 1;

-- Add amount to account B


UPDATE Accounts
SET balance = balance +
100
WHERE account_id = 2;

COMMIT;

If an error occurs before the COMMIT statement, the entire transaction is rolled
back, ensuring the transaction is atomic.

2. Consistency

● Definition: Consistency ensures that a transaction brings the database from


one valid state to another. The database must always follow all predefined
rules, constraints, and triggers, such as primary keys, foreign keys, and
checks on data validity. If a transaction violates any of these rules, it will not
be allowed to commit, ensuring the integrity of the database is maintained.
● Example: If the business logic requires that an employee's salary cannot be
less than a certain threshold, the transaction must not violate this constraint.
If the transaction tries to update the salary to a value lower than the
threshold, it will be rolled back.
● Real-world analogy: Consistency is like a bookkeeper ensuring that every
entry made in the ledger follows the rules and that the final balance is
correct.

SQL Example:

sql
Copy code
BEGIN TRANSACTION;

-- Insert a new record (should satisfy all constraints)


INSERT INTO Employees (EmployeeID, Name, Salary)
VALUES (1001, 'John Doe', 50000);

COMMIT;

If an invalid salary is attempted (e.g., negative salary), the transaction will fail, and
the data will remain consistent.
3. Isolation
● Definition: Isolation ensures that the operations of one transaction are isolated
from other concurrent transactions. The results of a transaction are not
visible to other transactions until the transaction is committed. This prevents
"dirty reads" (reading uncommitted changes), "non-repeatable reads" (where
data changes between reads within a transaction), and "phantom reads"
(where new data is added or removed while a transaction is executing).
● Isolation Levels: SQL databases offer different isolation levels to
control how transactions interact with each other:
○ Read Uncommitted: Transactions can see uncommitted changes made
by other transactions.
○ Read Committed: Transactions can only see committed changes made
by other transactions.
○ Repeatable Read: Transactions can read the same data multiple times
and get the same result, but other transactions can still insert new
rows.
○ Serializable: The highest level of isolation, where transactions are fully
isolated, and no other transactions can access the data being
processed.
● Example: If two transactions are updating the same account balance at the
same time, isolation ensures that one transaction will complete before the
other can begin to ensure no interference in the data.

SQL Example:

sql
Copy code
BEGIN TRANSACTION;

-- Update balance
UPDATE Accounts
SET balance = balance - 100
WHERE account_id = 1;

COMMIT;

If another transaction tries to update the same account at the same time, it
will either wait (depending on the isolation level) or encounter a conflict.
4. Durability

● Definition: Durability ensures that once a transaction is committed, its


changes are permanent and will survive any subsequent system failures,
such as power outages, crashes, or hardware failures. The database will
guarantee that the changes made by
the transaction are stored safely in non-volatile storage (e.g., hard drives or
SSDs) and can be recovered when the system restarts.
● Example: After committing a transaction that updates an account balance,
even if the system crashes immediately after, the changes to the account
balance will be preserved when the system is restored.
● Real-world analogy: Durability is like keeping a signed contract in a fireproof
safe. Once the transaction is complete, the results are permanent and cannot
be undone by a system failure.

SQL Example:

sql
Copy code
BEGIN TRANSACTION;

-- Transfer funds
UPDATE Accounts
SET balance = - 100
balance
WHERE account_id = 1;

UPDATE Accounts
SET balance = balance + 100
WHERE account_id = 2;

COMMIT; -- Once committed, the changes are permanent.

Even if the system crashes immediately after the COMMIT, the funds are still
transferred when the system recovers, thanks to the durability property.

Transaction Lifecycle in SQL

1. BEGIN TRANSACTION: Marks the beginning of a transaction.


2. COMMIT: Finalizes the transaction, making all changes permanent.
3. ROLLBACK: Reverts the changes made during the transaction if
something goes wrong.

Example of a Complete Transaction


sql
Copy code
BEGIN TRANSACTION;

-- Deduct from sender's account


UPDATE Accounts
SET balance = balance - 200
WHERE account_id = 1;

-- Add to receiver's account


UPDATE Accounts
SET balance = balance + 200
WHERE account_id = 2;

-- If all updates succeed, commit the transaction


COMMIT;

In case of any failure, the transaction will be rolled back and no changes

will be made: sql


Copy code
ROLLBACK;

Summary of ACID Properties


Property Example
Description
Atomicity The transaction is atomic; If an error occurs during a
either all operations are bank transfer, no money is
completed, or none are. deducted or added to any
account.
Consistenc The transaction must bring Enforcing rules like foreign
y the database from one valid keys or check constraints to
state to another, adhering to maintain the integrity of the
constraints. database.
Isolation The transaction is isolated Two bank transfers happening at
from others, ensuring that the same time on the same
no other transactions can account won't lead to
interfere. inconsistent balances.
Durability Once committed, the After committing a transaction,
changes are permanent and even if the system crashes, the
survive system failures. changes are saved and
recoverable.
These properties are crucial in maintaining the integrity, reliability, and
correctness of the database in a multi-user environment.

Concurrency Control in Databases

Concurrency control refers to the management of concurrent access to a database


by multiple users or transactions. It ensures that transactions are executed in a way
that maintains the consistency and integrity of the database, even when multiple
transactions are accessing or modifying the database at the same time. Without
proper concurrency control, issues such as lost updates, dirty reads, non-repeatable
reads, and phantom reads can arise, leading to inconsistent or incorrect data.

Key Issues in Concurrency Control

1. Lost Update: This occurs when two transactions simultaneously update the
same data, and one of the updates is overwritten by the other, resulting in
a loss of one update.
○ Example: If Transaction 1 reads a value, adds 100 to it, and writes it
back, and at the same time Transaction 2 reads the same value, adds
50, and writes it back, the update from Transaction 1 will be lost.
2. Dirty Read: A transaction reads data that has been written by another
transaction but not yet committed. If the other transaction is
rolled back, the data read by the first transaction is invalid.
○ Example: If Transaction 1 updates a row but has not yet committed,
and Transaction 2 reads that uncommitted data, Transaction 2 may
work with invalid data.
3. Non-Repeatable Read: A transaction reads a value, but when it reads the
same value again, it has changed due to another concurrent transaction.
○ Example: If Transaction 1 reads a value, then Transaction 2 updates
that value and commits, when Transaction 1 reads the value again, it
gets a different result.
4. Phantom Read: A transaction reads a set of rows that match a given
condition, but another transaction concurrently inserts, deletes,
or updates rows, causing the result set to change between reads
within the same transaction.
○ Example: Transaction 1 queries for all accounts with a balance greater
than
$1000, but Transaction 2 inserts a new account with a balance of
$1500. When Transaction 1 queries again, it gets a different set of
results.

Concurrency Control Mechanisms

There are two primary methods for managing concurrency:

5. Locking Mechanisms
6. Timestamp-based Protocols

1. Locking Mechanisms
Locks are used to control access to data items and prevent conflicting
operations from happening simultaneously. When a transaction locks a data
item, no other transaction can access or modify that item until the lock is
released.

Types of Locks:

● Shared Lock (S-Lock): A shared lock allows multiple transactions to read (but
not modify) a data item. Other transactions can acquire shared locks but
cannot modify the data until the lock is released.
○ Example: If Transaction 1 reads a record, it may acquire a shared lock
on that record to prevent any other transactions from modifying it
until the transaction is complete.
● Exclusive Lock (X-Lock): An exclusive lock allows a transaction to both
read and modify a data item. No other transactions can read or write the
data until the lock is released.
○ Example: If Transaction 1 is updating a record, it acquires an
exclusive lock to prevent other transactions from accessing or
modifying that record until it is finished.

Locking Protocols:

● Two-Phase Locking (2PL): This is a protocol that ensures serializability (the


highest isolation level). It requires that a transaction follows two phases:
○ Growing Phase: The transaction can acquire locks but cannot
release any locks.
○ Shrinking Phase: The transaction can release locks but cannot acquire
any new locks.
● Once a transaction releases its first lock, it enters the shrinking phase and
cannot acquire more locks, ensuring that no other transaction can interfere
with its operations during the critical section.
○ Example: A transaction must lock all the data it needs at the beginning
(growing phase) and release the locks at the end (shrinking phase),
ensuring no conflicting transactions occur in the middle.
● Deadlock: A deadlock occurs when two or more transactions are waiting for
each other to release locks, causing a cycle of dependency where none of
the transactions can proceed. To resolve deadlocks, the system can use
techniques like deadlock detection (identifying circular waits) or deadlock
prevention (avoiding situations where deadlocks can occur).
○ Example: Transaction 1 holds a lock on data item A and waits for a
lock on data item B, while Transaction 2 holds a lock on data item B
and waits for a lock on data item A. Both transactions are now in a
deadlock.
2. Timestamp-based Protocols
Timestamp-based concurrency control mechanisms ensure that transactions
execute in a way that respects the order of their timestamps, preventing conflicts
between transactions.

● Basic Timestamp Ordering: In this protocol, each transaction is assigned a


timestamp when it starts. The protocol ensures that transactions are
executed in timestamp order to prevent conflicts such as lost updates,
dirty reads, and non-repeatable reads.
○ Rules:
■ If Transaction T1 wants to read a data item and its timestamp
is earlier than the timestamp of the transaction that wrote the
item, then T1 is allowed to read the item (if it's not locked by
another transaction).
■ If T1 wants to write to a data item, it checks whether any
transaction with an earlier timestamp has already read or
written that item. If it has, T1 is aborted to maintain
consistency.
● Thomas' Write Rule: This rule is a relaxation of the basic timestamp ordering
protocol. If a transaction T1 writes a data item X, and there is a later
transaction T2 with a higher timestamp, then the write by T1 is ignored if T2's
Isolation Levelsisinalready
read/write SQL consistent with the value T1 was trying to write.

SQL provides different isolation levels to control how transactions interact with
each other. These isolation levels define the extent to which the operations in one
transaction are isolated from other concurrent transactions:

1. Read Uncommitted:
○ Allows dirty reads, meaning a transaction can read uncommitted
data from another transaction.
○ Lowest level of isolation, resulting in a risk of reading invalid
data.
2. Read Committed:
○ Ensures that transactions only read committed data, preventing
dirty reads.
○ However, non-repeatable reads can occur, meaning the value
can change between reads within the same transaction.
3. Repeatable Read:
○ Ensures that if a transaction reads a data item, it will get the
same value if it reads it again.
○ Prevents dirty and non-repeatable reads, but phantom reads can
still happen.
4. Serializable:
○ The highest isolation level, ensuring that transactions are executed in
such a way that they behave as if they were executed one after the
other (serially).
○ Prevents dirty reads, non-repeatable reads, and phantom reads, but it
can lead to performance bottlenecks due to high locking.
Concurrency Control Techniques Summary
Technique Description Pros Cons

Locking (2PL) Uses locks to control Guarantees Deadlocks can


access to data, serializability occur;
ensuring and performance
serializability through consistency. issues due to
two-phase locking. excessive locking.
Timestamp Transactions are Efficient Can cause
Ordering ordered based on without transaction
timestamps to locking; rollbacks due to
maintain consistency prevents conflicts.
without locking. conflicts.
Optimistic Assumes transactions Less May require
Concurrency will not conflict and locking rolling back
validates at commit overhead. transactions if
time. validation fails.
Multiversion Maintains multiple Improved read Complex to
Concurrency versions of data to performance manage multiple
allow concurrent reads and versions.
and writes without concurrency.
blocking.
Triggers in Databases

A trigger in a database is a set of instructions that are automatically executed (or


triggered) in response to specific events on a particular table or view. Triggers are
used to enforce business rules, data integrity, and automation in the database
system. They are typically associated with insertions, updates, or deletions of
records.

Triggers are often used for:

● Enforcing business rules: Automatically checking and enforcing constraints


beyond what is defined by normal database constraints.
● Auditing changes: Tracking changes to data for purposes like logging or
auditing.
● Data synchronization: Automatically updating or synchronizing data in
related tables.
● Complex validations: Performing validation checks or preventing operations
that do not meet certain criteria.

Types of Triggers

Triggers can be classified based on the timing and the event that causes them to fire.

1. Based on Timing
● BEFORE Trigger: Executes before the triggering event (such as INSERT,
UPDATE, or DELETE) is applied to the data. These are useful when you
need to validate or modify data before it is actually written to
the database.
● AFTER Trigger: Executes after the triggering event has completed. These are
useful for performing actions like logging changes, updating related data, or
enforcing rules after the changes have been applied.
● INSTEAD OF Trigger: Executes in place of the triggering event. Instead of
performing
the usual INSERT, UPDATE, or DELETE, it runs custom logic. This is
particularly useful for views that require updates.

2. Based on Events

Triggers can also be defined based on the type of event that causes them:

● INSERT Trigger: Fired when a new row is inserted into the table.
● UPDATE Trigger: Fired when an existing row is updated in the table.
● DELETE Trigger: Fired when a row is deleted from the table.

Basic Syntax of Triggers

The general syntax for creating a trigger varies by the database system (e.g.,
MySQL, PostgreSQL, SQL Server), but the structure is often similar. Below is a
general SQL syntax for creating a trigger:

sql
Copy code
CREATE TRIGGER trigger_name
{ BEFORE | AFTER | INSTEAD OF }
{ INSERT | UPDATE | DELETE }
ON table_name
FOR EACH ROW
BEGIN
-- Triggered action goes here
END;

● trigger_name: Name of the


trigger.
● BEFORE | AFTER | INSTEAD OF:
Timing of the trigger.
● INSERT | UPDATE | DELETE: Type
of event that fires the trigger.
● table_name: The name of the
table where the trigger is
defined.
● FOR EACH ROW: Specifies that the trigger will be executed for each
affected row (common in row-level triggers).
Examples of Triggers

1. BEFORE INSERT Trigger

This trigger checks if the value being inserted into the salary column is greater
than or equal to 10000. If not, it prevents the insert operation.

sql
Copy code
CREATE TRIGGER check_salary_before_insert
BEFORE INSERT ON Employees
FOR EACH ROW
BEGIN
IF NEW.salary < 10000 THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'Salary must be at least 10,000';
END IF;
END;

● Explanation: The trigger fires before a new row is inserted into the Employees
table. If the salary value is less than 10,000, the trigger raises an error and
prevents the insert.

2. AFTER INSERT Trigger

This trigger logs information about new rows added to the Employees table into a
separate
AuditLog table.

sql
Copy code
CREATE TRIGGER log_employee_insert
AFTER INSERT ON Employees
FOR EACH ROW
BEGIN
INSERT INTO AuditLog (action, table_name, record_id, action_time)
VALUES ('INSERT', 'Employees', NEW.EmployeeID, NOW());
END;

● Explanation: After a new row is inserted into the Employees table, this
trigger logs the insertion in the AuditLog table, recording the action, table
name, record ID, and the current timestamp.
3. AFTER DELETE Trigger

This trigger updates a Department table to decrement the employee count after
an employee is deleted.

sql
Copy code
CREATE TRIGGER update_department_count_after_delete
AFTER DELETE ON Employees
FOR EACH ROW
BEGIN
UPDATE Department
SET employee_count = employee_count - 1
WHERE DepartmentID = OLD.DepartmentID;
END;

● Explanation: After an employee is deleted, this trigger updates the


employee_count field in the Department table to reflect the
removal of an employee. The OLD keyword refers to the values of
the row before it was deleted.

4. INSTEAD OF Trigger (on a View)

If you have a view that combines data from multiple tables and you want to
allow INSERT, UPDATE, or DELETE operations on the view, you can use an
INSTEAD OF trigger.

sql
Copy code
CREATE TRIGGER instead_of_update_on_view
INSTEAD OF UPDATE ON EmployeeSalaryView
FOR EACH ROW
BEGIN
UPDATE Employees
SET salary = NEW.salary
WHERE EmployeeID = OLD.EmployeeID;
END;

● Explanation: This INSTEAD OF trigger allows updates on a view


(EmployeeSalaryView). When an update is performed on the view, the
trigger instead updates the Employees table directly.
Trigger Considerations

1. Performance Impact: Triggers can affect the performance of the database


because they introduce additional processing. They can make insert,
update, and delete operations slower, as they execute additional
logic behind the scenes.
2. Recursive Triggers: Be careful with recursive triggers where one trigger
causes another
trigger to fire. For example, an AFTER INSERT trigger might insert
another record, which causes another trigger to execute, and so on. To
avoid infinite loops, recursive triggers are often disabled or restricted.
3. Data Integrity: While triggers can enforce business rules and integrity
constraints that are not easily captured with normal constraints,
they can also introduce complexity that makes the system
harder to maintain.
4. Error Handling: Triggers often use error handling mechanisms to roll back
changes if the condition is not met (e.g., raising an exception or signaling
an error).
5. Use Cases:
○ Auditing: Logging changes to critical tables (inserts, updates,
deletes).
○ Referential Integrity: Automatically updating or deleting related rows
in child tables when rows in a parent table are modified or deleted.
○ Enforcing business rules: Preventing invalid data from being
inserted or updating data automatically based on certain rules.

Stored Procedures in Databases

A stored procedure is a set of SQL statements that are stored in the database and
can be executed (or invoked) by a user or application. Unlike ad-hoc SQL queries,
which are executed one at a time, a stored procedure allows you to write a series
of SQL commands and execute them with a single call. Stored procedures are often
used to encapsulate business logic, improve performance, and provide better
security and maintainability of database operations.

Advantages of Stored Procedures

6. Performance:
○ Stored procedures are precompiled and stored in the database,
meaning that the database can optimize the execution plan for
them. This leads to faster execution compared to executing SQL
queries individually.
7. Reusability:
○ Once created, stored procedures can be reused by multiple applications
or users, reducing the amount of redundant SQL code.
8. Maintainability:
○ Storing business logic in stored procedures centralizes the code. If
there’s a need to change the logic, it only needs to be updated in the
procedure, rather than in multiple application code locations.
○ Access to the data can be controlled via the stored procedure,
providing an additional layer of security. Users can be granted
permissions to execute a procedure without having direct access
to the underlying tables.
5. Reduced Network Traffic:
○ By executing a series of SQL commands in one call, stored
procedures reduce the amount of data that needs to be transmitted
over the network.
6. Error Handling:
○ Stored procedures support error handling, allowing you to manage
exceptions and rollback transactions if necessary.

Creating and Using Stored Procedures

Basic Syntax for Creating a Stored Procedure

The syntax to create a stored procedure can vary slightly across different
databases (MySQL, PostgreSQL, SQL Server, Oracle), but the basic structure is
typically as follows:

sql
Copy code
CREATE PROCEDURE procedure_name (parameter1 datatype, parameter2
datatype, ...)
BEGIN
-- SQL statements
END;

● procedure_name:
The name of the
procedure.
● parameter1, parameter2, ...: Parameters that the procedure accepts, which
can be used inside the procedure.
● SQL statements: The SQL code that the procedure will execute.

Example 1: Simple Stored Procedure

Let’s create a stored procedure that takes an employee ID as a parameter


and retrieves information about that employee.

sql
Copy code
CREATE PROCEDURE GetEmployeeInfo (IN emp_id INT)
BEGIN
SELECT * FROM Employees WHERE EmployeeID =
emp_id;
END;
● Explanation: This stored procedure, GetEmployeeInfo, accepts an emp_id as
an input parameter and retrieves the record for that employee from the
Employees table.

Example 2: Stored Procedure with Output Parameters

You can also use stored procedures with output parameters to return values back

to the caller. sql


Copy code
CREATE PROCEDURE GetEmployeeSalary (IN emp_id INT, OUT salary DECIMAL)
BEGIN
SELECT salary INTO salary FROM Employees WHERE EmployeeID =
emp_id;
END;

● Explanation: This stored procedure accepts an emp_id as an input and


returns the employee's salary via an output parameter salary.

Example 3: Stored Procedure with Multiple SQL Statements

You can execute multiple SQL statements within a stored procedure. Here's an
example where we insert a new employee record and return the newly inserted
employee’s ID.

sql
Copy code
CREATE PROCEDURE AddEmployee (IN emp_name VARCHAR(100), IN emp_salary
DECIMAL, OUT new_emp_id INT)
BEGIN
INSERT INTO Employees (EmployeeName, Salary)
VALUES (emp_name, emp_salary);

SET new_emp_id = LAST_INSERT_ID();


END;

● Explanation: This procedure inserts a new employee into the Employees table
and retrieves the ID of the newly inserted employee using the
LAST_INSERT_ID() function, storing it in the output parameter new_emp_id.

Executing Stored Procedures

Once a stored procedure is created, you can execute it using a CALL statement in
SQL.
Example: Executing a Stored Procedure with Parameters
sql
Copy code
CALL GetEmployeeInfo(101);

● Explanation: This command calls the GetEmployeeInfo procedure and


passes the employee ID 101 as a parameter. It will return the employee's
information for the specified ID.

Example: Executing a Stored Procedure with Output Parameters


sql
Copy code
CALL GetEmployeeSalary(101, @salary);
SELECT @salary;

● Explanation: This calls the


GetEmployeeSalary procedure with
the employee ID 101
and retrieves the salary via an output
parameter @salary.

Control-of-Flow in Stored Procedures

Stored procedures support control-of-flow constructs such as conditional


statements, loops, and error handling.

1. IF Statement:
○ Used for conditional execution of SQL statements.

sql
Copy code
CREATE PROCEDURE CheckSalary(IN emp_id INT)
BEGIN
DECLARE emp_salary DECIMAL;
SELECT salary INTO emp_salary FROM Employees WHERE EmployeeID =
emp_id;

IF emp_salary < 10000 THEN


SELECT 'Salary is below 10,000';
ELSE
SELECT 'Salary is sufficient';
END IF;
END;
2.
3. LOOP / WHILE Loop:
○ Loops can be used to repeat a set of statements based on certain
conditions.

sql
Copy code
CREATE PROCEDURE DecreaseSalaries (IN percentage DECIMAL)
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE emp_id INT;
DECLARE emp_salary DECIMAL;
DECLARE cur CURSOR FOR
SELECT EmployeeID, Salary
FROM Employees;

DECLARE CONTINUE HANDLER


FOR NOT FOUND SET done = 1;

OPEN cur;
read_loop: LOOP
FETCH cur INTO emp_id, emp_salary;
IF done THEN
LEAVE read_loop;
END IF;

UPDATE Employees SET Salary = Salary * (1 - percentage / 100)


WHERE EmployeeID = emp_id;
END LOOP;
CLOSE cur;
END;

4.
○ Explanation: This procedure uses a cursor to loop through all
employees and decrease their salary by a specified percentage.
5. Error Handling:
○ SQL procedures allow for TRY-CATCH (or equivalent) blocks for error
handling.

sql
Copy code
CREATE PROCEDURE UpdateEmployeeSalary(IN emp_id INT, IN new_salary
DECIMAL)
BEGIN
BEGIN TRY
UPDATE Employees SET Salary = new_salary WHERE EmployeeID =
emp_id;
END TRY
BEGIN CATCH
SELECT ERROR_MESSAGE() AS ErrorMessage;
END CATCH;
END;

6.
○ Explanation: This procedure attempts to update an employee’s
salary and handles any errors by displaying the error message.

Transaction Control in Stored Procedures

You can also include transaction control commands (such as COMMIT and ROLLBACK)
in stored procedures to ensure that multiple statements are executed as a single
transaction.

sql
Copy code
CREATE PROCEDURE TransferFunds (IN from_account INT, IN to_account
INT, IN amount DECIMAL)
BEGIN
DECLARE sufficient_balance INT;

-- Check if the source account has enough balance


SELECT balance INTO sufficient_balance FROM Accounts WHERE
AccountID = from_account;
IF sufficient_balance < amount THEN
SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT =
'Insufficient
balance';
END IF;

-- Begin
transaction START
TRANSACTION;

-- Transfer
funds
UPDATE Accounts SET balance = balance - amount WHERE AccountID =
from_account;
UPDATE Accounts SET balance = balance + amount WHERE AccountID =
to_account;
-- Commit transaction
COMMIT;
END;

● Explanation: This procedure transfers money from one account to another. It


checks for sufficient balance before proceeding with the transfer, ensuring
that the transaction is committed only if all conditions are met.

You might also like