0% found this document useful (0 votes)
13 views104 pages

SQL for Beginners to Advance Le - RAJPUT, ANANT

This document is a comprehensive guide for learning SQL from basic to advanced levels, including definitions, commands, and interview questions. It covers key concepts such as relational databases, SQL commands, constraints, data integrity, and normalization. The document emphasizes the importance of understanding SQL for effective data management and retrieval in relational database systems.

Uploaded by

evelasquez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views104 pages

SQL for Beginners to Advance Le - RAJPUT, ANANT

This document is a comprehensive guide for learning SQL from basic to advanced levels, including definitions, commands, and interview questions. It covers key concepts such as relational databases, SQL commands, constraints, data integrity, and normalization. The document emphasizes the importance of understanding SQL for effective data management and retrieval in relational database systems.

Uploaded by

evelasquez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 104

SQL FOR BEGINNERS

TO ADVANCE LEVEL
LEARN SQL FROM BASIC TO
ADVANCE LEVEL IN SIMPLEST WAY
WITH SOME INTERVIEWS
QUESTIONS

Special Thanks to Mr. GIRISH PRATAP


SINGH who secured ALL INDIA RANK
32 in GATE EXAM.
Copyright © by GIRISH PRATAP SINGH
All rights reserved. No part of this book may be reproduced,
stored in a retrieval system, or transmitted in any form or by
any means, electronic, mechanical, photocopying, recording, or
otherwise, without prior written permission from the copyright
owner, except for brief quotations embodied in critical reviews
and certain other noncommercial uses permitted by copyright
law. For permission requests, write to the publisher at
[email protected]
This work is protected by the copyright laws of India and other
countries. Unauthorized reproduction or distribution of this
work, or any portion of it, may result in severe civil and criminal
penalties, and will be prosecuted to the maximum extent
possible under the law.
**Please note that all legal proceedings related to this work,
including any disputes or discrepancies, shall be subject to the
jurisdiction of the Allahabad High court.**
I am not going to give you index page just follow each page in a
systematic manner.
Thank you.

We will start with definitions and will carry on with query and
interview questions.

Open your laptop/desktop download SQL application before going


through this book, the best way is to use online SQL application
where you will easily query SQL commands and you don’t need to
download in your laptop/desktop. Rest is your choice.

SQL: SQL is Structured Query Language, which is a computer


language for storing, manipulating and retrieving data stored in
relational database.
Relational database: A database structured to recognize relations
among stored items of information.
#SQL is the standard language for Relation Database System
Relational database management system(RDBMS): It is a
software used to store , manipulate or retrieve data stored in
relational database.
Some RDBMS like MySQL, MS Access, Oracle, Sybase, Informix,
postgres and SQL Server use SQL as standard database language.
Also, they are using different dialects, such as:
• MS SQL Server using T-SQL.

SQL COMMANDS

SQL Commands: These commands you will use in your SQL


query
DDL - Data Definition Language: Create, Alter, Drop, Truncate
(Remember it as D CAT)

DML - Data Manipulation Language: Insert, Update, Delete, call,


lock (Remember it as Lock CUDI)

DCL - Data Control Language: Grant, Revoke

DQL - Data Query Language: Select

TCL -Transaction Control Language: Commit, Rollback, save point, set


transaction , set constraints (CCPTR)
RDBMS

RDBMS stands for Relational Database Management System.


RDBMS is the basis for SQL and for all modern database systems
like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft
Access.
A Relational database management system (RDBMS) is a database
management system (DBMS) that is based on the relational model
as introduced by E. F. Codd.

What is table? The data in RDBMS is stored in database objects


called tables. The table is a collection of related data entries and it
consists of columns and rows. Remember, a table is the most
common and simplest form of data storage in a relational database.
A field is a column in a table that is designed to maintain specific
information about every record in the table. A column is a vertical
entity in a table that contains all information associated with a
specific field in a table.
A record, also called a row of data, is each individual entry that
exists in a table.
NULL VALUE
In SQL, `NULL` value represents the absence of a value or an
unknown value in a database field. It is not the same as an empty
string or zero; it signifies the absence of any data in that particular
column. Understanding and handling `NULL` values is an essential
part of SQL programming.

Here's an example to illustrate NULL values:


Suppose you have a table named Employees with the following
structure:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Birthdate DATE,
DepartmentID INT
);

Now, let's insert some data into this table:


INSERT INTO Employees (EmployeeID, FirstName, LastName, Birthdate,
DepartmentID)
VALUES
(1, 'John', 'Doe', '1990-05-15', 101),
(2, 'Jane', 'Smith', '1985-08-22', NULL),
(3, 'Robert', 'Johnson', NULL, 102),
(4, 'Emily', 'Brown', '1998-03-10', 101);

In this example, we've inserted four rows into the Employees table.
Note that:

1. The Birthdate column has NULL values for employees


2 and 3 because we don't have their birthdate
information.
2. The DepartmentID column has a NULL value for
employee 3 because we don't know which department
Robert belongs to.

Now open your laptop/desktop and do write above commands and


after that write commands given below and see what output is.
1- SELECT * FROM Employees; (press enter after semi
colon)
2- SELECT * FROM Employees WHERE Birthdate IS NULL;
3- SELECT * FROM Employees WHERE Birthdate IS NOT
NULL;
Handling NULL values in SQL often involves using the IS NULL
and IS NOT NULL operators to filter and manage data
effectively. It's important to be cautious when working with NULL
values, as they can affect query results and require special
consideration in your database design and application logic.
CONSTRAINTS
In SQL, constraints are rules and restrictions applied to a table's
columns to maintain data integrity and ensure that the data
stored in the table follows certain criteria. Constraints help
enforce business rules and prevent data inconsistencies. There
are several types of constraints in SQL, including:
Primary Key Constraint:

Ensures that a column (or a set of columns)


uniquely identifies each row in a table.
Primary key values must be unique and cannot
contain NULL values.
Example table creation with a primary key constraint:
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50)
);

Unique Constraint:

Ensures that the values in a column (or a set of


columns) are unique across all rows in a table.
Unlike a primary key, unique constraints allow NULL
values.
Example table creation with a unique constraint:
CREATE TABLE Products (
ProductID INT UNIQUE,
ProductName VARCHAR(50),
Price DECIMAL(10, 2)
);
Foreign Key Constraint:

Establishes a relationship between two tables by


enforcing referential integrity.
A foreign key in one table refers to the primary key in
another table.
Example table creation with a foreign key constraint:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE,
FOREIGN KEY (CustomerID) REFERENCES
Customers(CustomerID)
);
Check Constraint:

Defines a condition that data in a column must satisfy.


The data is only inserted or updated if it meets the
specified condition.
Example table creation with a check constraint:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Salary DECIMAL(10, 2) CHECK (Salary >= 0)
);
Not Null Constraint:

Ensures that a column cannot contain NULL values.


Example table creation with a not null constraint:

CREATE TABLE Customers (


CustomerID INT PRIMARY KEY,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
Email VARCHAR(100)
);
Constraints are an essential part of database design and
management, ensuring data consistency and reliability. They
help maintain the integrity of your data and prevent common
errors and inconsistencies.

Drop Constraint:
Dropping a constraint removes the constraint from a table. The
specific syntax for dropping a constraint can vary depending on
the DBMS (e.g., SQL Server, PostgreSQL, MySQL). You
typically need to specify the constraint name you want to drop.
Dropping a PRIMARY KEY Constraint:
ALTER TABLE Students
DROP CONSTRAINT PK_Students;

PRIMARY KEY AND FOREIGN KEY


Primary Key: A primary key is a column or a set of columns in a
table that uniquely identifies each row in that table. It enforces
the uniqueness of values and ensures that there are no
duplicate rows in the table. Primary keys are used to establish
relationships with other tables as foreign keys.

Foreign Key: A foreign key is a column or a set of columns in a


table that establishes a link between the data in two tables. It
represents a relationship between the tables and enforces
referential integrity. A foreign key in one table references the
primary key in another table.
Let's consider two tables: Employees and Departments and
describe the concepts of primary keys and foreign keys using
this example.
Table 1: Employees
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
DepartmentID INT
);
In this table:

EmployeeID is the primary key, which uniquely identifies


each employee in the Employees table.
FirstName and LastName are columns to store
employee names.
DepartmentID is a column that will later serve as a
foreign key, referencing the Departments table.
Table 2: Departments
CREATE TABLE Departments (
DepartmentID INT PRIMARY KEY,
DepartmentName VARCHAR(100)
);
In this table:

DepartmentID is the primary key, which uniquely


identifies each department in the Departments table.
DepartmentName is a column to store the name of the
department.
Primary Key:
In the Employees table, the EmployeeID column serves as the
primary key. This means that each employee must have a
unique EmployeeID, and no two employees can have the same
ID. The primary key constraint ensures data integrity by
preventing duplicate entries and providing a unique identifier for
each record.
Foreign Key:
In the Employees table, the DepartmentID column serves as a
foreign key. This means that it establishes a relationship
between the Employees table and the Departments table.
Specifically:

The DepartmentID column in the Employees table


references the DepartmentID column in the
Departments table.
It ensures that the value in the DepartmentID column of
an employee's record in the Employees table
corresponds to a valid department in the Departments
table.
Here's an example query to illustrate the use of foreign keys:
-- Insert a new department
INSERT INTO Departments (DepartmentID, DepartmentName)
VALUES (101, 'Human Resources');

-- Insert a new employee and associate them with the Human


Resources department
INSERT INTO Employees (EmployeeID, FirstName, LastName,
DepartmentID) VALUES (1, 'John', 'Doe', 101);
In this example, we insert a new department into the
Departments table and then insert a new employee into the
Employees table, specifying the department they belong to
using the DepartmentID foreign key.
Using primary keys and foreign keys in this way ensures data
consistency and maintains the integrity of relationships between
tables in a relational database.

INDEX
An index in a database is a data structure that improves the speed of
data retrieval operations on a database table. It's like an organized
reference to the data stored in a table, making it faster to search,
filter, and retrieve specific records. Indexes are essential for
optimizing database performance, especially when dealing with large
datasets.
Here are some key points to understand about indexes:
1. Purpose of Indexes:

Speed up data retrieval: Indexes allow the database


management system to locate specific rows in a table
more quickly, reducing the need for a full table scan.
Facilitate efficient searching: Indexes are particularly
useful for WHERE clause conditions in SQL queries, as
they can dramatically reduce the time it takes to find
matching rows.
2. How Indexes Work:

Indexes store a sorted or hashed subset of the data in a


table, containing a copy of the indexed columns and
pointers to the actual rows in the table.
When a query with a matching condition is executed, the
database engine uses the index to quickly identify the
relevant rows instead of scanning the entire table.
3. Types of Indexes:

Single-Column Index: Indexes created on a single


column.

Composite Index: Indexes created on multiple columns,


useful for queries with conditions involving multiple
columns.
Unique Index: Ensures that indexed columns have
unique values, similar to a primary key constraint.
Clustered Index: Dictates the physical order of data
rows in a table. Each table can have only one clustered
index.
Non-Clustered Index: Contains a copy of the indexed
columns along with pointers to the actual data rows.
4. Creating and Managing Indexes:

Indexes are typically created when you define a table's


schema using SQL statements like CREATE INDEX.
Indexes can be modified or dropped as needed to adapt
to changing query patterns or data requirements.
5. Trade-offs and Considerations:

While indexes improve data retrieval speed, they come


with some trade-offs. They require storage space and
can slow down data insertion and update operations
because the indexes must be maintained.
It's important to strike a balance between the number of
indexes and the type of queries you expect to run. Over-
indexing can be counterproductive.
Here's an example of creating a simple index in SQL:
-- Creating an index on the 'LastName' column of the 'Employees'
table
CREATE INDEX idx_LastName ON Employees (LastName);
In this example, we're creating a non-clustered index on the
LastName column of the Employees table. This index will speed up
queries that involve searching or filtering based on the last name of
employees.
DATA INTEGRITY
Data Integrity: Data integrity is a fundamental concept in the field of
data management and database systems. It refers to the accuracy,
consistency, and reliability of data in a database. Data integrity
ensures that data is valid, reliable, and free from errors or
inconsistencies. It involves maintaining the quality and
trustworthiness of data throughout its lifecycle.
The following categories of the data integrity exist with each
RDBMS:
• Entity Integrity : There are no duplicate rows in a table.
• Domain Integrity : Enforces valid entries for a given column by
restricting the type, the format, or the range of values.
• Referential Integrity : Rows cannot be deleted which are used by
other records.
• User-Defined Integrity : Enforces some specific business rules that
do not fall into entity, domain, or referential integrity.

Data integrity is typically enforced through various means, including:

Constraints: Database constraints, such as primary


keys, unique constraints, and check constraints, help
maintain data integrity by ensuring that data follows
predefined rules.
Referential Integrity: Foreign keys establish
relationships between tables and maintain referential
integrity by ensuring that related data remains
consistent.
Transactions: The use of database transactions
ensures that a series of operations on the database
either all succeed or fail together, preventing partial
updates and maintaining data consistency.

Data Validation: Implementing input validation and data


validation checks at the application level helps prevent
the insertion of incorrect or malformed data.
Data integrity is critical in various fields and industries, including
finance, healthcare, e-commerce, and more, as it ensures the
reliability and trustworthiness of data for decision-making, reporting,
and compliance purposes.

Database Normalization

Database normalization is a process in database design that helps


eliminate data redundancy and ensure data integrity by organizing
data into separate related tables. It is a set of rules and guidelines
that guide the structure of a relational database to optimize its
efficiency and reduce the risk of data anomalies.
The primary goals of database normalization are:

1. Eliminating Data Redundancy: Reducing data


duplication by storing data in a structured and efficient
manner. This saves storage space and avoids
inconsistencies that can arise when the same data is
stored in multiple places.
2. Minimizing Data Anomalies: Ensuring data integrity by
preventing insertion, update, or deletion anomalies.
Anomalies occur when changes to data in one place
lead to inconsistencies or errors in other parts of the
database.
3. Improving Data Consistency: Ensuring that data is
stored consistently and without ambiguity, making it
easier to maintain and query.
Database normalization is typically divided into several normal
forms, each with specific rules and criteria. The most commonly
used normal forms are:

First Normal Form (1NF):

Each column in a table must contain atomic (indivisible)


values.
There should be no repeating groups or arrays in a
column.
The order of rows and columns should not affect the
data.
Second Normal Form (2NF):
The table must already be in 1NF.
All non-key attributes (columns) must be fully functionally
dependent on the entire primary key.
Third Normal Form (3NF):

The table must already be in 2NF.


There should be no transitive dependencies. In other
words, non-key attributes should not depend on other
non-key attributes.
Boyce-Codd Normal Form (BCNF):

A stronger version of 3NF.


Every non-trivial functional dependency must be a
superkey.
Fourth Normal Form (4NF):

Deals with multi-valued dependencies.


A table is in 4NF if it's in BCNF and has no non-trivial
multi-valued dependencies.
Fifth Normal Form (5NF) and Beyond:

These are more advanced forms of normalization,


addressing complex situations and specific anomalies.
Normalization is an iterative process, and not all databases need to
be normalized to the highest level. The level of normalization
depends on the specific requirements and use cases of the
database. Over-normalization can lead to complex query
requirements, so it's essential to strike a balance between
normalization and query performance.
Normalization is a fundamental concept in relational database
design, and it helps ensure that data is stored efficiently, consistently,
and accurately, which is critical for data integrity and database
performance.

Now we will go through SQL Commands , I hope you are with your
laptop to perform query in practical .

SQL COMMANDS
CREATE AND INSERT
Let start with SQL commands, first we will learn how to Create a
table and then how to insert data into that table.
Let's create a table called Students with the following columns:
StudentID, FirstName, LastName, and Birthdate.
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Birthdate DATE
);
This SQL command creates a table named Students with four
columns: StudentID (as the primary key), FirstName, LastName,
and Birthdate.
Now, let's insert some sample data into the Students table:
## Inserting a single student
INSERT INTO Students (StudentID, FirstName, LastName,
Birthdate)
VALUES (1, 'John', 'Doe', '1995-03-15');

##Inserting multiple students in a single statement


INSERT INTO Students (StudentID, FirstName, LastName,
Birthdate)
VALUES
(2, 'Jane', 'Smith', '1998-07-22'),
(3, 'Robert', 'Johnson', '1997-05-10'),
(4, 'Emily', 'Brown', '2000-02-18');
In these SQL statements, we're inserting data into the Students
table:

The first INSERT statement inserts a single student with


StudentID 1 and other details.
The second INSERT statement inserts multiple students
in a single statement, each with a unique StudentID.
Now, write a query In your laptop –-----
Select * from Students (press enter)
## Always remember when you go for query type same name of
your table because it is case sensitive. Sometimes we write wrong
table name and it will show error.
SELECT AND FROM

Let's create a table called Employees with some sample data:


CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);
INSERT INTO Employees (EmployeeID, FirstName, LastName,
Department, Salary)
VALUES
(1, 'John', 'Doe', 'HR', 50000.00),
(2, 'Jane', 'Smith', 'Finance', 60000.00),
(3, 'Robert', 'Johnson', 'Engineering', 75000.00),
(4, 'Emily', 'Brown', 'Marketing', 55000.00);

Now, you have a table named Employees with data.


Perform a SELECT and FROM Query:
To retrieve data from the Employees table, you can use the
SELECT and FROM commands as follows:
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees;

This SQL query selects the EmployeeID, FirstName, LastName,


and Salary columns from the Employees table.
WHERE COMMAND
The WHERE clause is used to filter rows in a SQL query based on a
specified condition. Here's an example of a SELECT query with a
WHERE clause using the same Employees table we created earlier:
Example SQL Query with a WHERE Clause:
Let's say you want to retrieve the employees whose salaries are
greater than $60,000:
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees
WHERE Salary > 60000.00;
AND and OR OPERATORS
The AND and OR operators are used in SQL to combine multiple
conditions in the WHERE clause of a query. They allow you to create
more complex conditions for filtering data.
1. AND Operator: The AND operator is used to retrieve rows that
meet multiple conditions. It requires that all specified conditions be
true for a row to be included in the result.
Example using the AND operator:
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees
WHERE Department = 'Engineering' AND Salary > 70000.00;
This query retrieves employees in the "Engineering" department with
a salary greater than $70,000. Both conditions (Department is
'Engineering' and Salary > 70000.00) must be true for a row to be
included in the result.

2. OR Operator: The OR operator is used to retrieve rows that meet


at least one of the specified conditions. It allows for greater flexibility
in query conditions.
Example using the OR operator:
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees
WHERE Department = 'Engineering' OR Salary > 70000.00;
This query retrieves employees in the "Engineering" department or
those with a salary greater than $70,000. A row is included in the
result if either condition is true.
You can also combine AND and OR operators to create more
complex conditions. Here's an example:
SELECT EmployeeID, FirstName, LastName, Salary
FROM Employees
WHERE (Department = 'Engineering' AND Salary > 70000.00) OR
(Department = 'Finance' AND Salary > 65000.00);
In this query, employees in the "Engineering" department with a
salary greater than $70,000 or employees in the "Finance"
department with a salary greater than $65,000 will be selected.
The AND and OR operators provide powerful tools for filtering data
based on multiple conditions, allowing you to retrieve precisely the
data you need from your database.
IN , BETWEEN and LIKE
The IN, BETWEEN, and LIKE commands are used in SQL to filter
and query data based on various criteria. Let's look at each of these
commands:
1. IN Command: The IN command allows you to specify a list of
values and retrieve rows where a specified column's value matches
any value in the list. It's particularly useful for filtering data when you
have multiple values to check.
Example using the IN command:
SELECT FirstName, LastName
FROM Employees
WHERE Department IN ('Engineering', 'Marketing', 'Sales');
This query retrieves employees who belong to the 'Engineering',
'Marketing', or 'Sales' departments.
2. BETWEEN Command: The BETWEEN command is used to filter
rows based on a range of values for a particular column. It checks if
a column's value is within the specified range, including both
endpoints.
Example using the BETWEEN command:
SELECT ProductName, Price
FROM Products
WHERE Price BETWEEN 50.00 AND 100.00;
This query retrieves products with prices between $50.00 and
$100.00.
3. LIKE Command: The LIKE command is used to perform pattern
matching on text data. It allows you to search for values that match a
specific pattern or contain a specific substring. You can use wildcard
characters % (matches any number of characters) and _ (matches a
single character) with the LIKE command.
Example using the LIKE command:
SELECT ProductName
FROM Products
WHERE ProductName LIKE 'Desk%';
This query retrieves product names that start with 'Desk'. The %
wildcard matches any number of characters that follow 'Desk'.
You can use these commands individually or in combination to
create complex filtering conditions and retrieve specific data from
your database. These commands provide flexibility in querying data,
whether you need to filter based on specific values, ranges, or text
patterns.

SOME LIKE PATTERNS


SELECT FROM table_name WHERE column LIKE 'XXXX%'
SELECT FROM table_name WHERE column LIKE '%XXXX%'
SELECT FROM table_name WHERE column LIKE 'XXXX_'
SELECT FROM table_name WHERE column LIKE '_XXXX'
SELECT FROM table_name WHERE column LIKE '_XXXX_'
Do practice above pattern in your laptop, you will see it in practical
and you will also gain some confidence.

ORDER BY and GROUP BY


The ORDER BY and GROUP BY clauses are two important SQL
clauses used to sort and group data in the result set of a SQL query.
Let's explore each of them:
1. ORDER BY Clause: The ORDER BY clause is used to sort the
result set of a query based on one or more columns in either
ascending (ASC) or descending (DESC) order.
Syntax of the ORDER BY clause:
SELECT column1, column2, ...
FROM table_name
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC], ...;

column1, column2, ...: The columns to be selected in


the result set.
table_name: The name of the table from which you are
querying data.
ORDER BY: The clause that specifies the column(s) by
which the result set should be sorted.
column1 [ASC|DESC], column2 [ASC|DESC], ...: The
column(s) by which to sort the data, along with the
sorting direction (ASC for ascending, DESC for
descending).
Example of using ORDER BY:
SELECT FirstName, LastName, Salary
FROM Employees
ORDER BY Salary DESC;
This query retrieves employee names and salaries from the
Employees table and sorts the result in descending order of salary,
showing the highest-paid employees first.
2. GROUP BY Clause: The GROUP BY clause is used to group
rows with the same values in one or more columns into summary
rows. It is typically used with aggregate functions (such as SUM,
COUNT, AVG, etc.) to generate summary data for each group.
Syntax of the GROUP BY clause:
SELECT column1, column2, ..., aggregate_function(column)
FROM table_name
GROUP BY column1, column2, ...;

column1, column2, ...: The columns to be selected or


used for grouping.
table_name: The name of the table from which you are
querying data.
GROUP BY: The clause that specifies the column(s) by
which the data should be grouped.
aggregate_function(column): An aggregate function
applied to one or more columns within each group.
Example of using GROUP BY:
SELECT Department, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY Department;

This query groups employees by department and calculates the


average salary for each department.
In summary, the ORDER BY clause is used for sorting the result set,
while the GROUP BY clause is used for grouping data and
summarizing it. Both clauses are essential for generating meaningful
and organized results from your database queries.
COUNT and HAVING
The COUNT function and the HAVING clause are used in SQL to
aggregate data and filter the results of a query based on aggregated
values. Let's explore each of them with an example:
1. COUNT Function: The COUNT function is used to count the
number of rows in a result set or the number of non-null values in a
specific column. It is often used with the GROUP BY clause to count
the occurrences of values within a group.
Example of using the COUNT function:
SELECT Department, COUNT(*) AS EmployeeCount
FROM Employees
GROUP BY Department;

In this example, the COUNT function is used to count the number of


employees in each department. The result will show the department
name and the count of employees in that department.
2. HAVING Clause: The HAVING clause is used to filter the results
of a query based on an aggregate condition, typically when using
aggregate functions like COUNT, SUM, or AVG. It allows you to filter
groups of data that meet a specified condition.
Example of using the HAVING clause:
SELECT Department, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY Department
HAVING AVG(Salary) > 60000;

In this example, we're calculating the average salary for each


department using the AVG function and then using the HAVING
clause to filter out departments where the average salary is greater
than $60,000.
So, the HAVING clause is applied after grouping data, and it filters
the grouped results based on aggregate conditions.
In summary, the COUNT function is used for aggregation,
specifically for counting rows or values, while the HAVING clause is
used for filtering grouped results based on aggregate conditions.
These SQL features are useful for summarizing and filtering data in
complex queries.

DESC and ASC


In SQL, "DESC" and "ASC" are used to specify the order in which
the result set of a query should be sorted. They are typically used in
conjunction with the ORDER BY clause.
"ASC" stands for "ascending," and it sorts the result set
in ascending order (from smallest to largest values).
"DESC" stands for "descending," and it sorts the result
set in descending order (from largest to smallest values).
Here's an example using an "Employees" table:If you want to
retrieve a list of employees ordered by their salary in descending
order, you would use "DESC" as follows:
SELECT * FROM Employees
ORDER BY Salary DESC;

If you want to retrieve a list of employees ordered by their salary in


ascending order, you would use "ASC" as follows:
SELECT * FROM Employees
ORDER BY Salary ASC;

ALTER TABLE
In SQL, you can use the ALTER TABLE statement to make
structural changes to an existing table, such as adding, dropping, or
modifying columns. Here are examples of each of these actions:
1. Adding a Column:
To add a new column to the "Employees" table, you use the ALTER
TABLE statement with the ADD clause. For instance, if you want to
add a "Department" column to the "Employees" table, you can use
the following SQL statement:
ALTER TABLE Employees
ADD Department VARCHAR(50);

This SQL statement adds a new column named "Department" with a


VARCHAR data type to the "Employees" table.
2. Dropping a Column:
If you wish to remove a column from the "Employees" table, you can
do so using the ALTER TABLE statement with the DROP clause.
For example, let's say you want to remove the "Phone" column:
ALTER TABLE Employees
DROP COLUMN Phone;
This SQL statement will delete the "Phone" column from the
"Employees" table.
3. Modifying a Column:
To modify an existing column, you can utilize the ALTER TABLE
statement with the MODIFY clause. For example, if you want to
change the data type of the "Salary" column from INT to DECIMAL:
ALTER TABLE Employees
MODIFY Salary DECIMAL(10, 2);
This SQL statement modifies the "Salary" column's data type to
DECIMAL(10, 2).

4. RENAME TABLE NAME


To rename a table in SQL, you can use the ALTER TABLE
statement along with the RENAME TO clause. Here's how you
can rename the "Employees" table to "NewEmployees":
ALTER TABLE Employees
RENAME TO NewEmployees;
Keep in mind that the specific syntax for altering tables can vary
between different database management systems (e.g., MySQL,
PostgreSQL, SQL Server), so it's essential to consult the
documentation for the database system you're using to ensure you
use the correct syntax and options. Additionally, be cautious when
altering tables, as it can affect the integrity and functionality of your
database, especially in a production environment. Always make sure
to have backups and test your alterations in a safe environment
before applying them to a live database.
UPDATE and DELETE
In SQL, the UPDATE and DELETE commands are used to modify
and remove data from a table, respectively. Let's use an example
table called employees with columns for employee_id and
employee_name.
UPDATE Command:
The UPDATE command is used to modify existing records in a table.
You can specify which rows to update based on certain conditions.
Here's the basic syntax:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
table_name: The name of the table you want to update.
SET: Specifies the columns you want to update and their
new values.
WHERE: Optional, but it's used to specify a condition to
determine which rows will be updated.
For example, if you want to change the name of an employee with
employee_id = 1 from "John" to "Jane," you can do this:
UPDATE employees
SET employee_name = 'Jane'
WHERE employee_id = 1;

DELETE Command:
The DELETE command is used to remove records from a table. You
can also use a WHERE clause to specify which rows to delete.
Here's the basic syntax:
DELETE FROM table_name
WHERE condition;

table_name: The name of the table you want to delete


records from.
WHERE: Specifies the condition to determine which
rows to delete. If omitted, it will delete all rows in the
table.
For example, if you want to delete the employee with employee_id
= 2 from the employees table, you can do this:
DELETE FROM employees
WHERE employee_id = 2;
These commands are fundamental for data manipulation in SQL.
The UPDATE command allows you to modify existing data, and the
DELETE command lets you remove data that is no longer needed.
DATA TYPE
In SQL, a data type is an attribute that specifies the type of data that
a particular column or variable can hold. Data types are essential for
defining the structure of a table and for ensuring data integrity by
specifying the kind of data that can be stored in each column. Here
are some common SQL data types:

1. INTEGER Data Type:


INT, INTEGER, SMALLINT, BIGINT: Used
for whole numbers, typically without a
decimal point.
Example: age INT can store integer values
like 18, 42, -5.
2. NUMERIC/DECIMAL Data Type:
NUMERIC(precision, scale),
DECIMAL(precision, scale): Used for fixed-
point numbers with a specified number of
digits before and after the decimal point.
Example: price DECIMAL(8, 2) can store
numbers like 12345.67.
3. FLOATING-POINT Data Type:
FLOAT, REAL, DOUBLE PRECISION: Used
for approximate numeric values with a
floating-point representation.
Example: height REAL can store values like
5.8, 3.14159265.
4. CHARACTER Data Type:
CHAR(n): Used for fixed-length character
strings where n is the maximum number of
characters.
Example: first_name CHAR(50) can store
strings like "John" and "Alice."
5. VARCHAR Data Type:
VARCHAR(n): Used for variable-length
character strings with a maximum length of n.
Example: last_name VARCHAR(100) can
store strings like "Smith" and "Johnson."
6. TEXT Data Type:
TEXT: Used for longer text strings where the
length may vary greatly.
Example: comments TEXT can store
paragraphs of text.
7. DATE and TIME Data Types:
DATE, TIME, TIMESTAMP: Used for storing
dates, times, or date-time combinations.
Example: birth_date DATE can store dates
like '1990-05-15'.
8. BOOLEAN Data Type:
BOOLEAN: Used for storing true/false
values.
Example: is_active BOOLEAN can store
values like TRUE or FALSE.
9. BINARY Data Types:
BLOB, BYTEA, VARBINARY: Used for
storing binary data, such as images or files.
Example: profile_picture BLOB can store
binary image data.
10. ENUM Data Type:
ENUM: Used for defining a list of allowable
values for a column.
Example: gender ENUM('Male', 'Female',
'Other') can store one of these three values.

These are just a few common SQL data types. The specific data
types available may vary slightly between database systems, but the
concepts are largely consistent. Choosing the appropriate data type
for each column is crucial for efficient storage and data integrity in a
database.

## MODULUS: The modulus operator, denoted by %, is a mathematical


operation that returns the remainder when one number is divided by another. In
the context of programming and mathematics, the modulus operation is often used
with integers. Here's how it works:

If you divide an integer a by another integer b using the


modulus operator (a % b), it will return the remainder of
the division.
For example, 10 % 3 would return 1 because when you
divide 10 by 3, you get a quotient of 3 and a remainder
of 1.
The modulus operator is commonly used in programming for various
tasks such as determining if a number is even or odd, cycling
through a fixed range of values, and checking divisibility.

OPERATORS
AIRTHMETIC OPERATORS
Arithmetic operators in SQL are used to perform mathematical
operations on numerical values in the database. These operators
allow you to perform addition, subtraction, multiplication, division,
and modulo operations. Here's a brief description of each arithmetic
operator:
Addition Operator (+): The addition operator is used to add two or
more numerical values together. It returns the sum of the values.
Example: SELECT 5 + 3 AS sum result; ---- Returns 8
Subtraction Operator (-): The subtraction operator is used to
subtract one numerical value from another. It returns the difference
between the values.
Example: SELECT 10 - 5 AS difference result; ---- Returns 5
Multiplication Operator (*): The multiplication operator is used to
multiply two or more numerical values together. It returns the product
of the values.
Example: SELECT 4 * 6 AS product result; --- Returns 24
Division Operator (/): The division operator is used to divide one
numerical value by another. It returns the quotient of the division. Be
cautious with this operator, as dividing by zero can lead to errors.
Example: SELECT 20 / 4 AS division result; -- Returns 5

Modulo Operator (%): The modulo operator is used to find the


remainder when one numerical value is divided by another. It returns
the remainder of the division.
Example: SELECT 15 % 4 AS modulo_result; -- Returns 3
(remainder of 15 / 4)

COMPARISON OPERATORS
Comparison operators in SQL are used to compare values in a database table.
They return a Boolean (TRUE or FALSE) result based on the comparison. These
operators are essential for filtering and querying data based on specific conditions.
Now, let's describe each of the comparison operators in more detail:
Equal to (=): The equal-to operator checks if two values are equal. It
returns TRUE if they are, and FALSE if they are not.
Example: SELECT * FROM employees WHERE department_id = 5;
Not equal to (!= or <>): The not-equal-to operator checks if two values are not
equal. It returns TRUE if they are not equal, and FALSE if they are equal.
Example: SELECT * FROM products WHERE category_id != 2;
-- OR
SELECT * FROM products WHERE category_id <> 2;
Less than (<): The less-than operator checks if one value is less than another. It
returns TRUE if the condition is met and FALSE otherwise.
Example:
SELECT * FROM orders WHERE order_total < 100;
Greater than (>): The greater-than operator checks if one value is greater than
another. It returns TRUE if the condition is met and FALSE otherwise.
Example:
SELECT * FROM products WHERE price > 50;
Less than or equal to (<=): The less-than-or-equal-to operator checks if one
value is less than or equal to another. It returns TRUE if the condition is met and
FALSE otherwise.
Example:
SELECT * FROM students WHERE score <= 75;
Greater than or equal to (>=): The greater-than-or-equal-to operator checks if
one value is greater than or equal to another. It returns TRUE if the condition is
met and FALSE otherwise.
Example:
SELECT * FROM employees WHERE salary >= 50000;
Not Less Than (!<): The not-less-than operator checks if one value is not less
than another. It returns TRUE if the condition is met and FALSE otherwise.
Example: SELECT * FROM products WHERE stock_quantity !< 10;
Not Greater Than (!>): The not-greater-than operator checks if one value is not
greater than another. It returns TRUE if the condition is met and FALSE otherwise.
Example:
SELECT * FROM customers WHERE age !> 30;

LOGICAL OPERATORS
Logical operators in SQL are used to combine or modify conditions in SQL
statements. They allow you to create complex conditions for filtering and selecting
data. The commonly used logical operators are:

1. AND: The "AND" operator is used to combine two or more


conditions. It returns true if all the conditions are true.
2. OR: The "OR" operator is used to combine two or more conditions.
It returns true if at least one of the conditions is true.
3. NOT: The "NOT" operator is used to negate a condition. It returns
true if the condition is false, and false if the condition is true.

#As we discuss earlier AND and OR operators so we will not discuss it again
Now, let's describe the other operators in the context of the "employees" table:

ALL Operator: The "ALL" operator is used to compare a value to all values in a
subquery. It returns true if the condition is true for all rows in the subquery. For
example, you might use it to find employees with salaries greater than all salaries
in a particular department.
SELECT employee_name
FROM employees
WHERE salary > ALL (SELECT salary FROM employees WHERE department =
'Sales');
ANY Operator: The "ANY" operator is used to compare a value to any values in a
subquery. It returns true if the condition is true for at least one row in the subquery.
For instance, you could find employees with salaries greater than any salary in the
IT department.
SELECT employee_name
FROM employees
WHERE salary > ANY (SELECT salary FROM employees WHERE department =
'IT');

BETWEEN Operator: The "BETWEEN" operator is used to specify a range for a


condition. It's useful for selecting rows with values within a specified range, such
as finding employees with salaries between a minimum and maximum value.
SELECT employee_name
FROM employees
WHERE salary BETWEEN 40000 AND 60000;

EXISTS Operator: The "EXISTS" operator is used to check for the existence of
rows in a subquery. It returns true if the subquery returns one or more rows.
SELECT department_name
FROM departments
WHERE EXISTS (SELECT 1 FROM employees WHERE department_id =
departments.department_id);

IN Operator: The "IN" operator is used to compare a value to a list of values. It


returns true if the value matches any value in the list. For example, you can find
employees in the Marketing department or the Sales department.
SELECT employee_name
FROM employees
WHERE department IN ('Marketing', 'Sales');

LIKE Operator: The "LIKE" operator is used for pattern matching using wildcard
characters. It's often used to search for rows with values that match a specific
pattern. For instance, you could find employees whose names start with "J."
SELECT employee_name
FROM employees
WHERE employee_name LIKE 'J%';

NOT Operator: The "NOT" operator negates a condition. It returns true if the
condition is false. You can use it to find employees who are not in the IT
department.
SELECT employee_name
FROM employees
WHERE department_name <> 'IT';

IS NULL Operator: The "IS NULL" operator is used to check for NULL values in a
column. It returns true if a column contains a NULL value. For example, you can
find employees with no assigned manager.
SELECT employee_name
FROM employees
WHERE manager_id IS NULL;

UNIQUE Operator: The "UNIQUE" operator, often implied by the use of a unique
constraint or primary key, enforces uniqueness of values in a column, ensuring
that no two rows have the same value in that column.
-- Assuming the "employee_id" column is unique (usually a primary key)
SELECT DISTINCT employee_id
FROM employees;

TOP, DISTINCT and CURRENT TIMESTAMP


TOP and DISTINCT are SQL keywords used to retrieve and
manipulate data in a database. They are often used in the SELECT
statement to control the number of rows and filter out duplicate
values. However, it's important to note that their usage and syntax
may vary depending on the specific SQL database system you are
using (e.g., SQL Server, MySQL, PostgreSQL).
TOP:

TOP is typically used in Microsoft SQL Server


and Sybase databases. In other database
systems like MySQL and PostgreSQL, you might
use LIMIT instead.

TOP allows you to specify the maximum number


of rows to return from a query result. It's often
used when you want to retrieve a limited number
of rows from a table.
You can combine TOP with the ORDER BY
clause to control which rows are returned,
ordering the results in a specific way.
Example (SQL Server): SELECT TOP 5 * FROM
customers;
This query retrieves the first 5 rows from the
"customers" table.
Example (MySQL, PostgreSQL): SELECT * FROM
customers LIMIT 5;
This query retrieves the first 5 rows from the
"customers" table in MySQL or PostgreSQL.

DISTINCT:
DISTINCT is used to filter out duplicate values in the
result set. It ensures that the query returns only unique
rows, removing any duplicates.
You can use DISTINCT with one or more columns to
specify the uniqueness constraint. The query will then
return distinct combinations of values in those columns.

Example:
SELECT DISTINCT department FROM employees;

This query retrieves a list of distinct department names


from the "employees" table, eliminating duplicates.
You can also use DISTINCT with multiple columns to
ensure uniqueness across those columns.
Example:
SELECT DISTINCT first_name, last_name FROM
employees;

This query retrieves unique combinations of first


names and last names from the "employees"
table.

It's important to note that using DISTINCT can


impact query performance, especially on large
datasets, as the database must compare rows to
identify duplicates.
Both TOP and DISTINCT are valuable SQL tools that
help you control the result set to meet your specific
needs, whether it's limiting the number of rows or
ensuring uniqueness in your query results. The exact
syntax and availability of these keywords may vary
depending on the database system you are using.
CURRENT_TIMESTAMP
CURRENT_TIMESTAMP is a standard SQL function
that returns the current date and time, typically in a
timestamp format. It's often used in SQL queries and
database operations to record or reference the current
date and time.
Here's how you can use CURRENT_TIMESTAMP:
Timestamp Value: CURRENT_TIMESTAMP returns a
timestamp value representing the current date and time
according to the system clock of the database server.
Example:
SELECT CURRENT_TIMESTAMP;

Recording Timestamps: You can use CURRENT_TIMESTAMP


when inserting or updating records in a table to record when a
specific action occurred. For example, you might use it to timestamp
when a new order was placed or when a record was last updated.
Example: INSERT INTO orders (order_id, order_date)
VALUES (1, CURRENT_TIMESTAMP);
This would record the current date and time in the order_date
column.

Comparing Timestamps: You can use


CURRENT_TIMESTAMP to compare with timestamp values
in your database. For instance, you can find all orders
placed today.
Example: SELECT * FROM orders WHERE order_date
>= CURRENT_DATE;
This would return all orders placed today or later.
Calculations: You can perform calculations with
CURRENT_TIMESTAMP. For example, you can find
out how many hours have passed since a particular
timestamp by subtracting it from
CURRENT_TIMESTAMP.
Example:
SELECT (CURRENT_TIMESTAMP - order_date) AS
hours_passed FROM orders WHERE order_id = 1;
This would calculate and return the number of hours
that have passed since the order was placed.

ALIAS
In SQL, an alias is a temporary name or shorthand for a table or
column that you can use in a query to make your SQL statements
more readable or to avoid naming conflicts. Aliases are often used to
simplify complex queries, especially when working with self-joins,
subqueries, or when column names are long or not user-friendly.
There are two primary types of aliases in SQL:

1. Table Alias (Table Name Alias):


A table alias is used to provide a shorter or
more meaningful name for a table in a SQL
query. This can make your SQL statements
more concise and easier to understand.
Table aliases are particularly useful when
you're joining multiple tables, especially if
those tables have long names.
Example:
SELECT e.employee_id, e.first_name,
d.department_name
FROM employees e
JOIN departments d ON e.department_id =
d.department_id;

In this query, "e" and "d" are table aliases for the
"employees" and "departments" tables,
respectively.

2. Column Alias (AS Clause):


A column alias is used to provide a temporary
name for a column in the result set. It's often
used when you want to change the name of a
column in the query result, create calculated
columns, or clarify the purpose of a column.
You specify a column alias using the AS
clause, and it is typically followed by the
desired alias name.
Example: SELECT employee_id AS
"Employee ID", first_name AS "First Name"
FROM employees;

In this query, "Employee ID" and "First Name" are


column aliases for the "employee_id" and "first_name"
columns, respectively.
Column aliases can also be useful for renaming the
result of a calculation or expression in the query.
Example:
SELECT salary * 12 AS annual_salary
FROM employees;

In this query, "annual_salary" is a column alias


for the result of multiplying the "salary" column
by 12.

Aliases improve the readability and clarity of SQL queries and are
especially helpful when working with complex queries or large
databases. They don't affect the underlying data in the tables; they
are used solely for presentation in the query result.
JOINS
This one is very important topic, so read it properly before
going to theory portion just look at this image carefully.

Let's assume we have two tables: Employees and Departments.


Each table contains information about employees and their
departments.

Table : Employees
EmployeeID EmployeeName DepartmentID
1 John 101
2 Sarah 102
3 Michael 101
4 Emily 103
5 Chris 104

Table : Departments
DepartmentID DepartmentName
101 HR
102 Sales
103 IT

INNER JOIN: An INNER JOIN returns only the rows that have
matching values in both tables. In this case, it will return employees
who belong to a department.
SELECT EmployeeName, DepartmentName
FROM Employees
INNER JOIN Departments ON Employees.DepartmentID =
Departments.DepartmentID;
Run this Query in your device see the results

LEFT JOIN (LEFT OUTER JOIN): A LEFT JOIN returns all rows
from the left table (Employees) and the matching rows from the right
table (Departments). If there's no match in the right table, it still
includes the left table data with NULL values for the right table.
SELECT EmployeeName, DepartmentName
FROM Employees
LEFT JOIN Departments ON Employees.DepartmentID =
Departments.DepartmentID;

RIGHT JOIN (RIGHT OUTER JOIN): A RIGHT JOIN returns all rows
from the right table (Departments) and the matching rows from the
left table (Employees). If there's no match in the left table, it still
includes the right table data with NULL values for the left table.
SELECT EmployeeName, DepartmentName
FROM Employees
RIGHT JOIN Departments ON Employees.DepartmentID =
Departments.DepartmentID;

FULL JOIN (FULL OUTER JOIN): A FULL JOIN returns all rows
when there is a match in either the left or the right table. If there's no
match in one of the tables, it includes NULL values for that side.
SELECT EmployeeName, DepartmentName
FROM Employees
FULL JOIN Departments ON Employees.DepartmentID =
Departments.DepartmentID;

Self Join (Finding Employees and Their Managers):To perform a


self join to find employees and their respective managers, you would
use the Employees table, which has an EmployeeID and a
ManagerID column.
Add ManagerID column in your table by yourself
SELECT e.EmployeeName AS Employee, m.EmployeeName AS
Manager
FROM Employees e
LEFT JOIN Employees m ON e.ManagerID = m.EmployeeID;
This query joins the Employees table with itself, connecting the
ManagerID of each employee to the EmployeeID of their manager.
It retrieves the names of employees and their respective managers.

Cross Join (Combining Employees and Departments):To perform a


cross join, you can combine all employees with all departments,
creating all possible combinations.
SELECT EmployeeName, DepartmentName
FROM Employees
CROSS JOIN Departments;
This query generates a result set that includes all employees
combined with all departments.

UNION, UNION ALL, INTERSECT, EXCEPT


UNION:

UNION combines the result sets of two or more SELECT


statements into a single result set.
It removes duplicate rows from the combined result, so
each row appears only once in the final result.
The number and data types of columns in each SELECT
statement must be the same.
The result set is sorted by default.
Example:
SELECT employee_name FROM employees
UNION
SELECT customer_name FROM customers;
This query combines the names of employees and customers into a
single list, removing duplicates.

UNION ALL:

UNION ALL is similar to UNION, but it doesn't remove


duplicate rows. It combines all rows from the SELECT
statements into the result set, including duplicates.
It is faster than UNION because it doesn't involve the
extra step of removing duplicates.
The number and data types of columns in each SELECT
statement must be the same.
Example:
SELECT employee_name FROM employees
UNION ALL
SELECT customer_name FROM customers;
This query combines the names of employees and customers,
including duplicates.

INTERSECT:

INTERSECT returns only the rows that appear in both of


the result sets from two SELECT statements.
The number and data types of columns in each SELECT
statement must be the same.
Example:
SELECT product_name FROM product_catalog
INTERSECT
SELECT product_name FROM online_sales;
This query returns a list of product names that are both in the
product catalog and have been sold online.
EXCEPT:

EXCEPT returns the rows that are present in the first


result set but not in the second result set from two
SELECT statements.

The number and data types of columns in each SELECT


statement must be the same.
Example:
SELECT product_name FROM product_catalog
EXCEPT
SELECT product_name FROM in-store_sales;
This query returns a list of product names that are in the product
catalog but haven't been sold in-store.
Set operators like UNION, UNION ALL, INTERSECT, and EXCEPT
are useful for combining and comparing data from multiple tables or
result sets, allowing you to perform more complex queries and
analysis on your data.

TRUNCATE
The SQL TRUNCATE TABLE command is used to delete complete
data from an existing table. You can also use DROP TABLE
command to delete complete table but it would remove complete
table structure form the database and you would need to re-create
this table once again if you wish you store some data.
Syntax: The basic syntax for the TRUNCATE statement is:
TRUNCATE TABLE table_name;

1. Key Points:
TRUNCATE is typically used for removing all
rows from a table, but it can be used for
temporary tables and materialized views,
depending on the database system.
Unlike the DELETE statement, TRUNCATE
does not generate individual row delete
operations. It simply deallocates the data
pages associated with the table, effectively
dropping all rows in a single operation.
Due to its efficiency, TRUNCATE is usually
faster than DELETE for removing all rows
from a table.
TRUNCATE is typically used for removing all
data from a table without affecting the table's
structure (columns, constraints, indexes,
etc.). After truncating a table, it's still possible
to insert data into it.
Because TRUNCATE is a minimally logged
operation, it generates fewer log entries and
uses fewer system resources, making it a
good choice for quickly cleaning out a table,
especially in high-performance or batch
processing scenarios.
Considerations:

When you use TRUNCATE, be aware that it's an


all-or-nothing operation. You can't use a WHERE
clause to selectively remove specific rows; it
removes all rows.

Be cautious when using TRUNCATE because it


can't be rolled back. Once you truncate a table,
the data is permanently deleted.
Tables with foreign key constraints may not be
truncated unless the database system supports
cascading constraints and you specify the
CASCADE option.
TRUNCATE may reset identity columns or auto-
increment values, depending on the database
system.
Ensure you have the necessary privileges to
execute the TRUNCATE statement on the table.
TRANSACTION
A transaction in the context of a database management system
(DBMS) is a sequence of one or more SQL statements that are
executed as a single unit of work. Transactions ensure the
consistency, integrity, and isolation of the database by allowing
multiple operations to be treated as a single, atomic entity. Here are
key concepts related to transactions:
Properties of a Transaction:

1. ACID Properties:
Atomicity: A transaction is atomic, meaning
it is an all-or-nothing operation. Either all its
changes are applied, or none are. If any part
of the transaction fails, the entire transaction
is rolled back.
Consistency: A transaction ensures that the
database transitions from one consistent
state to another. It preserves the integrity of
the data by enforcing constraints and rules.
Isolation: Transactions are executed in
isolation from one another, meaning the
changes made by one transaction are not
visible to other transactions until the first
transaction is committed.
Durability: Once a transaction is committed,
its changes are permanent and survive
system failures. Even if the system crashes,
the changes are not lost.
2. Beginning and Ending a Transaction:
A transaction begins with the BEGIN
TRANSACTION, START TRANSACTION, or
similar statement.
It ends with a COMMIT to make the changes
permanent or a ROLLBACK to undo the
changes.
3. Savepoints:
Savepoints allow you to mark a point within a
transaction to which you can later roll back
without affecting the entire transaction.
4. Nested Transactions:
Some database systems support nested
transactions, allowing transactions to be
divided into subtransactions. Subtransactions
can be committed or rolled back
independently, but the outer transaction's
final outcome affects all subtransactions.
Transaction Control Statements:

1. BEGIN TRANSACTION / START TRANSACTION:


Initiates a new transaction. It can also be
used to define a savepoint within the
transaction.
2. COMMIT:
Finalizes the transaction, making all its
changes permanent in the database.
Example: BEGIN TRANSACTION;
------- SQL statements within the transaction
COMMIT;

3. ROLLBACK:
Undoes all changes made within the transaction,
reverting the database to the state it was in before the
transaction started.
Example: BEGIN TRANSACTION;
--------- SQL statements within the transaction
ROLLBACK;

4. SAVEPOINT:
Defines a savepoint within a transaction, allowing you to
roll back to that point later.
Example: BEGIN TRANSACTION;
-- SQL statements
SAVEPOINT my_savepoint;
-- More SQL statements
In this example, a savepoint named my_savepoint is
created within the transaction. You can later use this
savepoint to roll back to this specific point.

5. ROLLBACK TO SAVEPOINT:
Rolls back the transaction to a specific savepoint.
Example: BEGIN TRANSACTION;
-- SQL statements
SAVEPOINT my_savepoint;
-- More SQL statements
ROLLBACK TO my_savepoint;
-- Any changes made after my_savepoint are undone
RELEASE:

The RELEASE statement is used to release a savepoint,


effectively removing it from the transaction.
Once a savepoint is released, you cannot roll back to it.
Example: BEGIN TRANSACTION;
-- SQL statements
SAVEPOINT my_savepoint;
-- More SQL statements
RELEASE my_savepoint; -- The savepoint is released
and cannot be rolled back to

6. SET TRANSACTION:
Used to set characteristics of a transaction,
such as isolation level.
Transaction Isolation Levels:
Isolation levels control how transactions interact with each other. The
standard isolation levels include:

1. READ UNCOMMITTED:
Allows a transaction to read uncommitted
changes made by other transactions. It offers
the lowest isolation but the highest
concurrency.
2. READ COMMITTED:
A transaction can only read committed
changes made by other transactions. This
provides a balance between isolation and
concurrency.
3. REPEATABLE READ:
Ensures that a transaction can read the same
data consistently throughout its duration.
However, it may still allow new data to be
inserted by other transactions.
4. SERIALIZABLE:
Provides the highest level of isolation by
preventing other transactions from modifying
or inserting data that a transaction is reading.
5. SNAPSHOT ISOLATION:
This is a variation of isolation that allows a
transaction to see a consistent snapshot of
the database as it existed at the start of the
transaction.
Transaction management is crucial for maintaining data consistency
and integrity in a multi-user database environment. Choosing the
appropriate isolation level and using transaction control statements
correctly are essential for developing robust and reliable database
applications.

TEMPORARY TABLES
Temporary tables are a type of database table that is used for storing
temporary data. These tables are typically created and managed
within a session and exist only for the duration of that session. Once
the session ends or the connection is closed, temporary tables are
automatically deleted. Temporary tables are useful for storing
intermediate results, temporary data, or isolating data within a
specific session without affecting the global database schema.
Here are some key characteristics and uses of temporary tables:

1. Temporary Table Creation:


Temporary tables are typically created using
the CREATE TEMPORARY TABLE or
CREATE TEMP TABLE statement (the
syntax may vary slightly depending on the
database system).
Temporary tables can have the same
structure and functionality as regular
(permanent) tables.
Example (MySQL syntax):
CREATE TEMPORARY TABLE temp_customers (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(50)
);

2. Session Scope:
Temporary tables are local to the session or
connection in which they are created.
They are not visible or accessible to other
sessions or connections.
Each session can create its own set of
temporary tables with the same name without
conflicts.
3. Automatic Cleanup:
Temporary tables are automatically dropped
(deleted) when the session ends or the
connection is closed.
They do not persist beyond the duration of
the session.
This automatic cleanup simplifies
management and ensures that temporary
tables do not clutter the database.
4. Data Isolation:
Temporary tables are useful for isolating and
segregating data within a specific session.
They are often used to store intermediate
results during complex query operations or to
prevent naming conflicts in scenarios where
multiple sessions may be working with similar
data.
5. Performance Optimization:
Temporary tables can be employed to
improve query performance by allowing you
to store and manipulate intermediate results
rather than repeatedly recalculating them.
6. Data Transformation:
Temporary tables are commonly used during
data transformation, such as data cleansing,
aggregation, or data loading processes.
7. Materialized Views:
Temporary tables can be used to create
materialized views, which store the results of
complex or time-consuming queries for faster
retrieval.
8. Security and Permissions:
Temporary tables may inherit the security and
permissions settings of the user who creates
them, and they can be restricted to specific
users or roles.
9. Database System Compatibility:
The support and syntax for temporary tables
may vary between database systems. It's
important to refer to the documentation of
your specific database management system
to understand how temporary tables are
implemented.
Temporary tables are a valuable tool in database development and
query optimization, allowing for the efficient management of data
within a session while minimizing the impact on the overall database
schema. They are particularly useful in scenarios where data needs
to be isolated, transformed, or stored temporarily.
CLONING A TABLE
Cloning a table involves creating a new table that has the same
structure as an existing table. The clone, or copied table, typically
has the same columns, data types, and constraints as the original
table. Cloning is useful for various purposes, such as creating
backups, creating a working copy for analysis or experimentation, or
creating a template for new tables. The specific SQL syntax for
cloning a table may vary depending on the database system you are
using, but here's a general outline of how it is done:
Basic Steps to Clone a Table:

1. Create the New Table Structure:


Create a new table with the desired name
and structure that matches the original table.
You can use the CREATE TABLE statement
to define the new table's columns and data
types.
2. Copy Constraints and Indexes (Optional):
If you want the cloned table to have the same
constraints (e.g., primary keys, foreign keys)
and indexes as the original table, you need to
define them as part of the new table's
structure.
3. Copy Data (Optional):
If you want the cloned table to also contain
the same data as the original table, you can
use an INSERT INTO statement to copy the
data from the original table to the new table.
Example SQL for Cloning a Table:
Here's a simplified example in MySQL:
-- Step 1: Create the new table structure
CREATE TABLE cloned_table AS
SELECT * FROM original_table WHERE 1 = 0;

-- Step 2: Copy constraints and indexes (if desired)


-- (Define primary keys, foreign keys, indexes, etc. on the
cloned_table)

-- Step 3: Copy data (if desired)


INSERT INTO cloned_table
SELECT * FROM original_table;
In this example:

cloned_table is created with the same structure as


original_table by selecting all columns from
original_table where a condition that is always false (1
= 0) ensures that no data is initially copied.
Constraints and indexes can be defined on
cloned_table as needed.
If you want to copy data, you can use an INSERT INTO
statement to transfer the data from original_table to
cloned_table.
The exact SQL syntax may vary between database systems like
PostgreSQL, SQL Server, or Oracle, and it's important to refer to
your database system's documentation for specific syntax and
options.
Keep in mind that cloning a table only copies the table structure and
data if you choose to do so. It does not copy triggers, stored
procedures, or other database objects associated with the original
table.
AUTO-INCREMENT

Auto-increment is a feature in SQL databases that allows a numeric


column, typically an integer, to automatically generate a unique value
for each new row added to a table. This feature is commonly used
for creating primary keys to ensure each record in a table has a
unique identifier. Here are the key points to understand about auto-
increment:

1. Auto-Increment Column:
An auto-increment column is a column in a
database table that automatically generates a
unique value for each new row.
It is often used as the primary key of the
table, ensuring that each row has a distinct
identifier.
2. Primary Key Use:
Auto-increment columns are commonly used
as primary keys, but they can also be used in
other situations where unique identifiers are
needed.
3. Data Type:
Auto-increment columns are typically of
numeric data types, such as INT, BIGINT, or
SERIAL, depending on the database system
being used.
4. Initialization and Increment:
The initial value and increment (step size) are
configurable. You can set the starting value
and specify how much the column should
increment by for each new row.
5. Database Support:
The syntax and implementation of auto-
increment columns may vary between
database systems. For example, in MySQL,
you use AUTO_INCREMENT, in
PostgreSQL, you use SERIAL, and in SQL
Server, you use IDENTITY.
6. Uniqueness:
Auto-increment values are unique within the
table. This uniqueness is guaranteed by the
database system.
7. No Manual Assignment:
You typically don't need to manually provide a
value for an auto-increment column when
inserting a new row. The database system
handles it automatically.
8. Concurrency Considerations:
Auto-increment columns are designed to
handle concurrent inserts. Even in a multi-
user environment, each new row will receive
a unique value.
9. Resetting or Restarting:
In some database systems, you can reset or
restart the auto-increment value, but this
operation should be used with caution as it
can affect data integrity.
Example in MySQL:
CREATE TABLE Employees (
EmployeeID INT AUTO_INCREMENT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50)
);
In this example, the EmployeeID column is defined as an auto-
increment primary key. When new rows are inserted into the
Employees table, the EmployeeID value is automatically generated
and incremented.

HANDLING DUPLICATES IN SQL


Handling duplicates in SQL involves managing and removing
duplicate records from a table or result set. Duplicate records can
cause data quality issues and affect the accuracy of your queries.
SQL provides several techniques and keywords to address duplicate
records. Here are common methods for handling duplicates in SQL:

1. SELECT DISTINCT:
Use the SELECT DISTINCT statement to
retrieve unique values from a column or
combination of columns in a table. This
eliminates duplicate values and returns only
distinct rows.
SYNTAX: SELECT DISTINCT column1, column2
FROM table_name;

2. GROUP BY and HAVING:

The GROUP BY clause is used to group rows with


identical values into summary rows. You can then use
the HAVING clause to filter the grouped results based on
conditions.
SYNTAX: SELECT column1, COUNT(*)
FROM table_name
GROUP BY column1
HAVING COUNT(*) > 1;
This query retrieves the values from column1 that occur
more than once in the table.

3. ROW_NUMBER() and Common Table Expressions


(CTEs):

You can use the ROW_NUMBER() function in


combination with a common table expression (CTE) to
assign row numbers to rows within a group of duplicates.
This allows you to filter and keep only one instance of
each duplicate based on the row number.

SYNTAX: WITH CTE AS (


SELECT column1, column2, ROW_NUMBER() OVER
(PARTITION BY column1, column2 ORDER BY column1)
AS rn
FROM table_name
)
SELECT column1, column2
FROM CTE
WHERE rn = 1;
4. UNION:

If you have two or more result sets that contain duplicate


rows, you can use the UNION operator to combine them
and eliminate duplicates. To retain all rows, including
duplicates, you can use UNION ALL.

SYNTAX: SELECT column1, column2


FROM table1
UNION
SELECT column1, column2
FROM table2;
5. DELETE Duplicate Rows:

If you want to permanently remove duplicate rows from a


table, you can use the DELETE statement with a
subquery that identifies duplicates based on a unique
identifier or criteria.

SYNTAX:
DELETE FROM table_name
WHERE (column1, column2) NOT IN (
SELECT MIN(column1), MIN(column2)
FROM table_name
GROUP BY column1, column2
);
This query keeps the minimum row for each set of
duplicates based on column1 and column2.
6. INSERT INTO with SELECT DISTINCT:
To insert data into a new table while removing
duplicates, you can use the INSERT INTO
statement with a SELECT DISTINCT
subquery from the original table.

SYNTAX:
INSERT INTO new_table (column1, column2)
SELECT DISTINCT column1, column2
FROM original_table;
Handling duplicates in SQL depends on the specific
requirements and the database system you are using.
Depending on the situation, you may choose one or a
combination of these techniques to address duplicate
records in your data.
SQL INJECTION
SQL injection is a malicious technique in which an attacker
manipulates the data input into an application's SQL query,
exploiting vulnerabilities in the application's code to gain
unauthorized access to a database. It is one of the most
common and dangerous web application security
vulnerabilities. Here's an overview of SQL injection, its risks,
and how to prevent it:
How SQL Injection Works:

1. User Input: SQL injection occurs when an application


takes user input, such as form fields, URL parameters,
or cookies, and directly includes it in SQL queries
without proper validation or sanitization.
2. Malicious Input: An attacker submits specially crafted
input that includes SQL code as part of the user input.
This input can include SQL statements that manipulate
the query's behavior.
3. Exploitation: If the application does not properly
validate or sanitize the input, the attacker's input is
treated as part of the SQL query, allowing the attacker to
execute arbitrary SQL statements against the database.
4. Unauthorized Access: Depending on the vulnerability
and the attacker's skill, this can lead to unauthorized
access, data leakage, data manipulation, or even
complete control of the application and database.

Risks of SQL Injection:


SQL injection poses several significant risks:

1. Data Exposure: Attackers can access and retrieve


sensitive data from the database, including user
credentials, personal information, financial data, and
more.
2. Data Manipulation: Attackers can modify, delete, or
insert data into the database, potentially causing data
corruption and unauthorized changes.
3. Unauthorized Access: In the worst case, attackers can
gain full control of the database, application, and
potentially the server itself, depending on the privileges
of the application's database connection.

Preventing SQL Injection:


Preventing SQL injection is crucial for ensuring the security
of web applications. Here are some best practices to
mitigate SQL injection vulnerabilities:

1. Use Parameterized Queries: Parameterized queries or


prepared statements ensure that user input is treated as
data, not code. They separate user input from SQL code
and prevent SQL injection.
2. Input Validation and Sanitization: Implement strict
input validation to ensure that user input adheres to
expected patterns and formats. Reject input that doesn't
match the expected data type.
3. Escape User Input: If you can't use parameterized
queries, escape and sanitize user input to ensure it
doesn't contain SQL code. Most programming
languages and database libraries provide functions for
this purpose.
4. Least Privilege: Assign the minimal required
permissions to the database user account used by the
application. Avoid using highly privileged accounts.
5. Web Application Firewall (WAF): Consider using a
web application firewall that can detect and block SQL
injection attempts.
6. Security Patching: Keep your web application and
database management system up-to-date with security
patches to address known vulnerabilities.
7. Error Handling: Implement custom error handling to
prevent detailed error messages from being exposed to
attackers. These messages can provide valuable
information about the database structure.
8. Security Testing: Regularly perform security testing,
including penetration testing and code review, to identify
and remediate SQL injection vulnerabilities.

SQL injection is a critical security issue that requires


constant vigilance and proactive security measures to
protect your web applications and databases from potential
exploitation.

RAND and CONCAT


This is the last chapter and I hope this book helps you to gain
knowledge about SQL programming language
RAND() Function:

RAND() is a built-in SQL function that generates a


random floating-point number between 0 (inclusive) and
1 (exclusive).
It is commonly used to obtain random values for various
purposes, such as selecting random rows from a table or
generating random test data.
Example:
To generate random integers within a specific range, you can use
RAND() in combination with mathematical operations and the
FLOOR() function:
SELECT FLOOR(RAND() * 100) AS random_integer;
-- Generates a random integer between 0 and 99

CONCAT() Function:

CONCAT() is used to concatenate (join together) two or


more strings into a single string.
It can take multiple arguments, combining them in the
order they are provided.
Example:
SELECT CONCAT('Hello, ', 'World!'); -- Concatenates the two strings
You can also use CONCAT() to combine columns or add separators
between values:
SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM
employees;
In this example, the CONCAT() function combines the
first_name and last_name columns, separated by a space,
to create a full name.

These functions are part of the standard SQL language and are
widely supported by most relational database systems, including
MySQL, PostgreSQL, SQL Server, and others. They are useful for a
variety of tasks, including data manipulation, report generation, and
random data generation.
INTERVIEW QUESTIONS

Beginner Level SQL Interview Questions:

1. What does SQL stand for?


SQL stands for Structured Query Language.
2. What are the two primary types of SQL commands?
Data Definition Language (DDL) and Data
Manipulation Language (DML) commands.
3. What is a database?
A database is a structured collection of data
that is organized and stored for easy retrieval
and manipulation.
4. What is a table in a database?
A table is a database object that stores data
in rows and columns.
5. How do you retrieve all records from a table named
'employees'?
Use the SQL query: SELECT * FROM
employees;
6. What is the purpose of the SQL WHERE clause?
The WHERE clause is used to filter records
based on a specified condition.
7. How do you insert a new record into a table?
Use the SQL query: INSERT INTO
table_name (column1, column2, ...)
VALUES (value1, value2, ...);
8. What is the primary key in a table?
A primary key is a unique identifier for each
record in a table.
9. What is an SQL JOIN?
An SQL JOIN is used to combine rows from
two or more tables based on a related column
between them.
10. What is the SQL ORDER BY clause used for?
The ORDER BY clause is used to sort the
result set in ascending or descending order.
11. What is an SQL function?
An SQL function is a pre-defined operation
that takes one or more arguments and
returns a single value.
12. What is the purpose of the SQL GROUP BY clause?
The GROUP BY clause is used to group
rows that have the same values in specified
columns into summary rows.
13. What does the SQL COUNT() function do?
The COUNT() function is used to count the
number of rows in a result set.
14. What is the difference between a primary key and a
unique key?
A primary key is used to uniquely identify a
record in a table and cannot contain NULL
values. A unique key, on the other hand,
enforces uniqueness but can contain NULL
values.
15. What is the purpose of the SQL LIMIT clause?
The LIMIT clause is used to limit the number
of rows returned in a result set.
16. What is an SQL view?
An SQL view is a virtual table that is based
on the result of an SQL SELECT query.
17. How do you update data in a table using SQL?
Use the SQL query: UPDATE table_name
SET column1 = value1, column2 = value2
WHERE condition;
18. What is the difference between the SQL WHERE and
HAVING clauses?
The WHERE clause filters rows before
grouping, while the HAVING clause filters
groups after grouping.
19. What is the purpose of the SQL BETWEEN
operator?
The BETWEEN operator is used to filter a
result set based on a range of values.
20. What is an SQL alias?
An SQL alias is a temporary name assigned
to a table or column for the duration of a
query.
21. What does the SQL LIKE operator do?
The LIKE operator is used to search for a
specified pattern in a column.
22. What is the difference between SQL and NoSQL
databases?
SQL databases are relational databases with
structured data, while NoSQL databases are
non-relational databases that can store
unstructured or semi-structured data.
23. What is the SQL MAX() function used for?
The MAX() function is used to find the
maximum value in a column.
24. How do you delete records from a table using SQL?
Use the SQL query: DELETE FROM
table_name WHERE condition;
25. What is a foreign key in a table?
A foreign key is a field in a table that is linked
to the primary key of another table and is
used to establish a relationship between the
two tables.
Intermediate Level SQL Interview Questions:

1. What is a subquery in SQL?


A subquery is a query nested inside another
query. It can be used to retrieve data that will
be used as a condition in the main query.
2. What is the difference between INNER JOIN and
OUTER JOIN in SQL?
INNER JOIN returns only matching rows
between tables, while OUTER JOIN returns
matching rows and all unmatched rows from
at least one table.
3. Explain the difference between UNION and UNION
ALL in SQL.
UNION removes duplicate rows, while
UNION ALL includes all rows, even if they are
duplicates.
4. What is a SQL index, and why is it used?
A SQL index is a data structure used to
improve the speed of data retrieval
operations on a database table. It allows for
faster data lookup.
5. What is a SQL transaction, and why is it important?
A transaction is a series of one or more SQL
statements that are executed as a single unit
of work. It ensures that all the statements
within the transaction are either fully executed
or fully rolled back.
6. What is a self-join in SQL, and how is it different
from a regular join?
A self-join is when a table is joined with itself.
It's useful when you want to compare rows
within the same table. It uses table aliases to
distinguish between the two instances of the
same table.
7. Explain the concept of normalization in databases.
Normalization is the process of organizing
data in a database to reduce data
redundancy and improve data integrity by
breaking down tables into smaller, related
tables.
8. What is the purpose of the SQL CASE statement?
The SQL CASE statement is used to perform
conditional logic within an SQL query, similar
to an "if-else" statement in other
programming languages.
9. What is an SQL trigger, and when would you use it?
An SQL trigger is a set of actions that are
automatically executed when a specified
event occurs in a database. It's used to
enforce data integrity or automate tasks.
10. What is the difference between a candidate key, a
primary key, and a superkey?
A superkey is a set of one or more attributes
that can uniquely identify a tuple. A candidate
key is a minimal superkey, and the primary
key is the chosen candidate key that uniquely
identifies the tuple.
11. Explain the concept of ACID properties in SQL
transactions.
ACID stands for Atomicity, Consistency,
Isolation, and Durability. These properties
ensure the reliability of database
transactions.
12. What is a stored procedure in SQL, and how is it
different from a function?
A stored procedure is a set of SQL
statements that can be executed in a
predefined order. It doesn't return a value
directly, while a function does return a value.
13. What is SQL injection, and how can it be prevented?
SQL injection is a malicious technique where
attackers inject SQL code into input fields to
manipulate a database. It can be prevented
by using prepared statements, input
validation, and escaping user input.
14. Explain the concept of database transactions and
the use of COMMIT and ROLLBACK.
COMMIT is used to save the changes made
during a transaction, while ROLLBACK is
used to undo those changes if an error
occurs.
15. What is the purpose of the SQL EXISTS operator?
The EXISTS operator is used to check if a
subquery returns any rows and returns true if
at least one row is found.
16. What is the difference between a clustered index
and a non-clustered index in SQL?
A clustered index determines the physical
order of data in a table, and there can be only
one per table. A non-clustered index is a
separate structure that improves data
retrieval but doesn't affect the physical order
of data.
17. What is the SQL DDL and DML, and how do they
differ?
DDL (Data Definition Language) is used for
defining database structures (e.g., CREATE,
ALTER, DROP). DML (Data Manipulation
Language) is used for managing data (e.g.,
SELECT, INSERT, UPDATE, DELETE).
18. Explain the purpose of the SQL GROUP_CONCAT
function.
GROUP_CONCAT is used to concatenate
values from multiple rows into a single string,
often used with GROUP BY to aggregate
data.
19. What is the SQL window function, and how is it
different from aggregate functions?
A window function performs a calculation
across a set of table rows related to the
current row, without merging rows into a
single result. It differs from aggregate
functions that collapse multiple rows into one.
20. What is a foreign key constraint in SQL, and why is
it important?
A foreign key constraint enforces referential
integrity, ensuring that data in one table
corresponds to data in another table. It
maintains the relationship between tables.
21. What is the SQL NULL value, and how does it affect
query results?
NULL is a special marker used to indicate
that data does not exist in the database. It
affects query results because comparisons
involving NULL values typically return
unknown or null.
22. Explain the difference between a view and a table in
SQL.
A table is a physical storage structure, while
a view is a virtual table that is generated
based on the result of a query. Views don't
store data themselves but provide a dynamic
way to access data.
23. What is the SQL self-contained subquery, and how
is it different from a correlated subquery?
A self-contained subquery is independent
and doesn't reference columns from the outer
query. A correlated subquery depends on the
outer query's results and references columns
from it.
24. What is the SQL TRUNCATE statement used for?
The TRUNCATE statement is used to
remove all records from a table quickly,
essentially resetting the table.
25. Explain the concept of SQL schema and its role in
database management.
A schema is a logical container for database
objects like tables, views, and procedures. It
helps organize and manage database
objects, providing separation between
different parts of the database.

Hard Level SQL Interview Questions:

1. Explain the differences between a clustered index


and a non-clustered index in detail. When would you
choose one over the other, and why?
Clustered Index:

Determines the physical order of data in a table.


Only one per table.
Typically used on columns with a high degree of
uniqueness.
Non-clustered Index:

A separate structure that improves data retrieval


but doesn't affect physical data order.
Multiple non-clustered indexes can exist on a
table.

Useful for columns frequently used in search


conditions.

2. What is the SQL execution plan, and how can you


generate and analyze it to optimize query
performance?
The execution plan is a roadmap for how
SQL queries will be processed. It helps
identify areas where query performance can
be improved. You can generate and analyze
it using tools like EXPLAIN (for some
databases) or by inspecting execution plans
generated by the database query optimizer.
3. Explain the concept of database normalization and
its different forms. Provide examples of each form.
Database normalization is the process of
organizing data in a database to reduce
redundancy and improve data integrity. The
forms include 1NF (First Normal Form), 2NF
(Second Normal Form), 3NF (Third Normal
Form), BCNF (Boyce-Codd Normal Form),
and 4NF (Fourth Normal Form). Each form
builds upon the previous one, removing
certain types of data redundancy.
4. What is a recursive SQL query, and when might you
use it? Provide an example of a recursive query.
A recursive SQL query is used to query
hierarchical or self-referencing data. It
typically involves a common table expression
(CTE) that references itself. For example, you
might use it to query an organizational
hierarchy or a tree structure.
5. Explain the concept of SQL injection in detail, and
discuss advanced techniques to prevent it.
SQL injection is a malicious technique where
attackers inject SQL code into input fields to
manipulate a database. To prevent it, you
should use prepared statements, input
validation, and parameterized queries, as well
as security mechanisms like Web Application
Firewalls (WAFs).
6. What is a materialized view in SQL, and how is it
different from a regular view? When would you use
a materialized view?
A materialized view is a physical copy of the
result set of a query. It's different from a
regular view because it stores data on disk,
whereas a regular view is a virtual table.
Materialized views are useful when you need
to precompute and store complex
aggregations or frequently queried data for
performance reasons.
7. Explain the differences between the MERGE and
UPDATE statements in SQL. Provide an example of
when to use each.
The MERGE statement is used to perform
multiple operations (INSERT, UPDATE,
DELETE) in a single query based on a
specified condition. The UPDATE statement
is used to modify existing data in a table.
MERGE is typically used for complex updates
that require multiple actions.
8. What is a database trigger? Discuss the types of
triggers and scenarios where they are useful.
A database trigger is a set of actions that are
automatically executed when a specified
event occurs in a database. Types of triggers
include DML triggers (fire on data
manipulation events) and DDL triggers (fire
on data definition events). Triggers are used
for enforcing data integrity, auditing changes,
or automating tasks.
9. Explain the SQL concept of window functions and
provide examples of common window functions.
Window functions perform calculations across
a set of table rows related to the current row,
without collapsing multiple rows into one.
Common window functions include
ROW_NUMBER, RANK, DENSE_RANK,
LAG, and LEAD. They are often used for
ranking, analytics, and aggregations within
partitions of data.
10. What is the purpose of the SQL Pivot and Unpivot
operations, and how are they used?
Pivot and Unpivot are operations used to
transform data from rows to columns (Pivot)
or from columns to rows (Unpivot). They are
useful when you need to restructure data for
reporting or analysis purposes.
11. Explain the concept of database sharding, and
describe the advantages and challenges of
implementing sharding in a database system.
Database sharding is a technique used to
horizontally partition a database into smaller,
more manageable pieces. It can improve
performance and scalability but introduces
complexity in terms of data distribution, data
consistency, and maintenance.
12. What are Common Table Expressions (CTEs), and
how can they be used to simplify complex SQL
queries?
CTEs are temporary result sets that can be
referenced within a SELECT, INSERT,
UPDATE, or DELETE statement. They are
used to break down complex queries into
more manageable parts, making the SQL
code more readable and maintainable.
13. Explain the SQL deadlock phenomenon, and
discuss strategies to detect and resolve deadlocks
in a database system.
A deadlock occurs when two or more
transactions are unable to proceed because
each is waiting for the other to release a
resource. Strategies to address deadlocks
include deadlock detection, timeout
mechanisms, and adjusting isolation levels.
14. Discuss the principles and best practices for SQL
query optimization, including indexing, query
rewriting, and statistics.
SQL query optimization involves various
techniques such as creating appropriate
indexes, rewriting queries, and ensuring that
statistics are up-to-date. Understanding
execution plans and analyzing query
performance are also essential.
15. What is a SQL Full-Text Search, and how is it
different from standard text search? When would
you use Full-Text Search?
SQL Full-Text Search is a technology that
enables fast and efficient searching of text-
based data. It is different from standard text
search as it allows for more advanced search
operations like linguistic analysis, word
stemming, and relevance ranking. It's
commonly used in applications that require
powerful text search capabilities, such as
search engines.
16. Explain the concept of SQL data encryption,
including data at rest and data in transit. Describe
the methods and best practices for securing data
using encryption.
Data encryption involves protecting data from
unauthorized access using encryption
algorithms. Data at rest is secured by
encrypting the data on storage devices, while
data in transit is secured during transmission.
Common methods include Transparent Data
Encryption (TDE) for data at rest and
SSL/TLS for data in transit.
17. What is temporal data in SQL, and how can it be
managed using temporal tables and system-
versioned tables?
Temporal data refers to data that changes
over time. Temporal tables and system-
versioned tables allow you to track changes
to data and query it at specific points in time.
These features are useful for auditing,
historical reporting, and compliance
purposes.
18. Explain the differences between OLAP (Online
Analytical Processing) and OLTP (Online
Transaction Processing) database systems. Provide
examples of scenarios where each is more suitable.
OLAP databases are optimized for complex
queries and reporting, while OLTP databases
are designed for transactional operations.
OLAP is suitable for data warehousing and
business intelligence, while OLTP is used in
e-commerce and order processing systems.
19. Discuss the challenges and best practices for database
backup and recovery in a high-availability, production database
environment.

Database backup and recovery are critical for ensuring


data availability and integrity. Challenges include
minimizing downtime during backups, handling large
datasets, and maintaining recovery point objectives
(RPO) and recovery time objectives (RTO). Best
practices often involve a combination of full backups,
differential backups, transaction log backups, and
backup verification.
20. What is SQL Server Analysis Services (SSAS), and how is it
used for data analysis and business intelligence?

SQL Server Analysis Services (SSAS) is a Microsoft


service for analyzing and visualizing data. It offers two
modes: Multidimensional and Tabular. SSAS is used to
build OLAP cubes, create data models, and perform
advanced analytics for business intelligence.
21. Explain the concept of SQL injection defenses beyond
prepared statements and input validation, such as web
application firewalls (WAFs) and ORM frameworks.

Beyond prepared statements and input validation,


additional defenses include using Web Application
Firewalls (WAFs) to filter malicious requests, using
Object-Relational Mapping (ORM) frameworks that
handle parameterization, and implementing security
policies at the application layer.
22. What is the concept of data warehousing in SQL, and how
does it differ from traditional relational databases? Discuss the
advantages and use cases of data warehousing.

Data warehousing is a repository of data from various


sources, designed for efficient querying and reporting. It
differs from traditional databases in that it is optimized
for read-heavy, analytical workloads. Data warehousing
is used for business intelligence, data analytics, and
historical reporting.
23. Explain the CAP theorem (Consistency, Availability, Partition
Tolerance) and how it relates to distributed databases. Discuss
trade-offs and examples of databases that prioritize different
aspects of the CAP theorem.

The CAP theorem states that a distributed database can


provide at most two of the three properties: Consistency,
Availability, and Partition Tolerance. It's important to
understand the trade-offs when designing distributed
systems. For example, databases like MongoDB
prioritize availability and partition tolerance, while
databases like HBase prioritize consistency and partition
tolerance.
24. Describe SQL data masking and data obfuscation
techniques for protecting sensitive data in non-production
environments.

Data masking and obfuscation techniques involve


modifying or substituting sensitive data to protect privacy
in non-production environments. Methods include
tokenization, encryption, and anonymization. These
techniques help prevent data leaks and security
breaches.
25. Explain the SQL concept of data lineage and data
provenance. How are they important for data governance and
compliance?

Data lineage refers to the tracking of data as it moves


through various systems and processes, documenting its
origins and transformations. Data provenance is the
history of data, including its source, ownership, and
changes. Both are crucial for data governance, auditing,
and regulatory compliance, allowing organizations to
trace and control the flow of data.
ABOUT THE AUTHOR
ANANT RAJPUT
Anant Rajput is an accomplished SQL
programmer and a renowned academician in
the field of computer science. With a passion
for teaching and an extensive background in
database management, he has made
significant contributions to the world of SQL.

Anant's academic journey has been nothing short of remarkable. He


achieved an outstanding feat by qualifying for the Joint Entrance
Examination (JEE), showcasing his exceptional aptitude for science
and mathematics. His dedication to excellence only continued to
shine as he secured an impressive All India Rank 32 in the Graduate
Aptitude Test in Engineering (GATE) exam in 2023.

Moreover, Anant's intellectual prowess was further affirmed by his


achievement of qualifying the National Eligibility Test (NET) with the
Junior Research Fellowship (JRF) twice, demonstrating his passion
for research and academia.

However, Anant's love for teaching and knowledge-sharing goes


beyond exams and rankings. His most cherished hobby is to write
academic books, where he distills complex technical concepts into
clear, accessible language for both students and professionals. His
commitment to educational excellence and the SQL programming
community is evident in his dedication to authoring insightful and
practical resources.
Anant's unique combination of academic achievements and a
passion for teaching make him a valuable resource for both novice
and experienced SQL programmers. He continues to inspire and
guide countless individuals in their journey to mastering SQL
programming.

You might also like