0% found this document useful (0 votes)
812 views

SQL Server Documentation

This document provides an overview of SQL Server, including: 1. It defines what an RDBMS is and explains that SQL Server is a popular RDBMS. 2. It describes how to connect to SQL Server using SQL Server Management Studio and the basics of creating, altering, and dropping databases in SQL Server. 3. It provides an overview of the different data types supported in SQL Server such as integer, decimal, money/currency, date/time, character, and binary data types.

Uploaded by

sudha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
812 views

SQL Server Documentation

This document provides an overview of SQL Server, including: 1. It defines what an RDBMS is and explains that SQL Server is a popular RDBMS. 2. It describes how to connect to SQL Server using SQL Server Management Studio and the basics of creating, altering, and dropping databases in SQL Server. 3. It provides an overview of the different data types supported in SQL Server such as integer, decimal, money/currency, date/time, character, and binary data types.

Uploaded by

sudha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 135

MSSQL SERVER

Table of Contents

What is RDBMS....................................................................................................................................... 2
What is SQL? ........................................................................................................................................... 2
Connecting to SQL Server using SSMS ................................................................................................... 2
Creating, altering and dropping a database .......................................................................................... 3
DataTypes in SQL Server ........................................................................................................................ 4
Sub Languages in SQL Server ................................................................................................................. 7
Constraints in Sql Server ...................................................................................................................... 10
Select Statement .................................................................................................................................. 16
Joins in SQL Server................................................................................................................................ 19
Stored Procedure in Sql Server: ........................................................................................................... 27
Functions in Sql Server ......................................................................................................................... 33
Indexes in Sql Server ............................................................................................................................ 54
Views in Sql Server ............................................................................................................................... 62
Triggers in Sql Server ............................................................................................................................ 73
Temporary tables in SQL Server........................................................................................................... 80
Derived Tables,Table variables in Sql Server....................................................................................... 82
Common Table Expressions in Sql Server ............................................................................................ 85
Database Normalization ...................................................................................................................... 94
Pivot operator in sql server.................................................................................................................. 98
Error handling in sql server ................................................................................................................ 104
Transactions in SQL Server ................................................................................................................. 110
Subqueries in Sql Server..................................................................................................................... 113
Cursors in Sql Server........................................................................................................................... 117
Merge in SQL Server ........................................................................................................................... 119
Union, Intersect and Except in Sql Server ......................................................................................... 124
Cross Apply and Outer Apply in Sql Server........................................................................................ 127
RANK, DENSE_RANK and ROW_NUMBER functions in SQL Server .................................................. 132
What is RDBMS
A Relation Database Management system (RDBMS) is a database management
system that is based on the relational model. It has the following major components: Table,
Record/Tuple/Row, Field, and Column/Attribute. Examples of the most popular RDBMS are
MYSQL, Oracle, IBM DB2, and Microsoft SQL Server database

What is SQL?
1. It is a non-procedural language that is used to communicate with any database such
as Oracle, SQL Server, etc.
2. This Language was developed by the German Scientist Mr. E.F.Codd in 1968
3. ANSI (American National Standard Institute) approved this concept and in 1972 SQL
was released into the market
4. SQL is also called Sequel it stands for Structured English Query Language,
5. The sequel will provide a common language interface facility it means that a sequel
is a language that can communicate with any type of database such as SQL Server,
Oracle, MySQL, Sybase, BD2, etc.
6. SQL is not a case-sensitive language it means that all the commands of SQL are not
case sensitive
7. Every command of SQL should end with a semicolon (;) (It is optional for SQL Server)
8. SQL can be called NLI (Natural Language Interface). It means that all the SQL
Commands are almost similar to normal English language

Connecting to SQL Server using SSMS


SQL Server Management Studio (SSMS), is the client tool that can be used to write and
execute SQL queries. To connect to the SQL Server Management Studio
1. Click Start
2. Select All Programs
3. Select Microsoft SQL Server 2005, 2008, or 2008 R2 (Depending on the version installed)
4. Select SQL Server Management Studio
You will now see, Connect to Server window.
1. Select Database Engine as the Server Type. The other options that you will see here are
Analysis Services(SSAS), Reporting Services (SSRS) and Integration Services(SSIS).
Server type = Database Engine

2. Next you need to specify the Server Name. Here we can specify the name or the server or
IP Address. Server name = (local)

3. Now select Authentication. The options available here, depends on how you have
installed SQL Server. During installation, if you have chosen mixed mode authentication,
you will have both Windows Authentication and SQL Server Authentication. Otherwise, you
will just be able to connect using windows authentication.

4. If you have chosen Windows Authentication, you dont have to enter user name and
password, otherwise enter the user name and password and click connect.
You should now be connected to SQL Server. Now, click on New Query, on the top left hand
corner of SSMS. This should open a new query editor window, where we can type sql
queries and execute.

SSMS is a client tool and not the Server by itself. Usually database server (SQL Server), will
be on a dedicated machine, and developers connect to the server using SSMS from their
respective local (development) computers.
Developer Machines 1,2,3 and 4 connects to the database server using SSMS.

Creating, altering and dropping a database


A SQL Server database can be created, altered and dropped
1. Graphically using SQL Server Management Studio (SSMS) or
2. Using a Query
To create the database graphically
1. Right Click on Databases folder in the Object explorer
2. Select New Database
3. In the New Database dialog box, enter the Database name and click OK.

To Create the database using a query


Create database DatabaseName

Whether, you create a database graphically using the designer or, using a query, the
following 2 files gets generated.
.MDF file - Data File (Contains actual data)
.LDF file - Transaction Log file (Used to recover the database)

To alter a database, once it's created


Alter database DatabaseName Modify Name = NewDatabaseName
Alternatively, you can also use system stored procedure
Execute sp_renameDB 'OldDatabaseName','NewDatabaseName'

To Delete or Drop a database


Drop Database DatabaseThatYouWantToDrop

Dropping a database, deletes the LDF and MDF files.

You cannot drop a database, if it is currently in use. You get an error stating - Cannot drop
database "NewDatabaseName" because it is currently in use. So, if other users are
connected, you need to put the database in single user mode and then drop the database.
Alter Database DatabaseName Set SINGLE_USER With Rollback Immediate

With Rollback Immediate option, will rollback all incomplete transactions and closes the
connection to the database.

DataTypes in SQL Server

The SQL Server supports the following data types


Integer data types
Decimal data types
Money / currency data types
Date and Time data types
Character data types
Binary data types
Special data types
Integer Data Types in SQL Server:
Integer Data Types are allowed only to hold integer types of values and this data type can be
applied on EmpId, ProductCode, BracnchCode columns, etc. These data types are classified
into 4 types based on their range and memory size as shown in the below image
Decimal Data Types in SQL Server:
These data types are allowed decimal point values only. The Decimal Data Type contains
two types those are
Decimal (P, S)
Numeric (P, S)
But both are the same. Here P represents precision and S represents Scale and the default
value of the Decimal data type is Decimal (18, 0) and also for Numeric (18, 0).

SQL Server Money / Currency Data Type:


These data types are used to accept currency format values into a table column. The money
data type again classified into two types.

SQL Server Date and Time data types:


Date and Time data types are used to store a particular date and time information. These
are applying on the date of joining, date of birth, hire date, order date columns, etc. Date
and time data types again classified into 3 types, such as
Date: This data type will accept date format information only. The default format of the
date data type is ‘YYYY/MM/DD’
Time: It allows time format information only. The default format of the time data type is
‘hh:mm:ss.ms’
DateTime: It allows date and time format information. The default format of DateTime data
type is ‘YYYY/MM/DD hh:mm: ss.ms’.
Character Data Types in SQL Server:
Character data types are allowed characters and integer format values. These data types can
be applied to employee names, student names, and product name columns, etc. Character
data types again classified into two types, those are Unicode data types and Non-Unicode
data types.
Non Unicode data types: char (Size), varchar (size/max), Text
Unicode data types: nchar(size), nvarchar(size), ntext
Char (size):
It is a fixed-length data type (static data type).
It will store the data type in the Non-Unicode mechanism that means it will occupy 1byte for
1 character.
The maximum length of the char data type is from 1 to 8000 bytes.
Disadvantages: memory wastage because size cannot be changed at runtime.
Varchar (size/max):
It is a variable-length data type (dynamic data type) and will store the character in a non-
Unicode manner that means it will take 1 byte for 1 character.
The maximum length of the varchar data type is from 1 to 8000 bytes
Text:
The text data type is the old version data type of SQL Server and similar to the varchar(max)
data type.
Note: The above 3 data types come under the Non-Unicode mechanism and supported to
localized data (English culture data only).
What are the differences between the fixed-length data type and a variable-length data
type?

nchar(Size) data type:


It is a fixed-length data type and will stores the characters in the Unicode manner that
means it will take 2bytes memory per single character.
The maximum length of nchar data type is from up to 4000bytes.
Nvarchar(size/max) data type:
It is a variable-length data type and will store the data type in the Unicode manner that
means it will occupy 2bytes of memory per single character.
The maximum length of nvarchar data type is from up to 4000 bytes.
ntext data type:
It is an old version data type of SQL Server and similar to nvarchar(max/size) data type.
Here ‘n’ represents the national.
Binary data type:
These data types are used to store image files, audio files, and video files into a database
location.
Binary data types again classified into three types, such as
Binary(size):
It is a fixed-length data type and will store binary format information (0,1).
The maximum length of the binary data type is from up to 8000 bytes.
Varbinary(size/max):
It is a variable-length data type and will store the information in the form of binary format.
The maximum length of the varbinary data is from up to 8000bytes (we can store 2GB
information).
Image data type:
It is an old version data type of SQL Server and similar to the varbinary data type.
Note: Instead of text, ntext and image data type we use varchar(max), nvarchar(max) and
varbinary(max) data types in latest versions SQL Server.
Boolean Type:
To hold the Boolean values it provides a bit data type that can take a value of 1, 0, or NULL.
Note: The string values TRUE and FALSE can be converted to bit values. TRUE is converted to
1 and FALSE is converted to 0.

Sub Languages in SQL Server

SQL contains the following sublanguages


1. DDL (5 commands- create, alter, sp_rename, drop, truncate)
2. DML (3 commands- Insert, Update, Delete).
3. DQL/ DRL (1 command- select).
4. TCL (3 commands- commit, rollback, savepoint)
5. DCL (2 commands- Grant, Revoke).
6. Data Definition Language (DDL):
1. Data Definition Language (DDL) is used to define database objects such as tables,
synonyms, views, procedures, functions, triggers, etc. that means DDL statements are
used to alter/modify a database or table structure and schema
2. DDL commands are working on the structure of a table, not on the data of a table.
3. This language contains five commands. Those are (CREATE, ALTER, SP_RENAME,
TRUNCATE, DROP)
Create Command in SQL Server:
The CREATE command is used to create a new database object in a database such as tables,
views, functions, etc. In SQL Server, all database objects (tables, views, etc) are saved with an
extension of “dbo.<object name>”. The syntax to create a database is shown below.

Example: The following Create command creates a table called Student

CREATE TABLE student


(
studid INT,
sname VARCHAR(max),
salary DECIMAL(6, 2)
)
Alter Command in SQL Server:
This command is used to change or modify the structure of a table. In SQL Server, using the
ALTER command we can perform the following operations on an existing table.
1. Increase/decrease the width of a column.
2. Change the data type of a column.
3. Change the NOT NULL to NULL or NULL to NOT NULL.
4. Used to add a new column to an existing table.
5. Used to drop an existing column.
6. We can add a new constraint.
7. It can drop an existing constraint on a table.
8. Disable or re-enable check constraint of a table.
9. Changing a column name in the table.

ALTER-ALTER COLUMN:
This command is used to change a data type from an old data type to a new data type and
also to change the size of a data type of a column.
Syntax: ALTER TABLE <TABLENAME> ALTER COLUMN <COLUMNNAME> <NEW DATA
TYPE>[NEW SIZE]
Change the width of a column
Let’s change the width to VARCHAR (100). To do so, we need to use the Alter command as
shown below.
ALTER TABLE Student ALTER COLUMN Name VARCHAR(100)
Changing the data type of an existing column.
ALTER TABLE Student ALTER COLUMN Name NVARCHAR(100)
Changing the column NULL to NOT NULL.
ALTER TABLE Student ALTER COLUMN No INT NOT NULL
Changing NOT NULL to NULL.
ALTER TABLE Student ALTER COLUMN No INT NULL
Adding a new column to an existing table in SQL Server:
ALTER TABLE <TABLENAME> ADD <NEWCOLUMNNAME> <DATA TYPE>[NEW SIZE]
ALTER TABLE Student ADD Branch VARCHAR(20)
Deleting Column in SQL Server:
ALTER TABLE <TABLENAME> DROP COLUMN <COLUMNNAME>
ALTER TABLE Student DROP COLUMN Branch
SP_RENAME Command in SQL Server:
SP_RENAME ‘<TABLE NAME>.<OLD COLUMN NAME>’, ‘NEW COLUMN NAME’
Suppose, you want to change the name column from Name to StudentName, then you need
to use this stored procedure as shown below.
SP_RENAME ‘Student.Name’, ‘StudentName‘
This SP_RENAME stored procedure can also be used to change a table name from the old
table name to a new name. The syntax to change the table name using SP_RENAME stored
procedure is given below.
SP_RENAME ‘OLD TABLE NAME’, ‘NEW TABLE NAME’
For example, if you want to change the table name from Student to StudentDetails, then you
need to use the SP_RENAME stored procedure as shown below:
SP_RENAME ‘Student‘, ‘StudentDetails’Truncate Command in SQL Server:
Whenever you want to delete all the records or rows from a table without any condition, then
you need to use the Truncate command in SQL Server. So, using this command you cannot
delete specific records from the table because the truncate command does not support the
“where” clause. The syntax to use the TRUNCATE command is given below.
TRUNCATE TABLE <TABLENAME>
Suppose, you want to delete all the records from the Student table, then you need to use
the TRUNCATE command as shown below in SQL Server.
TRUNCATE TABLE Student
Note: The truncate command will delete rows but not the structure of the table.Drop
Command in SQL Server:
If you want to delete the table from the database, then you need to use the DROP command
in SQL Server. The syntax to use the DROP command is given below.
DROP TABLE <OBJECT NAME>
Suppose, you want to delete the Student table from the database, then you need to use the
DROP command as shown below.
DROP TABLE Student
Note: When a table is dropped all the dependent constraint which are associated with the
table also gets dropped. We cannot drop a master table.
.What are the differences between Delete and Truncate Command in SQL Server?
Delete Truncate

It is a DML command. It is a DDL command

By using the delete command we can


delete a specific record from the table. But it is not possible with truncate command.
Delete supports WHERE clause. Truncate does not support the WHERE clause

It is a temporary deletion It is a permanent deletion

Delete supports rollback transactions for Truncate doesn’t support rollback transaction so
restoring the deleted data. that we cannot restore the deleted information

Delete command will not reset identity


property. But it will reset the identity property

Data Manipulation Language(DML) statements are used to add, remove, and alter data in
the database. Particularly, these statements work on data stored in tables. Some examples
of DML keywords include:

1. UPDATE: used for modifying the data in the database.


2. INSERT: used for adding or inserting new data into database.
3. DELETE: used for deleting the already existing data from database.

Data Control Language. It is used to create roles, permissions, and referential integrity as
well it is used to control access to database by securing it.
Examples: GRANT, REVOKE statements,
1. GRANT: used for creating access permissions for users to database.
2. REVOKE: used for revoking the already assigned permissions.

Transactional Control Language. It is used to manage different transactions occurring within


a database.

Examples: COMMIT, ROLLBACK, SAVE TRANSACTION statements,


1. COMMIT: used for saving the work done in a particular transaction. For
example: “Ctrl + S” in word file.
2. ROLLBACK: used for reverting the transaction to the original state before
commit. For example: “Ctrl + Z” in word file.
3. SAVE TRANSACTION: used for setting save point in transactions.

Constraints in Sql Server


Constraints can be specified when the table is created with the CREATE TABLE statement, or
after the table is created with the ALTER TABLE statement.
SQL constraints are used to specify rules for the data in a table.
Constraints are used to limit the type of data that can go into a table. This ensures the
accuracy and reliability of the data in the table. If there is any violation between the
constraint and the data action, the action is aborted.
Constraints can be column level or table level. Column level constraints apply to a column,
and table level constraints apply to the whole table.
The following constraints are commonly used in SQL:
NOT NULL - Ensures that a column cannot have a NULL value
UNIQUE - Ensures that all values in a column are different
PRIMARY KEY - A combination of a NOT NULL and UNIQUE. Uniquely identifies each row in a
table
FOREIGN KEY - Prevents actions that would destroy links between tables
CHECK - Ensures that the values in a column satisfies a specific condition
DEFAULT - Sets a default value for a column if no value is specified
SQL NOT NULL Constraint:
By default, a column can hold NULL values.The NOT NULL constraint enforces a column to
NOT accept NULL values.
NOT NULL on CREATE TABLE:
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255) NOT NULL,
Age int
);
SQL NOT NULL on ALTER:
ALTER TABLE Persons
ALTER COLUMN Age int NOT NULL;
SQL UNIQUE Constraint:
The UNIQUE constraint ensures that all values in a column are different.
Both the UNIQUE and PRIMARY KEY constraints provide a guarantee for uniqueness for a
column or set of columns.
A PRIMARY KEY constraint automatically has a UNIQUE constraint.
However, you can have many UNIQUE constraints per table, but only one PRIMARY KEY
constraint per table.
The following SQL creates a UNIQUE constraint on the "ID" column when the "Persons"
table is created:
CREATE TABLE Persons (
ID int NOT NULL UNIQUE,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
ALTER TABLE Persons
ADD CONSTRAINT UC_Person UNIQUE (ID,LastName);
ALTER TABLE Persons
DROP CONSTRAINT UC_Person;
SQL PRIMARY KEY Constraint
The PRIMARY KEY constraint uniquely identifies each record in a table.
Primary keys must contain UNIQUE values, and cannot contain NULL values.
A table can have only ONE primary key; and in the table, this primary key can consist of
single or multiple columns (fields).
CREATE TABLE Persons (
ID int NOT NULL PRIMARY KEY,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
ALTER TABLE Persons
ADD CONSTRAINT PK_Person PRIMARY KEY (ID,LastName);
ALTER TABLE Persons
DROP CONSTRAINT PK_Person;

SQL FOREIGN KEY Constraint


The FOREIGN KEY constraint is used to prevent actions that would destroy links between
tables.
A FOREIGN KEY is a field (or collection of fields) in one table, that refers to the PRIMARY
KEY in another table.
The table with the foreign key is called the child table, and the table with the primary key is
called the referenced or parent table.
CREATE TABLE Orders (
OrderID int NOT NULL PRIMARY KEY,
OrderNumber int NOT NULL,
PersonID int FOREIGN KEY REFERENCES Persons(PersonID)
);
ALTER TABLE Orders
ADD CONSTRAINT FK_PersonOrder
FOREIGN KEY (PersonID) REFERENCES Persons(PersonID);
ALTER TABLE Orders
DROP CONSTRAINT FK_PersonOrder;
SQL CHECK Constraint
The CHECK constraint is used to limit the value range that can be placed in a column.
If you define a CHECK constraint on a column it will allow only certain values for this column.
If you define a CHECK constraint on a table it can limit the values in certain columns based
on values in other columns in the row.
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
City varchar(255),
CONSTRAINT CHK_Person CHECK (Age>=18 AND City='Sandnes')
);

ALTER TABLE Persons


ADD CONSTRAINT CHK_PersonAge CHECK (Age>=18 AND City='Sandnes');
ALTER TABLE Persons
DROP CONSTRAINT CHK_PersonAge;
SQL DEFAULT Constraint
The DEFAULT constraint is used to set a default value for a column.
The default value will be added to all new records, if no other value is specified.
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
City varchar(255) DEFAULT 'Sandnes'
);
ALTER TABLE Persons
ADD CONSTRAINT df_City
DEFAULT 'Sandnes' FOR City;
ALTER TABLE Persons
ALTER COLUMN City DROP DEFAULT;
Cascading referential integrity constraint
Cascading referential integrity constraint allows to define the actions Microsoft SQL Server
should take when a user attempts to delete or update a key to which an existing foreign
keys points.

For example, consider the 2 tables shown below. If you delete row with ID =
1 from tblGender table, then row with ID = 3 from tblPerson table becomes an orphan
record. You will not be able to tell the Gender for this row. So, Cascading referential
integrity constraint can be used to define actions Microsoft SQL Server should take when
this happens. By default, we get an error and the DELETE or UPDATE statement is rolled
back.

However, you have the following options when setting up Cascading referential integrity
constraint
1. No Action: This is the default behaviour. No Action specifies that if an attempt is made to
delete or update a row with a key referenced by foreign keys in existing rows in other
tables, an error is raised and the DELETE or UPDATE is rolled back.

2. Cascade: Specifies that if an attempt is made to delete or update a row with a key
referenced by foreign keys in existing rows in other tables, all rows containing those foreign
keys are also deleted or updated.

3. Set NULL: Specifies that if an attempt is made to delete or update a row with a key
referenced by foreign keys in existing rows in other tables, all rows containing those foreign
keys are set to NULL.

4. Set Default: Specifies that if an attempt is made to delete or update a row with a key
referenced by foreign keys in existing rows in other tables, all rows containing those foreign
keys are set to default values.
Identity Column
If a column is marked as an identity column, then the values for this column are
automatically generated, when you insert a new row into the table.The following, create
table statement marks PersonId as an identity column with seed = 1 and Identity Increment
= 1. Seed and Increment values are optional. If you don't specify the identity and seed they
both default 1
Create Table tblPerson
(
PersonId int Identity(1,1) Primary Key,
Name nvarchar(20)
)

In the following 2 insert statements, we only supply values for Name column and not for
PersonId column.
Insert into tblPerson values ('Sam')
Insert into tblPerson values ('Sara')

If you select all the rows from tblPerson table, you will see that, 'Sam' and 'Sara' rows have
got 1 and 2 as PersonId.

Now, if I try to execute the following query, I get an error stating - An explicit value for the
identity column in table 'tblPerson' can only be specified when a column list is used and
IDENTITY_INSERT is ON.
Insert into tblPerson values (1,'Todd')

So if you mark a column as an Identity column, you dont have to explicitly supply a value for
that column when you insert a new row. The value is automatically calculated and provided
by SQL server. So, to insert a row into tblPerson table, just provide value for Name column.
Insert into tblPerson values ('Todd')

Delete the row, that you have just inserted and insert another row. You see that the value
for PersonId is 2. Now if you insert another row, PersonId is 3. A record with PersonId = 1,
does not exist, and I want to fill this gap. To do this, we should be able to explicitly supply
the value for identity column. To explicitly supply a value for identity column
1. First turn on identity insert - SET Identity_Insert tblPerson ON
2. In the insert query specify the column list
Insert into tblPerson(PersonId, Name) values(2, 'John')

As long as the Identity_Insert is turned on for a table, you need to explicitly provide the
value for that column. If you don't provide the value, you get an error - Explicit value must
be specified for identity column in table 'tblPerson1' either when IDENTITY_INSERT is set to
ON or when a replication user is inserting into a NOT FOR REPLICATION identity column.

After, you have the gaps in the identity column filled, and if you wish SQL server to calculate
the value, turn off Identity_Insert.
SET Identity_Insert tblPerson OFF

If you have deleted all the rows in a table, and you want to reset the identity column value,
use DBCC CHECKIDENT command. This command will reset PersonId identity column.
DBCC CHECKIDENT(tblPerson, RESEED, 0)

Example queries for getting the last generated identity value


Select SCOPE_IDENTITY()
Select @@IDENTITY
Select IDENT_CURRENT('tblPerson')
In brief:
SCOPE_IDENTITY() - returns the last identity value that is created in the same session and in
the same scope.
@@IDENTITY - returns the last identity value that is created in the same session and across
any scope.
IDENT_CURRENT('TableName') - returns the last identity value that is created for a specific
table across any session and any scope.

Select Statement
Basic select statement syntax
SELECT Column_List FROM Table_Name

If you want to select all the columns, you can also use *. For better performance use the
column list, instead of using *.
SELECT * FROM Table_Name
To Select distinct rows use DISTINCT keyword
SELECT DISTINCT Column_List FROM Table_Name

Example: Select distinct city from tblPerson

Filtering rows with WHERE clause


SELECT Column_List FROM Table_Name WHERE Filter_Condition

Example: Select Name, Email from tblPerson where City = 'London'


Note: Text values, should be present in single quotes, but not required for numeric values.
Group By
In SQL Server we have got lot of aggregate functions. Examples
1. Count()
2. Sum()
3. avg()
4. Min()
5. Max()

Group by clause is used to group a selected set of rows into a set of summary rows by the
values of one or more columns or expressions. It is always used in conjunction with one or
more aggregate functions.

I want an sql query, which gives total salaries paid by City. The output should be as shown
below.

Query for retrieving total salaries by city:


We are applying SUM() aggregate function on Salary column, and grouping by city column.
This effectively adds, all salaries of employees with in the same city.
Select City, SUM(Salary) as TotalSalary
from tblEmployee
Group by City

Note: If you omit, the group by clause and try to execute the query, you get an error
- Column 'tblEmployee.City' is invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause.
Now, I want an sql query, which gives total salaries by City, by gender. The output should be
as shown below.

Query for retrieving total salaries by city and by gender: It's possible to group by multiple
columns. In this query, we are grouping first by city and then by gender.
Select City, Gender, SUM(Salary) as TotalSalary
from tblEmployee
group by City, Gender

Now, I want an sql query, which gives total salaries and total number of employees by City,
and by gender. The output should be as shown below.

Query for retrieving total salaries and total number of employees by City, and by gender:
The only difference here is that, we are using Count() aggregate function.
Select City, Gender, SUM(Salary) as TotalSalary,
COUNT(ID) as TotalEmployees
from tblEmployee
group by City, Gender

Filtering Groups:
WHERE clause is used to filter rows before aggregation, where as HAVING clause is used to
filter groups after aggregations. The following 2 queries produce the same result.

Filtering rows using WHERE clause, before aggrgations take place:


Select City, SUM(Salary) as TotalSalary
from tblEmployee
Where City = 'London'
group by City

Filtering groups using HAVING clause, after all aggrgations take place:
Select City, SUM(Salary) as TotalSalary
from tblEmployee
group by City
Having City = 'London'

From a performance standpoint, you cannot say that one method is less efficient than the
other. Sql server optimizer analyzes each statement and selects an efficient way of
executing it. As a best practice, use the syntax that clearly describes the desired result. Try
to eliminate rows that you wouldn't need, as early as possible.

It is also possible to combine WHERE and HAVING


Select City, SUM(Salary) as TotalSalary
from tblEmployee
Where Gender = 'Male'
group by City
Having City = 'London'

Difference between WHERE and HAVING clause:


1. WHERE clause can be used with - Select, Insert, and Update statements, where as
HAVING clause can only be used with the Select statement.
2. WHERE filters rows before aggregation (GROUPING), where as, HAVING filters groups,
after the aggregations are performed.
3.Aggregate functions cannot be used in the WHERE clause, unless it is in a sub query
contained in a HAVING clause, whereas, aggregate functions can be used in Having clause.

Joins in SQL Server


Joins in SQL server are used to query (retrieve) data from 2 or more related tables. In
general tables are related to each other using foreign key constraints.
In SQL server, there are different types of JOINS.
1. CROSS JOIN
2. INNER JOIN
3. OUTER JOIN

Outer Joins are again divided into 3 types


1. Left Join or Left Outer Join
2. Right Join or Right Outer Join
3. Full Join or Full Outer Join

Now let's understand all the JOIN types, with examples and the differences between
them.
Employee Table (tblEmployee)

Departments Table (tblDepartment)

SQL Script to create tblEmployee and tblDepartment tables


Create table tblDepartment
(
ID int primary key,
DepartmentName nvarchar(50),
Location nvarchar(50),
DepartmentHead nvarchar(50)
)
Go

Insert into tblDepartment values (1, 'IT', 'London', 'Rick')


Insert into tblDepartment values (2, 'Payroll', 'Delhi', 'Ron')
Insert into tblDepartment values (3, 'HR', 'New York', 'Christie')
Insert into tblDepartment values (4, 'Other Department', 'Sydney', 'Cindrella')
Go

Create table tblEmployee


(
ID int primary key,
Name nvarchar(50),
Gender nvarchar(50),
Salary int,
DepartmentId int foreign key references tblDepartment(Id)
)
Go

Insert into tblEmployee values (1, 'Tom', 'Male', 4000, 1)


Insert into tblEmployee values (2, 'Pam', 'Female', 3000, 3)
Insert into tblEmployee values (3, 'John', 'Male', 3500, 1)
Insert into tblEmployee values (4, 'Sam', 'Male', 4500, 2)
Insert into tblEmployee values (5, 'Todd', 'Male', 2800, 2)
Insert into tblEmployee values (6, 'Ben', 'Male', 7000, 1)
Insert into tblEmployee values (7, 'Sara', 'Female', 4800, 3)
Insert into tblEmployee values (8, 'Valarie', 'Female', 5500, 1)
Insert into tblEmployee values (9, 'James', 'Male', 6500, NULL)
Insert into tblEmployee values (10, 'Russell', 'Male', 8800, NULL)
Go

General Formula for Joins


SELECT ColumnList
FROM LeftTableName
JOIN_TYPE RightTableName
ON JoinCondition

CROSS JOIN
CROSS JOIN, produces the cartesian product of the 2 tables involved in the join. For
example, in the Employees table we have 10 rows and in the Departments table we have 4
rows. So, a cross join between these 2 tables produces 40 rows. Cross Join shouldn't have
ON clause.

CROSS JOIN Query:


SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
CROSS JOIN tblDepartment

JOIN or INNER JOIN


Write a query, to retrieve Name, Gender, Salary and DepartmentName from Employees and
Departments table. The output of the query should be as shown below.

SELECT Name, Gender, Salary, DepartmentName


FROM tblEmployee
INNER JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id

OR

SELECT Name, Gender, Salary, DepartmentName


FROM tblEmployee
JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id

Note: JOIN or INNER JOIN means the same. It's always better to use INNER JOIN, as this
explicitly specifies your intention.

So, in summary, INNER JOIN, returns only the matching rows between both the tables. Non
matching rows are eliminated.

LEFT JOIN or LEFT OUTER JOIN


Now, let's say, I want all the rows from the Employees table, including JAMES and RUSSELL
records. I want the output, as shown below.

SELECT Name, Gender, Salary, DepartmentName


FROM tblEmployee
LEFT OUTER JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id

OR

SELECT Name, Gender, Salary, DepartmentName


FROM tblEmployee
LEFT JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id

Note: You can use, LEFT JOIN or LEFT OUTER JOIN. OUTER keyowrd is optional

LEFT JOIN, returns all the matching rows + non matching rows from the left table. In reality,
INNER JOIN and LEFT JOIN are extensively used.

RIGHT JOIN or RIGHT OUTER JOIN


I want, all the rows from the right table. The query output should be, as shown below.
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
RIGHT OUTER JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id

OR

SELECT Name, Gender, Salary, DepartmentName


FROM tblEmployee
RIGHT JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id

Note: You can use, RIGHT JOIN or RIGHT OUTER JOIN. OUTER keyowrd is optional

RIGHT JOIN, returns all the matching rows + non matching rows from the right table.

FULL JOIN or FULL OUTER JOIN


I want all the rows from both the tables involved in the join. The query output should be, as
shown below.
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee
FULL OUTER JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id

OR

SELECT Name, Gender, Salary, DepartmentName


FROM tblEmployee
FULL JOIN tblDepartment
ON tblEmployee.DepartmentId = tblDepartment.Id

Note: You can use, FULLJOIN or FULL OUTER JOIN. OUTER keyowrd is optional

FULL JOIN, returns all rows from both the left and right tables, including the non matching
rows.

Joins Summary
Consider tblEmployees table shown below.

Write a query which gives the following result.

Self Join Query:


A MANAGER is also an EMPLOYEE. Both the, EMPLOYEE and MANAGER rows, are present in
the same table. Here we are joining tblEmployee with itself using different alias names, E for
Employee and M for Manager. We are using LEFT JOIN, to get the rows with ManagerId
NULL. You can see in the output TODD's record is also retrieved, but the MANAGER is NULL.
If you replace LEFT JOIN with INNER JOIN, you will not get TODD's record.
Select E.Name as Employee, M.Name as Manager
from tblEmployee E
Left Join tblEmployee M
On E.ManagerId = M.EmployeeId
In short, joining a table with itself is called as SELF JOIN

Stored Procedure in Sql Server:


A stored procedure is group of T-SQL (Transact SQL) statements. If you have a situation,
where you write the same query over and over again, you can save that specific query as a
stored procedure and call it just by it's name.
Creating a simple stored procedure without any parameters: This stored procedure,
retrieves Name and Gender of all the employees. To create a stored procedure we
use, CREATE PROCEDURE or CREATE PROC statement.

Create Procedure spGetEmployees


as
Begin
Select Name, Gender from tblEmployee
End

Note: When naming user defined stored procedures, Microsoft recommends not to
use "sp_" as a prefix. All system stored procedures, are prefixed with "sp_". This avoids any
ambiguity between user defined and system stored procedures and any conflicts, with some
future system procedure.

To execute the stored procedure, you can just type the procedure name and press F5, or
use EXEC or EXECUTE keywords followed by the procedure name as shown below.
1. spGetEmployees
2. EXEC spGetEmployees
3. Execute spGetEmployees

Note: You can also right click on the procedure name, in object explorer in SQL Server
Management Studio and select EXECUTE STORED PROCEDURE.

Creating a stored procedure with input parameters: This SP, accepts GENDER and
DEPARTMENTID parameters. Parameters and variables have an @ prefix in their name.

Create Procedure spGetEmployeesByGenderAndDepartment


@Gender nvarchar(50),
@DepartmentId int
as
Begin
Select Name, Gender from tblEmployee Where Gender = @Gender and DepartmentId =
@DepartmentId
End

To invoke this procedure, we need to pass the value for @Gender and @DepartmentId
parameters. If you don't specify the name of the parameters, you have to first pass value for
@Gender parameter and then for @DepartmentId.
EXECUTE spGetEmployeesByGenderAndDepartment 'Male', 1

On the other hand, if you change the order, you will get an error stating "Error converting
data type varchar to int." This is because, the value of "Male" is passed into @DepartmentId
parameter. Since @DepartmentId is an integer, we get the type conversion error.
spGetEmployeesByGenderAndDepartment 1, 'Male'

When you specify the names of the parameters when executing the stored procedure the
order doesn't matter.
EXECUTE spGetEmployeesByGenderAndDepartment @DepartmentId=1, @Gender = 'Male'

To view the text, of the stored procedure


1. Use system stored procedure sp_helptext 'SPName'
OR
2. Right Click the SP in Object explorer -> Scrip Procedure as -> Create To -> New Query
Editor Window
To change the stored procedure, use ALTER PROCEDURE statement:
Alter Procedure spGetEmployeesByGenderAndDepartment
@Gender nvarchar(50),
@DepartmentId int
as
Begin
Select Name, Gender from tblEmployee Where Gender = @Gender and DepartmentId =
@DepartmentId order by Name
End
To encrypt the text of the SP, use WITH ENCRYPTION option. Once, encrypted, you cannot
view the text of the procedure, using sp_helptext system stored procedure. There are ways
to obtain the original text, which we will talk about in a later session.
Alter Procedure spGetEmployeesByGenderAndDepartment
@Gender nvarchar(50),
@DepartmentId int
WITH ENCRYPTION
as
Begin
Select Name, Gender from tblEmployee Where Gender = @Gender and DepartmentId =
@DepartmentId
End

To delete the SP, use DROP PROC 'SPName' or DROP PROCEDURE 'SPName'
To create an SP with output parameter, we use the keywords OUT or OUTPUT.
@EmployeeCount is an OUTPUT parameter. Notice, it is specified with OUTPUT keyword.
Create Procedure spGetEmployeeCountByGender
@Gender nvarchar(20),
@EmployeeCount int Output
as
Begin
Select @EmployeeCount = COUNT(Id)
from tblEmployee
where Gender = @Gender
End
To execute this stored procedure with OUTPUT parameter
1. First initialise a variable of the same datatype as that of the output parameter. We have
declared @EmployeeTotal integer variable.
2. Then pass the @EmployeeTotal variable to the SP. You have to specify
the OUTPUT keyword. If you don't specify the OUTPUT keyword, the variable will be NULL.
3. Execute

Declare @EmployeeTotal int


Execute spGetEmployeeCountByGender 'Female', @EmployeeTotal output
Print @EmployeeTotal

If you don't specify the OUTPUT keyword, when executing the stored procedure, the
@EmployeeTotal variable will be NULL. Here, we have not specified OUTPUT keyword.
When you execute, you will see '@EmployeeTotal is null' printed.

Declare @EmployeeTotal int


Execute spGetEmployeeCountByGender 'Female', @EmployeeTotal
if(@EmployeeTotal is null)
Print '@EmployeeTotal is null'
else
Print '@EmployeeTotal is not null'

You can pass parameters in any order, when you use the parameter names. Here, we are
first passing the OUTPUT parameter and then the input @Gender parameter.

Declare @EmployeeTotal int


Execute spGetEmployeeCountByGender @EmployeeCount = @EmployeeTotal OUT,
@Gender = 'Male'
Print @EmployeeTotal

The following system stored procedures, are extremely useful when working procedures.
sp_help SP_Name : View the information about the stored procedure, like parameter
names, their datatypes etc. sp_help can be used with any database object, like tables, views,
SP's, triggers etc. Alternatively, you can also press ALT+F1, when the name of the object is
highlighted.
sp_helptext SP_Name : View the Text of the stored procedure
sp_depends SP_Name : View the dependencies of the stored procedure. This system SP is
very useful, especially if you want to check, if there are any stored procedures that are
referencing a table that you are abput to drop. sp_depends can also be used with other
database objects like table etc.
Note: All parameter and variable names in SQL server, need to have the @symbol.
What are stored procedure status variables?
Whenever, you execute a stored procedure, it returns an integer status variable. Usually,
zero indicates success, and non-zero indicates failure. To see this yourself, execute any
stored procedure from the object explorer, in sql server management studio.
1. Right Click and select 'Execute Stored Procedure
2. If the procedure, expects parameters, provide the values and click OK.
3. Along with the result that you expect, the stored procedure, returns a Return Value = 0

So, from this we understood that, when a stored procedure is executed, it returns an integer
status variable. With this in mind, let's understand the difference between output
parameters and RETURN values. We will use the Employees table below for this purpose.

The following procedure returns total number of employees in the Employees table, using
output parameter - @TotalCount.
Create Procedure spGetTotalCountOfEmployees1
@TotalCount int output
as
Begin
Select @TotalCount = COUNT(ID) from tblEmployee
End

Executing spGetTotalCountOfEmployees1 returns 3.


Declare @TotalEmployees int
Execute spGetTotalCountOfEmployees @TotalEmployees Output
Select @TotalEmployees

Re-written stored procedure using return variables


Create Procedure spGetTotalCountOfEmployees2
as
Begin
return (Select COUNT(ID) from Employees)
End

Executing spGetTotalCountOfEmployees2 returns 3.


Declare @TotalEmployees int
Execute @TotalEmployees = spGetTotalCountOfEmployees2
Select @TotalEmployees

So, we are able to achieve what we want, using output parameters as well as return values.
Now, let's look at example, where return status variables cannot be used, but Output
parameters can be used.

In this SP, we are retrieving the Name of the employee, based on their Id, using the output
parameter @Name.
Create Procedure spGetNameById1
@Id int,
@Name nvarchar(20) Output
as
Begin
Select @Name = Name from tblEmployee Where Id = @Id
End

Executing spGetNameById1, prints the name of the employee


Declare @EmployeeName nvarchar(20)
Execute spGetNameById1 3, @EmployeeName out
Print 'Name of the Employee = ' + @EmployeeName

Now let's try to achieve the same thing, using return status variables.
Create Procedure spGetNameById2
@Id int
as
Begin
Return (Select Name from tblEmployee Where Id = @Id)
End

Executing spGetNameById2 returns an error stating 'Conversion failed when converting the
nvarchar value 'Sam' to data type int.'. The return status variable is an integer, and hence,
when we select Name of an employee and try to return that we get a conversion error.

Declare @EmployeeName nvarchar(20)


Execute @EmployeeName = spGetNameById2 1
Print 'Name of the Employee = ' + @EmployeeName
So, using return values, we can only return integers, and that too, only one integer. It is not
possible, to return more than one value using return values, where as output parameters,
can return any datatype and an sp can have more than one output parameters. I always
prefer, using output parameters, over RETURN values.

In general, RETURN values are used to indicate success or failure of stored procedure,
especially when we are dealing with nested stored procedures.Return a value of 0, indicates
success, and any nonzero value indicates failure.
Difference between return values and output parameters

The following advantages of using Stored Procedures over adhoc queries (inline SQL)
1. Execution plan retention and reusability - Stored Procedures are compiled and their
execution plan is cached and used again, when the same SP is executed again. Although
adhoc queries also create and reuse plan, the plan is reused only when the query is textual
match and the datatypes are matching with the previous call. Any change in the datatype or
you have an extra space in the query then, a new plan is created.

2. Reduces network traffic - You only need to send, EXECUTE SP_Name statement, over the
network, instead of the entire batch of adhoc SQL code.

3. Code reusability and better maintainability - A stored procedure can be reused with
multiple applications. If the logic has to change, we only have one place to change, where as
if it is inline sql, and if you have to use it in multiple applications, we end up with multiple
copies of this inline sql. If the logic has to change, we have to change at all the places, which
makes it harder maintaining inline sql.

4. Better Security - A database user can be granted access to an SP and prevent them from
executing direct "select" statements against a table. This is fine grain access control which
will help control what data a user has access to.

5. Avoids SQL Injection attack - SP's prevent sql injection attack.

Functions in Sql Server


Functions in SQL server can be broadly divided into 2 categories
1. Built-in functions
2. User Defined functions
Built in string functions in sql server :
ASCII(Character_Expression) - Returns the ASCII code of the given character expression.
To find the ACII Code of capital letter 'A'
Example: Select ASCII('A')
Output: 65
CHAR(Integer_Expression) - Converts an int ASCII code to a character. The
Integer_Expression, should be between 0 and 255.
The following SQL, prints all the characters for the ASCII values from o thru 255
Declare @Number int
Set @Number = 1
While(@Number <= 255)
Begin
Print CHAR(@Number)
Set @Number = @Number + 1
End

Note: The while loop will become an infinite loop, if you forget to include the following line.
Set @Number = @Number + 1
Printing uppercase alphabets using CHAR() function:
Declare @Number int
Set @Number = 65
While(@Number <= 90)
Begin
Print CHAR(@Number)
Set @Number = @Number + 1
End

Printing lowercase alphabets using CHAR() function:


Declare @Number int
Set @Number = 97
While(@Number <= 122)
Begin
Print CHAR(@Number)
Set @Number = @Number + 1
End
Another way of printing lower case alphabets using CHAR() and LOWER() functions.
Declare @Number int
Set @Number = 65
While(@Number <= 90)
Begin
Print LOWER(CHAR(@Number))
Set @Number = @Number + 1
End
LTRIM(Character_Expression) - Removes blanks on the left handside of the given character
expression.
Example: Removing the 3 white spaces on the left hand side of the ' Hello' string using
LTRIM() function.
Select LTRIM(' Hello')
Output: Hello
RTRIM(Character_Expression) - Removes blanks on the right hand side of the given
character expression.
Example: Removing the 3 white spaces on the left hand side of the 'Hello ' string using
RTRIM() function.
Select RTRIM('Hello ')
Output: Hello
Example: To remove white spaces on either sides of the given character expression, use
LTRIM() and RTRIM() as shown below.
Select LTRIM(RTRIM(' Hello '))
Output: Hello

LOWER(Character_Expression) - Converts all the characters in the given


Character_Expression, to lowercase letters.
Example: Select LOWER('CONVERT This String Into Lower Case')
Output: convert this string into lower case

UPPER(Character_Expression) - Converts all the characters in the given


Character_Expression, to uppercase letters.
Example: Select UPPER('CONVERT This String Into upper Case')
Output: CONVERT THIS STRING INTO UPPER CASE

REVERSE('Any_String_Expression') - Reverses all the characters in the given string


expression.
Example: Select REVERSE('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
Output: ZYXWVUTSRQPONMLKJIHGFEDCBA

LEN(String_Expression) - Returns the count of total characters, in the given string


expression, excluding the blanks at the end of the expression.
Example: Select LEN('SQL Functions ')
Output: 13
LEFT(Character_Expression, Integer_Expression) - Returns the specified number of
characters from the left hand side of the given character expression.
Example: Select LEFT('ABCDE', 3)
Output: ABC
RIGHT(Character_Expression, Integer_Expression) - Returns the specified number of
characters from the right hand side of the given character expression.
Example: Select RIGHT('ABCDE', 3)
Output: CDE

CHARINDEX('Expression_To_Find', 'Expression_To_Search', 'Start_Location') - Returns the


starting position of the specified expression in a character string. Start_Location parameter
is optional.
Example: In this example, we get the starting position of '@' character in the email string
'[email protected]'.
Select CHARINDEX('@','[email protected]',1)
Output: 5

SUBSTRING('Expression', 'Start', 'Length') - As the name, suggests, this function returns


substring (part of the string), from the given expression. You specify the starting location
using the 'start' parameter and the number of characters in the substring using 'Length'
parameter. All the 3 parameters are mandatory.
Example: Display just the domain part of the given email '[email protected]'.
Select SUBSTRING('[email protected]',6, 7)
Output: bbb.com

In the above example, we have hardcoded the starting position and the length parameters.
Instead of hardcoding we can dynamically retrieve them using CHARINDEX() and LEN() string
functions as shown below.

Example:
Select SUBSTRING('[email protected]',(CHARINDEX('@', '[email protected]') + 1),
(LEN('[email protected]') - CHARINDEX('@','[email protected]')))
Output: bbb.com

Real time example, where we can use LEN(), CHARINDEX() and SUBSTRING() functions. Let
us assume we have table as shown below.
Write a query to find out total number of emails, by domain. The result of the query should
be as shown below.

Query
Select SUBSTRING(Email, CHARINDEX('@', Email) + 1,
LEN(Email) - CHARINDEX('@', Email)) as EmailDomain,
COUNT(Email) as Total
from tblEmployee
Group By SUBSTRING(Email, CHARINDEX('@', Email) + 1,
LEN(Email) - CHARINDEX('@', Email))

REPLICATE (String_To_Be_Replicated, Number_Of_Times_To_Replicate) - Repeats the


given string, for the specified number of times.

Example: SELECT REPLICATE('Pragim', 3)


Output: Pragim Pragim Pragim

A practical example of using REPLICATE() function: We will be using this table, for the rest of
our examples in this article.
Let's mask the email with 5 * (star) symbols. The output should be as shown below.

Query:
Select FirstName, LastName, SUBSTRING(Email, 1, 2) + REPLICATE('*',5) +
SUBSTRING(Email, CHARINDEX('@',Email), LEN(Email) - CHARINDEX('@',Email)+1) as Email
from tblEmployee

SPACE(Number_Of_Spaces) - Returns number of spaces, specified by the


Number_Of_Spaces argument.
Example: The SPACE(5) function, inserts 5 spaces between FirstName and LastName
Select FirstName + SPACE(5) + LastName as FullName
From tblEmployee

Output:
PATINDEX ('%Pattern%', Expression)
Returns the starting position of the first occurrence of a pattern in a specified expression. It
takes two arguments, the pattern to be searched and the expression. PATINDEX() is simial to
CHARINDEX(). With CHARINDEX() we cannot use wildcards, where as PATINDEX() provides
this capability. If the specified pattern is not found, PATINDEX() returns ZERO.
Example:
Select Email, PATINDEX('%@aaa.com', Email) as FirstOccurence
from tblEmployee
Where PATINDEX('%@aaa.com', Email) > 0
Output:

REPLACE (String_Expression, Pattern , Replacement_Value)


Replaces all occurrences of a specified string value with another string value.
Example: All .COM strings are replaced with .NET
Select Email, REPLACE(Email, '.com', '.net') as ConvertedEmail
from tblEmployee
STUFF (Original_Expression, Start, Length, Replacement_expression)
STUFF() function inserts Replacement_expression, at the start position specified, along with
removing the charactes specified using Length parameter.
Example:
Select FirstName, LastName,Email, STUFF(Email, 2, 3, '*****') as StuffedEmail
From tblEmployee
Output:

There are several built-in DateTime functions available in SQL Server. All the following
functions can be used to get the current system date and time, where you have sql server
installed.

Function Date Time Format Description

GETDATE() 2012-08-31 20:15:04.543 Commonly used function

CURRENT_TIMESTAMP 2012-08-31 20:15:04.543 ANSI SQL equivalent to GETDATE


2012-08-31
SYSDATETIME() More fractional seconds precision
20:15:04.5380028

2012-08-31 More fractional seconds precision +


SYSDATETIMEOFFSET()
20:15:04.5380028 + 01:00 Time zone offset

GETUTCDATE() 2012-08-31 19:15:04.543 UTC Date and Time

2012-08-31 UTC Date and Time, with More


SYSUTCDATETIME()
19:15:04.5380028 fractional seconds precision

Note: UTC stands for Coordinated Universal Time, based on which, the world regulates
clocks and time. There are slight differences between GMT and UTC, but for most common
purposes, UTC is synonymous with GMT.

To practically understand how the different date time datatypes available in SQL Server,
store data, create the sample table tblDateTime.
CREATE TABLE [tblDateTime]
(
[c_time] [time](7) NULL,
[c_date] [date] NULL,
[c_smalldatetime] [smalldatetime] NULL,
[c_datetime] [datetime] NULL,
[c_datetime2] [datetime2](7) NULL,
[c_datetimeoffset] [datetimeoffset](7) NULL
)
To Insert some sample data, execute the following query.
INSERT
INTO tblDateTime VALUES (GETDATE(),GETDATE(),GETDATE(),GETDATE(),GETDATE(),GETDA
TE())
ISDATE() - Checks if the given value, is a valid date, time, or datetime. Returns 1 for success,
0 for failure.
Examples:
Select ISDATE('PRAGIM') -- returns 0
Select ISDATE(Getdate()) -- returns 1
Select ISDATE('2012-08-31 21:02:04.167') -- returns 1
Note: For datetime2 values, IsDate returns ZERO.
Example:
Select ISDATE('2012-09-01 11:34:21.1918447') -- returns 0.
Day() - Returns the 'Day number of the Month' of the given date
Examples:
Select DAY(GETDATE()) -- Returns the day number of the month, based on current system
datetime.
Select DAY('01/31/2012') -- Returns 31

Month() - Returns the 'Month number of the year' of the given date
Examples:
Select Month(GETDATE()) -- Returns the Month number of the year, based on the current
system date and time
Select Month('01/31/2012') -- Returns 1

Year() - Returns the 'Year number' of the given date


Examples:
Select Year(GETDATE()) -- Returns the year number, based on the current system date
Select Year('01/31/2012') -- Returns 2012

DateName(DatePart, Date) - Returns a string, that represents a part of the given date. This
functions takes 2 parameters. The first parameter 'DatePart' specifies, the part of the date,
we want. The second parameter, is the actual date, from which we want the part of the
Date.

Valid Datepart parameter values

Examples:
Select DATENAME(Day, '2012-09-30 12:43:46.837') -- Returns 30
Select DATENAME(WEEKDAY, '2012-09-30 12:43:46.837') -- Returns Sunday
Select DATENAME(MONTH, '2012-09-30 12:43:46.837') -- Returns September
A simple practical example using some of these DateTime functions. Consider the table
tblEmployees.

Write a query, which returns Name, DateOfBirth, Day, MonthNumber, MonthName, and
Year as shown below.

Query:
Select Name, DateOfBirth, DateName(WEEKDAY,DateOfBirth) as [Day],
Month(DateOfBirth) as MonthNumber,
DateName(MONTH, DateOfBirth) as [MonthName],
Year(DateOfBirth) as [Year]
From tblEmployees

DatePart(DatePart, Date) - Returns an integer representing the specified DatePart. This


function is simialar to DateName(). DateName() returns nvarchar, where as DatePart()
returns an integer. The valid DatePart parameter values are shown below.
Examples:
Select DATEPART(weekday, '2012-08-30 19:45:31.793') -- returns 5
Select DATENAME(weekday, '2012-08-30 19:45:31.793') -- returns Thursday

DATEADD (datepart, NumberToAdd, date) - Returns the DateTime, after adding specified
NumberToAdd, to the datepart specified of the given date.

Examples:
Select DateAdd(DAY, 20, '2012-08-30 19:45:31.793')
-- Returns 2012-09-19 19:45:31.793
Select DateAdd(DAY, -20, '2012-08-30 19:45:31.793')
-- Returns 2012-08-10 19:45:31.793

DATEDIFF(datepart, startdate, enddate) - Returns the count of the specified datepart


boundaries crossed between the specified startdate and enddate.

Examples:
Select DATEDIFF(MONTH, '11/30/2005','01/31/2006') -- returns 2
Select DATEDIFF(DAY, '11/30/2005','01/31/2006') -- returns 62

Consider the emaployees table below.


Write a query to compute the age of a person, when the date of birth is given. The output
should be as shown below.

CREATE FUNCTION fnComputeAge(@DOB DATETIME)


RETURNS NVARCHAR(50)
AS
BEGIN

DECLARE @tempdate DATETIME, @years INT, @months INT, @days INT


SELECT @tempdate = @DOB

SELECT @years = DATEDIFF(YEAR, @tempdate, GETDATE()) - CASE WHEN (MONTH(@DOB)


> MONTH(GETDATE())) OR (MONTH(@DOB) = MONTH(GETDATE()) AND DAY(@DOB)
> DAY(GETDATE())) THEN 1 ELSE 0 END
SELECT @tempdate = DATEADD(YEAR, @years, @tempdate)SELECT @months
= DATEDIFF(MONTH, @tempdate, GETDATE()) - CASE WHEN DAY(@DOB)
> DAY(GETDATE()) THEN 1 ELSE 0 END
SELECT @tempdate = DATEADD(MONTH, @months, @tempdate)

SELECT @days = DATEDIFF(DAY, @tempdate, GETDATE())

DECLARE @Age NVARCHAR(50)


SET @Age = Cast(@years AS NVARCHAR(4)) + ' Years
' + Cast(@months AS NVARCHAR(2))+ ' Months ' + Cast(@days AS NVARCHAR(2))+ ' Days
Old'
RETURN @Age

End
Using the function in a query to get the expected output along with the age of the person.
Select Id, Name, DateOfBirth, dbo.fnComputeAge(DateOfBirth) as Age from tblEmployees
CAST and CONVERT
To convert one data type to another, CAST and CONVERT functions can be used.

Syntax of CAST and CONVERT functions from MSDN:


CAST ( expression AS data_type [ ( length ) ] )
CONVERT ( data_type [ ( length ) ] , expression [ , style ] )

From the syntax, it is clear that CONVERT() function has an optional style parameter, where
as CAST() function lacks this capability.

Consider the Employees Table below

The following 2 queries convert, DateOfBirth's DateTime datatype to NVARCHAR. The first
query uses the CAST() function, and the second one uses CONVERT() function. The output is
exactly the same for both the queries as shown below.
Select Id, Name, DateOfBirth, CAST(DateofBirth as nvarchar) as ConvertedDOB
from tblEmployees
Select Id, Name, DateOfBirth, Convert(nvarchar, DateOfBirth) as ConvertedDOB
from tblEmployees

Output:

Now, let's use the style parameter of the CONVERT() function, to format the Date as we
would like it. In the query below, we are using 103 as the argument for style parameter,
which formats the date as dd/mm/yyyy.
Select Id, Name, DateOfBirth, Convert(nvarchar, DateOfBirth, 103) as ConvertedDOB
from tblEmployees
Output:

The following table lists a few of the common DateTime styles:

To get just the date part, from DateTime


SELECT CONVERT(VARCHAR(10),GETDATE(),101)

In SQL Server 2008, Date datatype is introduced, so you can also use
SELECT CAST(GETDATE() as DATE)
SELECT CONVERT(DATE, GETDATE())

Note: To control the formatting of the Date part, DateTime has to be converted to
NVARCHAR using the styles provided. When converting to DATE data type, the CONVERT()
function will ignore the style parameter.

Now, let's write a query which produces the following output:

In this query, we are using CAST() function, to convert Id (int) to nvarchar, so it can be
appended with the NAME column. If you remove the CAST() function, you will get an error
stating - 'Conversion failed when converting the nvarchar value 'Sam - ' to data type int.'
Select Id, Name, Name + ' - ' + CAST(Id AS NVARCHAR) AS [Name-Id]
FROM tblEmployees

Now let's look at a practical example of using CAST function. Consider the registrations
table below.

Write a query which returns the total number of registrations by day

Query:
Select CAST(RegisteredDate as DATE) as RegistrationDate,
COUNT(Id) as TotalRegistrations
From tblRegistrations
Group By CAST(RegisteredDate as DATE)

The following are the differences between the 2 functions.


1. Cast is based on ANSI standard and Convert is specific to SQL Server. So, if portability is a
concern and if you want to use the script with other database applications, use Cast().
2. Convert provides more flexibility than Cast. For example, it's possible to control how you
want DateTime datatypes to be converted using styles with convert function.
User Defined functions:
In SQL Server there are 3 types of User Defined functions
1. Scalar functions
2. Inline table-valued functions
3. Multistatement table-valued functions

Scalar functions may or may not have parameters, but always return a single (scalar) value.
The returned value can be of any data type, except text, ntext, image, cursor, and
timestamp.
To create a function, we use the following syntax:
CREATE FUNCTION Function_Name(@Parameter1 DataType, @Parameter2
DataType,..@Parametern Datatype)
RETURNS Return_Datatype
AS
BEGIN
Function Body
Return Return_Datatype
END

Let us now create a function which calculates and returns the age of a person. To compute
the age we require, date of birth. So, let's pass date of birth as a parameter. So, AGE()
function returns an integer and accepts date parameter.
CREATE FUNCTION Age(@DOB Date)
RETURNS INT
AS
BEGIN
DECLARE @Age INT
SET @Age = DATEDIFF(YEAR, @DOB, GETDATE()) - CASE WHEN (MONTH(@DOB)
> MONTH(GETDATE())) OR (MONTH(@DOB) = MONTH(GETDATE()) AND DAY(@DOB)
> DAY(GETDATE())) THEN 1 ELSE 0 END
RETURN @Age
END

When calling a scalar user-defined function, you must supply a two-part


name, OwnerName.FunctionName. dbo stands for database owner.
Select dbo.Age( dbo.Age('10/08/1982')

You can also invoke it using the complete 3 part name,


DatabaseName.OwnerName.FunctionName.
Select SampleDB.dbo.Age('10/08/1982')

Consider the Employees table below.

Scalar user defined functions can be used in the Select clause as shown below.
Select Name, DateOfBirth, dbo.Age(DateOfBirth) as Age from tblEmployees
Scalar user defined functions can be used in the Where clause, as shown below.
Select Name, DateOfBirth, dbo.Age(DateOfBirth) as Age
from tblEmployees
Where dbo.Age(DateOfBirth) > 30

A stored procedure also can accept DateOfBirth and return Age, but you cannot use stored
procedures in a select or where clause. This is just one difference between a function and a
stored procedure. There are several other differences, which we will talk about in a later
session.
To alter a function we use ALTER FUNCTION FuncationName statement and to delete it, we
use DROP FUNCTION FuncationName.
To view the text of the function use sp_helptext FunctionName
An Inline Table Valued function, return a table.

Syntax for creating an inline table valued function


CREATE FUNCTION Function_Name(@Param1 DataType, @Param2 DataType..., @ParamN
DataType)
RETURNS TABLE
AS
RETURN (Select_Statement)

Consider this Employees table shown below, which we will be using for our example.
Create a function that returns EMPLOYEES by GENDER.
CREATE FUNCTION fn_EmployeesByGender(@Gender nvarchar(10))
RETURNS TABLE
AS
RETURN (Select Id, Name, DateOfBirth, Gender, DepartmentId
from tblEmployees
where Gender = @Gender)

If you look at the way we implemented this function, it is very similar to SCALAR function,
with the following differences
1. We specify TABLE as the return type, instead of any scalar data type
2. The function body is not enclosed between BEGIN and END block. Inline table valued
function body, cannot have BEGIN and END block.
3. The structure of the table that gets returned, is determined by the SELECT statement
with in the function.

Calling the user defined function


Select * from fn_EmployeesByGender('Male')

Output:

As the inline user defined function, is returning a table, issue the select statement against
the function, as if you are selecting the data from a TABLE.

Where can we use Inline Table Valued functions


1. Inline Table Valued functions can be used to achieve the functionality of parameterized
views. We will talk about views, in a later session.
2. The table returned by the table valued function, can also be used in joins with other
tables.

Consider the Departments Table


Joining the Employees returned by the function, with the Departments table
Select Name, Gender, DepartmentName
from fn_EmployeesByGender('Male') E
Join tblDepartment D on D.Id = E.DepartmentId

Executing the above query should produce the following output

Multi statement table valued functions are very similar to Inline Table valued functions, with
a few differences. Let's look at an example, and then note the differences.

Employees Table:

Let's write an Inline and multi-statement Table Valued functions that can return the
output shown below.

Inline Table Valued function(ILTVF):


Create Function fn_ILTVF_GetEmployees()
Returns Table
as
Return (Select Id, Name, Cast(DateOfBirth as Date) as DOB
From tblEmployees)
Multi-statement Table Valued function(MSTVF):
Create Function fn_MSTVF_GetEmployees()
Returns @Table Table (Id int, Name nvarchar(20), DOB Date)
as
Begin
Insert into @Table
Select Id, Name, Cast(DateOfBirth as Date)
From tblEmployees

Return
End

Calling the Inline Table Valued Function:


Select * from fn_ILTVF_GetEmployees()

Calling the Multi-statement Table Valued Function:


Select * from fn_MSTVF_GetEmployees()

Now let's understand the differences between Inline Table Valued functions and Multi-
statement Table Valued functions
1. In an Inline Table Valued function, the RETURNS clause cannot contain the structure of
the table, the function returns. Where as, with the multi-statement table valued function,
we specify the structure of the table that gets returned
2. Inline Table Valued function cannot have BEGIN and END block, where as the multi-
statement function can have.
3. Inline Table valued functions are better for performance, than multi-statement table
valued functions. If the given task, can be achieved using an inline table valued function,
always prefer to use them, over multi-statement table valued functions.
4. It's possible to update the underlying table, using an inline table valued function, but not
possible using multi-statement table valued function.
Updating the underlying table using inline table valued function:
This query will change Sam to Sam1, in the underlying table tblEmployees. When you try do
the same thing with the multi-statement table valued function, you will get an error
stating 'Object 'fn_MSTVF_GetEmployees' cannot be modified.'
Update fn_ILTVF_GetEmployees() set Name='Sam1' Where Id = 1
Reason for improved performance of an inline table valued function:
Internally, SQL Server treats an inline table valued function much like it would a view and
treats a multi-statement table valued function similar to how it would a stored procedure.
Indexes in Sql Server
Why indexes?
Indexes are used by queries to find data from tables quickly. Indexes are created on tables
and views. Index on a table or a view, is very similar to an index that we find in a book.
If you don't have an index in a book, and I ask you to locate a specific chapter in that book,
you will have to look at every page starting from the first page of the book.
On, the other hand, if you have the index, you lookup the page number of the chapter in the
index, and then directly go to that page number to locate the chapter.
Obviously, the book index is helping to drastically reduce the time it takes to find
the chapter.
In a similar way, Table and View indexes, can help the query to find data quickly.
In fact, the existence of the right indexes, can drastically improve the performance of the
query. If there is no index to help the query, then the query engine, checks every row in the
table from the beginning to the end. This is called as Table Scan. Table scan is bad for
performance.

Index Example: At the moment, the Employees table, does not have an index on SALARY
column.

Consider, the following query


Select * from tblEmployee where Salary > 5000 and Salary < 7000

To find all the employees, who has salary greater than 5000 and less than 7000, the query
engine has to check each and every row in the table, resulting in a table scan, which can
adversely affect the performance, especially if the table is large. Since there is no index, to
help the query, the query engine performs an entire table scan.

Now Let's Create the Index to help the query:Here, we are creating an index on Salary
column in the employee table
CREATE Index IX_tblEmployee_Salary
ON tblEmployee (SALARY ASC)

The index stores salary of each employee, in the ascending order as shown below. The
actual index may look slightly different.

Now, when the SQL server has to execute the same query, it has an index on the salary
column to help this query. Salaries between the range of 5000 and 7000 are usually present
at the bottom, since the salaries are arranged in an ascending order. SQL server picks up the
row addresses from the index and directly fetch the records from the table, rather than
scanning each row in the table. This is called as Index Seek.
To view the Indexes: In the object explorer, expand Indexes folder. Alternatively use
sp_helptext system stored procedure. The following command query returns all the indexes
on tblEmployee table.
Execute sp_helptext tblEmployee
To delete or drop the index: When dropping an index, specify the table name as well
Drop Index tblEmployee.IX_tblEmployee_Salary
The following are the different types of indexes in SQL Server
1. Clustered
2. Nonclustered
3. Unique
4. Filtered
5. XML
6. Full Text
7. Spatial
8. Columnstore
9. Index with included columns
10. Index on computed columns
Clustered Index:
A clustered index determines the physical order of data in a table. For this reason, a table
can have only one clustered index.

Create tblEmployees table using the script below.


CREATE TABLE [tblEmployee]
(
[Id] int Primary Key,
[Name] nvarchar(50),
[Salary] int,
[Gender] nvarchar(10),
[City] nvarchar(50))
Note that Id column is marked as primary key. Primary key, constraint create clustered
indexes automatically if no clustered index already exists on the table and a nonclustered
index is not specified when you create the PRIMARY KEY constraint.

To confirm this, execute sp_helpindex tblEmployee, which will show a unique clustered
index created on the Id column.
Now execute the following insert queries. Note that, the values for Id column are not in a
sequential order.
Insert into tblEmployee Values(3,'John',4500,'Male','New York')
Insert into tblEmployee Values(1,'Sam',2500,'Male','London')
Insert into tblEmployee Values(4,'Sara',5500,'Female','Tokyo')
Insert into tblEmployee Values(5,'Todd',3100,'Male','Toronto')
Insert into tblEmployee Values(2,'Pam',6500,'Female','Sydney')

Execute the following SELECT query


Select * from tblEmployee

Inspite, of inserting the rows in a random order, when we execute the select query we can
see that all the rows in the table are arranged in an ascending order based on the Id column.
This is because a clustered index determines the physical order of data in a table, and we
have got a clustered index on the Id column.

Because of the fact that, a clustered index dictates the physical storage order of the data
in a table, a table can contain only one clustered index. If you take the example
of tblEmployee table, the data is already arranged by the Id column, and if we try to create
another clustered index on the Name column, the data needs to be rearranged based on
the NAME column, which will affect the ordering of rows that's already done based on the
ID column.

For this reason, SQL server doesn't allow us to create more than one clustered index per
table. The following SQL script, raises an error stating 'Cannot create more than one
clustered index on table 'tblEmployee'. Drop the existing clustered index
PK__tblEmplo__3214EC0706CD04F7 before creating another.'
Create Clustered Index IX_tblEmployee_Name
ON tblEmployee(Name)

A clustered index is analogous to a telephone directory, where the data is arranged by the
last name. We just learnt that, a table can have only one clustered index. However, the
index can contain multiple columns (a composite index), like the way a telephone directory
is organized by last name and first name.

Let's now create a clustered index on 2 columns. To do this we first have to drop the
existing clustered index on the Id column.
Drop index tblEmployee.PK__tblEmplo__3214EC070A9D95DB

When you execute this query, you get an error message stating 'An explicit DROP INDEX is
not allowed on index 'tblEmployee.PK__tblEmplo__3214EC070A9D95DB'. It is being used
for PRIMARY KEY constraint enforcement.' We will talk about the role of unique index in the
next session. To successfully delete the clustered index, right click on the index in the Object
explorer window and select DELETE.

Now, execute the following CREATE INDEX query, to create a composite clustered Index on
the Gender and Salary columns.
Create Clustered Index IX_tblEmployee_Gender_Salary
ON tblEmployee(Gender DESC, Salary ASC)

Now, if you issue a select query against this table you should see the data physically
arranged, FIRST by Gender in descending order and then by Salary in ascending order. The
result is shown below.

Non Clustered Index:


A nonclustered index is analogous to an index in a textbook. The data is stored in one place,
the index in another place. The index will have pointers to the storage location of the data.
Since, the nonclustered index is stored separately from the actual data, a table can have
more than one non clustered index, just like how a book can have an index by Chapters at
the beginning and another index by common terms at the end.
In the index itself, the data is stored in an ascending or descending order of the index key,
which doesn't in any way influence the storage of data in the table. The following SQL
creates a Nonclustered index on the NAME column on tblEmployee table:
Create NonClustered Index IX_tblEmployee_Name
ON tblEmployee(Name)

Difference between Clustered and NonClustered Index:


1. Only one clustered index per table, where as you can have more than one non clustered
index
2. Clustered index is faster than a non clustered index, because, the non-clustered index has
to refer back to the table, if the selected column is not present in the index.
3. Clustered index determines the storage order of rows in the table, and hence doesn't
require additional disk space, but where as a Non Clustered index is stored seperately from
the table, additional storage space is required.
Unique index is used to enforce uniqueness of key values in the index. Let's understand this
with an example.
Create the Employee table using the script below
CREATE TABLE [tblEmployee]
(
[Id] int Primary Key,
[FirstName] nvarchar(50),
[LastName] nvarchar(50),
[Salary] int,
[Gender] nvarchar(10),
[City] nvarchar(50)
)

Since, we have marked Id column, as the Primary key for this table, a UNIQUE CLUSTERED
INDEX gets created on the Id column, with Id as the index key.

We can verify this by executing the sp_helpindex system stored procedure as shown below.
Execute sp_helpindex tblEmployee

Output:

Since, we now have a UNIQUE CLUSTERED INDEX on the Id column, any attempt to
duplicate the key values, will throw an error stating 'Violation of PRIMARY KEY constraint
'PK__tblEmplo__3214EC07236943A5'. Cannot insert duplicate key in object
dbo.tblEmployee'

Example: The following insert queries will fail


Insert into tblEmployee Values(1,'Mike', 'Sandoz',4500,'Male','New York')
Insert into tblEmployee Values(1,'John', 'Menco',2500,'Male','London')

Now let's try to drop the Unique Clustered index on the Id column. This will raise an error
stating - 'An explicit DROP INDEX is not allowed on
index tblEmployee.PK__tblEmplo__3214EC07236943A5. It is being used for PRIMARY KEY
constraint enforcement.'
Drop index tblEmployee.PK__tblEmplo__3214EC07236943A5
So this error message proves that, SQL server internally, uses the UNIQUE index to enforce
the uniqueness of values and primary key.

Expand keys folder in the object explorer window, and you can see a primary key
constraint. Now, expand the indexes folder and you should see a unique clustered index. In
the object explorer it just shows the 'CLUSTERED' word. To, confirm, this is infact an
UNIQUE index, right click and select properties. The properties window, shows the UNIQUE
checkbox being selected.

SQL Server allows us to delete this UNIQUE CLUSTERED INDEX from the object explorer. so,
Right click on the index, and select DELETE and finally, click OK. Along with the UNIQUE
index, the primary key constraint is also deleted.
Now, let's try to insert duplicate values for the ID column. The rows should be accepted,
without any primary key violation error.
Insert into tblEmployee Values(1,'Mike', 'Sandoz',4500,'Male','New York')
Insert into tblEmployee Values(1,'John', 'Menco',2500,'Male','London')
So, the UNIQUE index is used to enforce the uniqueness of values and primary key
constraint.
UNIQUENESS is a property of an Index, and both CLUSTERED and NON-CLUSTERED indexes
can be UNIQUE.
Creating a UNIQUE NON CLUSTERED index on the FirstName and LastName columns.
Create Unique NonClustered Index UIX_tblEmployee_FirstName_LastName
On tblEmployee(FirstName, LastName)

This unique non clustered index, ensures that no 2 entires in the index has the same first
and last names. a Unique Constraint, can be used to enforce the uniqueness of values,
across one or more columns. There are no major differences between a unique constraint
and a unique index.

In fact, when you add a unique constraint, a unique index gets created behind the scenes.
To prove this, let's add a unique constraint on the city column of the tblEmployee table.
ALTER TABLE tblEmployee
ADD CONSTRAINT UQ_tblEmployee_City
UNIQUE NONCLUSTERED (City)

At this point, we expect a unique constraint to be created. Refresh and Expand the
constraints folder in the object explorer window. The constraint is not present in this folder.
Now, refresh and expand the 'indexes' folder. In the indexes folder, you will see a UNIQUE
NONCLUSTERED index with name UQ_tblEmployee_City.

Also, executing EXECUTE SP_HELPCONSTRAINT tblEmployee, lists the constraint as a


UNIQUE NONCLUSTERED index.

So creating a UNIQUE constraint, actually creates a UNIQUE index. So a UNIQUE index can
be created explicitly, using CREATE INDEX statement or indirectly using a UNIQUE
constraint. So, when should you be creating a Unique constraint over a unique index.To
make our intentions clear, create a unique constraint, when data integrity is the objective.
This makes the objective of the index very clear. In either cases, data is validated in
the same manner, and the query optimizer does not differentiate between a unique index
created by a unique constraint or manually created.

Note:
1. By default, a PRIMARY KEY constraint, creates a unique clustered index, where as a
UNIQUE constraint creates a unique nonclustered index. These defaults can be changed if
you wish to.
2. A UNIQUE constraint or a UNIQUE index cannot be created on an existing table, if the
table contains duplicate values in the key columns. Obviously, to solve this,remove the key
columns from the index definition or delete or update the duplicate values.
3. By default, duplicate values are not allowed on key columns, when you have a unique
index or constraint. For, example, if I try to insert 10 rows, out of which 5 rows contain
duplicates, then all the 10 rows are rejected. However, if I want only the 5 duplicate rows to
be rejected and accept the non-duplicate 5 rows, then I can use IGNORE_DUP_KEY option.
An example of using IGNORE_DUP_KEY option is shown below.
CREATE UNIQUE INDEX IX_tblEmployee_City
ON tblEmployee(City)
WITH IGNORE_DUP_KEY
The following select query benefits from the index on the Salary column, because the
salaries are sorted in ascending order in the index. From the index, it's easy to identify the
records where salary is between 4000 and 8000, and using the row address the
corresponding records from the table can be fetched quickly.
Select * from tblEmployee where Salary > 4000 and Salary < 8000
Not only, the SELECT statement, even the following DELETE and UPDATE statements can
also benefit from the index. To update or delete a row, SQL server needs to first find that
row, and the index can help in searching and finding that specific row quickly.
Delete from tblEmployee where Salary = 2500
Update tblEmployee Set Salary = 9000 where Salary = 7500

Indexes can also help queries, that ask for sorted results. Since the Salaries are already
sorted, the database engine, simply scans the index from the first entry to the last entry and
retrieve the rows in sorted order. This avoids, sorting of rows during query execution, which
can significantly imrpove the processing time.
Select * from tblEmployee order by Salary

The index on the Salary column, can also help the query below, by scanning the index in
reverse order.
Select * from tblEmployee order by Salary Desc

GROUP BY queries can also benefit from indexes. To group the Employees with the same
salary, the query engine, can use the index on Salary column, to retrieve the already sorted
salaries. Since matching salaries are present in consecutive index entries, it is to count the
total number of Employees at each Salary quickly.
Select Salary, COUNT(Salary) as Total
from tblEmployee
Group By Salary

Diadvantages of Indexes:
Additional Disk Space: Clustered Index does not, require any additional storage. Every Non-
Clustered index requires additional space as it is stored separately from the table.The
amount of space required will depend on the size of the table, and the number and types of
columns used in the index.
Insert Update and Delete statements can become slow: When DML (Data Manipulation
Language) statements (INSERT, UPDATE, DELETE) modifies data in a table, the data in all the
indexes also needs to be updated. Indexes can help, to search and locate the rows, that we
want to delete, but too many indexes to update can actually hurt the performance of data
modifications.
What is a covering query?
If all the columns that you have requested in the SELECT clause of query, are present in the
index, then there is no need to lookup in the table again. The requested columns data can
simply be returned from the index.
A clustered index, always covers a query, since it contains all of the data in a table. A
composite index is an index on two or more columns. Both clustered and nonclustered
indexes can be composite indexes. To a certain extent, a composite index, can cover a
query.

Views in Sql Server


What is a View?
A view is nothing more than a saved SQL query. A view can also be considered as a virtual
table.
Let's understand views with an example. We will base all our examples
on tblEmployee and tblDepartment tables.

SQL Script to create tblEmployee table:


CREATE TABLE tblEmployee
(
Id int Primary Key,
Name nvarchar(30),
Salary int,
Gender nvarchar(10),
DepartmentId int
)

SQL Script to create tblDepartment table:


CREATE TABLE tblDepartment
(
DeptId int Primary Key,
DeptName nvarchar(20)
)

Insert data into tblDepartment table


Insert into tblDepartment values (1,'IT')
Insert into tblDepartment values (2,'Payroll')
Insert into tblDepartment values (3,'HR')
Insert into tblDepartment values (4,'Admin')
Insert data into tblEmployee table
Insert into tblEmployee values (1,'John', 5000, 'Male', 3)
Insert into tblEmployee values (2,'Mike', 3400, 'Male', 2)

Insert into tblEmployee values (3,'Pam', 6000, 'Female', 1)

Insert into tblEmployee values (4,'Todd', 4800, 'Male', 4)


Insert into tblEmployee values (5,'Sara', 3200, 'Female', 1)
Insert into tblEmployee values (6,'Ben', 4800, 'Male', 3)

At this point Employees and Departments table should look like this.
Employees Table:

Departments Table:

Now, let's write a Query which returns the output as shown below:

Select Id, Name, Salary, Gender, DeptName


from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId

Now let's create a view, using the JOINS query, we have just written.
Create View vWEmployeesByDepartment
as
Select Id, Name, Salary, Gender, DeptName
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId

To select data from the view, SELECT statement can be used the way, we use it with a table.
SELECT * from vWEmployeesByDepartment

When this query is executed, the database engine actually retrieves the data from the
underlying base tables, tblEmployees and tblDepartments. The View itself, doesnot store
any data by default. However, we can change this default behaviour, which we will talk
about in a later session. So, this is the reason, a view is considered, as just, a stored query or
a virtual table.

Advantages of using views:


1. Views can be used to reduce the complexity of the database schema, for non IT users. The
sample view, vWEmployeesByDepartment, hides the complexity of joins. Non-IT users, finds
it easy to query the view, rather than writing complex joins.

2. Views can be used as a mechanism to implement row and column level security.
Row Level Security:
For example, I want an end user, to have access only to IT Department employees. If I grant
him access to the underlying tblEmployees and tblDepartments tables, he will be able to
see, every department employees. To achieve this, I can create a view, which returns only IT
Department employees, and grant the user access to the view and not to the underlying
table.

View that returns only IT department employees:


Create View vWITDepartment_Employees
as
Select Id, Name, Salary, Gender, DeptName
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
where tblDepartment.DeptName = 'IT'

Column Level Security:


Salary is confidential information and I want to prevent access to that column. To achieve
this, we can create a view, which excludes the Salary column, and then grant the end user
access to this views, rather than the base tables.

View that returns all columns except Salary column:


Create View vWEmployeesNonConfidentialData
as
Select Id, Name, Gender, DeptName
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId

3. Views can be used to present only aggregated data and hide detailed data.

View that returns summarized data, Total number of employees by Department.


Create View vWEmployeesCountByDepartment
as
Select DeptName, COUNT(Id) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
Group By DeptName

To look at view definition - sp_helptext vWName


To modify a view - ALTER VIEW statement
To Drop a view - DROP VIEW vWName

Create Table tblEmployee Script:


CREATE TABLE tblEmployee
(
Id int Primary Key,
Name nvarchar(30),
Salary int,
Gender nvarchar(10),
DepartmentId int
)

Script to insert data:


Insert into tblEmployee values (1,'John', 5000, 'Male', 3)
Insert into tblEmployee values (2,'Mike', 3400, 'Male', 2)

Insert into tblEmployee values (3,'Pam', 6000, 'Female', 1)


Insert into tblEmployee values (4,'Todd', 4800, 'Male', 4)
Insert into tblEmployee values (5,'Sara', 3200, 'Female', 1)
Insert into tblEmployee values (6,'Ben', 4800, 'Male', 3)

Let's create a view, which returns all the columns from the tblEmployees table, except
Salary column.
Create view vWEmployeesDataExceptSalary
as
Select Id, Name, Gender, DepartmentId
from tblEmployee

Select data from the view: A view does not store any data. So, when this query is executed,
the database engine actually retrieves data, from the underlying tblEmployee base table.
Select * from vWEmployeesDataExceptSalary

Is it possible to Insert, Update and delete rows, from the underlying tblEmployees table,
using view vWEmployeesDataExceptSalary?
Yes, SQL server views are updateable.

The following query updates, Name column from Mike to Mikey. Though, we are updating
the view, SQL server, correctly updates the base table tblEmployee. To verify, execute,
SELECT statement, on tblEmployee table.
Update vWEmployeesDataExceptSalary
Set Name = 'Mikey' Where Id = 2

Along the same lines, it is also possible to insert and delete rows from the base table using
views.
Delete from vWEmployeesDataExceptSalary where Id = 2

Insert into vWEmployeesDataExceptSalary values (2, 'Mikey', 'Male', 2)

Now, let us see, what happens if our view is based on multiple base tables. For this purpose,
let's create tblDepartment table and populate with some sample data.
SQL Script to create tblDepartment table
CREATE TABLE tblDepartment
(
DeptId int Primary Key,
DeptName nvarchar(20)
)

Insert data into tblDepartment table


Insert into tblDepartment values (1,'IT')
Insert into tblDepartment values (2,'Payroll')
Insert into tblDepartment values (3,'HR')
Insert into tblDepartment values (4,'Admin')

Create a view which joins tblEmployee and tblDepartment tables, and return the result as
shown below.

View that joins tblEmployee and tblDepartment


Create view vwEmployeeDetailsByDepartment
as
Select Id, Name, Salary, Gender, DeptName
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId

Select Data from view vwEmployeeDetailsByDepartment


Select * from vwEmployeeDetailsByDepartment

vwEmployeeDetailsByDepartment Data:

Now, let's update, John's department, from HR to IT. At the moment, there are 2 employees
(Ben, and John) in the HR department.
Update vwEmployeeDetailsByDepartment
set DeptName='IT' where Name = 'John'

Now, Select data from the view vwEmployeeDetailsByDepartment:


Notice, that Ben's department is also changed to IT. To understand the reasons for incorrect
UPDATE, select Data from tblDepartment and tblEmployee base tables.

tblEmployee Table

tblDepartment

The UPDATE statement, updated DeptName from HR to IT in tblDepartment table, instead


of upadting DepartmentId column in tblEmployee table. So, the conclusion - If a view is
based on multiple tables, and if you update the view, it may not update the underlying base
tables correctly. To correctly update a view, that is based on multiple table, INSTEAD OF
triggers are used.
What is an Indexed View or What happens when you create an Index on a view?
A standard or Non-indexed view, is just a stored SQL query. When, we try to retrieve data
from the view, the data is actually retrieved from the underlying base tables. So, a view is
just a virtual table it does not store any data, by default.

However, when you create an index, on a view, the view gets materialized. This means, the
view is now, capable of storing data. In SQL server, we call them Indexed views and in
Oracle, Materialized views.
Let's now, look at an example of creating an Indexed view. For the purpose of this video, we
will be using tblProduct and tblProductSales tables.

Script to create table tblProduct


Create Table tblProduct
(
ProductId int primary key,
Name nvarchar(20),
UnitPrice int
)

Script to pouplate tblProduct, with sample data


Insert into tblProduct Values(1, 'Books', 20)
Insert into tblProduct Values(2, 'Pens', 14)
Insert into tblProduct Values(3, 'Pencils', 11)
Insert into tblProduct Values(4, 'Clips', 10)

Script to create table tblProductSales


Create Table tblProductSales
(
ProductId int,
QuantitySold int
)

Script to pouplate tblProductSales, with sample data


Insert into tblProductSales values(1, 10)
Insert into tblProductSales values(3, 23)
Insert into tblProductSales values(4, 21)
Insert into tblProductSales values(2, 12)
Insert into tblProductSales values(1, 13)
Insert into tblProductSales values(3, 12)
Insert into tblProductSales values(4, 13)
Insert into tblProductSales values(1, 11)
Insert into tblProductSales values(2, 12)
Insert into tblProductSales values(1, 14)

tblProduct Table
tblProductSales Table

Create a view which returns Total Sales and Total Transactions by Product. The output
should be, as shown below.

Script to create view vWTotalSalesByProduct


Create view vWTotalSalesByProduct
with SchemaBinding
as
Select Name,
SUM(ISNULL((QuantitySold * UnitPrice), 0)) as TotalSales,
COUNT_BIG(*) as TotalTransactions
from dbo.tblProductSales
join dbo.tblProduct
on dbo.tblProduct.ProductId = dbo.tblProductSales.ProductId
group by Name
If you want to create an Index, on a view, the following rules should be followed by the
view. For the complete list of all rules, please check MSDN.
1. The view should be created with SchemaBinding option
2. If an Aggregate function in the SELECT LIST, references an expression, and if there is a
possibility for that expression to become NULL, then, a replacement value should be
specified. In this example, we are using, ISNULL() function, to replace NULL values with
ZERO.
3. If GROUP BY is specified, the view select list must contain a COUNT_BIG(*) expression
4. The base tables in the view, should be referenced with 2 part name. In this example,
tblProduct and tblProductSales are referenced using dbo.tblProduct and
dbo.tblProductSales respectively.

Now, let's create an Index on the view:


The first index that you create on a view, must be a unique clustered index. After the unique
clustered index has been created, you can create additional nonclustered indexes.
Create Unique Clustered Index UIX_vWTotalSalesByProduct_Name
on vWTotalSalesByProduct(Name)

Since, we now have an index on the view, the view gets materialized. The data is stored in
the view. So when we execute Select * from vWTotalSalesByProduct, the data is retrurned
from the view itself, rather than retrieving data from the underlying base tables.

Indexed views, can significantly improve the performance of queries that involves JOINS and
Aggeregations. The cost of maintaining an indexed view is much higher than the cost of
maintaining a table index.

Indexed views are ideal for scenarios, where the underlying data is not frequently changed.
Indexed views are more often used in OLAP systems, because the data is mainly used for
reporting and analysis purposes. Indexed views, may not be suitable for OLTP systems, as
the data is frequently addedd and changed.
Limitations of views
1. You cannot pass parameters to a view. Table Valued functions are an excellent
replacement for parameterized views.

We will use tblEmployee table for our examples. SQL Script to create tblEmployee table:
CREATE TABLE tblEmployee
(
Id int Primary Key,
Name nvarchar(30),
Salary int,
Gender nvarchar(10),
DepartmentId int
)
Insert data into tblEmployee table
Insert into tblEmployee values (1,'John', 5000, 'Male', 3)
Insert into tblEmployee values (2,'Mike', 3400, 'Male', 2)
Insert into tblEmployee values (3,'Pam', 6000, 'Female', 1)
Insert into tblEmployee values (4,'Todd', 4800, 'Male', 4)
Insert into tblEmployee values (5,'Sara', 3200, 'Female', 1)
Insert into tblEmployee values (6,'Ben', 4800, 'Male', 3)

Employee Table

-- Error : Cannot pass Parameters to Views


Create View vWEmployeeDetails
@Gender nvarchar(20)
as
Select Id, Name, Gender, DepartmentId
from tblEmployee
where Gender = @Gender

Table Valued functions can be used as a replacement for parameterized views.


Create function fnEmployeeDetails(@Gender nvarchar(20))
Returns Table
as
Return
(Select Id, Name, Gender, DepartmentId
from tblEmployee where Gender = @Gender)

Calling the function


Select * from dbo.fnEmployeeDetails('Male')

2. Rules and Defaults cannot be associated with views.

3. The ORDER BY clause is invalid in views unless TOP or FOR XML is also specified.
Create View vWEmployeeDetailsSorted
as
Select Id, Name, Gender, DepartmentId
from tblEmployee
order by Id
If you use ORDER BY, you will get an error stating - 'The ORDER BY clause is invalid in views,
inline functions, derived tables, subqueries, and common table expressions, unless TOP or
FOR XML is also specified.'

4. Views cannot be based on temporary tables.

Create Table ##TestTempTable(Id int, Name nvarchar(20), Gender nvarchar(10))

Insert into ##TestTempTable values(101, 'Martin', 'Male')


Insert into ##TestTempTable values(102, 'Joe', 'Female')
Insert into ##TestTempTable values(103, 'Pam', 'Female')
Insert into ##TestTempTable values(104, 'James', 'Male')

-- Error: Cannot create a view on Temp Tables


Create View vwOnTempTable
as
Select Id, Name, Gender
from ##TestTempTable

Triggers in Sql Server


In SQL server there are 3 types of triggers
1. DML triggers
2. DDL triggers
3. Logon trigger
In general, a trigger is a special kind of stored procedure that automatically executes when
an event occurs in the database server.

DML stands for Data Manipulation Language. INSERT, UPDATE, and DELETE statements are
DML statements. DML triggers are fired, when ever data is modified using INSERT, UPDATE,
and DELETE events.

DML triggers can be again classified into 2 types.


1. After triggers (Sometimes called as FOR triggers)
2. Instead of triggers

After triggers, as the name says, fires after the triggering action. The INSERT, UPDATE, and
DELETE statements, causes an after trigger to fire after the respective statements complete
execution.

On ther hand, as the name says, INSTEAD of triggers, fires instead of the triggering action.
The INSERT, UPDATE, and DELETE statements, can cause an INSTEAD OF trigger to fire
INSTEAD OF the respective statement execution.
We will use tblEmployee and tblEmployeeAudit tables for our examples

SQL Script to create tblEmployee table:


CREATE TABLE tblEmployee
(
Id int Primary Key,
Name nvarchar(30),
Salary int,
Gender nvarchar(10),
DepartmentId int
)

Insert data into tblEmployee table


Insert into tblEmployee values (1,'John', 5000, 'Male', 3)
Insert into tblEmployee values (2,'Mike', 3400, 'Male', 2)
Insert into tblEmployee values (3,'Pam', 6000, 'Female', 1)

tblEmployee

SQL Script to create tblEmployeeAudit table:


CREATE TABLE tblEmployeeAudit
(
Id int identity(1,1) primary key,
AuditData nvarchar(1000)
)

When ever, a new Employee is added, we want to capture the ID and the date and time, the
new employee is added in tblEmployeeAudit table. The easiest way to achieve this, is by
having an AFTER TRIGGER for INSERT event.

Example for AFTER TRIGGER for INSERT event on tblEmployee table:


CREATE TRIGGER tr_tblEMployee_ForInsert
ON tblEmployee
FOR INSERT
AS
BEGIN
Declare @Id int
Select @Id = Id from inserted

insert into tblEmployeeAudit


values('New employee with Id = ' + Cast(@Id as nvarchar(5)) + ' is added at
' + cast(Getdate() as nvarchar(20)))
END

In the trigger, we are getting the id from inserted table. So, what is this inserted table?
INSERTED table, is a special table used by DML triggers. When you add a new row into
tblEmployee table, a copy of the row will also be made into inserted table, which only a
trigger can access. You cannot access this table outside the context of the trigger. The
structure of the inserted table will be identical to the structure of tblEmployee table.
So, now if we execute the following INSERT statement on tblEmployee. Immediately, after
inserting the row into tblEmployee table, the trigger gets fired (executed automatically), and
a row into tblEmployeeAudit, is also inserted.
Insert into tblEmployee values (7,'Tan', 2300, 'Female', 3)
Along, the same lines, let us now capture audit information, when a row is deleted from the
table, tblEmployee.
Example for AFTER TRIGGER for DELETE event on tblEmployee table:
CREATE TRIGGER tr_tblEMployee_ForDelete
ON tblEmployee
FOR DELETE
AS
BEGIN
Declare @Id int
Select @Id = Id from deleted

insert into tblEmployeeAudit


values('An existing employee with Id = ' + Cast(@Id as nvarchar(5)) + ' is deleted at
' + Cast(Getdate() as nvarchar(20)))
END

The only difference here is that, we are specifying, the triggering event as DELETE and
retrieving the deleted row ID from DELETED table. DELETED table, is a special table used by
DML triggers. When you delete a row from tblEmployee table, a copy of the deleted row will
be made available in DELETED table, which only a trigger can access. Just like INSERTED
table, DELETED table cannot be accessed, outside the context of the trigger and, the
structure of the DELETED table will be identical to the structure of tblEmployee table.

INSTEAD OF triggers, specifically INSTEAD OF INSERT trigger. We know that, AFTER triggers
are fired after the triggering event(INSERT, UPDATE or DELETE events), where as, INSTEAD
OF triggers are fired instead of the triggering event(INSERT, UPDATE or DELETE events). In
general, INSTEAD OF triggers are usually used to correctly update views that are based on
multiple tables.

Now, let's try to insert a row into the view, vWEmployeeDetails, by executing the following
query. At this point, an error will be raised stating 'View or function vWEmployeeDetails is
not updatable because the modification affects multiple base tables.'
Insert into vWEmployeeDetails values(7, 'Valarie', 'Female', 'IT')
So, inserting a row into a view that is based on multipe tables, raises an error by default.
Now, let's understand, how INSTEAD OF TRIGGERS can help us in this situation. Since, we
are getting an error, when we are trying to insert a row into the view, let's create an
INSTEAD OF INSERT trigger on the view vWEmployeeDetails.

Script to create INSTEAD OF INSERT trigger:


Create trigger tr_vWEmployeeDetails_InsteadOfInsert
on vWEmployeeDetails
Instead Of Insert
as
Begin
Declare @DeptId int

--Check if there is a valid DepartmentId


--for the given DepartmentName
Select @DeptId = DeptId
from tblDepartment
join inserted
on inserted.DeptName = tblDepartment.DeptName
--If DepartmentId is null throw an error
--and stop processing
if(@DeptId is null)
Begin
Raiserror('Invalid Department Name. Statement terminated', 16, 1)
return
End
--Finally insert into tblEmployee table
Insert into tblEmployee(Id, Name, Gender, DepartmentId)
Select Id, Name, Gender, @DeptId
from inserted
End

Now, let's execute the insert query:


Insert into vWEmployeeDetails values(7, 'Valarie', 'Female', 'IT')

The instead of trigger correctly inserts, the record into tblEmployee table. Since, we are
inserting a row, the inserted table, contains the newly added row, where as
the deleted table will be empty.
INSTEAD OF UPDATE trigger. An INSTEAD OF UPDATE triggers gets fired instead of an update
event, on a table or a view. For example, let's say we have, an INSTEAD OF UPDATE trigger
on a view or a table, and then when you try to update a row with in that view or table,
instead of the UPDATE, the trigger gets fired automatically. INSTEAD OF UPDATE TRIGGERS,
are of immense help, to correctly update a view, that is based on multiple tables.
Script to create INSTEAD OF UPDATE trigger:
Create Trigger tr_vWEmployeeDetails_InsteadOfUpdate
on vWEmployeeDetails
instead of update
as
Begin
-- if EmployeeId is updated
if(Update(Id))
Begin
Raiserror('Id cannot be changed', 16, 1)
Return
End

-- If DeptName is updated
if(Update(DeptName))
Begin
Declare @DeptId int

Select @DeptId = DeptId


from tblDepartment
join inserted
on inserted.DeptName = tblDepartment.DeptName

if(@DeptId is NULL )
Begin
Raiserror('Invalid Department Name', 16, 1)
Return
End

Update tblEmployee set DepartmentId = @DeptId


from inserted
join tblEmployee
on tblEmployee.Id = inserted.id
End

-- If gender is updated
if(Update(Gender))
Begin
Update tblEmployee set Gender = inserted.Gender
from inserted
join tblEmployee
on tblEmployee.Id = inserted.id
End

-- If Name is updated
if(Update(Name))
Begin
Update tblEmployee set Name = inserted.Name
from inserted
join tblEmployee
on tblEmployee.Id = inserted.id
End
End

Now, let's try to update JOHN's Department to IT.


Update vWEmployeeDetails
set DeptName = 'IT'
where Id = 1

The UPDATE query works as expected. The INSTEAD OF UPDATE trigger, correctly updates,
JOHN's DepartmentId to 1, in tblEmployee table.

Now, let's try to update Name, Gender and DeptName. The UPDATE query, works as
expected, without raising the error - 'View or function vWEmployeeDetails is not updatable
because the modification affects multiple base tables.'
Update vWEmployeeDetails
set Name = 'Johny', Gender = 'Female', DeptName = 'IT'
where Id = 1
Update() function used in the trigger, returns true, even if you update with the same value.
For this reason, I recomend to compare values between inserted and deleted tables, rather
than relying on Update() function. The Update() function does not operate on a per row
basis, but across all rows.
INSTEAD OF DELETE trigger: An INSTEAD OF DELETE trigger gets fired instead of the DELETE
event, on a table or a view. For example, let's say we have, an INSTEAD OF DELETE trigger on
a view or a table, and then when you try to update a row from that view or table, instead of
the actual DELETE event, the trigger gets fired automatically. INSTEAD OF DELETE TRIGGERS,
are used, to delete records from a view, that is based on multiple tables.
Script to create INSTEAD OF DELETE trigger:
Create Trigger tr_vWEmployeeDetails_InsteadOfDelete
on vWEmployeeDetails
instead of delete
as
Begin
Delete tblEmployee
from tblEmployee
join deleted
on tblEmployee.Id = deleted.Id

--Subquery
--Delete from tblEmployee
--where Id in (Select Id from deleted)
End

Notice that, the trigger tr_vWEmployeeDetails_InsteadOfDelete, makes use of DELETED


table. DELETED table contains all the rows, that we tried to DELETE from the view. So, we
are joining the DELETED table with tblEmployee, to delete the rows. You can also use sub-
queries to do the same. In most cases JOINs are faster than SUB-QUERIEs. However, in
cases, where you only need a subset of records from a table that you are joining with, sub-
queries can be faster.

Upon executing the following DELETE statement, the row gets DELETED as expected from
tblEmployee table
Delete from vWEmployeeDetails where Id = 1

Trigger INSERTED or DELETED?


Instead of DELETED table is always empty and the INSERTED table contains the newly
Insert inserted data.

Instead of INSERTED table is always empty and the DELETED table contains the rows
Delete deleted

Instead of DELETED table contains OLD data (before update), and inserted table
Update contains NEW data(Updated data)

Temporary tables in SQL Server


What are Temporary tables?
Temporary tables, are very similar to the permanent tables. Permanent tables get created in
the database you specify, and remain in the database permanently, until you delete (drop)
them. On the other hand, temporary tables get created in the TempDB and are
automatically deleted, when they are no longer used.
Different Types of Temporary tables
In SQL Server, there are 2 types of Temporary tables - Local Temporary tables and Global
Temporary tables.
How to Create a Local Temporary Table:
Creating a local Temporary table is very similar to creating a permanent table, except that
you prefix the table name with 1 pound (#) symbol. In the example
below, #PersonDetails is a local temporary table, with Id and Name columns.
Create Table #PersonDetails(Id int, Name nvarchar(20))

Insert Data into the temporary table:


Insert into #PersonDetails Values(1, 'Mike')
Insert into #PersonDetails Values(2, 'John')
Insert into #PersonDetails Values(3, 'Todd')

Select the data from the temporary table:


Select * from #PersonDetails

How to check if the local temporary table is created


Temporary tables are created in the TEMPDB. Query the sysobjects system table in
TEMPDB. The name of the table, is suffixed with lot of underscores and a random number.
For this reason you have to use the LIKE operator in the query.
Select name from tempdb..sysobjects
where name like '#PersonDetails%'

You can also check the existence of temporary tables using object explorer. In the object
explorer, expand TEMPDB database folder, and then exapand TEMPORARY TABLES folder,
and you should see the temporary table that we have created.
A local temporary table is available, only for the connection that has created the table. If
you open another query window, and execute the following query you get an error
stating 'Invalid object name #PersonDetails'. This proves that local temporary tables are
available, only for the connection that has created them.
A local temporary table is automatically dropped, when the connection that has created
the it, is closed. If the user wants to explicitly drop the temporary table, he can do so using
DROP TABLE #PersonDetails

If the temporary table, is created inside the stored procedure, it get's dropped
automatically upon the completion of stored procedure execution. The stored procedure
below, creates #PersonDetails temporary table, populates it and then finally returns the
data and destroys the temporary table immediately after the completion of the stored
procedure execution.
Create Procedure spCreateLocalTempTable
as
Begin
Create Table #PersonDetails(Id int, Name nvarchar(20))

Insert into #PersonDetails Values(1, 'Mike')


Insert into #PersonDetails Values(2, 'John')
Insert into #PersonDetails Values(3, 'Todd')

Select * from #PersonDetails


End

It is also possible for different connections, to create a local temporary table with the same
name. For example User1 and User2, both can create a local temporary table with the same
name #PersonDetails. Now, if you expand the Temporary Tables folder in the TEMPDB
database, you should see 2 tables with name #PersonDetails and some random number at
the end of the name. To differentiate between, the User1 and User2 local temp tables,
sql server appends the random number at the end of the temp table name.

How to Create a Global Temporary Table:


To create a Global Temporary Table, prefix the name of the table with 2 pound (##)
symbols. EmployeeDetails Table is the global temporary table, as we have prefixed it with 2
## symbols.
Create Table ##EmployeeDetails(Id int, Name nvarchar(20))

Global temporary tables are visible to all the connections of the sql server, and are only
destroyed when the last connection referencing the table is closed.

Multiple users, across multiple connections can have local temporary tables with the same
name, but, a global temporary table name has to be unique, and if you inspect the name of
the global temp table, in the object explorer, there will be no random numbers suffixed at
the end of the table name.

Difference Between Local and Global Temporary Tables:


1. Local Temp tables are prefixed with single pound (#) symbol, where as gloabl temp tables
are prefixed with 2 pound (##) symbols.

2. SQL Server appends some random numbers at the end of the local temp table name,
where this is not done for global temp table names.

3. Local temporary tables are only visible to that session of the SQL Server which has created
it, where as Global temporary tables are visible to all the SQL server sessions

4. Local temporary tables are automatically dropped, when the session that created the
temporary tables is closed, where as Global temporary tables are destroyed when the last
connection that is referencing the global temp table is closed.

Derived Tables,Table variables in Sql Server

SQL Script to create tblEmployee table:


CREATE TABLE tblEmployee
(
Id int Primary Key,
Name nvarchar(30),
Gender nvarchar(10),
DepartmentId int
)

SQL Script to create tblDepartment table


CREATE TABLE tblDepartment
(
DeptId int Primary Key,
DeptName nvarchar(20)
)

Insert data into tblDepartment table


Insert into tblDepartment values (1,'IT')
Insert into tblDepartment values (2,'Payroll')
Insert into tblDepartment values (3,'HR')
Insert into tblDepartment values (4,'Admin')

Insert data into tblEmployee table


Insert into tblEmployee values (1,'John', 'Male', 3)
Insert into tblEmployee values (2,'Mike', 'Male', 2)

Insert into tblEmployee values (3,'Pam', 'Female', 1)

Insert into tblEmployee values (4,'Todd', 'Male', 4)


Insert into tblEmployee values (5,'Sara', 'Female', 1)
Insert into tblEmployee values (6,'Ben', 'Male', 3)

Now, we want to write a query which would return the following output. The query should
return, the Department Name and Total Number of employees, with in the department. The
departments with greatar than or equal to 2 employee should only be returned.

Obviously, there are severl ways to do this. Let's see how to achieve this, with the help of a
view
Script to create the View
Create view vWEmployeeCount
as
Select DeptName, DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId

Query using the view:


Select DeptName, TotalEmployees
from vWEmployeeCount
where TotalEmployees >= 2

Note: Views get saved in the database, and can be available to other queries and stored
procedures. However, if this view is only used at this one place, it can be easily eliminated
using other options, like CTE, Derived Tables, Temp Tables, Table Variable etc.

Now, let's see, how to achieve the same using, temporary tables. We are using local
temporary tables here.

Select DeptName, DepartmentId, COUNT(*) as TotalEmployees


into #TempEmployeeCount
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId

Select DeptName, TotalEmployees


From #TempEmployeeCount
where TotalEmployees >= 2

Drop Table #TempEmployeeCount

Note: Temporary tables are stored in TempDB. Local temporary tables are visible only in the
current session, and can be shared between nested stored procedure calls. Global
temporary tables are visible to other sessions and are destroyed, when the last connection
referencing the table is closed.

Using Table Variable:


Declare @tblEmployeeCount table
(DeptName nvarchar(20),DepartmentId int, TotalEmployees int)

Insert @tblEmployeeCount
Select DeptName, DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId

Select DeptName, TotalEmployees


From @tblEmployeeCount
where TotalEmployees >= 2

Note: Just like TempTables, a table variable is also created in TempDB. The scope of a table
variable is the batch, stored procedure, or statement block in which it is declared. They can
be passed as parameters between procedures.

Using Derived Tables


Select DeptName, TotalEmployees
from
(
Select DeptName, DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId
)
as EmployeeCount
where TotalEmployees >= 2

Note: Derived tables are available only in the context of the current query.

Using CTE
With EmployeeCount(DeptName, DepartmentId, TotalEmployees)
as
(
Select DeptName, DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId
)

Select DeptName, TotalEmployees


from EmployeeCount
where TotalEmployees >= 2

Common Table Expressions in Sql Server


Common table expression (CTE) is introduced in SQL server 2005. A CTE is a temporary
result set, that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement,
that immediately follows the CTE.
Let's create the required Employee and Department tables, that we will be using for this
demo.

SQL Script to create tblEmployee table:


CREATE TABLE tblEmployee
(
Id int Primary Key,
Name nvarchar(30),
Gender nvarchar(10),
DepartmentId int
)

SQL Script to create tblDepartment table


CREATE TABLE tblDepartment
(
DeptId int Primary Key,
DeptName nvarchar(20)
)
Insert data into tblDepartment table
Insert into tblDepartment values (1,'IT')
Insert into tblDepartment values (2,'Payroll')
Insert into tblDepartment values (3,'HR')
Insert into tblDepartment values (4,'Admin')

Insert data into tblEmployee table


Insert into tblEmployee values (1,'John', 'Male', 3)
Insert into tblEmployee values (2,'Mike', 'Male', 2)
Insert into tblEmployee values (3,'Pam', 'Female', 1)
Insert into tblEmployee values (4,'Todd', 'Male', 4)
Insert into tblEmployee values (5,'Sara', 'Female', 1)
Insert into tblEmployee values (6,'Ben', 'Male', 3)

Write a query using CTE, to display the total number of Employees by Department Name.
The output should be as shown below.

Before we write the query, let's look at the syntax for creating a CTE.
WITH cte_name (Column1, Column2, ..)
AS
( CTE_query )

SQL query using CTE:


With EmployeeCount(DepartmentId, TotalEmployees)
as
(
Select DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
group by DepartmentId
)

Select DeptName, TotalEmployees


from tblDepartment
join EmployeeCount
on tblDepartment.DeptId = EmployeeCount.DepartmentId
order by TotalEmployees
We define a CTE, using WITH keyword, followed by the name of the CTE. In our
example, EmployeeCount is the name of the CTE. Within parentheses, we specify the
columns that make up the CTE. DepartmentId and TotalEmployees are the columns
of EmployeeCount CTE. These 2 columns map to the columns returned by the SELECT CTE
query. The CTE column names and CTE query column names can be different. Infact, CTE
column names are optional. However, if you do specify, the number of CTE columns and
the CTE SELECT query columns should be same. Otherwise you will get an error stating
- 'EmployeeCount has fewer columns than were specified in the column list'. The column
list, is followed by the as keyword, following which we have the CTE query within a pair of
parentheses.

EmployeeCount CTE is being joined with tblDepartment table, in the SELECT query, that
immediately follows the CTE. Remember, a CTE can only be referenced by a SELECT, INSERT,
UPDATE, or DELETE statement, that immediately follows the CTE. If you try to do something
else in between, we get an error stating - 'Common table expression defined but not used'.
The following SQL, raise an error.

With EmployeeCount(DepartmentId, TotalEmployees)


as
(
Select DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
group by DepartmentId
)

Select 'Hello'

Select DeptName, TotalEmployees


from tblDepartment
join EmployeeCount
on tblDepartment.DeptId = EmployeeCount.DepartmentId
order by TotalEmployees

It is also, possible to create multiple CTE's using a single WITH clause.


With EmployeesCountBy_Payroll_IT_Dept(DepartmentName, Total)
as
(
Select DeptName, COUNT(Id) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
where DeptName IN ('Payroll','IT')
group by DeptName
),
EmployeesCountBy_HR_Admin_Dept(DepartmentName, Total)
as
(
Select DeptName, COUNT(Id) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName
)
Select * from EmployeesCountBy_HR_Admin_Dept
UNION
Select * from EmployeesCountBy_Payroll_IT_Dept
Is it possible to UPDATE a CTE?
Yes & No, depending on the number of base tables, the CTE is created upon, and the
number of base tables affected by the UPDATE statement. If this is not clear at the moment,
don't worry. We will try to understand this with an example.
Let's create a simple common table expression, based on tblEmployee
table. Employees_Name_Gender CTE is getting all the required columns from one base
table tblEmployee.
With Employees_Name_Gender
as
(
Select Id, Name, Gender from tblEmployee
)
Select * from Employees_Name_Gender

Let's now, UPDATE JOHN's gender from Male to Female, using


the Employees_Name_Gender CTE
With Employees_Name_Gender
as
(
Select Id, Name, Gender from tblEmployee
)
Update Employees_Name_Gender Set Gender = 'Female' where Id = 1

Now, query the tblEmployee table. JOHN's gender is actually UPDATED. So, if a CTE is
created on one base table, then it is possible to UPDATE the CTE, which in turn will update
the underlying base table. In this case, UPDATING Employees_Name_Gender CTE,
updates tblEmployee table.
Now, let's create a CTE, on both the tables - tblEmployee and tblDepartment. The CTE
should return, Employee Id, Name, Gender and Department. In short the output should be
as shown below.
CTE, that returns Employees by Department
With EmployeesByDepartment
as
(
Select Id, Name, Gender, DeptName
from tblEmployee
join tblDepartment
on tblDepartment.DeptId = tblEmployee.DepartmentId
)
Select * from EmployeesByDepartment

Let's update this CTE. Let's change JOHN's Gender from Female to Male. Here, the CTE is
based on 2 tables, but the UPDATE statement affects only one base table tblEmployee. So
the UPDATE succeeds. So, if a CTE is based on more than one table, and if the UPDATE
affects only one base table, then the UPDATE is allowed.
With EmployeesByDepartment
as
(
Select Id, Name, Gender, DeptName
from tblEmployee
join tblDepartment
on tblDepartment.DeptId = tblEmployee.DepartmentId
)
Update EmployeesByDepartment set Gender = 'Male' where Id = 1

Now, let's try to UPDATE the CTE, in such a way, that the update affects both the tables
- tblEmployee and tblDepartment. This UPDATE
statement changes Gender from tblEmployee table
and DeptName from tblDepartment table. When you execute this UPDATE, you get an error
stating - 'View or function EmployeesByDepartment is not updatable because the
modification affects multiple base tables'. So, if a CTE is based on multiple tables, and if the
UPDATE statement affects more than 1 base table, then the UPDATE is not allowed.
With EmployeesByDepartment
as
(
Select Id, Name, Gender, DeptName
from tblEmployee
join tblDepartment
on tblDepartment.DeptId = tblEmployee.DepartmentId
)
Update EmployeesByDepartment set
Gender = 'Female', DeptName = 'IT'
where Id = 1

Finally, let's try to UPDATE just the DeptName. Let's change JOHN's DeptName from HR to
IT. Before, you execute the UPDATE statement, notice that BEN is also currently in HR
department.
With EmployeesByDepartment
as
(
Select Id, Name, Gender, DeptName
from tblEmployee
join tblDepartment
on tblDepartment.DeptId = tblEmployee.DepartmentId
)
Update EmployeesByDepartment set
DeptName = 'IT' where Id = 1

After you execute the UPDATE. Select data from the CTE, and you will see that BEN's
DeptName is also changed to IT.

This is because, when we updated the CTE, the UPDATE has actually changed
the DeptName from HR to IT, in tblDepartment table, instead of changing
the DepartmentId column (from 3 to 1) in tblEmployee table. So, if a CTE is based on
multiple tables, and if the UPDATE statement affects only one base table, the update
succeeds. But the update may not work as you expect.
So in short if,
1. A CTE is based on a single base table, then the UPDATE suceeds and works as expected.
2. A CTE is based on more than one base table, and if the UPDATE affects multiple base
tables, the update is not allowed and the statement terminates with an error.
3. A CTE is based on more than one base table, and if the UPDATE affects only one base
table, the UPDATE succeeds(but not as expected always)
A CTE that references itself is called as recursive CTE. Recursive CTE's can be of great help
when displaying hierarchical data. Example, displaying employees in an organization
hierarchy. A simple organization chart is shown below.

Let's create tblEmployee table, which holds the data, that's in the organization chart.
Create Table tblEmployee
(
EmployeeId int Primary key,
Name nvarchar(20),
ManagerId int
)

Insert into tblEmployee values (1, 'Tom', 2)


Insert into tblEmployee values (2, 'Josh', null)
Insert into tblEmployee values (3, 'Mike', 2)
Insert into tblEmployee values (4, 'John', 3)
Insert into tblEmployee values (5, 'Pam', 1)
Insert into tblEmployee values (6, 'Mary', 3)
Insert into tblEmployee values (7, 'James', 1)
Insert into tblEmployee values (8, 'Sam', 5)
Insert into tblEmployee values (9, 'Simon', 1)

Since, a MANAGER is also an EMPLOYEE, both manager and employee details are stored in
tblEmployee table. Data from tblEmployee is shown below.
Let's say, we want to display, EmployeeName along with their ManagerName. The ouptut
should be as shown below.

To achieve this, we can simply join tblEmployee with itself. Joining a table with itself is called
as self join. We discussed about Self Joins in Part 14 of this video series. In the output, notice
that since JOSH does not have a Manager, we are displaying 'Super Boss', instead of NULL.
We used IsNull(), function to replace NULL with 'Super Boss'.
SELF JOIN QUERY:
Select Employee.Name as [Employee Name],
IsNull(Manager.Name, 'Super Boss') as [Manager Name]
from tblEmployee Employee
left join tblEmployee Manager
on Employee.ManagerId = Manager.EmployeeId

Along with Employee and their Manager name, we also want to display their level in the
organization. The output should be as shown below.
We can easily achieve this using a self referencing CTE.
With
EmployeesCTE (EmployeeId, Name, ManagerId, [Level])
as
(
Select EmployeeId, Name, ManagerId, 1
from tblEmployee
where ManagerId is null

union all

Select tblEmployee.EmployeeId, tblEmployee.Name,


tblEmployee.ManagerId, EmployeesCTE.[Level] + 1
from tblEmployee
join EmployeesCTE
on tblEmployee.ManagerID = EmployeesCTE.EmployeeId
)
Select EmpCTE.Name as Employee, Isnull(MgrCTE.Name, 'Super Boss') as Manager,
EmpCTE.[Level]
from EmployeesCTE EmpCTE
left join EmployeesCTE MgrCTE
on EmpCTE.ManagerId = MgrCTE.EmployeeId

The EmployeesCTE contains 2 queries with UNION ALL operator. The first query selects the
EmployeeId, Name, ManagerId, and 1 as the level from tblEmployee where ManagerId is
NULL. So, here we are giving a LEVEL = 1 for super boss (Whose Manager Id is NULL). In the
second query, we are joining tblEmployee with EmployeesCTE itself, which allows us to loop
thru the hierarchy. Finally to get the required output, we are joining EmployeesCTE with
itself.
Database Normalization
Database normalization is the process of organizing data to minimize data redundancy
(data duplication), which in turn ensures data consistency.Let's understand with an
example, how redundant data can cause data inconsistency. Consider Employees table
below. For every employee with in the same department, repeating, all the 3 columns
(DeptName, DeptHead and DeptLocation). Let's say for example, if there 50 thousand
employees in the IT department, we would have unnecessarily repeated all the 3
department columns (DeptName, DeptHead and DeptLocation) data 50 thousand times.

Another common problem, is that data can become inconsistent. For example, let's say,
JOHN has resigned, and we have a new department head (STEVE) for IT department. At
present, there are 3 IT department rows in the table, and we need to update all of them.
Let's assume I updated only one row and forgot to update the other 2 rows, then obviously,
the data becomes inconsistent.

Another problem, DML queries (Insert, update and delete), could become slow, as there
could many records and columns to process.

So, to reduce the data redundancy, we can divide this large badly organised table into two
(Employees and Departments), as shown below. Now, we have reduced redundant
department data. So, if we have to update department head name, we only have one row to
update, even if there are 10 million employees in that department.

Normalized Departments Table


Normalized Employees Table

Database normalization is a step by step process. There are 6 normal forms, First Normal
form (1NF) thru Sixth Normal Form (6NF). Most databases are in third normal form (3NF).
There are certain rules, that each normal form should follow.

Now, let's explore the first normal form (1NF). A table is said to be in 1NF, if
1. The data in each column should be atomic. No multiple values, sepearated by comma.
2. The table does not contain any repeating column groups
3. Identify each record uniquely using primary key.

In the table below, data in Employee column is not atomic. It contains multiple employees
seperated by comma. From the data you can see that in the IT department, we have 3
employees - Sam, Mike, Shan. Now, let's say I want to change just, SHAN name. It is not
possible, we have to update the entire cell. Similary it is not possible to select or delete just
one employee, as the data in the cell is not atomic.

The 2nd rule of the first normal form is that, the table should not contain any repeating
column groups. Consider the Employee table below. We have repeated the Employee
column, from Employee1 to Employee3. The problem with this design is that, if a
department is going to have more than 3 employees, then we have to change the table
structure to add Employee4 column. Employee2 and Employee3 columns in the HR
department are NULL, as there is only employee in this department. The disk space is
simply wasted.

To eliminate the repeating column groups, we are dividing the table into 2. The repeating
Employee columns are moved into a seperate table, with a foreign key pointing to the
primary key of the other table. We also, introduced primary key to uniquely identify each
record.

A table is said to be in 2NF, if


1. The table meets all the conditions of 1NF
2. Move redundant data to a separate table
3. Create relationship between these tables using foreign keys.

The table below violates second normal form. There is lot of redundant data in the table.
Let's say, in my organization there are 100,000 employees and only 2 departments (IT &
HR). Since we are storing DeptName, DeptHead and DeptLocation columns also in the same
table, all these columns should also be repeated 100,000 times, which results in
unnecessary duplication of data.

So this table is clearly violating the rules of the second normal form, and the redundant
data can cause the following issues.
1. Disk space wastage
2. Data inconsistency
3. DML queries (Insert, Update, Delete) can become slow

Now, to put this table in the second normal form, we need to break the table into 2, and
move the redundant department data (DeptName, DeptHead and DeptLocation) into it's
own table. To link the tables with each other, we use the DeptId foreign key. The tables
below are in 2NF.
Third Normal Form (3NF):
A table is said to be in 3NF, if the table
1. Meets all the conditions of 1NF and 2NF
2. Does not contain columns (attributes) that are not fully dependent upon the primary key

The table below, violates third normal form, because AnnualSalary column is not fully
dependent on the primary key EmpId. The AnnualSalary is also dependent on
the Salary column. In fact, to compute the AnnualSalary, we multiply the Salary by 12.
Since AnnualSalary is not fully dependent on the primary key, and it can be computed, we
can remove this column from the table, which then, will adhere to 3NF.

Let's look at another example of Third Normal Form violation. In the table
below, DeptHead column is not fully dependent on EmpId column. DeptHead is also
dependent on DeptName. So, this table is not in 3NF.
To put this table in 3NF, we break this down into 2, and then move all the columns that are
not fully dependent on the primary key to a separate table as shown below. This design is
now in 3NF.

Pivot operator in sql server


Pivot is a sql server operator that can be used to turn unique values from one column,
into multiple columns in the output, there by effectively rotating a table.

Let's understand the power of PIVOT operator with an example


Create Table tblProductSales
(
SalesAgent nvarchar(50),
SalesCountry nvarchar(50),
SalesAmount int
)

Insert into tblProductSales values('Tom', 'UK', 200)


Insert into tblProductSales values('John', 'US', 180)
Insert into tblProductSales values('John', 'UK', 260)
Insert into tblProductSales values('David', 'India', 450)
Insert into tblProductSales values('Tom', 'India', 350)
Insert into tblProductSales values('David', 'US', 200)
Insert into tblProductSales values('Tom', 'US', 130)
Insert into tblProductSales values('John', 'India', 540)
Insert into tblProductSales values('John', 'UK', 120)
Insert into tblProductSales values('David', 'UK', 220)
Insert into tblProductSales values('John', 'UK', 420)
Insert into tblProductSales values('David', 'US', 320)
Insert into tblProductSales values('Tom', 'US', 340)
Insert into tblProductSales values('Tom', 'UK', 660)
Insert into tblProductSales values('John', 'India', 430)
Insert into tblProductSales values('David', 'India', 230)
Insert into tblProductSales values('David', 'India', 280)
Insert into tblProductSales values('Tom', 'UK', 480)
Insert into tblProductSales values('John', 'US', 360)
Insert into tblProductSales values('David', 'UK', 140)

Select * from tblProductSales: As you can see, we have 3 sales agents selling in 3 countries
Now, let's write a query which returns TOTAL SALES, grouped
by SALESCOUNTRY and SALESAGENT. The output should be as shown below.

A simple GROUP BY query can produce this output.


Select SalesCountry, SalesAgent, SUM(SalesAmount) as Total
from tblProductSales
group by SalesCountry, SalesAgent
order by SalesCountry, SalesAgent

At, this point, let's try to present the same data in different format using PIVOT operator.

Query using PIVOT operator:


Select SalesAgent, India, US, UK
from tblProductSales
Pivot
(
Sum(SalesAmount) for SalesCountry in ([India],[US],[UK])
) as PivotTable
This PIVOT query is converting the unique column values (India, US, UK)
in SALESCOUNTRY column, into Columns in the output, along with performing aggregations
on the SALESAMOUNT column. The Outer query, simply, selects SALESAGENT column
from tblProductSales table, along with pivoted columns from the PivotTable.

Having understood the basics of PIVOT, let's look at another example. Let's
create tblProductsSale, a slight variation of tblProductSales, that we have already created.
The table, that we are creating now, has got an additional Id column.
Create Table tblProductsSale
(
Id int primary key,
SalesAgent nvarchar(50),
SalesCountry nvarchar(50),
SalesAmount int
)

Insert into tblProductsSale values(1, 'Tom', 'UK', 200)


Insert into tblProductsSale values(2, 'John', 'US', 180)
Insert into tblProductsSale values(3, 'John', 'UK', 260)
Insert into tblProductsSale values(4, 'David', 'India', 450)
Insert into tblProductsSale values(5, 'Tom', 'India', 350)
Insert into tblProductsSale values(6, 'David', 'US', 200)
Insert into tblProductsSale values(7, 'Tom', 'US', 130)
Insert into tblProductsSale values(8, 'John', 'India', 540)
Insert into tblProductsSale values(9, 'John', 'UK', 120)
Insert into tblProductsSale values(10, 'David', 'UK', 220)
Insert into tblProductsSale values(11, 'John', 'UK', 420)
Insert into tblProductsSale values(12, 'David', 'US', 320)
Insert into tblProductsSale values(13, 'Tom', 'US', 340)
Insert into tblProductsSale values(14, 'Tom', 'UK', 660)
Insert into tblProductsSale values(15, 'John', 'India', 430)
Insert into tblProductsSale values(16, 'David', 'India', 230)
Insert into tblProductsSale values(17, 'David', 'India', 280)
Insert into tblProductsSale values(18, 'Tom', 'UK', 480)
Insert into tblProductsSale values(19, 'John', 'US', 360)
Insert into tblProductsSale values(20, 'David', 'UK', 140)
Now, run the same PIVOT query that we have already created, just by changing the name
of the table to tblProductsSale instead of tblProductSales
Select SalesAgent, India, US, UK
from tblProductsSale
Pivot
(
Sum(SalesAmount) for SalesCountry in ([India],[US],[UK])
)
as PivotTable

This output is not what we have expected.


This is because of the presence of Id column in tblProductsSale, which is also considered
when performing pivoting and group by. To eliminate this from the calculations, we have
used derived table, which only selects, SALESAGENT, SALESCOUNTRY, and SALESAMOUNT.
The rest of the query is very similar to what we have already seen.
Select SalesAgent, India, US, UK
from
(
Select SalesAgent, SalesCountry, SalesAmount from tblProductsSale
) as SourceTable
Pivot
(
Sum(SalesAmount) for SalesCountry in (India, US, UK)
) as PivotTable
UNPIVOT performs the opposite operation to PIVOT by rotating columns of a table-valued
expression into column values.

The syntax of PIVOT operator from MSDN


SELECT <non-pivoted column>,
[first pivoted column] AS <column name>,
[second pivoted column] AS <column name>,
...
[last pivoted column] AS <column name>
FROM
(<SELECT query that produces the data>)
AS <alias for the source query>
PIVOT
(
<aggregation function>(<column being aggregated>)
FOR
[<column that contains the values that will become column headers>]
IN ( [first pivoted column], [second pivoted column], ... [last pivoted column])
)
AS <alias for the pivot table>
<optional ORDER BY clause>;

Error handling in sql server


With the introduction of Try/Catch blocks in SQL Server 2005, error handling in sql server, is now
similar to programming languages like C#, and java. Before understanding error handling using
try/catch, let's step back and understand how error handling was done in SQL Server 2000,
using system function @@Error. Sometimes, system functions that begin with two at signs (@@),
are called as global variables. They are not variables and do not have the same behaviours as
variables, instead they are very similar to functions.

Now let's create tblProduct and tblProductSales, that we will be using for the rest of this
demo.

SQL script to create tblProduct


Create Table tblProduct
(
ProductId int NOT NULL primary key,
Name nvarchar(50),
UnitPrice int,
QtyAvailable int
)

SQL script to load data into tblProduct


Insert into tblProduct values(1, 'Laptops', 2340, 100)
Insert into tblProduct values(2, 'Desktops', 3467, 50)

SQL script to create tblProductSales


Create Table tblProductSales
(
ProductSalesId int primary key,
ProductId int,
QuantitySold int
)
Create Procedure spSellProduct
@ProductId int,
@QuantityToSell int
as
Begin
-- Check the stock available, for the product we want to sell
Declare @StockAvailable int
Select @StockAvailable = QtyAvailable
from tblProduct where ProductId = @ProductId

-- Throw an error to the calling application, if enough stock is not available


if(@StockAvailable < @QuantityToSell)
Begin
Raiserror('Not enough stock available',16,1)
End
-- If enough stock available
Else
Begin
Begin Tran
-- First reduce the quantity available
Update tblProduct set QtyAvailable = (QtyAvailable - @QuantityToSell)
where ProductId = @ProductId

Declare @MaxProductSalesId int


-- Calculate MAX ProductSalesId
Select @MaxProductSalesId = Case When
MAX(ProductSalesId) IS NULL
Then 0 else MAX(ProductSalesId) end
from tblProductSales
-- Increment @MaxProductSalesId by 1, so we don't get a primary key violation
Set @MaxProductSalesId = @MaxProductSalesId + 1
Insert into tblProductSales values(@MaxProductSalesId, @ProductId, @QuantityToSell)
Commit Tran
End
End

1. Stored procedure - spSellProduct, has 2 parameters - @ProductId and @QuantityToSell.


@ProductId specifies the product that we want to sell, and @QuantityToSell specifies, the
quantity we would like to sell.
2. Sections of the stored procedure is commented, and is self explanatory.
3. In the procedure, we are using Raiserror() function to return an error message back to
the calling application, if the stock available is less than the quantity we are trying to sell.
We have to pass atleast 3 parameters to the Raiserror() function.
RAISERROR('Error Message', ErrorSeverity, ErrorState)
Severity and State are integers. In most cases, when you are returning custom errors, the
severity level is 16, which indicates general errors that can be corrected by the user. In this
case, the error can be corrected, by adjusting the @QuantityToSell, to be less than or equal
to the stock available. ErrorState is also an integer between 1 and 255. RAISERROR only
generates errors with state from 1 through 127.

4. The problem with this procedure is that, the transaction is always committed. Even, if
there is an error somewhere, between updating tblProduct and tblProductSales table. In
fact, the main purpose of wrapping these 2 statments (Update tblProduct Statement
& Insert into tblProductSales statement) in a transaction is to ensure that, both of the
statements are treated as a single unit. For example, if we have an error when executing the
second statement, then the first statement should also be rolledback.
In SQL server 2000, to detect errors, we can use @@Error system function. @@Error
returns a NON-ZERO value, if there is an error, otherwise ZERO, indicating that the previous
sql statement encountered no errors. The stored procedure spSellProductCorrected, makes
use of @@ERROR system function to detect any errors that may have occurred. If there are
errors, roll back the transaction, else commit the transaction. If you comment the line
(Set @MaxProductSalesId = @MaxProductSalesId + 1), and then execute the stored
procedure there will be a primary key violation error, when trying to insert
into tblProductSales. As a result of this the entire transaction will be rolled back.
Alter Procedure spSellProductCorrected
@ProductId int,
@QuantityToSell int
as
Begin
-- Check the stock available, for the product we want to sell
Declare @StockAvailable int
Select @StockAvailable = QtyAvailable
from tblProduct where ProductId = @ProductId

-- Throw an error to the calling application, if enough stock is not available


if(@StockAvailable < @QuantityToSell)
Begin
Raiserror('Not enough stock available',16,1)
End
-- If enough stock available
Else
Begin
Begin Tran
-- First reduce the quantity available
Update tblProduct set QtyAvailable = (QtyAvailable - @QuantityToSell)
where ProductId = @ProductId

Declare @MaxProductSalesId int


-- Calculate MAX ProductSalesId
Select @MaxProductSalesId = Case When
MAX(ProductSalesId) IS NULL
Then 0 else MAX(ProductSalesId) end
from tblProductSales
-- Increment @MaxProductSalesId by 1, so we don't get a primary key violation
Set @MaxProductSalesId = @MaxProductSalesId + 1
Insert into tblProductSales values(@MaxProductSalesId, @ProductId, @QuantityToSell)
if(@@ERROR <> 0)
Begin
Rollback Tran
Print 'Rolled Back Transaction'
End
Else
Begin
Commit Tran
Print 'Committed Transaction'
End
End
End

Note: @@ERROR is cleared and reset on each statement execution. Check it immediately
following the statement being verified, or save it to a local variable that can be checked
later.

In tblProduct table, we already have a record with ProductId = 2. So the insert


statement causes a primary key violation error. @@ERROR retains the error number, as we
are checking for it immediately after the statement that cause the error.

Insert into tblProduct values(2, 'Mobile Phone', 1500, 100)


if(@@ERROR <> 0)
Print 'Error Occurred'
Else
Print 'No Errors'
On the other hand, when you execute the code below, you get message 'No Errors' printed.
This is because the @@ERROR is cleared and reset on each statement execution.
Insert into tblProduct values(2, 'Mobile Phone', 1500, 100)
--At this point @@ERROR will have a NON ZERO value
Select * from tblProduct
--At this point @@ERROR gets reset to ZERO, because the
--select statement successfullyexecuted
if(@@ERROR <> 0)
Print 'Error Occurred'
Else
Print 'No Errors'

In this example, we are storing the value of @@Error function to a local variable, which is
then used later.
Declare @Error int
Insert into tblProduct values(2, 'Mobile Phone', 1500, 100)
Set @Error = @@ERROR
Select * from tblProduct
if(@Error <> 0)
Print 'Error Occurred'
Else
Print 'No Errors'
Error handling in sql server 2005, and later versions
Syntax:
BEGIN TRY
{ Any set of SQL statements }
END TRY
BEGIN CATCH
[ Optional: Any set of SQL statements ]
END CATCH
[Optional: Any other SQL Statements]

Any set of SQL statements, that can possibly throw an exception are wrapped between
BEGIN TRY and END TRY blocks. If there is an exception in the TRY block, the control
immediately, jumps to the CATCH block. If there is no exception, CATCH block will be
skipped, and the statements, after the CATCH block are executed.
Errors trapped by a CATCH block are not returned to the calling application. If any part of
the error information must be returned to the application, the code in the CATCH block
must do so by using RAISERROR() function.

1. In procedure spSellProduct, Begin Transaction and Commit Transaction statements are


wrapped between Begin Try and End Try block. If there are no errors in the code that is
enclosed in the TRY block, then COMMIT TRANSACTION gets executed and the changes are
made permanent. On the other hand, if there is an error, then the control immediately
jumps to the CATCH block. In the CATCH block, we are rolling the transaction back. So, it's
much easier to handle errors with Try/Catch construct than with @@Error system function.
2. Also notice that, in the scope of the CATCH block, there are several system functions, that
are used to retrieve more information about the error that occurred These functions return
NULL if they are executed outside the scope of the CATCH block.

3. TRY/CATCH cannot be used in a user-defined functions.

Create Procedure spSellProduct


@ProductId int,
@QuantityToSell int
as
Begin
-- Check the stock available, for the product we want to sell
Declare @StockAvailable int
Select @StockAvailable = QtyAvailable
from tblProduct where ProductId = @ProductId

-- Throw an error to the calling application, if enough stock is not available


if(@StockAvailable < @QuantityToSell)
Begin
Raiserror('Not enough stock available',16,1)
End
-- If enough stock available
Else
Begin
Begin Try
Begin Transaction
-- First reduce the quantity available
Update tblProduct set QtyAvailable = (QtyAvailable - @QuantityToSell)
where ProductId = @ProductId

Declare @MaxProductSalesId int


-- Calculate MAX ProductSalesId
Select @MaxProductSalesId = Case When
MAX(ProductSalesId) IS NULL
Then 0 else MAX(ProductSalesId) end
from tblProductSales
--Increment @MaxProductSalesId by 1, so we don't get a primary key violation
Set @MaxProductSalesId = @MaxProductSalesId + 1
Insert into tblProductSales values(@MaxProductSalesId, @ProductId, @QuantityToSell)
Commit Transaction
End Try
Begin Catch
Rollback Transaction
Select
ERROR_NUMBER() as ErrorNumber,
ERROR_MESSAGE() as ErrorMessage,
ERROR_PROCEDURE() as ErrorProcedure,
ERROR_STATE() as ErrorState,
ERROR_SEVERITY() as ErrorSeverity,
ERROR_LINE() as ErrorLine
End Catch
End
End

Transactions in SQL Server


What is a Transaction?
A transaction is a group of commands that change the data stored in a database. A
transaction, is treated as a single unit. A transaction ensures that, either all of the
commands succeed, or none of them. If one of the commands in the transaction fails, all of
the commands fail, and any data that was modified in the database is rolled back. In this
way, transactions maintain the integrity of data in a database.

Transaction processing follows these steps:


1. Begin a transaction.
2. Process database commands.
3. Check for errors.
If errors occurred,
rollback the transaction,
else,
commit the transaction

Let's understand transaction processing with an example. For this purpose, let's Create and
populate, tblMailingAddress and tblPhysicalAddress tables
Create Table tblMailingAddress
(
AddressId int NOT NULL primary key,
EmployeeNumber int,
HouseNumber nvarchar(50),
StreetAddress nvarchar(50),
City nvarchar(10),
PostalCode nvarchar(50)
)

Insert into tblMailingAddress values (1, 101, '#10', 'King Street', 'Londoon', 'CR27DW')
Create Table tblPhysicalAddress
(
AddressId int NOT NULL primary key,
EmployeeNumber int,
HouseNumber nvarchar(50),
StreetAddress nvarchar(50),
City nvarchar(10),
PostalCode nvarchar(50)
)

Insert into tblPhysicalAddress values (1, 101, '#10', 'King Street', 'Londoon', 'CR27DW')

An employee with EmployeeNumber 101, has the same address as his physical and mailing
address. His city name is mis-spelled as Londoon instead of London. The following stored
procedure 'spUpdateAddress', updates the physical and mailing addresses. Both the
UPDATE statements are wrapped between BEGIN TRANSACTION and COMMIT
TRANSACTION block, which in turn is wrapped between BEGIN TRY and END TRY block.
So, if both the UPDATE statements succeed, without any errors, then the transaction is
committed. If there are errors, then the control is immediately transferred to the catch
block. The ROLLBACK TRANSACTION statement, in the CATCH block, rolls back the
transaction, and any data that was written to the database by the commands is backed out.

Create Procedure spUpdateAddress


as
Begin
Begin Try
Begin Transaction
Update tblMailingAddress set City = 'LONDON'
where AddressId = 1 and EmployeeNumber = 101

Update tblPhysicalAddress set City = 'LONDON'


where AddressId = 1 and EmployeeNumber = 101
Commit Transaction
End Try
Begin Catch
Rollback Transaction
End Catch
End
Let's now make the second UPDATE statement, fail. CITY column length in
tblPhysicalAddress table is 10. The second UPDATE statement fails, because the value for
CITY column is more than 10 characters.
Alter Procedure spUpdateAddress
as
Begin
Begin Try
Begin Transaction
Update tblMailingAddress set City = 'LONDON12'
where AddressId = 1 and EmployeeNumber = 101

Update tblPhysicalAddress set City = 'LONDON LONDON'


where AddressId = 1 and EmployeeNumber = 101
Commit Transaction
End Try
Begin Catch
Rollback Transaction
End Catch
End

Now, if we execute spUpdateAddress, the first UPDATE statements succeeds, but the
second UPDATE statement fails. As, soon as the second UPDATE statement fails, the control
is immediately transferred to the CATCH block. The CATCH block rolls the transaction back.
So, the change made by the first UPDATE statement is undone.

A transaction is a group of database commands that are treated as a single unit. A successful
transaction must pass the "ACID" test, that is, it must be
A - Atomic
C - Consistent
I - Isolated
D - Durable

Atomic - All statements in the transaction either completed successfully or they were all
rolled back. The task that the set of operations represents is either accomplished or not, but
in any case not left half-done. For example, in the spUpdateInventory_and_Sell stored
procedure, both the UPDATE statements, should succeed. If one UPDATE statement
succeeds and the other UPDATE statement fails, the database should undo the change
made by the first UPDATE statement, by rolling it back. In short, the transaction should be
ATOMIC.
Create Procedure spUpdateInventory_and_Sell
as
Begin
Begin Try
Begin Transaction
Update tblProduct set QtyAvailable = (QtyAvailable - 10)
where ProductId = 1

Insert into tblProductSales values(3, 1, 10)


Commit Transaction
End Try
Begin Catch
Rollback Transaction
End Catch
End
Consistent - All data touched by the transaction is left in a logically consistent state. For
example, if stock available numbers are decremented from tblProductTable, then, there has
to be a related entry in tblProductSales table. The inventory can't just disappear.
Isolated - The transaction must affect data without interfering with other concurrent
transactions, or being interfered with by them. This prevents transactions from making
changes to data based on uncommitted information, for example changes to a record that
are subsequently rolled back. Most databases use locking to maintain transaction
isolation.
Durable - Once a change is made, it is permanent. If a system error or power failure occurs
before a set of commands is complete, those commands are undone and the data is
restored to its original state once the system begins running again.

Subqueries in Sql Server


Please create the required tables and insert sample data using the script below.
Create Table tblProducts
(
[Id] int identity primary key,
[Name] nvarchar(50),
[Description] nvarchar(250)
)
Create Table tblProductSales
(
Id int primary key identity,
ProductId int foreign key references tblProducts(Id),
UnitPrice int,
QuantitySold int
)
Insert into tblProducts values ('TV', '52 inch black color LCD TV')
Insert into tblProducts values ('Laptop', 'Very thin black color acer laptop')
Insert into tblProducts values ('Desktop', 'HP high performance desktop')

Insert into tblProductSales values(3, 450, 5)


Insert into tblProductSales values(2, 250, 7)
Insert into tblProductSales values(3, 450, 4)
Insert into tblProductSales values(3, 450, 9)

Write a query to retrieve products that are not at all sold?


This can be very easily achieved using subquery as shown below. Select [Id], [Name],
[Description]
from tblProducts
where Id not in (Select Distinct ProductId from tblProductSales)

Most of the times subqueries can be very easily replaced with joins. The above query is
rewritten using joins and produces the same results. Select tblProducts.[Id], [Name],
[Description]
from tblProducts
left join tblProductSales
on tblProducts.Id = tblProductSales.ProductId
where tblProductSales.ProductId IS NULL

In this example, we have seen how to use a subquery in the where clause.

Let us now discuss about using a sub query in the SELECT clause. Write a query to retrieve
the NAME and TOTALQUANTITY sold, using a subquery.Select [Name],

(Select SUM(QuantitySold) from tblProductSales where ProductId =


tblProducts.Id) as TotalQuantity
from tblProducts
order by Name

Query with an equivalent join that produces the same result.


Select [Name], SUM(QuantitySold) as TotalQuantity
from tblProducts
left join tblProductSales
on tblProducts.Id = tblProductSales.ProductId
group by [Name]
order by Name

From these examples, it should be very clear that, a subquery is simply a select statement,
that returns a single value and can be nested inside a SELECT, UPDATE, INSERT, or DELETE
statement.
It is also possible to nest a subquery inside another subquery.
According to MSDN, subqueries can be nested upto 32 levels.
Subqueries are always encolsed in paranthesis and are also called as inner queries, and the
query containing the subquery is called as outer query.
The columns from a table that is present only inside a subquery, cannot be used in the
SELECT list of the outer query.
Correlated subquery in sql
In the example below, sub query is executed first and only once. The sub query results are
then used by the outer query. A non-corelated subquery can be executed independently of
the outer query.
Select [Id], [Name], [Description]
from tblProducts
where Id not in (Select Distinct ProductId from tblProductSales)

If the subquery depends on the outer query for its values, then that sub query is called as a
correlated subquery. In the where clause of the subquery below, "ProductId" column get it's
value from tblProducts table that is present in the outer query. So, here the subquery is
dependent on the outer query for it's value, hence this subquery is a correlated subquery.
Correlated subqueries get executed, once for every row that is selected by the outer
query. Corelated subquery, cannot be executed independently of the outer query.
Select [Name],
(Select SUM(QuantitySold) from tblProductSales where ProductId =
tblProducts.Id) as TotalQuantity
from tblProducts
order by Name
What to choose for performance - SubQueries or Joins
According to MSDN, in sql server, in most cases, there is usually no performance difference
between queries that uses sub-queries and equivalent queries using joins. For example, on
my machine I have
400,000 records in tblProducts table
600,000 records in tblProductSales tables
The following query, returns, the list of products that we have sold atleast once. This
query is formed using sub-queries. When I execute this query I get 306,199 rows in 6
seconds
Select Id, Name, Description
from tblProducts
where ID IN
(
Select ProductId from tblProductSales
)
At this stage please clean the query and execution plan cache using the following T-SQL
command.
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS; -- Clears query cache
Go
DBCC FREEPROCCACHE; -- Clears execution plan cache
GO
Now, run the query that is formed using joins. Notice that I get the exact same 306,199
rows in 6 seconds.
Select distinct tblProducts.Id, Name, Description
from tblProducts
inner join tblProductSales
on tblProducts.Id = tblProductSales.ProductId

Please Note: I have used automated sql script to insert huge amounts of this random data.
Please watch Part 61 of SQL Server tutorial, in which we have discussed about this
automated script.

According to MSDN, in some cases where existence must be checked, a join produces better
performance. Otherwise, the nested query must be processed for each result of the outer
query. In such cases, a join approach would yield better results.
The following query returns the products that we have not sold at least once. This query is
formed using sub-queries. When I execute this query I get 93,801 rows in 3 seconds

Select Id, Name, [Description]


from tblProducts
where Not Exists(Select * from tblProductSales where ProductId = tblProducts.Id)

When I execute the below equivalent query, that uses joins, I get the exact same 93,801
rows in 3 seconds.

Select tblProducts.Id, Name, [Description]


from tblProducts
left join tblProductSales
on tblProducts.Id = tblProductSales.ProductId
where tblProductSales.ProductId IS NULL

In general joins work faster than sub-queries, but in reality it all depends on the execution
plan that is generated by SQL Server. It does not matter how we have written the query, SQL
Server will always transform it on an execution plan. If sql server generates the same plan
from both queries, we will get the same result.

Cursors in Sql Server


if there is ever a need to process the rows, on a row-by-row basis, then cursors are your
choice. Cursors are very bad for performance, and should be avoided always. Most of the
time, cursors can be very easily replaced using joins.
There are different types of cursors in sql server as listed below. We will talk about the
differences between these cursor types in a later video session.
1. Forward-Only
2. Static
3. Keyset
4. Dynamic
Let us now look at a simple example of using sql server cursor to process one row at
time. We will be using tblProducts and tblProductSales tables, for this example. The tables
here show only 5 rows from each table. However, on my machine, there are 400,000
records in tblProducts and 600,000 records in tblProductSales tables.

Cursor Example: Let us say, I want to update the UNITPRICE column in tblProductSales
table, based on the following criteria
1. If the ProductName = 'Product - 55', Set Unit Price to 55

2. If the ProductName = 'Product - 65', Set Unit Price to 65


3. If the ProductName is like 'Product - 100%', Set Unit Price to 1000

Declare @ProductId int

-- Declare the cursor using the declare keyword


Declare ProductIdCursor CURSOR FOR
Select ProductId from tblProductSales

-- Open statement, executes the SELECT statment


-- and populates the result set
Open ProductIdCursor

-- Fetch the row from the result set into the variable
Fetch Next from ProductIdCursor into @ProductId

-- If the result set still has rows, @@FETCH_STATUS will be ZERO


While(@@FETCH_STATUS = 0)
Begin
Declare @ProductName nvarchar(50)
Select @ProductName = Name from tblProducts where Id = @ProductId

if(@ProductName = 'Product - 55')


Begin
Update tblProductSales set UnitPrice = 55 where ProductId = @ProductId
End
else if(@ProductName = 'Product - 65')
Begin
Update tblProductSales set UnitPrice = 65 where ProductId = @ProductId
End
else if(@ProductName like 'Product - 100%')
Begin
Update tblProductSales set UnitPrice = 1000 where ProductId = @ProductId
End

Fetch Next from ProductIdCursor into @ProductId


End

-- Release the row set


CLOSE ProductIdCursor
-- Deallocate, the resources associated with the cursor
DEALLOCATE ProductIdCursor

The cursor will loop thru each row in tblProductSales table. As there are 600,000 rows, to be
processed on a row-by-row basis, it takes around 40 to 45 seconds on my machine. We can
achieve this very easily using a join, and this will significantly increase the performance. We
will discuss about this in our next video session.

To check if the rows have been correctly updated, please use the following query.
Select Name, UnitPrice
from tblProducts join
tblProductSales on tblProducts.Id = tblProductSales.ProductId
where (Name='Product - 55' or Name='Product - 65' or Name like 'Product - 100%')

Update tblProductSales
set UnitPrice =
Case
When Name = 'Product - 55' Then 155
When Name = 'Product - 65' Then 165
When Name like 'Product - 100%' Then 10001
End
from tblProductSales
join tblProducts
on tblProducts.Id = tblProductSales.ProductId
Where Name = 'Product - 55' or Name = 'Product - 65' or
Name like 'Product - 100%'

When I executed this query, on my machine it took less than a second. Where as the same
thing using a cursor took 45 seconds. Just imagine the amount of impact cursors have on
performance. Cursors should be used as your last option. Most of the time cursors can be
very easily replaced using joins.

To check the result of the UPDATE statement, use the following query.
Select Name, UnitPrice from
tblProducts join
tblProductSales on tblProducts.Id = tblProductSales.ProductId
where (Name='Product - 55' or Name='Product - 65' or
Name like 'Product - 100%')

Merge in SQL Server


What is the use of MERGE statement in SQL Server
Merge statement introduced in SQL Server 2008 allows us to perform Inserts, Updates and
Deletes in one statement. This means we no longer have to use multiple statements for
performing Insert, Update and Delete.

With merge statement we require 2 tables


1. Source Table - Contains the changes that needs to be applied to the target table
2. Target Table - The table that require changes (Inserts, Updates and Deletes)

The merge statement joins the target table to the source table by using a common column
in both the tables. Based on how the rows match up as a result of the join, we can then
perform insert, update, and delete on the target table.

Merge statement syntax


MERGE [TARGET] AS T
USING [SOURCE] AS S
ON [JOIN_CONDITIONS]
WHEN MATCHED THEN
[UPDATE STATEMENT]
WHEN NOT MATCHED BY TARGET THEN
[INSERT STATEMENT]
WHEN NOT MATCHED BY SOURCE THEN
[DELETE STATEMENT]

Example 1 : In the example below, INSERT, UPDATE and DELETE are all performed in one
statement
1. When matching rows are found, StudentTarget table is UPDATED (i.e WHEN MATCHED)

2. When the rows are present in StudentSource table but not in StudentTarget table those
rows are INSERTED into StudentTarget table (i.e WHEN NOT MATCHED BY TARGET)

3. When the rows are present in StudentTarget table but not in StudentSource table those
rows are DELETED from StudentTarget table (i.e WHEN NOT MATCHED BY SOURCE)
Create table StudentSource
(
ID int primary key,
Name nvarchar(20)
)
GO

Insert into StudentSource values (1, 'Mike')


Insert into StudentSource values (2, 'Sara')
GO

Create table StudentTarget


(
ID int primary key,
Name nvarchar(20)
)
GO

Insert into StudentTarget values (1, 'Mike M')


Insert into StudentTarget values (3, 'John')
GO

MERGE StudentTarget AS T
USING StudentSource AS S
ON T.ID = S.ID
WHEN MATCHED THEN
UPDATE SET T.NAME = S.NAME
WHEN NOT MATCHED BY TARGET THEN
INSERT (ID, NAME) VALUES(S.ID, S.NAME)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;

Please Note : Merge statement should end with a semicolon, otherwise you would get an
error stating - A MERGE statement must be terminated by a semi-colon (;)

In real time we mostly perform INSERTS and UPDATES. The rows that are present in target
table but not in source table are usually not deleted from the target table.

Example 2 : In the example below, only INSERT and UPDATE is performed. We are not
deleting the rows that are present in the target table but not in the source table.
Truncate table StudentSource
Truncate table StudentTarget
GO

Insert into StudentSource values (1, 'Mike')


Insert into StudentSource values (2, 'Sara')
GO

Insert into StudentTarget values (1, 'Mike M')


Insert into StudentTarget values (3, 'John')
GO

MERGE StudentTarget AS T
USING StudentSource AS S
ON T.ID = S.ID
WHEN MATCHED THEN
UPDATE SET T.NAME = S.NAME
WHEN NOT MATCHED BY TARGET THEN
INSERT (ID, NAME) VALUES(S.ID, S.NAME);

Union, Intersect and Except in Sql Server


The following diagram explains the difference graphically

UNION operator returns all the unique rows from both the left and the right query. UNION
ALL included the duplicates as well.
INTERSECT operator retrieves the common unique rows from both the left and the right
query.
EXCEPT operator returns unique rows from the left query that aren’t in the right query’s
results.

Let us understand these differences with examples. We will use the following 2 tables for
the examples.
SQL Script to create the tables
Create Table TableA
(
Id int,
Name nvarchar(50),
Gender nvarchar(10)
)
Go
Insert into TableA values (1, 'Mark', 'Male')
Insert into TableA values (2, 'Mary', 'Female')
Insert into TableA values (3, 'Steve', 'Male')
Insert into TableA values (3, 'Steve', 'Male')
Go
Create Table TableB
(
Id int primary key,
Name nvarchar(50),
Gender nvarchar(10)
)
Go

Insert into TableB values (2, 'Mary', 'Female')


Insert into TableB values (3, 'Steve', 'Male')
Insert into TableB values (4, 'John', 'Male')
Go

UNION operator returns all the unique rows from both the queries. Notice the duplicates
are removed.
Select Id, Name, Gender from TableA
UNION
Select Id, Name, Gender from TableB

Result :

UNION ALL operator returns all the rows from both the queries, including the duplicates.
Select Id, Name, Gender from TableA
UNION ALL
Select Id, Name, Gender from TableB

Result :

INTERSECT operator retrieves the common unique rows from both the left and the right
query. Notice the duplicates are removed.
Select Id, Name, Gender from TableA
INTERSECT
Select Id, Name, Gender from TableB
Result :

EXCEPT operator returns unique rows from the left query that aren’t in the right query’s
results.
Select Id, Name, Gender from TableA
EXCEPT
Select Id, Name, Gender from TableB
Result :

For all these 3 operators to work the following 2 conditions must be met

The number and the order of the columns must be same in both the queries
The data types must be same or at least compatible
For example, if the number of columns are different, you will get the following error
Msg 205, Level 16, State 1, Line 1
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal
number of expressions in their target lists.

Cross Apply and Outer Apply in Sql Server

We will use the following 2 tables for examples in this demo


SQL Script to create the tables and populate with test data
Create table Department
(
Id int primary key,
DepartmentName nvarchar(50)
)
Go

Insert into Department values (1, 'IT')


Insert into Department values (2, 'HR')
Insert into Department values (3, 'Payroll')
Insert into Department values (4, 'Administration')
Insert into Department values (5, 'Sales')
Go

Create table Employee


(
Id int primary key,
Name nvarchar(50),
Gender nvarchar(10),
Salary int,
DepartmentId int foreign key references Department(Id)
)
Go

Insert into Employee values (1, 'Mark', 'Male', 50000, 1)


Insert into Employee values (2, 'Mary', 'Female', 60000, 3)
Insert into Employee values (3, 'Steve', 'Male', 45000, 2)
Insert into Employee values (4, 'John', 'Male', 56000, 1)
Insert into Employee values (5, 'Sara', 'Female', 39000, 2)
Go

We want to retrieve all the matching rows between Department and Employee tables.

This can be very easily achieved using an Inner Join as shown below.
Select D.DepartmentName, E.Name, E.Gender, E.Salary
from Department D
Inner Join Employee E
On D.Id = E.DepartmentId

Now if we want to retrieve all the matching rows


between Department and Employee tables + the non-matching rows from the LEFT table
(Department)
This can be very easily achieved using a Left Join as shown below.
Select D.DepartmentName, E.Name, E.Gender, E.Salary
from Department D
Left Join Employee E
On D.Id = E.DepartmentId

Now let's assume we do not have access to the Employee table. Instead we have access to
the following Table Valued function, that returns all employees belonging to a department
by Department Id.

Create function fn_GetEmployeesByDepartmentId(@DepartmentId int)


Returns Table
as
Return
(
Select Id, Name, Gender, Salary, DepartmentId
from Employee where DepartmentId = @DepartmentId
)
Go

The following query returns the employees of the department with Id =1.
Select * from fn_GetEmployeesByDepartmentId(1)

Now if you try to perform an Inner or Left join between Department table
and fn_GetEmployeesByDepartmentId() function you will get an error.

Select D.DepartmentName, E.Name, E.Gender, E.Salary


from Department D
Inner Join fn_GetEmployeesByDepartmentId(D.Id) E
On D.Id = E.DepartmentId

If you execute the above query you will get the following error
Msg 4104, Level 16, State 1, Line 3
The multi-part identifier "D.Id" could not be bound.

This is where we use Cross Apply and Outer Apply operators. Cross Apply is semantically
equivalent to Inner Join and Outer Apply is semantically equivalent to Left Outer Join.

Just like Inner Join, Cross Apply retrieves only the matching rows from the Department table
and fn_GetEmployeesByDepartmentId() table valued function.

Select D.DepartmentName, E.Name, E.Gender, E.Salary


from Department D
Cross Apply fn_GetEmployeesByDepartmentId(D.Id) E

Just like Left Outer Join, Outer Apply retrieves all matching rows from the Department table
and fn_GetEmployeesByDepartmentId() table valued function + non-matching rows from
the left table (Department)

Select D.DepartmentName, E.Name, E.Gender, E.Salary


from Department D
Outer Apply fn_GetEmployeesByDepartmentId(D.Id) E

How does Cross Apply and Outer Apply work

The APPLY operator introduced in SQL Server 2005, is used to join a table to a table-valued
function.
The Table Valued Function on the right hand side of the APPLY operator gets called for each
row from the left (also called outer table) table.
Cross Apply returns only matching rows (semantically equivalent to Inner Join)
Outer Apply returns matching + non-matching rows (semantically equivalent to Left Outer
Join). The unmatched columns of the table valued function will be set to NULL.

RANK, DENSE_RANK and ROW_NUMBER functions in SQL Server

Similarities between RANK, DENSE_RANK and ROW_NUMBER functions

• Returns an increasing integer value starting at 1 based on the ordering of


rows imposed by the ORDER BY clause (if there are no ties)
• ORDER BY clause is required
• PARTITION BY clause is optional
• When the data is partitioned, the integer value is reset to 1 when the
partition changes
We will use the following Employees table for the examples in this video

SQL Script to create the Employees table


Create Table Employees
(
Id int primary key,
Name nvarchar(50),
Gender nvarchar(10),
Salary int
)
Go

Insert Into Employees Values (1, 'Mark', 'Male', 6000)


Insert Into Employees Values (2, 'John', 'Male', 8000)
Insert Into Employees Values (3, 'Pam', 'Female', 4000)
Insert Into Employees Values (4, 'Sara', 'Female', 5000)
Insert Into Employees Values (5, 'Todd', 'Male', 3000)

Notice that no two employees in the table have the same salary. So all the 3 functions
RANK, DENSE_RANK and ROW_NUMBER produce the same increasing integer value when
ordered by Salary column.

SELECT Name, Salary, Gender,


ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNumber,
RANK() OVER (ORDER BY Salary DESC) AS [Rank],
DENSE_RANK() OVER (ORDER BY Salary DESC) AS DenseRank
FROM Employees

You will only see the difference when there ties (duplicate values in the column used in the
ORDER BY clause).

Now let's include duplicate values for Salary column.

To do this
First delete existing data from the Employees table
DELETE FROM Employees

Insert new rows with duplicate valuse for Salary column


Insert Into Employees Values (1, 'Mark', 'Male', 8000)
Insert Into Employees Values (2, 'John', 'Male', 8000)
Insert Into Employees Values (3, 'Pam', 'Female', 8000)
Insert Into Employees Values (4, 'Sara', 'Female', 4000)
Insert Into Employees Values (5, 'Todd', 'Male', 3500)

At this point data in the Employees table should be as shown below

Notice 3 employees have the same salary 8000. When you execute the following query you
can clearly see the difference between RANK, DENSE_RANK and ROW_NUMBER functions.

SELECT Name, Salary, Gender,


ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNumber,
RANK() OVER (ORDER BY Salary DESC) AS [Rank],
DENSE_RANK() OVER (ORDER BY Salary DESC) AS DenseRank
FROM Employees

Difference between RANK, DENSE_RANK and ROW_NUMBER functions

✓ ROW_NUMBER : Returns an increasing unique number for each row starting at 1,


even if there are duplicates.
✓ RANK : Returns an increasing unique number for each row starting at 1. When there
are duplicates, same rank is assigned to all the duplicate rows, but the next row after
the duplicate rows will have the rank it would have been assigned if there had been
no duplicates. So RANK function skips rankings if there are duplicates.
✓ DENSE_RANK : Returns an increasing unique number for each row starting at 1.
When there are duplicates, same rank is assigned to all the duplicate rows but the
DENSE_RANK function will not skip any ranks. This means the next row after the
duplicate rows will have the next rank in the sequence.

You might also like