KUDVENKAT SQL SERVER TRANSCRIPT-Part1
KUDVENKAT SQL SERVER TRANSCRIPT-Part1
2. Next you need to specify the Server Name. Here we can specify the name or the server or IP
Address.If you have SQL Server installed on your local machine, you can specify, (local) or just .
(Period) or 127.0.0.1
Server name = (local)
3. Now select Authentication. The options available here, depends on how you have installed SQL
Server. During installation, if you have chosen mixed mode authentication, you will have both
Windows Authentication and SQL Server Authentication. Otherwise, you will just be able to
connect using windows authentication.
4. If you have chosen Windows Authentication, you dont have to enter user name and password,
otherwise enter the user name and password and click connect.
You should now be connected to SQL Server. Now, click on New Query, on the top left hand
corner of SSMS. This should open a new query editor window, where we can type sql queries and
execute.
SSMS is a client tool and not the Server by itself. Usually database server (SQL Server), will be on
a dedicated machine, and developers connect to the server using SSMS from their respective
local (development) computers.
Developer Machines 1,2,3 and 4 connects to the database server using SSMS.
1
2
Whether, you create a database graphically using the designer or, using a query, the following 2
files gets generated.
.MDF file - Data File (Contains actual data)
.LDF file - Transaction Log file (Used to recover the database)
2
3
You cannot drop a database, if it is currently in use. You get an error stating - Cannot drop database
"NewDatabaseName" because it is currently in use. So, if other users are connected, you need to put the
database in single user mode and then drop the database.
Alter Database DatabaseName Set SINGLE_USER With Rollback Immediate
With Rollback Immediate option, will rollback all incomplete transactions and closes the connection to the
database.
3
4
The following statement creates tblGender table, with ID and Gender columns. The following statement
creates tblGender table, with ID and Gender columns. ID column, is the primary key column. The
primary key is used to uniquely identify each row in a table. Primary key does not allow nulls.
Create Table tblGender
(ID int Not Null Primary Key,
Gender nvarchar(50))
In tblPerson table, GenderID is the foreign key referencing ID column in tblGender table. Foreign key
references can be added graphically using SSMS or using a query.
Foreign keys are used to enforce database integrity. In layman's terms, A foreign key in one table
points to a primary key in another table. The foreign key constraint prevents invalid data form being
inserted into the foreign key column. The values that you enter into the foreign key column, has to be one
of the values contained in the table it points to.
Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest
4
5
In this video, we will learn adding a Default Constraint. A column default can be specified using Default
constraint. The default constraint is used to insert a default value into a column. The default value will be
added to all new records, if no other value is specified, including NULL.
The insert statement below does not provide a value for GenderId column, so the default of 1 will be
inserted for this record.
Insert into tblPerson(ID,Name,Email) values(5,'Sam','[email protected]')
On the other hand, the following insert statement will insert NULL, instead of using the default.
Insert into tblPerson(ID,Name,Email,GenderId) values (6,'Dan','[email protected]',NULL)
To drop a constraint
ALTER TABLE { TABLE_NAME }
DROP CONSTRAINT { CONSTRAINT_NAME }
5
6
Cascading referential integrity constraint allows to define the actions Microsoft SQL Server should take
when a user attempts to delete or update a key to which an existing foreign keys points.
For example, consider the 2 tables shown below. If you delete row with ID = 1 from tblGender table,
then row with ID = 3 from tblPerson table becomes an orphan record. You will not be able to tell the
Gender for this row. So, Cascading referential integrity constraint can be used to define actions Microsoft
SQL Server should take when this happens. By default, we get an error and the DELETE or UPDATE
statement is rolled back.
However, you have the following options when setting up Cascading referential integrity
constraint
1. No Action: This is the default behaviour. No Action specifies that if an attempt is made to delete or
update a row with a key referenced by foreign keys in existing rows in other tables, an error is raised and
the DELETE or UPDATE is rolled back.
2. Cascade: Specifies that if an attempt is made to delete or update a row with a key referenced by
foreign keys in existing rows in other tables, all rows containing those foreign keys are also deleted or
updated.
3. Set NULL: Specifies that if an attempt is made to delete or update a row with a key referenced by
foreign keys in existing rows in other tables, all rows containing those foreign keys are set to NULL.
4. Set Default: Specifies that if an attempt is made to delete or update a row with a key referenced by
foreign keys in existing rows in other tables, all rows containing those foreign keys are set to default
values.
6
7
CHECK constraint is used to limit the range of the values, that can be entered for a column.
Let's say, we have an integer AGE column, in a table. The AGE in general cannot be less than ZERO and
at the same time cannot be greater than 150. But, since AGE is an integer column it can accept negative
values and values much greater than 150.
So, to limit the values, that can be added, we can use CHECK constraint. In SQL Server, CHECK
constraint can be created graphically, or using a query.
The following check constraint, limits the age between ZERO and 150.
ALTER TABLE tblPerson
ADD CONSTRAINT CK_tblPerson_Age CHECK (Age > 0 AND Age < 150)
If the BOOLEAN_EXPRESSION returns true, then the CHECK constraint allows the value, otherwise it
doesn't. Since, AGE is a nullable column, it's possible to pass null for this column, when inserting a row.
When you pass NULL for the AGE column, the boolean expression evaluates to UNKNOWN, and allows
the value.
7
8
In the following 2 insert statements, we only supply values for Name column and not for PersonId
column.
Insert into tblPerson values ('Sam')
Insert into tblPerson values ('Sara')
If you select all the rows from tblPerson table, you will see that, 'Sam' and 'Sara' rows have got 1 and 2 as
PersonId.
Now, if I try to execute the following query, I get an error stating - An explicit value for the identity column
in table 'tblPerson' can only be specified when a column list is used and IDENTITY_INSERT is ON.
Insert into tblPerson values (1,'Todd')
So if you mark a column as an Identity column, you dont have to explicitly supply a value for that column
when you insert a new row. The value is automatically calculated and provided by SQL server. So, to
insert a row into tblPerson table, just provide value for Name column.
Insert into tblPerson values ('Todd')
Delete the row, that you have just inserted and insert another row. You see that the value for PersonId is
2. Now if you insert another row, PersonId is 3. A record with PersonId = 1, does not exist, and I want to
fill this gap. To do this, we should be able to explicitly supply the value for identity column. To explicitly
supply a value for identity column
1. First turn on identity insert - SET Identity_Insert tblPerson ON
2. In the insert query specify the column list
Insert into tblPerson(PersonId, Name) values(2, 'John')
As long as the Identity_Insert is turned on for a table, you need to explicitly provide the value for that
column. If you don't provide the value, you get an error - Explicit value must be specified for identity
column in table 'tblPerson1' either when IDENTITY_INSERT is set to ON or when a replication user is
inserting into a NOT FOR REPLICATION identity column.
After, you have the gaps in the identity column filled, and if you wish SQL server to calculate the value,
turn off Identity_Insert.
SET Identity_Insert tblPerson OFF
If you have deleted all the rows in a table, and you want to reset the identity column value, use DBCC
CHECKIDENT command. This command will reset PersonId identity column.
DBCC CHECKIDENT(tblPerson, RESEED, 0)
How to get the last generated identity column value in SQL Server - Part 8
From the previous session, we understood that identity column values are auto generated. There are
several ways in sql server, to retrieve the last identity value that is generated. The most common way is to
use SCOPE_IDENTITY() built in function.
8
9
SCOPE_IDENTITY() returns the last identity value that is created in the same session (Connection) and
in the same scope (in the same Stored procedure, function, trigger). Let's say, I have 2 tables tblPerson1
and tblPerson2, and I have a trigger on tblPerson1 table, which will insert a record into tblPerson2 table.
Now, when you insert a record into tblPerson1 table, SCOPE_IDENTITY() returns the idetentity value
that is generated in tblPerson1 table, where as @@IDENTITY returns, the value that is generated in
tblPerson2 table. So, @@IDENTITY returns the last identity value that is created in the same session
without any consideration to the scope. IDENT_CURRENT('tblPerson') returns the last identity value
created for a specific table across any session and any scope.
In brief:
SCOPE_IDENTITY() - returns the last identity value that is created in the same session and in the same
scope.
@@IDENTITY - returns the last identity value that is created in the same session and across any scope.
IDENT_CURRENT('TableName') - returns the last identity value that is created for a specific table across
any session and any scope.
Both primary key and unique key are used to enforce, the uniqueness of a column. So, when do
you choose one over the other?
A table can have, only one primary key. If you want to enforce uniqueness on 2 or more columns, then we
use unique key constraint.
What is the difference between Primary key constraint and Unique key constraint? This question
is asked very frequently in interviews.
1. A table can have only one primary key, but more than one unique key
2. Primary key does not allow nulls, where as unique key allows one null
9
10
If you want to select all the columns, you can also use *. For better performance use the column
list, instead of using *.
SELECT *
FROM Table_Name
Group By - Part 11
In SQL Server we have got lot of aggregate functions. Examples
1. Count()
2. Sum()
3. avg()
4. Min()
5. Max()
Group by clause is used to group a selected set of rows into a set of summary rows by the values of one
or more columns or expressions. It is always used in conjunction with one or more aggregate functions.
I want an sql query, which gives total salaries paid by City. The output should be as shown below.
10
11
Note: If you omit, the group by clause and try to execute the query, you get an error - Column
'tblEmployee.City' is invalid in the select list because it is not contained in either an aggregate function or
the GROUP BY clause.
Now, I want an sql query, which gives total salaries by City, by gender. The output should be as shown
below.
Query for retrieving total salaries by city and by gender: It's possible to group by multiple columns. In
this query, we are grouping first by city and then by gender.
Select City, Gender, SUM(Salary) as TotalSalary
from tblEmployee
group by City, Gender
Now, I want an sql query, which gives total salaries and total number of employees by City, and by
gender. The output should be as shown below.
Query for retrieving total salaries and total number of employees by City, and by gender: The only
difference here is that, we are using Count() aggregate function.
Select City, Gender, SUM(Salary) as TotalSalary,
COUNT(ID) as TotalEmployees
from tblEmployee
group by City, Gender
Filtering Groups:
WHERE clause is used to filter rows before aggregation, where as HAVING clause is used to filter groups
after aggregations. The following 2 queries produce the same result.
11
12
from tblEmployee
Where City = 'London'
group by City
Filtering groups using HAVING clause, after all aggrgations take place:
Select City, SUM(Salary) as TotalSalary
from tblEmployee
group by City
Having City = 'London'
From a performance standpoint, you cannot say that one method is less efficient than the other. Sql
server optimizer analyzes each statement and selects an efficient way of executing it. As a best practice,
use the syntax that clearly describes the desired result. Try to eliminate rows that
you wouldn't need, as early as possible.
Joins in sql server - Part 12
Joins in SQL server are used to query (retrieve) data from 2 or more related tables. In general tables are
related to each other using foreign key constraints.
Please watch Parts 3 and 5 in this video series, before continuing with this video.
Part 3 - Creating and working with tables
Part 5 - Cascading referential integrity constraint
Now let's understand all the JOIN types, with examples and the differences between them.
Employee Table (tblEmployee)
12
13
DepartmentName nvarchar(50),
Location nvarchar(50),
DepartmentHead nvarchar(50)
Go
Go
13
14
Name nvarchar(50),
Gender nvarchar(50),
Salary int,
Go
Go
CROSS JOIN
CROSS JOIN, produces the cartesian product of the 2 tables involved in the join. For example, in the
Employees table we have 10 rows and in the Departments table we have 4 rows. So, a cross join
between these 2 tables produces 40 rows. Cross Join shouldn't have ON clause.
14
15
OR
Note: JOIN or INNER JOIN means the same. It's always better to use INNER JOIN, as this explicitly
specifies your intention.
If you look at the output, we got only 8 rows, but in the Employees table, we have 10 rows. We didn't get
JAMES and RUSSELL records. This is because the DEPARTMENTID, in Employees table is NULL for
these two employees and doesn't match with ID column in Departments table.
So, in summary, INNER JOIN, returns only the matching rows between both the tables. Non matching
rows are eliminated.
15
16
OR
Note: You can use, LEFT JOIN or LEFT OUTER JOIN. OUTER keyowrd is optional
LEFT JOIN, returns all the matching rows + non matching rows from the left table. In reality, INNER JOIN
and LEFT JOIN are extensively used.
16
17
OR
Note: You can use, RIGHT JOIN or RIGHT OUTER JOIN. OUTER keyowrd is optional
RIGHT JOIN, returns all the matching rows + non matching rows from the right table.
17
18
OR
Note: You can use, FULLJOIN or FULL OUTER JOIN. OUTER keyowrd is optional
FULL JOIN, returns all rows from both the left and right tables, including the non matching rows.
Joins Summary
18
19
19
20
Before watching this video, please watch Part 12 - Joins in SQL Server
How to retrieve only the non matching rows from the left table. The output should be as shown
below:
20
21
Query:
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee E
LEFT JOIN tblDepartment D
ON E.DepartmentId = D.Id
WHERE D.Id IS NULL
How to retrieve only the non matching rows from the right table
Query:
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee E
RIGHT JOIN tblDepartment D
ON E.DepartmentId = D.Id
WHERE E.DepartmentId IS NULL
How to retrieve only the non matching rows from both the left and right table. Matching rows
should be eliminated.
21
22
Query:
SELECT Name, Gender, Salary, DepartmentName
FROM tblEmployee E
FULL JOIN tblDepartment D
ON E.DepartmentId = D.Id
WHERE E.DepartmentId IS NULL
OR D.Id IS NULL
22
23
In parts 12 and 13, we have seen joining 2 different tables - tblEmployees and tblDepartments. Have
you ever thought of a need to join a table with itself. Consider tblEmployees table shown below.
In short, joining a table with itself is called as SELF JOIN. SELF JOIN is not a different type of JOIN. It
can be classified under any type of JOIN - INNER, OUTER or CROSS Joins. The above query is, LEFT
OUTER SELF Join.
23
24
from tblEmployee
Cross Join tblEmployee
group by City
Having City = 'London'
In Part 14, we have learnt writing a LEFT OUTER SELF JOIN query, which produced the following output.
24
25
In the output, MANAGER column, for Todd's rows is NULL. I want to replace the NULL value, with 'No
Manager'
Replacing NULL value using ISNULL() function: We are passing 2 parameters to IsNULL() function. If
M.Name returns NULL, then 'No Manager' string is used as the replacement value.
SELECT E.Name as Employee, ISNULL(M.Name,'No Manager') as Manager
FROM tblEmployee E
LEFT JOIN tblEmployee M
ON E.ManagerID = M.EmployeeID
Replacing NULL value using COALESCE() function: COALESCE() function, returns the first NON
NULL value.
SELECT E.Name as Employee, COALESCE(M.Name, 'No Manager') as Manager
FROM tblEmployee E
LEFT JOIN tblEmployee M
ON E.ManagerID = M.EmployeeID
Consider the Employees Table below. Not all employees have their First, Midde and Last Names filled.
Some of the employees has First name missing, some of them have Middle Name missing and some of
them last name.
25
26
Now, let's write a query that returns the Name of the Employee. If an employee, has all the columns
filled - First, Middle and Last Names, then we only want the first name.
If the FirstName is NULL, and if Middle and Last Names are filled then, we only want the middle
name. For example, Employee row with Id = 1, has the FirstName filled, so we want to retrieve his
FirstName "Sam". Employee row with Id = 2, has Middle and Last names filled, but the First name is
missing. Here, we want to retrieve his middle name "Todd". In short, The output of the query should be as
shown below.
We are passing FirstName, MiddleName and LastName columns as parameters to the COALESCE()
function. The COALESCE() function returns the first non null value from the 3 columns.
SELECT Id, COALESCE(FirstName, MiddleName, LastName) AS Name
FROM tblEmployee
26
27
Note: If you want to see the cost of DISTINCT SORT, you can turn on the estimated query execution plan
using CTRL + L.
Note: For UNION and UNION ALL to work, the Number, Data types, and the order of the columns in the
select statements should be same.
If you want to sort, the results of UNION or UNION ALL, the ORDER BY caluse should be used on
the last SELECT statement as shown below.
Select Id, Name, Email from tblIndiaCustomers
UNION ALL
Select Id, Name, Email from tblUKCustomers
UNION ALL
Select Id, Name, Email from tblUSCustomers
Order by Name
27
28
There are several advantages of using stored procedures, which we will discuss in a later video session.
In this session, we will learn how to create, execute, change and delete stored procedures.
Creating a simple stored procedure without any parameters: This stored procedure, retrieves Name
and Gender of all the employees. To create a stored procedure we use, CREATE PROCEDURE or
CREATE PROC statement.
Note: When naming user defined stored procedures, Microsoft recommends not to use "sp_" as a prefix.
28
29
All system stored procedures, are prefixed with "sp_". This avoids any ambiguity between user defined
and system stored procedures and any conflicts, with some future system procedure.
To execute the stored procedure, you can just type the procedure name and press F5, or use EXEC or
EXECUTE keywords followed by the procedure name as shown below.
1. spGetEmployees
2. EXEC spGetEmployees
3. Execute spGetEmployees
Note: You can also right click on the procedure name, in object explorer in SQL Server Management
Studio and select EXECUTE STORED PROCEDURE.
Creating a stored procedure with input parameters: This SP, accepts GENDER and
DEPARTMENTID parameters. Parameters and variables have an @ prefix in their name.
To invoke this procedure, we need to pass the value for @Gender and @DepartmentId parameters. If
you don't specify the name of the parameters, you have to first pass value for @Gender parameter and
then for @DepartmentId.
EXECUTE spGetEmployeesByGenderAndDepartment 'Male', 1
On the other hand, if you change the order, you will get an error stating "Error converting data type
varchar to int." This is because, the value of "Male" is passed into @DepartmentId parameter. Since
@DepartmentId is an integer, we get the type conversion error.
spGetEmployeesByGenderAndDepartment 1, 'Male'
When you specify the names of the parameters when executing the stored procedure the order doesn't
matter.
EXECUTE spGetEmployeesByGenderAndDepartment @DepartmentId=1, @Gender = 'Male'
To encrypt the text of the SP, use WITH ENCRYPTION option. Once, encrypted, you cannot view the
text of the procedure, using sp_helptext system stored procedure. There are ways to obtain the original
text, which we will talk about in a later session.
29
30
To delete the SP, use DROP PROC 'SPName' or DROP PROCEDURE 'SPName'
In the next seesion, we will learn creating stored procedures with OUTPUT parameters.
To create an SP with output parameter, we use the keywords OUT or OUTPUT. @EmployeeCount is
an OUTPUT parameter. Notice, it is specified with OUTPUT keyword.
Create Procedure spGetEmployeeCountByGender
@Gender nvarchar(20),
@EmployeeCount int Output
as
Begin
Select @EmployeeCount = COUNT(Id)
from tblEmployee
where Gender = @Gender
End
30
31
1. First initialise a variable of the same datatype as that of the output parameter. We have declared
@EmployeeTotal integer variable.
2. Then pass the @EmployeeTotal variable to the SP. You have to specify the OUTPUT keyword. If you
don't specify the OUTPUT keyword, the variable will be NULL.
3. Execute
If you don't specify the OUTPUT keyword, when executing the stored procedure, the @EmployeeTotal
variable will be NULL. Here, we have not specified OUTPUT keyword. When you execute, you will see
'@EmployeeTotal is null' printed.
You can pass parameters in any order, when you use the parameter names. Here, we are first
passing the OUTPUT parameter and then the input @Gender parameter.
The following system stored procedures, are extremely useful when working procedures.
sp_help SP_Name : View the information about the stored procedure, like parameter names, their
datatypes etc. sp_help can be used with any database object, like tables, views, SP's, triggers etc.
Alternatively, you can also press ALT+F1, when the name of the object is highlighted.
sp_depends SP_Name : View the dependencies of the stored procedure. This system SP is very useful,
especially if you want to check, if there are any stored procedures that are referencing a table that you
are abput to drop. sp_depends can also be used with other database objects like table etc.
Note: All parameter and variable names in SQL server, need to have the @symbol.
31
32
So, from this we understood that, when a stored procedure is executed, it returns an integer status
variable. With this in mind, let's understand the difference between output parameters and RETURN
values. We will use the Employees table below for this purpose.
The following procedure returns total number of employees in the Employees table, using output
parameter - @TotalCount.
Create Procedure spGetTotalCountOfEmployees1
@TotalCount int output
as
Begin
Select @TotalCount = COUNT(ID) from tblEmployee
End
32
33
So, we are able to achieve what we want, using output parameters as well as return values. Now, let's
look at example, where return status variables cannot be used, but Output parameters can be used.
In this SP, we are retrieving the Name of the employee, based on their Id, using the output
parameter @Name.
Create Procedure spGetNameById1
@Id int,
@Name nvarchar(20) Output
as
Begin
Select @Name = Name from tblEmployee Where Id = @Id
End
Now let's try to achieve the same thing, using return status variables.
Create Procedure spGetNameById2
@Id int
as
Begin
Return (Select Name from tblEmployee Where Id = @Id)
End
Executing spGetNameById2 returns an error stating 'Conversion failed when converting the nvarchar
value 'Sam' to data type int.'. The return status variable is an integer, and hence, when we select Name of
an employee and try to return that we get a converion error.
33
34
So, using return values, we can only return integers, and that too, only one integer. It is not possible, to
return more than one value using return values, where as output parameters, can return any datatype and
an sp can have more than one output parameters. I always prefer, using output parameters, over
RETURN values.
In general, RETURN values are used to indicate success or failure of stored procedure, especially when
we are dealing with nested stored procedures.Return a value of 0, indicates success, and any nonzero
value indicates failure.
2. Reduces network traffic - You only need to send, EXECUTE SP_Name statement, over the network,
instead of the entire batch of adhoc SQL code.
3. Code reusability and better maintainability - A stored procedure can be reused with multiple
applications. If the logic has to change, we only have one place to change, where as if it is inline sql, and
if you have to use it in multiple applications, we end up with multiple copies of this inline sql. If the logic
has to change, we have to change at all the places, which makes it harder maintaining inline sql.
4. Better Security - A database user can be granted access to an SP and prevent them from executing
direct "select" statements against a table. This is fine grain access control which will help control what
data a user has access to.
5. Avoids SQL Injection attack - SP's prevent sql injection attack. Please watch this video on SQL
Injection Attack, for more information.
34
35
There are several built-in functions. In this video session, we will look at the most common string
functions available.
Note: The while loop will become an infinite loop, if you forget to include the following line.
Set @Number = @Number + 1
Another way of printing lower case alphabets using CHAR() and LOWER() functions.
Declare @Number int
Set @Number = 65
While(@Number <= 90)
Begin
Print LOWER(CHAR(@Number))
Set @Number = @Number + 1
35
36
End
LTRIM(Character_Expression) - Removes blanks on the left handside of the given character expression.
Example: Removing the 3 white spaces on the left hand side of the ' Hello' string using LTRIM()
function.
Select LTRIM(' Hello')
Output: Hello
RTRIM(Character_Expression) - Removes blanks on the right hand side of the given character
expression.
Example: Removing the 3 white spaces on the left hand side of the 'Hello ' string using RTRIM()
function.
Select RTRIM('Hello ')
Output: Hello
Example: To remove white spaces on either sides of the given character expression, use LTRIM() and
RTRIM() as shown below.
Select LTRIM(RTRIM(' Hello '))
Output: Hello
LEN(String_Expression) - Returns the count of total characters, in the given string expression, excluding
the blanks at the end of the expression.
36
37
In the next video session, we will discuss about the rest of the commonly used built-in string functions.
37
38
Example: In this example, we get the starting position of '@' character in the email string
'[email protected]'.
Select CHARINDEX('@','[email protected]',1)
Output: 5
SUBSTRING('Expression', 'Start', 'Length') - As the name, suggests, this function returns substring (part
of the string), from the given expression. You specify the starting location using the 'start' parameter and
the number of characters in the substring using 'Length' parameter. All the 3 parameters are mandatory.
Example: Display just the domain part of the given email '[email protected]'.
Select SUBSTRING('[email protected]',6, 7)
Output: bbb.com
In the above example, we have hardcoded the starting position and the length parameters. Instead of
hardcoding we can dynamically retrieve them using CHARINDEX() and LEN() string functions as shown
below.
Example:
Select SUBSTRING('[email protected]',(CHARINDEX('@', '[email protected]') + 1), (LEN('[email protected]') -
CHARINDEX('@','[email protected]')))
Output: bbb.com
Real time example, where we can use LEN(), CHARINDEX() and SUBSTRING() functions. Let us
assume we have table as shown below.
38
39
Write a query to find out total number of emails, by domain. The result of the query should be as shown
below.
Query
Select SUBSTRING(Email, CHARINDEX('@', Email) + 1,
LEN(Email) - CHARINDEX('@', Email)) as EmailDomain,
COUNT(Email) as Total
from tblEmployee
Group By SUBSTRING(Email, CHARINDEX('@', Email) + 1,
LEN(Email) - CHARINDEX('@', Email))
A practical example of using REPLICATE() function: We will be using this table, for the rest of our
examples in this article.
39
40
Let's mask the email with 5 * (star) symbols. The output should be as shown below.
Query:
Select FirstName, LastName, SUBSTRING(Email, 1, 2) + REPLICATE('*',5) +
SUBSTRING(Email, CHARINDEX('@',Email), LEN(Email) - CHARINDEX('@',Email)+1) as Email
from tblEmployee
Example: The SPACE(5) function, inserts 5 spaces between FirstName and LastName
Select FirstName + SPACE(5) + LastName as FullName
From tblEmployee
Output:
40
41
PATINDEX('%Pattern%', Expression)
Returns the starting position of the first occurrence of a pattern in a specified expression. It takes two
arguments, the pattern to be searched and the expression. PATINDEX() is simial to CHARINDEX(). With
CHARINDEX() we cannot use wildcards, where as PATINDEX() provides this capability. If the specified
pattern is not found, PATINDEX() returns ZERO.
Example:
Select Email, PATINDEX('%@aaa.com', Email) as FirstOccurence
from tblEmployee
Where PATINDEX('%@aaa.com', Email) > 0
Output:
41
42
Example:
Select FirstName, LastName,Email, STUFF(Email, 2, 3, '*****') as StuffedEmail
From tblEmployee
There are several built-in DateTime functions available in SQL Server. All the following functions can be
used to get the current system date and time, where you have sql server installed.
Note: UTC stands for Coordinated Universal Time, based on which, the world regulates clocks and
time. There are slight differences between GMT and UTC, but for most common purposes, UTC is
42
43
To practically understand how the different date time datatypes available in SQL Server, store data,
create the sample table tblDateTime.
CREATE TABLE [tblDateTime]
(
[c_time] [time](7) NULL,
[c_date] [date] NULL,
[c_smalldatetime] [smalldatetime] NULL,
[c_datetime] [datetime] NULL,
[c_datetime2] [datetime2](7) NULL,
[c_datetimeoffset] [datetimeoffset](7) NULL
)
Now, issue a select statement, and you should see, the different types of datetime datatypes, storing the
current datetime, in different formats.
Output:
IsDate, Day, Month, Year and DateName DateTime functions in SQL Server - Part 26
ISDATE() - Checks if the given value, is a valid date, time, or datetime. Returns 1 for success, 0 for
failure.
Examples:
Select ISDATE('PRAGIM') -- returns 0
Select ISDATE(Getdate()) -- returns 1
Select ISDATE('2012-08-31 21:02:04.167') -- returns 1
43
44
Example:
Select ISDATE('2012-09-01 11:34:21.1918447') -- returns 0.
Day() - Returns the 'Day number of the Month' of the given date
Examples:
Select DAY(GETDATE()) -- Returns the day number of the month, based on current system datetime.
Select DAY('01/31/2012') -- Returns 31
Month() - Returns the 'Month number of the year' of the given date
Examples:
Select Month(GETDATE()) -- Returns the Month number of the year, based on the current system date
and time
Select Month('01/31/2012') -- Returns 1
Examples:
Select Year(GETDATE()) -- Returns the year number, based on the current system date
Select Year('01/31/2012') -- Returns 2012
DateName(DatePart, Date) - Returns a string, that represents a part of the given date. This functions
takes 2 parameters. The first parameter 'DatePart' specifies, the part of the date, we want. The second
parameter, is the actual date, from which we want the part of the Date.
44
45
Examples:
Select DATENAME(Day, '2012-09-30 12:43:46.837') -- Returns 30
Select DATENAME(WEEKDAY, '2012-09-30 12:43:46.837') -- Returns Sunday
Select DATENAME(MONTH, '2012-09-30 12:43:46.837') -- Returns September
A simple practical example using some of these DateTime functions. Consider the table tblEmployees.
Write a query, which returns Name, DateOfBirth, Day, MonthNumber, MonthName, and Year as shown
below.
Query:
Select Name, DateOfBirth, DateName(WEEKDAY,DateOfBirth) as [Day],
45
46
Month(DateOfBirth) as MonthNumber,
DateName(MONTH, DateOfBirth) as [MonthName],
Year(DateOfBirth) as [Year]
From tblEmployees
Examples:
Select DATEPART(weekday, '2012-08-30 19:45:31.793') -- returns 5
Select DATENAME(weekday, '2012-08-30 19:45:31.793') -- returns Thursday
DATEADD (datepart, NumberToAdd, date) - Returns the DateTime, after adding specified
NumberToAdd, to the datepart specified of the given date.
Examples:
Select DateAdd(DAY, 20, '2012-08-30 19:45:31.793')
-- Returns 2012-09-19 19:45:31.793
Select DateAdd(DAY, -20, '2012-08-30 19:45:31.793')
-- Returns 2012-08-10 19:45:31.793
46
47
DATEDIFF(datepart, startdate, enddate) - Returns the count of the specified datepart boundaries crossed
between the specified startdate and enddate.
Examples:
Select DATEDIFF(MONTH, '11/30/2005','01/31/2006') -- returns 2
Select DATEDIFF(DAY, '11/30/2005','01/31/2006') -- returns 62
Write a query to compute the age of a person, when the date of birth is given. The output should be as
shown below.
End
47
48
Using the function in a query to get the expected output along with the age of the person.
Select Id, Name, DateOfBirth, dbo.fnComputeAge(DateOfBirth) as Age from tblEmployees
From the syntax, it is clear that CONVERT() function has an optional style parameter, where as CAST()
function lacks this capability.
The following 2 queries convert, DateOfBirth's DateTime datatype to NVARCHAR. The first query uses
the CAST() function, and the second one uses CONVERT() function. The output is exactly the same for
both the queries as shown below.
Select Id, Name, DateOfBirth, CAST(DateofBirth as nvarchar) as ConvertedDOB
from tblEmployees
Select Id, Name, DateOfBirth, Convert(nvarchar, DateOfBirth) as ConvertedDOB
from tblEmployees
Output:
Now, let's use the style parameter of the CONVERT() function, to format the Date as we would like it. In
the query below, we are using 103 as the argument for style parameter, which formats the date as
dd/mm/yyyy.
Select Id, Name, DateOfBirth, Convert(nvarchar, DateOfBirth, 103) as ConvertedDOB
from tblEmployees
48
49
Output:
For complete list of all the Date and Time Styles, please check MSDN.
In SQL Server 2008, Date datatype is introduced, so you can also use
SELECT CAST(GETDATE() as DATE)
SELECT CONVERT(DATE, GETDATE())
Note: To control the formatting of the Date part, DateTime has to be converted to NVARCHAR using the
styles provided. When converting to DATE data type, the CONVERT() function will ignore the style
parameter.
In this query, we are using CAST() function, to convert Id (int) to nvarchar, so it can be appended with
the NAME column. If you remove the CAST() function, you will get an error stating - 'Conversion failed
when converting the nvarchar value 'Sam - ' to data type int.'
Select Id, Name, Name + ' - ' + CAST(Id AS NVARCHAR) AS [Name-Id]
FROM tblEmployees
Now let's look at a practical example of using CAST function. Consider the registrations table below.
49
50
Query:
Select CAST(RegisteredDate as DATE) as RegistrationDate,
COUNT(Id) as TotalRegistrations
From tblRegistrations
Group By CAST(RegisteredDate as DATE)
The general guideline is to use CAST(), unless you want to take advantage of the style functionality in
CONVERT().
ABS ( numeric_expression ) - ABS stands for absolute and returns, the absolute (positive) number.
50
51
CEILING and FLOOR functions accept a numeric expression as a single parameter. CEILING() returns
the smallest integer value greater than or equal to the parameter, whereas FLOOR() returns the largest
integer less than or equal to the parameter.
Examples:
Select CEILING(15.2) -- Returns 16
Select CEILING(-15.2) -- Returns -15
Power(expression, power) - Returns the power value of the specified expression to the specified power.
Example: The following example calculates '2 TO THE POWER OF 3' = 2*2*2 = 8
Select POWER(2,3) -- Returns 8
RAND([Seed_Value]) - Returns a random float number between 0 and 1. Rand() function takes an
optional seed parameter. When seed value is supplied the
RADN() function always returns the same value for the same seed.
Example:
Select RAND(1) -- Always returns the same value
If you want to generate a random number between 1 and 100, RAND() and FLOOR() functions can be
used as shown below. Every time, you execute this query, you get a random number between 1 and 100.
Select FLOOR(RAND() * 100)
Example:
Select SQUARE(9) -- Returns 81
SQRT ( Number ) - SQRT stands for Square Root. This function returns the square root of the given
value.
Example:
Select SQRT(81) -- Returns 9
ROUND ( numeric_expression , length [ ,function ] ) - Rounds the given numeric expression based on
the given length. This function takes 3 parameters.
1. Numeric_Expression is the number that we want to round.
2. Length parameter, specifies the number of the digits that we want to round to. If the length is a
positive number, then the rounding is applied for the decimal part, where as if the length is negative, then
the rounding is applied to the number before the decimal.
3. The optional function parameter, is used to indicate rounding or truncation operations. A value of 0,
indicates rounding, where as a value of non zero indicates truncation. Default, if not specified is 0.
51
52
Examples:
-- Round to 2 places after (to the right) the decimal point
Select ROUND(850.556, 2) -- Returns 850.560
-- Truncate anything after 2 places, after (to the right) the decimal point
Select ROUND(850.556, 2, 1) -- Returns 850.550
-- Truncate anything after 1 place, after (to the right) the decimal point
Select ROUND(850.556, 1, 1) -- Returns 850.500
-- Round the last 2 places before (to the left) the decimal point
Select ROUND(850.556, -2) -- 900.000
-- Round the last 1 place before (to the left) the decimal point
Select ROUND(850.556, -1) -- 850.000
We will cover
1. User Defined Functions in sql server
2. Types of User Defined Functions
3. Creating a Scalar User Defined Function
4. Calling a Scalar User Defined Function
5. Places where we can use Scalar User Defined Function
6. Altering and Dropping a User Defined Function
Scalar functions may or may not have parameters, but always return a single (scalar) value. The
returned value can be of any data type, except text, ntext, image, cursor, and timestamp.
52
53
END
Let us now create a function which calculates and returns the age of a person. To compute the age we
require, date of birth. So, let's pass date of birth as a parameter. So, AGE() function returns an integer
and accepts date parameter.
CREATE FUNCTION Age(@DOB Date)
RETURNS INT
AS
BEGIN
DECLARE @Age INT
SET @Age = DATEDIFF(YEAR, @DOB, GETDATE()) - CASE WHEN (MONTH(@DOB) >
MONTH(GETDATE())) OR (MONTH(@DOB) = MONTH(GETDATE()) AND DAY(@DOB) >
DAY(GETDATE())) THEN 1 ELSE 0 END
RETURN @Age
END
When calling a scalar user-defined function, you must supply a two-part name,
OwnerName.FunctionName. dbo stands for database owner.
Select dbo.Age( dbo.Age('10/08/1982')
You can also invoke it using the complete 3 part name, DatabaseName.OwnerName.FunctionName.
Select SampleDB.dbo.Age('10/08/1982')
Scalar user defined functions can be used in the Select clause as shown below.
Select Name, DateOfBirth, dbo.Age(DateOfBirth) as Age from tblEmployees
Scalar user defined functions can be used in the Where clause, as shown below.
Select Name, DateOfBirth, dbo.Age(DateOfBirth) as Age
from tblEmployees
Where dbo.Age(DateOfBirth) > 30
53
54
A stored procedure also can accept DateOfBirth and return Age, but you cannot use stored procedures
in a select or where clause. This is just one difference between a function and a stored procedure.
There are several other differences, which we will talk about in a later session.
To alter a function we use ALTER FUNCTION FuncationName statement and to delete it, we use DROP
FUNCTION FuncationName.
From Part 30, We learnt that, a scalar function, returns a single value. on the other hand, an Inline Table
Valued function, return a table.
Consider this Employees table shown below, which we will be using for our example.
If you look at the way we implemented this function, it is very similar to SCALAR function, with the
following differences
1. We specify TABLE as the return type, instead of any scalar data type
54
55
2. The function body is not enclosed between BEGIN and END block. Inline table valued function body,
cannot have BEGIN and END block.
3. The structure of the table that gets returned, is determined by the SELECT statement with in the
function.
Output:
As the inline user defined function, is returning a table, issue the select statement against the function, as
if you are selecting the data from a TABLE.
Joining the Employees returned by the function, with the Departments table
Select Name, Gender, DepartmentName
from fn_EmployeesByGender('Male') E
Join tblDepartment D on D.Id = E.DepartmentId
55
56
Multi statement table valued functions are very similar to Inline Table valued functions, with a few
differences. Let's look at an example, and then note the differences.
Employees Table:
Let's write an Inline and multi-statement Table Valued functions that can return the output shown
below.
56
57
From tblEmployees
Return
End
Now let's understand the differences between Inline Table Valued functions and Multi-statement
Table Valued functions
1. In an Inline Table Valued function, the RETURNS clause cannot contain the structure of the table, the
function returns. Where as, with the multi-statement table valued function, we specify the structure of the
table that gets returned
2. Inline Table Valued function cannot have BEGIN and END block, where as the multi-statement function
can have.
3. Inline Table valued functions are better for performance, than multi-statement table valued functions. If
the given task, can be achieved using an inline table valued function, always prefer to use them, over
multi-statement table valued functions.
4. It's possible to update the underlying table, using an inline table valued function, but not possible using
multi-statement table valued function.
Nondeterministic functions may return different results each time they are called with a specific set of
input values even if the database state that they access remains the same.
Examples: GetDate() and CURRENT_TIMESTAMP
57
58
Rand() function is a Non-deterministic function, but if you provide the seed value, the function
becomes deterministic, as the same value gets returned for the same seed value.
We will be using tblEmployees table, for the rest of our examples. Please, create the table using this
script.
CREATE TABLE [dbo].[tblEmployees]
(
[Id] [int] Primary Key,
[Name] [nvarchar](50) NULL,
[DateOfBirth] [datetime] NULL,
[Gender] [nvarchar](10) NULL,
[DepartmentId] [int] NULL
)
Insert rows into the table using the insert script below.
Insert into tblEmployees values(1,'Sam','1980-12-30 00:00:00.000','Male',1)
Insert into tblEmployees values(2,'Pam','1982-09-01 12:02:36.260','Female',2)
Insert into tblEmployees values(3,'John','1985-08-22 12:03:30.370','Male',1)
Insert into tblEmployees values(4,'Sara','1979-11-29 12:59:30.670','Female',3)
Insert into tblEmployees values(5,'Todd','1978-11-29 12:59:30.670','Male',1)
58
59
Begin
Return (Select Name from tblEmployees Where Id = @Id)
End
Now try to retrieve, the text of the function, using sp_helptex fn_GetEmployeeNameById. You will
get a message stating 'The text for object 'fn_GetEmployeeNameById' is encrypted.'
Note: You have to use the 2 part object name i.e, dbo.tblEmployees, to use WITH SCHEMABINDING
option. dbo is the schema name or owner name, tblEmployees is the table name.
6. Now, try to drop the table using - Drop Table tblEmployees. You will get a message stating, 'Cannot
DROP TABLE tblEmployees because it is being referenced by object fn_GetEmployeeNameById.'
So, Schemabinding, specifies that the function is bound to the database objects that it references. When
SCHEMABINDING is specified, the base objects cannot be modified in any way that would affect the
function definition. The function definition itself must first be modified or dropped to remove dependencies
on the object that is to be modified.
59
60
Creating a local Temporary table is very similar to creating a permanent table, except that you prefix the
table name with 1 pound (#) symbol. In the example below, #PersonDetails is a local temporary table,
with Id and Name columns.
Create Table #PersonDetails(Id int, Name nvarchar(20))
You can also check the existence of temporary tables using object explorer. In the object explorer,
expand TEMPDB database folder, and then exapand TEMPORARY TABLES folder, and you should see
the temporary table that we have created.
A local temporary table is available, only for the connection that has created the table. If you open
another query window, and execute the following query you get an error stating 'Invalid object name
#PersonDetails'. This proves that local temporary tables are available, only for the connection that has
created them.
A local temporary table is automatically dropped, when the connection that has created the it, is
closed. If the user wants to explicitly drop the temporary table, he can do so using
DROP TABLE #PersonDetails
If the temporary table, is created inside the stored procedure, it get's dropped automatically upon the
completion of stored procedure execution. The stored procedure below, creates #PersonDetails
temporary table, populates it and then finally returns the data and destroys the temporary
table immediately after the completion of the stored procedure execution.
Create Procedure spCreateLocalTempTable
as
Begin
Create Table #PersonDetails(Id int, Name nvarchar(20))
It is also possible for different connections, to create a local temporary table with the same name. For
example User1 and User2, both can create a local temporary table with the same name #PersonDetails.
Now, if you expand the Temporary Tables folder in the TEMPDB database, you should see 2 tables with
name #PersonDetails and some random number at the end of the name. To differentiate between, the
User1 and User2 local temp tables, sql server appends the random number at the end of the temp table
name.
60
61
Global temporary tables are visible to all the connections of the sql server, and are only destroyed
when the last connection referencing the table is closed.
Multiple users, across multiple connections can have local temporary tables with the same name, but,
a global temporary table name has to be unique, and if you inspect the name of the global temp table, in
the object explorer, there will be no random numbers suffixed at the end of the table name.
2. SQL Server appends some random numbers at the end of the local temp table name, where this is not
done for global temp table names.
3. Local temporary tables are only visible to that session of the SQL Server which has created it, where
as Global temporary tables are visible to all the SQL server sessions
4. Local temporary tables are automatically dropped, when the session that created the temporary tables
is closed, where as Global temporary tables are destroyed when the last connection that is referencing
the global temp table is closed.
If you don't have an index in a book, and I ask you to locate a specific chapter in that book, you will have
to look at every page starting from the first page of the book.
On, the other hand, if you have the index, you lookup the page number of the chapter in the index,
and then directly go to that page number to locate the chapter.
Obviously, the book index is helping to drastically reduce the time it takes to find the chapter.
In a similar way, Table and View indexes, can help the query to find data quickly.
In fact, the existence of the right indexes, can drastically improve the performance of the query. If there is
no index to help the query, then the query engine, checks every row in the table from the beginning to the
end. This is called as Table Scan. Table scan is bad for performance.
Index Example: At the moment, the Employees table, does not have an index on SALARY column.
61
62
To find all the employees, who has salary greater than 5000 and less than 7000, the query engine has
to check each and every row in the table, resulting in a table scan, which can adversely affect the
performance, especially if the table is large. Since there is no index, to help the query, the query engine
performs an entire table scan.
Now Let's Create the Index to help the query:Here, we are creating an index on Salary column in the
employee table
CREATE Index IX_tblEmployee_Salary
ON tblEmployee (SALARY ASC)
The index stores salary of each employee, in the ascending order as shown below. The actual index
may look slightly different.
Now, when the SQL server has to execute the same query, it has an index on the salary column to
help this query. Salaries between the range of 5000 and 7000 are usually present at the bottom, since the
salaries are arranged in an ascending order. SQL server picks up the row addresses from the index and
directly fetch the records from the table, rather than scanning each row in the table. This is called as
Index Seek.
An Index can also be created graphically using SQL Server Management Studio
1. In the Object Explorer, expand the Databases folder and then specific database you are working with.
2. Expand the Tables folder
3. Expand the Table on which you want to create the index
4. Right click on the Indexes folder and select New Index
5. In the New Index dialog box, type in a meaningful name
6. Select the Index Type and specify Unique or Non Unique Index
7. Click the Add
8. Select the columns that you want to add as index key
9 Click OK
10. Save the table
62
63
To view the Indexes: In the object explorer, expand Indexes folder. Alternatively use sp_helptext system
stored procedure. The following command query returns all the indexes on tblEmployee table.
Execute sp_helptext tblEmployee
To delete or drop the index: When dropping an index, specify the table name as well
Drop Index tblEmployee.IX_tblEmployee_Salary
In this video session, we will talk about Clustered and Non-Clustered indexes.
Clustered Index:
A clustered index determines the physical order of data in a table. For this reason, a table can have only
one clustered index.
Note that Id column is marked as primary key. Primary key, constraint create clustered indexes
automatically if no clustered index already exists on the table and a nonclustered index is not specified
when you create the PRIMARY KEY constraint.
To confirm this, execute sp_helpindex tblEmployee, which will show a unique clustered index created on
the Id column.
63
64
Now execute the following insert queries. Note that, the values for Id column are not in a sequential
order.
Insert into tblEmployee Values(3,'John',4500,'Male','New York')
Insert into tblEmployee Values(1,'Sam',2500,'Male','London')
Insert into tblEmployee Values(4,'Sara',5500,'Female','Tokyo')
Insert into tblEmployee Values(5,'Todd',3100,'Male','Toronto')
Insert into tblEmployee Values(2,'Pam',6500,'Female','Sydney')
Inspite, of inserting the rows in a random order, when we execute the select query we can see that all
the rows in the table are arranged in an ascending order based on the Id column. This is because a
clustered index determines the physical order of data in a table, and we have got a clustered index on the
Id column.
Because of the fact that, a clustered index dictates the physical storage order of the data in a table,
a table can contain only one clustered index. If you take the example of tblEmployee table, the data is
already arranged by the Id column, and if we try to create another clustered index on the Name column,
the data needs to be rearranged based on the NAME column, which will affect the ordering of rows that's
already done based on the ID column.
For this reason, SQL server doesn't allow us to create more than one clustered index per table. The
following SQL script, raises an error stating 'Cannot create more than one clustered index on table
'tblEmployee'. Drop the existing clustered index PK__tblEmplo__3214EC0706CD04F7 before creating
another.'
Create Clustered Index IX_tblEmployee_Name
ON tblEmployee(Name)
A clustered index is analogous to a telephone directory, where the data is arranged by the last name.
We just learnt that, a table can have only one clustered index. However, the index can contain multiple
columns (a composite index), like the way a telephone directory is organized by last name and first name.
Let's now create a clustered index on 2 columns. To do this we first have to drop the existing
clustered index on the Id column.
Drop index tblEmployee.PK__tblEmplo__3214EC070A9D95DB
When you execute this query, you get an error message stating 'An explicit DROP INDEX is not
allowed on index 'tblEmployee.PK__tblEmplo__3214EC070A9D95DB'. It is being used for PRIMARY
KEY constraint enforcement.' We will talk about the role of unique index in the next session. To
successfully delete the clustered index, right click on the index in the Object explorer window and select
DELETE.
Now, execute the following CREATE INDEX query, to create a composite clustered Index on the
Gender and Salary columns.
Create Clustered Index IX_tblEmployee_Gender_Salary
ON tblEmployee(Gender DESC, Salary ASC)
Now, if you issue a select query against this table you should see the data physically arranged, FIRST
by Gender in descending order and then by Salary in ascending order. The result is shown below.
64
65
In the index itself, the data is stored in an ascending or descending order of the index key, which doesn't
in any way influence the storage of data in the table.
The following SQL creates a Nonclustered index on the NAME column on tblEmployee table:
Create NonClustered Index IX_tblEmployee_Name
ON tblEmployee(Name)
Unique index is used to enforce uniqueness of key values in the index. Let's understand this with an
example.
65
66
[FirstName] nvarchar(50),
[LastName] nvarchar(50),
[Salary] int,
[Gender] nvarchar(10),
[City] nvarchar(50)
)
Since, we have marked Id column, as the Primary key for this table, a UNIQUE CLUSTERED INDEX
gets created on the Id column, with Id as the index key.
We can verify this by executing the sp_helpindex system stored procedure as shown below.
Execute sp_helpindex tblEmployee
Output:
Since, we now have a UNIQUE CLUSTERED INDEX on the Id column, any attempt to duplicate the
key values, will throw an error stating 'Violation of PRIMARY KEY constraint
'PK__tblEmplo__3214EC07236943A5'. Cannot insert duplicate key in object dbo.tblEmployee'
Now let's try to drop the Unique Clustered index on the Id column. This will raise an error stating - 'An
explicit DROP INDEX is not allowed on index tblEmployee.PK__tblEmplo__3214EC07236943A5. It is
being used for PRIMARY KEY constraint enforcement.'
Drop index tblEmployee.PK__tblEmplo__3214EC07236943A5
So this error message proves that, SQL server internally, uses the UNIQUE index to enforce the
uniqueness of values and primary key.
Expand keys folder in the object explorer window, and you can see a primary key constraint. Now,
expand the indexes folder and you should see a unique clustered index. In the object explorer it just
shows the 'CLUSTERED' word. To, confirm, this is infact an UNIQUE index, right click and select
properties. The properties window, shows the UNIQUE checkbox being selected.
66
67
SQL Server allows us to delete this UNIQUE CLUSTERED INDEX from the object explorer. so, Right
click on the index, and select DELETE and finally, click OK. Along with the UNIQUE index, the primary
key constraint is also deleted.
Now, let's try to insert duplicate values for the ID column. The rows should be accepted, without any
primary key violation error.
Insert into tblEmployee Values(1,'Mike', 'Sandoz',4500,'Male','New York')
Insert into tblEmployee Values(1,'John', 'Menco',2500,'Male','London')
So, the UNIQUE index is used to enforce the uniqueness of values and primary key constraint.
UNIQUENESS is a property of an Index, and both CLUSTERED and NON-CLUSTERED indexes can
be UNIQUE.
Creating a UNIQUE NON CLUSTERED index on the FirstName and LastName columns.
Create Unique NonClustered Index UIX_tblEmployee_FirstName_LastName
On tblEmployee(FirstName, LastName)
67
68
This unique non clustered index, ensures that no 2 entires in the index has the same first and last
names. In Part 9, of this video series, we have learnt that, a Unique Constraint, can be used to enforce
the uniqueness of values, across one or more columns. There are no major differences between a unique
constraint and a unique index.
In fact, when you add a unique constraint, a unique index gets created behind the scenes. To prove
this, let's add a unique constraint on the city column of the tblEmployee table.
ALTER TABLE tblEmployee
ADD CONSTRAINT UQ_tblEmployee_City
UNIQUE NONCLUSTERED (City)
At this point, we expect a unique constraint to be created. Refresh and Expand the constraints folder
in the object explorer window. The constraint is not present in this folder. Now, refresh and expand the
'indexes' folder. In the indexes folder, you will see a UNIQUE NONCLUSTERED index with name
UQ_tblEmployee_City.
So creating a UNIQUE constraint, actually creates a UNIQUE index. So a UNIQUE index can be
created explicitly, using CREATE INDEX statement or indirectly using a UNIQUE constraint. So, when
should you be creating a Unique constraint over a unique index.To make our intentions clear, create
a unique constraint, when data integrity is the objective. This makes the objective of the index very clear.
In either cases, data is validated in the same manner, and the query optimizer does not differentiate
between a unique index created by a unique constraint or manually created.
Note:
1. By default, a PRIMARY KEY constraint, creates a unique clustered index, where as a UNIQUE
constraint creates a unique nonclustered index. These defaults can be changed if you wish to.
2. A UNIQUE constraint or a UNIQUE index cannot be created on an existing table, if the table contains
duplicate values in the key columns. Obviously, to solve this,remove the key columns from the index
definition or delete or update the duplicate values.
3. By default, duplicate values are not allowed on key columns, when you have a unique index or
constraint. For, example, if I try to insert 10 rows, out of which 5 rows contain duplicates, then all the 10
rows are rejected. However, if I want only the 5 duplicate rows to be rejected and accept the non-
duplicate 5 rows, then I can use IGNORE_DUP_KEY option. An example of using IGNORE_DUP_KEY
option is shown below.
CREATE UNIQUE INDEX IX_tblEmployee_City
ON tblEmployee(City)
WITH IGNORE_DUP_KEY
68
69
In this video session, we talk about the advantages and disadvantages of indexes. We wil also talk
about a concept called covering queries.
In Part 35, we have learnt that, Indexes are used by queries to find data quickly. In this part, we will
learn about the different queries that can benefit from indexes.
NonClustered Index
69
70
The following select query benefits from the index on the Salary column, because the salaries are
sorted in ascending order in the index. From the index, it's easy to identify the records where salary is
between 4000 and 8000, and using the row address the corresponding records from the table can be
fetched quickly.
Select * from tblEmployee where Salary > 4000 and Salary < 8000
Not only, the SELECT statement, even the following DELETE and UPDATE statements can also
benefit from the index. To update or delete a row, SQL server needs to first find that row, and the index
can help in searching and finding that specific row quickly.
Delete from tblEmployee where Salary = 2500
Update tblEmployee Set Salary = 9000 where Salary = 7500
Indexes can also help queries, that ask for sorted results. Since the Salaries are already sorted, the
database engine, simply scans the index from the first entry to the last entry and retrieve the rows in
sorted order. This avoids, sorting of rows during query execution, which can significantly imrpove the
processing time.
Select * from tblEmployee order by Salary
The index on the Salary column, can also help the query below, by scanning the index in reverse order.
Select * from tblEmployee order by Salary Desc
GROUP BY queries can also benefit from indexes. To group the Employees with the same salary, the
query engine, can use the index on Salary column, to retrieve the already sorted salaries. Since matching
salaries are present in consecutive index entries, it is to count the total number of Employees at each
Salary quickly.
Select Salary, COUNT(Salary) as Total
from tblEmployee
Group By Salary
Diadvantages of Indexes:
Additional Disk Space: Clustered Index does not, require any additional storage. Every Non-Clustered
index requires additional space as it is stored separately from the table.The amount of space required will
depend on the size of the table, and the number and types of columns used in the index.
Insert Update and Delete statements can become slow: When DML (Data Manipulation Language)
statements (INSERT, UPDATE, DELETE) modifies data in a table, the data in all the indexes also needs
to be updated. Indexes can help, to search and locate the rows, that we want to delete, but too many
indexes to update can actually hurt the performance of data modifications.
70
71
there is no need to lookup in the table again. The requested columns data can simply be returned from
the index.
A clustered index, always covers a query, since it contains all of the data in a table. A composite index
is an index on two or more columns. Both clustered and nonclustered indexes can be composite indexes.
To a certain extent, a composite index, can cover a query.
Let's understand views with an example. We will base all our examples on tblEmployee and
tblDepartment tables.
71
72
At this point Employees and Departments table should look like this.
Employees Table:
Departments Table:
Now, let's write a Query which returns the output as shown below:
To get the expected output, we need to join tblEmployees table with tblDepartments table. If you are
new to joins, please click here to view the video on Joins in SQL Server.
Select Id, Name, Salary, Gender, DeptName
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
Now let's create a view, using the JOINS query, we have just written.
Create View vWEmployeesByDepartment
as
Select Id, Name, Salary, Gender, DeptName
from tblEmployee
join tblDepartment
72
73
on tblEmployee.DepartmentId = tblDepartment.DeptId
To select data from the view, SELECT statement can be used the way, we use it with a table.
SELECT * from vWEmployeesByDepartment
When this query is executed, the database engine actually retrieves the data from the underlying base
tables, tblEmployees and tblDepartments. The View itself, doesnot store any data by default. However,
we can change this default behaviour, which we will talk about in a later session. So, this is the reason, a
view is considered, as just, a stored query or a virtual table.
2. Views can be used as a mechanism to implement row and column level security.
Row Level Security:
For example, I want an end user, to have access only to IT Department employees. If I grant him access
to the underlying tblEmployees and tblDepartments tables, he will be able to see, every department
employees. To achieve this, I can create a view, which returns only IT Department employees, and grant
the user access to the view and not to the underlying table.
3. Views can be used to present only aggregated data and hide detailed data.
73
74
on tblEmployee.DepartmentId = tblDepartment.DeptId
Group By DeptName
Let's create a view, which returns all the columns from the tblEmployees table, except Salary column.
Create view vWEmployeesDataExceptSalary
as
Select Id, Name, Gender, DepartmentId
from tblEmployee
Select data from the view: A view does not store any data. So, when this query is executed, the
database engine actually retrieves data, from the underlying tblEmployee base table.
Select * from vWEmployeesDataExceptSalary
Is it possible to Insert, Update and delete rows, from the underlying tblEmployees table, using view
vWEmployeesDataExceptSalary?
Yes, SQL server views are updateable.
The following query updates, Name column from Mike to Mikey. Though, we are updating the view,
74
75
SQL server, correctly updates the base table tblEmployee. To verify, execute, SELECT statement, on
tblEmployee table.
Update vWEmployeesDataExceptSalary
Set Name = 'Mikey' Where Id = 2
Along the same lines, it is also possible to insert and delete rows from the base table using views.
Delete from vWEmployeesDataExceptSalary where Id = 2
Insert into vWEmployeesDataExceptSalary values (2, 'Mikey', 'Male', 2)
Now, let us see, what happens if our view is based on multiple base tables. For this purpose, let's
create tblDepartment table and populate with some sample data.
SQL Script to create tblDepartment table
CREATE TABLE tblDepartment
(
DeptId int Primary Key,
DeptName nvarchar(20)
)
Create a view which joins tblEmployee and tblDepartment tables, and return the result as shown
below.
vwEmployeeDetailsByDepartment Data:
75
76
Now, let's update, John's department, from HR to IT. At the moment, there are 2 employees (Ben, and
John) in the HR department.
Update vwEmployeeDetailsByDepartment
set DeptName='IT' where Name = 'John'
Notice, that Ben's department is also changed to IT. To understand the reasons for incorrect
UPDATE, select Data from tblDepartment and tblEmployee base tables.
tblEmployee Table
tblDepartment
76
77
We will discuss about triggers and correctly updating a view that is based on multiple tables, in a later
video session.
In Part 39, we have covered the basics of views and in Part 40, we have seen, how to update the
underlying base tables thru a view. In this video session, we will learn about INDEXED VIEWS.
What is an Indexed View or What happens when you create an Index on a view?
A standard or Non-indexed view, is just a stored SQL query. When, we try to retrieve data from the
view, the data is actually retrieved from the underlying base tables. So, a view is just a virtual table it does
not store any data, by default.
However, when you create an index, on a view, the view gets materialized. This means, the view is
now, capable of storing data. In SQL server, we call them Indexed views and in Oracle, Materialized
views.
Let's now, look at an example of creating an Indexed view. For the purpose of this video, we will be
using tblProduct and tblProductSales tables.
77
78
tblProduct Table
tblProductSales Table
78
79
Create a view which returns Total Sales and Total Transactions by Product. The output should be,
as shown below.
If you want to create an Index, on a view, the following rules should be followed by the view. For the
complete list of all rules, please check MSDN.
1. The view should be created with SchemaBinding option
2. If an Aggregate function in the SELECT LIST, references an expression, and if there is a possibility for
that expression to become NULL, then, a replacement value should be specified. In this example, we are
using, ISNULL() function, to replace NULL values with ZERO.
3. If GROUP BY is specified, the view select list must contain a COUNT_BIG(*) expression
4. The base tables in the view, should be referenced with 2 part name. In this example, tblProduct and
79
80
Since, we now have an index on the view, the view gets materialized. The data is stored in the view.
So when we execute Select * from vWTotalSalesByProduct, the data is retrurned from the view itself,
rather than retrieving data from the underlying base tables.
Indexed views, can significantly improve the performance of queries that involves JOINS and
Aggeregations. The cost of maintaining an indexed view is much higher than the cost of maintaining a
table index.
Indexed views are ideal for scenarios, where the underlying data is not frequently changed. Indexed
views are more often used in OLAP systems, because the data is mainly used for reporting and analysis
purposes. Indexed views, may not be suitable for OLTP systems, as the data is frequently addedd and
changed.
1. You cannot pass parameters to a view. Table Valued functions are an excellent replacement for
parameterized views.
We will use tblEmployee table for our examples. SQL Script to create tblEmployee table:
CREATE TABLE tblEmployee
(
Id int Primary Key,
Name nvarchar(30),
Salary int,
Gender nvarchar(10),
DepartmentId int
)
80
81
Employee Table
3. The ORDER BY clause is invalid in views unless TOP or FOR XML is also specified.
Create View vWEmployeeDetailsSorted
as
Select Id, Name, Gender, DepartmentId
from tblEmployee
order by Id
If you use ORDER BY, you will get an error stating - 'The ORDER BY clause is invalid in views, inline
functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also
specified.'
81
82
We will discuss about DDL and logon triggers in a later session. In this video, we will learn about
DML triggers.
In general, a trigger is a special kind of stored procedure that automatically executes when an event
occurs in the database server.
DML stands for Data Manipulation Language. INSERT, UPDATE, and DELETE statements are DML
statements. DML triggers are fired, when ever data is modified using INSERT, UPDATE, and DELETE
events.
After triggers, as the name says, fires after the triggering action. The INSERT, UPDATE, and
DELETE statements, causes an after trigger to fire after the respective statements complete execution.
On ther hand, as the name says, INSTEAD of triggers, fires instead of the triggering action. The
INSERT, UPDATE, and DELETE statements, can cause an INSTEAD OF trigger to fire INSTEAD OF the
respective statement execution.
82
83
DepartmentId int
)
tblEmployee
When ever, a new Employee is added, we want to capture the ID and the date and time, the new
employee is added in tblEmployeeAudit table. The easiest way to achieve this, is by having an AFTER
TRIGGER for INSERT event.
In the trigger, we are getting the id from inserted table. So, what is this inserted table? INSERTED
table, is a special table used by DML triggers. When you add a new row into tblEmployee table, a copy of
the row will also be made into inserted table, which only a trigger can access. You cannot access this
table outside the context of the trigger. The structure of the inserted table will be identical to the structure
of tblEmployee table.
So, now if we execute the following INSERT statement on tblEmployee. Immediately, after inserting
the row into tblEmployee table, the trigger gets fired (executed automatically), and a row into
83
84
Along, the same lines, let us now capture audit information, when a row is deleted from the table,
tblEmployee.
Example for AFTER TRIGGER for DELETE event on tblEmployee table:
CREATE TRIGGER tr_tblEMployee_ForDelete
ON tblEmployee
FOR DELETE
AS
BEGIN
Declare @Id int
Select @Id = Id from deleted
The only difference here is that, we are specifying, the triggering event as DELETE and retrieving the
deleted row ID from DELETED table. DELETED table, is a special table used by DML triggers. When you
delete a row from tblEmployee table, a copy of the deleted row will be made available in DELETED table,
which only a trigger can access. Just like INSERTED table, DELETED table cannot be accessed, outside
the context of the trigger and, the structure of the DELETED table will be identical to the structure of
tblEmployee table.
In the next session, we will talk about AFTER trigger for UPDATE event.
Triggers make use of 2 special tables, INSERTED and DELETED. The inserted table contains the
updated data and the deleted table contains the old data. The After trigger for UPDATE event, makes use
of both inserted and deleted tables.
84
85
Immediately after the UPDATE statement execution, the AFTER UPDATE trigger gets fired, and you
should see the contenets of INSERTED and DELETED tables.
The following AFTER UPDATE trigger, audits employee information upon UPDATE, and stores the
audit data in tblEmployeeAudit table.
Alter trigger tr_tblEmployee_ForUpdate
on tblEmployee
for Update
as
Begin
-- Declare variables to hold old and updated data
Declare @Id int
Declare @OldName nvarchar(20), @NewName nvarchar(20)
Declare @OldSalary int, @NewSalary int
Declare @OldGender nvarchar(20), @NewGender nvarchar(20)
Declare @OldDeptId int, @NewDeptId int
85
86
-- Delete the row from temp table, so we can move to the next row
Delete from #TempTable where Id = @Id
End
End
In this video we will learn about, INSTEAD OF triggers, specifically INSTEAD OF INSERT trigger. We
know that, AFTER triggers are fired after the triggering event(INSERT, UPDATE or DELETE events),
where as, INSTEAD OF triggers are fired instead of the triggering event(INSERT, UPDATE or
DELETE events). In general, INSTEAD OF triggers are usually used to correctly update views that are
based on multiple tables.
We will base our demos on Employee and Department tables. So, first, let's create these 2 tables.
86
87
Since, we now have the required tables, let's create a view based on these tables. The view should
return Employee Id, Name, Gender and DepartmentName columns. So, the view is obviously based on
multiple tables.
When you execute, Select * from vWEmployeeDetails, the data from the view, should be as shown
below
Now, let's try to insert a row into the view, vWEmployeeDetails, by executing the following query. At
this point, an error will be raised stating 'View or function vWEmployeeDetails is not updatable because
the modification affects multiple base tables.'
Insert into vWEmployeeDetails values(7, 'Valarie', 'Female', 'IT')
So, inserting a row into a view that is based on multipe tables, raises an error by default. Now, let's
understand, how INSTEAD OF TRIGGERS can help us in this situation. Since, we are getting an error,
when we are trying to insert a row into the view, let's create an INSTEAD OF INSERT trigger on the view
vWEmployeeDetails.
87
88
on vWEmployeeDetails
Instead Of Insert
as
Begin
Declare @DeptId int
The instead of trigger correctly inserts, the record into tblEmployee table. Since, we are inserting a
row, the inserted table, contains the newly added row, where as the deleted table will be empty.
In the trigger, we used Raiserror() function, to raise a custom error, when the
DepartmentName provided in the insert query, doesnot exist. We are passing 3 parameters to the
Raiserror() method. The first parameter is the error message, the second parameter is the severity level.
Severity level 16, indicates general errors that can be corrected by the user. The final parameter is
the state. We will talk about Raiserror() and exception handling in sql server, in a later video session.
88
89
In this video we will learn about, INSTEAD OF UPDATE trigger. An INSTEAD OF UPDATE triggers
gets fired instead of an update event, on a table or a view. For example, let's say we have, an INSTEAD
OF UPDATE trigger on a view or a table, and then when you try to update a row with in that view or table,
instead of the UPDATE, the trigger gets fired automatically. INSTEAD OF UPDATE TRIGGERS, are of
immense help, to correctly update a view, that is based on multiple tables.
Let's create the required Employee and Department tables, that we will be using for this demo.
Since, we now have the required tables, let's create a view based on these tables. The view should
return Employee Id, Name, Gender and DepartmentName columns. So, the view is obviously based on
multiple tables.
Script to create the view:
Create view vWEmployeeDetails
as
Select Id, Name, Gender, DeptName
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
When you execute, Select * from vWEmployeeDetails, the data from the view, should be as shown
below
89
90
In Part 45, we tried to insert a row into the view, and we got an error stating - 'View or function
vWEmployeeDetails is not updatable because the modification affects multiple base tables.'
Now, let's try to update the view, in such a way that, it affects, both the underlying tables, and see, if we
get the same error. The following UPDATE statement changes Name column from tblEmployee and
DeptName column from tblDepartment. So, when we execute this query, we get the same error.
Update vWEmployeeDetails
set Name = 'Johny', DeptName = 'IT'
where Id = 1
Now, let's try to change, just the department of John from HR to IT. The following UPDATE query,
affects only one table, tblDepartment. So, the query should succeed. But, before executing the query,
please note that, employees JOHN and BEN are in HR department.
Update vWEmployeeDetails
set DeptName = 'IT'
where Id = 1
After executing the query, select the data from the view, and notice that BEN's DeptName is also
changed to IT. We intended to just change JOHN's DeptName. So, the UPDATE didn't work as
expected. This is because, the UPDATE query, updated the DeptName from HR to IT, in tblDepartment
table. For the UPDATE to work correctly, we should change the DeptId of JOHN from 3 to 1.
90
91
So, the conclusion is that, if a view is based on multiple tables, and if you update the view, the
UPDATE may not always work as expected. To correctly update the underlying base tables, thru a view,
INSTEAD OF UPDATE TRIGGER can be used.
Before, we create the trigger, let's update the DeptName to HR for record with Id = 3.
Update tblDepartment set DeptName = 'HR' where DeptId = 3
-- If DeptName is updated
if(Update(DeptName))
Begin
Declare @DeptId int
91
92
if(@DeptId is NULL )
Begin
Raiserror('Invalid Department Name', 16, 1)
Return
End
-- If gender is updated
if(Update(Gender))
Begin
Update tblEmployee set Gender = inserted.Gender
from inserted
join tblEmployee
on tblEmployee.Id = inserted.id
End
-- If Name is updated
if(Update(Name))
Begin
Update tblEmployee set Name = inserted.Name
from inserted
join tblEmployee
on tblEmployee.Id = inserted.id
End
End
The UPDATE query works as expected. The INSTEAD OF UPDATE trigger, correctly updates, JOHN's
DepartmentId to 1, in tblEmployee table.
Now, let's try to update Name, Gender and DeptName. The UPDATE query, works as expected,
without raising the error - 'View or function vWEmployeeDetails is not updatable because the modification
affects multiple base tables.'
Update vWEmployeeDetails
set Name = 'Johny', Gender = 'Female', DeptName = 'IT'
where Id = 1
Update() function used in the trigger, returns true, even if you update with the same value. For this
reason, I recomend to compare values between inserted and deleted tables, rather than relying on
Update() function. The Update() function does not operate on a per row basis, but across all rows.
92
93
In this video we will learn about, INSTEAD OF DELETE trigger. An INSTEAD OF DELETE trigger gets
fired instead of the DELETE event, on a table or a view. For example, let's say we have, an INSTEAD OF
DELETE trigger on a view or a table, and then when you try to update a row from that view or table,
instead of the actual DELETE event, the trigger gets fired automatically. INSTEAD OF DELETE
TRIGGERS, are used, to delete records from a view, that is based on multiple tables.
Let's create the required Employee and Department tables, that we will be using for this demo.
Since, we now have the required tables, let's create a view based on these tables. The view should
return Employee Id, Name, Gender and DepartmentName columns. So, the view is obviously based on
multiple tables.
Script to create the view:
93
94
When you execute, Select * from vWEmployeeDetails, the data from the view, should be as shown
below
In Part 45, we tried to insert a row into the view, and we got an error stating - 'View or function
vWEmployeeDetails is not updatable because the modification affects multiple base tables'. Along, the
same lines, in Part 46, when we tried to update a view that is based on multiple tables, we got the same
error. To get the error, the UPDATE should affect both the base tables. If the update affects only one
base table, we don't get the error, but the UPDATE does not work correctly, if the DeptName column is
updated.
Now, let's try to delete a row from the view, and we get the same error.
Delete from vWEmployeeDetails where Id = 1
--Subquery
--Delete from tblEmployee
--where Id in (Select Id from deleted)
End
94
95
Upon executing the following DELETE statement, the row gets DELETED as expected from
tblEmployee table
Delete from vWEmployeeDetails where Id = 1
Instead of DELETED table is always empty and the INSERTED table contains the newly inserted
Insert data.
Instead of
INSERTED table is always empty and the DELETED table contains the rows deleted
Delete
Instead of DELETED table contains OLD data (before update), and inserted table contains NEW
Update data(Updated data)
Let's create the required Employee and Department tables, that we will be using for this demo.
95
96
Now, we want to write a query which would return the following output. The query should return, the
Department Name and Total Number of employees, with in the department. The departments with greatar
than or equal to 2 employee should only be returned.
Obviously, there are severl ways to do this. Let's see how to achieve this, with the help of a view
Script to create the View
Create view vWEmployeeCount
as
Select DeptName, DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId
Note: Views get saved in the database, and can be available to other queries and stored procedures.
However, if this view is only used at this one place, it can be easily eliminated using other options, like
CTE, Derived Tables, Temp Tables, Table Variable etc.
Now, let's see, how to achieve the same using, temporary tables. We are using local temporary
tables here.
Select DeptName, DepartmentId, COUNT(*) as TotalEmployees
into #TempEmployeeCount
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId
Note: Temporary tables are stored in TempDB. Local temporary tables are visible only in the current
session, and can be shared between nested stored procedure calls. Global temporary tables are visible to
other sessions and are destroyed, when the last connection referencing the table is closed.
96
97
Insert @tblEmployeeCount
Select DeptName, DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId
Note: Just like TempTables, a table variable is also created in TempDB. The scope of a table variable is
the batch, stored procedure, or statement block in which it is declared. They can be passed as
parameters between procedures.
Note: Derived tables are available only in the context of the current query.
Using CTE
With EmployeeCount(DeptName, DepartmentId, TotalEmployees)
as
(
Select DeptName, DepartmentId, COUNT(*) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName, DepartmentId
)
Note: A CTE can be thought of as a temporary result set that is defined within the execution scope of a
single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived
table in that it is not stored as an object and lasts only for the duration of the query.
97
98
Let's create the required Employee and Department tables, that we will be using for this demo.
Write a query using CTE, to display the total number of Employees by Department Name. The output
should be as shown below.
Before we write the query, let's look at the syntax for creating a CTE.
WITH cte_name (Column1, Column2, ..)
AS
( CTE_query )
98
99
We define a CTE, using WITH keyword, followed by the name of the CTE. In our example,
EmployeeCount is the name of the CTE. Within parentheses, we specify the columns that make up the
CTE. DepartmentId and TotalEmployees are the columns of EmployeeCount CTE. These 2 columns
map to the columns returned by the SELECT CTE query. The CTE column names and CTE query
column names can be different. Infact, CTE column names are optional. However, if you do specify, the
number of CTE columns and the CTE SELECT query columns should be same. Otherwise you will get
an error stating - 'EmployeeCount has fewer columns than were specified in the column list'. The column
list, is followed by the as keyword, following which we have the CTE query within a pair of parentheses.
EmployeeCount CTE is being joined with tblDepartment table, in the SELECT query, that immediately
follows the CTE. Remember, a CTE can only be referenced by a SELECT, INSERT, UPDATE, or
DELETE statement, that immediately follows the CTE. If you try to do something else in between, we
get an error stating - 'Common table expression defined but not used'. The following SQL, raise an error.
Select 'Hello'
99
100
EmployeesCountBy_HR_Admin_Dept(DepartmentName, Total)
as
(
Select DeptName, COUNT(Id) as TotalEmployees
from tblEmployee
join tblDepartment
on tblEmployee.DepartmentId = tblDepartment.DeptId
group by DeptName
)
Select * from EmployeesCountBy_HR_Admin_Dept
UNION
Select * from EmployeesCountBy_Payroll_IT_Dept
Let's create the required tblEmployee and tblDepartment tables, that we will be using for this demo.
100
101
Let's now, UPDATE JOHN's gender from Male to Female, using the Employees_Name_Gender CTE
With Employees_Name_Gender
as
(
Select Id, Name, Gender from tblEmployee
)
Update Employees_Name_Gender Set Gender = 'Female' where Id = 1
Now, query the tblEmployee table. JOHN's gender is actually UPDATED. So, if a CTE is created on
one base table, then it is possible to UPDATE the CTE, which in turn will update the underlying base
table. In this case, UPDATING Employees_Name_Gender CTE, updates tblEmployee table.
Now, let's create a CTE, on both the tables - tblEmployee and tblDepartment. The CTE should
return, Employee Id, Name, Gender and Department. In short the output should be as shown below.
Let's update this CTE. Let's change JOHN's Gender from Female to Male. Here, the CTE is based on 2
tables, but the UPDATE statement affects only one base table tblEmployee. So the UPDATE succeeds.
So, if a CTE is based on more than one table, and if the UPDATE affects only one base table, then
the UPDATE is allowed.
With EmployeesByDepartment
as(
101
102
Now, let's try to UPDATE the CTE, in such a way, that the update affects both the tables - tblEmployee
and tblDepartment. This UPDATE statement changes Gender from tblEmployee table and DeptName
from tblDepartment table. When you execute this UPDATE, you get an error stating - 'View or function
EmployeesByDepartment is not updatable because the modification affects multiple base tables'. So, if a
CTE is based on multiple tables, and if the UPDATE statement affects more than 1 base table, then the
UPDATE is not allowed.
With EmployeesByDepartment
as
(
Select Id, Name, Gender, DeptName
from tblEmployee
join tblDepartment
on tblDepartment.DeptId = tblEmployee.DepartmentId
)
Update EmployeesByDepartment set
Gender = 'Female', DeptName = 'IT'
where Id = 1
Finally, let's try to UPDATE just the DeptName. Let's change JOHN's DeptName from HR to IT.
Before, you execute the UPDATE statement, notice that BEN is also currently in HR department.
With EmployeesByDepartment
as
(
Select Id, Name, Gender, DeptName
from tblEmployee
join tblDepartment
on tblDepartment.DeptId = tblEmployee.DepartmentId
)
Update EmployeesByDepartment set
DeptName = 'IT' where Id = 1
After you execute the UPDATE. Select data from the CTE, and you will see that BEN's DeptName is also
changed to IT.
This is because, when we updated the CTE, the UPDATE has actually changed the DeptName from HR
to IT, in tblDepartment table, instead of changing the DepartmentId column (from 3 to 1) in
tblEmployee table. So, if a CTE is based on multiple tables, and if the UPDATE statement affects only
102
103
one base table, the update succeeds. But the update may not work as you expect.
So in short if,
1. A CTE is based on a single base table, then the UPDATE suceeds and works as expected.
2. A CTE is based on more than one base table, and if the UPDATE affects multiple base tables, the
update is not allowed and the statement terminates with an error.
3. A CTE is based on more than one base table, and if the UPDATE affects only one base table, the
UPDATE succeeds(but not as expected always)
103
104
104
105
105
106
106
107
107
108
108
109
109
110
110
111
111
112
112
113
113
114
114
115
115
116
116
117
117
118
118
119
119
120
Note: Text values, should be present in single quotes, but not required for numeric values.
Group By - Part 11
In SQL Server we have got lot of aggregate functions. Examples
1. Count()
2. Sum()
3. avg()
4. Min()
5. Max()
Group by clause is used to group a selected set of rows into a set of summary rows by the values of one
or more columns or expressions. It is always used in conjunction with one or more aggregate functions.
I want an sql query, which gives total salaries paid by City. The output should be as shown below.
120
121
Note: If you omit, the group by clause and try to execute the query, you get an error - Column
'tblEmployee.City' is invalid in the select list because it is not contained in either an aggregate function or
the GROUP BY clause.
Now, I want an sql query, which gives total salaries by City, by gender. The output should be as shown
below.
Query for retrieving total salaries by city and by gender: It's possible to group by multiple columns. In
this query, we are grouping first by city and then by gender.
Select City, Gender, SUM(Salary) as TotalSalary
from tblEmployee
group by City, Gender
Now, I want an sql query, which gives total salaries and total number of employees by City, and by
gender. The output should be as shown below.
Query for retrieving total salaries and total number of employees by City, and by gender: The only
difference here is that, we are using Count() aggregate function.
Select City, Gender, SUM(Salary) as TotalSalary,
121
122
COUNT(ID) as TotalEmployees
from tblEmployee
group by City, Gender
Filtering Groups:
WHERE clause is used to filter rows before aggregation, where as HAVING clause is used to filter groups
after aggregations. The following 2 queries produce the same result.
Filtering groups using HAVING clause, after all aggrgations take place:
Select City, SUM(Salary) as TotalSalary
from tblEmployee
group by City
Having City = 'London'
From a performance standpoint, you cannot say that one method is less efficient than the other. Sql
server optimizer analyzes each statement and selects an efficient way of executing it. As a best practice,
use the syntax that clearly describes the desired result. Try to eliminate rows that
you wouldn't need, as early as possible.
Group By - Part 11
In SQL Server we have got lot of aggregate functions. Examples
1. Count()
2. Sum()
3. avg()
4. Min()
5. Max()
Group by clause is used to group a selected set of rows into a set of summary rows by the values of one
or more columns or expressions. It is always used in conjunction with one or more aggregate functions.
122
123
I want an sql query, which gives total salaries paid by City. The output should be as shown below.
Note: If you omit, the group by clause and try to execute the query, you get an error - Column
'tblEmployee.City' is invalid in the select list because it is not contained in either an aggregate function or
the GROUP BY clause.
Now, I want an sql query, which gives total salaries by City, by gender. The output should be as shown
below.
Query for retrieving total salaries by city and by gender: It's possible to group by multiple columns. In
this query, we are grouping first by city and then by gender.
123
124
Now, I want an sql query, which gives total salaries and total number of employees by City, and by
gender. The output should be as shown below.
Query for retrieving total salaries and total number of employees by City, and by gender: The only
difference here is that, we are using Count() aggregate function.
Select City, Gender, SUM(Salary) as TotalSalary,
COUNT(ID) as TotalEmployees
from tblEmployee
group by City, Gender
Filtering Groups:
WHERE clause is used to filter rows before aggregation, where as HAVING clause is used to filter groups
after aggregations. The following 2 queries produce the same result.
Filtering groups using HAVING clause, after all aggrgations take place:
Select City, SUM(Salary) as TotalSalary
from tblEmployee
group by City
Having City = 'London'
From a performance standpoint, you cannot say that one method is less efficient than the other. Sql
server optimizer analyzes each statement and selects an efficient way of executing it. As a best practice,
use the syntax that clearly describes the desired result. Try to eliminate rows that
you wouldn't need, as early as possible.
124
125
2. WHERE filters rows before aggregation (GROUPING), where as, HAVING filters groups, after the
aggregations are performed.
3. Aggregate functions cannot be used in the WHERE clause, unless it is in a sub query contained in a
HAVING clause, whereas, aggregate functions can be used in Having clause.
125