0% found this document useful (0 votes)
22 views

Indexes and Operators

Uploaded by

venkatbj
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Indexes and Operators

Uploaded by

venkatbj
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Indexes and Operators

Indexes:

Indexes are used to speedup data access in the database. By using them one can
quickly find the data in a table without having to read all the data from the tables.

The index structure resembles an inverse tree similar to a directory structure. This
tree begins with the 1st page of an index which is the root node which contains
pointers to other pages in the index. Then comes the intermediate node or branch
node which also contains pointers to leaf nodes or other branch nodes. The leaf
node is the lowest level page in an index and it contains either an Row identifier
(RowId) that points to the actual data in a table or it may even contain a clustering
key itself.

The following B-Tree (Balanced Tree) is a sample of an index page containing 1000
records and how they are separated into Root node, Branch node and Leaf node.

Consider the above B-Tree structure and we are searching a field with value of 978.
First the database engine would check the root node value of 1 to 1000. Since the
value is 978 it uses the branch node 501 to 1000 and navigates further to reach
751 to 1000. Finally it would navigate further through the leaf node 876 to 1000
and find the desired record 978.

Thus by using the index there were only fewer number of reads. When compared to
scanning all the 1000 records to fetch the result, this usage of index has reduced
the IO to a large extent.
Clustered Index:
A clustered index stores the table data at the leaf page of the index based on
clustering key. Because the clustered index stores the data in sorted order there is
no need to rearrange the page in the index. There can be only one clustered index
in a table since the data is sorted.

Consider the analogy of a dictionary where the words are sorted alphabetically and
definitions appear next to the word. Similarly in a clustered index CI, the leaf page
contains the entire data/records and is sorted.

NonClustered Index:
A nonclustered index is analogous to an index in the back of a book. We can use
the book’s index to locate pages that match an index entry. There can be 249
nonclustered index NCI and 1 clustered index in any table.
If table does not have a clustered index, then its unsorted and is called a HEAP. A
NC created on a Heap contains pointers to table rows. Each entry in the index page
contains a row ID (RID). The RID is a pointer to a table row in a heap. If the table
has clustered index, the index pages of a NCI contain the CI keys rather than RID’s.
An index pointer whether it is a RID or a CI key is called a lookup.

Scan and Seek:


The following are the operators related to indexes. They are available in theSQL
Server query execution plans.

Table scan – A table scan results in reading the entire datas in a table and returns
the entire table or specific records. This is bad for performance, if the table has
numerous records doing a table scan will affect the performance severely. In some
cases if there are fewer records its fine to have table scan.
So if you see that SQL Server has performed a Table Scan, take a note of how
many rows are in the table. If there aren’t many, then in this case, a Table Scan is
a good thing.
In the below query in adventureworks database we don’t have an index on
customerid field and it results in table scan. Also when a Select * is done in a table
it will fetch the entire results and hence will do a table scan.

Select * from sales.storecontact


Select * from sales.storecontact
where customerid=322

Clustered Index scan – This is nothing but tables scan in a table which has
Clustered index. Since the clustered index leaf page contains the data itself,
performing a clustered index scan will scan all the entire records which will affect
performance. But as I mentioned earlier if the records are fewer it
wouldn’t affect much.

For the below query the optimizer does an Clustered index scan to retrieve the
records,

Select * from sales.storecontact


where contacttypeid=15

Clustered index seek – A seek will retrieve only selective records from a table
when compared to scan which will traverse all the records in a table. If the seek
operation is d
one using a clustered index it’s a clustered index seek. Basically, any seek
operation will vertically traverse through the B-Tree and fetch the records.
Consider the below query for which I created a clustered index on Customerid field.
The optimizer uses the clustered index and performs a clustered index seek.

Select * from sales.storecontact


where customerid=322

The clustered index seek will traverse through the records where the
customerid=322 and fetch the output. When compared to table scan which will
traverse through all the records, an index seek is very helpful in reading the
number of records quickly and is good for performance.

Index scan – Since a scan touches every row in the table whether or not it
qualifies, the cost is proportional to the total number of rows in the table. An index
scan is nothing but a scan on the nonclustered index. When index scan happens, all
the rows in the leaf level are scanned.

Index seek – An index seek uses the nonclustered index to seek the records in a
table and is considered better for performance if there is high selectivity.

For the below query the optimizer does an index seek using the NC index on
contacted field. Since the NC covers only the contactid it will not be able to fetch all
the records with an index seek alone. So its uses seek to fetch the records which
have contactid=322 and then does a key lookup using the clustered index key to
fetch the other fields records.

Select * from sales.storecontact


where contactid=322

The key lookup is an expensive operation if there are numerous records. Since key
lookup increases as the IO we might have to avoid it in some cases. Index with
included columns can help to overcome this situation and cover the entire query
and in turn causes an index seek.

POSTED BY DEEPAK AT 3:06 PM 0 COMMENTS LINKS TO THIS POST


LABELS: PERFORMANCE TUNING

MONDAY, MARCH 17, 2008

Identifying Database Fragmentation in SQL Server 2005

We can make use of the dynamic management views or dynamic


management functions in SQL Server 2005 to determine the Internal and
External Fragmentation. The DMF sys.dm_db_index_physical_stats is used
to identify the fragmentation in SQL 2005.

The general syntax for this DMF is given below,

SELECT * FROM sys.dm_db_index_physical_stats


(DB_ID(N'Database Name'), OBJECT_ID(N'Table Name'), NULL, NULL ,
NULL)

To following columns are essential for understanding the fragmentation for


all the tables in AdventureWorks database.

SELECT object_id, index_id, avg_fragmentation_in_percent,


avg_page_space_used_in_percent
FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'),
NULL,NULL,NULL,NULL)

The below query provides the list of all the tables and their fragmentation
levels.

USE ADVENTUREWORKS
GO
SELECT CAST(DB_NAME(database_id) AS varchar(20)) AS [Database
Name],
CAST(OBJECT_NAME(object_id) AS varchar(20)) AS [TABLE NAME],
index_id, index_type_desc, avg_fragmentation_in_percent,
avg_page_space_used_in_percent
FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'),
NULL,NULL,NULL,'Detailed')

The output of the above query is shown in the below screenshot.

We should view the output


of avg_fragmentation_in_percent column in the above query. If the value is
greater than 10 it implies that external fragmentation is present.

We should also view the output of avg_page_space_used_in_percent


column. If it is less than 75 then it indicates internal fragmentation.

Managing Index Fragmentation:


Once we have identified whether there is Internal or External fragmentation,
we need to perform the following,

ALTER INDEX REORGANIZE:

This command will reorganize an index. This will physically organize the leaf
pages to match the logical order. The syntax for the same is given below,

ALTER INDEX INDEX NAME ON OBJECT REORGANIZE

ALTER INDEX REBUILD:


This command will rebuild the indexes. It will remove both the internal and
external fragmentation by dropping and recreating the indexes.

ALTER INDEX INDEX NAME ON OBJECT REBUILD

Refer this link for more options in Alter Index Reorganize and Rebuild commands

Determine when to Reorganize or Rebuild an Index:

1. Execute Alter index Reorganize when avg_page_space_used_in_percent


less than 75 and greater than 60 or avg_fragmentation_in_percent is
greater than 10 but less than 15

2. Execute Alter index build when avg_page_space_used_in_percent less


than 60 or avg_fragmentation_in_percent is greater than 15
POSTED BY DEEPAK AT 12:24 PM 0 COMMENTS LINKS TO THIS POST
LABELS: PERFORMANCE TUNING

SQL Server 2000 Index Defragmentation

When you perform any data modification operations (INSERT, UPDATE, or


DELETE statements) table fragmentation can occur. When changes are made
to the data that affect the index, index fragmentation can occur and the
information in the index can get scattered in the database. Fragmented data
can cause SQL Server to perform unnecessary data reads, so a queries
performance against a heavy fragmented table can be very poor.

Index fragmentation actually comes in two different forms: External


Fragmentation and Internal Fragmentation.

External Fragmentation:
External fragmentation occurs when an index leaf page is not in logical order. When
an index is created, the index keys are placed in a logical order on a set of index
pages. As new data is inserted into the index, it is possible for the new keys to be
inserted in between existing keys. This may cause new index pages to be created to
accommodate any existing keys that were moved so that the new keys can be
inserted in correct order. These new index pages usually will not be physically
adjacent to the pages the moved keys were originally stored in. It is this process of
creating new pages that causes the index pages to be out of logical order.

Internal Fragmentation:

Internal fragmentation occurs when the index pages are not being used to their
maximum volume. While this may be an advantage on an application with heavy
data inserts, setting a fill factor causes space to be left on index pages, severe
internal fragmentation can lead to increased index size and cause additional reads
to be performed to return needed data. These extra reads can lead to degradation
in query performance.

How to determine if an index is fragmented?

SQL Server provides a database command, DBCC SHOWCONTIG, to use to


determine if a particular table or index has fragmentation.

DBCC SHOWCONTIG

[(
{ 'table_name' | table_id | 'view_name' | view_id }
[ , 'index_name' | index_id ]
)]
[ WITH
{
[ , [ ALL_INDEXES ] ]
[ , [ TABLERESULTS ] ]
[ , [ FAST ] ]
[ , [ ALL_LEVELS ] ]
[ NO_INFOMSGS ]
}
]

Permissions default to members of the sysadmin server role,


the db_owner and db_ddladmin database roles and
table owner.

The below query identifies the fragmentation of all


the indexes in table Databaselog in
AdventureWorks database.

USE AdventureWorks
GO
DBCC SHOWCONTIG ('dbo.Databaselog') With All_Indexes
GO

DBCC SHOWCONTIG scanning 'DatabaseLog' table...


Table: 'DatabaseLog' (2073058421); index ID: 2, database ID: 6
LEAF level scan performed.
- Pages Scanned...............................: 1
- Extents Scanned.............................: 1
- Extent Switches.............................: 0
- Avg. Pages per Extent......................: 1.0
- Scan Density [Best Count:Actual Count]..: 100.00% [1:1]
- Logical Scan Fragmentation ...............: 0.00%
- Extent Scan Fragmentation ................: 0.00%
- Avg. Bytes Free per Page ..................:1331.0
- Avg. Page Density (full) ....................: 83.56%
DBCC execution completed. If DBCC printed error messages,
contact your system administrator.

This output indicates that the table is free of fragmentation


based on the logical scan fragmentation (0.00%),extent scan
fragmentation 0.00%) and scan density (100%).
When you examine the results from DBCC SHOWCONTIG pay
particular attention to logical scan fragmentation
and average page density.
Consider defragmenting the indexes if the
logical scan fragmentation is 20% or more.
The scan density needs to be close to 100% and if it
is less than 100% there is some fragmentation.We need to
defragment the indexes if the page density is less than 80%.
Low values of average page density can result in more
pages that must be read to satisfy a query.
How to reduce fragmentation?

We can reduce fragmentation and improve the performance


using one of the following methods,
1. Dropping and re-creating the index

2. Rebuilding an index using DBCC DBREINDEX statement

3. Defragmenting an index using DBCC INDEXDEFRAG statement

Dropping and re-creating the index:

Dropping and rebuilding an index does have the advantage of completely rebuilding
an index which does reorders the index pages, compacting the pages, and dropping
any unneeded pages. You may need to consider dropping and rebuilding indexes
that show high levels of both internal and external fragmentation.

Some of the disadvantages of dropping and recreating an index with either DROP
INDEX and CREATE INDEX includes the disappearance of the index while you are
dropping and recreating it. As the index is dropped and recreated, it is no longer
available for queries and query performance may suffer dramatically until you can
rebuild the index.

Another disadvantage of dropping and recreating an index is the potential to cause


blocking as all requests to the index are blocked until the index is rebuilt.

Rebuilding an index using DBCC DBREINDEX statement:

Rebuilding an index is a more efficient way to reduce fragmentation in comparison


with dropping and re-creating an index, this is because rebuilding an index is done
by one statement which is easier than coding multiple DROP INDEX and CREATE
INDEX statements. To rebuild indexes, you can use the DBCC DBREINDEX
statement. It can be used to rebuild one or more indexes for a specified table.

Advantages:

1. It will automatically rebuild the statistics as well during rebuilding an index.

2. It is faster while rebuilding large or heavily fragmented indexes.

Disadvantages:

1. While running DBREINDEX the table is unavailable to the users (so its offline).

2. Log space usage is high when DBREINDEX operation is performed and hence it is advisable to
change recovery model to Bulk logged when it is done.

3. If you stop DBREINDEX the entire operation is rolled back since it operates as one atomic
transaction.

Defragmenting an index using DBCC INDEXDEFRAG statement:

DBCC INDEXDEFRAG statement defragments the leaf level of the index so that the
physical order of the index pages match the left-to-right logical order of the leaf
nodes. The DBCC INDEXDEFRAG statement will report to the user an estimated
percentage completed every five minutes and can be terminated at any point in the
process, so that any completed work is retained.

Advantages:
1. INDEXDEFRAG is an online operation therefore the table and indexes are available while the
indexes are defragmented.

2. It can be stopped and restarted without the need to rollback and no work is lost. Because each
unit of DBCC INDEXDEFRAG occurs as a separate transaction.

Disadvantages:

1. For an active system where frequent updates on the data are taking place, DBCC INDEXDEFRAG
will skip the locked pages (pages which get updated) and hence it will not be able to completely
eliminate fragmentation.

2. It is not faster when performed against large indexes and will not rebuild the statistics.

You might also like