Hungred Dot Com: 15 Ways To Optimize Your SQL Queries
Hungred Dot Com: 15 Ways To Optimize Your SQL Queries
Sponsors
Hello there! If you are new here, you might want to subscribe to the
RSS feed for updates on this topic.
Previous article was on 10 Ways To Destroy A SQL Database that sort of teaches
16
you what mistakes many company might make on their database that will
2
eventually lead to a database destroy. In this article, you will get to know 15
ways to optimize your SQL queries. Many ways are common to optimize a query while
others are less obvious.
$10 - Advertise
Indexes
Index your column is a common way to optimize your search result. Nonetheless, one
must fully understand how does indexing work in each database in order to fully utilize
indexes. On the other hand, useless and simply indexing without understanding how it
work might just do the opposite.
Symbol Operator
Symbol operator such as >,<,=,!=, etc. are very helpful in our query. We can optimize
some of our query with symbol operator provided the column is indexed. For example,
$10 - Advertise
SQL Server
Training
Access Videos,
Articles and More.
Join the
SSWUG.ORG
Community Today!
www.sswug.org
Now, the above query is not optimized due to the fact that the DBMS will have to look
for the value 16 THEN scan forward to value 16 and below. On the other hand, a
optimized value will be
Web based
OLAP Client
for Microsoft
Analysis Services
download Free
Evaluation
www.ReportPortal.com
This way the DBMS might jump straight away to value 15 instead. Its pretty much the
same way how we find a value 15 (we scan through and target ONLY 15) compare to a
value smaller than 16 (we have to determine whether the value is smaller than 16;
additional operation).
Download Pl Sql
Download the 30
day trail version
for PL/SQL IDE!
www.allroundautomatio
Wildcard
Visual SQL to
XML
In SQL, wildcard is provided for us with % symbol. Using wildcard will definitely slow
down your query especially for table that are really huge. We can optimize our query with
wildcard by doing a postfix wildcard instead of pre or full wildcard.
1
2
3
4
5
6
#Full wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE '%hello%';
#Postfix wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE 'hello%';
#Prefix wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE '%hello';
NOT Operator
Try to avoid NOT operator in SQL. It is much faster to search for an exact match
(positive operator) such as using the LIKE, IN, EXIST or = symbol operator instead of a
negative operator such as NOT LIKE, NOT IN, NOT EXIST or != symbol. Using a
negative operator will cause the search to find every single row to identify that they are
ALL not belong or exist within the table. On the other hand, using a positive operator
just stop immediately once the result has been found. Imagine you have 1 million record
in a table. Thats bad.
COUNT VS EXIST
Some of us might use COUNT operator to determine whether a particular data exist
1
Similarly, this is very bad query since count will search for all record exist on the table to
determine the numeric value of field COLUMN. The better alternative will be to use the
EXIST operator where it will stop once it found the first record. Hence, it exist.
Wildcard VS Substr
Most developer practiced Indexing. Hence, if a particular COLUMN has been indexed, it
is best to use wildcard instead of substr.
1
2
#BAD
SELECT * FROM TABLE WHERE
The above will substr every single row in order to seek for the single character value. On
the other hand,
1
2
#BETTER
SELECT * FROM TABLE WHERE
COLUMN = 'value%'.
Wildcard query will run faster if the above query is searching for all rows that contain
Data Types
Use the most efficient (smallest) data types possible. It is unnecessary and sometimes
dangerous to provide a huge data type when a smaller one will be more than sufficient to
optimize your structure. Example, using the smaller integer types if possible to get
smaller tables. MEDIUMINT is often a better choice than INT because a MEDIUMINT
column uses 25% less space. On the other hand, VARCHAR will be better than longtext
to store an email or small details.
Primary Index
The primary column that is used for indexing should be made as short as possible. This
makes identification of each row easy and efficient by the DBMS.
String indexing
It is unnecessary to index the whole string when a prefix or postfix of the string can be
indexed instead. Especially if the prefix or postfix of the string provides a unique
identifier for the string, it is advisable to perform such indexing. Shorter indexes are
faster, not only because they require less disk space, but because they also give you more
hits in the index cache, and thus fewer disk seeks.
Hence, dont be lazy and try to limit the result turn which is both efficient and can help
minimize the damage of an SQL injection attack.
1
In Subquery
Some of us will use a subquery within the IN operator such as this.
1
Doing this is very expensive because SQL query will evaluate the outer query first before
proceed with the inner query. Instead we can use this instead.
1
On the other hand, using Union such as this will utilize Indexes.
1
2
3
Summary
Definitely, these optimization tips doesnt guarantee that your queries wont become your
system bottleneck. It will require much more benchmarking and profiling to further
optimize your SQL queries. However, the above simple optimization can be utilize by
anyone that might just help save some colleague rich bowl while you learn to write good
queries. (its either you or your team leader/manager)
No related posts.
About Clay
I am Clay who is the main writer for this website. I own a small web hosting
company in Malaysia and i'm available to be hired as individual contractor on
elance or odesk. You can find me on twitter.
View all posts by Clay
This entry was posted in Developer, How-to, Informative, SQL, Tips And Tricks, Web Development and tagged SQL.
Bookmark the permalink.
10 Ways To Destroy A SQL Database
I couldnt understand the Symbol Operator point. Could you please explain it a
little further?
You are wrong about the NOT opertor. If you think about it you will realize that
you can determine if there are NO black marbles in a bowl just as fast as you can
determine if there is at least one black marble. There is no need to examine every
marble; you can stop as soon as I find one black marble.
NOT EXISTS is exactly that: an EXISTS test that is logically negated. Its possible
that a NOT EXISTS (or NOT LIKE or NOT IN) test will examine every
row/character/list member if the searched item is not present, but that will
happen for both EXISTS and NOT EXISTS.
MAX and MIN do not look for the maximum or minimum value in a column,
and they arent operators. The MIN and MAX functions are aggregate functions
that operate on the selected rows (or groups of rows if GROUP BY is used).
SELECT MAX(col) FROM table will find the maximum value of col, but the
functions are more general than that. Indexes are expensive to maintain, and
indexing columns just to speed up MIN and MAX is not great advice.
Clay says:
October 27 at 3:20 PM
@Greg : I agree with you that indexing columns just to speed up MIN and MAX
is not a good advice. May be there is a misunderstanding on that point. I meant
that MAX and MIN can be used on indexed column for better speed. Deliberating
indexing a column because of a MIN or MAX is pure, NO NO. Thanks for the
feedback
Well, regarding the NOT operator, if there are any algorithm available in the
world that work like a human. May be your theory might just hit the right spot.
Clay says:
October 27 at 4:52 PM
@Veera : To make thing simple. If there are 15 available in that column it will
directly point towards 15 (if there are no exact value, it will just be similar as <16)
instead of going through the whole row finding which is the highest value that is
smaller than 16 (might be 15,14,13,12,11, etc. the DBMS do not know until it look
through them). If there is a equal to symbol, it means that there is a probability
that it will jump directly to 15 and return the result directly to you.
unreal4u says:
October 28 at 1:08 AM
FROM customer
WHERE a=b)
UNION
(SELECT
0 AS name,
number_of_products
FROM products
WHERE y=z)
instead of:
SELECT * FROM customer
Faster:
SELECT * FROM customer AS a, products AS b
WHERE a.id = b.id_customer
I dont know how or why, but it turned out that first case was much slower (10
seconds) than the second case (3,4 seconds).
It was however, a system already in production, so i hadnt the chance to play
much with the query or with indexes
The client however, was very happy with the results xD
Greetings
JW says:
October 28 at 1:49 AM
"However, if that particular column was never used for searching purposes, it
gives no reason to index that particular column although it is given unique"
Um, I would say this is misleading too. What about unique columns that are
frequently used in joins, but never searched?
Clay says:
October 28 at 6:40 AM
@JW : Yup, you dont need an index when no search is being done on a
particular column. But if it is a unique and frequently use column, having an
index will perform better.
For join, it depends on what DBMS you use. In MySQL, indexes perform more
efficient when your join have the same data type and size. Although you dont
need its result, columns that frequently require in joins need DBMS to search for
the matching partners. Hence, having indexes will be better in joins too but
criteria to make it efficient depends on each individual implementation of DBMS.
Clay says:
October 28 at 7:00 AM
Tags
Illustrator J2SE
JavaScript jQuery
Feature Post
Recent Posts
Popular Posts
script
RegisterScript
Disable Yii Log on Action Controller
415.71
Nagios
Others photoshop
Wordpress
posted to Twitter
Get WordPress Custom Post
Taxonomy Categories and Tags
Check Whether a page is using
WordPress Custom Taxonomy
Category