blog-algomaster-io-p-a-detailed-guide-on-database-indexes
blog-algomaster-io-p-a-detailed-guide-on-database-indexes
148 13 16 Share
Suppose you’re trying to find the page which contains information related to a certain
word.
Without an index page, you would have to go through every page, which could take
hours or even days.
One you find the right index, you can efficiently jump to that page.
The index, since it's sorted alphabetically and gives page numbers for specific
information, saves us from spending too much time flipping through every page.
Credit: https://www.ctpublic.org/show/the-colin-mcenroe-show/2022-08-17/how-
indexes-help-organize-our-world
Database indexes work in a similar manner. They guide the database to the exact
location of the data, enabling faster and more efficient data retrieval.
If you’re finding this newsletter valuable and want to deepen your learning, consider
becoming a paid subscriber.
As a paid subscriber, you'll receive an exclusive deep-dive article every week, access to
a structured System Design Resource (100+ topics and interview questions), and other
premium perks.
A database index is a super-efficient lookup table that allows a database to find data
much faster.
It holds the indexed column values along with pointers to the corresponding rows in
the table.
Without an index, the database might have to scan every single row in a massive table
to find what you want – a painfully slow process.
But, with an index, the database can zero in on the exact location of the desired data
using the index’s pointers.
Let's say we have a table named employees with the following structure:
Now, let's create an index on the last_name column to improve the performance of
queries that frequently search or sort based on the last name.
In this example, we use the CREATE INDEX statement to create an index named
idx_last_name on the employees table. The index is created on the last_name
column.
After creating the index, queries that involve conditions or sorting on the last_name
column will be optimized. For example:
This query will use the idx_last_name index to quickly locate the rows where the
last_name is 'Smith', avoiding a full table scan.
You can also create indexes on multiple columns (composite indexes) if your queries
frequently involve conditions on multiple columns together. For example:
This creates a composite index on the first_name and last_name columns, which
can be useful for queries that search or sort based on both columns.
Share
2. Index Building: The database management system builds the index by scanning
the table and storing the values of the indexed column(s) along with a pointer to
the corresponding data.
4. Index Search: If an index exists, the database searches the index for the requested
data, using the pointers to quickly locate the data.
5. Data Retrieval: The database retrieves the requested data, using the pointers from
the index.
Reduced CPU Usage: By reducing the number of rows that need to be scanned,
indexes can decrease CPU usage and optimize resource utilization.
Rapid Data Retrieval: Indexes enable quick data retrieval for queries that involve
equality or range conditions on the indexed columns.
Efficient Sorting: Indexes can also be used to efficiently sort data based on the
indexed columns, eliminating the need for expensive sorting operations.
Better Data Organization: Indexes can help maintain data organization and
structure, making it easier to manage and maintain the database.
Sparse index: Has entries only for some of the search key values. Suitable for
situations where the data has a large number of distinct search key values.
Hash Index: A index that uses a hash function to map values to specific locations.
Great for exact match queries.
Filtered Index: Indexes a subset of rows based on a specific filter condition.
Useful to improve query speed on commonly filtered columns.
Covering Index: Includes all the columns required by a query in the index itself,
eliminating the need to access the underlying table data.
Function-based index: Indexes that are created based on the result of a function
or expression applied to one or more columns of a table.
Full-Text Index: A index designed for full-text search, allowing for efficient
searching of text data.
Spatial Index: Used for indexing geographical data types.
Most database engines use either a B-Tree or a variation of B-Trees like B+ Trees.
B-Trees have a hierarchical structure with a root node, internal nodes (index nodes),
and leaf nodes.
Each node in a B-Tree contains a sorted array of keys and pointers to child nodes.
Self-Balancing: B-trees ensure that the 'height' of the tree stays balanced even
when inserting or deleting data. This ensures logarithmic time complexity
for insertion, deletion, and searching.
Ordered: B-trees keep the data sorted, making range queries ("find all orders
between date X and Y") and inequality comparisons very fast.
Disk-Friendly: B-trees are designed to work well with disk-based storage. A single
node of a B-tree often corresponds to a disk block, minimizing disk access
operations.
Many databases use a slightly modified B-tree variant called the B+ tree.
In a B+ tree, all data values are stored only in the leaf nodes, which can further
improve performance for certain use cases like range queries.
Hash Tables
Hash tables are used for hash indexes, which are based on a hash function.
A hash table consists of an array of buckets, with each bucket containing the addresses
for rows in the data.
Hash indexes employ a hash function to map keys to their corresponding bucket in the
hash table, enabling constant-time lookup operations.
Hash indexes provide fast equality lookups, as the hash function determines the exact
location of the data based on the key.
Bitmaps
Each bit in the bitmap corresponds to a row, and the value of the bit indicates whether
the key value exists in that row.
Bitmap indexes use a bitmap (a binary array) to represent the presence or absence of a
specific key value in each row of a table.
Bitmap indexes are well-suited for columns with low cardinality (a small number of
distinct values) and for performing complex queries involving multiple conditions.
Bitmap operations like AND, OR, and NOT are performed efficiently using bitwise
operations, making bitmap indexes suitable for analytical queries involving multiple
columns.
Identify Query Patterns: Analyze the most frequent and critical queries executed
against your database to determine which columns to index and which type of
index to use.
Index Frequently Used Columns: Consider indexing columns that are frequently
used in WHERE, JOIN, and ORDER BY clauses.
Index Selective Columns: Indexes are most effective on columns with a good
spread of data values (high cardinality). Indexing a gender column might be less
beneficial than one with a unique customer_id.
Use Appropriate Index Types: Choose the right index type for your data and
queries.
Consider Composite Indexes: For queries involving multiple columns, consider
creating composite indexes that encompass all relevant columns. This reduces the
need for multiple single-column indexes and improves query performance.
Avoid Over-Indexing: Avoid creating too many indexes, as this can lead to
increased storage requirements and slower write performance.
Indexes take up extra disk space since they're additional data structures that
need to be stored alongside your tables.
Every time you insert, update, or delete data in a table with an index, the index
needs to update too. This can slightly slow down write operations.
But remember to choose the right column and index type, monitor performance, and
avoid over-indexing to get the most out of them.
If you found it valuable, hit a like ❤️ and consider subscribing for more such content
every week.
Share
P.S. If you’re finding this newsletter helpful and want to get even more value, consider
becoming a paid subscriber.
As a paid subscriber, you'll receive an exclusive deep dive every week, access to a
comprehensive system design learning resource , and other premium perks.
References:
What Are the Types of Indexes in a Relational Database?
148 13 16 Share
Previous Next
Comments Restacks
Write a comment...
Your post was incredibly helpful! I had been struggling to grasp indexing, but everything
makes sense now.
LIKE (2) REPLY SHARE
Aabhas May 7
Liked by Ashish Pratap Singh
11 more comments...
See all