Database Performance Considerations - Where indexes can improve performance and where not

In a nutshell a database index is an auxiliary data structure which allows for faster retrieval of data stored in the database. They are keyed off of a specific column so that queries like “Give me all people with a last name of ‘Sam’” are fast.

Indexes don’t come for free. What you gain for in retrieval speed you lose in insertion and deletion speed because every time you alter a table the indexes must be updated accordingly. If your table is updating frequently it’s possible that having indexes will cause overall performance of your database to suffer. There is also a space penalty, as the indexes take up space in memory or on disk. A single index is smaller than the table because it doesn’t contain all the data, only pointers to the data, but in general the larger the table the larger the index

(1) When to Index

- Columns frequently used in search or query expressions
- Columns used to join tables
- High-selectivity keys. The selectivity of an index is the percentage of rows in a table having the same value for the indexed key. An index's selectivity is optimal if few rows have the same value.
- Foreign keys

(2) When do not Index
- Columns and expressions with few distinct values
- Frequently updated columns

Consider indexing foreign keys of referential integrity constraints, since this column is commonly used to join both tables. Indexing foreign key columns helps avoid full table scans while searching for matching rows in the child when DML is performed on the parent.

When choosing to index a column (or group of columns), consider whether the performance gain for queries is worth the performance loss for massive INSERTs, UPDATEs, and DELETEs and the quantity of space required to store the index. You might want to experiment by comparing the processing times of the SQL statements with and without indexes. You can measure processing time with the SQL Trace facility.

Finally, the best way to know if you need to index or not is generating the execution plan for the query. Analyzing it you can determine if an index should be or not created.