Should I index my database table(s), and if so, how?
An index is like a set of pointers to specific rows in a table. These pointers are ordered in terms of the column(s) defined by the index, which makes SQL's scans much more efficient - they just look up the pointers to the rows with the relevant data (based on a WHERE or other clause), and jump right to the row(s).
If you have multiple indexes on one column each, there will be n sets of pointers to the rows - each ordered by the specific column. As I will discuss later, you should choose your index(es) wisely.
If you have one index on multiple columns, it creates one huge set of pointers -- ordering the rows by each column you chose, successively.
So let's say you have a table with three integer columns (a, b and c). Let's insert some sample data, which are stored on disk in heap format [remember - tables are not guaranteed to be sorted in any way], looking like this (a normal SELECT a,b,c FROM table, given some fictitious data):
Now, in the first example, we'll create a multi-column index on a, b and c. Now, this is how the pointers will look, ORDER BY a, b, c:
So now, imagine running a query SELECT a,b,c FROM table WHERE a=1. This query will be very efficient, because the pointers have grouped these records all together.
Now, imagine running a similar query, but this time SELECT a,b,c FROM table WHERE b=1. This query will not be very efficient, since SQL Server's only index option is this index which does NOT consider b to be a top priority (and it just so happens that these records are NOT grouped together). It's the jumping around on the pointers that makes SQL Server work harder to get all the rows that match the WHERE clause - in some cases it may be more efficient for SQL Server to just do a table scan, rather than care about your index.
Let's create multiple indexes now, one on each column. The pointers for each, in order, will look something like this:
Now, running the two queries mentioned above, each will be very efficient. In the first query, SQL Server will choose the first index, and get all the rows where a is grouped together - minimizing read / scan time. In the second query, SQL Server is smart enough to ignore the first index, and use the second index instead.
I will mention that with a compound index, let's say columns a, b and c, the optimizer will use the index for a query on column a, or a and b, or a and b and c, or a and c. However, the index will not be able to optimize on queries against column b, column c, or columns b and c.
Keep in mind that while an index can speed up SELECTs, it can also can slow down INSERTs and UPDATEs. In addition, an index occupies disk space, which can be an issue not only for performance but also for backup / replication purposes. Just something you should keep in mind before adding 19,000 indexes to a database - there is definitely a happy medium between no indexes and too many.
Indexing in and of itself is a science, and is not easy to master without spending a lot of time analyzing different indexes and their impacts on performance and disk space. And I didn't even start getting into fragmentation, partitioning, clustered indexes, compound indexes...
What you use and what will work best depends on your schema, the nature of your queries, where your performance counts, the load on your system, hardware, levels of transactions, acceptable query times, type of application, etc. I strongly recommend running Index Tuning Wizard, feeding it a SQL trace of the typical activity on your system. The wizard should identify which types of indexes will work best for your scenario.
Related ArticlesHow do I build a query with optional parameters?
How do I calculate the median in a table?
How do I create a store locator feature?
How do I deal with MEMO, TEXT, HYPERLINK, and CURRENCY columns?
How do I deal with multiple resultsets from a stored procedure?
How do I debug my SQL statements?
How do I determine if a column exists in a given table?
How do I enable or disable connection pooling?
How do I enumerate through the DSNs on a machine?
How do I find a stored procedure containing <text>?
How do I get a list of Access tables and their row counts?
How do I get the latest version of the JET OLEDB drivers?
How do I handle alphabetic paging?
How do I handle BIT / BOOLEAN columns?
How do I handle error checking in a stored procedure?
How do I ignore common words in a search?
How do I page through a recordset?
How do I present one-to-many relationships in my ASP page?
How do I prevent duplicates in a table?
How do I prevent my ASP pages from waiting for backend activity?
How do I prevent NULLs in my database from mucking up my HTML?
How do I protect my Access database (MDB file)?
How do I protect my stored procedure code?
How do I protect myself against the W32.Slammer worm?
How do I remove duplicates from a table?
How do I rename a column?
How do I retrieve a random record?
How do I return row numbers with my query?
How do I send a database query to a text file?
How do I simulate an array inside a stored procedure?
How do I solve 'Could not find installable ISAM' errors?
How do I solve 'Operation must use an updateable query' errors?
How do I temporarily disable a trigger?
How do I use a SELECT list alias in the WHERE or GROUP BY clause?
How do I use a variable in an ORDER BY clause?
Should I store images in the database or the filesystem?
Should I use a #temp table or a @table variable?
Should I use a view, a stored procedure, or a user-defined function?
Should I use recordset iteration, or GetRows(), or GetString()?
What are all these dt_ stored procedures, and can I remove them?
What are the limitations of MS Access?
What are the limitations of MSDE?
What are the valid styles for converting datetime to string?
What datatype should I use for my character-based database columns?
What datatype should I use for numeric columns?
What does "ambiguous column name" mean?
What is this 'Multiple-step OLE DB' error?
What is wrong with 'SELECT *'?
What naming convention should I use in my database?
What should I choose for my primary key?
What should my connection string look like?
When should I use CreateObject to create my recordset objects?
Where can I get this 'Books Online' documentation?
Where do I get MSDE?
Which database platform should I use for my ASP application?
Which tool should I use: Enterprise Manager or Query Analyzer?
Why are there gaps in my IDENTITY / AUTOINCREMENT column?
Why can I not 'open a database created with a previous version...'?
Why can't I access a database or text file on another server?
Why can't I use the TOP keyword?
Why do I get 'Argument data type text is invalid for argument [...]'?
Why do I get 'Not enough space on temporary disk' errors?
Why does ASP give me ActiveX errors when connecting to a database?
Should I use COALESCE() or ISNULL()?
Where can I get basic info about using stored procedures?