Applications

SQL Server: Data Warehouse Indexing Strategy

Indexing strategies on data warehouses can be extremely light, depending on its usage. Most data warehouses are used solely to populate the SSAS database and, therefore, are not queried directly. In this case, the index strategy would be geared towards speeding up the ETL process. In some cases, the data warehouse may be queried directly. If that is the case, then you would adjust your index strategy accordingly.

 

Basic Index Strategy

 

  • Dimension tables will have a clustered single-column Primary Key index on the Primary Keys.
    • For a VERY LARGE (> 100k rows) dimension table, you may want to add a few indexes on the non-key columns most often used in queries.
    • For LARGE dimensions (~100k rows), may want to consider indexing the source system key (remember that this key would not be unique if you have any Type 2 attributes in the dimension). This is not for query time, but for the surrogate key pipeline during ETL.

     

  • Fact tables will have a clustered single-column Primary Key index on the Primary Keys, and will also have a single-column non-clustered index on the Foreign Keys to the dimension tables.

Leave a Reply