Business Intelligence

Stats and Query Performance Primer

 

If you haven’t picked up on a theme yet in my posts, then shame on you for not reading everything on this site….. To be frank, I love working with performance issues on databases (mostly from a design and code perspective, my hardware skills are a little out of date, but I try to keep up).

This code snippet will be just one more for your tool belt in evaluating performance and looking for optimizations. The script below will give you the name of the statistic and the table column it is associated with for a given database. Stats names that start with PK are usually going to be your primary keys whereas names that are similar to WA_Sys are going to be system generated statistics for your objects.

select    object_name(s.object_id) as TableName

        , s.name as StatisticName

        , COL_NAME(s.object_id, sc.column_id)as ColumnName

from        sys.stats s

inner join    sys.stats_columns sc

on        s.stats_id = sc.stats_id

and        s.object_id = sc.object_id

where        OBJECTPROPERTY(s.object_id, ‘ISMSShipped’) = 0

order by    object_name(s.object_id)

With this output, you can generate details about specific stat objects (the StatisticName output column from above). The output will give you three result sets:

  1. First result set is general information, such as update date, number of rows in the table, rows sampled for the statistics, etc. The step value in here should be equal to the number of rows in your third set, which is the detailed histogram data for this specific index.
  2. If DENSITY_VECTOR is specified, then the second result set will describes the columns returned in the result set
  3. Third result set is the detailed data for every step in the statistic. From books online:
    1. RANGE_HI_KEY: Upper bound value of a histogram step
    2. RANGE_ROWS: Estimated number of rows from the table that fall within a histogram step, excluding the upper bound
    3. EQ_ROWS: Estimated number of rows from the table that are equal in value to the upper bound of the histogram step
    4. DISTINCT_RANGE_ROWS: Estimated number of distinct values within a histogram step, excluding the upper bound
    5. AVG_RANGE_ROWS: Frequency or average number of duplicate values within a histogram step, excluding the upper bound (RANGE_ROWS / DISTINCT_RANGE_ROWS for DISTINCT_RANGE_ROWS > 0)

For specifics on strings, Karen Delany wrote this article for the http://www.sqlcommunity.com website.

Leave a Reply

x

We use cookies to ensure the best possible experience on our website. Detailed information on the use of cookies on this site is provided in our Privacy and Cookie Policy. Further instruction on how to disable our cookies can be found there.