Tuesday 2 December 2008

A problem with out of date table statistics

We had an interesting database by problem We extracted data from an original database into a new database. Then we ran a specific query on both databases.So the query runs over exactly the same data, but on the original database the query took 14 hours to complete and on the new database 20 seconds.

The root cause was out-of-date statistics on the tables/views. I built up a new maintenance plan, including the Rebuild Index task and Update Statistics task and after finishing the maintplan, the same query finished within 16 seconds on the original database.

I also compared the execution plans and found it is the out-of-date statistics which caused SQL Server resulted in wrong cardinality estimation. The problem was that there was a huge difference between the EstimateRows and Rows columns. The estimated rows for the entire view is incorrectly estimated to 1 although there are 84938 rows existing. So SQL Server incorrectly generated the poor nested loop join operator based on this incorrect cardinality estimation. The result is that SQL Server had to run the table scan and was the reason why we saw the “IO_Completion” waittype and unnecessary disk bottleneck.

From this interesting case, we can learn two things:
(1) Frequently update statistics is a best practice to keep SQL Server “thinking” properly.
(2) Even with incorrect/out-of-date statistics, a good database design (correct indexing) could also help avoid the unnecessary overhead caused by the incorrect statistics.

To clear out of date statistics use sp_updatestats. But be carefull executing this on a productive system as it can lead to performance issues as the SQL server has less information to base it’s execution plans on