Monday 18 February 2013

Heaps Of Trouble

I recently came across a situation at a client which I thought I would share with you. I came across this while doing some routine maintenance and noticed a small table consuming a lot of memory. I mean this table, which is a smallish lookup table, was consuming almost 430 MB of buffer space. The thing about lookup tables and the way they are used is that the entire table is usually read into memory. This is by design but the problem here is that there should be no way that this table needs 430 MB.


First I will check the structure of this table. This table is a heap with no indexes at all and the table only contains 3 columns, each of which is a varchar, and they are 1, 15 and 150 characters in length. This means the maximum possible size of a row is 166 bytes (plus a couple of bytes for overhead) which is not very big at all.
CREATE TABLE Codes
      (Code VARCHAR(15) NOT NULL,
      Category VARCHAR(1) NOT NULL,
      Condition VARCHAR(150) NOT NULL);

OK so how much data is stored in this table? A simple SELECT COUNT(*) reveals 42,997 rows in this table. Whoa, hang on a second. If we had 42,997 rows all using the maximum possible row size we would have the following:
42997 * 166 = 7137502 bytes
7137502 / 1024 = 6970 kilobytes
6970 / 1024 = 6.8 megabytes

6.8 MB is somewhat different to the 430 MB the table is currently using. What is going on?
Next I want to have a look at sp_spaceused to see what the system will report:

rows – 42997
reserved – 517 MB
data – 430 MB
index_size – 32 KB
unused – 86 MB

Obviously something is terribly wrong. I want to have a look at index physical stats.
SELECT index_id, index_type_desc, avg_fragmentation_in_percent, avg_page_space_used_in_percent, page_count, record_count
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Codes'), NULL, NULL, 'DETAILED')

avg_fragmentation_in_percent – 30.9%
avg_page_space_used_in_percent – 4.02%
page_count – 55,143

Well, well, well. The table storage is not normal at all. Fragmentation, at 31%, is not too bad but this table has 55,143 pages to store 43,647 rows and on average only 4% of each page has data. In other words we are using 430MB worth of data pages to store about 6MB worth of data. Not only do we waste 424MB of disk space, we also waste 424MB of memory and we also have to read that 430MB from disk to get to our 6MB of data. Any reduction in IO will usually benefit the entire database server.
The cause of this problem is documented in Books Online:
When rows are deleted from a heap the Database Engine may use row or page locking for the operation. As a result, the pages made empty by the delete operation remain allocated to the heap. When empty pages are not deallocated, the associated space cannot be reused by other objects in the database. To delete rows in a heap and deallocate pages, use one of the following methods.

·         Specify the TABLOCK hint in the DELETE statement. Using the TABLOCK hint causes the delete operation to take a shared lock on the table instead of a row or page lock. This allows the pages to be deallocated.
·         Use TRUNCATE TABLE if all rows are to be deleted from the table.
·         Create a clustered index on the heap before deleting the rows. You can drop the clustered index after the rows are deleted. This method is more time consuming than the previous methods and uses more temporary resources.

Now I know this table gets loaded from another system on a daily basis. I had a look at the procedure that loads the table and saw the following:
DELETE FROM Codes;

INSERT INTO Codes
SELECT ......

This table has had pages being allocated to it on a daily basis and the delete statement ensured that pages would not be de-allocated. In my particular case I am going to create a clustered index on this table to remedy the situation and to prevent a reccurance. Let’s see what that has done for our storage.
rows – 42,997
reserved – 3,624 KB
data – 3,408 KB
index_size – 56 KB
unused – 160 KB

avg_fragmentation_in_percent – 0.2%
avg_page_space_used_in_percent – 99%
page_count – 426

Bang, I’ve just scored 430MB of RAM for free.

3 comments:

  1. Sean,

    In your above row size math you use X * 166 to bytes. (And I understand this may be a bonehead question...) How did you derive that number?

    Thanks,

    -Jeff

    ReplyDelete
  2. That is the maximum possible row size for the data.
    varchar(150)
    varchar(15)
    varchar(1)
    So 166 single byte characters if each column was full, i.e. a worst case scenario.

    ReplyDelete
    Replies
    1. Thanks! Here I was thinking it was some clever way of establishing row size. I should have looked at the obvious. :)

      Delete