SQL Server Central is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
Search:  
 
 

Ken Kaufman

Add to Technorati Favorites Add to Google
Browse by Tag : Performance,MetaData,DDL (RSS)

Performance Effects on Column Type Changes

By Ken Kaufman in Ken Kaufman | 05-14-2008 6:47 PM | Categories: Filed under: , ,
Rating: (not yet rated) Rate this |  Discuss | 419 Reads | 281 Reads in Last 30 Days |no comments

 

The other day I was given a request to change a field from varchar(6) to varchar(10).  Not thinking it was a big deal I said just give me the checked in code and I'll make the change.  One of our sharper dev's (Bill Carlson) was concerned with the lock caused by this statement.  I'd never actually took a deep dive on how this would occur so I decided to run a quick test to see what would happen.  Making this change on a 2.5 million row table took 10 ms.  What this means is SQL is smart enough to understand that since it's growing a varchar field from 6 to 10 it didn't need to go to the data to verify this statement would succeed.  It simply had to make the meta data changes in the system tables.  However when I reverted this back from 10 to 6 it took 32 seconds.  In this case SQL had to verify each value of each record to ensure it was less then 7 characters. 

 

This brought on the realization of how SQL Server verifies data during inserts and updates.  If you look at the structure of a record with variable lengths there is no meta data on the length of variable fields it simply knows to create offsets within the row. Checking to determine whether a field is too long occurs at a higher level as data comes in.  Once it passes this check data streams to the data page without any additional verification.

 

After wetting my interest on varchars the next logical choice was to look at int's.  Starting out with an unfragmented table I changed a field from int to bigint, then back again from bigint to int.  Moving from an int to a bigint was almost 7x longer then the reverse.  This is was due to page splitting and allocation which reared it's head in perfmon.  The simple explanation here is that unlike varchar items, every value takes a defined space 4 bytes/8 bytes for int/bigint respectively.  Going from int to bigint added 4 bytes to each record, and forced pages to be allocated and split.  In my case the initial int was 4 bytes of a 13 byte row almost 30%,  growing it to 8 bytes made it almost 50%.  In terms of page size it went from one page holding 620 rows to 474.  I guess the moral of this is whenever you're looking at running ddl statements you need to consider what the real effects on the table are, and have someone bright looking over your shoulder.