The other day I was given a request to change a field from
varchar(6) to varchar(10). Not thinking
it was a big deal I said just give me the checked in code and I'll make the
change. One of our sharper dev's (Bill Carlson) was concerned with the lock caused by
this statement. I'd never actually took
a deep dive on how this would occur so I decided to run a quick test to see
what would happen. Making this change on
a 2.5 million row table took 10 ms. What
this means is SQL is smart enough to understand that since it's growing a
varchar field from 6 to 10 it didn't need to go to the data to verify this
statement would succeed. It simply had
to make the meta data changes in the system tables. However when I reverted this back from 10 to
6 it took 32 seconds. In this case SQL
had to verify each value of each record to ensure it was less then 7 characters.
This brought on the realization of how SQL Server verifies
data during inserts and updates. If you
look at the structure of a record with variable lengths there is no meta data
on the length of variable fields it simply knows to create offsets within the row.
Checking to determine whether a field is too long occurs at a higher level as
data comes in. Once it passes this check
data streams to the data page without any additional verification.
After wetting my interest on varchars the next logical
choice was to look at int's. Starting
out with an unfragmented table I changed a field from int to bigint, then back
again from bigint to int. Moving from an
int to a bigint was almost 7x longer then the reverse. This is was due to page splitting and
allocation which reared it's head in perfmon.
The simple explanation here is that unlike varchar items, every value
takes a defined space 4 bytes/8 bytes for int/bigint respectively. Going from int to bigint added 4 bytes to
each record, and forced pages to be allocated and split. In my case the initial int was 4 bytes of a 13
byte row almost 30%, growing it to 8
bytes made it almost 50%. In terms of
page size it went from one page holding 620 rows to 474. I guess the moral of this is whenever you're
looking at running ddl statements you need to consider what the real effects on
the table are, and have someone bright looking over your shoulder.