The general rule of thumb on "variable tables" VS "temp
tables" is if your working with a small datasets or complex query plans use a
temp table. When working with simple
queries or small datasets use variable tables.
This was built by design.
Variable tables are light in memory structures, locking is minimal,
statistics aren't created, and plans don't recompile with changes to the
table. Temp tables are real table
structures, and behave almost identical to normal tables, with a shorter life
span. Therefore there is an overhead, in
creating and tearing down, with temp tables. The question is why use temp tables at all to
hold temporary storage.
The trick is in recompilation. Recompilation occurs mid stream during a
batch, when circumstances change, from the initial plan, where SQL determines
to recompile a more optimal plan. This is
often observed when a subplan is dependent on a previous subplan. For example if a query inserts data into a
work table, then selects off the work table.
If the precompiled plan expects 100 rows in the table, but gets 100K,
the plan might need to be changed. This
is where the difference lays between the two data structures. Since SQL keeps statistics on a temp table,
it can recognize major changes and make the appropriate changes to the plan through
a recompile. However with that said a
recompile is expensive and gains from the new plan might not justify the cost
of the recompile.
Recompiles on temp tables occur when the cardinality becomes
greater then 6, and then again greater then 500. I always look at 1000 being a threshold on
whether to use a temp table over a table variable. This is a personal rule of thumb, and you
should always look at the complexity of the query itself which in the end has
the final say on whether a query plan will be recompiled. The following example uses a union join to
force this situation.
There are three portions to this script, the first creates
the tables and populates them. The
second and third create stored procedures based on a temp table and a table
variable. The @rowcount variable in part
one will be used to throttle the query into different states.
--Part 1 Create table
Declare @rowcount int
set @rowcount = 5
if exists(select name from sys.objects where name = 'table1')
Begin
drop table table1
end
if exists(select name from sys.objects where name = 'table2')
Begin
drop table table2
end
if exists(select name from sys.objects where name = 'table3')
Begin
drop table table3
end
--Create tables
create table table1
(PKInt int primary key,
charfield varchar(20))
create table table2
(PKInt int primary key,
charfield varchar(20))
create table table3
(PKInt int primary key,
charfield varchar(20))
Declare @Interval int
set @interval = 1
while @interval < @rowcount
Begin
Insert into table1
values(@interval, 'Table1')
set @interval = @interval + 2
end
set @interval =2
while @interval < @rowcount
Begin
Insert into table2
values(@interval, 'Table2')
set @interval = @interval + 2
end
set @interval = 1
while @interval < @rowcount
Begin
Insert into table3
values(@interval, 'Table3')
set @interval = @interval + 3
end
--Part2 create SP with table variable
Create Procedure tableVariable
as
Declare @tableVar table
(PKInt int primary key,
charfield varchar(20))
insert into @tableVar
select * from table1
union
select * from table2
select t.Pkint, t.charfield, tv.charfield
from table3 t inner join
@tableVar
tv on
t.pkint = tv.pkint
go
--Part2 create SP with table variable
Create Procedure temptable
as
create table #temptable
(PKInt int primary key,
charfield varchar(20))
insert into #temptable
select * from table1
union
select * from table2
select t.Pkint, t.charfield, tt.charfield
from table3 t inner join
#temptable
tt on
t.pkint = tt.pkint
Once these are in place run
the following commands. Display the
execution plan for the newly created stored procedures.
exec
tablevariable
go
exec temptable
You'll notice that both SP's are
using a nested loop to join the temporary storage with table3, with a scan of
table3 and a seek on the temp structure. Now go ahead and execute them and turn
on the display actual execution plan.
This will allow you to see if the plan is recompiled with a new
plan. In addition execute "set statistics
io on" so we can see how many IO's this is performing. The freeproccache is for good measure so
everything is clean when compiled
dbcc
freeproccache
go
set statistics io on
go
exec
tablevariable
go
exec temptable
Both held there plan, and if
you look at the "select join" portion of the statement for reads both have an
identical cost of 6. This is inline with
the statement above that if under 6 rows change then no recompile will take
place. Since we only had 5 no recompile
occurred. So lets prove out that theory by running the
first set of statements to recreate the tables and set the @rowcount variable
to 6. Once complete repeat the above
exercise of executing the SP's.
Declare @rowcount int
set @rowcount =
6
You'll notice the estimated
plan is again identical nested loop with the temp storage doing a scan and
table3 using a seek. When you look at
the actual plan the tablevariable SP kept the precompiled plan while the
temptable SP recompiled and chose a different plan. Still using a nested loop but this time using
a seek against the temptable and a scan against table3. The IO's also show the recompile improved
IO's on the select join with the variable having 12 IO's while the temptable
used only 6.
As stated before 500 is also
a benchmark where a recompile will take place.
Lets take a look at what the difference is here. Recreate the tables by setting @rowcount variable
to 500. By this time you should be
convinced that the precompiled plan for both SP's will stay the same, so lets
only focus on the output. This time the
temptable SP is no longer using a nested loop, but rather a merge join. The comparitable difference is 1002 Reads for
the tablevariable and 6 for the temptable.
I played this out a little further to get some more numbers. Notice the expotential growth difference
below. As you can see the larger the
changes to temp storage the more disparity occurs, which equals resources for
the server, and user time waiting.
|
Insert
Rows
|
Temp
Reads
|
Variable
Reads
|
|
5
|
6
|
6
|
|
6
|
6
|
12
|
|
500
|
6
|
1002
|
|
1000
|
7
|
2003
|
|
10000
|
43
|
20029
|
Before you go out and change all your variable tables into
temp tables, there are a few things to consider. As mentioned above temp tables are real
tables and have the associated overhead with them. There's the metadata creation and
maintenance. There's heavier locking,
and transactions are logged. Of course
the biggest overhead comes in the form of recompilations. The reason execution plans are cached is
because they're costly to recreate. The
gains made in IO's can easily be swallowed up by the cost of the recompilation.