in

SQLServerCentral.com

The largest free SQL Server community.

Ken Kaufman

November 2007 - Posts

  • Parameter Sniffing

    Parameter sniffing is the default strategy used by SQL Server for caching query plans.  The concept is that when a paramatized query is compiled for the first time, it uses the input parameters to build an optimal plan.   This plan is then cached.  Reuse of the query will always be based on the cached plan whether it is optimal for incoming parameters or not.  This explanation is a little convoluted and the best way to explain is through example.

    In the following example I’ve created a subset table based off of the “person.address” table in the AdventureWorks database. The primary field to be concerned with here is the “City” field where there has been a non-clustered index created.   The subsetted data has a breakdown of the following city information,

    cityCount   city
    ----------- ------------------------------
    1           Crossville
    1           Denby
    1           Escondido
    213         Beaverton
    214         Concord
    215         Burien
    398         Paris
    434         London


    To start out I’ve created the following SP to query the data by providing a parameter on the city

    CREATE procedure [dbo].[snifftest]
    @city varchar(100)
    as
    select * from testaddress where city = @city

    Once created we’ll look at the following:
    1. What the plan looks like for London (29% of rows)
    2. What the plan looks like for Denby (Less then 1% of rows)
    3. How many reads for London
    4. How many reads for Denby


    set showplan_text off
    go
    dbcc freeproccache
    go
    set showplan_text on
    go
    exec snifftest 'London'
    go
    exec snifftest 'Denby'

    |--Clustered Index Scan

    The above returned a cluster index scan for both queries.  This is because London was the first parameter in, and it is more efficient to use a scan with such a high density. Now view the IO’s by running the following queries.

    set showplan_text off
    go
    set statistics IO on
    go
    exec snifftest 'London'
    go
    exec snifftest 'Denby'

    Each query uses the same method which is a scan and therefore has the same IO’s which is 24 in this case. 


    set showplan_text off
    go
    set statistics IO on
    go
    exec snifftest 'London'
    go
    exec snifftest 'Denby'

    Scan count 1, logical reads 24

    Now reverse the order, have Denby go first.  The results show an index seek.


    set showplan_text off
    go
    dbcc freeproccache
    go
    set showplan_text on
    go
    exec snifftest 'Denby'
    go
    exec snifftest 'London'

    |--Index Seek

    The next query is where the rubber meets the road, when we execute this query to look at the IO’s for each value.

    set showplan_text off
    go
    set statistics IO on
    go
    exec snifftest 'London'
    go
    exec snifftest 'Denby'

    Scan count 1, logical reads 871

    Scan count 1, logical reads 4

    For Denby it’s 4 reads, but for London it’s 871.  This is a huge difference.  With London making up such a large percentage of the data it only follows that the amount of calls to retrieve this data for London should be inline.  What this means is 29% of your calls will do 857 more reads then needed.  If London or another value with a high density is executed first parameter sniffing is optimal here.  Even though calls to a lower densitity value, loose a small amount of performance, it more then makes up for the overall system performance. 

    So now the question is why does SQL by default use parameter sniffing.  This is because of query compilation.  There’s a cost involved in compiling the correct plan for a query.  Rather then compiling on every query, SQL cheats by caching the plans.  When the call for a SP comes in it simply reuses the plan already in memory, forgoes the cost of compilation.  As with any cheat there’s always a risk in that on the first compilation an edge case value could come in, and force the wrong plan.    In the case above this would be a value requiring an index seek rather then a scan.  In a future blog I’ll talk about troubleshooting and avoiding these scenerios.

  • Renaming Tables in a Transaction


    A common method of ETL is to bulk load tables from a staging area into your production environment,  Of course the issue here is that if your production environment is 24X7, it means taking the table offline, which can cause significant issues for your users.  One commonly used work around is to load this data into a holding table and rename.  The rename is instantaneous, and usually has no to little effect on your users.  To do so you would often issue a command similar to this.

     

    execute sp_rename 'table1', 'table1_old'

    execute sp_rename 'table1_new', 'table1'

     

    The issue arrises with the above statement is what happens if the first command succeeds and the second command fails.  You’ve now orphanned your users with no table to hit.  The work around is to put this into a transaction.  Pre 2005 transactions meant using the @@error return.  For example after each statement you would need something like this:

     

     

     

    Begin Tran rename

    execute sp_rename 'table1', 'table1_old'

    if @@error > 0

    begin

    goto Errorhandler

    end

     

    execute sp_rename 'table1_new', 'table1'

    if @@error > 0

    begin

    goto Errorhandler

    end

    commit tran rename

    errorhandler:

    Rollback tran rename

     

    I’m not a big fan of complexity or typing, so the easier way around is to use the new try/catch feature of sql server.  Put all your transactions in the try, and rollback in the catch

     

    Begin Tran rename

    begin try

    execute sp_rename 'table1', 'table1_old'

    execute sp_rename 'table1_new', 'table1'

    Commit Tran rename

    end try

    Begin Catch

    rollback tran rename

    end catch

     

     

  • Looking Inside the Log for Transactional Replication

    As the name implies the log reader is responsible for reading the log on the publishing database.  Using the fn_dblog command we can peek inside the log to see what’s happening here. 

    When a database is set as the publisher in a replication schema, it marks dml statements against it’s published articles as for replication.  What this does is it holds the transactions within its transaction log until the log agent comes by and picks up these transactions.  Where the effect of this can be seen is in checkpointing, which flushes the dirty pages to disk.  If a transaction has not been grabbed by the log reader a checkpoint will not flush to disk.  This can cause significant headaches for an admin in that the log will continue to grow, even if the db is in simple mode or a backup log is issued. 

     

    To see how this works you can leverage fn_dblog function.  Create a db in “simple mode”, then create a sample table and publish it for transactional replication.  Once this is setup view the transaction log using the following command:

     

    select * from fn_dblog(null,null)

     

    You should noticed several records returned.  Since your db is in simple mode your seeing all the transactions that have occurred since the last “checkpoint” has occurred.  You can clear your log (Flush to disk) by issuing the Checkpoint command.  Execute

     

    Checkpoint

    Go

    select * from fn_dblog(null,null)

     

    You’ll now notice only two records in the log, which represent the checkpoint that has occurred. 

    LOP_BEGIN_CKPT

    LOP_END_CKPT

     

    Now that we see how that works insert a row into your published table then issue the fn_dblog again.

     

    Insert into table1(charfield)

    values('hello')

    go

    select * from fn_dblog(null,null)

    Depending on how your table was setup you’ll see a few more rows.  If you go to the bottom of the resultset you’ll see the following results under the operation command:

     

    LOP_BEGIN_XACT

    LOP_INSERT_ROWS

    LOP_COMMIT_XACT

    This is the command you just issued.  If you scroll horizontally here you’ll see some keys to this operation: 

    • Under transaction id all three rows have the same id. 
    • Under AllocationUnitName you’ll see the object that was just effected by the transaction
    • Under description there is the command just executed and below that the record is replicated
    • Under Replicated records you see the amount of records changed by the last statement.

     

     

    If you’re a geek like me you can hunt all day through the resultsets, but it’s clear the change just made is set for replication.  Go ahead and clear the transaction log, and view the results using the previous command:

     

    Checkpoint

    Go

    select * from fn_dblog(null,null)

     

    B

     

    Before moving on to the next step,  take a look at  a couple of columns (Oldest Replicated Begin LSN/ Next Replicated End LSN).  These lsn numbers are set to 0 and indicate that no active records are slated for replication.  Now go ahead and stop the log reader agent on your distributor and issue some of the same commands as above. 

     

    Insert into table1(charfield)

    values('hello')

    go

    Checkpoint

    go

    select * from fn_dblog(null,null)

     

     

    Now the results get really interesting.  Using the checkpoint doesn’t clear the log.  Even though the insert is committed, the log will not clear, this is because the records marked for replication have not been retrieved by the log reader.  If you look at the checkpoint rows at the bottom you’ll also see Oldest Replicated Begin LSN/ Next Replicated End LSN point to the first and last transaction respectively for the replicated data.  This is one of the biggest problems experienced in transactional replication.  If the log reader stops without notice the log will grow out of control and bring the system to its knees.

     

    We can manually issue the same commands to clear the log.  Run sp_replcmds against the publisher DB.  You’ll retrieve the same data the agent does.  The command is picked up in a hash format.  However if you run a “checkpoint” then the “fn_dblog” nothing has changed the data has not been cleared.  After the agent runs the sp_replcmds, it then lets the log know these have been picked up by issuing sp_repldone.   This marks the transactions as being picked up.  If you issue the checkpoint/fn_dblog your back to two records representing the checkpoint.

     

    Now try to start the log reader agent, noticed it won’t.  This is due to a sql server safety measure, that only one connection can be made using these ms_repl commands.  To get the log reader back issue, kill your current session.  Some other cool commands that can be found in BOL are:

    • sp_replcounters – Give statistics such as number of commands to be replicated, Frequency, starting and ending lsn’s.
    • sp_repltrans- shows transactions waiting to be replicated
    • sp_replshowcmds – Shows in text form the commands that have been executed

     

Copyright Red Gate Software
Powered by Community Server (Commercial Edition), by Telligent Systems