in

SQLServerCentral.com

The largest free SQL Server community.

Ken Kaufman

Transferring Data In Memory using SQLBulkCopy Class

In a SQL Server enterprise there are several ways to move data, choosing the best method, is often more of a challenge then actually implementing.  The key factors involved are performance, Dev time, and ease of implementation.  A nice utility in the .Net framework is the SqlBulkCopy class.  This leverages the sturdy dll's of the command line tool BCP, with one major advantage, it's done in memory.  Using BCP to import data requires a disk location.  If you want to use BCP to move data from one server to the next, you must drop to disk from the source, then import from disk into the destination.  Until solid state drives become a reality, this method comes with a heavy overhead tax.  This is where the bulk copy class comes into play.  It eliminates this bottleneck, by working within memory.  This satisfies the performance requirement.  Leveraging Sql Server's support for assemblies, gives the ability to execute from within a SQL Stored Sproc, and satisfies the ease of implementation.  

 

The core under workings of using this class is attaching to the source using ADO with a command reader to the data, then transferring this data to the SqlBulkCopy class which pushes to the destination.  At its simplest implementation you could have a working assembly in less the ten lines of code, which satisfies Dev Time requirements.  In the technical world every alternative solution has its own cost.  Using this method transfers your data buffering to a more expensive storage in the form of memory.  This is not as big an issue as one would expect.  Using the BCP command line tool requires transferring the entire data set to disk, then picking it up again.  Using a data reader allows for a smaller foot print, were data is fetched and inserted in a pipelined operation. Because it's pipelined you can easily transfer a GB of data with less then 20 MB of memory. 

 

There are several samples of this on the net, and in Visual Studio documentation of how to do this.  I've included a recent code snippet I used.  As with most code it has it's own flair, essentially focusing on data verification, by counting rows at the source and destination and allowing for the ability to either sync (truncate the destination) or append data. 

 

 

Managed Code

public static void pushData(string sourceTable, string destServer,

            string destDB, string destTable, int batchSize, int tearDown)

        {

         

                //Declare Variables

                string _destServer;

                string _destDB;

                string _query;

                string _destTable;

                int _batchSize;

                string destConnString;

                string _cntDestTable;

                string _cntSourceTable;

                int _cntDestResult;

                int _cntPriorDestResult;

                int _cntDestReturn;

                int _cntSourceResult;

                string _cntError;

               

 

                //Set Variables

                _destDB = destDB;

                _destServer = destServer;

                _destTable = destTable;

                _batchSize = batchSize;

                //Source Query

                _query = "select * from " + sourceTable;

                //Verify Queries

                _cntDestTable = "select count(*) from " + destTable;

                _cntSourceTable = "select count(*) from " + sourceTable;

                _cntError = "RAISERROR ('Source and Dest Counts Dont Match', 16, 1)";

            

                //Build Destination ConnString

                destConnString = "Data Source=" + _destServer + ";Initial Catalog=" +

                _destDB + ";Integrated Security=True";

 

                //connect Locally get data

                SqlConnection oConn = new SqlConnection("context connection=true");

                oConn.Open();

 

                //Gets Count of Source used to verify at end of assembly

                SqlCommand cmdCntSource = new SqlCommand(_cntSourceTable, oConn);

                SqlDataReader sourceCntReaderPrior = cmdCntSource.ExecuteReader();

                sourceCntReaderPrior.Read();

                _cntSourceResult = (int)sourceCntReaderPrior[0];

                sourceCntReaderPrior.Close();

 

                //Get Rows into reader to push   

                SqlCommand cmd = new SqlCommand(_query, oConn);

                //pull into reader

                SqlDataReader reader = cmd.ExecuteReader();

 

 

                //Open Connection to Destination

                SqlConnection destConn = new SqlConnection(destConnString);

                destConn.Open();

 

                //Truncate Destination if needed else Get Existing row count

                if (tearDown == 1)

                {

                    string tearQuery = "truncate table " + _destTable;

                    SqlCommand cmdTear = new SqlCommand(tearQuery,destConn);

                    cmdTear.ExecuteNonQuery();

                    //Set seed count to 0

                    _cntPriorDestResult = 0;

                 }

                else

                {

                   //Get Seed Count in Dest table

                    SqlCommand cmdCntDest =new SqlCommand(_cntDestTable, destConn);

                    SqlDataReader destCntReaderPrior = cmdCntDest.ExecuteReader();

                    destCntReaderPrior.Read();

                    _cntPriorDestResult = (int)destCntReaderPrior[0];

                    destCntReaderPrior.Close();

                  }

                           

                //Move Data to desting

                SqlBulkCopy sbc = new SqlBulkCopy(destConn);

                sbc.DestinationTableName = _destTable;

                sbc.BatchSize = _batchSize;

                sbc.WriteToServer(reader);

                reader.Close();

 

                //Get count of Destination after insert

                SqlCommand dstCmd = new SqlCommand(_cntDestTable, destConn);

                SqlDataReader destCntReader = dstCmd.ExecuteReader();

                destCntReader.Read();

                _cntDestResult = (int)destCntReader[0];

                destCntReader.Close();

               

                //Calculate Rows Copied

                _cntDestReturn = _cntDestResult - _cntPriorDestResult;

 

                //Compare Source/Dest for Accuracy If they don't match send error

                if (_cntSourceResult != _cntDestReturn)

                {

                   

                    SqlCommand cmdCntError = new SqlCommand(_cntError, oConn);

                    SqlContext.Pipe.ExecuteAndSend(cmdCntError);

                   

                }

                else

                {

                    //Send results and Cleanup

                    SqlContext.Pipe.Send(_cntSourceResult.ToString() + " Rows sent :)");

                    oConn.Close();

                    destConn.Close();

                }

              

 

            

        }

 

 

The stored sproc to call this would look like this:

 

CREATE PROCEDURE [dbo].[sp_BCP]

      @sourceTable [nvarchar](50),

      @destServer [nvarchar](50),

      @destDB [nvarchar](50),

      @destTable [nvarchar](50),

      @batchSize [int],

      @tearDown [int]

WITH EXECUTE AS CALLER

AS

EXTERNAL NAME [bcpTest].[BCPData].[pushData]

Published May 08 2008, 05:21 PM by Ken Kaufman
Filed under: ,

Comments

No Comments
Copyright Red Gate Software
Powered by Community Server (Commercial Edition), by Telligent Systems