Tuesday, 17 November 2015

Update Column like Batch in SQL Server

Developers sometimes need to modify data in one or more columns for all or most rows in a table. This is usually not an issue as long as the table is fairly small.
If the table is large, however, your update statement will lock the entire table and make it unavailable, even for data reads. Further more, a highly volatile table can bring down the entire application or website for the duration of the update. At times, a large, single transaction like this will greatly expand the size of the transaction log and -- in extreme scenarios -- contribute to running out of disk space on the database server.
It is therefore a good practice to do mass updates in batches, combined with frequent transaction log backups. In my experience, a batch of 10,000 or 50,000 works best. It is difficult to specify a threshold of when you should start considering batching, as it all depends on factors such as how fast you disk I/O is, how heavily the table is used, and more.
There is one guideline you can use though. A typical command timeout in ADO.NET is about 30 seconds. While the update takes place, other processes have to wait until it is finished. So if you expect that your update will take longer than 20-25 seconds, you are better off doing a batch update, otherwise you will end up with application timeouts.
Here is a sample code that shows how to update a column in a table, using 10,000 as batch size:
WHILE ( 0 = 0 )
    BEGIN
        UPDATE TOP
 ( 10000 )
                Person
        SET     Status = 2
        WHERE   Status = 1
        IF @@ROWCOUNT = 0
            BREAK
    END

No comments:

Post a Comment