Wednesday, February 21, 2007

Avoid Deadlocks using the SQL Adapter and BizTalk Server 2006

As applications scale it is often necessary to add additional servers in order to take advantage of the power of parallel processing. For BizTalk solutions this means that there can be two or more BizTalk hosts running identical solutions and all sharing the same resources. BizTalk does a good job of managing multiple hosts internally (i.e. maintaining unique identifiers) but sharing external resources (i.e. a database) needs to be handled more carefully. This article describes a strategy for maximizing efficiency and minimizing conflict between multiple BizTalk Hosts communicating with a single database.

The primary reason why you would want to use the following approach is to avoid errors like this:
"Transaction (Process ID 56) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. Error Number : -2147467259."

The approach described in this article follows the pattern where two (or more) BTS servers are polling a single database using identical SQL Adapter port configurations but this approach also applies to a single BTS server where deadlocks are an issue under load.


The following goals were used to define how this strategy should behave:

  1. No duplicate records
  2. Maximize use of all BTS instances
  3. No SQL locking conflicts
  4. Acquire data as quickly as possible

The first three requirements can be handled by a carefully crafted stored procedure (see section 1.3) while the fourth is controlled by BizTalk. In the receive port properties set Poll While Data Found to True. This will make BizTalk keep reading the database until there are no more records to process. If you do not set this option then the solution will not scale to handle larger volumes.

The Aquisition Stored Procedure

Due to the fine-grain control that stored procedures provide it is strongly recommended that they be used over a direct SQL query. Especially since a direct query will not allow you to manage SQL locking which can be problematic with multiple BTS hosts competing for the same resource. The basic steps of the acquisition procedure are (complete source code available in section 1.5):

  1. Set the Transaction Isolation Level
  2. Acquire unique identifier of the record
  3. Mark the record as processed
  4. Return the contents of the record

Set Transaction Isolation Level

By default BizTalk wraps all calls made with the SQL Adapter in a transaction with an isolation level of SERIALIZABLE. This level is much more aggressive with resource locking than the default level of READ COMMITTED. As a result it is important to specify the Transaction Isolation Level for the transaction within the procedure. In the sample provided the transaction level of READ COMMITTED is.

In order to maximize concurrency while eliminating conflict it is often necessary to further refine resource locking by the use SQL locking. Locking hints provide a fine degree of control over how SQL handles the visibility/availability of records to other transactions.

It’s worth noting that there is more than one way to approach the way that SQL locking is handled and that a REPEATABLE READ solution with some modifications to the locking hints could also theoretically work. Although, since the sample solution discussed in this document has been implemented and tested in a real-world application so it is the recommended solution.

Acquire Unique Identifier

The first step is to get the unique identifier of the record to process. Generally, this is a select statement with any WHERE or JOIN expressions necessary to return a single row’s identifier where the processed flag is not set (see the next step details). In the sample provided TOP and ROWCOUNT are also used to insure a single row result. It is important to note that the acquisition procedure will only acquire and process a single row. It is left up to the BTS receive port to poll the database and control execution of the acquisition procedure to consume multiple rows.

Two SQL locking hints are used to lock the row (UPDLOCK) and read past (READPAST) any existing locks. The row is locked at this point (as early as possible) to prevent another SQL instance from also acquiring it for the lifetime of the transaction. The read past hint is used to skip past any existing locks (e.g. created by other instances of this procedure) without stopping and waiting for the lock to clear which is the default behavior.

It is recommended that the number of JOINS be minimized so as to reduce the chance of locking a row in a table required by other processes. One possible solution is to populate a ‘Staging’ table and then consume rows from there.

Mark As Processed

Using the identifier acquired in the previous step, a processed flag is set to indicate that this record has been consumed by BizTalk and should not be included in future acquisition attempts. The query used to acquire a unique record identifier specifically avoids rows that have been processed so once this is set it will not be consumed again.

In the sample provided, another field called ProcessedOn is set at the same time to date stamp the time that the record was consumed by BizTalk. This field is not strictly required but can be handy for auditing and testing purposes.

Get Message Contents

The final step of the acquisition procedure is to return the contents of the row to the SQL Adapter. The sample provided uses SELECT * but this is only for demonstration purposes and is not recommended for real-world scenarios.

Since the stored procedure is designed to facilitate the BTS SQL Adapter it is important to include the FOR XML AUTO clause in the SELECT statement. This will return the results of the query in a form (XML) that BizTalk can consume.

Finally, when the transaction ends all locks are released.

SQL Locking Hints

There are a variety of specialized locking hints available in SQL Server 2000. Below is a summary of hints that are useful for sharing a single resource by multiple hosts (descriptions taken from SQL 2000 Books Online go there for more information).

READPAST

Specifies that locked rows are skipped (read past). For example, assume table T1 contains a single integer column with the values of 1, 2, 3, 4, 5. If transaction A changes the value of 3 to 8 but has not yet committed, a SELECT * FROM T1 (READPAST) yields values 1, 2, 4, 5. READPAST applies only to transactions operating at READ COMMITTED isolation and reads past only row-level locks. This lock hint is used primarily to implement a work queue on a SQL Server table.

ROWLOCK

Specifies that a shared row lock is taken when a single shared page or table lock is normally taken.

UPDLOCK

Specifies that update locks instead of shared locks are taken while reading the table, and that they are held until the end-of-statement or end-of-transaction.

READ UNCOMMITTED

Implements dirty read, or isolation level 0 locking, which means that no shared locks are issued and no exclusive locks are honored. When this option is set, it is possible to read uncommitted or dirty data; values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the four isolation levels.


REPEATABLE READ

Locks are placed on all data that is used in a query, preventing other users from updating the data, but new phantom rows can be inserted into the data set by another user and are included in later reads in the current transaction. Because concurrency is lower than the default isolation level, use this option only when necessary.

SERIALIZABLE

Places a range lock on the data set, preventing other users from updating or inserting rows into the data set until the transaction is complete. This is the most restrictive of the four isolation levels. Because concurrency is lower, use this option only when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT statements in a transaction.


Sample Stored Procedure

/*
This stored procedure efficiently consumes a single row for multiple BizTalk without deadlocks.
Receive Ports using the SQL Adapter.

Author: Adam.Bowron@Visiphor.com
Created: October 25, 2004
*/
CREATE Procedure Shared_BTS_Select
AS

BEGIN TRANSACTION

--Explicitly set the Transaction Isolation Level since the procedure relies on it
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
DECLARE @Record_ID integer
SET NOCOUNT ON
--Acquire a single row and lock it so that another concurrent query will ignore it.
--READPAST is used to skip locked rows without waiting for them to release.
--UPDLOCK is used to lock the row for the lifetime of the transaction
SET ROWCOUNT 1
SELECT
@Record_ID = Source_Table_ID FROM
Source_Table WITH (READPAST, UPDLOCK) WHERE
Processed = 0
ORDER BY
Last_Update_Dt ASC
SET ROWCOUNT 0
--Mark the row as processed so that it is ignored by future queries
UPDATE
Source_Table
SET
Processed = 1,
ProcessedOn = getdate()
WHERE
Source_Table_ID = @Record_ID
--Return the contents of the row in XML form to the SQL Adapter.
SELECT
*
FROM
Source_Table
WHERE
Source_Table_ID = @Record_ID FOR XML AUTO

COMMIT TRANSACTION
GO


References

SQL 2000 Books Online (transaction/locking hints are copied verbatim)

Thursday, February 01, 2007

"Best Before" Phase

There are many software development methodologies and all of them have variations on phases with a project. There are names like "Envisioning", "Planning", "Stabilization", etc. All good names with valid, serious, meanings...

At this point, I'd like to officially introduce another phase that lays on top of all the stodgy official phases. I call it the Best Before phase. This is generally near the end of the project when you start to notice that things in the fridge expire on or after your project is due to be completed. There's a funny emotional reaction to seeing that the expiry date on the carton of milk is a few days after your project is scheduled to complete. The reaction ranges from relief ("Thank god it's almost over") to outright fear ("I'll never get it done in time"). Either way, you know you're in the Best Before phase.

"I don't know phases"
- Doctor Evil