As applications scale it is often necessary to add additional servers in order to take advantage of the power of parallel processing. For BizTalk solutions this means that there can be two or more BizTalk hosts running identical solutions and all sharing the same resources. BizTalk does a good job of managing multiple hosts internally (i.e. maintaining unique identifiers) but sharing external resources (i.e. a database) needs to be handled more carefully. This article describes a strategy for maximizing efficiency and minimizing conflict between multiple BizTalk Hosts communicating with a single database.
The primary reason why you would want to use the following approach is to avoid errors like this:
"Transaction (Process ID 56) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. Error Number : -2147467259."
The approach described in this article follows the pattern where two (or more) BTS servers are polling a single database using identical SQL Adapter port configurations but this approach also applies to a single BTS server where deadlocks are an issue under load.
The following goals were used to define how this strategy should behave:
- No duplicate records
- Maximize use of all BTS instances
- No SQL locking conflicts
- Acquire data as quickly as possible
The first three requirements can be handled by a carefully crafted stored procedure (see section 1.3) while the fourth is controlled by BizTalk. In the receive port properties set Poll While Data Found to True. This will make BizTalk keep reading the database until there are no more records to process. If you do not set this option then the solution will not scale to handle larger volumes.
The Aquisition Stored Procedure
Due to the fine-grain control that stored procedures provide it is strongly recommended that they be used over a direct SQL query. Especially since a direct query will not allow you to manage SQL locking which can be problematic with multiple BTS hosts competing for the same resource. The basic steps of the acquisition procedure are (complete source code available in section 1.5):
- Set the Transaction Isolation Level
- Acquire unique identifier of the record
- Mark the record as processed
- Return the contents of the record
Set Transaction Isolation Level
By default BizTalk wraps all calls made with the SQL Adapter in a transaction with an isolation level of SERIALIZABLE. This level is much more aggressive with resource locking than the default level of READ COMMITTED. As a result it is important to specify the Transaction Isolation Level for the transaction within the procedure. In the sample provided the transaction level of READ COMMITTED is.
In order to maximize concurrency while eliminating conflict it is often necessary to further refine resource locking by the use SQL locking. Locking hints provide a fine degree of control over how SQL handles the visibility/availability of records to other transactions.
It’s worth noting that there is more than one way to approach the way that SQL locking is handled and that a REPEATABLE READ solution with some modifications to the locking hints could also theoretically work. Although, since the sample solution discussed in this document has been implemented and tested in a real-world application so it is the recommended solution.
Acquire Unique Identifier
The first step is to get the unique identifier of the record to process. Generally, this is a select statement with any WHERE or JOIN expressions necessary to return a single row’s identifier where the processed flag is not set (see the next step details). In the sample provided TOP and ROWCOUNT are also used to insure a single row result. It is important to note that the acquisition procedure will only acquire and process a single row. It is left up to the BTS receive port to poll the database and control execution of the acquisition procedure to consume multiple rows.
Two SQL locking hints are used to lock the row (UPDLOCK) and read past (READPAST) any existing locks. The row is locked at this point (as early as possible) to prevent another SQL instance from also acquiring it for the lifetime of the transaction. The read past hint is used to skip past any existing locks (e.g. created by other instances of this procedure) without stopping and waiting for the lock to clear which is the default behavior.
It is recommended that the number of JOINS be minimized so as to reduce the chance of locking a row in a table required by other processes. One possible solution is to populate a ‘Staging’ table and then consume rows from there.
Mark As Processed
Using the identifier acquired in the previous step, a processed flag is set to indicate that this record has been consumed by BizTalk and should not be included in future acquisition attempts. The query used to acquire a unique record identifier specifically avoids rows that have been processed so once this is set it will not be consumed again.
In the sample provided, another field called ProcessedOn is set at the same time to date stamp the time that the record was consumed by BizTalk. This field is not strictly required but can be handy for auditing and testing purposes.
Get Message Contents
The final step of the acquisition procedure is to return the contents of the row to the SQL Adapter. The sample provided uses SELECT * but this is only for demonstration purposes and is not recommended for real-world scenarios.
Since the stored procedure is designed to facilitate the BTS SQL Adapter it is important to include the FOR XML AUTO clause in the SELECT statement. This will return the results of the query in a form (XML) that BizTalk can consume.
Finally, when the transaction ends all locks are released.
SQL Locking Hints
There are a variety of specialized locking hints available in SQL Server 2000. Below is a summary of hints that are useful for sharing a single resource by multiple hosts (descriptions taken from SQL 2000 Books Online go there for more information).
READPAST
Specifies that locked rows are skipped (read past). For example, assume table T1 contains a single integer column with the values of 1, 2, 3, 4, 5. If transaction A changes the value of 3 to 8 but has not yet committed, a SELECT * FROM T1 (READPAST) yields values 1, 2, 4, 5. READPAST applies only to transactions operating at READ COMMITTED isolation and reads past only row-level locks. This lock hint is used primarily to implement a work queue on a SQL Server table.
ROWLOCK
Specifies that a shared row lock is taken when a single shared page or table lock is normally taken.
UPDLOCK
Specifies that update locks instead of shared locks are taken while reading the table, and that they are held until the end-of-statement or end-of-transaction.
READ UNCOMMITTED
Implements dirty read, or isolation level 0 locking, which means that no shared locks are issued and no exclusive locks are honored. When this option is set, it is possible to read uncommitted or dirty data; values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the four isolation levels.
REPEATABLE READ
Locks are placed on all data that is used in a query, preventing other users from updating the data, but new phantom rows can be inserted into the data set by another user and are included in later reads in the current transaction. Because concurrency is lower than the default isolation level, use this option only when necessary.
SERIALIZABLE
Places a range lock on the data set, preventing other users from updating or inserting rows into the data set until the transaction is complete. This is the most restrictive of the four isolation levels. Because concurrency is lower, use this option only when necessary. This option has the same effect as setting HOLDLOCK on all tables in all SELECT statements in a transaction.
Sample Stored Procedure
/*
This stored procedure efficiently consumes a single row for multiple BizTalk without deadlocks.
Receive Ports using the SQL Adapter.
Author: Adam.Bowron@Visiphor.com
Created: October 25, 2004
*/
CREATE Procedure Shared_BTS_Select
AS
BEGIN TRANSACTION
--Explicitly set the Transaction Isolation Level since the procedure relies on it
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
DECLARE @Record_ID integer
SET NOCOUNT ON
--Acquire a single row and lock it so that another concurrent query will ignore it.
--READPAST is used to skip locked rows without waiting for them to release.
--UPDLOCK is used to lock the row for the lifetime of the transaction
SET ROWCOUNT 1
SELECT
@Record_ID = Source_Table_ID FROM
Source_Table WITH (READPAST, UPDLOCK) WHERE
Processed = 0
ORDER BY
Last_Update_Dt ASC
SET ROWCOUNT 0
--Mark the row as processed so that it is ignored by future queries
UPDATE
Source_Table
SET
Processed = 1,
ProcessedOn = getdate()
WHERE
Source_Table_ID = @Record_ID
--Return the contents of the row in XML form to the SQL Adapter.
SELECT
*
FROM
Source_Table
WHERE
Source_Table_ID = @Record_ID FOR XML AUTO
COMMIT TRANSACTION
GO
References
SQL 2000 Books Online (transaction/locking hints are copied verbatim)
9 comments:
Hi Adam,
Great information!
Maybe it is cost to mention that the same effect (deadlocks) would be by processing batches of the messages, when the stored procedure starts simultaneously in many instances/sessions.
BTW I have one BizTalk application when the only way to avoid the deadlocks was do insert a "dummy" SELECT with (TABLOCKX, HOLDLOCK).
Leonid Ganeline
BizTalk Solution Developer
----------------------------------------------------------------------------------
Microsoft Certified Technology Specialist: BizTalk Server 2006
http://geekswithblogs.net/leonidganeline/
This is an excellent description if the SQL Stored procedure has to retrieve data and delete them. Imagine what if the data still remains (we are not deleting the data) and the data has to be processed only once in the BizTalk environment.
Any thots?
Thanks
kays
Kays-
The example in this article is just that... data (records) are queued up to be read. Once the stored procedure executes the logic updates the queued record and sets as "processed" so that the same data will not be read during the next execution.
- Jay
Good article Adam. However, I'd differ abit on using the "Poll while data found" on the receive adapter. This would trigger more instances of the BizTalk artifact and therefore increase inbound latency and the number of persistence points on your servers.
Using this option won't be ideal if the SQL queue needs to be sent out in a batch by BizTalk. For your specific scenario, I like the way you used the ROWPAST feature. Top work there.
Cheers,
Zeeshan
Hi Adam,
Great information which saved lot of time for us. One of the best posts I have come across.
Thanks,
Rajashekar
Using this technique, should UseAmbientTransaction be set to true or false?
Hi --thanks for the write up! In BTS 2009 and WCF-SQL adapter we have a separate procedure for data available required. What would you advise for locking in this procedure? select count(*) from source_table with(readpast) where processed=0?
Thank you
Post a Comment