BizTalk servers are essentially stateless. The SQL Server databases store all BizTalk’s configuration, tracking, and processing information that they require for normal operation.
Of these, the MessageBox is the heart of any BizTalk solution and without it nothing would work. All inbound and outbound messages, along with orchestration execution, dehydration, rehydration, and message tracking, rely on the BizTalk MessageBox.
Once the message has passed from the adapter and through the pipeline, the BizTalk Message Agent, which runs as part of a BizTalk host instance, is responsible for evaluating who has subscribed to this message and committing it into the BizTalk MessageBox.
The BizTalk MessageBox is implemented as a SQL Server database and is shared between all BizTalk servers in a BizTalk group. Multiple MessageBoxes can be utilized in very specific high-load or latency-critical scenarios.
The MessageBox is a huge database that is essentially BizTalk’s storage engine during processing. The most important point to make is that no changes whatsoever are supported to the MessageBox.
In this post, we are primarily focused on understanding how the physical messages themselves are stored in the MessageBox database.
All physical messages within BizTalk are stored across four tables in the MessageBox database: Spool, MessageParts, Parts, and Fragments. The reason for this is that each message can consist of one or more part, and each part consists of one or more fragments.
The fragmentation of messages within BizTalk can be controlled to some degree. The BizTalk group’s large message fragment size setting is used to determine when the Messaging Engine will overflow a message into the MessageBox before it has completed processing. This is done to help avoid out-of memory conditions within the BizTalk process when processing in the context of a receive port.
For every host created, a corresponding set of queue table is made in BizTalk Message Box database. When you configure BizTalk, the default host BizTalkServerApplication is created. If you examine the MessageBox database, you will see the following tables for this host:
- BizTalkServerApplicationQ (main queue) – host polls this table to collect message pending for processing.
- BizTalkServerApplicationQ_Scheduled (scheduled queue) – this is created but never actually used by BizTalk.
- InstanceStateMessageReferences_BizTalkServerApplication (state queue) – holds the list of messages that have been processed but are required later
- BizTalkServerApplicationQ_Suspended (suspended queue) – suspended message are stored here.
These host queues are used only to store references to the actual messages. Each instance of the host runs within its own BTSNTSVC.EXE process and will poll its main queue table as configured, for as long as there is work on the queues to read and there are resources available to process the work
If no work is present on the queue, the polling interval will slow down until it eventually reaches the maximum polling interval.
When a host instance processes a message, it actually retrieves the physical message data from the Spool, MessageParts, Parts, and Fragments tables.
There is a clear separation of the tables used to store the physical messages and those that hold their references as shown below.
The reason for the distinction is that each message can have one or more subscribing services that may run in different hosts.
Storing only a reference to any message on the host queues enables each message to be processed by multiple hosts but be stored only once. Each instance of a particular host will poll its main queue table (for example, the HostAq table in the figure), which contains the message reference. The message content itself will be accessed through the Spool, MessageParts, Parts, and Fragments tables.
When a message is routed only to one subscriber, once it has finished processing a reference will immediately be inserted into the MessageZeroSum table, which is used to mark the physical messages for clean up by a job running within SQL Server Agent. This means that these types of messages will be cleaned up very quickly from the MessageBox database.
For messages going to multiple subscribers, once the message finishes processing, the BizTalk service will not remove the message from the database or insert a reference in the MessageZeroSum table, because it does not know whether there are any more subscribers to the message (for example, if it is referenced in another queue). The reference counts (or number of subscribers) for these messages are maintained by a combination of the following tables: MessageRefCountLog1, MessageRefCountLog2, and MessageRefCountLogTotals. The ActiveRefCountLog table is also used as an auxiliary table.
A job running within SQL Server Agent is used to aggregate the reference counts in these tables and determine which messages can be deleted. The Message IDs that can be deleted are then inserted into the MessageZeroSum table, from which another SQL Server Agent job will clean them up.