Thursday, August 19, 2010

BizTalk Key concepts

I am writing this blog is to help those people who need to know the basic concepts of BizTalk architecture i.e. how it works and on which technologies it’s sustain its information. The whole information which I have mentioned here is from the Professional BizTalk® Server 2006.pdf. The main reason to write this blog is that BizTalk having haphazard architecture and sometime its tool may puzzle you in upholding your concepts.

Thread Pool for BizTalk:

Because large parts of the BizTalk engine are written in .NET, which itself leverages the built-in CLR thread pool, if you run the SOAP adapter and your orchestrations in the same host, they will fight over limited resources, such as threads. When running multiple orchestrations, each using a thread pool thread, it’s easy to starve the SOAP adapter of threads to process outbound messages. This results in the SOAP adapter raising errors and then retrying the message later. For the most part, this is due to the fact that the Orchestration Engine uses the .NET ThreadPool very aggressively.

hosts :

By default, BizTalk has two default hosts: BizTalkServerApplication and BizTalkServer -

IsolatedHost. The BizTalkServerApplication host is the default host for any adapters or

Orchestrations that you create and for outbound SOAP messages generated by BizTalk, whereas the BizTalkServerIsolatedHost is used for messages received via the SOAP adapter.

There are several good reasons for this split. In-bound SOAP messages are handled by Internet

Information Server (IIS), not the BizTalk Server engine, unlike other adapters. Also, you can think of messages being received from IIS as being far less trusted than those from other adapters; in many cases they will be received over an untrusted network, such as the Internet. Many adapters, such as the SOAP adapter, need to be hosted by processes other than the BizTalk NT Service. The isolated host model enables this and effectively means that an adapter may be hosted outside of BizTalk. For these scenarios, the adapter actually loads the BizTalk Messaging Engine and, subsequently, the Message Agent into its process address space. This ensures that the same out-of-process communication is required regardless of whether the adapter is hosted by BizTalk or as an isolated adapter.

pipeline

When a pipeline component is invoked, it may read the message data from the message stream and perform what normalization or translation is required. So in the case of the Flat File Disassembler pipeline component, it will be responsible for reading the flat-file contents from the stream and applying the flat-file schema parsing and transformation to create an XML message.

subscription & Context

A subscription is made up of a number of conditions to ensure that a subscriber gets the message it requires. When a message is processed, metadata is derived from the message content, the transport adapter, and the BizTalk port that the message was received over and is written to the context of the message. These context properties are evaluated against the subscriptions in order to determine how to route the message. These subscriptions are held within the Subscription table of the BizTalkMsgBoxDb SQL database and are evaluated when a message is published to the MessageBox by the BizTalk Message Agent, which is run as part of the BizTalk host.

Each BizTalk message has an associated context; this context is used to hold out-of-band metadata and is often used to make routing decisions and to make available various transport information for use later in the processing.

The key difference is that property promotions are visible at both the messaging infrastructure laye Orchestrations. Property promotions are made available within Orchestrations by storing the prom property in the Message context of a given message. Unlike distinguished promotions, property prtions are limited to 255 characters.

When an orchestration port is bound to a particular send port, the information about that binding is stored in the BizTalk Management database.

With this information, the Message Agent inserts the message once into each MessageBox database that has a subscription by calling the bts_InsertMessage stored procedure. The bts_InsertMessage stored procedure is first called with a subscription ID. On this first pass, the stored procedure calls the int_EvaluateSubscriptions stored procedure which is responsible for looking up the subscription detail information, verifying that the message meets security requirements for the application by checking that message predicates match application properties for the host, and inserting a reference to the message in the application specific queue or application specific suspended queue depending on the state. The message ID, subscription ID, service ID, and other subscription information are inserted into the application specific queue table for each subscription that was found for this application. After the messages are inserted, the message properties and message predicates tables are cleared of the batch related values.

The bts_InsertMessage stored procedure is called subsequently for each part in the message. On the first call, the message context is passed and is then inserted into the SPOOL table along with metadata about the message such as the number of parts, the body part name and ID. In addition the message body part is inserted into the PARTS table using the int_InsertPart stored procedure. The bts_InsertMessage stored procedure is then called for each of the remaining message parts where they are simply passed to the int_InsertPart stored procedure to be persisted in the PARTS table.

No conversion or translation of messages is performed by adapters; they purely bring the bits into BizTalk so that pipelines can then optionally perform any data translation, such as converting a flat file to an XML document.

Consider the scenario whereby an orchestration has checkpointed during execution. If that BizTalk server were to suffer a power outage, another BizTalk server could continue the orchestration from the checkpoint rather than having to start the orchestration from the beginning, as it would have to without this checkpoint. This is a great optimization but has overhead. The cost of serializing the orchestration out of memory into the BizTalk MessageBox will impact the CPU utilization of the BizTalk server and the SQL server, impact network bandwidth, and impact the amount of I/O for the database. Such impact is a necessary

evil, and BizTalk has been tuned to reduce the impact. For example, the state is compressed before being written to the MessageBox, which effectively trades CPU cycles for network bandwidth. The XLang compiler does an excellent job of collapsing multiple persistence points into one where possible. It can do this because it has implicit knowledge of the orchestration.

Sometimes, developers want to make use of .NET Framework types that are not serializable. A

very common example is System.XML types, such as XmlNode. Use of these types will be

blocked by the orchestration compiler or at runtime because it would cause a failure when per-

sisting the orchestration. Developers quickly found that they could make use of these types within atomic scopes because persistence cannot happen within an atomic scope. (It’s all or nothing.) Atomic scopes are there for a reason, but they were not designed for such a scenario. These types can also be used by adopting a stateless model whereby they are not held as member variables; hence, the fact that they do not support serialization is not relevant.

Dehydration

Business processes are typically long running because they execute over extended periods of time and are always asynchronous in terms of messaging. If an orchestration sends a message using a Send shape the process of physically sending the message is done asynchronously to the orchestration. Once the message has been published to the BizTalk MessageBox, the orchestration carries on executing until it reaches a waiting point, such as a Receive shape.

Waiting points are typically Receive and Delay shapes. In these cases, there is no reason for the orches-tration to remain in memory to wait for a message to arrive or a delay period to expire. In fact, such a delay could take hours or days. At these points, the orchestration becomes a candidate for dehydration.The Orchestration Engine may dehydrate your orchestration after such a delay has exceeded the engine threshold and persist it into the BizTalk MessageBox.

You should ensure that any schemas used by your BizTalk solution are housed in a separate BizTalk project and therefore deployed in a different assembly. Having your orchestrations and schemas com-bined in the same assembly will prevent versioning from using this approach, as you cannot have multi-ple versions of the same schema deployed at any one time. Thus, you are required to undeploy all orchestrations and schema assemblies and prevent any form of in-place upgrade.

Direct port binding

Direct port binding enables you to loosely couple orchestrations together or loosely couple orchestra-tions from the message source. Orchestrations with direct bound ports are not bound to physical ports at deployment time but instead rely on messages arriving from other receive ports or being created from other orchestrations at runtime.

Such an approach is particularly useful when you want to implement a publish-and-subscribe messag-ing architecture and want to slot in new orchestrations without modifying the rest of your architecture. A common scenario that I like to use when explaining direct ports is an insurance-brokering system. One insurance quote message arrives in the MessageBox and is consumed by an orchestration that generates an internal quotation message.

Coreleation

It does this by requiring a developer to identify one or more pieces of data present in the initial message or the message context, which when combined, produce a unique identifier. Often, there is a unique identifier already present in the message which can be used as is; a common example of such a unique identifier is a GUID (globally unique identifier).

This unique identifier must be present somewhere in the return message or message context and allow BizTalk to then identify which dehydrated orchestration the message is ultimately destined for. It cobe that the unique identifier is actually a pair of properties that together form a unique identifier. Correlation is implemented as part of the messaging platform and therefore requires that any data to used as a unique identifier for correlation to be promoted into the message-context using property promotions. (Remember that distinguished promotions are only available within orchestration.)

If the orchestration also has other (nonactivation) Receive shapes for different messages, there’s no subscription present for these until the Receive shape is executed.

If you consider the orchestration in Figure 5-46, you can see that there are two Receive shapes shown,which seems simple enough. The first Receive has been marked as an activating Receive, as usual.

The Singleton Pattern

In some scenarios, you may wish to process a number of messages using one orchestration instance rather than multiple instances. This limits the parallelism you would otherwise get, so it should o used if absolutely necessary.

Its use when we want to promote property in message, on the basis of this promotion we use this patteren,i.e how do we return our message. for example there is a field order_id which is promoted and its value is ‘4’ so when we return order we will check the order whose values is ‘4’ have to return.

promoted properties

A term that is commonly used to describe a message’s promoted properties is message

context. Message context includes all the instance-specific and exchange-specific data fields,

and essentially is the metadata that the messaging engine of BizTalk Server uses to process

messages. As previously noted, instance-specific properties are those that pertain to a specific

message instance, and they must be promoted explicitly during development. A common

example of this type of property is an XML element containing a unique ID, which may capture

an important data field such as an order number. From a message schema, XML elements,

attributes, and records may be promoted.

Distinguished

• Distinguished fields are available only within a single orchestration instance, and they

are not available to other BizTalk Server objects, such as receive locations, send ports,

send port groups, and tracking utilities.

• Distinguished fields can be of any length; promoted properties have a maximum length

of 255 characters.

• Distinguished fields have less of a performance impact than promoted properties, as

they are not persisted to the MessageBox database. Instead, they are essentially XPath

aliases, which simply point to the appropriate XML data field. Additionally, adding the

DistinguishedField attribute to a field on a .NET class allows it to be exposed as a dis-

tinguished field.

Distinguished fields are accessed through a reference to the name of the message, the

name of the record structure containing the distinguished field (which could include

multiple levels of child records), and the name of the distinguished field, with each

named item separated by periods: MessageName. RecordName.ChildRecordName.

DistinguishedFieldName. Promoted properties, on the other hand, are accessed

through a reference to the name of the message, the name of the property schema,

and a name of the promoted property, via the following format:

MessageName(PropertySchemaName.PromotedPropertyName).

Property schemas

Property schemas allow you to promote properties so that they can be used when setting up

filter expressions. As long as the PassThruReceive pipeline is not used, these promoted prop-

erties are added to the message context during pipeline processing. Once added to the

message context, they can be used as filter expressions on send ports. These properties are

also available to be evaluated or modified in orchestrations. To create a property schema and

promote a property, follow these steps:

Rule Engine

Executing rules against a large XML message is something to avoid. The BizTalk Rules Engine (BRE) requires that XML messages be asserted into the BRE memory using a TypedXmlDocument based on an XmlDocument (which therefore requires the entire message to be loaded into memory).

BAM

You can use the BAM Excel add-in to define BAM activities. These activities will contain the data you wish to collect. It depends on your solution, but quite often you will end up with an activity for each key concept that you wish to collect, so for an order-processing system you may have an activity for an Order and Line Item. These activities can be linked together in a parent-child relationship to maintain the logical relationships between them, which we’ll discuss later.

BAM activity views are also defined using Microsoft Excel in the same way as BAM activities and are analogous in approach to SQL views. An activity view can enable you to provide a customized view of the activity data for different roles within your organization and to define precalculated queries and dimensions that will be automatically created (via an OLAP cube) when deploying the database infra-structure.

After you have defined activities in the Excel spreadsheet, you can use the BAM Management tool (bm.exe) to provision the database infrastructure to support your BAM activities. Each activity is repre-sented as a SQL table, with each activity item represented as a column within the activity.

Data Dimension

The data dimension enables you to view activity data grouped by data held within the activity itself. When you define a data dimension, you specify one or more data items that will be used when grouping the results. The following table shows how many total products have been sold, but it doesn’t show any more detail as to wh ch products have been sold.

Numeric Range Dimension

The numeric range dimension enables you to group data based on friendly names assigned to numeric ranges. For example, a loan application between $0 and $5000 might be defined as low-risk loan, but a loan of more than $5000 might be defined as a high-risk loan.

Time Dimension

The time dimension enables you to view activity data grouped by milestones held within the activity. When you define a time dimension, you specify a business milestone to base the dimension on and a for- mat in which to display the data and time.

The following table shows the number of products sold, grouped using a time dimension that breaks down the sales into year and months. You can control to what granularity the data is grouped.

No comments:

Post a Comment