Runtime Engine Durability

I understand that the Runtime engine does a lot of internal communication with queues. For example, the Instance Manager within the runtime engine will queue a workflow, which is thrown on a queue. The Scheduler within the workflow runtime then takes the workflow off the queue and fires the exectue method on the workflow, which then adds the first activity on the queue, etc., etc.. I also understand that workflow state can be persisted to maintain state durability over reboots, and also the sql timer service can be used to maintain delay events as well.

My question is, how durable is the runtime itself? For example, how much of the data on the interal queues are retained in the event the host process goes down or the box rebootes? For example, let's say I have a sequential workflow that starts up and executes 5 code activities, then has a Listen Activity, then has 5 more code activities before completeing. I understand that with the persistence service once the listen activity is reached the workflow will be persisted to the underlying store; however, what happens if the machine goes done during the first 5 activities, or the last 5? Or, using the same workflow example, the event is fired and the runtime pulls back the instance from the persisted store, but then the host dies. After the host comes back up it doesn't have any idea that event took place and the persisted workflow instance is still "listening" for the event. Is this accurate, or am I missing some storing mechanism of the incoming events?

What if we wrapped a transaction around some of those activities, does that make a difference?

[1652 byte] By [mikewo] at [2007-12-18]
# 1

To answer your first question - it depends on the activities. If any of the first 5 activities are transactionscopeActivity, the state of the workflow will be persisted after each activity of that type completes. Or, if you have a delay activity and the instance goes idle, it will be unloaded and persisted as well.

So, if the host goes down during the execution of the workflow, when it comes back it will resume exectution from the last persistence point.

As to the external events - we don't provide any mechanism for dealing with external events in a sense that if the scenario you describe does happen, your workflow will never resume. You will have to implement a "message box" like service to make sure that your external events do get to their destination in a durable fashion.

In you last scenario, if the workflow has been pulled back from the store and then the host dies, after the host comes back it will reload all the workflows that were running /in memory at the time of its untimely death. Much worse if the external event comes when the host is dead and you don't have any mechanism to store your external events (message box) and assure delivery.

Let me know if it answers your question.

Thanks, Iza

Iza at 2007-9-8 > top of Msdn Tech,Software Development for Windows Vista,Windows Workflow Foundation...

Software Development for Windows Vista

Site Classified