Moving unstructured pieces of "work" in a workflow.
In WF terms, what is the preferred method of attaching, persisting, and working with unstructured objects such as PDFs, image files, and objects that do not fit very well in a table or row? I see basically how to work with messages and flow logic but would like someone to point me in the right direction for a discussion on working with unstructured data and large binary files that require processing, approval, etc.. In the past, it has simply been a pointer (\\server\share\queue1\myimage.jpg) to a file. I would guess that SharePoint would be overkill for this?
DeBug
Hey Doug,
The main piece of advice I'd give is to avoid keeping the data as instance properties on the workflow. If you have a large PDF or image, you don't want it to be serialized with the instance state every time the workflow is persisted. Be aware that if you're raising events using local services, the event data is also serialized into the workflow queue. If you have really large files, or if perf/memory is a big issue, the safest bet is to pass around URLs to the files and only load and utilize them inside the activity that's doing the processing.
As for storage mediums, it would really depend on your scenario, but using SharePoint or the filesystem are both fine avenues for storing files. (Although I've heard SharePoint requires some tweaking to work properly with large files.)
Anyways, hope this helps. Feel free to reply if you have questions about a specific scenario you had in mind.
Arjun
Arjun hit the most important point, the workflow should only contain a reference to the file, not the actual data. There are a couple of other important considerations: what controls the document lifecycle and, less important, where is it actually stored.
The original question does not provide enough context for a specific recomendation, but I can outline a thought process that should lead to a solution.
Are the documents just read-only reference to support theprocess, or does the process create or modify documents? If the latter, you will probably want at least some eleemnts of a document management system: check-in/check-out, audit history, and perhaps storage of version/change history.
What other document infrastructure exists? Is there an incumbent document management system? What are they legal/business requirements for these documents? Are there specific access restrictions, retention policies, or audit requirements?
Most document management systems provide api's, typically web services today. Creating custom activities to interact with the existing system is a good approach, if there is a current system.
As Arjun suggested, in many ways the physical storage of the file is the least important consideration, and will probably be determined by other factors.
Perhaps the best way to think about it is to determine how the documents would be managed if you weren't using WWF. From the document's perspective, the role of workflow is simply to deliver it to the right participants at the right time, without adding any unnecessary complexity. If you can achieve this, your users should be happy.