Tuesday, April 27, 2010

Misleading error message in Windows Workflow Foundation 4.0

If you are running a Windows Workflow Foundation 4.0 (in Visual Studio 2010) and you are testing it with the WCF Test Client and you get a message similar to the following:

Failed to invoke the service. Possible causes: The service is offline or inaccessible; the client-side configuration does not match the proxy; the existing proxy is invalid. Refer to the stack trace for more detail. You can try to recover by starting a new proxy, restoring to default configuration, or refreshing the service.

The operation could not be performed because WorkflowInstance 'cb123a26-34bd-4ab8-876f-63dee2080b42' has completed.

Server stack trace:

   at System.ServiceModel.Channels.ServiceChannel.HandleReply(ProxyOperationRuntime operation, ProxyRpc& rpc)

   at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

   at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

   at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:

   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

   at IService.GetData(String inParameter1)

   at ServiceClient.GetData(String inParameter1)

The good news is that if you have not messed with bindings, etc, and you made a simple change to the workflow and your service started throwing this exception, it is likely the message is just misleading. I have concluded that you will get this message anytime an exception is thrown during the execution of the workflow. This could be something as simple as a null pointer or much more complex.

The question is how do we figure out what the real exception is. A more important question is how do we catch and log these exception. Without the logging, we won’t know if our users are having issues and how have a clue what the cause is.

One way to address this issue is to Use the TryCatch Activity in Windows Workflow Foundation (WF) Designer in Visual Studio 2010. This works just like a try-catch-finally would work in C#. You can create a custom Code Activity called something like LogError. Click here for details on creating this custom Activity. You can then use it in the catch portion of the TryCatch activity.

You can put the TryCatch Activity at the highest level in your workflow to server as a catch all or you can use it particular points in your workflow. Just like when coding, it is often appropriate to do both.

Now when you try to run test your Workflow you will see your error in the Windows Event Log / Viewer. Since the workflow didn’t return the expected response, you still get this generic / useless error, but at least you know the cause now.

If after all this, there is no exception being thrown then it is likely you are trying to send an message to your workflow that is not valid. By not valid, I mean it could be that the Message you are sending is not to the Current Message. Consider the case where you have 2 ReceiveRequest Activities and they are in a Sequence. If you try to send a Message to the second one before the first one this is not valid. Why? Because they are in a sequence. The first Activity must complete before the second one can be executed. That is the vary nature of a workflow. If you need them to be able to be called regardless of the sequence, then you should probably use the Parallel Activity.

Lastly, if you are executing the Activities in order and still getting the error, make sure you are referencing the same CorrelationHandle and that you have specified a key for it to use as the correlation object. This is essentially a primary key for an instance of the workflow. IN WF3 this was the workflow id. In WF4, you can use a key in your data or you can use a GUID like WF3 did, but in any case, you need to tell all your Receive Activities what you want to use to make the correlation. If you don’t have a correlation handle and key defined then WF4 will have now way of telling what instance of the workflow you are trying to access.

1 comment:

Anonymous said...

I have a situation where I have 2 ReceiveRequest Activities in a Sequence (one is before a logic branch and the second is contained in one of the logic branches). If you try to send a Message to the second one before the first one this causes an exception as expected. However then when I go back and send a Message to the first one successfully, the workflow then skips the second ReceiveRequest activity and completes. Do you know why happens?