Troubleshoot a Mule Application
Introduction
The easiest part of software development is writing code. Anyone can write a piece of code (even a kid). But, the biggest challenge is writing code that will perform smoothly in hostile conditions. And to achieve this goal the code must pass through severe conditions.
I want to present some troubleshooting cases of a Mule Application in this article. So, let's get started.
Problem Scenario
I have a simple Mule application deployed in Cloudhub 1.0.
The application is a REST API that invokes Salesforce to upsert some data. Recently, we have seen that the application becomse stale after some hours of running and there was no threads available to process new requests.
Troubleshooting
When the application is running but it's not accepting new requests the next best way after logs is the thread dump analysis. The focus of this article is analyzing thread dumps and how to take action based on them.
Taking a thread dump
You can take thread dumps of only running applications (obvious). Open the logs of the mule application and hover around the Worker and you will be given an option:
Click the Diagnostics option and it will download the thread dumps of your application.
Analyze the issue
There are many tools for thread dump analysis however I prefer the fastthread.io. You can upload your download dump and it provides a lot of helpful information.
Well, you can see that it already has detected some issues.
Let's dig deeper.
You can see that 141 threads out of 220 are in a WAITING state. Something must have gone wrong.
Let's check the identical stack traces.
Well, if you see the first trace (stackTrace1), it's quite obvious that something bad is going on where lots of threads are doing something and they are generating identical traces.
Now, let's open this stackTrace1, and let's see what's going on.
Well, you can see that 114 threads are printing identical traces and if you go one by one (left-hand side), you can see the same behavior.
All these threads are in a WAITING state while making a call to Salesforce using the salesforce connector.
So, the problem is obvious that we have done some misconfiguration in the Salesforce connector.
Final outcome of the analysis
Well, this is a known issue. Check the official Mulesoft article on it.
In brief, if we don't set the Read Timeout in the SF connector (default is 0), the Salesforce connector will wait infinitely to receive a response from Salesforce. And if Salesforce is taking too much time (or down) to respond, for every incoming request to your Mule application, Mule will open a new thread to process the request. Thereby Mule will run out of threads and your application will halt.
Solution
Set a proper Read Timeout in the Salesforce Connector configuration to avoid such issues.
Conclusion
In this brief article, I have shown how to troubleshoot a Mule application using thread dumps. If you have any other situations where thread dumps helped you to analyze the root cause please let us know.
Thanks.