The Different Meanings of Finished
This is the third article in our series on how to deal with incomplete cases in process mining. You can find an overview of all articles in the series here.
Once you have determined what your startpoints and what your endpoints are, you still need to think about what finished or completed actually means for your process.
Multiple interpretations are possible and the differences can be subtle, but you will need to use different filters depending on the meaning that you want to apply. The results will be different and you need to be clear about which meaning is right for your data set.
Here are four examples for how you can filter incomplete cases. Its not that any of these are better or more appropriate than others in general. Instead, it depends on your process and on the meaning of finished that you want to choose.
Perhaps the most common meaning of finished is to look at which activities have occurred as the very last activity (for end points) or as the very first activity (for start points) in a case.
This corresponds to the dashed lines that you see in the process map and you can use the Endpoints Filter in Discard cases mode to filter all cases that start or end with a particular set of activities (see Figure 1).
When you add this filter, only the activities that occurred as the very first event in any of the cases are shown in the Start event values on the left and only activities that occurred as the very last event in any of the cases are shown in the End event values on the right.
You can then select only the regular start and end activities that you have identified in the previous step to focus on your completed cases. For example, if we only select the Order completed activity as a regular end point for our refund process, then the remaining data set will only contain the 333 cases that actually ended with Order completed. If you use the shortcut Filter for this start/end activity after clicking on a dashed line in the process map, Disco will automatically add a pre-configured Endpoints filter to your data set.
To use your filtered data set as the new reference point for your further analysis, you can enable the checkbox Apply filters permanently after pressing the Copy and filter button. The outcome of applying the filter will be the same (the same 333 cases remain), but the applied filter will be consolidated in a new data set, so that successive analyses use this new baseline as the new 100% of cases.
Sometimes, the very last activity that happened in a case is not the best way to determine whether a case has been completed or not.
For example, after completing an order there might be back-end activities such as archiving or other documentation steps that occur later. In these cases, Order completed will not be the very last step in the process (so, the case would not be picked up if you use the Endpoints filter).
If you are mainly concerned that one or more milestone activities that indicate the completion of your process have occurred or not, you can use the Attribute Filter in Mandatory mode (see Figure 2). This way, you determine all cases where any of the selected activities has happened, but you don’t care whether they were the very last step in the process or whether other activities were recorded afterwards.
Instead of manually adding this filter, you can also use the shortcut Filter this activity… after clicking on the activity in the process map. Disco will automatically add a pre-configured Attribute Filter in Mandatory mode to your data set with the right activity already selected.
If we apply this meaning of finished based on the milestone activity Order completed for the refund process, we get a slightly different outcome compared to the Endpoints Filter before. Instead of 333 cases, there now remain 334 cases after applying the filter and we can see that the additional case ended with the activity Warehouse (see Figure 3).
If we now click on this dashed line leading from the Warehouse activity and use the short-cut to investigate this case in more detail, we can see in the history of the case that the activity Order completed did indeed occur. However, it occurred in the middle of the process after the order was initially rejected. Then, the case got picked up again and the refund was actually granted (see Figure 4).
In another scenario, you might be analyzing the refund process from a customer perspective: This is a process that the customers of an electronics manufacturer go through after the product that they purchased was broken and they now want to get their money back. So, from the customers point of view the process is finished as soon as they have received their refund.
To analyze the data from this perspective, we can focus on the three payment activities Payment issued, Refund issued and Special Refund issued (see Figure 5).
If we search for these activities in the process map, then we can see that there are several activities that happen afterwards. Sometimes, the delays in the back-end processing can be quite long (for example, 7.5 days on average after the Payment issued step), but from the customers perspective this delay is not relevant.
So, to focus our analysis on the part of the process that is relevant for the customer, we can use the Endpoints Filter in Trim longest mode (see Figure 6).
When we change the Endpoints Filter mode from Discard cases to Trim longest, then all of the activities become available as Start event values on the left and as End event values on the right. We can now select only the three payment activities as the customer endpoints in our process.
As a result, everything that happened after any of these three payment activities is cut off. We can see that the customer payments now appear as the endpoints in our process map (see Figure 7).
The cases that remain in the data set after applying the filter are the same ones as if we would have used the Attribute filter in Mandatory mode. But cutting off all activities after the payments enables us to focus our process analysis on the part of the process that is relevant from the customers perspective:
The process map does not show the back-end activities after the payments anymore, so our bottleneck analysis (see also Analyze SLAs and Bottlenecks) will point us to the right places in the map that we should focus on.
The case durations in the statistics views are only shown for the times from the creation of the refund order until the time that the customer has received their money back.
The variants now only show us the process scenarios from the time of the order creation until the payment activities, so they are more meaningful for this perspective.
Open for longer than X
There might be activities in your process that can be considered an endpoint if there has been a certain period of inactivity afterwards (see also Reason No. 3 at the beginning of this series). For example, we can request missing information (like the purchase receipt) from a customer to handle their refund order but the customer might not get back to us.
If we want to focus on cases where the activity Missing documents requested was the last step in the process but nothing has happend for a month, we can use a combination of filters in the following way.
First, we add an Endpoints filter as shown in Figure 8.
Then, we add a second filter by clicking the click to add filter button again and we add a Timeframe filter on top of it (see Figure 9).
By adapting the selected timeframe in such a way that the past month is not covered, we will only keep those cases that did end with Missing documents requested and where that last step took place more than one month ago.