Inside Nitro: The Endpoint Filter
With Nitro 3.0 we introduced log filters. Filtering is as essential tool to clean your data, focus your analysis, and to drill down into specific aspects of your process. As powerful as it is, filtering also introduces a certain level of complexity. So, we decided that sticky notes1 won’t be enough anymore, and we promised to write more about the new log filters to guide your way.
Today, I’ll explain how the Endpoint filter works, and when and why you need it.
The filter can be used in two different modes: Discard and Trim.
1. Discard: Clean up incomplete cases
The ‘Discard’ option can be used to remove incomplete process instances from your data set.
In this case, the endpoints are used as a selection criterion to decide whether to keep or throw away a process instance during the filtering.
Almost all data sets that are extracted from real IT systems contain incomplete cases, which were either still running when the data was exported, or which had started before the chosen data time frame selected for analysis.
Below, you see the mined process model for of a purchasing process that was created based on a data set with many incomplete cases (click on the picture to see a larger version). There are arcs from many activities in the middle of the process to the end of the process.
This model accurately reflects the data set, but it does not show the regular process flow from start to finish. You can use the Endpoint filter to clean up your data in the following way.
In Nitro, select the Filter tab and add a new endpoint filter as shown below.
The filter settings then show you a list of activities that occurred as the first event (Start event values) and as the last event (End event values) in all cases in your data set. In the screenshot below, you can see that in the example process all cases start with the activity Create Purchase Requisition, but there are many different end activities.
Based on our domain knowledge, we know that there are only three legitimate end activities:
- Pay invoice (the regular end activity of the purchasing process)
- Analyze Purchase Requisition (if the purchase requisition was not approved by the manager and the process has been stopped)
- Analyze Request for Quotation (if the request for quotation was not approved by the purchasing agent)
So, we select only these three activities as End event values, apply the filter by clicking ‘Start filtering…’, and export the filtered event log. The discovered process model then reflects the behavior of the process only based on completed log traces like shown below.
Now, what do you do when you are confronted with a data set, where you are not sure which are legitimate end activities and which are just from cases that are still running? Here is a trick that you can use to find out more.
Bonus trick: Find the regular end activities for your process
For this, you’ll get a peek at another filter in Nitro: The Timeframe filter. With this filter, you can restrict the timeframe for your event log.
Lower the upper timeframe limit by dragging the timeframe slider from the right to the middle to exclude cases that might still be running (see screenshot).
If you then inspect the Activity Statistics as shown below, you will find only the end activities for process instances that have been completed by the selected upper timeframe date. In the purchasing example, we can easily find back the three regular end activities that we already knew.
This works best if your dataset covers a large timeframe and the individual processes are well-contained.
In the same way, you can also look for start activities: Just limit your timeframe to the second half of the dataset and see which activities are the typical start activities for your process. It is unlikely that there are process instances that were already running before the start of the covered data timeframe and then have been inactive for many months.
2. Trim: Chop your process to the size you want it
The ‘Trim’ option can be used to focus your analysis on a part of the process.
In this case, the endpoints are used as clipping markers and all events before the indicated ‘start’ activities plus all events after the indicated ‘end’ activities are thrown away during the filtering.
Let’s say that we have discovered a conformance problem in our purchasing process: Sometimes the process moves from Send Invoice directly to the Authorize Supplier’s Invoice Payment step. The obligatory Release Supplier’s Invoice process step, which needs to be performed by the Financial Manager, has been skipped in 10 instances.
Furthermore, we have received complaints from our suppliers that the payments are made really late. Perhaps that is one of the reasons that the Release Supplier’s Invoice process step was sometimes skipped?
In any case, we want to focus our analysis just on this part of the overall process, and here is where the Trim option of the Endpoints filter comes in handy. Select the start and end activities for the process part you wish to focus on as shown below.
When you mine a new process model based on the filtered event log, then the result is a process that is “chopped off” at the indicated endpoints. Now you can send that (much more focused) picture over to your colleague who should take a look at that conformance issue.
Furthermore, you can analyze the throughput times for this sub-process just like you would analyze them for the whole process. So, if you ever come across the question “How long does it usually take to get from this point in the process to that point of the process?” — The Trim filter is your friend.
The Trim option can also be used for clean-up purposes if your end activities are not guaranteed to be the last event in the process. For example, sometimes you have a dataset where after a successful completion event there may still be some kind of comment activities, thus making it impossible to use the Discard option for clean-up without removing the comment activities first. Use Trim to directly indicate where your process starts and ends, and it’ll throw away the rest.
I hope you found this useful!
How do you clean up your incomplete cases? Have you ever used the timeframe filter trick before?