Blog Posts

ProM Tips — Which Log Filters Should You Use?

gold miner

In an earlier post, I have explained how you can mine a first process model and create an animation. In this post, I want to make you familiar with the log filters in ProM1.

What is a log filter?

Once you have created an event log and opened it in ProM, you may want to change it in various ways. Filtering is done for two main reasons: cleaning the data or narrowing down the analysis. Sometimes there are also technical reasons (just based on how some of the plug-ins in ProM work).

All in all, you can use log filters in ProM to change your event log in 4 ways:

  1. Remove process instances (cases)

  2. Add events

  3. Remove events

  4. Modify events

In the following, I’ll show you my two most important log filters – I use them in every project. There will also be a rough overview of other useful filters at the end of this post.

1. Adding artificial ‘start’ and ‘end’ events

Let’s pick up the call center example again. Previously, we had discovered the following process model using the Fuzzy Miner.

One thing that can be observed in the picture above is that it is hard to see where the process starts and where it ends. All of the activities are connected, and there is no clear beginning or end point.

To create a clear start and end point in your process models, you can use the so-called Add Artificial Start Task Log Filter and the Add Artificial End Task Log Filter. They are located in the ‘Advanced’ filter tab (see screenshot below).

To use these filters on your event log, you first have to select each of them in the list on the right and then press ‘add selected filter’. In the next screen, you can provide a custom name if you like2 , and then press ‘add new filter’. After that, the filter should appear in the ‘Log filter chain’ on the left (see screenshot above).

The effect of the log filter can be directly verified in the log inspector. The event log is exactly the same, just that the artificial ‘Start’ event has been inserted at the very beginning in each trace (see screenshot below). Similarly, the ‘End’ event is inserted after the last event in the sequence.

If we now discover a process model based on the filtered log, we get the following process model:

The main path of the process is now clearly visible based on the thickness of the arcs: Most process instances are handled by an incoming call at the front line and are then directly completed.

Be aware that, in fact, some of the mining algorithms (such as the Heuristic miner) assume that there is an identical start and end event for each case (otherwise the quality of the result may be reduced). So, it is strongly advisable to add these start and end events in most situations3 .

2. Filtering cases based on ‘start’ and ‘end’ events

Now, this sounds very similar but in fact is quite the opposite: Instead of adding artificial ‘start’ and ‘end’ events we select only those process instances that start and/or end with particular activities in the first place.

In many situations, one gets a data extract of the complete process logging in a particular time frame. So, the event log most likely contains some process instances that are incomplete because they were started before the data extract begins, or they were not yet finished when the data extract stops. So, to clean up your data you should remove those incomplete process instances from the log.

This is exactly what the simple log filter can do. In the middle area (see screenshot below), you can determine which are the legal start and end events for your process and throw away all cases that do not fulfill this filtering criteria.

Again, you can observe and verify your current filter settings by changing to the inspector tab in the log window.

In the call center example we have extracted complete process instances from the data base: We had only retrieved data for those service requests that were started within one particular month. Therefore, all ‘start’ events that are found in the log do really correspond to the start of the process.

Nevertheless, looking at the list of start events in the simple log filter UI above, we can see that the process did not always start with an incoming phone call or an incoming email (as expected). This indicates that agents are not always re-using existing service requests but give out new IDs in the middle of the process. It is important for the service quality that all activities are logged with the same service ID because it ensures that agents in follow-up activities have all the necessary information at hand.

For example, the activity ‘Handle Email-FL’ appears in the list of start events. However, when an email is received by the call center through the web form of the company, then a service request ID is automatically assigned. The existing service request number should be re-used.

We can use the start event filter to narrow down our analysis to these special cases. Furthermore, we want to focus only on the activity ‘Handle Email-FL’ in order to find out which agents are making this mistake. To remove all other activities from the event log, we deselect them in the right part the filter (see screenshot above).

We can view the log summary for the filtered log by changing to the Summary tab of the log dialog (see screenshot above). Scrolling to the Originators category, we can see that the agents Samuil and Jochem most often create new service IDs when they handle incoming emails. We can now give them targeted training and explain them why it is important to re-use existing service numbers.

The example shows that just using simple filtering methods, one can already answer certain questions about a process and find quality problems.

Other log filters in ProM

There are several other filter plug-ins in ProM. Here are a few that I use:

I hope this gives an overview and a starting point for how to use the log filters in ProM. Just like ProM itself, the filters have grown over time based on specific needs. So, you may very well end up missing a certain type of filtering functionality.

Which filters do you use most often? Which ones have you missed and how did you deal with it? Let us know in the comments!

  1. We generally recommend to use ProM 5 because of its stability and functionality. The log filters are in fact a good example where ProM 6 is not fully functional yet. If you want to know where ProM is heading in the future, you can learn more about the advantages of ProM 6 here. ↩︎

  2. I choose ‘Start’ and ‘End’ instead of the somewhat lengthy ‘Artificial Start Task’ and ‘Artificial End Task’. ↩︎

  3. You can avoid redoing the filtering multiple times if you choose Exports -> Filtered log -> Efficient MXML.GZ Export from the menu and save the filtered log. ↩︎

Leave a Comment

Get the BPI Web Feed

Using the HTML code below, you can display this Business Process Incubator page content with the current filter and sorting inside your web site for FREE.

Copy/Paste this code in your website html code:

<iframe src="" frameborder="0" scrolling="auto" width="100%" height="700">

Customizing your BPI Web Feed

You can click on the Get the BPI Web Feed link on any of our page to create the best possible feed for your site. Here are a few tips to customize your BPI Web Feed.

Customizing the Content Filter
On any page, you can add filter criteria using the MORE FILTERS interface:

Customizing the Content Filter

Customizing the Content Sorting
Clicking on the sorting options will also change the way your BPI Web Feed will be ordered on your site:

Get the BPI Web Feed

Some integration examples