Blog Posts

Case Study: Auditing With Process Mining — Part VI: Data Transformation

Step 5: Data Transformation

This is the 6th article in our case study series on auditing with process mining. The series is written by Jasmine Handler and Andreas Preslmayr from the City of Vienna. You can find an overview of all the articles in the series here.

The goal of the next step was to bring the raw data in a format that we could load into the process mining software. We filtered the relevant information from the raw data files and linked the data tables based on the prior defined connections. The output data was formatted as an event log, with a unique ID as case ID, activity names, timestamps, resources, and attributes for each event.

We performed the data transformation using the open-source software KNIME. To validate the transformed data, we performed crosschecking with the productive system whenever we implemented changes in the data transformation workflow. These validation steps showed quite some potential for improvement, and we adapted the workflow several times until the output data finally represented the data from the productive system (see Figure 7 below).

The first (left) and last (right) data transformation workflow version
Figure 7: The first (left) and last (right) data transformation workflow version

The data transformation was the most time-consuming step within the process mining project. One of the factors was that we had no direct access to the productive system. Therefore, the audited party had to support the data validation process and help with crosschecking. This led to waiting times and delays within the project.

Another factor was that we initially had not appropriately considered the 1:n and n:m relationships when tracing the case IDs. For example, one order can lead to several invoices and payments. Furthermore, one invoice can address multiple orders. One payment can cover more than one invoice, and so on. These many-to-many relationships had to be adequately handled during data transformation.

After several adaptions to the transformation workflow, we passed all the validation steps and generated a data set we were confident working with.


New parts in this auditing series will appear on this blog every week. Simply come back or sign up to be notified about new blog entries here.

Leave a Comment

Get the BPI Web Feed

Using the HTML code below, you can display this Business Process Incubator page content with the current filter and sorting inside your web site for FREE.

Copy/Paste this code in your website html code:

<iframe src="https://www.businessprocessincubator.com/content/case-study-auditing-with-process-mining-part-vi-data-transformation/?feed=html" frameborder="0" scrolling="auto" width="100%" height="700">

Customizing your BPI Web Feed

You can click on the Get the BPI Web Feed link on any of our page to create the best possible feed for your site. Here are a few tips to customize your BPI Web Feed.

Customizing the Content Filter
On any page, you can add filter criteria using the MORE FILTERS interface:

Customizing the Content Filter

Customizing the Content Sorting
Clicking on the sorting options will also change the way your BPI Web Feed will be ordered on your site:

Get the BPI Web Feed

Some integration examples

BPMN.org

XPDL.org

×