Case Study: Auditing With Process Mining — Part V: Raw Data
This is the 5th article in our case study series on auditing with process mining. The series is written by Jasmine Handler and Andreas Preslmayr from the City of Vienna. You can find an overview of all the articles in the series here.
The data for our process mining analysis was stored in two different systems: SAP and the WMD xSuite feeder. We had identified the data tables that needed to be extracted from these systems when we specified the data model. We had no direct access to the Wiener Stadtwerke’s information systems. Thus, the audited party extracted the data tables for us and provided the raw data in CSV files.
From the data model, we already knew in which tables the timestamps for each activity were located. For example, we knew that the timestamp for ‘Create purchase request’ could be found in the EBAN table. The timestamp for the ‘Release purchase request’ activity could be found in the WMD table, and so on. However, because the raw data was distributed across multiple CSV files, we also needed to find the connections between the individual data tables so that we could merge the files into one (see Figure 6 for the connections between the tables – click on the image to see a larger version).
Figure 6: Raw data with connections between the tables
For each table, we identified which information could be used as a timestamp for an activity, resources, and other activity- or case-related attributes.
Based on the knowledge of the relevant data fields for the activity timestamps, attributes, and resources, and with this understanding of the connections between the raw data tables, we now had the basis for building our event log.
New parts in this auditing series will appear on this blog every week. Simply come back or sign up to be notified about new blog entries here.
Leave a Comment
You must be logged in to post a comment.