Process Mining for Usability Tests
You might have noticed: Products—and especially consumer electronics—are becoming more and more complex. As a result, people are not always able to deal with these complexities and usability becomes a distinguishing factor in brand reputation and customer satisfaction.
Process mining is a new technology that makes invisible process flows visible by analyzing existing log data in a bottom-up manner. Earlier, we have seen how process mining can be applied to the test process of ASML and an HR process. But can process mining also help to improve the usability of products?
Usability tests, for example first-use consumer tests, can help to get early feedback from the field while there is still time to adapt the product before releasing it to the market. Traditional usability measures include mostly static information, for example:
number of errors produced,
the time required to complete a task,
number of keystrokes, etc.
The results typically do not reflect the temporal aspects of the test data. So, in this project, we looked at how process mining can be used to get insights into the actual user behavior.
For the project, a group of 29 Dutch volunteers (from age of 22 to 66) participated in a usability test for a new television. 19 participants were male and 10 were female. The usability test took place in a simulated living room to make them feel at home as much as possible.
The participants were asked to complete the following three tasks:
Channel selection. After installation of the television, channel RTL 7 has been automatically programmed on channel 25. The participants were asked to put RTL 7 on channel 7.
Dual screen. The Dual screen function is innovative in comparison with previous versions of the product. It is one of the features promoted by marketing to sell the product. The participants were asked to watch the channels NEDERLAND 2 and NET 5 simultaneously.
Digital picture. Another function that is new in comparison with previous versions of the television is the Digital picture function, which allows to view digital pictures from a USB stick on the television screen.
The “correct procedures” to solve each of these three usability tasks is shown in the picture below. Further down, you can see the process models of the actual user behavior for the middle task (Dual Screen).
(Process models of the optimal user behavior for solving each of the three television usability tasks.)
The entire experiment took about 15-40 minutes per person, depending on the participants’ performance. During the experiment, the participants and the television screen were captured on a video camera. From these video recordings, an event log of the actual usage behavior was created semi-automatically. You can find more details about the event log creation in Pieter’s Master thesis.
One of the goals of the study was to assess the effect of a consumer’s (product) knowledge on usability. People with high product knowledge are assumed to be more familiar with the product, and to have more experience in using it.
The test participants were divided into ‘High Knowledge’ (13 people) and ‘Low Knowledge’ (16 people) groups based on their knowledge ratings in the questionnaire2. Process mining was done on the usage logs of these two groups separately.
(Process model discovered for the ‘High Knowledge’ group performing the Dual Screen task. The numbers and coloring indicate frequencies.)
Look at the behavior of the ‘High Knowledge’ group performing the Dual Screen task above. One can nicely see the paths that were taken from the start to the end of the task.
Most people use the dual screen button and immediately enter the dual screen menu, whereas some people find the dual screen menu via the TV menu.
In the dual screen mode, some participants press on the TV button to select the preferred channel of the left screen (which is not needed) and then go back to the dual screen mode.
If people from the ‘High Knowledge’ group visit a deviant state, they return to the dual screen mode.
An interesting loop that shows the switches between the screens leads from ‘Highlight second screen’ back to ‘Highlight first screen’ and then to ‘Selected channel first screen’.
(Process model discovered for the ‘Low Knowledge’ group performing the Dual Screen task. Compared to the ‘High Knowledge’ group this model is more complex showing more variability among the group of users.)
In the model above it is very visible that the people in the ‘Low Knowledge’ were even further away from the optimal solution. One person even had to give up.
The participants in this group exhibit more varied behavior, which can be seen from the lower frequency numbers on the arcs.
Furthermore, they visit many menus that are irrelevant for the dual screen task.
Interestingly, the participants in this group return to the main menu more frequently than the participants in the ‘High Knowledge’ group.
Another important insight is that people from the ‘Low Knowledge’ group are not able to visit the dual screen mode via the TV menu (there is no arc between ‘TV menu’ and ‘Dual screen mode’), since they seem to get stuck in this menu.
I find this is a nice example that through the visualization of actual user behavior it is possible to reveal usage patterns, which provide both qualitative and quantitative feedback.
Not only for Consumers
Also in a business context usage behavior can be crucial. For example, in call centers there is an increasing use of analytics for operational performance management. Agents often have to switch between 4-6 different applications (Siebel, SAP, etc.) while handling a call. Desktop analysis tools can analyze the key strokes of an agent and the resulting insight can be used to build an abstracting layer on top of the actual applications that matches the typical call flow.
Do you see other examples where understanding user behavior is important? Let us know in the comments.
See details in P.P.H.J. Hofstra. Analysing the Effect of Consumer Knowledge on Product Usability Using Process Mining Techniques. Masters thesis, Eindhoven University of Technology, Department of Industrial Design, Eindhoven, The Netherlands, 2009. ↩︎
The knowledge level of each participant was measured by asking them questions assessing their “familiarity” and “expertise” with televisions and computers. For example, participants had to react to statements such as “Compared to most other people, I know less about televisions/computers” (familiarity) and “I usually talk with friends and colleagues about new developments regarding televisions/computers” (expertise) on a 5-point Likert scale (ranging from total disagreement to total agreement). For more details see Pieter’s thesis. ↩︎