Conformance checking


Business process conformance checking is a family of process mining techniques to compare a process model with an event log of the same process. It is used to check if the actual execution of a business process, as recorded in the event log, conforms to the model and vice versa.
For instance, there may be a process model indicating that purchase orders of more than one million euros require two checks. Analysis of the event log will show whether this rule is followed or not.
Another example is the checking of the so-called “four-eyes” principle stating that particular activities should not be executed by one and the same person. By scanning the event log using a model specifying these requirements, one can discover potential cases of fraud. Hence, conformance checking may be used to detect, locate and explain deviations, and to measure the severity of these deviations.

Overview

Conformance checking techniques take as input a process model and event log and return a set of differences between the behavior captured in the process model and the behavior captured in the event log. These differences may be represented visually or textually as lists of natural language statements. Some techniques may also produce a normalized measures indicating to what extent the process model and the event log match each other.
The interpretation of non-conformance depends on the purpose of the model:
  • If the model is intended to be descriptive, discrepancies between model and log indicate that the model needs to be improved to capture reality better.
  • If the model is normative, then such discrepancies may be interpreted in two ways: they may expose undesirable deviations. or may reveal desirable deviations.

    Techniques

The purpose of conformance checking is to identify two types of discrepancies:
  • Unfitting log behavior: behavior observed in the log that is not allowed by the model.
  • Additional model behavior: behavior allowed in the model but never observed in the log.
There are broadly three families of techniques for detecting unfitting log behavior: replay, trace alignment and behavioral alignment.
In replay techniques, each trace is replayed against the process model one event at a time. When a replay error is detected, it is reported and a local correction is made to resume the replay procedure. The local correction may be for example to skip/ignore a task in the process model or to skip/ignore an event in the log.
A general limitation of replay methods is that error recovery is performed locally each time that an error is encountered. Hence, these methods might not identify the minimum number of errors that can explain the unfitting log behavior. This limitation is addressed by trace alignment techniques. These latter techniques identify, for each trace in the log, the closest corresponding trace that can be parsed by the model. Trace alignment techniques also compute an alignment showing the points of divergence between these two traces. The output is a set of pairs of aligned traces. Each pair shows a trace in the log that does not match exactly a trace in the model, together with the corresponding closest trace produced by the model.
Trace alignment techniques do not explicitly handle concurrent tasks nor cyclic behavior. If for example four tasks can occur only in a fixed order in the process model, but they can occur concurrently in the log, this difference cannot directly detected by trace alignment, because it cannot be observed at the level of individual traces. This limitation is addressed by behavioral alignment techniques These techniques compute an alignment between the state space of the process captured by the model against the state space of the process recorded in the log, and detect states where a task or behavioral relation occur in the model but not in the log. These techniques can hence detect both unfitting behavior and additional behavior.
Other methods to identify additional behavior are based on negative events
. These methods start by enhancing the traces in the log by inserting fake events in all or some traces of the log. A negative event is inserted after a given prefix of a trace if this event is never observed preceded by that prefix anywhere in the log.
For example, if event C is never observed after prefix AB, then C can be inserted as a negative event after AB. Thereafter, the log enhanced with negative events is replayed against the process model. If the process model can replay the negative events, it means that there is behavior captured in the process model that is not captured in the log.