Avoiding over-fitting in ILP-based process discovery

Conference Contribution

Zelst, van, S.J., Dongen, van, B.F. & van der Aalst, W.M.P. (2015). Avoiding over-fitting in ILP-based process discovery. In J. Recker, H.R. Motahari-Nezhad & M. Weidlich (Eds.), Business Process Management (pp. 163-171). (Lecture Notes in Computer Science, No. 9253). Dordrecht: Springer. In Scopus Cited 9 times.

Read more: DOI      Medialink/Full text



The aim of process discovery is to discover a process model based on business process execution data, recorded in an event log. One of several existing process discovery techniques is the ILP-based process discovery algorithm. The algorithm is able to unravel complex process structures and provides formal guarantees w.r.t. the model discovered, e.g., the algorithm guarantees that a discovered model describes all behavior present in the event log. Unfortunately the algorithm is unable to cope with exceptional behavior present in event logs. As a result, the application of ILP-based process discovery techniques in everyday process discovery practice is limited. This paper addresses this problem by proposing a filtering technique tailored towards ILP-based process discovery. The technique helps to produce process models that are less over-fitting w.r.t. the event log, more understandable, and more adequate in capturing the dominant behavior present in the event log. The technique is implemented in the ProM framework.

Keywords: Process mining Process discovery Integer linear programming Filtering