SlideShare a Scribd company logo
Chapter 6
Advanced Process
Discovery Techniques
prof.dr.ir. Wil van der Aalst
www.processmining.org
Overview
Chapter 1
Introduction



Part I: Preliminaries

Chapter 2                   Chapter 3
Process Modeling and        Data Mining
Analysis


Part II: From Event Logs to Process Models

Chapter 4                  Chapter 5               Chapter 6
Getting the Data           Process Discovery: An   Advanced Process
                           Introduction            Discovery Techniques


Part III: Beyond Process Discovery

Chapter 7                   Chapter 8              Chapter 9
Conformance                 Mining Additional      Operational Support
Checking                    Perspectives


Part IV: Putting Process Mining to Work

Chapter 10                  Chapter 11             Chapter 12
Tool Support                Analyzing “Lasagna     Analyzing “Spaghetti
                            Processes”             Processes”


Part V: Reflection

Chapter 13                  Chapter 14
Cartography and             Epilogue
Navigation
                                                                          PAGE 1
Process discovery

                              supports/
      “world”    business
                               controls
                processes                      software
   people   machines                            system
        components
           organizations                              records
                                                   events, e.g.,
                                                    messages,
                                   specifies       transactions,
    models
                                  configures            etc.
   analyzes
                                 implements
                                   analyzes


                            discovery
        (process)                                 event
                            conformance
          model                                    logs
                            enhancement
                                                                   PAGE 2
Challenge

 “able to replay event log”                 “Occam’s razor”

          fitness                             simplicity

                               process
                              discovery



generalization                                precision
 “not overfitting the log”                “not underfitting the log”



                                                              PAGE 3
Observing a stable process infinitely long

       frequent                  all behavior
       behavior    trace in   (including noise)
                  event log




                                                  PAGE 4
Target model


               target model




                              PAGE 5
Non-fitting model


                    non-fitting model




                                        PAGE 6
Overfitting model


                    overfitting model




                                        PAGE 7
Underfitting model


               underfitting model




                                    PAGE 8
Characteristics of process discovery
 algorithms
• Representational bias
   −   Inability to represent concurrency
   −   Inability to deal with (arbitrary) loops
   −   Inability to represent silent actions
   −   Inability to represent duplicate actions
   −   Inability to model OR-splits/joins
   −   Inability to represent non-free-choice behavior
   −   Inability to represent hierarchy
• Ability to deal with noise
• Completeness notion assumed
• Approach used (direct algorithmic approaches, two-
  phase approaches, computational intelligence
  approaches, partial approaches, etc.)                  PAGE 9
Examples
• Algorithmic techniques
  • Alpha miner
  • Alpha+, Alpha++, Alpha#
  • FSM miner
  • Fuzzy miner
  • Heuristic miner
  • Multi phase miner
• Genetic process mining
  • Single/duplicate tasks
  • Distributed GM
• Region-based process mining
  • State-based regions
  • Language based regions
• Classical approaches not dealing with concurrency
  • Inductive inference (Mark Gold, Dana Angluin et al.)
  • Sequence mining
                                                           PAGE 10
Heuristic mining

• To deal with noise and incompleteness.
• To have a better representational bias than the α
  algorithm (AND/XOR/OR/skip).
• Uses C-nets.


                            b
                          check
                          policy

               a            c                 e
            register       check             close
             claim        damage             case

                            d
                                   consult
                                   expert
                                                      PAGE 11
Example log; problem α algorithm




                 p5

                 b



        a   p1   d      p3   e

start                              end

            p2    c     p4

                                         PAGE 12
Taking into account frequencies




                                  PAGE 13
Dependency measure




                     PAGE 14
Example




          PAGE 15
Lower threshold (2 direct successions and
a dependency of at least 0.7)
       5(0.83)

                      b

           11(0.92)       11(0.92)

  a                   c                    e
         11(0.92)            11(0.92)


      13(0.93)                  13(0.93)
                      d

          4(0.80)




                                               PAGE 16
Higher threshold (5 direct successions
and a dependency of at least 0.9)

                  b
    11(0.92)             11(0.92)



a                 c                 e
       11(0.92)       11(0.92)


    13(0.93)             13(0.93)
                  d




                                         PAGE 17
Learning splits and joins

                          5
                                  20    b       20

                                       21
           5             20                          20         5


                    20            20            20   20
      a                                 c                        e
      40                 20            21            20         40
                                                           13
               13
                                  13            13
                    13                                13
                                        d
                              4        17
                                            4
                                  4



                                                                     PAGE 18
Alternative visualization

                     5
                             20   b        20

                                  21
     5              20                          20         5


               20            20            20   20
a                                  c                        e
40                  20            21            20         40
                                                      13
         13
                             13            13
               13                                13
                                  d                                       b
                         4        17
                                       4
                             4
                                                                    AND       AND
                                                                a         c         e




                                                                          d




                                                                                        PAGE 19
Characteristics of heuristic mining

• Can deal with noise and therefore quite robust.
• Improved representational bias.
• Split and join rules are only considered locally
  (therefore most of the discovered model are not
  sound and require repair actions).




                                                     PAGE 20
Genetic process mining

                    create initial
                     population



   event log                                                  mutation

                                     next generation
                  compute
                   fitness
                                       elitism
  termination
                       tournament                           children

                                                       crossover

    select best                  parents
     individual



                             “dead” individuals



                                                                         PAGE 21
Design decisions

•   Representation of individuals
•   Initialization
•   Fitness function
•   Selection strategy (tournament and elitism)
•   Crossover                                   create initial
                                                 population


•   Mutation                   event log                                                  mutation

                                                                 next generation
                                              compute
                                               fitness
                                                                   elitism
                              termination
                                                   tournament                           children

                                                                                   crossover

                                select best                  parents
                                 individual



                                                         “dead” individuals




                                                                                                     PAGE 22
Example: crossover

                        b                                                                           b
                    examine                                                                     examine
                   thoroughly                                                                  thoroughly
                                                            g                                                                           g
                                                           pay                                                                         pay
                        c                                                                           c
                                                       compensation                                                                compensation
           a                          e                                                a                          e
                    examine                                                                     examine
start   register    casually      decide                              end   start   register    casually      decide                              end
        request                                                                     request
                                                            h                                                                           h
                        d                                                                           d
                                                          reject                                                                      reject
                   check ticket                          request                               check ticket                          request
                                  f                                                                           f
                                          reinitiate                                                                  reinitiate
                                           request                                                                     request




                        b                                                                           b
                    examine                                                                     examine
                   thoroughly                                                                  thoroughly
                                                            g                                                                           g
                                                           pay                                                                         pay
                        c                                                                           c
                                                       compensation                                                                compensation
           a                          e                                                a                          e
                    examine                                                                     examine
start   register    casually      decide                              end   start   register    casually      decide                              end
        request                                                                     request
                                                            h                                                                           h
                        d                                                                           d
                                                          reject                                                                      reject
                   check ticket                          request                               check ticket                          request
                                  f                                                                           f
                                          reinitiate
                                                                                                                      reinitiate
                                           request
                                                                                                                       request




                                                                                                                                            PAGE 23
Example: mutation



                                  remove place

                        b                                                                           b
                    examine                                                                     examine
                   thoroughly                                                                  thoroughly
                                                            g                                                                           g
                                                           pay                                                                         pay
                        c                                                                           c
                                                       compensation                                                                compensation
           a                          e                                                a                          e
                    examine                                                                     examine
start   register    casually      decide                              end   start   register    casually      decide                              end
        request                                                                     request
                                                            h                                                                           h
                        d                                                                           d
                                                          reject                                                                      reject
                   check ticket                          request                               check ticket                          request
                                  f                                                                           f
                                          reinitiate                                                                  reinitiate
                                           request
                                                                            added arc                                  request




                                                                                                                                        PAGE 24
Characteristics of genetic
 process mining




• Requires a lot of computing power.
• Can be distributed easily.
• Can deal with noise, infrequent behavior, duplicate tasks,
  invisible tasks, etc.
• Allows for incremental improvement and combinations
  with other approaches (heuristics post-optimization, etc.).
                                                       PAGE 25
Region-based mining

• Two types of regions theory:
   − State-based regions
   − Language-based regions
• All about discovering places (like in the α algorithm)!


              a1                          b1


              a2                          b2

              ...         p(A,B)          ...
              am                          bn


        A={a1,a2, … am}            B={b1,b2, … bn}
                                                      PAGE 26
State-based regions

Two steps:
1.Discover a transition system (different abstractions
  are possible)
2.Convert transition system into an “equivalent” Petri
  net.




                                                     PAGE 27
Step 1: learning a transition system

                                 current state


       trace:   abcdcdcde faghhhi
                      past                       future

                             past and future

•   past, future, past+future
•   sequence, multiset, set abstraction
•   limited horizon to abstract further
•   filtering e.g. based on transaction type, names, etc.
•   labels based on activity name or other features
                                                            PAGE 28
Past without abstraction (full sequence)


                    c             d
       ‹a,b›
                        ‹a,b,c›       ‹a,b,c,d›
                b
      a             e             d
 ‹›       ‹a›           ‹a,e›         ‹a,e,d›
                c
                    b             d
       ‹a,c›
                        ‹a,c,b›       ‹a,c,b,d›

                                                PAGE 29
Future without abstraction


             a             b        ‹c,d›
 ‹a,b,c,d›       ‹b,c,d›       c
             a             e              d
  ‹a,e,d›         ‹e,d›            ‹d ›       ‹›
                               b
             a             c
                                    ‹b,d›
 ‹a,c,b,d›       ‹c,b,d›

                                                   PAGE 30
Past with multiset abstraction

           [a,e]
                             d
                                      [a,d,e]
                e       [a,b]
      a             b
 []       [a]
                c        c
                    b             d
           [a,c]        [a,b,c]       [a,b,c,d]

                                                  PAGE 31
Only last event matters for state

                                ‹e›
                    e                      d
        a               b
                                ‹ b›       d
  ‹›         ‹a ›           c          b       ‹d›
                    c                      d

                                ‹c›

                                                     PAGE 32
Step 2: constructing a Petri net using
regions
                                            a = enter
               b                d           b = enter
       a                            e       c = exit
                                            d = exit
                   f            d           e = do not cross
   e                                        f = do not cross
           e

                       f        c
       a

                           R


                       a                c

           e                                      f
                           pR
                       b                d

                                                               PAGE 33
Example

                                                      d
                                        e
                                            [a,e]             [a,d,e]
                               [ a,b]
             a             b
        []       [a]                    c
                       c
                           b                          d
                  [a,c]                     [a,b,c]           [a,b,c,d]




                               b



        a        p1            e              p3          d

start                                                                end

                 p2            c              p4
                                                                           PAGE 34
Language based regions


                  f                  c1

                          a1                    b1

              e                       c                      d
                                     pR
                          a2                    b2


                          X                     Y

Region R = (X,Y,c) corresponding to place pR: X = {a1,a2,c1} =
transitions producing a token for pR, Y = {b1,b2,c1} = transitions
consuming a token from pR, and c is the initial marking of pR.       PAGE 35
Based idea: enough tokens should be
present when consuming
                           A place is feasible if it
                           can be added without
       f        c1         disabling any of the
                           traces in the event log.

           a1        b1

   e            c          d
                pR
           a2        b2


           X         Y



                                               PAGE 36
Example




          PAGE 37
Regions




          PAGE 38
Model

        a        p5            d

                      c
 p1         p2            p3       p4
        b                      e

                 p6




                                        PAGE 39
Characteristics of region-based mining

• Can be used to discover more complex control-flow
  structures.
• Classical approaches need to be adapted
  (overfitting!).
• Representational bias can be parameterized (e.g.,
  free-choice nets, label splitting, etc.).
• Problems dealing with noise.




                                                  PAGE 40
Other approaches, e.g. fuzzy mining




                                      PAGE 41
Evaluating the discovered process



                         Fitness: Is the event log
                         possible according to the
                         model?

Precision: Is the model                        Generalization: Is the model
not underfitting (allow for                    not overfitting (only allow for
too much)?                                     the “accidental” examples)?


                         Structure: Is this the
                         simplest model (Occam's
                         Razor)?



                                                                          PAGE 42
Ad

Recommended

Process Mining - Chapter 5 - Process Discovery
Process Mining - Chapter 5 - Process Discovery
Wil van der Aalst
 
Process Mining - Chapter 4 - Getting the Data
Process Mining - Chapter 4 - Getting the Data
Wil van der Aalst
 
Process Mining - Chapter 11 - Analyzing Lasagna Processes
Process Mining - Chapter 11 - Analyzing Lasagna Processes
Wil van der Aalst
 
Process Mining - Chapter 10 - Tool Support
Process Mining - Chapter 10 - Tool Support
Wil van der Aalst
 
Process Mining - Chapter 3 - Data Mining
Process Mining - Chapter 3 - Data Mining
Wil van der Aalst
 
Process Mining - Chapter 1 - Introduction
Process Mining - Chapter 1 - Introduction
Wil van der Aalst
 
Process Mining - Chapter 2 - Process Modeling and Analysis
Process Mining - Chapter 2 - Process Modeling and Analysis
Wil van der Aalst
 
Process Mining - Chapter 7 - Conformance Checking
Process Mining - Chapter 7 - Conformance Checking
Wil van der Aalst
 
Process Mining - Chapter 8 - Mining Additional Perspectives
Process Mining - Chapter 8 - Mining Additional Perspectives
Wil van der Aalst
 
Process Mining 2.0: From Insights to Actions
Process Mining 2.0: From Insights to Actions
Marlon Dumas
 
Process Mining - Chapter 9 - Operational Support
Process Mining - Chapter 9 - Operational Support
Wil van der Aalst
 
Process Mining Introduction
Process Mining Introduction
Vala Ali Rohani
 
Process Mining Intro (Eng)
Process Mining Intro (Eng)
Dafna Levy
 
Introduction to Business Process Monitoring and Process Mining
Introduction to Business Process Monitoring and Process Mining
Marlon Dumas
 
Enterprise Architecture and Open Source
Enterprise Architecture and Open Source
Karim Baïna
 
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
Wil van der Aalst
 
Ppt
Ppt
bullsrockr666
 
Introduction to ETL and Data Integration
Introduction to ETL and Data Integration
CloverDX (formerly known as CloverETL)
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
Taro L. Saito
 
Business Process Modeling
Business Process Modeling
Sandy Kemsley
 
Introduction to Data Warehousing
Introduction to Data Warehousing
Jason S
 
DataFusion-and-Arrow_Supercharge-Your-Data-Analytical-Tool-with-a-Rusty-Query...
DataFusion-and-Arrow_Supercharge-Your-Data-Analytical-Tool-with-a-Rusty-Query...
aiuy
 
Change Data Feed in Delta
Change Data Feed in Delta
Databricks
 
OLAP technology
OLAP technology
Dr. Mahendra Srivastava
 
Better decision making with proper business intelligence
Better decision making with proper business intelligence
madhavlankapati
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
Databricks
 
Accelerating Data Ingestion with Databricks Autoloader
Accelerating Data Ingestion with Databricks Autoloader
Databricks
 
Success Factors for Process Mining Technology
Success Factors for Process Mining Technology
Celonis
 
Process Mining - a new governance approach
Process Mining - a new governance approach
Martin Pscheidl
 
Process Mining - Chapter 12 - Analyzing Spaghetti Processes
Process Mining - Chapter 12 - Analyzing Spaghetti Processes
Wil van der Aalst
 

More Related Content

What's hot (20)

Process Mining - Chapter 8 - Mining Additional Perspectives
Process Mining - Chapter 8 - Mining Additional Perspectives
Wil van der Aalst
 
Process Mining 2.0: From Insights to Actions
Process Mining 2.0: From Insights to Actions
Marlon Dumas
 
Process Mining - Chapter 9 - Operational Support
Process Mining - Chapter 9 - Operational Support
Wil van der Aalst
 
Process Mining Introduction
Process Mining Introduction
Vala Ali Rohani
 
Process Mining Intro (Eng)
Process Mining Intro (Eng)
Dafna Levy
 
Introduction to Business Process Monitoring and Process Mining
Introduction to Business Process Monitoring and Process Mining
Marlon Dumas
 
Enterprise Architecture and Open Source
Enterprise Architecture and Open Source
Karim Baïna
 
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
Wil van der Aalst
 
Ppt
Ppt
bullsrockr666
 
Introduction to ETL and Data Integration
Introduction to ETL and Data Integration
CloverDX (formerly known as CloverETL)
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
Taro L. Saito
 
Business Process Modeling
Business Process Modeling
Sandy Kemsley
 
Introduction to Data Warehousing
Introduction to Data Warehousing
Jason S
 
DataFusion-and-Arrow_Supercharge-Your-Data-Analytical-Tool-with-a-Rusty-Query...
DataFusion-and-Arrow_Supercharge-Your-Data-Analytical-Tool-with-a-Rusty-Query...
aiuy
 
Change Data Feed in Delta
Change Data Feed in Delta
Databricks
 
OLAP technology
OLAP technology
Dr. Mahendra Srivastava
 
Better decision making with proper business intelligence
Better decision making with proper business intelligence
madhavlankapati
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
Databricks
 
Accelerating Data Ingestion with Databricks Autoloader
Accelerating Data Ingestion with Databricks Autoloader
Databricks
 
Success Factors for Process Mining Technology
Success Factors for Process Mining Technology
Celonis
 
Process Mining - Chapter 8 - Mining Additional Perspectives
Process Mining - Chapter 8 - Mining Additional Perspectives
Wil van der Aalst
 
Process Mining 2.0: From Insights to Actions
Process Mining 2.0: From Insights to Actions
Marlon Dumas
 
Process Mining - Chapter 9 - Operational Support
Process Mining - Chapter 9 - Operational Support
Wil van der Aalst
 
Process Mining Introduction
Process Mining Introduction
Vala Ali Rohani
 
Process Mining Intro (Eng)
Process Mining Intro (Eng)
Dafna Levy
 
Introduction to Business Process Monitoring and Process Mining
Introduction to Business Process Monitoring and Process Mining
Marlon Dumas
 
Enterprise Architecture and Open Source
Enterprise Architecture and Open Source
Karim Baïna
 
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
On the Role of Fitness, Precision, Generalization and Simplicity in Process D...
Wil van der Aalst
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
Taro L. Saito
 
Business Process Modeling
Business Process Modeling
Sandy Kemsley
 
Introduction to Data Warehousing
Introduction to Data Warehousing
Jason S
 
DataFusion-and-Arrow_Supercharge-Your-Data-Analytical-Tool-with-a-Rusty-Query...
DataFusion-and-Arrow_Supercharge-Your-Data-Analytical-Tool-with-a-Rusty-Query...
aiuy
 
Change Data Feed in Delta
Change Data Feed in Delta
Databricks
 
Better decision making with proper business intelligence
Better decision making with proper business intelligence
madhavlankapati
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
Databricks
 
Accelerating Data Ingestion with Databricks Autoloader
Accelerating Data Ingestion with Databricks Autoloader
Databricks
 
Success Factors for Process Mining Technology
Success Factors for Process Mining Technology
Celonis
 

Viewers also liked (17)

Process Mining - a new governance approach
Process Mining - a new governance approach
Martin Pscheidl
 
Process Mining - Chapter 12 - Analyzing Spaghetti Processes
Process Mining - Chapter 12 - Analyzing Spaghetti Processes
Wil van der Aalst
 
Process Mining: Understanding and Improving Desire Lines in Big Data
Process Mining: Understanding and Improving Desire Lines in Big Data
Wil van der Aalst
 
Process Mining - Chapter 14 - Epilogue
Process Mining - Chapter 14 - Epilogue
Wil van der Aalst
 
Event Logs: What kind of data does process mining require?
Event Logs: What kind of data does process mining require?
Wil van der Aalst
 
Distributed Process Discovery and Conformance Checking
Distributed Process Discovery and Conformance Checking
Wil van der Aalst
 
Process Mining for ERP Systems
Process Mining for ERP Systems
Dirk Fahland
 
Process mining with alpha++ algorithm
Process mining with alpha++ algorithm
Fadlika Dita Nurjanto
 
Process mining approaches [email protected]
Process mining approaches [email protected]
kashif kashif
 
Ontologies And Process Mining
Ontologies And Process Mining
George Varvaressos
 
Process mining in business process management
Process mining in business process management
Ramez Al-Fayez
 
Building Information Model (BIM) based process mining
Building Information Model (BIM) based process mining
Stijn van Schaijk
 
Bim based process mining master thesis presentation
Bim based process mining master thesis presentation
Stijn van Schaijk
 
Process Mining Book
Process Mining Book
Wil van der Aalst
 
Process Mining - Chapter 13 - Cartography and Navigation
Process Mining - Chapter 13 - Cartography and Navigation
Wil van der Aalst
 
Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE
Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE
Yandex
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Salah Amean
 
Process Mining - a new governance approach
Process Mining - a new governance approach
Martin Pscheidl
 
Process Mining - Chapter 12 - Analyzing Spaghetti Processes
Process Mining - Chapter 12 - Analyzing Spaghetti Processes
Wil van der Aalst
 
Process Mining: Understanding and Improving Desire Lines in Big Data
Process Mining: Understanding and Improving Desire Lines in Big Data
Wil van der Aalst
 
Process Mining - Chapter 14 - Epilogue
Process Mining - Chapter 14 - Epilogue
Wil van der Aalst
 
Event Logs: What kind of data does process mining require?
Event Logs: What kind of data does process mining require?
Wil van der Aalst
 
Distributed Process Discovery and Conformance Checking
Distributed Process Discovery and Conformance Checking
Wil van der Aalst
 
Process Mining for ERP Systems
Process Mining for ERP Systems
Dirk Fahland
 
Process mining with alpha++ algorithm
Process mining with alpha++ algorithm
Fadlika Dita Nurjanto
 
Process mining in business process management
Process mining in business process management
Ramez Al-Fayez
 
Building Information Model (BIM) based process mining
Building Information Model (BIM) based process mining
Stijn van Schaijk
 
Bim based process mining master thesis presentation
Bim based process mining master thesis presentation
Stijn van Schaijk
 
Process Mining - Chapter 13 - Cartography and Navigation
Process Mining - Chapter 13 - Cartography and Navigation
Wil van der Aalst
 
Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE
Process Mining: Data Science in Action - Wil van der Aalst, TU/e, DSC/e, HSE
Yandex
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Salah Amean
 
Ad

Similar to Process Mining - Chapter 6 - Advanced Process Discovery_techniques (20)

Process mining chapter_07_conformance_checking
Process mining chapter_07_conformance_checking
Muhammad Ajmal
 
Process mining chapter_05_process_discovery
Process mining chapter_05_process_discovery
Muhammad Ajmal
 
Process mining chapter_08_mining_additional_perspectives
Process mining chapter_08_mining_additional_perspectives
Muhammad Ajmal
 
Discovering Concurrency: Learning (Business) Process Models from Examples
Discovering Concurrency: Learning (Business) Process Models from Examples
Wil van der Aalst
 
Process mining chapter_12_analyzing_spaghetti_processes
Process mining chapter_12_analyzing_spaghetti_processes
Muhammad Ajmal
 
Process mining chapter_01_introduction
Process mining chapter_01_introduction
Muhammad Ajmal
 
Discovering Petri Nets: Evidence-Based Business Process Management
Discovering Petri Nets: Evidence-Based Business Process Management
Wil van der Aalst
 
Repairing Process Models to Match Reality
Repairing Process Models to Match Reality
Dirk Fahland
 
Process mining chapter_14_epilogue
Process mining chapter_14_epilogue
Muhammad Ajmal
 
Virtual memory
Virtual memory
Mohd Arif
 
Keynote Gartner Business Process Management Summit, February 2009, London
Keynote Gartner Business Process Management Summit, February 2009, London
Wil van der Aalst
 
Simplifying Mined Process Models
Simplifying Mined Process Models
Dirk Fahland
 
Business Process Configuration in the Cloud: How to Support and Analyze Multi...
Business Process Configuration in the Cloud: How to Support and Analyze Multi...
Wil van der Aalst
 
Back To The Future
Back To The Future
Bill Scott
 
Introduction to R for Data Mining
Introduction to R for Data Mining
Revolution Analytics
 
On Failure and Resilience
On Failure and Resilience
Mike Brittain
 
Prdc2012
Prdc2012
Yusuke Shimizu
 
OW2 Petals Dragon SOA Linuxtag09
OW2 Petals Dragon SOA Linuxtag09
Catherine Nuel
 
TomTom for Business Process Managment (TomTom4BPM)
TomTom for Business Process Managment (TomTom4BPM)
Wil van der Aalst
 
7 QC Tools
7 QC Tools
Arvind2084
 
Process mining chapter_07_conformance_checking
Process mining chapter_07_conformance_checking
Muhammad Ajmal
 
Process mining chapter_05_process_discovery
Process mining chapter_05_process_discovery
Muhammad Ajmal
 
Process mining chapter_08_mining_additional_perspectives
Process mining chapter_08_mining_additional_perspectives
Muhammad Ajmal
 
Discovering Concurrency: Learning (Business) Process Models from Examples
Discovering Concurrency: Learning (Business) Process Models from Examples
Wil van der Aalst
 
Process mining chapter_12_analyzing_spaghetti_processes
Process mining chapter_12_analyzing_spaghetti_processes
Muhammad Ajmal
 
Process mining chapter_01_introduction
Process mining chapter_01_introduction
Muhammad Ajmal
 
Discovering Petri Nets: Evidence-Based Business Process Management
Discovering Petri Nets: Evidence-Based Business Process Management
Wil van der Aalst
 
Repairing Process Models to Match Reality
Repairing Process Models to Match Reality
Dirk Fahland
 
Process mining chapter_14_epilogue
Process mining chapter_14_epilogue
Muhammad Ajmal
 
Virtual memory
Virtual memory
Mohd Arif
 
Keynote Gartner Business Process Management Summit, February 2009, London
Keynote Gartner Business Process Management Summit, February 2009, London
Wil van der Aalst
 
Simplifying Mined Process Models
Simplifying Mined Process Models
Dirk Fahland
 
Business Process Configuration in the Cloud: How to Support and Analyze Multi...
Business Process Configuration in the Cloud: How to Support and Analyze Multi...
Wil van der Aalst
 
Back To The Future
Back To The Future
Bill Scott
 
On Failure and Resilience
On Failure and Resilience
Mike Brittain
 
OW2 Petals Dragon SOA Linuxtag09
OW2 Petals Dragon SOA Linuxtag09
Catherine Nuel
 
TomTom for Business Process Managment (TomTom4BPM)
TomTom for Business Process Managment (TomTom4BPM)
Wil van der Aalst
 
Ad

More from Wil van der Aalst (12)

Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
Wil van der Aalst
 
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
Wil van der Aalst
 
20 years of Process Mining Research (ICPM 2019 keynote)
20 years of Process Mining Research (ICPM 2019 keynote)
Wil van der Aalst
 
Earth Movers’ Stochastic Conformance Checking
Earth Movers’ Stochastic Conformance Checking
Wil van der Aalst
 
Using Process Mining to Remove Operational Friction in Shared Services
Using Process Mining to Remove Operational Friction in Shared Services
Wil van der Aalst
 
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
Wil van der Aalst
 
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
Wil van der Aalst
 
Configurable Declare: Designing Customizable Flexible Models
Configurable Declare: Designing Customizable Flexible Models
Wil van der Aalst
 
A Decade of Business Process Management Conferences: Reflections on a Develop...
A Decade of Business Process Management Conferences: Reflections on a Develop...
Wil van der Aalst
 
Service Interaction: Patterns, Formalization, and Analysis
Service Interaction: Patterns, Formalization, and Analysis
Wil van der Aalst
 
Keynote on Process Mining at SSCI 2010 / CIDM 2011
Keynote on Process Mining at SSCI 2010 / CIDM 2011
Wil van der Aalst
 
Keynote at 18th International Conference on Cooperative Information Systems (...
Keynote at 18th International Conference on Cooperative Information Systems (...
Wil van der Aalst
 
Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
Wil van der Aalst
 
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
Wil van der Aalst
 
20 years of Process Mining Research (ICPM 2019 keynote)
20 years of Process Mining Research (ICPM 2019 keynote)
Wil van der Aalst
 
Earth Movers’ Stochastic Conformance Checking
Earth Movers’ Stochastic Conformance Checking
Wil van der Aalst
 
Using Process Mining to Remove Operational Friction in Shared Services
Using Process Mining to Remove Operational Friction in Shared Services
Wil van der Aalst
 
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
Wil van der Aalst
 
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
Wil van der Aalst
 
Configurable Declare: Designing Customizable Flexible Models
Configurable Declare: Designing Customizable Flexible Models
Wil van der Aalst
 
A Decade of Business Process Management Conferences: Reflections on a Develop...
A Decade of Business Process Management Conferences: Reflections on a Develop...
Wil van der Aalst
 
Service Interaction: Patterns, Formalization, and Analysis
Service Interaction: Patterns, Formalization, and Analysis
Wil van der Aalst
 
Keynote on Process Mining at SSCI 2010 / CIDM 2011
Keynote on Process Mining at SSCI 2010 / CIDM 2011
Wil van der Aalst
 
Keynote at 18th International Conference on Cooperative Information Systems (...
Keynote at 18th International Conference on Cooperative Information Systems (...
Wil van der Aalst
 

Recently uploaded (20)

Noah Loul Shares 5 Key Impacts of AI Agents on the Sales Industry
Noah Loul Shares 5 Key Impacts of AI Agents on the Sales Industry
Noah Loul
 
5 Smart Ways to Build a Highly Productive Team
5 Smart Ways to Build a Highly Productive Team
RUPAL AGARWAL
 
Paul Turovsky - A Key Contributor
Paul Turovsky - A Key Contributor
Paul Turovsky
 
The APCO Geopolitical Radar Q3 2025 Edition
The APCO Geopolitical Radar Q3 2025 Edition
APCO
 
QuickBooks Keeps Freezing: Causes & Solutions.pptx
QuickBooks Keeps Freezing: Causes & Solutions.pptx
robastwilliams
 
Power of the Many Masterclasses - 2nd draft .pptx
Power of the Many Masterclasses - 2nd draft .pptx
AlexBausch2
 
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Lviv Startup Club
 
Vaden Consultancy: Transforming Businesses with Integrated HR, IT, and Cloud ...
Vaden Consultancy: Transforming Businesses with Integrated HR, IT, and Cloud ...
Vaden Consultancy
 
The Science Behind Effective Lead Nurture Programs in B2B Marketing.pptx
The Science Behind Effective Lead Nurture Programs in B2B Marketing.pptx
brandonsoros91
 
Salary_Wage_Computation_3Day_Lesson.pptx
Salary_Wage_Computation_3Day_Lesson.pptx
DaryllWhere
 
AX to Dynamics 365 Finance and Operations in USA.pdf
AX to Dynamics 365 Finance and Operations in USA.pdf
Trident information system
 
Chapter 1 Introduction to Accountin1.6 plusone class first chapter (1) (1).pptx
Chapter 1 Introduction to Accountin1.6 plusone class first chapter (1) (1).pptx
dilshap23
 
Digitally Mastering Insurance Claims - Decision-Centric Claims
Digitally Mastering Insurance Claims - Decision-Centric Claims
Denis Gagné
 
Hire the Best Crypto Recovery Experts for Fast Recovery in 2025: Puran Crypto...
Hire the Best Crypto Recovery Experts for Fast Recovery in 2025: Puran Crypto...
henryywalker3
 
Rushi Manche | Blockchain Tech Company Co-Founder
Rushi Manche | Blockchain Tech Company Co-Founder
Rushi Manche
 
Integration of Information Security Governance and Corporate Governance
Integration of Information Security Governance and Corporate Governance
Tokyo Security Community
 
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Lviv Startup Club
 
S4F03 Col11 Conversion of Accounting to SAP S/4HANA
S4F03 Col11 Conversion of Accounting to SAP S/4HANA
Libreria ERP
 
Essar at IEW 2025, Leading the Way to India’s Green Energy Transition.
Essar at IEW 2025, Leading the Way to India’s Green Energy Transition.
essarcase
 
How Effective Leadership Drives Success and Accelerates Business Growth by De...
How Effective Leadership Drives Success and Accelerates Business Growth by De...
Devin Doyle
 
Noah Loul Shares 5 Key Impacts of AI Agents on the Sales Industry
Noah Loul Shares 5 Key Impacts of AI Agents on the Sales Industry
Noah Loul
 
5 Smart Ways to Build a Highly Productive Team
5 Smart Ways to Build a Highly Productive Team
RUPAL AGARWAL
 
Paul Turovsky - A Key Contributor
Paul Turovsky - A Key Contributor
Paul Turovsky
 
The APCO Geopolitical Radar Q3 2025 Edition
The APCO Geopolitical Radar Q3 2025 Edition
APCO
 
QuickBooks Keeps Freezing: Causes & Solutions.pptx
QuickBooks Keeps Freezing: Causes & Solutions.pptx
robastwilliams
 
Power of the Many Masterclasses - 2nd draft .pptx
Power of the Many Masterclasses - 2nd draft .pptx
AlexBausch2
 
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Oleksandr Osypenko: Управління часом та ресурсами (UA)
Lviv Startup Club
 
Vaden Consultancy: Transforming Businesses with Integrated HR, IT, and Cloud ...
Vaden Consultancy: Transforming Businesses with Integrated HR, IT, and Cloud ...
Vaden Consultancy
 
The Science Behind Effective Lead Nurture Programs in B2B Marketing.pptx
The Science Behind Effective Lead Nurture Programs in B2B Marketing.pptx
brandonsoros91
 
Salary_Wage_Computation_3Day_Lesson.pptx
Salary_Wage_Computation_3Day_Lesson.pptx
DaryllWhere
 
AX to Dynamics 365 Finance and Operations in USA.pdf
AX to Dynamics 365 Finance and Operations in USA.pdf
Trident information system
 
Chapter 1 Introduction to Accountin1.6 plusone class first chapter (1) (1).pptx
Chapter 1 Introduction to Accountin1.6 plusone class first chapter (1) (1).pptx
dilshap23
 
Digitally Mastering Insurance Claims - Decision-Centric Claims
Digitally Mastering Insurance Claims - Decision-Centric Claims
Denis Gagné
 
Hire the Best Crypto Recovery Experts for Fast Recovery in 2025: Puran Crypto...
Hire the Best Crypto Recovery Experts for Fast Recovery in 2025: Puran Crypto...
henryywalker3
 
Rushi Manche | Blockchain Tech Company Co-Founder
Rushi Manche | Blockchain Tech Company Co-Founder
Rushi Manche
 
Integration of Information Security Governance and Corporate Governance
Integration of Information Security Governance and Corporate Governance
Tokyo Security Community
 
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Ihor Pavlenko: Управління ризиками за допомогою AI (UA)
Lviv Startup Club
 
S4F03 Col11 Conversion of Accounting to SAP S/4HANA
S4F03 Col11 Conversion of Accounting to SAP S/4HANA
Libreria ERP
 
Essar at IEW 2025, Leading the Way to India’s Green Energy Transition.
Essar at IEW 2025, Leading the Way to India’s Green Energy Transition.
essarcase
 
How Effective Leadership Drives Success and Accelerates Business Growth by De...
How Effective Leadership Drives Success and Accelerates Business Growth by De...
Devin Doyle
 

Process Mining - Chapter 6 - Advanced Process Discovery_techniques

  • 1. Chapter 6 Advanced Process Discovery Techniques prof.dr.ir. Wil van der Aalst www.processmining.org
  • 2. Overview Chapter 1 Introduction Part I: Preliminaries Chapter 2 Chapter 3 Process Modeling and Data Mining Analysis Part II: From Event Logs to Process Models Chapter 4 Chapter 5 Chapter 6 Getting the Data Process Discovery: An Advanced Process Introduction Discovery Techniques Part III: Beyond Process Discovery Chapter 7 Chapter 8 Chapter 9 Conformance Mining Additional Operational Support Checking Perspectives Part IV: Putting Process Mining to Work Chapter 10 Chapter 11 Chapter 12 Tool Support Analyzing “Lasagna Analyzing “Spaghetti Processes” Processes” Part V: Reflection Chapter 13 Chapter 14 Cartography and Epilogue Navigation PAGE 1
  • 3. Process discovery supports/ “world” business controls processes software people machines system components organizations records events, e.g., messages, specifies transactions, models configures etc. analyzes implements analyzes discovery (process) event conformance model logs enhancement PAGE 2
  • 4. Challenge “able to replay event log” “Occam’s razor” fitness simplicity process discovery generalization precision “not overfitting the log” “not underfitting the log” PAGE 3
  • 5. Observing a stable process infinitely long frequent all behavior behavior trace in (including noise) event log PAGE 4
  • 6. Target model target model PAGE 5
  • 7. Non-fitting model non-fitting model PAGE 6
  • 8. Overfitting model overfitting model PAGE 7
  • 9. Underfitting model underfitting model PAGE 8
  • 10. Characteristics of process discovery algorithms • Representational bias − Inability to represent concurrency − Inability to deal with (arbitrary) loops − Inability to represent silent actions − Inability to represent duplicate actions − Inability to model OR-splits/joins − Inability to represent non-free-choice behavior − Inability to represent hierarchy • Ability to deal with noise • Completeness notion assumed • Approach used (direct algorithmic approaches, two- phase approaches, computational intelligence approaches, partial approaches, etc.) PAGE 9
  • 11. Examples • Algorithmic techniques • Alpha miner • Alpha+, Alpha++, Alpha# • FSM miner • Fuzzy miner • Heuristic miner • Multi phase miner • Genetic process mining • Single/duplicate tasks • Distributed GM • Region-based process mining • State-based regions • Language based regions • Classical approaches not dealing with concurrency • Inductive inference (Mark Gold, Dana Angluin et al.) • Sequence mining PAGE 10
  • 12. Heuristic mining • To deal with noise and incompleteness. • To have a better representational bias than the α algorithm (AND/XOR/OR/skip). • Uses C-nets. b check policy a c e register check close claim damage case d consult expert PAGE 11
  • 13. Example log; problem α algorithm p5 b a p1 d p3 e start end p2 c p4 PAGE 12
  • 14. Taking into account frequencies PAGE 13
  • 16. Example PAGE 15
  • 17. Lower threshold (2 direct successions and a dependency of at least 0.7) 5(0.83) b 11(0.92) 11(0.92) a c e 11(0.92) 11(0.92) 13(0.93) 13(0.93) d 4(0.80) PAGE 16
  • 18. Higher threshold (5 direct successions and a dependency of at least 0.9) b 11(0.92) 11(0.92) a c e 11(0.92) 11(0.92) 13(0.93) 13(0.93) d PAGE 17
  • 19. Learning splits and joins 5 20 b 20 21 5 20 20 5 20 20 20 20 a c e 40 20 21 20 40 13 13 13 13 13 13 d 4 17 4 4 PAGE 18
  • 20. Alternative visualization 5 20 b 20 21 5 20 20 5 20 20 20 20 a c e 40 20 21 20 40 13 13 13 13 13 13 d b 4 17 4 4 AND AND a c e d PAGE 19
  • 21. Characteristics of heuristic mining • Can deal with noise and therefore quite robust. • Improved representational bias. • Split and join rules are only considered locally (therefore most of the discovered model are not sound and require repair actions). PAGE 20
  • 22. Genetic process mining create initial population event log mutation next generation compute fitness elitism termination tournament children crossover select best parents individual “dead” individuals PAGE 21
  • 23. Design decisions • Representation of individuals • Initialization • Fitness function • Selection strategy (tournament and elitism) • Crossover create initial population • Mutation event log mutation next generation compute fitness elitism termination tournament children crossover select best parents individual “dead” individuals PAGE 22
  • 24. Example: crossover b b examine examine thoroughly thoroughly g g pay pay c c compensation compensation a e a e examine examine start register casually decide end start register casually decide end request request h h d d reject reject check ticket request check ticket request f f reinitiate reinitiate request request b b examine examine thoroughly thoroughly g g pay pay c c compensation compensation a e a e examine examine start register casually decide end start register casually decide end request request h h d d reject reject check ticket request check ticket request f f reinitiate reinitiate request request PAGE 23
  • 25. Example: mutation remove place b b examine examine thoroughly thoroughly g g pay pay c c compensation compensation a e a e examine examine start register casually decide end start register casually decide end request request h h d d reject reject check ticket request check ticket request f f reinitiate reinitiate request added arc request PAGE 24
  • 26. Characteristics of genetic process mining • Requires a lot of computing power. • Can be distributed easily. • Can deal with noise, infrequent behavior, duplicate tasks, invisible tasks, etc. • Allows for incremental improvement and combinations with other approaches (heuristics post-optimization, etc.). PAGE 25
  • 27. Region-based mining • Two types of regions theory: − State-based regions − Language-based regions • All about discovering places (like in the α algorithm)! a1 b1 a2 b2 ... p(A,B) ... am bn A={a1,a2, … am} B={b1,b2, … bn} PAGE 26
  • 28. State-based regions Two steps: 1.Discover a transition system (different abstractions are possible) 2.Convert transition system into an “equivalent” Petri net. PAGE 27
  • 29. Step 1: learning a transition system current state trace: abcdcdcde faghhhi past future past and future • past, future, past+future • sequence, multiset, set abstraction • limited horizon to abstract further • filtering e.g. based on transaction type, names, etc. • labels based on activity name or other features PAGE 28
  • 30. Past without abstraction (full sequence) c d ‹a,b› ‹a,b,c› ‹a,b,c,d› b a e d ‹› ‹a› ‹a,e› ‹a,e,d› c b d ‹a,c› ‹a,c,b› ‹a,c,b,d› PAGE 29
  • 31. Future without abstraction a b ‹c,d› ‹a,b,c,d› ‹b,c,d› c a e d ‹a,e,d› ‹e,d› ‹d › ‹› b a c ‹b,d› ‹a,c,b,d› ‹c,b,d› PAGE 30
  • 32. Past with multiset abstraction [a,e] d [a,d,e] e [a,b] a b [] [a] c c b d [a,c] [a,b,c] [a,b,c,d] PAGE 31
  • 33. Only last event matters for state ‹e› e d a b ‹ b› d ‹› ‹a › c b ‹d› c d ‹c› PAGE 32
  • 34. Step 2: constructing a Petri net using regions a = enter b d b = enter a e c = exit d = exit f d e = do not cross e f = do not cross e f c a R a c e f pR b d PAGE 33
  • 35. Example d e [a,e] [a,d,e] [ a,b] a b [] [a] c c b d [a,c] [a,b,c] [a,b,c,d] b a p1 e p3 d start end p2 c p4 PAGE 34
  • 36. Language based regions f c1 a1 b1 e c d pR a2 b2 X Y Region R = (X,Y,c) corresponding to place pR: X = {a1,a2,c1} = transitions producing a token for pR, Y = {b1,b2,c1} = transitions consuming a token from pR, and c is the initial marking of pR. PAGE 35
  • 37. Based idea: enough tokens should be present when consuming A place is feasible if it can be added without f c1 disabling any of the traces in the event log. a1 b1 e c d pR a2 b2 X Y PAGE 36
  • 38. Example PAGE 37
  • 39. Regions PAGE 38
  • 40. Model a p5 d c p1 p2 p3 p4 b e p6 PAGE 39
  • 41. Characteristics of region-based mining • Can be used to discover more complex control-flow structures. • Classical approaches need to be adapted (overfitting!). • Representational bias can be parameterized (e.g., free-choice nets, label splitting, etc.). • Problems dealing with noise. PAGE 40
  • 42. Other approaches, e.g. fuzzy mining PAGE 41
  • 43. Evaluating the discovered process Fitness: Is the event log possible according to the model? Precision: Is the model Generalization: Is the model not underfitting (allow for not overfitting (only allow for too much)? the “accidental” examples)? Structure: Is this the simplest model (Occam's Razor)? PAGE 42