We discussed the fundamentals and uses of process mining in my previous article. It would be helpful to understand how process mining can be practically implemented
The two questions to help you understand how to implement process mining :
- How do you connect to your information systems both on-prem as well as SaaS providers.
2. What is the workflow of the Data Engineer for using the raw data in multiple tables in your information systems to create a file that maps out your process? This file is called Activity File containing Activity Table that in turn contains Case Key, Activity Name and timestamp.
How do you connect to the information systems?
The Data Engineer’s task is to connect, extract, transform and setup the data model. Celonis helps you to extract data from arbitrary source systems to transform and visualize your business processes. Typically, event logs sit in different information system software. The information systems can exist both on premise (Oracle and SAP) and in the cloud (ServiceNow and Salesforce).
- On Premise: Source data exists in your own IT systems/ network and does not allow third party software. An ‘On premise connector’ is used to handle the communication between source systems and Event Collection (module of Celonis). In Celonis, the Event Collection module is where the Data Engineer starts his job. It helps you to connect to process data in your underlying source systems, and gather time-stamped event logs from operational systems across your entire digital footprint.
- On Cloud: The source data exists in the cloud. The Celonis Intelligent Business Cloud (CIBC) uses cloud connectors to talk to cloud service provider via their web APIs.
What is the workflow of the Data Engineer?
Aka how does she use the raw data in multiple tables to create one master file that maps out your business process?
Data Pools is the main structure that clusters the data connection, data jobs and data models so that you can setup your entire workflow
Data Connection gives you access to the multiple on premise and cloud connectors. Finally, the data jobs consists of the setup of all the data extraction and transformation
Here is a visual of what the Activity File looks like
For our example, we are considering the P2P process in SAP. SAP Procure to Pay process is required when we need to purchase materials/services from an external vendor. This process includes all the business tasks starting from a purchase requisition and finishing with payment to the vendor. There are about 30+tables in SAP for the P2P process. The visual below focusses on a few important tables below. In our discussion we will focus on the first step .i.e., “Create Purchase Requisition” of the P2P process.
Here are the 4 tables relevant for our discussion
Some definitions to know:
Purchase Requisition is an internal document in which one department is requesting another department to buy goods.
Purchase order is created after the requisition and is a document that is used to actually purchase those goods from an outside vendor.
PR or PO Item: Line Item on the Purchase Order or Purchase Requisition document and reflects material type, quantity and price
The visual below will help you to understand how the EKKO (Purchase Document Header) and EKPO (Purchase Order Items) are related. While the EKKO table provides information about the Purchase Order Document headers the EKPO table provides information about the Purchase Order Items in the Purchase document
Generic procedure to create a data log of a portion of the business process:
The goal of the exercise below is to extract data for the first step of P2P process called ‘Create Purchase Requisition”. The expectation is that you should be able to apply these steps to the other parts of the P2P process as well
- Specify the objects you want to analyze (stays the same for all activities, here it is the Purchase Order item)
- Depending on the activity, add tables with necessary further information by joining them to 1)
- Extract relevant information (case key, activity name, event time, …)
- Insert results into activity table. Without getting into the details, let’s assume that we have already created an empty Activity table with 3 columns Case key, Activity and Event time.
The whole SQL query below demonstrates how an Activity table is created out of a diverse network of tables in our information system
We will break down the SQL query to understand how we arrived at the Activity Table
a) Identify the SAP tables to be joined.
The EKPO table has all the details at the Purchase Order item level. The EKKO table contains all the headers of the Purchase Orders. So we extract all the purchase order items for the purchase orders extracted from the EKKO table. The MANDT (client) and EBELN (Document #) are the columns common to these 2 tables. This leads us to join EKKO and EKPO table using the query below:
SELECT……
FROM “EKPO”
JOIN “EKKO” ON “EKPO”.”MANDT” = “EKKO”.”MANDT” AND “EKPO”.”EBELN” = “EKKO”.”EBELN”
b) Depending on the activity, add tables with necessary further information by joining them to a)
To the above query we add the EBAN table as well since that has the timestamp of when the purchase requisition was created(timestamp column=BADAT). The BANFN column from the EBAN table gives the purchase requisition number and coincides with the EKPO table, hence is used in the join. The BNFPO gives the purchase item number and coincides with the EKPO table
SELECT……
FROM “EKPO”
JOIN “EKKO” ON “EKPO”.”MANDT” = “EKKO”.”MANDT” AND “EKPO”.”EBELN” = “EKKO”.”EBELN”
JOIN “EBAN”
ON “EKPO”.”MANDT” = “EBAN”.”MANDT” AND
“EKPO”.”BANFN” = “EBAN”.”BANFN”
AND “EKPO”.”BNFPO” = “EBAN”.”BNFPO”
WHERE “EKKO”.”BSTYP” = ‘F’;
The result of this join gives us the Purchase Order (from EKKO), Purchase Order Items (from EKPO) and timestamp details (from EBAN).
c)Extract relevant information (case key, activity name, event time, …)
These were restricted to the document type=’F’ as they relate to the process of purchase requisition items
d) Insert results into activity table.
SELECT “EKPO”.”MANDT” || “EKPO”.”EBELN” || “EKPO”.”EBELP” AS “_CASE_KEY” ,’Create Purchase Requisition Item’ AS “ACTIVITY_EN” ,”EBAN”.”BADAT” AS “EVENTTIME”
Finally the Case Key in the activity table is the concatenation of primary keys of the Client ID, Purchase Order Document # and the Purchase Order Item #
This entire process helps us extract the data related to the process “Create Purchase Requisition Item”
The Activity table is now ready!
Sources
[1]Badakhshan, Bernhart, Geyer-Klingeberg, Nakladal, Schenk, Vogelgesang The Action Engine — Turning Process Insights into Action. http://ceur-ws.org/Vol-2374/paper8.pdf
[2]Davenport and Spanyi (2019).What Process Mining Is, and Why Companies Should Do It. https://hbr.org/2019/04/what-process-mining-is-and-why-companies-should-do-it
[3]www.celonis.com
[4]Online Training available for Analyst and Data Engineer for the Celonis Intelligent Business Cloud at https://www.celonis.com/training/
[5]Irani (2017). SAP Procure to Pay Process. https://erproof.com/mm/free-training/sap-procure-to-pay-process/#:~:text=SAP%20Procure%20to%20Pay%20process%20is%20required%20when%20we%20need,with%20payment%20to%20the%20vendor.
[5]http://www.heeh.nl/en/processmining
[6]Santoso, A., Felderer, M. Specification-driven predictive business process monitoring. Softw Syst Model 19, 1307–1343 (2020). https://doi.org/10.1007/s10270-019-00761-w
[7]https://www.celonis.com/process-mining/process-mining-white-paper#how-process-mining-works