Build Survivable, Hardened Cloud Processes with BPMN
Blog: Process Modeling 2.0
Integrating the cloud into your error processing strategy can be challenging. Fortunately, BPMN provides a powerful approach to managing failures, in the form of exceptions and compensations. Adding resources and services into the ‘Cloud’ adds error points that must be managed. The designer of these processes should strive to make their processes ‘survivable’ and hardened. The exception shapes and compensations allow continuous operation throughout intermittent error conditions such as those encounter in integrating cloud services and resources. Without a hardened process your participants might need to use a manual process for the exception flow. This can be an outcome of overfocus on “happy day” paths. Manual processes promote staff improvisations and undocumented “work arounds” and obviously should be avoided.
Figure 1 presents an example of BPMN’s modeling of survivable processes in a survey pattern. If an error occurs in the email-reading process, it can be posted to a log and systems administrators can respond. The survey can continue. Finally, the process can loop back or complete.
Figure 1: BPMN for an exception in the survey process.
The aim of BPMN is to support “long-running” transactions so it is especially suited to the cloud scenario. For instance, a process-oriented car rental agency might create one process for each vehicle, each using a cloud-based machine health service. In this case, the process could run for years, with thousands of calls to the cloud-based service.
Three Error Types
There are three types of exceptions that may affect process execution:
- Technical exceptions—the process server has failed
- Transient exceptions—a needed resource (for instance a cloud service) for the process is unavailable
- Business exceptions—the condition of the process is in error. Data is incomplete or has errors.
Unfortunately, there is no cure or notation for the technical exception. When the server halts, recovering processes are dependent on the technical nature of the process server.
Decisions, supported by business rules, can create named business exceptions. If the process is transactional, it should compensate for transactions that fail to complete. In compensation, the process cleans up or backs out records from ERP’s, operational data stores, or data warehouses – perhaps these can be cloud based.
Processes often must be completed within a specific time period, so there could be time-out actions noted on the diagram. Business processes usually receive a message and translate that message into yet another message for consumption by the next part of the process. BPMN is powerful for developing time-outs, exceptions, and compensations. In building a basic process in BPMN, you can attach time-outs, exceptions, and compensations directly to a subprocess task list. When time-outs and exceptions occur within the process tasks, the process server traps these at the point of the subprocess.
Figure 2 presents a process server “architecture”, including a cloud services, for understanding the three types of errors. If the process server fails, then it obviously cannot run and a technical error occurs. If the ‘on-premise’ database service fails, then the process cannot get the data it needs to complete. This is a transient error. You should be able to restart the process instance and complete it once the database is restored. However cloud services are more complex, you might not have the same level of control that you would have for the database.
Users send data for the process in application servers. If users send incorrect data, then a business exception has occurred. Finally, you use process steps to resolve the error. If the users have used a cloud-bases SAS system to create a request or order, then more complexity is added.
Figure 2: Architecture for a process environment in a cloud-based ecosystem.
You can model business processes with advanced abilities that suspend processing, or communicate with disconnected, cloud-based processes. They can add new services and capabilities such as payments processing. These powers yield opportunities and fulfill the need for the enterprise to reduce the cost of their most complicated practices. Yet, to develop these cloud-based capabilities, you should understand how BPM tools use the three types of exceptions and what you must plan for when developing the production version of the process diagram.
Cloud-based- Error Handling Example
A process diagram should use exception handlers to redirect the process flow when a cloud based exception happens. So, if the subprocess in the figure below raises an error, then steps can be taken to accommodate the error.
Figure 3: using subprocess for exceptional flow
In BMPN, handling errors arising from a subprocess is known as ‘Exceptional flow’ and, for errors, this is always interrupting: the subprocess that generated the error will terminate, and the alternate “exceptional flow” is followed instead. Later, the process can merge exceptional flow with the main flow. The exceptional flow merge does not need to immediately return to the subprocess that generated the error. If required, a process model can merge several steps downstream.
Finally, cloud process integration diagrams in BPMN should include error handling, compensation, and transactions. All services have the potential for error conditions, understanding how to use BPMN to automate error handling and recover, or be survivable, is critical.