Ontology-based Representation and Reasoning on Process Models: A Logic Programming Approach

Ontology-based Representation and Reasoning on Process Models: A Logic Programming Approach

Fabrizio Smith and Maurizio Proietti
National Research Council
Istituto di Analisi dei Sistemi ed Informatica ”Antonio Ruberti”
Via dei Taurini 19, 00185, Roma, Italy
{fabrizio.smith, maurizio.proietti}@iasi.cnr.it
Abstract

We propose a framework grounded in Logic Programming for representing and reasoning about business processes from both the procedural and ontological point of views. In particular, our goal is threefold: (1) define a logical language and a formal semantics for process models enriched with ontology-based annotations; (2) provide an effective inference mechanism that supports the combination of reasoning services dealing with the structural definition of a process model, its behavior, and the domain knowledge related to the participating business entities; (3) implement such a theoretical framework into a process modeling and reasoning platform. To this end we define a process ontology coping with a relevant fragment of the popular BPMN modeling notation. The behavioral semantics of a process is defined as a state transition system by following an approach similar to the Fluent Calculus, and allows us to specify state change in terms of preconditions and effects of the enactment of activities. Then we show how the procedural process knowledge can be seamlessly integrated with the domain knowledge specified by using the OWL 2 RL rule-based ontology language. Our framework provides a wide range of reasoning services, including CTL model checking, which can be performed by using standard Logic Programming inference engines through a goal-oriented, efficient, sound and complete evaluation procedure. We also present a software environment implementing the proposed framework, and we report on an experimental evaluation of the system, whose results are encouraging and show the viability of the approach.

Keywords: Business Processes, Ontologies, Logic Programming, Knowledge Representation, Verification.

1 Introduction

The adoption of structured and systematic approaches for the management of Business Processes (BPs) that operate within an organization is constantly gaining popularity, especially in medium to large organizations such as manufacturing enterprises, service providers, and public administrations. The core of such approaches is the development of BP models that represent the knowledge about processes in machine accessible form. One of the main advantages of process modeling is that it enables automated analysis facilities, such as the verification that the requirements specified over the models are enforced. The automated analysis issue is addressed in the BP Management (BPM) community mainly from a control flow perspective, with the aim of verifying whether the behavior of the modeled system presents logical errors (see, for instance, the notion of soundness [64]).

Unfortunately, standard BP modeling languages are not fully adequate to capture process knowledge in all its aspects. While their focus is on the procedural representation of a BP as a workflow graph that specifies the planned order of operations, the domain knowledge regarding the entities involved in such a process, i.e., the business environment in which processes are carried out, is often left implicit. This kind of knowledge is typically expressed through natural language comments and labels attached to the models, which constitute very limited, informal and ambiguous pieces of information. The lack of a formal representation of the domain knowledge within process models is widely recognized as an obstacle for the further automation of BPM tools and methodologies that effectively support process analysis, retrieval, and reuse [32].

In order to overcome this limitation, the application of well-established techniques stemming from the area of Knowledge Representation in the domains of BP modeling [32, 17, 36, 66] and Web Services [12, 22] has been shown to be a promising approach. In particular, the use of computational ontologies is the most established approach for representing in a machine processable way the knowledge about the domain where business processes operate, providing formal definitions for the basic entities involved in a process, such as activities, actors, data items, and the relations between them. However, there are still several open issues regarding the combination of BP modeling languages (with their execution semantics) and ontologies, and the accomplishment of behavioral reasoning tasks involving both these components. Indeed, most of the approaches developed for the semantic enrichment of process models or Web Services (such as the above cited ones) do not provide an adequate model theory nor an axiomatization to capture and reasoning on dynamic aspects of process descriptions. On the other hand, approaches based on action languages developed in AI (e.g., [57, 6, 44]) are very expressive formalisms that can be used to simultaneously capture the process and the domain knowledge, but they are too general to be applied to BP modeling, and must be suitably restricted not only towards decidability of reasoning but also to reflect the peculiarities of processes. Indeed, action languages provide a limited support for process definition, in terms of workflow constructs, and they lack a clear mapping from standard (ontology and process) modeling languages.

The main objective of this paper is to design a framework for representing and reasoning about business process knowledge from both the procedural and ontological point of views. To achieve this goal, we do not propose yet another business process modeling language, but we provide a framework based on Logic Programming (LP) [38] for reasoning about process-related knowledge expressed by means of de-facto standards for BP modeling, like BPMN [46], and ontology definition, like OWL [43]. We define a rule-based procedural semantics for a relevant fragment of BPMN, by following an approach inspired by the Fluent Calculus [61], and we extend it in order to take into account OWL annotations that describe preconditions and effects of activities and events occurring within a BP. In particular, we integrate our procedural BP semantics with the OWL 2 RL profile thanks to a common grounding in LP. OWL 2 RL is indeed a fragment of the OWL ontology language that has a suitable rule-based presentation, thus constituting an excellent compromise between expressivity and efficiency.

The contributions of this paper can be summarized as follows.

After presenting the preliminaries in Section 2, we propose, in Section 3, a revised and extended version of the Business Process Abstract Language (BPAL) [16, 56], a process ontology for modeling the procedural semantics of a BP regarded as a workflow. To this end we introduce an axiomatization to cope with a relevant fragment of the BPMN 2.0 specification, allowing us to deal with a large class of process models.

We then propose, in Section 4, an approach for the semantic annotation of BP models, where BP elements are described by using an OWL 2 RL ontology.

In Section 5 we provide a general verification mechanism by integrating the temporal logic CTL [15] within our framework, in order to analyze properties of the states that the system can reach, by taking into account both the control-flow and the semantic annotation.

In Section 6 we show how a repository of semantically enriched BPs can be organized in a Business Process Knowledge Base (BPKB), which, due to the common representation of its components in LP, provides a uniform and formal framework that enables logical inference. We then discuss how, by using state-of-the-art LP systems, we can perform some very sophisticated reasoning tasks, such as verification, querying and trace compliance checking, that combine both the procedural and the domain knowledge relative to a BP.

In Section 7 we provide the computational characterization of the reasoning services that can be performed on top of a BPKB, showing in particular that, for a large class of them, advanced resolution strategies (such as SLG-Resolution [14]) guarantee an efficient, sound and complete procedure.

In Section 8 we describe the implemented tool, which provides a graphical user interface to support the semantic BP design, and a reasoner, developed in XSB Prolog [58], able to operate on the BPKB. We also report on an evaluation of the system performance, demonstrating that complex reasoning tasks can be performed on business process of small-to-medium size in an acceptable amount of time and memory resources.

In Section 9 we compare our work to related approaches and in the concluding section we give a critical discussion of our approach, along with directions for future work.

2 Preliminaries

In order to clarify the terminology and the notation used throughout this paper, in this section we recall some basic notions related to the BPMN notation [46], Description Logics [4] as well as foundations of the OWL [43] standard, and Logic Programming [38].

2.1 Bpmn

Business Process Modeling and Notation (BPMN) [46] is a graphical language for BP modeling, standardized by the OMG (http://www.omg.org). The primary goal of BPMN is to provide a standard notation readily understandable by all business stakeholders, which include the business analysts who create and refine the processes, the technical developers responsible for their implementation, and the business managers who monitor and manage them.

A BPMN model is defined through a Business Process Diagram (BPD), which is a kind of flowchart incorporating constructs to represents the control flow, data flow, resource allocation (i.e., how the work is assigned to the participants), and exception handling (i.e., how erroneous behavior can be handled and compensated). We will briefly overview the core BPMN constructs referring to the example in Figure 1.

The constructs of BPMN are classified as flow objects, artifacts, connecting objects, and swimlanes.

Flow objects are partitioned into activities (represented as rounded rectangles), events (represented as circles), and gateways (represented as diamonds). Activities are a generic way of representing some kind of work performed within the process, and can be tasks (i.e., atomic activities such as create_order) or compound activities corresponding to the execution of entire sub-processes (e.g., create_order). Events denote something that “happens” during the enactment of a business process, and are classified as start events, intermediate events, and end events which can start (e.g., ), suspend (e.g., ), or end (e.g., ) the process enactment. An intermediate event, such as ex, attached to the boundary of an activity models exception handling. Gateways model the branching and merging of the control flow. There are several types of gateways in BPMN, each of which may be used as a branch gateway if it has multiple outgoing flows, or a merge gateway if it has multiple incoming flows. The split and join behavior depends on the semantics associated to each type of gateway. Exclusive branch gateways (e.g., g1) are decision points where exactly one of a set of mutually exclusive alternative flows is selected, while an exclusive merge gateway (e.g., g2) merges two incoming flows into a single one. Parallel branch gateways (e.g., g7) create parallel threads of execution, while parallel merge gateways (e.g., g8) synchronize concurrent flows. Inclusive branch gateways (e.g., g3) are decision points where at least one of a set of non-exclusive alternative flows is selected, while an inclusive merge gateway (e.g., g4) is supposed to be able to synchronize a varying number of threads, i.e., it is executed only when at least one of its predecessors has been executed and no other will be eventually executed111For sake of completeness, BPMN provides two more types of gateways, which we do not exemplify, namely, the event-based and the complex gateway..

Connecting objects are sequence flows (e.g., the directed edge between g1 and g3) and associations (e.g., the dashed edge between create_order and order). A sequence flow links two flow objects and denotes a control flow relation, i.e., it states that the control flow can pass from the source to the target object. An association is used to associate artifacts (i.e., data objects) with flow objects, and its direction defines if a data object is used as an input (e.g., order is an input of accept_order) or it is an output (e.g., order is an output of create_order) of some flow element.

Swimlanes are used to model participants, i.e., a generic notion representing a role within a company (e.g., Sales Clerk), a department (e.g., Finance) or a business partner (e.g., Courier), which is assigned to the execution of a collection of activities.

Figure 1: Handle Order Business Process

2.2 Description Logics and Rule-based OWL Ontologies

Description Logics (DLs) [4] are a family of knowledge representation languages that can be used to represent the knowledge of an application domain in a structured and formally well-understood way. DLs are typically adopted for the definition of ontologies since on the one hand, the important notions of the domain are described by concept descriptions, i.e., expressions that are built from atomic concepts (usually thought as sets of individuals, e.g., ) and atomic roles (relations between concepts, e.g., ) using the concept and role constructors provided by the particular DL (e.g., , that is, the set of persons who work for a company). On the other hand, DLs correspond to decidable fragments of classical first-order logic (FOL), and thus are equipped with a formal, logic-based semantics that makes such languages suitable for automated reasoning.

OWL Axiom DL Expression FOL Formula
a type C
a P b
C subClassOf D C D
C disjointWith D C D x.
P domain C P.C
P range C P.C
transitiveProperty P P
functionalProperty P 1 P
P inverseOf Q P Q
OWL Constructor
C intersectionOf D C D
C unionOf D C D
P allValuesFrom C P.C
P someValuesFrom C P.C
complementOf D D
Table 1: Main OWL statements and FOL equivalence

Typically, Description Logics are used for representing a TBox (terminological box) and the ABox (Assertional Box). The TBox describes concept (and role) hierarchies, (e.g., ), while the ABox contains assertions about individuals (e.g., ).

The growing interest in the Semantic Web vision [7], where Knowledge Representation techniques are adopted to make resources machine-interpretable by “intelligent agents”, has pushed the standardization of languages for ontology and meta-data sharing over the (semantic) web. Among these, one of the most promising standards is the Ontology Web Language (OWL) [43], formally grounded in DLs, proposed by the Web Ontology Working Group of W3C. OWL is syntactically layered on RDF [34] and RDFS [10], and can be considered as an extension of RDFS in terms of modeling capabilities and reasoning facilities. The underlying data model (derived from RDF) is based on statements (or RDF triples) of the form , which allow us to describe a resource (subject) in terms of named relations (properties). Values of named relations (i.e. objects) can be URIrefs of Web resources or literals, i.e., representations of data values (such as integers and strings).

Table 1 shows, for some OWL statements, the corresponding DL notations and FOL formulae, where C and D are concepts (OWL classes), P and Q are roles (OWL properties), a and b are constants, and x and y are variables.

The recent OWL 2 specification defines profiles that correspond to syntactic subsets of OWL, each of which is designed to trade some expressive power for efficiency of reasoning. In particular, we consider OWL 2 RL, closely related to the Horn fragment of FOL, which is based on Description Logic Programs [28] and pD* [59]. The use of OWL 2 RL allows us to take advantage of the efficient resolution strategies developed for logic programs, in order to perform the reasoning tasks typically supported by Description Logics reasoning systems, such as concept subsumption and ontology consistency. Indeed, the semantics of OWL 2 RL is defined through a partial axiomatization of the OWL 2 RDF-Based Semantics in the form of first-order implications (OWL 2 RL/RDF rules), and constitutes an upward-compatible extension of RDF and RDFS.

OWL 2 RL ontologies are modeled by means of the ternary predicate representing an OWL statement with subject s, predicate p and object o. For instance, the assertion represents the inclusion axiom a b. Reasoning on triples is supported by OWL 2 RL/RDF rules of the form . Table 2 shows some of the rules of the OWL 2 RL/RDF rule-set. According to the terminology we will introduce in the next section, this rule set is a definite logic program.

Transitive
subsumption      
Inheritance

Domain
Range
Transitivity
Subsumption
of existential      
formulae      
Intersection
Disjointness
Table 2: Excerpt of the OWL 2 RL/RDF rule-set

2.3 Logic programming

We briefly recall the basic notions of Logic Programming. In particular, we will consider the class of locally stratified logic programs, or stratified programs, for short, and their standard semantics defined by the perfect model. (Recall that all major declarative semantics of logic programs coincide on stratified programs.) This class of logic programs is expressive enough to represent several complementary pieces of knowledge related to business processes, such as the syntactic structure of the control flow, the operational semantics, the ontology-based properties, and the temporal properties of the execution. For more details about LP we refer to [38, 2].

A term is either a constant, or a variable, or an expression of the form , where is a function symbol and are terms. An atom is a formula of the form , where is a predicate symbol and are terms. A literal is either an atom or a negated atom. A rule is a formula of the form , where is an atom (the head of the rule) and is a conjunction of literals (the body of the rule). If we call the rule a fact. A rule (term, atom, literal) is ground if no variables occur in it. A logic program is a set of rules. A definite program is a logic program with no negated atoms in the body of its rules. For a logic program , by we denote the set of ground instances of rules in .

Let denote the Herbrand base for , that is, the set of ground atoms that can be constructed in the language of program . An (Herbrand) interpretation is a subset of . A ground atom is true in if . A ground negated atom is true in if . A ground rule is true in if either is true in or, for some , is not true in . An interpretation is a model of if all rules in are true in . Every definite program has a least Herbrand model. However, this property does not hold for general logic programs.

A (local) stratification is a function from the Herbrand base to the set of all countable ordinals [2, 50]. However, for the purposes of this paper it will be enough to consider stratification functions from to the set of the natural numbers. For a ground atom , is called the stratum of . A stratification extends to negated atoms by taking . A ground rule is stratified with respect to if, for . A program is stratified with respect to if every rule in is. Finally, a logic program is stratified if it is stratified with respect to some stratification function.

The perfect model of , denoted , is defined as follows. Let be stratified with respect to . For every , let be the set of rules in whose head has stratum . Thus, . We define a sequence of interpretations as follows: (i) is the least model of (note that is a definite program), and (ii) is the least model of that contains . The perfect model of , is defined as . (Here we are using the simplifying assumption that the codomain of the stratification function is .)

The operational semantics of logic programs is based on the notion of derivation, which is constructed by SLD-resolution augmented with the Negation as Failure rule [38]. Given a stratified program , we will define below the one-step derivation relation , where are queries, that is, conjunctions of literals, and is a substitution. The definition of one-step derivation relation depends on the following notions. A derivation for a query with respect to is a sequence (). We will omit the reference to when clear from the context. A derivation is successful if its last query is the empty conjunction true. A query succeeds if there exists a successful derivation for it. A query fails if it does not succeed. The one-step derivation relation is defined by the following two derivation rules.

  1. Let be a query, where is an atom. Suppose that () is a rule in such that is unifiable with via a most general unifier  [38]. Then .

  2. Let be a query, where is a ground atom. Suppose that fails. Then , where is the identity substitution.

Note that in the definition of a derivation we assume the left-to-right selection rule for literals. Note also that, in rule (N) the one-step derivation from refers to the set of all derivations from (to show that fails). However, this definition is well-founded because the program is stratified. We say that a query is generable from a query if there exists either a derivation or a derivation and is generable from . An answer for a query is a substitution such that there exists a successful derivation and is the restriction of the composition to the variables occurring in . A query flounders if there exists a query generable from such that the leftmost literal of is a non ground negated atom.

The operational semantics is sound and complete with respect to the perfect model semantics for queries that do not flounder. Indeed, it can be shown that (see, for instance, [50, 2]), given a program and an atom that does not flounder with respect to , then: (1) if succeeds with answer , then every ground instance of belongs to , and (2) if belongs to for some substitution , then succeeds with an answer which is more general than .

The definition of a derivation given above is quite abstract and not fully constructive. In particular, the application of rule (N) requires to test that an atom has no successful derivations, and this property is undecidable in the general case. Thus, an effective query evaluation strategy depends on the concrete way derivations are constructed.

A well-known difficulty of the evaluation strategy based on depth-first search is that infinite derivations may be constructed, even in cases where a finite set of atoms (modulo variants) is derived from a given initial query. In particular, this nonterminating behavior can occur for stratified Datalog programs, that is, function free stratified programs.

In order to avoid this difficulty, in this paper we adopt SLG-resolution, a query evaluation mechanism that implements SLD resolution with Negation as Failure by means of tabling [14]. During the construction of the derivations for a given atom , a table is maintained to record the answers to and to the atoms generated from . The tabled answers are used the next time an atom is generated, and hence no atom is evaluated more than once. Thus, SLG-resolution is able to compute in finite time all answers to a query, if a finite set of atoms is generated and a finite set of answers for those atoms exists. In particular, SLG-resolution always terminates and is able to compute all answers for queries to stratified Datalog programs.

3 Rule-based Representation of BP Schemas

In this section we introduce a formal representation of business processes by means of the notion of Business Process Schema (BPS). A BPS, its meta-model, and its procedural (or behavioral) semantics will all be specified by sets of rules, for which we adopt the standard notation and semantics of LP (see Section 2.3).

3.1 Introducing BPAL

The Business Process Abstract Language (BPAL) introduces a language conceived to provide a declarative modeling method capable of fully capturing procedural knowledge in a business process. BPAL constructs are common to the most used and widely accepted BP modeling languages (e.g., BPMN [46], UML activity diagrams [47], EPC [33]) and, in particular, it is based on the BPMN 2.0 specification [46].

Formally, a (set of) BPS(s) is specified by a set of ground facts of the form , where are constants denoting flow elements (e.g., activities, events, and gateways) and p is a predicate symbol. In Table 3 we list some of the BPAL predicates, and in Table 4 we exemplify their usage reporting the translation of the Handle Order process ( for short) depicted in Figure 1 as a BPAL BPS. An extended discussion can be found in [16, 55].

Construct Description
bp(p,s,e) p is a process, with entry-point s and exit-point e
element(x) x is a flow element occurring in some process
relation(x,y,p) the elements x and y are in relation in the process p
task(a) a is an atomic activity
event(e) e is an event
exception(e,a,p) the intermediate event (an exception) is attached to the activity
comp_act(a,s,e) a is a compound activity, with entry-point s and exit-point e
seq(el1,el2,p) a sequence flow relation is defined between el1 and el2 in p
par_branch(g) the execution of g enables all the successor flow elements
par_merge(g) g waits for the completion of all the predecessor flow elements
exc_branch(g) the execution of g enables one of the successor flow elements
exc_merge(g) g waits for the completion of one of the predecessor flow elements
inc_branch(g) the execution of g enables at least one of its successors
inc_merge(g) g waits for the completion of the predecessor flow elements
that will be eventually executed
item(i) i is a data element
input(a,i,p) the activity a uses as input the data element i in the process p
output(a,i,p) the activity a uses as output the data element i in the process p
participant(part) part is a participant
assigned(a,part,p) the activity a is assigned to the participant part in the process p
Table 3: Excerpt of the BPAL language
bp(ho,s,e) seq(s, ordering, ho) comp_act(ordering, s, e)
seq(ordering,g1,ho) seq(g1,g2, ho) assigned(ordering,sales_clerk,ho)
seq(g1,g3,ho) seq(g3,parts_auction,ho) assigned (delivering,shipper,ho)
seq(g3,allocate_inventory,ho) seq(parts_auction,g4,ho) seq(s1,create_order,ordering)
seq(allocate_inventory,g4,ho) seq(g4,g5,ho) seq(create_order,g9,ordering)
seq(g5,g2,ho) seq(g5,select_shipper,ho) seq(g9,accept_order,ordering)
seq(g2,notify_rejection,ho) seq(select_shipper,g7,ho) exception(ex,accept_order,ordering)
seq(notify_rejection,g6,ho) seq(g6,e,ho) seq(ex,g10,ordering)
exc_branch(g1) participant(sales_clerk) input(accept_order,order,ordering)
inc_branch(g3) task(create_order) output(create_order,order,ordering)
par_branch(g7) item(order)
Table 4: BPS representing the Handle Order process

Our formalization also includes in a set of rules that represents the meta-model, defining hierarchical relationships among the BPAL predicates, e.g., ; disjointness relationships among BPAL elements, e.g., ; structural properties which regard a BPS as a directed graph, where edges correspond to sequence and item flow relations. A first set of structural properties represents constraints that should be verified by a well-formed BPS, i.e., syntactically correct BPS: (1) every process is assigned to a unique start event and to a unique end event; (2) every flow element occurs on a path from the start event to the end event; (3) start events have no predecessors and end events have no successors; (4) branch gateways have exactly one predecessor and at least two successors, while merge gateways have at least two predecessors and exactly one successor; (5) activities and intermediate events have exactly one predecessor and one successor; (6) there are no cycles in the hierarchy of compound activities.

Finally, other meta-model properties are related to the notions of path and reachability between flow elements, such as the following ones, which will be used in the sequel: , representing the transitive closure of the sequence flow relation, and , which holds if there is a path in between and not including , i.e.:

With respect to the framework introduced in [16, 55], here we consider unstructured cyclic workflows whose behavioral semantics will be introduced in the following.

3.2 Behavioral Semantics

Now we present a formal definition of the behavioral semantics, or enactment, of a BPS, by following an approach inspired by the Fluent Calculus, a well-known calculus for action and change (see [61] for an introduction), which is formalized in Logic Programming.

In the Fluent Calculus, the state of the world is represented as a collection of fluents, i.e., terms representing atomic properties that hold at a given instant of time. An action, also represented as a term, may cause a change of state, i.e., an update of the collection of fluents associated with it. Finally, a plan is a sequence of actions that leads from the initial to the final state. For states we use set notation (here we depart from [61], where an associative-commutative operator is used for representing collections of fluents). A fluent is an expression of the form , where is a fluent symbol and are constants or variables. In order to model the behavior of a BPS, we represent states as finite sets of ground fluents. We take a closed-world interpretation of states, that is, we assume that a fluent , different from true, holds in a state iff . Our set-based representation of states relies on the assumption that the BPS is safe, that is, during its enactment there are no concurrent executions of the same flow element [64]. This assumption enforces that the set of states reachable by a given BPS is finite. A fluent expression is built inductively from fluents, the binary function symbol , and the unary function symbol . The satisfaction relation assigns a truth value to a fluent expression with respect to a state. This relation is encoded by a predicate , which holds if the fluent expression is true in the state . We also introduce a constant symbol true, such that holds for every state . Accordingly to the closed-world interpretation given to states, the satisfaction relation is defined by the following rules:

Note that, by the perfect model semantics, reflecting the closed-world assumption, for any fluent different from true, holds in a state iff .

We will consider the following two kinds of fluents:

  • , which means that the flow element has been executed and the successor flow element is waiting for execution, during the enactment of the process (cf stands for control flow);

  • , which means that the activity is being executed during the enactment of the process (en stands for enacting).

To clarify our terminology note that, when a flow element is waiting for execution, might not be enabled to execute, because other conditions need to be fulfilled, such as those depending on the synchronization with other flow elements (see, in particular, the semantics of merging behaviors below).

We assume that the execution of an activity has a beginning and a completion (although we do not associate a duration with activity execution), while the other flow elements execute instantaneously. Thus, we will consider two kinds of actions: which starts the execution of an activity , and , which represents the completion of the execution of a flow element (possibly, an activity). The change of state determined by the execution of an action will be formalized by a relation , which holds if action can be executed in state leading to state . For defining the relation the following auxiliary predicates will be used: (i) , which holds if , where and are sets of fluents, and(ii) , which holds if is the set of ground instances of fluent such that condition holds.

The relation holds if a state is immediately reachable from a state , that is, some action can be executed in state leading to state :

We say that a state is reachable from a state if there is a finite, possibly empty, sequence of actions from to , that is, holds, where the relation reachable_state is is the reflexive-transitive closure of .

In the rest of this section we present a fluent-based formalization of the behavioral semantics of a BPS as a set of rules , partially reported in Table 5. The proposed formal semantics is focused on a core of the BPMN language and it mainly refers to its semantics, as described (informally) in the most recent specification of the language [46]. Most of the constructs considered here (e.g., parallel or exclusive branching/merging) have the same interpretation in most workflow languages. However, when different interpretations are given, e.g., in the case of inclusive merge, we stick to the BPMN one.

3.2.1 Activity and Event Execution

The enactment of a process begins with the execution of the associated start event in a state where the fluent holds, being start a reserved constant. After the execution of the start event, its unique successor waits for execution (Rule E1). The execution of an end event leads to the final state of a process execution, in which the fluent holds, where E is the end event associated with the process P and end is a reserved constant (Rule E2).

According to the informal semantics of BPMN, intermediate events are intended as instantaneous patterns of behavior that are registered at a given time point. Thus, we formally model the execution of an intermediate event as a single state transition, as defined in Rule E3. Intermediate events in BPMN can also be attached to activity boundaries to model exceptional flows. Upon occurrence of an exception, the execution of the activity is interrupted, and the control flow moves along the sequence flow that leaves the event (Rule E4).

The execution of an activity is enabled to begin after the completion of its unique predecessor flow element. The effects of the execution of an activity vary depending on its type (i.e., atomic task or compound activity). The beginning of an atomic task is modeled by adding the fluent to the state (Rule A1). At the completion of , the fluent is removed and the control flow moves on to the unique successor of (Rule A2). The execution of a compound activity, whose internal structure is defined as a process itself, begins by enabling the execution of the associated start event (Rule A3), and completes after the execution of the associated end event (Rule A4).

(E1)
(E2)
(E3)
(E4)
(A1)
(A2)
(A3)
(A4)
(B1)
(B2)
(B3)
(X1)
(O1)
(O2)
(O3)
(O4)
(O5)
(P1)
(P2)
Table 5: Fragment of the behavioral semantics of the BPAL language

3.2.2 Branching Behaviors

When a branch gateway is executed, a subset of its successors is selected for execution. We consider here exclusive, inclusive, and parallel branch gateways.

An exclusive branch leads to the execution of exactly one successor (Rule B1), while an inclusive branch leads to the concurrent execution of a non-empty subset of its successors (Rule B2). The set of successors of exclusive or inclusive decision points may depend on guards, i.e., conditions that usually take the form of tests on the value of the items that are passed between the activities. While Rules B1-B2 formalize a nondeterministic choice among the successors of a decision point, in Section 4.3 guard expressions will be included in the framework in the form of fluent expressions whose truth value is tested with respect to the current state. Finally, a parallel branch leads to the concurrent execution of all its successors (Rule B3).

3.2.3 Merging Behaviors

An exclusive merge can be executed whenever at least one of its predecessors has been executed (Rule X1).

For the inclusive merge several operational semantics have been proposed, due to the complexity of its non-local semantics (see e.g., [33, 65]). An inclusive merge is supposed to be able to synchronize a varying number of threads, i.e., it is executed only when predecessors have been executed and no other will be eventually executed. Here we refer to the semantics described in [65] adopted by BPMN, stating that (Rule O1) an inclusive merge M can be executed if the following two conditions hold (Rules O2, O3):

  1. at least one of its predecessors has been executed,

  2. for each non-executed predecessor X, there is no flow element U which is waiting for execution and is upstream X. The notion of being upstream captures the fact that may lead to the execution of , and is defined as follows. A flow element U is upstream if (Rules O4, O5): a) there is a path from U to X not including M, and b) there is no path from U to an executed predecessor of M not including M.

Finally, a parallel merge can be executed if all its predecessors have been executed as defined in Rule P1, where holds if there exists no predecessor of which has not been executed in state (Rule P2).

3.2.4 Item Flow

BP modeling must be able to represent the physical and the information items that are produced and consumed by the various activities during the execution of a process. For the formalization of the item flow semantics, we commit to the BPMN standard, where the so-called data objects are used to store information created and read by activities.

In our approach items are essentially regarded as variables, and hence there is a single instance of a given item any time during the execution that may be (over-)written by some activity. We consider two main types of relations between activities and items. First of all, an activity may use a particular item (input relation). This implies that the item is expected to hold a value before the activity is executed. Second, an activity may produce a particular value (output relation), causing the item to get a new value. If it has no value yet, it is created, otherwise it is overwritten. It is worth noting that the item flow is not necessarily imposed over the control flow, but they interact for the definition of the process behavior. For instance, an activity expecting a value from a given item, may also cause a deadlock if this condition is never satisfied.

The item flow is modeled through the fluent (wrtn stands for “written”) representing the situation in which the item It has been produced by the activity A in the enactment of the process P. In order to handle item manipulation, the semantics of task enactment (Rules A1, A2) is extended as follows:

where holds if, at a state during the enactment of process , there exists some input item It for that has not been produced. Thus,

The case of compound activities can be treated in a similar way and is omitted.

4 Semantic Annotation

In the previous section we have shown how a procedural representation of a BPS can be modeled in our rule-based framework as an activity workflow. However, not all the relevant knowledge regarding process enactment is captured by a workflow model, which defines the planned order of operations but does not provide an explicit representation of the domain knowledge regarding the entities involved in such a process, i.e., the business environment in which processes are carried out.

Similarly to proposals like Semantic BPM [32] and Semantic Web Services [22], we will make use of semantic annotations to enrich the procedural knowledge specified by a BPS with domain knowledge expressed in terms of a given business reference ontology. Annotations provide two kinds of ontology-based information: (i) formal definitions of the basic entities involved in a process (e.g., activities, actors, items) to specify their meaning in an unambiguous way (terminological annotations), and (ii) specifications of preconditions and effects of the enactment of flow elements (functional annotations).

4.1 Reference Ontology

A business reference ontology is intended to capture the semantics of a business scenario in terms of the relevant vocabulary plus a set of axioms (TBox) that define the intended meaning of the vocabulary terms. In order to represent the semantic annotations of a BPAL BPS in a uniform way, we will consider ontologies falling within the OWL 2 RL profile (See Section 2.2), and hence expressible as sets of rules. An OWL 2 RL ontology is represented as a set of rules, consisting of a set of facts of the form , called triples, encoding the specific OWL TBox, along with the rules that are common to all OWL 2 RL ontologies, such as the ones of Table 2.

In Table 6 we show an example of business reference ontology for the annotation of the Handle Order process depicted in Figure 1. For the sake of conciseness and clarity, the axioms of ontology are represented as DL expressions, instead of sets of triples. The translation into triple form can be done automatically as shown in [28, 59].

Actors
Organizational_Actor Actor Human_Actor Actor
Corporate_Customer Organizational_Actor Employee Human_Actor
Department Organizational_Actor Business_Partner Organizational_Actor
Accounting_Dpt Department Supply_Dpt Department
Order_Mgt_Dept Department Warehouse_Mgt Department
Carrier Organizational_Actor Courier Carrier Business_Partner
Carrier_Dpt Carrier Department
Objects
ClosedPO Purchase_Order ApprovedPO Purchase_Order
CancelledPO ClosedPO FulfilledPO ClosedPO
UnavailablePL Part_List AvailablePL Part_List
payment related payment Invoice
CancelledPO ApprovedPO UnavailablePL AvailablePL
ApprovedPO FulfilledPO Order CancelledPO
Processes
AuthorizingProcedure Process Transportation Process
Payment Process Invoicing Process
Communication Process Refuse Communication
Rejecting Authorizing_Procedure Accepting Authorizing_Procedure
Relations
member related content related
destination related member Human_Actor
member Organizational_Actor
Table 6: Business Reference Ontology excerpt

4.2 Terminological Annotation

A terminological annotation associates elements of a BPS with concepts of a reference ontology, in order to describe the former in terms of a suitable conceptualization of the underlying business domain provided by the latter. This association is specified by a set of OWL assertions of the form , where:

  • BpsEl is an element of a BPS;

  • Concept is either i) a named concept defined in the ontology, e.g., Purchase_Order, or ii) a complex concept, defined by a class expression, e.g., Rejecting content.Purchase_Order;

  • termRef is an OWL property name.

We do not assume that every BPS element is annotated, nor that every concept is the meaning associated with some BPS element. Furthermore, different BPS elements could be annotated with respect to the same concept, to provide an alignment of the different terminologies and conceptualizations used in different BP schemas. E.g., the activities bill_client and issue_invoice occurring in different processes may actually refer to the same notion, suitably defined in the ontology.

Example 1.

Examples of annotations related to the Handle Order process (Figure 1) are listed below. The item order is annotated with the Purchase_Order concept, while the participant shipper with the concept Carrier, which can be either an internal Department or a Business_Partner. A sales_clerk is defined as an Employee, which is part of the Order_Mgt_Dpt. The task delivering is defined as a Transportation related to some sort of Product. Finally, notify_rejection represents a Communication with a Corporate_Customer, and in particular, a Refuse related to Purchase_Order.






4.3 Functional Annotation

By using the ontology vocabulary and axioms, we define semantic annotations for modeling the behavior of individual process elements in terms of preconditions under which a flow element can be executed, and effects on the state of the world after its execution. Preconditions and effects, collectively called functional annotations, can be used, for instance, to model input/output relations of activities with business entities. Fluents can represent the properties of a business entity affected by the execution of an activity at a given time during the execution of the process. A precondition specifies the properties a business entity must posses when an activity is enabled to start, and an effect specifies the properties of a business entity after having completed an activity. These aspects are only partially supported by current BP modeling notations, such as BPMN, in terms of data objects representing information storage during the BP enactment.

Functional annotations are formulated by means of the following relations:

  • , which specifies a fluent expression , called enabling condition, that must hold to execute an element A in the process P;

  • , which specifies the set of fluents, called negative effects, that do not hold after the execution of and the set of fluents , called positive effects, that hold after the execution of in the process . is a fluent expression that must hold to complete the activity . We assume that and are disjoint sets of fluents, and the variables occurring in them also occur in .

  • , which models a conditional sequence flow used to select the set of successors of decision points. is a associated to the exclusive or inclusive branch gateway , i.e., a fluent expressions that must hold in order to enable the flow element , successor of in the process . We also have the rule .

The enabling conditions, the guards and the negative and positive effects occurring in functional annotations are fluent expressions built from fluents of the form , corresponding to the OWL statement , where we adopt the usual rdf, rdfs, and owl prefixes for names in the OWL vocabulary, and the bro prefix for names relative to our specific examples. We assume that the fluents appearing in functional annotations are either of the form , corresponding to the unary atom , or of the form , corresponding to the binary atom , where and are individuals, while and are concepts and properties, respectively, defined in the reference ontology . Thus, fluents correspond to assertions about individuals, i.e., assertions belonging to the ABox of the ontology, and hence the ABox may change during process enactment due to the effects specified by the functional annotations, while , providing the ontology definitions and axioms, i.e., the TBox of the ontology, does not change.

Let us now present an example of specification of functional annotations. In particular, our example shows nondeterministic effects, that is, a case where a flow element is associated with more than one pair of negative and positive effects.

Example 2.

Consider again the Handle Order process shown in Figure 1. After the execution of create_order, a purchase order is issued. This order can be approved or canceled upon execution of the activities accept_order and cancel_order, respectively. Depending on the inventory capacity checked during the check_inventory task, the requisition of parts performed by an external supplier is performed (parts_auction). Once that all the order parts are available, the order can be fulfilled and an invoice is associated with the order. This behavior is specified by the functional annotations reported in Table 7.

Flow Element Enabling Condition (pre) Effects (eff)
create_order true Q: true
E: {}

accept_order
Q:
E: {}

cancel_order
Q:
E: {}
E: {}

check_inventory
Q:
E: {
        }

check_inventory

parts_auction
Q:
E: {}

parts_auction
Q:
        
E: {}
E: {}


bill_client
Q: true
E: {}


payment
Q:
                 
E: {}
Branch Successor Guard
g1 g3
g1 g2
g3 parts_auction
g5 g2
g5 select_shipper
Table 7: Functional annotation for the Handle Order process

4.3.1 Formal Semantics of Functional Annotations

In the presence of functional annotations, the enactment of a BPS is modeled by extending the result relation so as to take into account the pre and eff relations. We only consider the case of task execution. The other cases are similar and will be omitted.

Given a state , a flow element can be enacted if is waiting for execution according to the control flow semantics, and its enabling condition is satisfied, i.e., is true. Moreover, given an annotation , when is completed in a given state , then a new state is obtained by taking out from the set of fluents and then adding the set of fluents. The execution of tasks considering functional annotations is then defined as:


Note that, since the variables occurring in and are included in those of , the evaluation of binds these variables to constants.

Similarly, the semantics of inclusive and exclusive branches is extended to evaluate the associated guard expressions, in order to determine the set of successors to be enabled. The execution of decision points is then defined as: