Verification of Agent-Based Artifact Systems

# Verification of Agent-Based Artifact Systems

## Abstract

Artifact systems are a novel paradigm for specifying and implementing business processes described in terms of interacting modules called artifacts. Artifacts consist of data and lifecycles, accounting respectively for the relational structure of the artifacts’ states and their possible evolutions over time. In this paper we put forward artifact-centric multi-agent systems, a novel formalisation of artifact systems in the context of multi-agent systems operating on them. Differently from the usual process-based models of services, the semantics we give explicitly accounts for the data structures on which artifact systems are defined.

We study the model checking problem for artifact-centric multi-agent systems against specifications written in a quantified version of temporal-epistemic logic expressing the knowledge of the agents in the exchange. We begin by noting that the problem is undecidable in general. We then identify two noteworthy restrictions, one syntactical and one semantical, that enable us to find bisimilar finite abstractions and therefore reduce the model checking problem to the instance on finite models. Under these assumptions we show that the model checking problem for these systems is EXPSPACE-complete. We then introduce artifact-centric programs, compact and declarative representations of the programs governing both the artifact system and the agents. We show that, while these in principle generate infinite-state systems, under natural conditions their verification problem can be solved on finite abstractions that can be effectively computed from the programs. Finally we exemplify the theoretical results of the paper through a mainstream procurement scenario from the artifact systems literature.

## 1 Introduction

Much of the work in the area of reasoning about knowledge involves the development of formal techniques for the representation of epistemic properties of rational actors, or agents, in a multi-agent system (MAS). The approaches based on modal logic are often rooted on interpreted systems [\BCAYParikh \BBA RamanujamParikh \BBA Ramanujam1985], a computationally grounded semantics [\BCAYWooldridgeWooldridge2000] used for the interpretation of several temporal-epistemic logics. This line of research was thoroughly explored in the 1990s leading to a significant body of work [\BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995]. Further significant explorations have been conducted since then; a recent topic of interest has focused on the development of automatic techniques, including model checking [\BCAYClarke, Grumberg, \BBA PeledClarke et al.1999], for the verification of temporal-epistemic specifications for the autonomous agents in a MAS [\BCAYGammie \BBA van der MeydenGammie \BBA van der Meyden2004, \BCAYKacprzak, Nabialek, Niewiadomski, Penczek, Pólrola, Szreter, Wozna, \BBA ZbrzeznyKacprzak et al.2008, \BCAYLomuscio, Qu, \BBA RaimondiLomuscio et al.2009]. This has led to developments in a number of areas traditionally outside artificial intelligence, knowledge representation and MAS, including security [\BCAYDechesne \BBA WangDechesne \BBA Wang2010, \BCAYCiobaca, Delaune, \BBA KremerCiobaca et al.2012], web-services [\BCAYLomuscio, Solanki, Penczek, \BBA SzreterLomuscio et al.2010] and cache-coherence protocols in hardware design [\BCAYBaukus \BBA van der MeydenBaukus \BBA van der Meyden2004]. The ambition of the present paper is to offer a similar change of perspective in the area of artifact systems [\BCAYCohn \BBA HullCohn \BBA Hull2009], a growing topic in Service-Oriented Computing (SOC).

Artifacts are structures that “combine data and process in an holistic manner as the basic building block[s]” [\BCAYCohn \BBA HullCohn \BBA Hull2009] of systems’ descriptions. Artifact systems are services constituted by complex workflow schemes based on artifacts which the agents interact with. The data component is given by the relational databases underpinning the artifacts in a system, whereas the workflows are described by “lifecycles” associated with each artifact schema. While in the standard services paradigm services are made public by exposing their processes interface, in artifact systems both the data structures and the lifecycles are advertised. Services are composed in a “hub” where operations on the artifacts are executed. Implementations of artifact systems, such as the IBM engine Barcelona [\BCAYHeath, Hull, \BBA VaculínHeath et al.2011], provide a hub where the service choreography and service orchestratation [\BCAYAlonso, Casati, Kuno, \BBA MachirajuAlonso et al.2004] are carried out.

While artifact systems are beginning to drive new application areas, such as case management systems [\BCAYMarin, Hull, \BBA VaculínMarin et al.2012], we identify two shortcomings in the present state-of-the-art. Firstly, the artifact systems literature [\BCAYBhattacharya, Gerede, Hull, Liu, \BBA SuBhattacharya et al.2007, \BCAYDeutsch, Hull, Patrizi, \BBA VianuDeutsch et al.2009, \BCAYHullHull2008, \BCAYNooijen, Fahland, \BBA DongenNooijen et al.2012] focuses exclusively on the artifacts themselves. While there is obviously a need to model and implement the artifact infrastructure, importantly we also need to account for the agents implementing the services acting on the artifact system. This is of particular relevance given that artifact systems are envisaged to play a leading role in information systems. We need to be able to reason not just about the artifact states but also about what actions specific participants are allowed and not allowed to do, what knowledge they can or cannot derive in a system run, what system state they can achieve in coordination with their peers, etc. In other words, we need to move from the description of the artifact infrastructure to one that encompasses both the agents and the infrastructure.

Secondly, there is a pressing demand to provide the hub with automatic choreography and orchestration capabilities. It is well-known that choreography techniques can be leveraged on automatic model checking techniques; orchestration can be recast as a synthesis problem, which, in turn, can also benefit from model checking technology. However, while model checking and its applications are relatively well-understood in the plain process-based modelling, the presence of data makes these problems much harder and virtually unexplored. Additionally, infinite domains in the underlying databases lead to infinite state-spaces and undecidability of the model checking problem.

The aim of this paper is to make a concerted contribution to both problems above. Firstly, we provide a computationally grounded semantics to systems comprising the artifact infrastructure and the agents operating on it. We use this semantics to interpret a temporal-epistemic language with first-order quantifiers to reason about the evolution of the hub as well as the knowledge of the agents in the presence of evolving, structured data. We observe that the model checking problem for these structures is undecidable in general and analyse two notable decidable fragments. In this context, a contribution we make is to provide finite abstractions to infinite-state artifact systems, thereby presenting a technique for their effective verification for a class of declarative agent-based, artifact-centric programs that we here define. We evaluate this methodology by studying its computational complexity and by demonstrating its use on a well-known scenario from the artifact systems literature.

### 1.1 Artifact-Centric Systems

Service-oriented computing is concerned with the study and development of distributed applications that can be automatically discovered and composed by means of remote interfaces. A point of distinction over more traditional distributed systems is the interoperability and connectedness of services and the shared format for both data and remote procedure calls. Two technology-independent concepts permeate the service-oriented literature: orchestration and choreography [\BCAYAlonso, Casati, Kuno, \BBA MachirajuAlonso et al.2004, \BCAYSingh \BBA HuhnsSingh \BBA Huhns2005]. Orchestration involves the ordering of actions of possibly different services, facilitated by a controller or orchestrator, to achieve a certain overall goal. Choreography concerns the distributed coordination of different actions through publicly observable events to achieve a certain goal. A MAS perspective [\BCAYWooldridgeWooldridge2001] is known to be particularly helpful in service-oriented computing in that it allows us to ascribe information states and private or common goals to the various services. Under this view the agents of the system implement the services and interact with one another in a shared infrastructure or environment.

A key theoretical problem in SOC is to devise effective mechanisms to verify that service composition is correct according to some specification. Techniques based on model checking [\BCAYClarke, Grumberg, \BBA PeledClarke et al.1999] and synthesis [\BCAYBerardi, Cheikh, Giacomo, \BBA PatriziBerardi et al.2008] have been put forward to solve the composition and orchestration problem for services described and advertised at interface level through finite state machines [\BCAYCalvanese, Giacomo, Lenzerini, Mecella, \BBA PatriziCalvanese et al.2008]. More recently, attention has turned to services described by languages such as WS-BPEL [\BCAYAlves et al.Alves et al.2007], which provide potentially unbounded variables in the description of the service process. Again, model checking approaches have successfully been used to verify complex service compositions [\BCAYBertoli, Pistore, \BBA TraversoBertoli et al.2010, \BCAYLomuscio, Qu, \BBA SolankiLomuscio et al.2012].

While WS-BPEL provides a model for services with variables, the data referenced by them is non-permanent. The area of data-centric workflows [\BCAYHull, Narendra, \BBA NigamHull et al.2009, \BCAYNigam \BBA CaswellNigam \BBA Caswell2003] evolved as an attempt to provide support for permanent data, typically present in the form of underlying databases. Although usually abstracted away, permanent data is of central importance to services, which typically query data sources and are driven by the answers they obtain; see, e.g., [\BCAYBerardi, Calvanese, Giacomo, Hull, \BBA MecellaBerardi et al.2005]. Therefore, a faithful model of a service behavior cannot, in general, disregard this component. In response to this, proposals have been made in the workflows and service communities in terms of declarative specifications of data-centric services that are advertised for automatic discovery and composition. The artifact-centric approach [\BCAYCohn \BBA HullCohn \BBA Hull2009] is now one of the leading emerging paradigms in the area. As described in [\BCAYHullHull2008, \BCAYHull, Damaggio, De Masellis, Fournier, Gupta, Heath, Hobson, Linehan, Maradugu, Nigam, Sukaviriya, \BBA VaculinHull et al.2011] artifact-centric systems can be presented along four dimensions.

Artifacts are the holders of all structured information available in the system. In a business-oriented scenario this may include purchase orders, invoices, payment records, etc. Artifacts may be created, amended, and destroyed at run time; however, abstract artifact schemas are provided at design time to define the structure of all artifacts to be manipulated in the system. Intuitively, external events cause changes in the system, including in the value of artifact attributes.

The evolution of artifacts is governed by lifecycles. These capture the changes that an artifact may go through from creation to deletion. Intuitively, a purchase order may be created, amended and operated on by several events before it is fullfilled and its existence in the system terminated: a lifecycle associated with a purchase order artifact formalises these transitions.

Services are seen as the actors operating on the artifact system. They represent both human and software actors, possibly distributed, that generate events on the artifact system. Some services may “own” artifacts, and some artifacts may be shared by several services. However, not all artifacts, or parts of artifacts, are visible to all services. Views and windows respectively determine which parts of artifacts and which artifact instances are visible to which service. An artifact hub is a system that maintains the artifact system and processes the events generated by the services.

Services generate events on the artifact system according to associations. Typically these are declarative descriptions providing the precondition and postconditions for the generation of events. These generate changes in the artifact system according to the artifact lifecycles. Since events may trigger changes in several artifacts in the system, events are processed by a well-defined semantics [\BCAYDamaggio, Hull, \BBA VaculínDamaggio et al.2011, \BCAYHull, Damaggio, De Masellis, Fournier, Gupta, Heath, Hobson, Linehan, Maradugu, Nigam, Sukaviriya, \BBA VaculinHull et al.2011] that governs the sequence of changes an artifact-system may undertake upon consumption of an event. Such a semantics, based on the use of Prerequisite-Antecedent-Consequent (PAC) rules, ensures acyclicity and full determinism in the updates on the artifact system. GSM is a declarative language that can be used to describe artifact systems. Barcelona is an engine that can be used to run a GSM-based artifact-centric system [\BCAYHeath, Hull, \BBA VaculínHeath et al.2011].

As it will be clear in the next section, in line with the agent-based approach to services, we will use agent-based concepts to model services. The artifact-system will be represented as an environment, constituted by evolving databases, upon which the agents operate; lifecycles and associations will be modelled by local and global transition functions. The model is intended to incorporate all artifact-related concepts including views and windows.

In view of the above in this paper we address the following questions. How can we give a transition-based semantics for artifacts and agents operating on them? What syntax should we use to specify properties of the agents and the artifacts themselves? Can we verify that an artifact system satisfies certain properties? As this will be shown to be undecidable, can we find suitable fragments on which this can actually be carried out? If so, what is the resulting complexity? Lastly, can we provide declarative specifications for the agent programs so that these can be verified by model checking? Can this technique be used on mainstream scenarios from the SOC literature?

This paper intends to contribute answering these questions.

### 1.2 Related Work

As stated above, virtually all current literature on artifact-centric systems focuses on properties and implementations of the artifact-system as such. Little or no attention is given to the actors on the system, whether they are human or artificial agents. A few formal techniques have, however, been put forward to verify the core, non-agent aspects of the system; in the following we briefly compare these to this contribution.

To our knowledge the verification of artifact-centric business processes was first discussed in [\BCAYBhattacharya, Gerede, Hull, Liu, \BBA SuBhattacharya et al.2007], where reachability and deadlocks are phrased in the context of artifact-centric systems and complexity results for the verification problem are given. The present contribution differs markedly from [\BCAYBhattacharya, Gerede, Hull, Liu, \BBA SuBhattacharya et al.2007] by employing a more expressive specification language, even if the agent-related aspects are not considered, and by putting forward effective abstraction procedures for verification.

In [\BCAYGerede \BBA SuGerede \BBA Su2007] a verification technique for artifact-centric systems against a variant of computation-tree logic is put forward. The decidability of the verification problem is proven for the language considered under the assumption that the interpretation domain is bounded. Decidability is also shown for the unbounded case by making restrictions on the values that quantified variables can range over. In the work here presented we also work on unbounded domains, but do not require the restrictions present in [\BCAYGerede \BBA SuGerede \BBA Su2007]: we only insist on the fact that the number of distinct values in the system does not exceed a given threshold at any point in any run. Most importantly, the interplay between quantification and modalities here considered allows us to bind and use variables in different states. This is a major difference as this feature is very expressive and known to lead to undecidability.

A related line of research is followed in [\BCAYDeutsch, Hull, Patrizi, \BBA VianuDeutsch et al.2009, \BCAYDamaggio, Deutsch, \BBA VianuDamaggio et al.2012], where the verification problem for artifact systems against two variants of first-order linear-time temporal logic is considered. Decidability of the verification problem is retained by imposing syntactic restrictions on both the system descriptions and the specifications to check. This effectively limits the way in which new values introduced at every computational step can be used by the system. Properties based on arithmetic operators are considered in [\BCAYDamaggio, Deutsch, \BBA VianuDamaggio et al.2012]. While there are elements of similarity between these approaches and the one we put forward here, including the fact that the concrete interpretation domain is replaced by an abstract one, the contribution here presented has significant differences from these. Firstly, our setting is branching-time and not linear-time thereby resulting in different expressive power. Secondly, differently from [\BCAYDeutsch, Hull, Patrizi, \BBA VianuDeutsch et al.2009, \BCAYDamaggio, Deutsch, \BBA VianuDamaggio et al.2012], we impose no constraints on nested quantifiers. In contrast, [\BCAYDamaggio, Deutsch, \BBA VianuDamaggio et al.2012] admits only universal quantification over combinations of quantifier-free first-order formulas. Thirdly, the abstraction results we present here are given in general terms on the semantics of declarative programs and do not depend on a particular presentation of the system.

More closely related to the present contribution is [\BCAYHariri, Calvanese, Giacomo, Deutsch, \BBA MontaliHariri et al.2012], where conditions for the decidability of the model checking problem for data-centric dynamic systems, e.g., dynamic systems with relational states, are given. In this case the specification language used is a first-order version of the -calculus. While our temporal fragment is subsumed by the -calculus, since we use indexed epistemic modalities as well as a common knowledge operator, the two specification languages have different expressive power. To retain decidability, like we do here, the authors assume a constraint on the size of the states. However, differently from the contribution here presented, [\BCAYHariri, Calvanese, Giacomo, Deutsch, \BBA MontaliHariri et al.2012] assume limited forms of quantification whereby only individuals persisting in the system evolution can be quantified over. In this contribution we do not make this restriction.

Irrespective of what above, the most important feature that characterises our work is that the set-up is entirely based on epistemic logic and multi-agent systems. We use agents to represent the autonomous services operating in the system and agent-based concepts play a key role in the modelling, the specifications, and the verification techniques put forward. Differently from all approaches presented above we are not only concerned with whether the artifact-system meets a particular specification. Instead, we also wish to consider what knowledge the agents in the system acquire by interacting among themselves and with the artifact-system during a system run. Additionally, the abstraction methodology put forward is modular with respect to the agents in the system. These features enable us to give constructive procedures for the generation of finite abstractions for artifact-centric programs associated with infinite models. We are not aware of any work in the literature tackling any of these aspects.

Relation to previous work by the authors. This paper combines and expands preliminary results originally discussed in [\BCAYBelardinelli, Lomuscio, \BBA PatriziBelardinelli et al.2011a], [\BCAYBelardinelli, Lomuscio, \BBA PatriziBelardinelli et al.2011b], [\BCAYBelardinelli, Lomuscio, \BBA PatriziBelardinelli et al.2012a], and [\BCAYBelardinelli, Lomuscio, \BBA PatriziBelardinelli et al.2012b]. In particular, the technical set up of artifacts and agents is different from that of our preliminary studies and makes it more natural to express artifact-centric concepts such as views. Differently from our previous attempts we here incorporate an operator for common knowledge and provide constructive methods to define abstractions for all notions of bisimulation. We also consider the complexity of the verification problem, previously unexplored, and evaluate the technique in detail on a case study.

### 1.3 Scheme of the Paper

The rest of the paper is organised as follows. In Section 2 we introduce Artifact-centric Multi-Agent Systems (ACMAS), the semantics we will be using throughout the paper to describe agents operating on an artifact system. In the same section we put forward FO-CTLK, a first-order logic with knowledge and time to reason about the evolution of the knowledge of the agents and the artifact system. This enables us to propose a satisfaction relation based on the notion of bounded quantification, define the model checking problem, and highlight some properties of isomorphic states.

An immediate result we will explore concerns the undecidability of the model checking problem for ACMAS in their general setting. Section 3 is concerned with synctactical restrictions on FO-CTLK that enable us to guarantee the existence of finite abstractions of infinite-state ACMAS, thereby making the model checking problem feasible by means of standard techniques.

Section 4 tackles restrictions orthogonal to those of Section 3 by focusing on a subclass of ACMAS that admits a decidable model checking problem when considering full FO-CTLK specifications. The key finding here is that bounded and uniform ACMAS, a class identified by studying a strong bisimulation relation, admit finite abstractions for any FO-CTLK specification. The section concludes by showing that under these restrictions the model checking problem is EXPSPACE-complete.

We turn our attention to artifact programs in Section 6 by defining the concept of artifact-centric programs. We define them through natural, first-order preconditions and postconditions in line with the artifact-centric approach. We give a semantics to them in terms of ACMAS and show that their generated models are precisely those uniform ACMAS studied earlier in the paper. It follows that, under some boundedness conditions, which can be naturally expressed, the model checking problem for artifact-centric programs is decidable and can be executed on finite models.

Section 7 reports a scenario from the artifact systems literature. This is used to exemplify the technique by providing finite abstractions that can be effectively verified.

We conclude in Section 8 where we consider the limitations of the approach and point to further work.

## 2 Artifact-Centric Multi-Agent Systems

In this section we formalise artifact-centric systems and state their verification problem. As data and databases are important constituents of artifact systems, our formalisation of artifacts relies on them as underpinning concepts. However, as discussed in the previous section, we here give prominence to agent-based concepts. As such, we define our systems as comprising both the artifacts in the system as well as the agents that interact with the system.

A standard paradigm for logic-based reasoning about agent systems is interpreted systems [\BCAYParikh \BBA RamanujamParikh \BBA Ramanujam1985, \BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995]. In this setting agents are endowed with private local states and evolve by performing actions according to an individual protocol. As data play a key part, as well as to allow us to specify properties of the artifact system, we will define the agents’ local states as evolving database instances. We call this formalisation artifact-centric multi-agent systems (AC-MAS). AC-MAS enable us to represent naturally and concisely concepts much used in the artifact paradigm such as the one of view discussed earlier.

Our specification language will include temporal-epistemic logic but also quantification over a domain so as to represent the data. This is an usual verification setting, so we will formally define the model checking problem for this set up.

### 2.1 Databases and First-Order Logic

As discussed above, we use databases as the basic building blocks for defining the states of the agents and the artifact system. We here fix the notation and terminology used. We refer to  [\BCAYAbiteboul, Hull, \BBA VianuAbiteboul et al.1995] for more details on databases.

###### Definition 2.1 (Database Schemas)

A (relational) database schema is a set of relation symbols , each associated with its arity .

Instances of database schemas are defined over interpretation domains.

###### Definition 2.2 (Database Instances)

Given an interpretation domain and a database schema , a -instance over is a mapping associating each relation symbol with a finite -ary relation over , i.e., .

The set of all -instances over an interpretation domain is denoted by . We simply refer to “instances” whenever the database schema is clear by the context. The active domain of an instance , denoted as , is the set of all individuals in occurring in some tuple of some predicate interpretation . Observe that, since contains a finite number of relation symbols and each is finite, so is .

To fix the notation, we recall the syntax of first-order formulas with equality and no function symbols. Let be a countable set of individual variables and be a finite set of individual constants. A term is any element .

###### Definition 2.3 (FO-formulas over D)

Given a database schema , the formulas of the first-order language are defined by the following BNF grammar:

 φ ::= t=t′∣Pi(t1,…,tqi)∣¬φ∣φ→φ∣∀xφ

where , is a -tuple of terms and are terms.

We assume “” to be a special binary predicate with fixed obvious interpretation. To summarise, is a first-order language with equality over the relational vocabulary with no function symbols and with finitely many constant symbols from . Observe that considering a finite set of constants is not a limitation. Indeed, since we will be working with finite sets of formulas, can always be defined so as to be able to express any formula of interest.

In the following we use the standard abbreviations , , , and . Also, free and bound variables are defined as standard. For a formula we denote the set of its variables as , the set of its free variables as , and the set of its constants as . We write to list explicitly in arbitrary order all the free variables of . By slight abuse of notation, we treat as a set, thus we write . A sentence is a formula with no free variables.

Given an interpretation domain such that , an assignment is a function . For an assignment , we denote by the assignment such that: (i) ; and (ii) , for every different from . For convenience, we extend assignments to constants so that , if ; that is, we assume a Herbrand interpretation of constants. We can now define the semantics of .

###### Definition 2.4 (Satisfaction of FO-formulas)

Given a -instance , an assignment , and an FO-formula , we inductively define whether satisfies under , written , as follows: iff iff iff it is not the case that iff or iff for all , we have that

A formula is true in , written , iff , for all assignments .

Observe that we adopt an active-domain semantics, that is, quantified variables range only over the active domain of . Also notice that constants are interpreted rigidly; so, two constants are equal if and only if they are syntactically the same. In the rest of the paper, we assume that every interpretation domain includes . Also, as a usual shortcut, we write to express that it is not the case that .

Finally, we introduce the operator on -instances that will be used later in the paper. Let the primed version of a database schema be the schema obtained from by syntactically replacing each predicate symbol with its primed version of the same arity.

###### Definition 2.5 (⊕ Operator)

Given two -instances and , we define as the -instance such that and .

Intuitively, the operator defines a disjunctive join of the two instances, where relation symbols in are interpreted according to , while their primed versions are interpreted according to .

### 2.2 Artifact-Centric Multi-Agent Systems

In the following we introduce the semantic structures that we will use throughout the paper. We define an artifact-centric multi-agent system as a system comprising an environment representing all interacting artifacts in the system and a finite set of agents interacting with such environment. As agents have views of the artifact state, i.e., projections of the status of particular artifacts, we assume the building blocks of their private local states also to be modelled as database instances. In line with the interpreted systems semantics [\BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995] not everything in the agents’ states needs to be present in the environment; a portion of it may be entirely private and not replicated in other agents’ states. So, we start by introducing the notion of agent.

###### Definition 2.6 (Agent)

Given an interpretation domain , an agent is a tuple , where:

• is the local database schema;

• is the set of local states;

• is the finite set of action types of the form , where is the tuple of abstract parameters;

• is the local protocol function, where is the set of ground actions of the form where and is a tuple of ground parameters.

Intuitively, at a given time each agent is in some local state that represents all the information agent has at its disposal. In this sense we follow [\BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995] but require that this information is structured as a database. Again, following standard literature we assume that the agents are autonomous and proactive and perform the actions in according to the protocol function . In the definition above we distinguish between “abstract parameters” to denote the language in which particular action parameters are given, and their concrete values or “ground parameters”.

We assume that the agents interact among themselves and with an environment comprising all artifacts in the system. As artifacts are entities involving both data and process, we can see them as collections of database instances paired with actions and governed by special protocols. Without loss of generality we can assume the environment state to be a single database instance including all artifacts in the system. From a purely formal point of view this allows us to represent the environment as a special agent. Of course, in any specific instantiation the environment and the agents will be rather different, exactly in line with the standard propositional version of interpreted systems.

We can therefore define the synchronous composition of agents with the environment.

###### Definition 2.7 (Artifact-Centric Multi-Agent Systems)

Given an interpretation domain and a set of agents defined on , an artifact-centric multi-agent system (or AC-MAS) is a tuple where:

• is the set of reachable global states;

• is the interpretation domain;

• is the initial global state;

• is the global transition function, where is the set of global (ground) actions, and is defined iff for every .

As we will see in later sections, AC-MAS are the natural extension of interpreted systems to the first order to account for environments constituted of artifact-centric systems. They can be seen as a specialisation of quantified interpreted systems [\BCAYBelardinelli \BBA LomuscioBelardinelli \BBA Lomuscio2012], a general extension of interpreted systems to the first-order case.

In the formalisation above the agent is referred to as the environment . The environment includes all artifacts in the system as well as additional information to facilitate communication between the agents and the hub, e.g., messages in transit etc. At any given time an AC-MAS is described by a tuple of database instances, representing all the agents in the system as well as the artifact system. A single interpretation domain for all database schemas is given. Note that this does not break the generality of the representation as we can always extend the domain of all agents and the environment before composing them into a single AC-MAS. The global transition function defines the evolution of the system through synchronous composition of actions for the environment and all agents in the system.

Much of the interaction we are interested in modelling involves message exchanges with payload, hence the action parameters, between agents and the environment, i.e., agents operating on the artifacts. However, note that the formalisation above does not preclude us from modelling agent-to-agent interactions, as the global transition function does not rule out successors in which only some agents change their local state following some actions. Also observe that essential concepts such as views are naturally expressed in AC-MAS by insisting that the local state of an agent includes part of the environment’s, i.e., the artifacts the agent has access to. Not all AC-MAS need to have views defined, so it is also possible for the views to be empty.

Other artifact-based concepts such as lifecycles are naturally expressed in AC-MAS. As artifacts are modelled as part of the environment, a lifecycle is naturally encoded in AC-MAS simply as the sequence of changes induced by the transition function on the fragment of the environment representing the lifecycle in question. We will show an example of this in Section 7.

Some technical remarks now follow. To simplify the notation, we denote a global ground action as , where and , with each of appropriate size. We define the transition relation on such that if and only if there exists a such that . If , we say that is a successor of . A run from is an infinite sequence , with . For , we take . A state is reachable from if there exists a run from the global state such that , for some . We assume that the relation is serial. This can be easily obtained by assuming that each agent has a skip action enabled at each local state and that performing skip induces no changes in any of the local states. We consider to be the set of states reachable from the initial state . For convenience we will use also the concept of temporal-epistemic (t.e., for short) run. Formally a t.e. run from a state is an infinite sequence such that and or , for some . A state is said to be temporally-epistemically reachable (t.e. reachable, for short) from if there exists a t.e. run from the global state such that for some we have that . Obviously, temporal-epistemic runs include purely temporal runs as a special case.

As in plain interpreted systems [\BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995], we say that two global states and are epistemically indistinguishable for agent , written , if . Differently from interpreted systems the local equality is evaluated on database instances. Also, notice that we admit to be infinite, thereby allowing the possibility of the set of states to be infinite. Indeed, unless we specify otherwise, we will assume to be working with infinite-state AC-MAS.

Finally, for technical reasons it is useful to refer to a global database schema of an AC-MAS. Every global state is associated with the (global) -instance such that , for . We omit the subscript when is clear from the context and we write for . Notice that for every , the associated with is unique, while the converse is not true in general.

### 2.3 Model Checking

We now define the problem of verifying an artifact-centric multi-agent system against a specification of interest. By following the artifact-centric model, we wish to give data the same prominence as processes. To deal with data and the underlying database instances, our specification language needs to include first-order logic. Further, we require temporal logic to describe the system execution. Lastly, we use epistemic logic to express the information the agents have at their disposal. Hence, we define a first-order temporal epistemic specification language to be interpreted on AC-MAS. The specification language will be used in Section 6 to formalise properties of artifact-centric programs.

###### Definition 2.8 (The Logic FO-CTLK)

The first-order CTLK (or FO-CTLK) formulas over a database schema are inductively defined by the following BNF:

 φ ::= ϕ∣¬φ∣φ→φ∣∀xφ∣AXφ∣AφUφ∣EφUφ∣Kiφ∣Cφ

where and .

The notions of free and bound variables for FO-CTLK extend straightforwardly from , as well as functions vars, free, and const. As usual, the temporal formulas and (resp. ) are read as “for all runs, at the next step ” and “for all runs (resp. some run), until ”. The epistemic formulas and intuitively mean that “agent knows ” and “it is common knowledge among all agents that ” respectively. We use the abbreviations , , , , and as standard. Observe that free variables can occur within the scope of modal operators, thus allowing for the unconstrained alternation of quantifiers and modal operators, thereby allowing us to refer to elements in different modal contexts. We consider also a number of fragments of FO-CTLK. The sentence atomic version of FO-CTLK without epistemic modalities, or SA-FO-CTL, is the language obtained from Definition 2.8 by removing the clauses for epistemic operators and restricting atomic formulas to first-order sentences, so that no variable appears free in the scope of a modal operator:

 φ ::= ϕ∣¬φ∣φ→φ∣AXφ∣AφUφ∣EφUφ

where is a sentence.

We will consider also the language FO-ECTLK, i.e., the existential fragments of FO-CTLK, defined as follows:

 φ ::= ϕ∣φ∧φ∣φ∨φ∣∀xφ∣∃xφ∣EXφ∣EφUφ∣¯Kiφ∣¯Cφ,

where , with and the standard abbreviations, , and .

The semantics of FO-CTLK formulas is defined as follows.

###### Definition 2.9 (Satisfaction for FO-CTLK)

Consider an AC-MAS , an FO-CTLK formula , a state , and an assignment . We inductively define whether satisfies in under , written , as follows: iff , if is an FO-formula iff it is not the case that iff or iff for all , iff for all runs , if , then iff for all runs , if , then there is s.t. , and for all , implies iff for some run , and there is s.t. , and for all , implies iff for all , implies iff for all , implies where is the transitive closure of .

A formula is said to be true at a state , written , if for all assignments . Moreover, is said to be true in , written , if .

A key concern in this paper is to explore the model checking of AC-MAS against first-order temporal-epistemic specifications.

###### Definition 2.10 (Model Checking)

Model checking an AC-MAS  against an FO-CTLK formula amounts to finding an assignment such that .

It is easy to see that whenever is finite the model checking problem is decidable as is a finite-state system. In general this is not the case.

###### Theorem 2.11

The model checking problem for AC-MAS w.r.t. FO-CTLK is undecidable.

Proof (sketch).  This can be proved by showing that every Turing machine whose tape contains an initial input can be simulated by an artifact system . The problem of checking whether terminates on that particular input can be reduced to checking whether , where encodes the termination condition. The detailed construction is similar to that of Theorem 4.10 of [\BCAYDeutsch, Sui, \BBA VianuDeutsch et al.2007].      Given the general setting in which the model checking problem is defined above, the negative result is not surprising. In the following we identify syntactic and semantic restrictions for which the problem is decidable.

### 2.4 Isomorphisms

We now investigate the concept of isomorphism on AC-MAS. This will be needed in later sections to produce finite abstractions of infinite-state AC-MAS. In what follows let and be two AC-MAS.

###### Definition 2.12 (Isomorphism)

Two local states are isomorphic, written , iff there exists a bijection such that:

• is the identity on ;

• for every , , we have that iff .

When this is the case, we say that is a witness for .

Two global states and are isomorphic, written , iff there exists a bijection such that for every , is a witness for .

Notice that isomorphisms preserve the constants in as well as predicates in the local states up to renaming of the corresponding terms. Any function as above is called a witness for . Obviously, the relation is an equivalence relation. Given a function defined on , denotes the interpretation in obtained from by renaming each as . If is also injective (thus invertible) and the identity on , then .

Example. For an example of isomorphic states, consider an agent with local database schema , let be an interpretation domain, and fix the set of constants. Let be the local state such that and (see Figure 1). Then, the local state such that and is isomorphic to . This can be easily seen by considering the isomorphism , where: , , and . On the other hand, the state where and is not isomorphic to . Indeed, although a bijection exists that “transforms” into , it is easy to see that none can be such that .

Note that, while isomorphic states have the same relational structure, two isomorphic states do not necessarily satisfy the same FO-formulas as satisfaction depends also on the values assigned to free variables. To account for this, we introduce the following notion.

###### Definition 2.13 (Equivalent assignments)

Given two states and , and a set of variables , two assignments and are equivalent for w.r.t.  and iff there exists a bijection such that:

• is a witness for ;

• .

Intuitively, equivalent assignments preserve both the (in)equalities of the variables in and the constants in up to renaming. Note that, by definition, the above implies that are isomorphic. We say that two assignments are equivalent for an FO-CTLK formula , omitting the states and when it is clear from the context, if these are equivalent for .

We can now show that isomorphic states satisfy exactly the same FO-formulas.

###### Proposition 2.14

Given two isomorphic states and , an FO-formula , and two assignments and equivalent for , we have that

 (Ds,σ)⊨φ iff (Ds′,σ′)⊨φ

Proof.  The proof is by induction on the structure of . Consider the base case for the atomic formula . Then iff . Since and are equivalent for , and , this is the case iff , that is, . The base case for is proved similarly, by observing that the satisfaction of depends only on the assignments, and that the function of Def. 2.13 is a bijection, thus all the (in)equalities between the values assigned by and are preserved. This is sufficient to guarantee that iff . The inductive step for the propositional connectives is straightforward. Finally, if , then iff for all , . Now consider the witness for , where is as in Def. 2.13. We have that and are equivalent for . By induction hypothesis iff . Since is a bijection, this is the case iff for all , , i.e., .

This leads us to the following result.

###### Corollary 2.15

Given two isomorphic states and and an FO-sentence , we have that

 Ds⊨φ iff Ds′⊨φ

Proof.  From right to left. Suppose, by contradiction, that . Then there exists an assignment s.t. . Since , if is a witness for , then the assignment is equivalent to for and . By Proposition 2.14 we have that , that is, . The case from left to right can be shown similarly.

Thus, isomorphic states cannot be distinguished by FO-sentences. This enables us to use this notion when defining simulations as we will see in the next section.

## 3 Abstractions for Sentence Atomic FO-CTL

In the previous section we have observed that model checking AC-MAS against FO-CTLK is undecidable in general. So, it is clearly of interest to identify decidable settings. In what follows we introduce two main results. The first, presented in this section, identifies restrictions on the language; the second, presented in the next section, focuses on semantic constraints. While these cases are in some sense orthogonal to each other, we show that they both lead to decidable model checking problems. They are also both carried out on a rather natural subclass of AC-MAS that we call bounded, which we identify below. Our goal for proceeding in this manner is to identify finite abstractions of infinite-state AC-MAS so that verification of programs, that admit AC-MAS as models, can be conducted on them, rather than on infinite-state AC-MAS. We will see this in detail in Section 6.

Given our aims we begin by defining a first notion of bisimulation in the context of AC-MAS. Bisimulations will be used to show that all bounded AC-MAS admit a finite, bisimilar, abstraction that satisifies the same SA-FO-CTL specifications as the original AC-MAS. Also in what follows we assume that and .

###### Definition 3.1 (Simulation)

A relation is a simulation iff implies:

1. ;

2. for every , if then there exists s.t.  and .

Definition 3.1 presents the standard notion of simulation applied to the case of AC-MAS. The difference from the propositional case is that we here insist on the states being isomorphic, a generalisation from the usual requirement for propositional valuations to be equal [\BCAYBlackburn, de Rijke, \BBA VenemaBlackburn et al.2001]. As in the standard case, two states and are said to be similar, written , if there exists a simulation relation s.t. . It can be proven that the similarity relation is a simulation itself, and in particular the largest one w.r.t. set inclusion, and that it is transitive and reflexive. Finally, we say that simulates , written , if . We extend the above to bisimulations.

###### Definition 3.2 (Bisimulation)

A relation is a bisimulation iff both and are simulations.

We say that two states and are bisimilar, written , if there exists a bisimulation s.t. . Similarly to simulations, it can be proven that the bisimilarity relation is the largest bismulation. Further, it is an equivalence relation. Finally, and are said to be bisimilar, written , if .

Since, as shown in Proposition 2.15, the satisfaction of FO-sentences is invariant under isomorphisms, we can now extend the usual bisimulation result from the propositional case to that of SA-FO-CTL. We begin by showing a result on bisimilar runs.

###### Proposition 3.3

Consider two AC-MAS  and such that , , for some , and a run of such that . Then there exists a run of such that:

• ;

• for all , .

Proof.  We show by induction that such run in exists. For , let . Obviously, . Now, assume, by induction hypothesis, that . Let . Since , by Def. 3.1, there exists such that and . Let ; hence we obtain . By definition is a run of .

This enables us to show that bisimilar AC-MAS preserve SA-FO-CTL formulas. This is an extension of analogous results on propositional CTL.

###### Lemma 3.4

Consider the AC-MAS  and such that , , for some and an SA-FO-CTL formula . Then,

 (P,s)⊨φ iff (P′,s′)⊨φ

Proof.  The proof is by induction on the structure of . Observe first that since is sentence-atomic, its satisfaction does not depend on assignments. We report the proof for the left-to-right part of the implication; the converse can be shown similarly.

The base case for an FO-sentence follows from Prop. 2.15. The inductive cases for propositional connectives are straightforward.

For , assume for contradiction that and . Then, there exists a run s.t.  and . By Def. 3.2 and 3.1 there exists a s.t.  and . Further, by seriality of , can be extended to a run s.t.  and . By the induction hypothesis we obtain that . Hence, , which is a contradiction.

For , let be a run with such that there exists such that , and for every , implies . By Prop. 3.3 there exists a run s.t.  and for all , . By the induction hypothesis we have that for each , iff , and iff . Therefore, is a run s.t. , , and for every , implies , i.e., .

For , assume for contradiction that and . Then, there exists a run s.t.  and for every , if , then there exists s.t.  and . By Prop. 3.3 there exists a run s.t.  and for all , . Further, by the induction hypothesis we have that iff and iff . But then is s.t.  and for every , if , then there exists s.t.  and . That is, , which is a contradiction.

By applying the result above to the case of and , we obtain the following.

###### Theorem 3.5

Consider the AC-MAS  and such that , and an SA-FO-CTL formula . We have

 P⊨φ iff P′⊨φ

In summary we have proved that bisimilar AC-MAS validate the same SA-FO-CTL formulas. In the next section we use this result to reduce, under additional assumptions, the verification of an infinite-state AC-MAS to that of a finite-state one.

### 3.1 Finite Abstractions of Bisimilar AC-MAS

We now define a notion of finite abstraction for AC-MAS. We prove that abstractions are bisimilar to the corresponding concrete model. We are particularly interested in finite abstraction; so we operate on a special class of infinite models that we call bounded.

###### Definition 3.6 (Bounded AC-MAS)

An AC-MAS  is -bounded, for , if for all , .

An AC-MAS is -bounded if none of its reachable states contains more than distinct elements. Observe that bounded AC-MAS may be defined on infinite domains . Furthermore, note that a -bounded AC-MAS may contain infinitely many states, all bounded by . So -bounded systems are infinite-state in general. Notice also that the value bounds only the number of distinct individuals in a state, not the size of the state itself, i.e., the amount of memory required to accommodate the individuals. Indeed, the infinitely many elements of need an unbounded number of bits to be represented (e.g., as finite strings), so, even though each state is guaranteed to contain at most distinct elements, nothing can be said about how large the actual space required by such elements is. On the other hand, it should be clear that memory-bounded AC-MAS are finite-state (hence -bounded, for some ).

Thus, seen as programs, -bounded AC-MAS are in general memory-unbounded. Therefore, for the purpose of verification, they cannot be trivially checked by generating all their executions –as it would be the case if they were memory-bounded– like standard model checking techniques typically do. However, we will show later that any -bounded infinite-state ACMAS admits a finite abstraction which can be used to verify it.

We now introduce abstractions in a modular manner by first introducing a set of abstract agents from a concrete AC-MAS.

###### Definition 3.7 (Abstract agent)

Let be an agent defined on the interpretation domain . Given a set of individuals, we define the abstract agent on such that:

1. ;

2. ;

3. ;

4. iff there exist and s.t. , for some witness , and , for some bijection extending to .

Given a set of agents defined on , let be the set of the corresponding abstract agents on .

We remark that , as defined in Definition 3.7, is indeed an agent and complies with Definition 2.6. Notice that the protocol of is defined on the basis of its corresponding concrete agent and requires the existence of a bijection between the elements in the local states and the action parameters. Thus, in order for a ground action of to have a counterpart in , the last requirement of Definition 3.7 constrains to contain a sufficient number of distinct values. As it will become apparent later, the size of determines how closely an abstract system can simulate its concrete counterpart.

We can now formalize the notion of abstraction that we will use in this section.

###### Definition 3.8 (Abstraction)

Let be an AC-MAS over and the set of agents obtained as in Definition 3.7, for some . The AC-MAS  defined over is said to be an abstraction of iff:

• ;

• for some iff there exist and , such that , and for some witness , and for some extending .

Notice that abstractions have initial states isomorphic to their concrete counterparts. The condition in Definition 3.8 means that whenever for some witness , , and , then . This constraint means that action are data-independent. So, for example, a copy action in the concrete model has a corresponding copy action in the abstract model regardless of the data that are copied. Crucially, this condition requires that the domain contains enough elements to simulate the concrete states and action effects as the following result makes precise. In what follows we take , i.e., is the sum of the maximum numbers of parameters contained in the action types of each agent in .

###### Theorem 3.9

Consider a -bounded AC-MAS  over an infinite interpretation domain , an SA-FO-CTLK formula , and a finite interpretation domain such that and . Any abstraction of is bisimilar to .

Proof.  Define a relation as . We show that is a bisimulation such that . Observe first that , so . Next, consider and such that (i.e., ), and assume that , for some . Then, there exists s.t. . We show next that there exists s.t.  and . To this end, observe that, since and , we can define an injective function such that . We take ; it remains to prove that . By the condition on the cardinality of we can extend to as well, and set . Then, by the definition of we have that