Verification of AgentBased Artifact Systems
Abstract
Artifact systems are a novel paradigm for specifying and implementing business processes described in terms of interacting modules called artifacts. Artifacts consist of data and lifecycles, accounting respectively for the relational structure of the artifacts’ states and their possible evolutions over time. In this paper we put forward artifactcentric multiagent systems, a novel formalisation of artifact systems in the context of multiagent systems operating on them. Differently from the usual processbased models of services, the semantics we give explicitly accounts for the data structures on which artifact systems are defined.
We study the model checking problem for artifactcentric multiagent systems against specifications written in a quantified version of temporalepistemic logic expressing the knowledge of the agents in the exchange. We begin by noting that the problem is undecidable in general. We then identify two noteworthy restrictions, one syntactical and one semantical, that enable us to find bisimilar finite abstractions and therefore reduce the model checking problem to the instance on finite models. Under these assumptions we show that the model checking problem for these systems is EXPSPACEcomplete. We then introduce artifactcentric programs, compact and declarative representations of the programs governing both the artifact system and the agents. We show that, while these in principle generate infinitestate systems, under natural conditions their verification problem can be solved on finite abstractions that can be effectively computed from the programs. Finally we exemplify the theoretical results of the paper through a mainstream procurement scenario from the artifact systems literature.
1 Introduction
Much of the work in the area of reasoning about knowledge involves the development of formal techniques for the representation of epistemic properties of rational actors, or agents, in a multiagent system (MAS). The approaches based on modal logic are often rooted on interpreted systems [\BCAYParikh \BBA RamanujamParikh \BBA Ramanujam1985], a computationally grounded semantics [\BCAYWooldridgeWooldridge2000] used for the interpretation of several temporalepistemic logics. This line of research was thoroughly explored in the 1990s leading to a significant body of work [\BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995]. Further significant explorations have been conducted since then; a recent topic of interest has focused on the development of automatic techniques, including model checking [\BCAYClarke, Grumberg, \BBA PeledClarke et al.1999], for the verification of temporalepistemic specifications for the autonomous agents in a MAS [\BCAYGammie \BBA van der MeydenGammie \BBA van der Meyden2004, \BCAYKacprzak, Nabialek, Niewiadomski, Penczek, Pólrola, Szreter, Wozna, \BBA ZbrzeznyKacprzak et al.2008, \BCAYLomuscio, Qu, \BBA RaimondiLomuscio et al.2009]. This has led to developments in a number of areas traditionally outside artificial intelligence, knowledge representation and MAS, including security [\BCAYDechesne \BBA WangDechesne \BBA Wang2010, \BCAYCiobaca, Delaune, \BBA KremerCiobaca et al.2012], webservices [\BCAYLomuscio, Solanki, Penczek, \BBA SzreterLomuscio et al.2010] and cachecoherence protocols in hardware design [\BCAYBaukus \BBA van der MeydenBaukus \BBA van der Meyden2004]. The ambition of the present paper is to offer a similar change of perspective in the area of artifact systems [\BCAYCohn \BBA HullCohn \BBA Hull2009], a growing topic in ServiceOriented Computing (SOC).
Artifacts are structures that “combine data and process in an holistic manner as the basic building block[s]” [\BCAYCohn \BBA HullCohn \BBA Hull2009] of systems’ descriptions. Artifact systems are services constituted by complex workflow schemes based on artifacts which the agents interact with. The data component is given by the relational databases underpinning the artifacts in a system, whereas the workflows are described by “lifecycles” associated with each artifact schema. While in the standard services paradigm services are made public by exposing their processes interface, in artifact systems both the data structures and the lifecycles are advertised. Services are composed in a “hub” where operations on the artifacts are executed. Implementations of artifact systems, such as the IBM engine Barcelona [\BCAYHeath, Hull, \BBA VaculínHeath et al.2011], provide a hub where the service choreography and service orchestratation [\BCAYAlonso, Casati, Kuno, \BBA MachirajuAlonso et al.2004] are carried out.
While artifact systems are beginning to drive new application areas, such as case management systems [\BCAYMarin, Hull, \BBA VaculínMarin et al.2012], we identify two shortcomings in the present stateoftheart. Firstly, the artifact systems literature [\BCAYBhattacharya, Gerede, Hull, Liu, \BBA SuBhattacharya et al.2007, \BCAYDeutsch, Hull, Patrizi, \BBA VianuDeutsch et al.2009, \BCAYHullHull2008, \BCAYNooijen, Fahland, \BBA DongenNooijen et al.2012] focuses exclusively on the artifacts themselves. While there is obviously a need to model and implement the artifact infrastructure, importantly we also need to account for the agents implementing the services acting on the artifact system. This is of particular relevance given that artifact systems are envisaged to play a leading role in information systems. We need to be able to reason not just about the artifact states but also about what actions specific participants are allowed and not allowed to do, what knowledge they can or cannot derive in a system run, what system state they can achieve in coordination with their peers, etc. In other words, we need to move from the description of the artifact infrastructure to one that encompasses both the agents and the infrastructure.
Secondly, there is a pressing demand to provide the hub with automatic choreography and orchestration capabilities. It is wellknown that choreography techniques can be leveraged on automatic model checking techniques; orchestration can be recast as a synthesis problem, which, in turn, can also benefit from model checking technology. However, while model checking and its applications are relatively wellunderstood in the plain processbased modelling, the presence of data makes these problems much harder and virtually unexplored. Additionally, infinite domains in the underlying databases lead to infinite statespaces and undecidability of the model checking problem.
The aim of this paper is to make a concerted contribution to both problems above. Firstly, we provide a computationally grounded semantics to systems comprising the artifact infrastructure and the agents operating on it. We use this semantics to interpret a temporalepistemic language with firstorder quantifiers to reason about the evolution of the hub as well as the knowledge of the agents in the presence of evolving, structured data. We observe that the model checking problem for these structures is undecidable in general and analyse two notable decidable fragments. In this context, a contribution we make is to provide finite abstractions to infinitestate artifact systems, thereby presenting a technique for their effective verification for a class of declarative agentbased, artifactcentric programs that we here define. We evaluate this methodology by studying its computational complexity and by demonstrating its use on a wellknown scenario from the artifact systems literature.
1.1 ArtifactCentric Systems
Serviceoriented computing is concerned with the study and development of distributed applications that can be automatically discovered and composed by means of remote interfaces. A point of distinction over more traditional distributed systems is the interoperability and connectedness of services and the shared format for both data and remote procedure calls. Two technologyindependent concepts permeate the serviceoriented literature: orchestration and choreography [\BCAYAlonso, Casati, Kuno, \BBA MachirajuAlonso et al.2004, \BCAYSingh \BBA HuhnsSingh \BBA Huhns2005]. Orchestration involves the ordering of actions of possibly different services, facilitated by a controller or orchestrator, to achieve a certain overall goal. Choreography concerns the distributed coordination of different actions through publicly observable events to achieve a certain goal. A MAS perspective [\BCAYWooldridgeWooldridge2001] is known to be particularly helpful in serviceoriented computing in that it allows us to ascribe information states and private or common goals to the various services. Under this view the agents of the system implement the services and interact with one another in a shared infrastructure or environment.
A key theoretical problem in SOC is to devise effective mechanisms to verify that service composition is correct according to some specification. Techniques based on model checking [\BCAYClarke, Grumberg, \BBA PeledClarke et al.1999] and synthesis [\BCAYBerardi, Cheikh, Giacomo, \BBA PatriziBerardi et al.2008] have been put forward to solve the composition and orchestration problem for services described and advertised at interface level through finite state machines [\BCAYCalvanese, Giacomo, Lenzerini, Mecella, \BBA PatriziCalvanese et al.2008]. More recently, attention has turned to services described by languages such as WSBPEL [\BCAYAlves et al.Alves et al.2007], which provide potentially unbounded variables in the description of the service process. Again, model checking approaches have successfully been used to verify complex service compositions [\BCAYBertoli, Pistore, \BBA TraversoBertoli et al.2010, \BCAYLomuscio, Qu, \BBA SolankiLomuscio et al.2012].
While WSBPEL provides a model for services with variables, the data referenced by them is nonpermanent. The area of datacentric workflows [\BCAYHull, Narendra, \BBA NigamHull et al.2009, \BCAYNigam \BBA CaswellNigam \BBA Caswell2003] evolved as an attempt to provide support for permanent data, typically present in the form of underlying databases. Although usually abstracted away, permanent data is of central importance to services, which typically query data sources and are driven by the answers they obtain; see, e.g., [\BCAYBerardi, Calvanese, Giacomo, Hull, \BBA MecellaBerardi et al.2005]. Therefore, a faithful model of a service behavior cannot, in general, disregard this component. In response to this, proposals have been made in the workflows and service communities in terms of declarative specifications of datacentric services that are advertised for automatic discovery and composition. The artifactcentric approach [\BCAYCohn \BBA HullCohn \BBA Hull2009] is now one of the leading emerging paradigms in the area. As described in [\BCAYHullHull2008, \BCAYHull, Damaggio, De Masellis, Fournier, Gupta, Heath, Hobson, Linehan, Maradugu, Nigam, Sukaviriya, \BBA VaculinHull et al.2011] artifactcentric systems can be presented along four dimensions.
Artifacts are the holders of all structured information available in the system. In a businessoriented scenario this may include purchase orders, invoices, payment records, etc. Artifacts may be created, amended, and destroyed at run time; however, abstract artifact schemas are provided at design time to define the structure of all artifacts to be manipulated in the system. Intuitively, external events cause changes in the system, including in the value of artifact attributes.
The evolution of artifacts is governed by lifecycles. These capture the changes that an artifact may go through from creation to deletion. Intuitively, a purchase order may be created, amended and operated on by several events before it is fullfilled and its existence in the system terminated: a lifecycle associated with a purchase order artifact formalises these transitions.
Services are seen as the actors operating on the artifact system. They represent both human and software actors, possibly distributed, that generate events on the artifact system. Some services may “own” artifacts, and some artifacts may be shared by several services. However, not all artifacts, or parts of artifacts, are visible to all services. Views and windows respectively determine which parts of artifacts and which artifact instances are visible to which service. An artifact hub is a system that maintains the artifact system and processes the events generated by the services.
Services generate events on the artifact system according to associations. Typically these are declarative descriptions providing the precondition and postconditions for the generation of events. These generate changes in the artifact system according to the artifact lifecycles. Since events may trigger changes in several artifacts in the system, events are processed by a welldefined semantics [\BCAYDamaggio, Hull, \BBA VaculínDamaggio et al.2011, \BCAYHull, Damaggio, De Masellis, Fournier, Gupta, Heath, Hobson, Linehan, Maradugu, Nigam, Sukaviriya, \BBA VaculinHull et al.2011] that governs the sequence of changes an artifactsystem may undertake upon consumption of an event. Such a semantics, based on the use of PrerequisiteAntecedentConsequent (PAC) rules, ensures acyclicity and full determinism in the updates on the artifact system. GSM is a declarative language that can be used to describe artifact systems. Barcelona is an engine that can be used to run a GSMbased artifactcentric system [\BCAYHeath, Hull, \BBA VaculínHeath et al.2011].
The above is a partial and incomplete description of the artifact paradigm. We refer to [\BCAYCohn \BBA HullCohn \BBA Hull2009, \BCAYHullHull2008, \BCAYHull, Damaggio, De Masellis, Fournier, Gupta, Heath, Hobson, Linehan, Maradugu, Nigam, Sukaviriya, \BBA VaculinHull et al.2011] for more details.
As it will be clear in the next section, in line with the agentbased approach to services, we will use agentbased concepts to model services. The artifactsystem will be represented as an environment, constituted by evolving databases, upon which the agents operate; lifecycles and associations will be modelled by local and global transition functions. The model is intended to incorporate all artifactrelated concepts including views and windows.
In view of the above in this paper we address the following questions. How can we give a transitionbased semantics for artifacts and agents operating on them? What syntax should we use to specify properties of the agents and the artifacts themselves? Can we verify that an artifact system satisfies certain properties? As this will be shown to be undecidable, can we find suitable fragments on which this can actually be carried out? If so, what is the resulting complexity? Lastly, can we provide declarative specifications for the agent programs so that these can be verified by model checking? Can this technique be used on mainstream scenarios from the SOC literature?
This paper intends to contribute answering these questions.
1.2 Related Work
As stated above, virtually all current literature on artifactcentric systems focuses on properties and implementations of the artifactsystem as such. Little or no attention is given to the actors on the system, whether they are human or artificial agents. A few formal techniques have, however, been put forward to verify the core, nonagent aspects of the system; in the following we briefly compare these to this contribution.
To our knowledge the verification of artifactcentric business processes was first discussed in [\BCAYBhattacharya, Gerede, Hull, Liu, \BBA SuBhattacharya et al.2007], where reachability and deadlocks are phrased in the context of artifactcentric systems and complexity results for the verification problem are given. The present contribution differs markedly from [\BCAYBhattacharya, Gerede, Hull, Liu, \BBA SuBhattacharya et al.2007] by employing a more expressive specification language, even if the agentrelated aspects are not considered, and by putting forward effective abstraction procedures for verification.
In [\BCAYGerede \BBA SuGerede \BBA Su2007] a verification technique for artifactcentric systems against a variant of computationtree logic is put forward. The decidability of the verification problem is proven for the language considered under the assumption that the interpretation domain is bounded. Decidability is also shown for the unbounded case by making restrictions on the values that quantified variables can range over. In the work here presented we also work on unbounded domains, but do not require the restrictions present in [\BCAYGerede \BBA SuGerede \BBA Su2007]: we only insist on the fact that the number of distinct values in the system does not exceed a given threshold at any point in any run. Most importantly, the interplay between quantification and modalities here considered allows us to bind and use variables in different states. This is a major difference as this feature is very expressive and known to lead to undecidability.
A related line of research is followed in [\BCAYDeutsch, Hull, Patrizi, \BBA VianuDeutsch et al.2009, \BCAYDamaggio, Deutsch, \BBA VianuDamaggio et al.2012], where the verification problem for artifact systems against two variants of firstorder lineartime temporal logic is considered. Decidability of the verification problem is retained by imposing syntactic restrictions on both the system descriptions and the specifications to check. This effectively limits the way in which new values introduced at every computational step can be used by the system. Properties based on arithmetic operators are considered in [\BCAYDamaggio, Deutsch, \BBA VianuDamaggio et al.2012]. While there are elements of similarity between these approaches and the one we put forward here, including the fact that the concrete interpretation domain is replaced by an abstract one, the contribution here presented has significant differences from these. Firstly, our setting is branchingtime and not lineartime thereby resulting in different expressive power. Secondly, differently from [\BCAYDeutsch, Hull, Patrizi, \BBA VianuDeutsch et al.2009, \BCAYDamaggio, Deutsch, \BBA VianuDamaggio et al.2012], we impose no constraints on nested quantifiers. In contrast, [\BCAYDamaggio, Deutsch, \BBA VianuDamaggio et al.2012] admits only universal quantification over combinations of quantifierfree firstorder formulas. Thirdly, the abstraction results we present here are given in general terms on the semantics of declarative programs and do not depend on a particular presentation of the system.
More closely related to the present contribution is [\BCAYHariri, Calvanese, Giacomo, Deutsch, \BBA MontaliHariri et al.2012], where conditions for the decidability of the model checking problem for datacentric dynamic systems, e.g., dynamic systems with relational states, are given. In this case the specification language used is a firstorder version of the calculus. While our temporal fragment is subsumed by the calculus, since we use indexed epistemic modalities as well as a common knowledge operator, the two specification languages have different expressive power. To retain decidability, like we do here, the authors assume a constraint on the size of the states. However, differently from the contribution here presented, [\BCAYHariri, Calvanese, Giacomo, Deutsch, \BBA MontaliHariri et al.2012] assume limited forms of quantification whereby only individuals persisting in the system evolution can be quantified over. In this contribution we do not make this restriction.
Irrespective of what above, the most important feature that characterises our work is that the setup is entirely based on epistemic logic and multiagent systems. We use agents to represent the autonomous services operating in the system and agentbased concepts play a key role in the modelling, the specifications, and the verification techniques put forward. Differently from all approaches presented above we are not only concerned with whether the artifactsystem meets a particular specification. Instead, we also wish to consider what knowledge the agents in the system acquire by interacting among themselves and with the artifactsystem during a system run. Additionally, the abstraction methodology put forward is modular with respect to the agents in the system. These features enable us to give constructive procedures for the generation of finite abstractions for artifactcentric programs associated with infinite models. We are not aware of any work in the literature tackling any of these aspects.
Relation to previous work by the authors. This paper combines and expands preliminary results originally discussed in [\BCAYBelardinelli, Lomuscio, \BBA PatriziBelardinelli et al.2011a], [\BCAYBelardinelli, Lomuscio, \BBA PatriziBelardinelli et al.2011b], [\BCAYBelardinelli, Lomuscio, \BBA PatriziBelardinelli et al.2012a], and [\BCAYBelardinelli, Lomuscio, \BBA PatriziBelardinelli et al.2012b]. In particular, the technical set up of artifacts and agents is different from that of our preliminary studies and makes it more natural to express artifactcentric concepts such as views. Differently from our previous attempts we here incorporate an operator for common knowledge and provide constructive methods to define abstractions for all notions of bisimulation. We also consider the complexity of the verification problem, previously unexplored, and evaluate the technique in detail on a case study.
1.3 Scheme of the Paper
The rest of the paper is organised as follows. In Section 2 we introduce Artifactcentric MultiAgent Systems (ACMAS), the semantics we will be using throughout the paper to describe agents operating on an artifact system. In the same section we put forward FOCTLK, a firstorder logic with knowledge and time to reason about the evolution of the knowledge of the agents and the artifact system. This enables us to propose a satisfaction relation based on the notion of bounded quantification, define the model checking problem, and highlight some properties of isomorphic states.
An immediate result we will explore concerns the undecidability of the model checking problem for ACMAS in their general setting. Section 3 is concerned with synctactical restrictions on FOCTLK that enable us to guarantee the existence of finite abstractions of infinitestate ACMAS, thereby making the model checking problem feasible by means of standard techniques.
Section 4 tackles restrictions orthogonal to those of Section 3 by focusing on a subclass of ACMAS that admits a decidable model checking problem when considering full FOCTLK specifications. The key finding here is that bounded and uniform ACMAS, a class identified by studying a strong bisimulation relation, admit finite abstractions for any FOCTLK specification. The section concludes by showing that under these restrictions the model checking problem is EXPSPACEcomplete.
We turn our attention to artifact programs in Section 6 by defining the concept of artifactcentric programs. We define them through natural, firstorder preconditions and postconditions in line with the artifactcentric approach. We give a semantics to them in terms of ACMAS and show that their generated models are precisely those uniform ACMAS studied earlier in the paper. It follows that, under some boundedness conditions, which can be naturally expressed, the model checking problem for artifactcentric programs is decidable and can be executed on finite models.
Section 7 reports a scenario from the artifact systems literature. This is used to exemplify the technique by providing finite abstractions that can be effectively verified.
We conclude in Section 8 where we consider the limitations of the approach and point to further work.
2 ArtifactCentric MultiAgent Systems
In this section we formalise artifactcentric systems and state their verification problem. As data and databases are important constituents of artifact systems, our formalisation of artifacts relies on them as underpinning concepts. However, as discussed in the previous section, we here give prominence to agentbased concepts. As such, we define our systems as comprising both the artifacts in the system as well as the agents that interact with the system.
A standard paradigm for logicbased reasoning about agent systems is interpreted systems [\BCAYParikh \BBA RamanujamParikh \BBA Ramanujam1985, \BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995]. In this setting agents are endowed with private local states and evolve by performing actions according to an individual protocol. As data play a key part, as well as to allow us to specify properties of the artifact system, we will define the agents’ local states as evolving database instances. We call this formalisation artifactcentric multiagent systems (ACMAS). ACMAS enable us to represent naturally and concisely concepts much used in the artifact paradigm such as the one of view discussed earlier.
Our specification language will include temporalepistemic logic but also quantification over a domain so as to represent the data. This is an usual verification setting, so we will formally define the model checking problem for this set up.
2.1 Databases and FirstOrder Logic
As discussed above, we use databases as the basic building blocks for defining the states of the agents and the artifact system. We here fix the notation and terminology used. We refer to [\BCAYAbiteboul, Hull, \BBA VianuAbiteboul et al.1995] for more details on databases.
Definition 2.1 (Database Schemas)
A (relational) database schema is a set of relation symbols , each associated with its arity .
Instances of database schemas are defined over interpretation domains.
Definition 2.2 (Database Instances)
Given an interpretation domain and a database schema , a instance over is a mapping associating each relation symbol with a finite ary relation over , i.e., .
The set of all instances over an interpretation domain is denoted by . We simply refer to “instances” whenever the database schema is clear by the context. The active domain of an instance , denoted as , is the set of all individuals in occurring in some tuple of some predicate interpretation . Observe that, since contains a finite number of relation symbols and each is finite, so is .
To fix the notation, we recall the syntax of firstorder formulas with equality and no function symbols. Let be a countable set of individual variables and be a finite set of individual constants. A term is any element .
Definition 2.3 (FOformulas over )
Given a database schema , the formulas of the firstorder language are defined by the following BNF grammar:
where , is a tuple of terms and are terms.
We assume “” to be a special binary predicate with fixed obvious interpretation. To summarise, is a firstorder language with equality over the relational vocabulary with no function symbols and with finitely many constant symbols from . Observe that considering a finite set of constants is not a limitation. Indeed, since we will be working with finite sets of formulas, can always be defined so as to be able to express any formula of interest.
In the following we use the standard abbreviations , , , and . Also, free and bound variables are defined as standard. For a formula we denote the set of its variables as , the set of its free variables as , and the set of its constants as . We write to list explicitly in arbitrary order all the free variables of . By slight abuse of notation, we treat as a set, thus we write . A sentence is a formula with no free variables.
Given an interpretation domain such that , an assignment is a function . For an assignment , we denote by the assignment such that: (i) ; and (ii) , for every different from . For convenience, we extend assignments to constants so that , if ; that is, we assume a Herbrand interpretation of constants. We can now define the semantics of .
Definition 2.4 (Satisfaction of FOformulas)
Given a instance , an assignment , and an FOformula , we inductively define whether satisfies under , written , as follows: iff iff iff it is not the case that iff or iff for all , we have that
A formula is true in , written , iff , for all assignments .
Observe that we adopt an activedomain semantics, that is, quantified variables range only over the active domain of . Also notice that constants are interpreted rigidly; so, two constants are equal if and only if they are syntactically the same. In the rest of the paper, we assume that every interpretation domain includes . Also, as a usual shortcut, we write to express that it is not the case that .
Finally, we introduce the operator on instances that will be used later in the paper. Let the primed version of a database schema be the schema obtained from by syntactically replacing each predicate symbol with its primed version of the same arity.
Definition 2.5 ( Operator)
Given two instances and , we define as the instance such that and .
Intuitively, the operator defines a disjunctive join of the two instances, where relation symbols in are interpreted according to , while their primed versions are interpreted according to .
2.2 ArtifactCentric MultiAgent Systems
In the following we introduce the semantic structures that we will use throughout the paper. We define an artifactcentric multiagent system as a system comprising an environment representing all interacting artifacts in the system and a finite set of agents interacting with such environment. As agents have views of the artifact state, i.e., projections of the status of particular artifacts, we assume the building blocks of their private local states also to be modelled as database instances. In line with the interpreted systems semantics [\BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995] not everything in the agents’ states needs to be present in the environment; a portion of it may be entirely private and not replicated in other agents’ states. So, we start by introducing the notion of agent.
Definition 2.6 (Agent)
Given an interpretation domain , an agent is a tuple , where:

is the local database schema;

is the set of local states;

is the finite set of action types of the form , where is the tuple of abstract parameters;

is the local protocol function, where is the set of ground actions of the form where and is a tuple of ground parameters.
Intuitively, at a given time each agent is in some local state that represents all the information agent has at its disposal. In this sense we follow [\BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995] but require that this information is structured as a database. Again, following standard literature we assume that the agents are autonomous and proactive and perform the actions in according to the protocol function . In the definition above we distinguish between “abstract parameters” to denote the language in which particular action parameters are given, and their concrete values or “ground parameters”.
We assume that the agents interact among themselves and with an environment comprising all artifacts in the system. As artifacts are entities involving both data and process, we can see them as collections of database instances paired with actions and governed by special protocols. Without loss of generality we can assume the environment state to be a single database instance including all artifacts in the system. From a purely formal point of view this allows us to represent the environment as a special agent. Of course, in any specific instantiation the environment and the agents will be rather different, exactly in line with the standard propositional version of interpreted systems.
We can therefore define the synchronous composition of agents with the environment.
Definition 2.7 (ArtifactCentric MultiAgent Systems)
Given an interpretation domain and a set of agents defined on , an artifactcentric multiagent system (or ACMAS) is a tuple where:

is the set of reachable global states;

is the interpretation domain;

is the initial global state;

is the global transition function, where is the set of global (ground) actions, and is defined iff for every .
As we will see in later sections, ACMAS are the natural extension of interpreted systems to the first order to account for environments constituted of artifactcentric systems. They can be seen as a specialisation of quantified interpreted systems [\BCAYBelardinelli \BBA LomuscioBelardinelli \BBA Lomuscio2012], a general extension of interpreted systems to the firstorder case.
In the formalisation above the agent is referred to as the environment . The environment includes all artifacts in the system as well as additional information to facilitate communication between the agents and the hub, e.g., messages in transit etc. At any given time an ACMAS is described by a tuple of database instances, representing all the agents in the system as well as the artifact system. A single interpretation domain for all database schemas is given. Note that this does not break the generality of the representation as we can always extend the domain of all agents and the environment before composing them into a single ACMAS. The global transition function defines the evolution of the system through synchronous composition of actions for the environment and all agents in the system.
Much of the interaction we are interested in modelling involves message exchanges with payload, hence the action parameters, between agents and the environment, i.e., agents operating on the artifacts. However, note that the formalisation above does not preclude us from modelling agenttoagent interactions, as the global transition function does not rule out successors in which only some agents change their local state following some actions. Also observe that essential concepts such as views are naturally expressed in ACMAS by insisting that the local state of an agent includes part of the environment’s, i.e., the artifacts the agent has access to. Not all ACMAS need to have views defined, so it is also possible for the views to be empty.
Other artifactbased concepts such as lifecycles are naturally expressed in ACMAS. As artifacts are modelled as part of the environment, a lifecycle is naturally encoded in ACMAS simply as the sequence of changes induced by the transition function on the fragment of the environment representing the lifecycle in question. We will show an example of this in Section 7.
Some technical remarks now follow. To simplify the notation, we denote a global ground action as , where and , with each of appropriate size. We define the transition relation on such that if and only if there exists a such that . If , we say that is a successor of . A run from is an infinite sequence , with . For , we take . A state is reachable from if there exists a run from the global state such that , for some . We assume that the relation is serial. This can be easily obtained by assuming that each agent has a skip action enabled at each local state and that performing skip induces no changes in any of the local states. We consider to be the set of states reachable from the initial state . For convenience we will use also the concept of temporalepistemic (t.e., for short) run. Formally a t.e. run from a state is an infinite sequence such that and or , for some . A state is said to be temporallyepistemically reachable (t.e. reachable, for short) from if there exists a t.e. run from the global state such that for some we have that . Obviously, temporalepistemic runs include purely temporal runs as a special case.
As in plain interpreted systems [\BCAYFagin, Halpern, Moses, \BBA VardiFagin et al.1995], we say that two global states and are epistemically indistinguishable for agent , written , if . Differently from interpreted systems the local equality is evaluated on database instances. Also, notice that we admit to be infinite, thereby allowing the possibility of the set of states to be infinite. Indeed, unless we specify otherwise, we will assume to be working with infinitestate ACMAS.
Finally, for technical reasons it is useful to refer to a global database schema of an ACMAS. Every global state is associated with the (global) instance such that , for . We omit the subscript when is clear from the context and we write for . Notice that for every , the associated with is unique, while the converse is not true in general.
2.3 Model Checking
We now define the problem of verifying an artifactcentric multiagent system against a specification of interest. By following the artifactcentric model, we wish to give data the same prominence as processes. To deal with data and the underlying database instances, our specification language needs to include firstorder logic. Further, we require temporal logic to describe the system execution. Lastly, we use epistemic logic to express the information the agents have at their disposal. Hence, we define a firstorder temporal epistemic specification language to be interpreted on ACMAS. The specification language will be used in Section 6 to formalise properties of artifactcentric programs.
Definition 2.8 (The Logic FOCTLK)
The firstorder CTLK (or FOCTLK) formulas over a database schema are inductively defined by the following BNF:
where and .
The notions of free and bound variables for FOCTLK extend straightforwardly from , as well as functions vars, free, and const. As usual, the temporal formulas and (resp. ) are read as “for all runs, at the next step ” and “for all runs (resp. some run), until ”. The epistemic formulas and intuitively mean that “agent knows ” and “it is common knowledge among all agents that ” respectively. We use the abbreviations , , , , and as standard. Observe that free variables can occur within the scope of modal operators, thus allowing for the unconstrained alternation of quantifiers and modal operators, thereby allowing us to refer to elements in different modal contexts. We consider also a number of fragments of FOCTLK. The sentence atomic version of FOCTLK without epistemic modalities, or SAFOCTL, is the language obtained from Definition 2.8 by removing the clauses for epistemic operators and restricting atomic formulas to firstorder sentences, so that no variable appears free in the scope of a modal operator:
where is a sentence.
We will consider also the language FOECTLK, i.e., the existential fragments of FOCTLK, defined as follows:
where , with and the standard abbreviations, , and .
The semantics of FOCTLK formulas is defined as follows.
Definition 2.9 (Satisfaction for FOCTLK)
Consider an ACMAS , an FOCTLK formula , a state , and an assignment . We inductively define whether satisfies in under , written , as follows: iff , if is an FOformula iff it is not the case that iff or iff for all , iff for all runs , if , then iff for all runs , if , then there is s.t. , and for all , implies iff for some run , and there is s.t. , and for all , implies iff for all , implies iff for all , implies where is the transitive closure of .
A formula is said to be true at a state , written , if for all assignments . Moreover, is said to be true in , written , if .
A key concern in this paper is to explore the model checking of ACMAS against firstorder temporalepistemic specifications.
Definition 2.10 (Model Checking)
Model checking an ACMAS against an FOCTLK formula amounts to finding an assignment such that .
It is easy to see that whenever is finite the model checking problem is decidable as is a finitestate system. In general this is not the case.
Theorem 2.11
The model checking problem for ACMAS w.r.t. FOCTLK is undecidable.
Proof (sketch). This can be proved by showing that every Turing machine whose tape contains an initial input can be simulated by an artifact system . The problem of checking whether terminates on that particular input can be reduced to checking whether , where encodes the termination condition. The detailed construction is similar to that of Theorem 4.10 of [\BCAYDeutsch, Sui, \BBA VianuDeutsch et al.2007]. Given the general setting in which the model checking problem is defined above, the negative result is not surprising. In the following we identify syntactic and semantic restrictions for which the problem is decidable.
2.4 Isomorphisms
We now investigate the concept of isomorphism on ACMAS. This will be needed in later sections to produce finite abstractions of infinitestate ACMAS. In what follows let and be two ACMAS.
Definition 2.12 (Isomorphism)
Two local states are isomorphic, written , iff there exists a bijection such that:

is the identity on ;

for every , , we have that iff .
When this is the case, we say that is a witness for .
Two global states and are isomorphic, written , iff there exists a bijection such that for every , is a witness for .
Notice that isomorphisms preserve the constants in as well as predicates in the local states up to renaming of the corresponding terms. Any function as above is called a witness for . Obviously, the relation is an equivalence relation. Given a function defined on , denotes the interpretation in obtained from by renaming each as . If is also injective (thus invertible) and the identity on , then .
Example. For an example of isomorphic states, consider an agent with local database schema , let be an interpretation domain, and fix the set of constants. Let be the local state such that and (see Figure 1). Then, the local state such that and is isomorphic to . This can be easily seen by considering the isomorphism , where: , , and . On the other hand, the state where and is not isomorphic to . Indeed, although a bijection exists that “transforms” into , it is easy to see that none can be such that .
Note that, while isomorphic states have the same relational structure, two isomorphic states do not necessarily satisfy the same FOformulas as satisfaction depends also on the values assigned to free variables. To account for this, we introduce the following notion.
Definition 2.13 (Equivalent assignments)
Given two states and , and a set of variables , two assignments and are equivalent for w.r.t. and iff there exists a bijection such that:

is a witness for ;

.
Intuitively, equivalent assignments preserve both the (in)equalities of the variables in and the constants in up to renaming. Note that, by definition, the above implies that are isomorphic. We say that two assignments are equivalent for an FOCTLK formula , omitting the states and when it is clear from the context, if these are equivalent for .
We can now show that isomorphic states satisfy exactly the same FOformulas.
Proposition 2.14
Given two isomorphic states and , an FOformula , and two assignments and equivalent for , we have that
iff 
Proof. The proof is by induction on the structure of . Consider the base case for the atomic formula . Then iff . Since and are equivalent for , and , this is the case iff , that is, . The base case for is proved similarly, by observing that the satisfaction of depends only on the assignments, and that the function of Def. 2.13 is a bijection, thus all the (in)equalities between the values assigned by and are preserved. This is sufficient to guarantee that iff . The inductive step for the propositional connectives is straightforward. Finally, if , then iff for all , . Now consider the witness for , where is as in Def. 2.13. We have that and are equivalent for . By induction hypothesis iff . Since is a bijection, this is the case iff for all , , i.e., .
This leads us to the following result.
Corollary 2.15
Given two isomorphic states and and an FOsentence , we have that
iff 
Proof. From right to left. Suppose, by contradiction, that . Then there exists an assignment s.t. . Since , if is a witness for , then the assignment is equivalent to for and . By Proposition 2.14 we have that , that is, . The case from left to right can be shown similarly.
Thus, isomorphic states cannot be distinguished by FOsentences. This enables us to use this notion when defining simulations as we will see in the next section.
3 Abstractions for Sentence Atomic FOCTL
In the previous section we have observed that model checking ACMAS against FOCTLK is undecidable in general. So, it is clearly of interest to identify decidable settings. In what follows we introduce two main results. The first, presented in this section, identifies restrictions on the language; the second, presented in the next section, focuses on semantic constraints. While these cases are in some sense orthogonal to each other, we show that they both lead to decidable model checking problems. They are also both carried out on a rather natural subclass of ACMAS that we call bounded, which we identify below. Our goal for proceeding in this manner is to identify finite abstractions of infinitestate ACMAS so that verification of programs, that admit ACMAS as models, can be conducted on them, rather than on infinitestate ACMAS. We will see this in detail in Section 6.
Given our aims we begin by defining a first notion of bisimulation in the context of ACMAS. Bisimulations will be used to show that all bounded ACMAS admit a finite, bisimilar, abstraction that satisifies the same SAFOCTL specifications as the original ACMAS. Also in what follows we assume that and .
Definition 3.1 (Simulation)
A relation is a simulation iff implies:

;

for every , if then there exists s.t. and .
Definition 3.1 presents the standard notion of simulation applied to the case of ACMAS. The difference from the propositional case is that we here insist on the states being isomorphic, a generalisation from the usual requirement for propositional valuations to be equal [\BCAYBlackburn, de Rijke, \BBA VenemaBlackburn et al.2001]. As in the standard case, two states and are said to be similar, written , if there exists a simulation relation s.t. . It can be proven that the similarity relation is a simulation itself, and in particular the largest one w.r.t. set inclusion, and that it is transitive and reflexive. Finally, we say that simulates , written , if . We extend the above to bisimulations.
Definition 3.2 (Bisimulation)
A relation is a bisimulation iff both and are simulations.
We say that two states and are bisimilar, written , if there exists a bisimulation s.t. . Similarly to simulations, it can be proven that the bisimilarity relation is the largest bismulation. Further, it is an equivalence relation. Finally, and are said to be bisimilar, written , if .
Since, as shown in Proposition 2.15, the satisfaction of FOsentences is invariant under isomorphisms, we can now extend the usual bisimulation result from the propositional case to that of SAFOCTL. We begin by showing a result on bisimilar runs.
Proposition 3.3
Consider two ACMAS and such that , , for some , and a run of such that . Then there exists a run of such that:

;

for all , .
Proof. We show by induction that such run in exists. For , let . Obviously, . Now, assume, by induction hypothesis, that . Let . Since , by Def. 3.1, there exists such that and . Let ; hence we obtain . By definition is a run of .
This enables us to show that bisimilar ACMAS preserve SAFOCTL formulas. This is an extension of analogous results on propositional CTL.
Lemma 3.4
Consider the ACMAS and such that , , for some and an SAFOCTL formula . Then,
iff 
Proof. The proof is by induction on the structure of . Observe first that since is sentenceatomic, its satisfaction does not depend on assignments. We report the proof for the lefttoright part of the implication; the converse can be shown similarly.
The base case for an FOsentence follows from Prop. 2.15. The inductive cases for propositional connectives are straightforward.
For , assume for contradiction that and . Then, there exists a run s.t. and . By Def. 3.2 and 3.1 there exists a s.t. and . Further, by seriality of , can be extended to a run s.t. and . By the induction hypothesis we obtain that . Hence, , which is a contradiction.
For , let be a run with such that there exists such that , and for every , implies . By Prop. 3.3 there exists a run s.t. and for all , . By the induction hypothesis we have that for each , iff , and iff . Therefore, is a run s.t. , , and for every , implies , i.e., .
For , assume for contradiction that and . Then, there exists a run s.t. and for every , if , then there exists s.t. and . By Prop. 3.3 there exists a run s.t. and for all , . Further, by the induction hypothesis we have that iff and iff . But then is s.t. and for every , if , then there exists s.t. and . That is, , which is a contradiction.
By applying the result above to the case of and , we obtain the following.
Theorem 3.5
Consider the ACMAS and such that , and an SAFOCTL formula . We have
iff 
In summary we have proved that bisimilar ACMAS validate the same SAFOCTL formulas. In the next section we use this result to reduce, under additional assumptions, the verification of an infinitestate ACMAS to that of a finitestate one.
3.1 Finite Abstractions of Bisimilar ACMAS
We now define a notion of finite abstraction for ACMAS. We prove that abstractions are bisimilar to the corresponding concrete model. We are particularly interested in finite abstraction; so we operate on a special class of infinite models that we call bounded.
Definition 3.6 (Bounded ACMAS)
An ACMAS is bounded, for , if for all , .
An ACMAS is bounded if none of its reachable states contains more than distinct elements. Observe that bounded ACMAS may be defined on infinite domains . Furthermore, note that a bounded ACMAS may contain infinitely many states, all bounded by . So bounded systems are infinitestate in general. Notice also that the value bounds only the number of distinct individuals in a state, not the size of the state itself, i.e., the amount of memory required to accommodate the individuals. Indeed, the infinitely many elements of need an unbounded number of bits to be represented (e.g., as finite strings), so, even though each state is guaranteed to contain at most distinct elements, nothing can be said about how large the actual space required by such elements is. On the other hand, it should be clear that memorybounded ACMAS are finitestate (hence bounded, for some ).
Thus, seen as programs, bounded ACMAS are in general memoryunbounded. Therefore, for the purpose of verification, they cannot be trivially checked by generating all their executions –as it would be the case if they were memorybounded– like standard model checking techniques typically do. However, we will show later that any bounded infinitestate ACMAS admits a finite abstraction which can be used to verify it.
We now introduce abstractions in a modular manner by first introducing a set of abstract agents from a concrete ACMAS.
Definition 3.7 (Abstract agent)
Let be an agent defined on the interpretation domain . Given a set of individuals, we define the abstract agent on such that:

;

;

;

iff there exist and s.t. , for some witness , and , for some bijection extending to .
Given a set of agents defined on , let be the set of the corresponding abstract agents on .
We remark that , as defined in Definition 3.7, is indeed an agent and complies with Definition 2.6. Notice that the protocol of is defined on the basis of its corresponding concrete agent and requires the existence of a bijection between the elements in the local states and the action parameters. Thus, in order for a ground action of to have a counterpart in , the last requirement of Definition 3.7 constrains to contain a sufficient number of distinct values. As it will become apparent later, the size of determines how closely an abstract system can simulate its concrete counterpart.
We can now formalize the notion of abstraction that we will use in this section.
Definition 3.8 (Abstraction)
Let be an ACMAS over and the set of agents obtained as in Definition 3.7, for some . The ACMAS defined over is said to be an abstraction of iff:

;

for some iff there exist and , such that , and for some witness , and for some extending .
Notice that abstractions have initial states isomorphic to their concrete counterparts. The condition in Definition 3.8 means that whenever for some witness , , and , then . This constraint means that action are dataindependent. So, for example, a copy action in the concrete model has a corresponding copy action in the abstract model regardless of the data that are copied. Crucially, this condition requires that the domain contains enough elements to simulate the concrete states and action effects as the following result makes precise. In what follows we take , i.e., is the sum of the maximum numbers of parameters contained in the action types of each agent in .
Theorem 3.9
Consider a bounded ACMAS over an infinite interpretation domain , an SAFOCTLK formula , and a finite interpretation domain such that and . Any abstraction of is bisimilar to .
Proof. Define a relation as . We show that is a bisimulation such that . Observe first that , so . Next, consider and such that (i.e., ), and assume that , for some . Then, there exists s.t. . We show next that there exists s.t. and . To this end, observe that, since and , we can define an injective function such that . We take ; it remains to prove that . By the condition on the cardinality of we can extend to as well, and set . Then, by the definition of we have that