Matrix Graph Grammars with Application Conditions

# Matrix Graph Grammars with Application Conditions

Pedro Pablo Pérez Velasco    Juan de Lara
School of Computer Science
{pedro.perez
jdelara}@uam.es
###### Abstract

In the Matrix approach to graph transformation we represent simple digraphs and rules with Boolean matrices and vectors, and the rewriting is expressed using Boolean operators only. In previous works, we developed analysis techniques enabling the study of the applicability of rule sequences, their independence, state reachability and the minimal graph able to fire a sequence.

In the present paper we improve our framework in two ways. First, we make explicit (in the form of a Boolean matrix) some negative implicit information in rules. This matrix (called nihilation matrix) contains the elements that, if present, forbid the application of the rule (i.e. potential dangling edges, or newly added edges, which cannot be already present in the simple digraph). Second, we introduce a novel notion of application condition, which combines graph diagrams together with monadic second order logic. This allows for more flexibility and expressivity than previous approaches, as well as more concise conditions in certain cases. We demonstrate that these application conditions can be embedded into rules (i.e. in the left hand side and the nihilation matrix), and show that the applicability of a rule with arbitrary application conditions is equivalent to the applicability of a sequence of plain rules without application conditions. Therefore, the analysis of the former is equivalent to the analysis of the latter, showing that in our framework no additional results are needed for the study of application conditions. Moreover, all analysis techniques of [21, 22] for the study of sequences can be applied to application conditions.

\issue

XXI (2009)

Matrix Graph Grammars

Keywords: Graph Transformation, Matrix Graph Grammars, Application Conditions, Monadic Second Order Logic, Graph Dynamics.

## 1 Introduction

Graph transformation [8, 32] is becoming increasingly popular in order to describe system behaviour due to its graphical, declarative and formal nature. For example, it has been used to describe the operational semantics of Domain Specific Visual Languages (DSVLs) [19], taking the advantage that it is possible to use the concrete syntax of the DSVL in the rules, which then become more intuitive to the designer.

The main formalization of graph transformation is the so called algebraic approach [8], which uses category theory in order to express the rewriting step. Prominent examples of this approach are the double [3, 8] and single [6] pushout (DPO and SPO), which have developed interesting analysis techniques, for example to check sequential and parallel independence between pairs of rules [8, 32], or to calculate critical pairs [14, 17].

Frequently, graph transformation rules are equipped with application conditions (ACs) [7, 8, 15], stating extra (i.e. in addition to the left hand side) positive and negative conditions that the host graph should satisfy for the rule to be applicable. The algebraic approach has proposed a kind of ACs with predefined diagrams (i.e. graphs and morphisms making the condition) and quantifiers regarding the existence or not of matchings of the different graphs of the constraint in the host graph [7, 8]. Most analysis techniques for plain rules (without ACs) have to be adapted then for rules with ACs (see e.g. [17] for critical pairs with negative ACs). Moreover, different adaptations may be needed for different kinds of ACs. Thus, a uniform approach to analyse rules with arbitrary ACs would be very useful.

In previous works [21, 22, 23, 25], we developed a framework (Matrix Graph Grammars, MGGs) for the transformation of simple digraphs. Simple digraphs and their transformation rules can be represented using Boolean matrices and vectors. Thus, the rewriting can be expressed using Boolean operators only. One important point is that, as a difference from other approaches, we explicitly represent the rule dynamics (addition and deletion of elements), instead of only the static parts (rule pre- and post-conditions). This fact gives an interesting viewpoint enabling useful analysis techniques, such as for example checking independence of a sequence of arbitrary length and a permutation of it, or to obtain the smallest graph able to fire a sequence. On the theoretical side, our formalization of graph transformation introduces concepts from many branches of mathematics, like Boolean algebra, group theory, functional analysis, tensor algebra and logics [25]. This wealth of available mathematical results opens the door to new analysis methods not developed so far, like sequential independence and explicit parallelism not limited to pairs of sequences, applicability, congruence and reachability. On the practical side, the implementations of our analysis techniques, being based on Boolean algebra manipulations, are expected to have a good performance.

In this paper we improve the framework, by extending grammar rules with a matrix (the nihilation matrix) that contains the edges that, if present in the host graph, forbid rule application. These are potential dangling edges and newly added ones, which cannot be added twice, since we work with simple digraphs. This matrix, which can be interpreted as a graph, makes explicit some implicit negative information in the rule’s pre-condition. To the best of our knowledge, this idea is not present in any approach to graph transformation.

In addition, we propose a novel approach for graph constraints and ACs, where the diagram and the quantifiers are not fixed. For the quantification, we use a full-fledged formula using monadic second order logic (MSOL) [4]. We show that once the match is considered, a rule with ACs can be transformed into plain rules, by adding the positive information to the left hand side, and the negative in the nihilation matrix. This way, the applicability of a rule with arbitrary ACs is equivalent to the applicability of one of the sequences of plain rules in a set: analysing the latter is equivalent to analysing the former. Thus, in MGGs, there is no need to extend the analysis techniques to special cases of ACs. Although we present the concepts in the MGGs framework, many of these ideas are applicable to other approaches as well.

Paper organization. Section 2 gives an overview of MGGs. Section 3 introduces our graph constraints and ACs. Section 4 shows how ACs can be embedded into rules. Section 5 presents the equivalence between ACs and sequences. Section 6 compares with related work and Section 7 ends with the conclusions. This paper is an extension of [24].

## 2 Matrix Graph Grammars

Simple Digraphs. We work with simple digraphs, which we represent as where is a Boolean matrix for edges (the graph adjacency matrix) and a Boolean vector for vertices or nodes. We use the notation and to denote the set of edges and nodes respectively. Note that we explicitly represent the nodes of the graph with a vector. This is necessary because in our approach we add and delete nodes, and thus we mark the existing nodes with a in the corresponding position of the vector. The left of Fig. 1 shows a graph representing a production system made of a machine (controlled by an operator), which consumes and produces pieces through conveyors. Generators create pieces in conveyors. Self loops in operators and machines indicate that they are busy.

Note that the matrix and the vector in the figure are the smallest ones able to represent the graph. Adding zero elements to the vector (and accordingly zero rows and columns to the matrix) would result in equivalent graphs. Next definition formulates the representation of simple digraphs.

{definition}

[Simple Digraph Representation] A simple digraph is represented by where is the graph’s adjacency matrix and the Boolean vector of its nodes.

Compatibility. Well-formedness of graphs (i.e., absence of dangling edges) can be checked by verifying the identity , where is the Boolean matrix product (like the regular matrix product, but with and and or instead of multiplication and addition), is the transpose of the matrix , is the negation of the nodes vector , and is an operation (a norm, actually) that results in the or of all the components of the vector. We call this property compatibility [21]. Note that results in a vector that contains a 1 in position when there is an outgoing edge from node to a non-existing node. A similar expression with the transpose of is used to check for incoming edges. The next definition formally characterizes compatibility.

{definition}

[Compatibility] A simple digraph is compatible iff .

Typing. A type is assigned to each node in by a function from the set of nodes to a set of types , . In Fig. 1 types are represented as an extra column in the matrices, the numbers before the colon distinguish elements of the same type. For edges we use the types of their source and target nodes.

{definition}

[Typed Simple Digraph]

A typed simple digraph over a set of types , is made of a simple digraph , and a function from the set of nodes to the set of types , .

Next, we define the notion of partial morphism between typed simple digraphs.

{definition}

[Typed Simple Digraph Morphism]

Given two simple digraphs for , a morphism is made of two partial injective functions , between the set of nodes () and edges (), s.t. and ; where is the domain of the partial function .

Productions. A production, or rule, is a morphism of typed simple digraphs. Using a static formulation, a rule is represented by two typed simple digraphs that encode the left and right hand sides (LHS and RHS). The matrices and vectors of these graphs are arranged so that the elements identified by morphism match (this is called completion, see below).

{definition}

[Static Formulation of Production]

A production is statically represented as , where stands for edges and for vertices.

A production adds and deletes nodes and edges, therefore using a dynamic formulation, we can encode the rule’s pre-condition (its LHS) together with matrices and vectors representing the addition and deletion of edges and nodes. We call such matrices and vectors for “erase” and for “restock”.

{definition}

[Dynamic Formulation of Production]

A production is dynamically represented as , where contains the types of the new nodes, and are the deletion Boolean matrix and vector, and are the addition Boolean matrix and vector. They have a 1 in the position where the element is to be deleted or added respectively.

The output of rule is calculated by the Boolean formula , which applies both to nodes and edges (the (and) symbol is usually omitted in formulae).

Example. Fig. 2 shows a rule and its associated matrices. The rule models the consumption of a piece by a machine. Compatibility of the resulting graph must be ensured, thus the rule cannot be applied if the machine is already busy, as it would end up with two self loops, which is not allowed in a simple digraph. This restriction of simple digraphs can be useful in this kind of situations, and acts like a built-in negative AC. Later we will see that the Nihilation matrix takes care of this restriction.

Completion. In order to operate with the matrix representation of graphs of different sizes, an operation called completion adds extra rows and columns with zeros to matrices and vectors and rearranges rows and columns so that the identified edges and nodes of the two graphs match. For example, in Fig. 2, if we need to operate and , completion adds a fourth 0-row and fourth 0-column to .

Stated in another way, whenever we have to operate graphs and , a morphism (i.e. a partial function) has to be defined. Completion rearranges the matrices and vectors of both graphs so that the elements in end up in the same row and column of the matrices. Thus, after the completion we have that . In the examples, we omit such operation, assuming that matrices are completed when necessary. Later we will operate with the matrices of different productions, thus we have to select the elements (nodes and edges) of each rule that get identified to the same element in the host graph. That is, one has to establish morphisms between the LHS and RHS of the different rules, and completion rearranges the matrices according to the morphisms. Note that there may be different ways to complete two matrices, by chosing different orderings for its rows and columns. This is because a simple digraph can be represented by many adjacency matrices, which differ in the order of rows and columns. In any case, the graphs represented by the matrices are the same.

Nihilation Matrix. In order to consider the elements in the host graph that disable a rule application, we extend the notation for rules with a new graph . Its associated matrix specifies the two kinds of forbidden edges: those incident to nodes which are going to be erased and any edge added by the rule (which cannot be added twice, since we are dealing with simple digraphs). Notice however that considers only potential dangling edges with source and target in the nodes belonging to .

{definition}

[Nihilation Matrix]

Given the production , its nihilation matrix contains non-zero elements in positions corresponding to newly added edges, and to non-deleted edges adjacent to deleted nodes.

We extend the rule formulation with this nihilation matrix. The concept of rule remains unaltered because we are just making explicit some implicit information. Matrices are derived in the following order: . Thus, a rule is statically determined by its LHS and RHS , from which it is possible to give a dynamic definition , with and , to end up with a full specification including its environmental behaviour . No extra effort is needed from the grammar designer, because can be automatically calculated as the image by rule of a certain matrix (see proposition 2).

{definition}

[Full Dynamic Formulation of Production]

A production is dynamically represented as , where is the nihilation matrix, and are the deletion Boolean matrix and vector, and and are the addition Boolean matrix and vector.

Next proposition shows how to calculate the nihilation matrix using the production , by applying it to a certain matrix.

{proposition}

[Nihilation matrix]

The nihilation matrix of a given production is calculated as with . 111Symbol denotes the tensor product, which sums up the covariant and contravariant parts and multiplies every element of the first vector by the whole second vector.

Proof. Matrix specifies potential dangling edges incident to nodes in ’s LHS:

 ¯¯¯¯¯D=dij={1if(eV)i=1or(eV)j=1.0otherwise. (1)

Note that . Every incident edge to a node that is deleted becomes dangling, except those explicitly deleted by the production. In addition, edges added by the rule cannot be present in the host graph, .

Example. The nihilation matrix for the example rule of Fig. 2 is calculated as follows:

 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯eV⊗¯¯¯¯¯¯¯¯¯¯¯(eV)t=¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯⎡⎢ ⎢ ⎢⎣1110⎤⎥ ⎥ ⎥⎦⊗⎡⎢ ⎢ ⎢⎣1110⎤⎥ ⎥ ⎥⎦t=⎡⎢ ⎢ ⎢⎣0001000100011111⎤⎥ ⎥ ⎥⎦

The nihilation matrix is then given by :

 NE=r∨¯¯¯e¯¯¯¯¯D=⎡⎢ ⎢ ⎢⎣0000010000100000⎤⎥ ⎥ ⎥⎦∨¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯⎡⎢ ⎢ ⎢⎣0000000000001000⎤⎥ ⎥ ⎥⎦⎡⎢ ⎢ ⎢⎣0001000100011111⎤⎥ ⎥ ⎥⎦=⎡⎢ ⎢ ⎢⎣0001010100110111⎤⎥ ⎥ ⎥⎦

The matrix indicates any dangling edge from the deleted piece (the edge to the conveyor is not signaled as it is explicitly deleted), as well as self-loops in the machine and in the operator.

Matrix can be extended to a simple digraph by taking the nodes in the LHS: . Note that it defines a simple digraph, as one basically needs to add the source and target nodes of the edges in , which are a subset of the nodes in , because for the calculation of we have used the edges stemming from the nodes in . Fig. 3 shows the graph representation for the nihilation matrix of previous example. The nihilation matrix should not be confused with the notion of Negative Application Condition (NAC) [8], which is an additional graph specified by the designer (i.e. not derived from the rule) containing extra negative conditions.

The evolution of the rule’s LHS (i.e. how it is transformed into the RHS) is given by the production itself (). It is interesting to analyse the behaviour of the nihilation matrix, which is given by the next proposition.

{proposition}

[Evolution of the Nihilation Matrix]

Let be a compatible production with nihilation matrix . Then, the elements that must not appear once the production is applied are given by , where is the inverse of (the production that adds what deletes and vice versa, obtained by swapping and ).

Proof. The elements that should not appear in the RHS are potential dangling edges and those deleted by the production: . This coincides with as shown by the following set of identities:

 p−1(NE)=e∨¯¯¯rNE=e∨¯¯¯r(r∨¯¯¯e¯¯¯¯¯D)=e∨¯¯¯e¯¯¯r¯¯¯¯¯D=e∨¯¯¯r¯¯¯¯¯D=e∨¯¯¯¯¯D. (2)

In the last equality of (2) compatibility has been used, .

Remark. Though strange at a first glance, a dual behaviour of the negative part of a production with respect to the positive part should be expected. The fact that uses rather than for its evolution is quite natural. When a production erases one element, it asks its LHS to include it, so it demands its presence. The opposite happens when adds some element. For things happen in the opposite direction. If the production asks for the addition of some element, then the size of (its number of edges) is increased while if some element is deleted, shrinks.

Example. Fig. 4 shows the calculation of using the graph representation of the matrices in equation 2.

Next definition introduces a functional notation for rules (already used in [22]), inspired by the Dirac or bra-ket notation [2]. This notation will be useful for reasoning and proving the propositions in Section 5.

{definition}

[Functional Formulation of Production]

A production can be depicted as , splitting the static part (initial state, ) from the dynamics (element addition and deletion, ).

Using such formulation, the ket operators (i.e. those to the right side of the bra-ket) can be moved to the bra (i.e. left hand side) by using their adjoints (which are usually decorated with an asterisk). We make use of this notation in Section 5.

Match and Derivations. Matching is the operation of identifying the LHS of a rule inside a host graph (we consider only injective matches). Given rule and a simple digraph , any total injective morphism is a match for in , thus it is one of the ways of completing in . The following definition considers not only the elements that should be present in the host graph (those in ) but also those that should not (those in the nihilation matrix, ).

{definition}

[Direct Derivation]

Given rule and graph as in Fig. 5(a), – with – is called a direct derivation with result if the following conditions are satisfied:

1. There exist total injective morphisms and .

2. , .

3. The match induces a completion of in . Matrices and are then completed in the same way to yield and . The output graph is calculated as .

Remark. Item 2 is needed to ensure that and are matched to the same nodes in .

Example. Fig. 5(b) shows the application of rule startProcess to graph . We have also depicted the inclusion of in (bidirectional arrows have been used for simplification). is the complement (negation) of matrix .

It is useful to consider the structure defined by the negation of the host graph, . It is made up of the graph and the vector of nodes . Note that the negation of a graph is not a graph because in general compatibility fails, that is why the term “structure” is used.

The complement of a graph coincides with the negation of the adjacency matrix, but while negation is just the logical operation, taking the complement means that a completion operation has been performed before. Hence, taking the complement of a matrix is the negation with respect to some appropriate completion of . That is, the complement of graph with respect to graph , through a morphism is a two-step operation: (i) complete and according to , yielding and ; (ii) negate . As long as no confusion arises negation and complements will not be syntactically distinguished.

Examples. Suppose we have two graphs and as those depicted in Fig. 6 and that we want to check that is not in . Note that is not contained in (an operator node does not even appear), but it does appear in the negation of the completion of with respect to (graph in the same figure).

In the context of Fig. 5(b), we see that there is an inclusion (i.e. the forbidden elements after applying production are not in ). This is so because we complete with an additional piece (which was deleted from ). Note also that in Definition 2, we have to complete and (step 3). As an occurrence of has to be found in , all nodes of have to be present in and thus is big enough to be able to find an inclusion .

When applying a rule, dangling edges can occur. This is possible because the nihilation matrix only considers dangling edges to nodes appearing in the rule’s LHS. However, a dangling edge can occur between a node deleted by the rule and a node not considered by the rule’s LHS. In MGG, we propose an SPO-like behaviour [21], where the dangling edges are deleted. Thus, if rule produces dangling edges (a fact that is partially signaled by ) it is enlarged to explicitly consider the dangling edges in the LHS. This is equivalent to adding a pre-production (called production) to be applied before the original rule [22]. Thus, rule is transformed into sequence (applied from right to left), where deletes the dangling edges and is applied as it is. In order to ensure that both productions are applied to the same elements (matches are non-deterministic), we defined a marking operator which modifies the rules, so that the resulting rule (), in addition, adds a special node connected to the elements to be marked, and in addition considers the special node in the LHS and then deletes it. This is a technique to control rule application by passing the match from one rule to the next.

Analysis Techniques. In [21, 22, 23, 25] we developed some analysis techniques for MGGs, we briefly give an intuition to those that will be used in Section 5.2.

One of the goals of our previous work was to analyse rule sequences independently of a host graph. We represent a rule sequence as , where application is from right to left (i.e. is applied first). For its analysis, we complete the sequence, by identifying the nodes across rules which are assumed to be mapped to the same node in the host graph.

Once the sequence is completed, our notion of sequence coherence [21] [26] [25] permits knowing if, for the given identification, the sequence is potentially applicable (i.e. if no rule disturbs the application of those following it). The formula for coherence results in a matrix and a vector (which can be interpreted as a graph) with the problematic elements. If the sequence is coherent, both should be zero, if not, they contain the problematic elements. A coherent sequence is compatible if its application produces a simple digraph. That is, no dangling edges are produced in intermediate steps.

Given a completed sequence, the minimal initial digraph (MID) is the smallest graph that permits applying such sequence. Conversely, the negative initial digraph (NID) contains all elements that should not be present in the host graph for the sequence to be applicable. In this way, the NID is a graph that should be found in for the sequence to be applicable (i.e. none of its edges can be found in ). If the sequence is not completed (i.e. no overlapping of rules is decided), we can also give the set of all graphs able to fire such sequence or spoil its application. We call them initial digraph set and negative digraph set respectively. See section 6 in [26] or sections 4.4 and 5.3 in [25].

Other concepts we developed aim at checking sequential independence (i.e. same result) between a sequence and a permutation of it. G-Congruence detects if two sequences (one permutation of the other) have the same MID and NID. It returns two matrices and two vectors, representing two graphs, which are the differences between the MIDs and NIDs of each sequence respectively. Thus if zero, the sequences have the same MID and NID. Two coherent and compatible completed sequences that are G-congruent are sequential independent. See section 7 in [26] or section 6.1 in [25].

## 3 Graph Constraints and Application Conditions

In this section, we present our concepts of graph constraints (GCs) and application conditions (ACs). A GC is defined as a diagram plus a MSOL formula. The diagram is made of a set of graphs and morphisms (partial injective functions) which specify the relationship between elements of the graphs. The formula specifies the conditions to be satisfied in order to make a host graph satisfy the GC (i.e. we check whether is a model for the diagram and the formula). The domain of discourse of the formulae are simple digraphs, and the diagram is a means to represent the interpretation function I.222Recall that, in essence, the domain of discourse is a set of individual elements which can be quantified over. The interpretation function assigns meanings (semantics) to symbols [5].

GC formulae are made of expressions about graph inclusions. For this purpose, we introduce the following two predicates:

 P(X1,X2)=∀m[F(m,X1)⇒F(m,X2)] (3) Q(X1,X2)=∃e[F(e,X1)∧F(e,X2)] (4)

where predicate states that element (a node or an edge) is in graph . In this way, predicate means that graph is included in . Note that ranges over all nodes and edges (edges are defined by their initial and final node) of , thus ensuring the containment of in (i.e. preserving the graph structure). Predicate asserts that there is a partial morphism between and , which is defined on at least one edge. That is, and share an edge. In this case, ranges over all edges.

Predicates decorated with superindices or refer to Edges or Vertices. Thus, says that every vertex in graph should also be present in . Actually is in fact a shortcut for stating that all vertices in should be found in (), all edges in should be found in () and in addition the set of nodes found should correspond to the source and target nodes of the edges.

Predicate asks for an inclusion morphism . The diagram of the constraint may already include such morphism (i.e. the diagram can be seen as a set of restrictions imposed on the interpretation function I) and we can either permit extensions of (i.e. the model – host graph – may relate more elements of and ) or keep it as defined in the diagram. In this latter case, the host graph should identify exactly the specified elements in and keep different the elements not related by . This is represented using predicate , which can be expressed using :

 PEU(X1,X2)=∀a[¬(F(a,D)+F(a,coD))]=PE(D,coD)∧PE(DC,coDC) (5)

where , , stands for the complement (i.e. is the complement of w.r.t ) and is the xor operation. A similar reasoning applies to nodes.

The notation (syntax) will be simplified by making the host graph the default second argument for predicates and . Besides, it will be assumed that by default total morphisms are demanded: unless otherwise stated predicate is assumed.

Example. Before starting with formal definitions, we give an intuition of GCs. The following GC is satisfied if for every in it is possible to find a related in : , equivalent by definition to . Nodes and edges in and are related through the diagram shown in Fig. 7, which relates elements with the same number and type. As a notational convenience, to enhance readability, each graph in the diagram has been marked with the quantifier given in the formula. If a total match is sought, no additional inscription is presented, but if a partial match is demanded the graph is additionally marked with a . Similarly, if a total match is forbidden by the formula, the graph is marked with . This convention will be used in most examples throughout the paper. The GC in Fig. 7 expresses that each machine should have an output conveyor.

Note the identity , which we use throughout the paper. We take the convention that negations in abbreviations apply to the predicate (e.g., ) and not the negation of the graph’s adjacency matrix.

A bit more formally, the syntax of well-formed formulas is inductively defined as in monadic second-order logic, which is first-order logic plus variables for subsets of the domain of discourse. Across this paper, formulas will normally have one variable term which represents the host graph. Usually, the rest of the terms will be given (they will be constant terms). Predicates will consist of and and combinations of them through negation and binary connectives. Next definition formally presents the notion of diagram.

{definition}

[Diagram] A diagram is a set of simple digraphs and a set of partial injective morphisms with . Diagram is well defined if every cycle of morphisms commute.

The formulae in the constraints use variables in the set , and predicates and . Formulae are restricted to have no free variables except for the default second argument of predicates and , which is the host graph in which we evaluate the GC. Next definition presents the notion of GC.

{definition}

[Graph Constraint] is a graph constraint, where is a well defined diagram and a sentence with variables in . A constraint is called basic if (with one bound variable and one free variable) and .

In general, there will be an outstanding variable among the representing the host graph, being the only free variable in . In previous paragraphs it has been denoted by , the default second argument for predicates and . We sometimes speak of a “GC defined over G”. A basic GC will be one made of just one graph and no morphisms in the diagram (recall that the host graph is not represented by default in the diagram nor included in the formulas).

Next, we define an AC as a GC where exactly one of the graphs in the diagram is the rule’s LHS (existentially quantified over the host graph) and another one is the graph induced by the nihilation matrix (existentially quantified over the negation of the host graph).

{definition}

[Application Condition] Given rule with nihilation matrix , an AC (over the free variable ) is a GC satisfying:

1. such that and .

2. such that is the only free variable.

3. must demand the existence of in and the existence of in .

The simple graph can be thought of as a host graph to which some grammar rules are to be applied. For simplicity, we usually do not explicitly show the condition 3 in the formulae of ACs, nor the nihilation matrix in the diagram. However, if omitted, both and are existentially quantified before any other graph of the AC. Thus, an AC has the form . Note the similarities between Def. 3 and that of derivation in Def. 2.

Actually, we can interpret the rule’s LHS and its nihilation matrix as the minimal AC a rule can have. Hence, any well defined production has a natural associated AC. Note also that, in addition to the AC diagram, the structure of the rule itself imposes a relation between and (and between and ). For technical reasons, related to converting pre- into post-conditions and viceversa, we assume that morphisms in the diagram do not have codomain or . This is easily solved as we may always use their inverses due to ’s injectiveness.

Semantics of Quantification. In GCs or ACs, graphs are quantified either existentially or universally. We now give the intuition of the semantics of such quantification applied to basic formulae. Thus, we consider the four basic cases: (i) , (ii) , (iii) , (iv) .

Case (i) states that should include graph . For example, in Fig. 8, the GC demands an occurrence of in (which exists).

Case (ii) demands that, for all potential occurrences of in , the shape of graph is actually found. The term potential occurrences means all distinct maximal partial matches333A match is partial if it does not identify all nodes or edges of the source graph. The domain of a partial match should be a graph. (which are total on nodes) of in . A non-empty partial match in is maximal, if it is not strictly included in another partial or total match. For example, consider the GC in the context of Fig. 8. There are two possible instantiations of (as there are two machines and one operator), and these are the two input elements to the formula. As only one of them satisfies – the expanded form of – the GC is not satisfied by .

Case (iii) demands that, for all potential occurrences of , none of them should have the shape of . The term potential occurrence has the same meaning as in case (ii). In Fig. 8, there are two potential instantiations of the GC . As one of them actually satisfies , the formula is not satisfied by .

Finally, case (iv) is equivalent to , where by definition . This GC states that for all possible instantiations of , one of them must not have the shape of . This means that a non-empty partial morphism should be found. The GC in Fig. 8 is satisfied by because, again, there are two possible instantiations, and one of them actually does not have an edge between the operator and the machine.

Next definition formalizes the previous intuition, where we use the following notation:

• is a maximal non-empty partial morphism s.t.

• is a total morphism

• is an isomorphism

where are the nodes of the graph in the domain of . Thus, denotes the set of all potential occurrences of a given constraint graph in , where we require all nodes in be present in the domain of . Note that each may be empty in edges.

{definition}

[Basic Constraint Satisfaction]

The host graph satisfies , written444The notation is explained in more detail after Def. 3. iff .
The host graph satisfies , writteni iff .

The diagrams associated to the formulas in previous definition have been omitted for simplicity as they consist of a single element: . Recall that by default predicate is assumed as well as as second argument, e.g. the first formula in previous definition is actually . Note also that only these two cases are needed, as one has and .

Thus, this is a standard interpretation of MSOL formulae, save for the domain of discourse (graphs) and therefore the elements of quantification (maximal non-empty partial morphisms). Taking this fact into account, next, we define when a graph satisfies an arbitrary . This definition also applies to ACs.

{definition}

[Graph Constraint Satisfaction] We say that satisfies the graph constraint under the interpretation function , written , if is a model for that satisfies the element relations555As any mapping, assigns elements in the domain to elements in the codomain. Elements so related should be mapped to the same element. For example, Let and with and . Further, assume , then . specified by the diagram , and the following interpretation for the predicates in :

1. total injective morphism.

2. partial injective morphism, non-empty in edges.

where with666It can be the case that . and . The interpretation of quantification is as in Def. 3 but setting and instead of and , respectively.

The notation deserves the following comments:

1. The notation means that the formula is satisfied under interpretation given by , assignments given by morphisms specified in and substituting the variables in with the graphs in .

2. As commented after Def. 3, in many cases the formula will have a single variable (the one representing the host graph ) and always the interpretation function will be that given in Def. 3. We may thus write which is the notation that appears in Def. 3. The notation may also be used.

3. Similarly, as an AC is just a GC where , and are present, we may write . For practical purposes, we are interested in testing whether, given a host graph , a certain match satisfies the AC. In this case we write . In this way, the satisfaction of an AC by a match and a host graph is like the satisfaction of a GC by a graph , where a morphism is already specified in the diagram of the GC.

Remark. For technical reasons, we require all graphs in the GC for which a partial morphism is demanded to be found in the host graph to have at least one edge and be connected. That is why has to be non-empty in edges.

Examples. Fig. 9 shows rule contract, with an AC given by the diagram in the figure (where morphisms identify elements with the same type and number, this convention is followed throughout the paper), together with formula . The rule creates a new operator, and assigns it to a machine. The rule can be applied if there is a match of the LHS (a machine is found), the machine is not busy (), and all operators are busy (). Graph to the right satisfies the AC, with the match that identifies the machine in the LHS with the machine in with the same number.

Using the terminology of ACs in the algebraic approach [8], is a negative application condition (NAC). On the other hand, there is no equivalent to in the algebraic approach, but in this case it could be emulated by a diagram made of two graphs stating that if an operator exists then it does not have a self-loop. However, this is not possible in all cases as next example shows.

Fig. 10 shows rule move, which has an AC with formula: . As previously stated, in this example and the followings, the rule’s LHS and the nihilation matrix are omitted in the AC’s formula. The example AC checks whether all conveyors connected to conveyor 1 in the LHS reach a common target conveyor in one step. We can use “global” information, as graph has to be found in and then all output conveyors are checked to be connected to it ( is existentially quantified in the formula before the universal). Note that we first obtain all possible conveyors (). As the identifications of the morphism have to be preserved, we consider only those potential instances of with equal to in . From these, we take those that are connected (), and which therefore have to be connected with the conveyor identified by the LHS. Graph satisfies the AC, while graph does not, as the target conveyor connected to is not the same as the one connected to and . To the best of our efforts it is not possible to express this condition using the standard ACs in the DPO approach given in [8].

## 4 Embedding Application Conditions into Rules

In this section, the goal is to embed arbitrary ACs into rules by including the positive and negative coditions in and respectively. It is necessary to check that direct derivations can be the codomain of the interpretation function, that is, intuitively we want to assert whether “MGG + AC = MGG” and “MGG + GC = MGG”.

As stated in previous section, in direct derivations, the matching corresponds to formula , but additional ACs may represent much more general properties, due to universal quantifiers and partial morphisms. Normally, plain rules (without ACs) in the different approaches to graph transformation do not care about elements that cannot be present. If so, a match is just . Thus, we seek for a means to translate universal quantifiers and partial morphisms into existential quantifiers and total morphisms.

For this purpose, we introduce two operations on basic diagrams: closure (), dealing with universal quantifiers only, and decomposition (), for partial morphisms only (i.e. with the predicate).

The closure operator converts a universal quantification into a number of existentials, as many as maximal partial matches there are in the host graph (see definition 3). Thus, given a host graph , demanding the universal appearance of graph in is equivalent to asking for the existence of as many replicas of as partial matches of are in .

{definition}

[Closure]

Given with diagram , ground formula and a host graph , the result of applying to is calculated as follows:

 d ⟼ d′=({A1,…,An},dij:Ai→Aj) f ⟼ f′=∃A1…∃An[n⋀i=1Ain⋀i,j=1,j>iPU(Ai,Aj)] (6)

with , , and .

Remark. Completion creates a morphism between each different and (both isomorphic to ), but morphisms are not needed in both directions (i.e. is not needed). The condition that morphism must not be an isomorphism means that at least one element of and has to be identified in different places of . This is accomplished by means of predicate (see its definition in equation 5), which ensures that the elements not related by , are not related in .

The interpretation of the closure operator is that demanding the universal appearance of a graph is equivalent to the existence of all of its potential instances (i.e. those elements in ) in the specified digraph (, or some other). Some nodes can be the same for different identifications (), so the procedure does not take into account morphisms that identify every single node, . Therefore, each contains the image of a potential match of in (there are possible occurrences of in ) and identifies elements considered equal.

Example. Assume the diagram to the left of Fig. 11, made of just graph , together with formula , and graph , where such GC is to be evaluated. The GC asks for the existence of all potential connections between each generator and each conveyor. Performing closure we obtain , where diagram is shown to the right of Fig. 11, and each identifies elements with the same number and type. The closure operator makes explicit that three potential occurrences must be found (as ), thus, taking information from the graph where the GC is evaluated and placing it in the GC itself.

The idea behind decomposition is to split a graph into its basic components to transform partial morphisms into total morphisms of one of its parts. For this purpose, the decomposition operator splits a digraph into its edges, generating as many digraphs as edges in . As stated in remark 1 of definition 3, all graphs for which the GC asks for a partial morphism are forbidden to have isolated nodes. We are more interested in the behaviour of edges (which to some extent comprises nodes as source and target elements of the edges, except for isolated nodes) than on nodes alone as they define the topology of the graph. This is also the reason why predicate was defined to be true in the presence of a partial morphism non-empty in edges. If so desired, in order to consider isolated nodes, it is possible to define two decomposition operators, one for nodes and one for edges, but this is left for future work.

{definition}

[Decomposition]

Given with ground formula and diagram , acts on – in the following way:

 d ⟼ d′=({A1,…,An},dij:Ai→Aj) f ⟼ f′=∃A1…∃An[n⋁i=1Ai] (7)

with , the number of edges of , and , where contains a single edge of .

Demanding a partial morphism is equivalent to asking for the existence of a total morphism of some of its edges, that is, each contains exactly one of the edges of .

Example. Consider , where graph is shown to the left of Fig. 12. The constraint is satisfied by a host graph if there is a partial morphism non-empty in edges . Thus, we require that either the two conveyors are connected, or there is a piece in one of them. Using decomposition, we obtain . Diagram is shown in Fig. 12(b), together with a graph satisfying the constraint in Fig. 12(c). Note that this constraint can be expressed more concisely than in other approaches, like the algebraic/categorical one of [8].

Note how, decomposition is not affected by the host graph to which it is to be evaluated. Also, we do not care whether some graphs in the decomposition are matched in the same place in the host graph (e.g. and ), as the GC just requires one of them to be found.

Now we show the main result of this section, which states that it is possible to reduce any formula in an AC (or GC) into another one using existential quantifiers and total morphisms only. This theorem is of interest because derivations as defined in MGGs (the matching part) use only total morphisms and existential quantifiers.

{theorem}

[ reduction]

Let with a ground formula, can be transformed into a logically equivalent with existential quantifiers only.

Proof. Let the depth of a graph for a fixed node be the maximum over the shortest path (to avoid cycles) starting in any node different from and ending in . The depth of a graph is the maximum depth for all its nodes. Diagram is a graph where nodes are digraphs and edges are morphisms . We use to denote the depth of . In order to prove the theorem we apply induction on the depth, checking out every case. There are 16 possibilities for and a single element , summarized in Table 1.

Elements in the same row for each pair of columns are related using equalities and , so it is possible to reduce the study to cases (1)–(4) and (9)–(12). Identities and reduce (9)–(12) to formulae (1)–(4):

 ∃A[¯¯¯¯Q(A)]=∃A[P(A,¯¯¯¯G)] , ∃A[Q(A)]=∃A[¯¯¯¯P(A,¯¯¯¯G)] ∄A[Q(A)]=∄A[¯¯¯¯P(A,¯¯¯¯G)] , ∄A[¯¯¯¯Q(A)]=∄A[P(A,¯¯¯¯G)].

Thus, it is enough to study the first four cases, but we have to specify if must be found in or . Finally, all cases in the first column can be reduced to (1):

• (1) is the definition of match.

• (2) can be transformed into total morphisms (case 1) using operator : .

• (3) can be transformed into total morphisms (case 1) using operator : . Here for simplicity, the conditions on are assumed to be satisfied and thus have not been included.

• (4) combines (2) and (3), where operators and are applied in order (see remark below): .

If there is more than one element at depth 1, this same procedure can be applied mechanically (well-definedness guarantees independence with respect to the order in which elements are selected). Note that if depth is 1, graphs on the diagram are unrelated (otherwise, depth 1).

Induction Step. When there is a universal quantifier , according to equation 4, elements of are replicated as many times as potential instances of can be found in the host graph. In order to continue the application procedure, we have to clone the rest of the diagram for each replica of , except those graphs which are existentially quantified before in the formula. That is, if we have a formula , when performing the closure of , we have to replicate as many times as , but not . Moreover has to be connected to each replica of , preserving the identifications of the morphism . More in detail, when closure is applied to , we iterate on all graphs in the diagram:

• If is existentially quantified after () then it is replicated as many times as . Appropriate morphisms are created between each and if a morphism existed. The new morphisms identify elements in and according to . This permits finding different matches of for each , some of which can be equal.777If for example there are three instances of in the host graph but only one of , then the three replicas of are matched to the same part of .

• If is existentially quantified before () then it is not replicated, but just connected to each replica of if necessary. This ensures that a unique has to be found for each . Moreover, the replication of has to preserve the shape of the original diagram. That is, if there is a morphism , then each has to preserve the identifications of (this means that we take only those which preserve the structure of the diagram).

• If is universally quantified (no matter if it is quantified before or after ), again it is replicated as many times as . Afterwards, will itself need to be replicated due to its universality. The order in which these replications are performed is not relevant as .

Remark. Operators and commute, i.e. . In the equation of item 4, the application order does not matter. Composition is a direct translation of , which first considers all appearances of nodes in and then splits these occurrences into separate digraphs. This is the same as considering every pair of connected nodes in by one edge and take their closure, i.e, .

Example. Fig. 13 shows rule endProc and the diagram of its AC, which has formula: . The AC allows for the application of the rule if all machines connected (as output) to the conveyor in are operated by the same operator. This is so as the AC considers all machines connected to the LHS conveyor by . For these machines, it should be the case that a unique operator ( is placed at the beginning of the formula) is connected to them ().

The bottom of the figure shows the resulting diagram after applying the previous theorem, using graph to the upper right of the figure. At depth 2, graph is replicated three times, as it is universally quantified and there are three machines. Then, the rest of the diagram is replicated, except the graphs quantified before ( and ). The resulting formula of the AC is