Duality in Graphical Models

Duality in Graphical Models

Abstract

Graphical models have proven to be powerful tools for representing high-dimensional systems of random variables. One example of such a model is the undirected graph, in which lack of an edge represents conditional independence between two random variables given the rest. Another example is the bidirected graph, in which absence of edges encodes pairwise marginal independence. Both of these classes of graphical models have been extensively studied, and while they are considered to be dual to one another, except in a few instances this duality has not been thoroughly investigated. In this paper, we demonstrate how duality between undirected and bidirected models can be used to transport results for one class of graphical models to the dual model in a transparent manner. We proceed to apply this technique to extend previously existing results as well as to prove new ones, in three important domains. First, we discuss the pairwise and global Markov properties for undirected and bidirected models, using the pseudographoid and reverse-pseudographoid rules which are weaker conditions than the typically used intersection and composition rules. Second, we investigate these pseudographoid and reverse pseudographoid rules in the context of probability distributions, using the concept of duality in the process. Duality allows us to quickly relate them to the more familiar intersection and composition properties. Third and finally, we apply the dualization method to understand the implications of faithfulness, which in turn leads to a more general form of an existing result.

1Introduction

Graphical models are used to study conditional independence statements about a set of random variables where . Most generally, a graphical model for such a system is a graph . In such a graph, inclusion of in the edge set is predicated on some conditional independence statement about and . For example, the undirected graph corresponding to (called the concentration graph in some settings, such as [12]) is constructed such that if and only if and are conditionally independent given the rest of the variables. The bidirected graph (called the covariance graph in some settings, such as [3]), is constructed using the rule if and only if and are independent.

Undirected and bidirected graphical models have been widely studied in the literature [4]. Important to the study of such models are the equivalence between the pairwise Markov and global Markov properties (see Section 2.4), general conditions for such equivalences, and the concept of faithfulness. Under certain conditions, graphical models defined using pairwise relations also encode more complicated global conditional independence statements. These conditional independence statements are typically represented by separation statements in an undirected or bidirected graph , and when this occurs the random variable is said to be globally Markov with respect to . More specifically, when the global Markov condition is satisfied, separation of two disjoint subsets and given a third separating subset implies a conditional independence statement about , , and . The ability of a graphical model to encode such complex conditional independence statements is important when using such models in applications. The reverse containment, that all conditional independence statements within a random vector being encoded by separation statements in a given graph, is known as faithfulness.

Several authors have specified conditions under which the pairwise and global Markov properties are equivalent in both undirected and bidirected models [5]. Conditions under which a distribution is faithful to a graphical model have also been formulated for undirected and bidirected trees [2]. However, although undirected and bidirected graphical models are known to be dual to each other (a notion formalized by [9]), they have frequently been treated differently – especially when proving properties of such models. In several instances, authors have succeeded in obtaining results for bidirected graphs that parallel those for undirected graphs, but as we shall demonstrate have used more complicated proof techniques than necessary. In this paper, we demonstrate that many of these results could have been achieved using the dual framework of [10], and use this formalism to develop even more general results. Our approach shows how without exception, results on bidirected graphs can be easily obtained by analyzing undirected graphs.

Besides introducing and investigating the important technique of using duality to “transport” results in the undirected graph setting to the bidirected graph setting, we also enumerate below the additional novel contributions in the paper. First, weaker conditions – the so-called pseudographoid rules – for the equivalence between the pairwise and global Markov conditions on both undirected and bidirected graphs is given. Duality is used to adapt the result on undirected graphs to the dual result on bidirected graphs. Second, the relationship between the familiar intersection/composition properties and the more general pseudographoid rules is formally derived. Third, a result on faithfulness in the bidirected graph setting, known to be true only for Gaussian random variables, is generalized in a significant way. In many cases, “direct” proofs are given side by side with proofs using duality, to underscore the power of the technique.

The paper is organized as follows. In Section 2, we introduce preliminary notation and concepts. The general approach taken by the paper is outlined in Section 3. The equivalence between pairwise and global Markov properties is considered in Section 4. These conditions, called the pseudographoid and reverse pseudographoid rules, are studied in greater detail in Section 5. Finally, the idea of faithfulness is investigated in Section 6, and the use of duality allows extension of a result which formerly applied only to Gaussian distributions to a more general setting.

2Preliminaries

In this section, notation concerning conditional independence structures and graphical models is introduced. These are then stated in the language of relations as in [10], a formalism which is used throughout this paper. Some preliminary results regarding various closure rules on the set of relations are derived.

2.1Conditional Independence

In the sequel, we will distance ourselves from conditional independence structures of probability distributions and instead work with general relations (defined in Section 2.2). However, it is useful to use probability distributions as a motivating example before moving to the language of relations.

Throughout this paper, is a random vector indexed by a finite set . For any , denotes a sub-vector of . For disjoint , with , , and taking values in -algebras , , and respectively, is said to be conditionally independent of given if ,

where is the law of . In this case we say that

which will henceforth be shortened to when there is no ambiguity.

Any system of random variables satisfies the following semigraphoid rules [12]:

  • Symmetry :

  • Decomposition :

  • Weak Union : .

  • Contraction : ,

where is shorthand for when there is no ambiguity, and all sets above are disjoint. Above, and can represent any partition of with . Note that although any set of random variables necessarily satisfies the semigraphoid axioms above, the list is not complete; in fact, no finite characterization of random variables using such rules exists [14].

Any random vector also satisfies the following localizability property [10]:

  • Localizability :

where lowercase letters, above and henceforth, represent singletons, i.e., . More generally, [10] demonstrates that any semigraphoid is localizable.

Furthermore, may satisfy

  • Intersection (I):

  • Composition (M): .

If admits a strictly positive density, it follows that satisfies the intersection rule [6]. It is well-known that Gaussian random variables satisfy both intersection and composition. Some simple results relating the above properties are presented in Section 2.2.

2.2Relations

In the sequel, it will be useful to abstract conditional independence structures using the notation of relations. For a finite set , define as the set of all triples with pairwise disjoint and both nonempty. A subset of will be referred to as a relation. For a random vector indexed by , define the relation by (where are disjoint) which is called the conditional independence structure of . As all relations in general will be subsets of , the dependence on will be henceforth suppressed.

All of the closure rules of Section 2.1 can translated into the notation of relations. For example, is a semigraphoid if it is closed under the semigraphoid rules in the sense that

The following simple lemma (stated by [9] without proof) gives a compact representation of properties , and above.

Suppose first that is closed under , and . If , it follows by that and by that . On the other hand, if , it follows by that .

Now suppose that is closed under defined in the statement of the lemma. Then it is clear by definition that is closed under , and .

This representation of the semigraphoid rules is parsimonious in the sense that it consists of one if and only if statement. We shall later see that this is convenient when proving results concerning semigraphoids.

Furthermore, is localizable if

Note that for any random variable , the corresponding conditional independence structure is a semigraphoid [12]. Under the localizability condition, global statements about general triples can be constructed from pairwise triples of the form , where are singletons. Specifically, define . Global statements about a localizable set of triples can be made when working only with . This technique is used effectively, for example, in [10].

The following lemma shows that localizability is a strictly weaker condition than the semigraphoid rules.

The fact that semigraphoids are localizable is shown in [10], Lemma 3. See also [11], Lemma 1.

Now let be localizable. Then

which shows closure of under decomposition . Moreover,

and so is closed under weak union .

It remains to show by counter example that need not be closed under contraction. Consider the relation which is closed under localizability . However, is not closed under contraction, as it does not contain .

Localizability and its relation to the semigraphoid axioms is considered in more detail in Section 5, when we consider localizability and the pseudographoid rules in detail. For more details on the rich mathematical framework underlying and its subfamilies, we refer the reader to [7], [17], [15] and [9].

2.3Graphical Models

Let be an undirected graph with vertex set and edge set such that . Two distinct vertices are said to be adjacent in if and in this case we write or simply when it is unambiguous. The vertices are said to be connected in if there exists some , with distinct, such that . In this case, is said to be a path connecting and in . We define as the set of all paths connecting and .

Given an undirected graph and disjoint , we say that separates and in if , any path in contains at least one element of . In this case, we write

and define by . For any undirected graph , is a semigraphoid closed under intersection and composition [12]. It follows that is localizable; for a direct proof of this statement, see the appendix. For a further discussion of in a larger context, see [11], [7] and [9].

Given a relation , we construct the undirected graph corresponding to , written as , according to the rule

The bidirected graph corresponding to is the graph constructed according to the rule

Note that each edge of can be thought of as a bidirected edge. We use this convention to be consistent with previous work on directed graphs, and adhere to the notation of [6]. Some authors refer to the undirected graph above as the concentration graph and the bidirected graph as the covariance graph [4]. This terminology makes sense when modelling Gaussian distributions, for which pairwise partial independence between random variables is encoded by sparsity in the inverse covariance (concentration) matrix. Since the context of this paper is much broader, we use the undirected/bidirected nomenclature.

Such graphs have been widely studied going back to [3], [5], and [12]. For general reference on both undirected and bidirected graphs, as well as other types of graphical models, see for example [4], [6], [16], and [18].

2.4Global Markov Properties

Under certain assumptions, pairwise statements about graphical models can be used to construct global statements. Given a relation and graph , we say that is:

  • Undirected (concentration) pairwise Markov

    with respect to if , that is, .

  • Undirected global Markov

    with respect to if , that is, .

  • Bidirected (covariance) pairwise Markov

    with respect to if .

  • Bidirected global Markov

    with respect to if .

Note that we speak of Markov properties with respect to a relation as compared to a random variable . In the language of conditional independence, the undirected global Markov rule, e.g., is the statement that

Fixing some arbitrary , note that if is undirected pairwise Markov with respect then ; is minimal in this sense. Further, if satisfies , then is undirected pairwise Markov with respect to . Similarly, if is bidirected pairwise Markov with respect to , then . If , then is bidirected pairwise Markov with respect to .

It is well-known [12] that closure of a relation under intersection is a sufficient condition for equivalence between the undirected pairwise and undirected global Markov properties. The same is true [5] for relations closed under composition with respect to the bidirected Markov properties. In Section 4, the assumptions of intersection and composition will be weakened. In addition, the concept of duality will be used to demonstrate how the result on bidirected graphs follows directly from that on undirected graphs.

2.5Additional Closure Rules

We now present four additional closure rules on . First, we define the pseudographoid and reverse pseudographoid rules. Pseudographoids have been previously studied going back at least to [11], and are defined here in their pairwise form as in [7]. The reverse pseudographoid rule is also considered by [7].

  • Pseudographoid Rule : , where are singletons.

  • Reverse Pseudographoid Rule : where are singletons.

Note that the pseudographoid rule is analogous to intersection, but with some sets restricted to be singletons. On the other hand, the reverse pseudographoid rule is a weakened version of composition. In Section 4, these rules are shown to be both necessary sufficient for equivalence of the Markov properties (Section 2.3). In Section 5, the pseudographoid and reverse pseudographoid rules are studied with respect to semigraphoids, and shown to be equivalent to intersection (respectively, composition) in that context.

In the sequel, the following two closure properties will be used to study faithfulness (see Section 6) of graphical models:

  • Decomposable Transitivity: .

  • Dual Decomposable Transitivity: .

Decomposable transitivity was first defined in [2] to provide conditions for faithfulness of random variables to tree graphs. Dual decomposability is a new notion introduced in this paper to facilitate derivation of a dual result to that of [2] in Section 6.

3Duality

We now introduce the notion of duality, a tool which we rely heavily on in the sequel. Given some triple , define its dual by . Similarly, given a relation , its dual is defined to be

Note that . Duality is discussed in detail in [9], in the context of various classes of relations.

Consider the undirected and bidirected graphs corresponding to a given relation , as defined in Section 2.3. Note that , and hence the undirected graph corresponding to is the dual of the bidirected graph (and vice versa), that is . Furthermore, the Markov properties for bidirected graphs are simply defined using duality. A relation is bidirected pairwise Markov with respect to some if , and bidirected global Markov with respect to if .

We now define a sense in which one can “dualize” a result regarding undirected graphs to one about bidirected graphs. Let be a relation. As in [15], we define a rule with antecedents, , as a set of -tuples of with each a distinct triple on . We say that a relation is closed under if for every -tuple , it holds that

For example, the intersection rule is the set of all -tuples of the form

where are arbitrary disjoint subsets of . We will also allow for , and in this case is closed under the unary rule when

which is simply set-containment.

The dual of a rule is defined as

Noting that

it is easy to see that the dual of the intersection rule is composition, i.e., the set of all triples of the form

after the change of variables .

Note that the closure of a relation under some rule is equivalent to closure of under . Similarly, suppose it holds that whenever a relation is closed under some finite set of rules, , it must also be closed under . Then it immediately holds that whenever a relation is closed under , it must also be closed under . Otherwise, there would exist some which was closed under but not under . In this case, would be closed under but not under , contradicting the assumption.

For example, for a given graph , consider the unary rules

and

These are the pairwise undirected and pairwise bidirected Markov properties of Section 2.4; they are jointly dual. Similarly, the global Markov properties are given by

and

Hence, the existence of a result providing equivalence between the undirected pairwise and undirected global Markov rules under the assumption of some closure rule yields the dual result, i.e., equivalence between the bidirected pairwise and bidirected global Markov rules under the dual closure rule. Although we will use the technique described above in the sequel, we will not require the preceding notation regarding rules.

Before proceeding, we provide three lemmas which will be used in the remainder of this paper. The following lemma specifies the effect of the dualization operator on relations with respect to various closure rules.

It is clear by definition that , therefore it suffices to show one direction. Therefore, assume that is localizable; we will show that is as well.

We first show the direction of the localizability condition. Assume that , and consider any singletons , and satisfying . We are done if , which is equivalent to showing that . Since , we have that for any , by localizability of . But , and hence as required.

To show the direction, consider pairwise disjoint , and assume that for any singletons and . Therefore, for any such , it holds that . Hence, for any , which implies by localizability of that . Therefore, , completing the proof of .

Let be a semigraphoid and consider . Then which implies that by decomposition and weak union, respectively. However, this in turn yields . Next, let such that . By contraction, it follows that , which implies that , and hence is a semigraphoid. The reverse direction follows from the fact that , completing the proof of .

Let be a closed under intersection, and consider . Then

and as is closed under intersection this yields that . It follows that , and hence is closed under composition.

Next, assume that is closed under composition, and let . Then

This implies by composition that , which in turn yields , completing the proof of .

Assume first that is closed under , and for pairwise disjoint with singletons let . It follows that

Then by , it also holds that

This in turn implies that which shows that is closed under .

Next assume that is closed under , and as before let . Then

and by it follows that

This implies that , and hence is closed under , completing the proof of .

This follows directly from the definition of dual decomposable transitivity.

Lemma ? will greatly expedite proofs regarding undirected and bidirected graphs in the sequel. The following lemma from [7] allows dualization of any Gaussian random vector.

In particular, application of Lemma ? yields the following result.

Assume that there exists some Gaussian random vector for which is not closed under . Consider then the Gaussian random vector with , the existence of which is guaranteed by Lemma ?. Since closure of a relation under is equivalent to closure of under , it must be the case that is not closed under . Then is a Gaussian random vector for which is not closed under , contradicting our assumption.

As we will demonstrate in Section 6, Lemmas ? and ? allow dualization of closure rules specific to the Gaussian distribution in general.

4Duality, Pseudographoid Rules and the Global Markov Properties

In this section, the pairwise and global Markov properties are examined using the dualization technique developed in Section 3. We provide weaker conditions for the equivalences between pairwise and global Markov properties. The typical assumptions for this equivalence are the semigraphoid rules in combination with intersection for undirected graphs or composition for bidirected graphs. Instead, we use localizability (shown to be weaker than the semigraphoid rules by Lemma ?) in combination with either the pseudographoid or reverse pseudographoid rule (which are weaker than intersection/composition).

We begin by considering this equivalence for undirected graphs, and proceed to dualize the result to bidirected graphs. The equivalence between the pairwise and global Markov properties for bidirected graphs is treated differently in the literature than that for undirected graphs, but as we will see, the results are equivalent in the dual sense described in Section 3. The full proof is also given for bidirected graphs for completeness, although dualization is a far more efficient method. We first restate the following well known results regarding the global Markov properties in the language of relations.

The assumptions of both theorems above can be separately weakened. Localizability can be used in place of the semigraphoid rules, while the pseudographoid and reverse pseudographoid rules can be used in place of intersection and composition, respectively. and each can be stated in terms of necessary and sufficient conditions. We consider both theorems above in light of these weakened properties. In doing so, we modify the logic to assume the pairwise Markov property, and then provide an equivalence between properties of and the global Markov property. While this deviates from the literature, we find it a more natural framework from a graphical modelling perspective; when a graph is used to model a relation , the graph is chosen such that is pairwise Markov with respect to and hence that should be assumed.

4.1Pseudographoids and the Undirected Markov Properties

To begin, we adapt Theorem ? above, considering the undirected pairwise and undirected global Markov properties for any localizable relation closed under the pseudographoid rule. This equivalence was originally considered by [12] under the assumptions of the semigraphoid and intersection rules, and a related result is stated in the language of relations by [7]. The proof in one direction is technical, but essentially mirrors that of Theorem ? due to [12]; it has therefore been moved to the appendix. While [12] shows that closure under intersection is sufficient for the result, we instead assume closure under the pseudographoid rule.

See appendix.

Assume now that is undirected global Markov with respect to . Then by definition, . By [11], is a localizable pseudographoid, completing the proof.

4.2Reverse Pseudographoids and the Bidirected Markov Properties

The reverse pseudographoid rule is now examined in place of composition to relate the the bidirected pairwise and bidirected global Markov properties (see Theorem ?). The original proof of such equivalence by [5] showed sufficiency of the semigraphoid and composition rules, an assumption which is weakened here. We first provide a direct proof of this more general result. We then proceed to use the concept of duality to yield a much simpler proof of this general result. The direct proof in the direction of Theorem ? below uses techniques similar in some ways to the proof of Theorem ? by [5]. However, there are subtle differences. In fact it rather parallels the proof of Theorem ? exactly in a dual sense. For this reason, and also to provide contrast with the brevity of the ensuing alternate proof which leverages duality, the direct proof of Theorem ? is nevertheless given below.

Assume that is and . To begin, we show that , where are singletons. As in the proof of Theorem ?, this is done by induction on . To begin, if (i.e., ), it follows that , which implies by the bidirected pairwise Markov assumption.

Assume now that for singletons and with , . Then let with . As , we can find some singleton , and hence and . By the inductive hypothesis, this implies that .

Next, as in the proof of Theorem ?, and implies that either or . Without loss of generality, let . Then , and by the inductive hypothesis, . Since satisfies the reverse pseudographoid rule, implies . This completes the induction on .

Finally, assume that for disjoint . Then for singletons and any . By the previous part of this proof, this implies that for all such . As the family is equivalent to , it follows by localizability that , completing the assertion.

Assume now that is bidirected global Markov with respect to . Then by definition, . As by [11], is and , it remains to note that localizability is preserved under dualization and that the dual of a pseudographoid is a reverse pseudographoid ((Lemma ?).

Theorem ? can also be proven using the concept of duality, outlined in Section 3. After applying Lemma ?, Theorem ? is seen to follow directly from Theorem ? after a very short proof.

Since is bidirected pairwise Markov with respect to , by definition

Hence, is undirected pairwise Markov with respect to .

Therefore by Theorem ?, is undirected global Markov with respect to if and only if is localizable and closed under . By Lemma ?, the foregoing statement is true if and only if is localizable and closed under .

Thus, duality has allowed for a much shorter and simpler proof Theorem ? through reuse of Theorem ?.

5Duality, Semigraphoids, and Pseudographoid Rules

In Section 4, the pairwise and global Markov properties were investigated using the pseudographoid and reverse pseudographoid rules. We now examine pseudographoid and reverse pseudographoid rules in the context of semigraphoids and localizability.

The pseudographoid and reverse pseudographoid rules are weaker than the intersection and composition rules typically used to study Markov properties when considered on a general relation . However, undirected and bidirected graphs are defined solely to study systems of random variables, which are semigraphoids by default. Therefore, it makes sense to consider the pseudographoid and reverse pseudographoid rules as they relate to relations of the form for some random variable with finite index set .

In this section, we demonstrate that when restricted to semigraphoids, closure under the pseudographoid and intersection rules are equivalent. The analogous statement is true for the reverse pseudographoid and composition rule. We demonstrate both equivalences in their own right, but further show how one equivalence can be dualized into the other for a simpler result.

Furthermore, we show that localizable pseudographoids are a less restrictive class of relations than semigraphoids satisfying intersection. The dual result follows immediately. In particular, this shows that indeed our conditions for equivalences between pairwise and global Markov properties in Section 4 are weaker than those which currently exist, for the space of relations.

As remarked in Section 2.1, for any random vector , the conditional independence structure is a semigraphoid, satisfying symmetry , decomposition , weak union , and contraction . By Lemma ?, this is equivalent to satisfying

Similar parsimonious equivalences can be stated under the further assumption of intersection, composition, or the pseudographoid rules.

The direction of is by definition, while follows from the weak union rule (as defined in Section 2.2). Similarly, the direction of is by definition while the direction follows by decomposition.

To prove , note that if is a semigraphoid closed under the pseudographoid rule, then for singletons,

where for clarity we underscore implications with letter of the corresponding rule as given in Section 2.1. The other direction follows by definition.

To prove , note that

.

The following lemma shows equivalence between the pseudographoid and intersection rules, when the semigraphoid rules are satisfied.

The direction is clear; we will show the direction. Define first the generalized pseudographoid rule by

This rule will be denoted by ; note that the pseudographoid rule on a semigraphoid is while intersection is . Starting then with , by induction on , , and .

Claim .1

If the semigraphoid is closed under for some , it is also closed under .

To prove Claim ?.1, fix some pairwise disjoint with and singletons and assume . Then first,

Second,

Finally,

which shows closure under .

The next claim shows induction on the next coordinate.

Claim .2

If the semigraphoid is closed under for some , it is also closed under .

Assume that for some pairwise disjoint with and singletons, . Then

which proves Claim ?.2.

The next claim completes the inductive argument.

Claim .3

If the semigraphoid is closed under for some , it is also closed under .

For disjoint with and , assume that . Then