Models and Algorithms for Graph Watermarking
Abstract
We introduce models and algorithmic foundations for graph watermarking. Our frameworks include security definitions and proofs, as well as characterizations when graph watermarking is algorithmically feasible, in spite of the fact that the general problem is NPcomplete by simple reductions from the subgraph isomorphism or graph edit distance problems. In the digital watermarking of many types of files, an implicit step in the recovery of a watermark is the mapping of individual pieces of data, such as image pixels or movie frames, from one object to another. In graphs, this step corresponds to approximately matching vertices of one graph to another based on graph invariants such as vertex degree. Our approach is based on characterizing the feasibility of graph watermarking in terms of keygen, marking, and identification functions defined over graph families with known distributions. We demonstrate the strength of this approach with exemplary watermarking schemes for two random graph models, the classic ErdősRényi model and a random powerlaw graph model, both of which are used to model realworld networks.
1 Introduction
In the classic media watermarking problem, we are given a digital representation, , for some media object, , such as a piece of music, a video, or an image, such that there is a rich space, , of possible representations for besides that are all moreorless equivalent. Informally, a digital watermarking scheme for is a function that maps and a reasonably short random message, , to an alternative representation, , for in . The verification of such a marking scheme takes and a presumablymarked representation, (which was possibly altered by an adversary), along with the set of messages previously used for marking, and it either identifies the message from this set that was assigned to or it indicates a failure. Ideally, it should difficult for an adversary to transform a representation, (which he was given), into another representation in , that causes the identification function to fail. Some example applications of such digital watermarking schemes include steganographic communication and marking digital works for copyright protection (e.g., see [16, 25, 50]).
With respect to digital representations of media objects that are intended to be rendered for human performances, such as music, videos, and images, there is a wellestablished literature on digital watermarking schemes and even welldeveloped models for such schemes (e.g., see Hopper et al. [24]). Typically, such watermarking schemes take advantage of the fact that rendered works have many possible representations with almost imperceptibly different renderings from the perspective of a human viewer or listener.
In this paper, we are inspired by recent systems work on graph watermarking by Zhao et al. [56, 55], who propose a digital watermarking scheme for graphs, such as social networks, proteininteraction graphs, etc., which are to be used for commercial, entertainment, or scientific purposes. This work by Zhao et al. presents a system and experimental results for their particular method for performing graph watermarking, but it is lacking in formal security and algorithmic foundations. For example, Zhao et al. do not provide formal proofs for circumstances under which graph watermarking is undetectable or when it is computationally feasible. Thus, as complementary work to the systems results of Zhao et al., we are interested in the present paper in providing models and algorithms for graph watermarking, in the spirit of the watermarking model provided by Hopper et al. [24] for media files. In particular, we are interested in providing a framework for identifying when graph watermarking is secure and computationally feasible.
1.1 Additional Related Work
Under the term “graph watermarking,” there is some additional work, although it is not actually for the problem of graph watermarking as we are defining it. For instance, there is a line of research involving software watermarking using graphtheoretic concepts and encodings. In this case, the object being marked is a piece of software and the goal of a “graph watermarking” scheme is to create a graph, , from a message, , and then embed into the control flow of a piece of software, , to mark . Examples of such work include pioneering work by Collberg and Thomborson [15], as well as subsequent work by Venkatesan, Vazirani, and Sinha [52] and Collberg et al. [14]. (See also Chen et al. [7] and Bento et al. [4], as well as a survey by Hamilton and Danicic [22].) This work on software watermarking differs from the graph watermarking problem we study in the present paper, however, because in the graph watermarking problem we study an input graph is provided and we want to alter it to add a mark. In the graphbased software watermarking problem, a graph is instead created from a message to have a specific, known structure, such as being a permutation graph, and then that graph is embedded into the control flow of the piece of software.
A line of research that is more related to the graph watermarking problem we study is anonymization and deanonymization for social networks (e.g., see [3, 57, 23, 26, 37, 43, 53]). One of the closest examples of such prior work is by Backstrom, Dwork, and Kleinberg [3], who show how to introduce a small set of “rogue” vertices into a social network and connect them to each other and to other vertices so that if that same network is approximately replicated in another setting it is easy to match the two copies. Such work differs from graph watermarking, however, because the set of rogue vertices are designed to “stand out” from the rest of the graph rather than “blend in,” and it may in some cases be relatively easy for an adversary to identify and remove such rogue vertices. Also, we would ideally prefer graph watermarking schemes that make small changes to the adjacencies of existing vertices rather than mark a graph by introducing new vertices, since in some applications it may not be possible to introduce new vertices into a graph that we wish to watermark. In addition to this work, also of note is work by Narayanan and Shmatikov [43], who study the problem of approximately matching two social networks without marking, as well as the work on Khanna and Zane [28] for watermarking road networks by perterbing vertex positions (which is a marking method outside the scope of our approach).
Our approach to graph watermarking is also necessarily related to the problem of graph isomorphism and its approximation (e.g., see [1, 2, 17, 27, 30, 46]). In the graph isomorphism problem, we are given two vertex graphs, and , and asked if there is a mapping, , of vertices in to vertices in such that is an edge in if and only if is an edge in . While the graph isomorphism problem is “famous” for having an uncertain, but unlikely [1], with respect to being NPcomplete, extensions to subgraph isomorphism and graph edit distance are known to be NPcomplete (e.g., see [20]).
1.2 Our Results
In this paper, we introduce a general graph watermarking framework that is based on the use of key generation, marking, and identification functions, as well as a hypothetical watermarking security experiment (which would be performed by an adversary). We define these functions in terms of graphs taken over random families of graphs, which allows us to quantify situations in which graph watermarking is provably feasible.
We also provide some graph watermarking schemes as examples of our framework, defined in terms of the classic ErdősRényi randomgraph model and a random powerlaw graph model. Our schemes extend and build upon previous results on graph isomorphism for these graph families, which may be of independent interest. In particular, we design simple marking schemes for these random graph families based on simple edgeflipping strategies involving high and mediumdegree vertices. Analyzing the correctness of our schemes is quite nontrivial, however, and our analysis and proofs involve intricate probabilistic arguments. We provide an analysis of our scheme against adversaries that can themselves flip edges in order to defeat our mark identification algorithms. In addition, we provide experimental validation of our algorithms, showing that our edgeflipping scheme can succeed for a graph without specific knowledge of the parameters of its deriving graph family. We also conducted experiments to fit realworld networks to the random powerlaw graph model, which gave results that showed that the model was generally a good fit for the networks tested but the learned values did not fall into the range needed for our scheme.
2 Our Watermarking Framework
We begin by presenting a general framework for graph watermarking, which differs from the general model of Hopper et al. [24], but is similar in spirit.
Suppose we are given an undirected graph, , that we wish to mark. To define the security of a watermarking scheme for , must come from a family of graphs with some degree of entropy [56]. We formalize this by assuming a probability distribution over the family of graphs from which is taken.
Definition 1.
A graph watermarking scheme is a tuple over a set, , of graphs where

is a private key generation function, such that is a list of (pseudo)random graph elements, such as vertices and/or vertex pairs, defined over a graph of vertices. These candidate locations for marking are defined independent of a specific graph; that is, vertices in are identified simply by the numbering from to . For example, could be a small random graph, , and some random edges to connect to a larger input graph [56], or could be a set of vertex pairs in an input graph that form candidate locations for marking.

takes a private key generated by , and a specific graph from , and returns a pair, , such that is a unique identifier for and is the graph obtained by adding the mark determined by to in the location determined determined by the private key . is called every time a different marked copy needs to be produced, with the th copy being denoted by . Therefore, the unique identifiers should be thought of as being generated randomly. To associate a marked graph with the user who receives it, the watermarking scheme can be augmented with a table storing user name and unique identifiers. Alternatively, the identifiers can be generated pseudorandomly as a hash of a private key provided by the user.

takes a private key from , the original graph, , identifiers of previouslymarked copies of , and a test graph, , and it returns the identifier, , of the watermarked graph that it is identifying as a match for . It may also return , as an indication of failure, if it does not identify any of the graphs as a match for .
In addition, in order for a watermarking scheme to be effective, we require that with high probability^{1}^{1}1Or “whp,” that is, with probability at least , for some . over the graphs from and output pairs, of , for any , we have .
Algorithm 1 shows a hypothetical security experiment for a watermarking scheme with respect to an adversary, , who is trying to defeat the scheme. Intuitively, in the hypothetical experiment, we generate a key , choose a graph , from family according to distribution (as discussed above), and then generate marked graphs according to our scheme (for some set of messages). Next, we randomly choose one of the marked graphs, , and communicate it to an adversary. The adversary then outputs a graph that is similar to where his goal is to cause our identification algorithm to fail on .
In order to characterize differences between graphs, we assume a similarity measure , defining the distance between graphs in family . We also include a similarity threshold , that defines the advantage of an adversary performing the experiment in Algorithm 1. Specifically, the advantage of an adversary, who is trying to defeat our watermarking scheme is
The watermarking scheme is secure against adversary if the similarity threshold is and ’s advantage is polynomially negligible (i.e., is for some ).
Examples of adversaries could include the following:

Arbitrary edgeflipping adversary: a malicious adversary who can arbitrarily flip edges in the graph. That is, the adversary adds an edge if it is not already there, and removes it otherwise.

Random edgeflipping adversary: an adversary who independently flips each edge with a given probability.

Arbitrary adversary: a malicious adversary who can arbitrarily add and/or remove vertices and flip edges in the graph.

Random adversary: an adversary who independently adds and/or removes vertices with a given probability and independently flips each edge with a given probability.
One could also imagine other types of adversaries, as well, such as a random adversary who is limited in terms of the numbers or types of edges or vertices that he can change.
2.1 Random graph models
As defined above, a graph watermarking scheme requires that graphs to be marked come from some distribution. In this paper, we consider two families of random graphs—the classic ErdősRényi model and a random powerlaw graph model—which should capture large classes of applications where graph watermarking would be of interest.
Definition 2 (The ErdősRényi model).
A random graph is a graph with vertices, where each of the possible edges appears in the graph independently with probability .
Definition 3 (The random powerlaw graph model, §5.3 of [9]).
Given a sequence , such that , the general random graph is defined by labeling the vertices through and choosing each edge independently from the others with probability , where .
We define a random powerlaw graph parameterized by the maximum degree and average degree . Let for values of in the range between and , where
(1) 
This definition implies that each edge appears with probability
(2) 
As we show in the following proposition, this model does indeed have a powerlaw degree distribution.
Proposition 4.
In the random powerlaw graph , the expected number of vertices with degree is between and where .
Proof.
The function relating the index of a vertex to its expected degree is convex and decreasing. By the mean value theorem, the number of indices such that satisfies
Now the derivative of is . Noting that is the expected number of vertices of degree , the result is proven. ∎
2.2 Graph watermarking algorithms
We discuss some instantiations of the graph watermarking framework defined above. Unlike previous watermarking or deanonymization schemes that add vertices [3, 56], we describe an effective and efficient scheme based solely on edge flipping. Such an approach would be especially useful for applications where it could be infeasible to add vertices as part of a watermark.
Our scheme does not require adding labels to the vertices or additional objects stored in the graph for identification purposes. Instead, we simply rely on the structural properties of graphs for the purposes of marking. In particular, we focus on the use of vertex degrees, that is, the number of edges incident on each vertex. We identify high and medium degree vertices as candidates for finding edges that can be flipped in the course of marking. The specific degree thresholds for what we mean by “highdegree” and “mediumdegree” depend on the graph family, however, so we postpone defining these notions precisely until our analysis sections.
Algorithms providing an example implementation of our graph watermarking scheme are shown in Algorithm 2. The algorithm randomly selects a set of candidate vertex pairs for flipping, from among the high and mediumdegree vertices, with no vertex being incident to more than a parameter of candidate pairs. We introduce a procedure, , which labels highdegree vertices by their degree ranks and each mediumdegree vertex, , by a bit vector identifying its highdegree adjacencies. This bit vector has a bit for each highdegree vertex, which is for neighbors of and for nonneighbors. The algorithm , takes a random set of candidate edges and a graph, , and it flips the corresponding edges in according to a resampling of the edges using the distribution . The algorithm, approximateisomorphism, returns a mapping of the high and mediumdegree vertices in to matching high and mediumdegree vertices in , if possible. The algorithm, , uses the approximate isomorphism algorithm to match up high and mediumdegree vertices in and , and then it extracts the bitvector from this matching using .
As mentioned above, we also need a notion of distance for graphs. We use two different such notions. The first is the graph edit distance, which is the minimum number of edges needed to flip to go from one graph to another. The second is vertex distance, which intuitively is an edgeflipping metric localized to vertices.
Definition 5 (Graph distances).
Let be the set of graphs on vertices. If , define as the set of bijections between the vertex sets and . Define the graph edit distance as
where is the symmetric difference of the two edge sets under correspondence . Define the vertex distance as
where is the set of edges incident to .
3 Identifying High and MediumDegree Vertices
We begin analyzing our proposed graph watermarking scheme by showing how high and mediumdegree vertices can be identified under our two random graph distributions. We begin with some technical results related to graph isomorphism that form the basis of our watermarking approach, with the goal of determining the conditions under which a vertex of a random graph can be identified with high probability, either by its degree (if the degree is high) or by its set of highdegree neighbors (if it has medium degree). We ignore lowdegree vertices: their information content and distinguishability are low, and they are not used by our example scheme. Because our results on vertex identifiability are used in our graph watermarking scheme, we also determine how robust these identifications are, based on how wellseparated the vertices are by their degrees.
We first find a threshold number such that the vertices with highest degree are likely to have distinct and wellseparated degree values. We call these vertices the highdegree vertices. Next, we look among the remaining vertices for those that are wellseparated in terms of their highdegree neighbors. Specifically, the (highdegree) neighborhood distance between two vertices is the number of highdegree vertices which are connected to exactly one of the two vertices. Note that we will omit the term “highdegree” in “highdegree neighborhood distance” from now on, as it will always be implied.
In the ErdősRényi model, we show that all vertices that are not highdegree nevertheless have wellseparated highdegree neighborhoods whp. In the random powerlaw graph model, however, there will be many lowerdegree vertices whose highdegree neighborhoods cannot be separated. Those that have wellseparated highdegree neighborhoods with high probability form the mediumdegree vertices, and the rest are the lowdegree vertices.
For completeness, we include the following wellknown Chernoff concentration bound, which we will refer to time and again.
Lemma 6 (Chernoff inequality [9]).
Let be independent random variables with
We consider the sum , with expectation . Then
3.1 Vertex separation in the ErdősRényi model
Let us next consider vertex separation results for the classic ErdősRényi randomgraph model. Recall that in this model, each edge is chosen independently with probability .
Definition 7.
Index vertices in nonincreasing order by degree. Let represent the th highest degree in the graph. Given , we say that a vertex is highdegree with respect to if it has degree at least . Otherwise, we say that the vertex is mediumdegree. We just say highdegree when the value of is understood from context.
Note that in this randomgraph model, there are no lowdegree vertices.
Definition 8.
A graph is separated if all highdegree vertices differ in their degree by at least and all mediumdegree vertices are neighborhood distance apart.
Note: this definition depends on how highdegree or mediumdegree vertices are defined and will therefore be different for the random powerlaw graph model.
Lemma 9 (Extension of Theorem 3.15 in [5]).
Suppose , , and . Then with probability
is such that
where .
Proof.
We quantify and extend the probability analysis of a proof from [5]. Let
The event of the result fails if or if there is such that .
The statement of theorem 3.12 of [5] still holds when the words “a.e. satisfies” are replaced by “ satisfies with probability greater than ”. This can be seen directly from the part of the proof where Chebychev’s inequality is applied.
By this result, the probability that is . The probability that for a given is . ∎
Lemma 10 (Vertex separation in the ErdősRényi model).
Let , , , . Suppose is such that . Then is separated with probability .
Proof.
We prove the theorem with probability at least . Let and . By Lemma 9, the probability that for some is at most .
Let be the expected neighborhood distance between two vertices . We have
so that, if ,
Since the highdegree vertices are separated by more than two degrees, the fact that they are highdegree vertices is independent of whether they are neighbors of and . Consequently, we can apply a Chernoff bound (Lemma 6.) Then, by the union bound, the probability that for some mediumdegree is less than . ∎
Thus, highdegree vertices are wellseparated with high probability in the ErdősRényi model, and the mediumdegree vertices are distinguished with high probability by their highdegree neighborhoods.
3.2 Vertex separation in the random powerlaw graph model
We next study vertex separation for a random powerlaw graph model, which can match the degree distributions of many graphs that naturally occur in social networking and science. For more information about powerlaw graphs and their applications, see e.g. [6, 40, 44].
In the random powerlaw graph model, vertex indices are used to define edge weights and therefore do not necessarily start at 1. The lowest index that corresponds to an actual vertex is denoted . So vertex indices range from to . Additionally, there are two other special indices and , which we define in this section, that separate the three classes of vertices.
Definition 11.
The vertices ranging from to are the highdegree vertices, those that range from to are the mediumdegree vertices, and those beyond are the lowdegree vertices.
In this model, the value of is constrained by the requirement that . When , this constraint is not actually restrictive. However, when , must be asymptotically greater than . The constraints on also constrain the value of the maximal and average degree of the graph.
We define and to be independent of , but dependent on parameters that control the amount and probability of separation at each level. The constraints that and translate into corresponding restrictions on the valid values of , namely that and . We define in the following lemma.
Lemma 12 (Separation of highdegree vertices).
In the model, let . Then,
(3) 
Moreover, for all satisfying and , the probability that
is at least .
Proof.
The first statement follows from the fact that is a convex function of and from taking its derivative at and .
For the second statement, let and let . We will show that if , then
(4) 
Now we choose such that . The inequality implies that and (4) holds for all . By the union bound applied to Eq. 4
Since , the right hand side is bounded above by . This proves the result.
Now, we prove Eq. 4. Clearly, since , we have that . So if and , then . This implies that
where the second inequality follows from Eq. 3 and the definition of given in Definition 3. If , the right hand side is lowerbounded by . The result follows by applying a Chernoff bound (Lemma 6). ∎
For simplicity, we often use the following observation.
Observation 13.
Rewriting to show its dependence on , we have
(5) 
For the graph model to make sense, the highdegree threshold must be asymptotically greater than the lowest index. In other words, we must have that . Since , this implies that .
We next define , the degree threshold for mediumdegree vertices, in the following lemma.
Lemma 14 (Separation of mediumdegree vertices).
Let be defined as in Definition 3, be defined as in Eq. 5, and
(6) 
Let denote the neighborhood distance between two vertices and in . If , for every and , the probability that
where
(7) 
is at least for sufficiently large .
Proof.
Let and let
We claim that if , then
(8) 
If we choose , we have that , so that Eq. 8 applies to all such that . Moreover, since
our choice of implies that . By applying the union bound to Eq. 8, we have
which establishes the lemma.
Let us now prove the claim. Observe that is the sum over the highdegree vertices , of indicator variables for the event that vertex is connected to exactly one of the vertices and . It i For fixed and , these are independent random variables. Therefore, we can apply a Chernoff bound. The probability that is
Since , for sufficiently large , this expression is bounded below by , and
by Eq. 2, Eq. 5 and Eq. 7, as can be shown by a straightforward but lengthy computation. Let . This implies that
Therefore, applying the Chernoff bound (Lemma 6) to the for fixed and and all highdegree vertices proves the claim. ∎
Observation 15.
We would have the undesirable situation that whenever , or equivalently when . In fact, in order for , we must have .
We illustrate the breakpoints for high, medium, and lowdegree vertices in Fig. 1.
The next lemma summarizes the above discussion and provides the forms of and that we use in our analysis.
Lemma 16 (Vertex separation in the powerlaw model).
Let . Fix . Let and where and . Let
For sufficiently large , the probability that a graph is not separated is at most .
Proof.
Let be defined as in Lemma 12. A straightforward computation using Eq. 1, Eq. 3, and Eq. 5 shows that
So for sufficiently large , we have . For all , the average degrees of consecutive vertices are at least apart. So for two highdegree vertices to be within of each other, at least one of the two must have degree at least away from its expected degree. By Lemma 12, the probability that some highdegree vertex satisfies is at most .
By Lemma 14, the probability that there are two mediumdegree vertices with neighborhood distance less than is at most . ∎
Thus, our marking scheme for the random powerlaw graph model is effective.
4 Adversary Tolerance
In this section, we study the degree to which our exemplary graph watermarking scheme can tolerate an arbitrary edgeflipping adversary. To measure success, we use the notion of security and adversary advantage which are formally defined in 2. We quantify the number of edge flips that can be tolerated under the ErdősRényi model and the random powerlaw graph model.
Theorem 17 (Security against an arbitrary edgeflipping adversary in the ErdősRényi model).
Let , , and such that . Let be sufficiently large so that
(9) 
Suppose the similarity measure is the vertex distance , the similarity threshold is , we have a number of watermarked copies, and their identifiers are generated using bits. Suppose also that the identifiers map to sets of edges of a graph constrained by the fact that no more than edges can be incident to any vertex. The watermarking scheme defined in Algorithm 2 is secure against any deterministic adversary.
The proof of this theorem relies on two lemmas. Lemma 18 identifies conditions under which a set of bit vectors with bits independently set to 1 is unlikely to have two close bit vectors. Lemma 19 states that a deterministic adversary’s ability to guess the location of the watermark is limited. Informally, this is because the watermarked graph was obtained through a random process, so that there are many likely original graphs that could have produced it.
Lemma 18 (Separation of IDs).
Consider random bit strings of length , where each bit is independently set to 1, and the ith bit is 1 with probability satisfying for a fixed value . The probability that at least two of these strings are within Hamming distance of each other is at most if .
Proof.
The expected distance between two such strings is at least Applying Lemma 6 with , we have that the probability that their Hamming distance is less than is at most . Therefore, the probability that at least two out of strings are within Hamming distance of each other is at most . ∎
Lemma 19 (Guessing power of adversary).
Consider a complete graph on vertices, and let of its edges be red. Let be a sample of edges chosen uniformly at random among those that satisfy the constraint that no more than edges of the sample can be incident to any one vertex. Suppose also that and are nondecreasing functions of such that
(10) 
For sufficiently large , the probability that contains at least red edges is bounded by . Moreover, if , then the probability that contains at least red edge is bounded by , for some and for sufficiently large .
Proof.
In the process of selecting edges without replacement, let be the event that the sample contain at least red edges, and let be the event that the sample satisfies the degree constraint. The event whose probability we want to bound is equal to
Let us first show that can be lower bounded by a constant. To prove this, we select vertices with replacement uniformly at random, and pair consecutive vertices to obtain edges. Choosing vertices uniformly in this way will simplify showing that the degree constraint is satisfied. Of course we want to avoid “selfloops”, or edges where both end vertices are the same. Let denote the event that there is a vertex that is incident to more than edges of the sample. Also, let denote the event that the sample contains no selfloops and no duplicate edges. Then
Now, the probability of encountering a selfloop is and the probability of an edge being a duplicate of another is at most . Therefore,
By Eq. 10, . So is bounded away from 0. Moreover, since the edges now consist of pairs of independently chosen vertices, we can approximate the number of edges incident to each vertex by independent Poisson random variables with parameter thusly:
where the middle factor is a bound on the probability that one Poisson variable is at least (Theorem 5.4 of [41]), and the last factor is an adjustment factor for this approximation (Corollary 5.9 of [41]). This expression is bounded by a constant factor times the expression on the lefthand side of Eq. 10. Consequently, converges to 0, and for sufficiently large , , as was to be shown.
Now we find an upper bound for . To do this, we select edges with replacement uniformly at random. Because is relatively small when compared to , it is unlikely that the sample will contain any duplicates. Formally, let be the event that the sample contains at least red edges, and be the event that the sample consists of distinct edges. We have
The probability that two selected edges are the same edge is . So
So for large enough , is bounded below by .
Finally, we bound . The expected number of red edges in this sample is which is bounded below by and bounded above by . So using these bounds and a Chernoff bound (Lemma 6), where we set equal to , we have that
If as , set equal to :
for some constant . Putting it all together, we have that for large enough , and is bounded above by times one of the two bounds for . This proves the result. ∎
Theorem 17.
An upper bound on the advantage of any deterministic adversary on graphs on vertices is given by the conditional probability
where the parameters passed to are defined according to the experiment in Algorithm 1. We show that this quantity is polynomially negligible.
For to be successfully identified, it is sufficient for the following three conditions to hold:

the original graph is separated;

the Hamming distance between any two and involved in a pair in is at least ;

changes no edges of the watermark.
These are sufficient conditions because we only test graphs whose vertices had at most incident edges modified by the adversary, and another incident edges modified by the watermarking. So for original graphs that are separated, the labeling of the vertices can be successfully recovered. Finally, if the adversary does not modify any potential edge that is part of the watermark, the of the graph is intact and can be recovered from the labeling.
Now, by Lemma 10, the probability that is not separated is less than . Moreover, since , by Lemma 18, the probability that there are two identifiers in that are within of each other is at most .
Finally, for graphs in which an adversary makes fewer than modifications per vertex, the total number of edges the adversary can modify is . Since all vertices are high and mediumdegree vertices in this model, . Therefore, . Equation 9 guarantees that the hypothesis given by Eq. 10 of Lemma 19 is satisfied. Consequently, the probability that changes one or more adversary edges is for some constant .
This proves that each of the three conditions listed above fails with polynomially negligible probability, which implies that the conditional probability is also polynomially negligible. ∎
Theorem 20 (Security against an arbitrary edgeflipping adversary in the random powerlaw graph model).
Let , , and where , and .
Let . Suppose the similarity measure is a vector of distances , that the corresponding similarity threshold is the vector where is the maximum number of edges the adversary can flip in total, and the maximum number number of edges it can flip per vertex. Suppose that we have watermarked copies of the graph, that we use to watermark a graph.
Suppose also that the identifiers map to sets of edges of a graph constrained by the fact that no more than edges can be incident to any vertex. Then the watermarking scheme defined in Algorithm 2 is secure against any deterministic adversary.
Proof.
The proof is similar to the proof of Theorem 17. An upper bound on the advantage of any deterministic adversary on graphs on vertices is given by the conditional probability
where the parameters passed to are defined according to the experiment in Algorithm 1. We show that this quantity is polynomially negligible.
For to be successfully identified, it is sufficient for the following three conditions to hold:

the original graph is