A Statistical Model for Motifs Detection

# A Statistical Model for Motifs Detection

Hamid Javadi     and    Andrea Montanari Department of Electrical Engineering, Stanford UniversityDepartment of Electrical Engineering and Statistics, Stanford University
###### Abstract

We consider a statistical model for the problem of finding subgraphs with specified topology in an otherwise random graph. This task plays an important role in the analysis of social and biological networks. In these types of networks, small subgraphs with a specific structure have important functional roles, and they are referred to as ‘motifs.’

Within this model, one or multiple copies of a subgraph is added (‘planted’) in an Erdős-Renyi random graph with vertices and edge probability . We ask whether the resulting graph can be distinguished reliably from a pure Erdős-Renyi random graph, and we present two types of result. First we investigate the question from a purely statistical perspective, and ask whether there is any test that can distinguish between the two graph models. We provide necessary and sufficient conditions that are essentially tight for small enough subgraphs.

Next we study two polynomial-time algorithms for solving the same problem: a spectral algorithm, and a semidefinite programming (SDP) relaxation. For the spectral algorithm, we establish sufficient conditions under which it distinguishes the two graph models with high probability. Under the same conditions the spectral algorithm indeed identifies the hidden subgraph.

The spectral algorithm is substantially sub-optimal with respect to the optimal test. We show that a similar gap is present for the more sophisticated SDP approach.

## 1 Introduction

‘Motifs’ play a key role in the analysis of social and biological networks. Quoting from an influential paper in this area [MSOI02], the term ‘motif’ broadly refers to

“patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks.”

For instance, the authors of [MSOI02] considered directed graph representations of various types of data: gene regulation networks, neural circuits, food webs, the world wide web, electronic circuits. They identified a number of small subgraphs that are found in atypically large numbers in such networks, and provided interpretations of their functional role. The analysis of motifs in large biological networks was pursued in a number of publications, see e.g. [KIMA04, YLSK04, KA05, SSR05, Alo07].

The analysis of subgraph frequencies has an even longer history within sociology, in part because sociological theories are predictive of specific subgraph occurrences. We refer to [Gra73] for early insights, and to [WF94, EK10] for recent reviews of this research area.

This paper studies a statistical model for the motif detection problem. In order to provide a formal statement of the problem, denote by the space of graphs over vertex set .

###### Definition 1.1.

We say that the two (sequences of) probability laws , over are strongly distinguishable if there exists a sequence of functions (a ‘test’) such that

 limsupn→∞P0,n(T(Gn)=1)=limsupn→∞P1,n(T(Gn)=0)=0.

We say that they are weakly distinguishable if there exists such that

 limsupn→∞[P0,n(T(Gn)=1)+P1,n(T(Gn)=0)]<1.

We say that they are polynomial-time weakly (or strongly) distinguishable if a test exists that achieves the above and can be computed by a polynomial-time algorithm.

Throughout the paper, will correspond to a standard Erdős-Renyi random graph, while will be an Erdős-Renyi random graph with planted copies of a small graph . Namely, fix , and let be a sequence of graphs, indexed by . Let us emphasize that is a non-random graph on vertices. Given a graph , we denote by the set of labelings of the vertices of taking values in , i.e.

 L(F,n)≡{φ:V(F)→[n]s.t.φ(i)≠φ(j)∀i≠j}. (1.1)

(In particular .) For an edge , we let be the unordered pair , and hence .

We then let independently for all pairs . On the other hand

 P1,n(⋅) =1|L(Hn,n)|∑φ∈L(Hn,n)P1,n(⋅∣∣φ), (1.2) P1,n((i,j)∈E∣∣φ) ={1 if (i,j)∈φ(E(Hn)),q0 otherwise. (1.3)

Here it is understood that edges are independent conditional on the labeling . In Section 4 we will generalize this defintion by considering the case in which a large number of identical copies of is planted.

Whenever the two laws and are distinguishable, we will also say that the motif is detectable. We will consider the following fundamental questions:

• Under which conditions on , the are two laws , weakly distinguishable? Under which conditions are they strongly distinguishable?

• Assuming the conditions for distinguishability of , are satisfied, under which conditions there exists a polynomial-time computable test that distinguishes from ?

• What features of the motif controls it detectability?

The first two questions have attracted a substantial amount of work since the nineties for special cases such as planted cliques or dense subgraphs [Jer92, FK00, FPK01a]. A brief overview of this line of work is presented in Section 1.1. However, applied studies investigate a much broader collection of motifs than just cliques [MSOI02, Alo07, WF94, EK10] and –in fact– cliques are rarely the most interesting motif from a scientific perspective.

It is a priori unclear whether intuitions developed on the planted clique problem apply to general motif detection. Quantitative implications, namely tight necessary and sufficient conditions for detectability of an arbitrary motif are even less clear. The present paper builds upon some key insights that were developed in the earlier literature, to obtain a broader picture that is potentially applicable to motifs of scientific interest.

In the rest of the paper, we present the following results:

1. Section 2 presents upper and lower bounds on the statistical threshold for motif detection. The bounds match for sufficiently small subgraphs (in particular for ) yielding a sharp characterization in that regime. The key feature controlling statistical detection turns out to be the maximum graph density (see Section 2 for a definition).

Determining sharp detectability conditions for larger motifs, or for background density depending on , remains an open problem.

2. In order to explore the computational limits of motif detection, Section 3.1 analyzes a simple spectral algorithm that computes the leading eigenvector of the centered adjacency matrix of . The key feature controlling the behavior of this algorithm is the leading eigenvalue of the adjacency matrix of . We prove that the spectral approach succeeds with high probability if this eigenvalue is larger than .

We also show that, in the same regime, the spectral algorithms can be augmented with a combinatorial step that identifies the planted subgraph. We prove that this step is successful under a certain ‘balancedness’ condition on the motif .

3. Section 3.2 apply a semidefinite programming (SDP) relaxation of the quadratic assignment problem to motif detection. We prove that the SDP approach is not successful unless –again– the leading eigenvalue of is of order . In other words, a similarly large gap between statistical and algorithmic thresholds exists for SDP as for spectral methods.

As is often the case in proving negative results for SDP relaxation, our analysis relies on a careful construction of a primal witness. This construction is of potential interest for other applications of quadratic assignment.

4. Finally, Section 4 considers the more general case in which copies of the motif are planted in the same graph. We obtain upper and lower bounds on the statistical threshold. These bounds match when the motif size is small enough or the number of copies is small enough.

These results suggest that motifs detection might be computationally hard in regimes in which it is statistically feasible. As discussed in the next section, this phenomenon has been already extensively investigated for the planted clique problem. More interestingly, our analysis suggests that the two thresholds are controlled by different features of the motif . While the statistical threshold depends on the maximum density of , the spectral threshold is related to its principal eigenvalue.

### 1.1 Related work

Statistical models for motif detection have been studied so far only for specific cases. An important line of work within theoretical computer science has focused on the planted clique problem, which corresponds to the case , the complete graph over vertices. It is a classical result that the planted model is distinguishable from a pure Erdős-Renyi random graph, for , provided (with arbitrarily small), while it is undistinguishable for , see e.g. [GM75]. On the other hand, approximating the size of the largest clique in a graph is hard in a worst case setting, even within a factor [Hås99, Kho01]. Starting with Jerrum’s seminal work, a number of authors analyzed broad classes of algorithms for the statistical model. A short list includes Monte Carlo Markov Chain [Jer92], spectral algorithms [AKS98], SDP relaxations within the Lovasz-Schrijver hierarchy [FK00], statistical query models [FGR13], message passing algorithms [DM14], and, most recently, SDP relaxations within the Sum-Of-Squares hierarchy [MPW15, DM15, RS15, HKP15, BHK16].

While these works provide precious insights into the tradeoff between statistical and computational limits for graph estimation, they cannot be applied directly to general motif detection. In particular, they do not clarify what features of the motif are relevant for detection.

The planted clique problem is arguably the most studied statistical problem presenting a large gap between optimal statistical estimation and computationally efficient algorithms. In fact, several works assume hardness of planted clique to prove hardness of other statistical estimation problems, see e.g. [BR13]. Similar in spirit is the recent work in [DLSS14], that instead assumes hardness of refuting random satisfiability formulas, as well as [ADBL11].

A generalization of the maximum-clique problem is provided by the densest -subgraph problem. Given a graph and an integer , this requires to find a subset of vertices of size that contains the largest number of edges [FPK01a]. The best polynomial-time algorithm guarantees an approximation ratio [BCC10]. The recent paper [Man17] proves that an approximation ratio cannot be achieved under the exponential time hypothesis (see also [Fei02, Kho06, AAM11] for other inapproximability results). Semidefinite programming relaxations were studied, among others in [FPK01b, BCV12].

Once more, these results do not apply directly to the motif detection problem. In particular, it is not clear –in general– that searching for a dense subgraph is necessarily the best approach to detect a given motif . Further, the focus of this line of work is on worst-case approximation guarantees, rather than on statistical thresholds.

Closer to the scope of the present paper is some of the recent work on ‘community detection’. A random graph over vertices is generated by selecting a subset of vertices uniformly at random. Edges are conditionally independent given . Two vertices are connected by an edge with probability if and probability otherwise. Statistical and computational thresholds for detection and estimation of the set were studied by a number of authors [VAC15, ACV14, HWX15, Mon15, HWX17]. Again, these works focuses uniquely on the density of the planted subgraph, and do not model its structure.

Following the original work on motifs in biological networks [MSOI02], several papers developed algorithms to sample uniformly random networks with certain given characteristics, e.g. with a given degree sequence [MKI03, TS07, BD11]. Uniform samples can be used to assess the significance level of specific subgraph counts in a real network under consideration. Let, for instance, denote the number of copies of a certain small graph in . If in a real network of interest we find , the probability is used as significance level for this discovery. Let us mention two key differences with respect to problem considered in the present paper. First, we focus on conditions under which the laws and are strongly distinguishable. For instance, in the case of a single planted subgraph, under the null model with very high probability. Hence, Monte Carlo is not effective in accurately computing -values in this regime.

Second, the work of [MKI03, TS07, BD11] implicitly assumes that the subgraph has bounded size so that can be computed in time by exhaustive search. Here we consider instead large subgraphs , and address the computational challenge of testing for in polynomial time. Let us emphasize that, while we typically assume to have diverging size, in practice it is impossible to perform exhaustive search already for quite small subgraphs. For instance, if , and , exhaustive search requires of the order of operations.

### 1.2 Notations

Given , we let denote the set of first integers. We write for the cardinality of a set . We denote by the incomplete factorial.

Throughout this paper, we will use lowercase boldface (e.g. , , etc.) to denote vectors and uppercase boldface (e.g. , , etc.) to denote matrices. For a vector and a set , we define as for and , otherwise. For a matrix we use to denote its spectral norm. Given a square matrix , we denote its trace by . Given a symmetric matrix , we denote by its ordered eigenvalues.

We denote by the all-ones vector, and by , the identity and all-ones matrices, respectively. For a matrix , is the vector whose ’th entry is where and are the quotient and the remainder in dividing by , respectively. Also, denotes the ’th standard unit vector.

A simple graph is a pair , where is a vertex set and is a set of unordered pairs , . We will write , whenever necessary to specify which graph we are referring to. Throughout, we will be focusing on finite graphs. We let , . For , degree of node is shown by . For , the number of nodes in which are connected to is denoted by . We let be the set of graphs over vertex set .

We follow the standard Big-Oh notation. Given functions , we write if there exists a constant such that , if there exists a constant such that , and if and . Further if for all and large enough, and if for all and large enough.

## 2 Statistical limits on hypothesis testing

In this section we address the first question stated in Section 1: under which conditions on , are the two laws , distinguishable? We focus on the case of a single planted subgraph, , deferring the generalization to to Section 4. We note that strong and weak distinguishability are equivalent to , and , respectively.

Our results depend on the graph sequence through its maximum density . For a graph , we define

 d(H)≡maxF⊆H(e(F)v(F)). (2.1)

The following theorem provides a sufficient condition on the distinguishability of laws , .

###### Theorem 1.

Let be a sequence of non-empty graphs such that and for let be the null model with edge density , and be planted model with parameters , . If

 liminfn→∞d(Hn)log(1/q0)logn>1, (2.2)

then the two laws , are strongly distinguishable.

###### Remark 2.1.

The proof of this theorem also provides an explicit test which has asymptotically vanishing error probability under the assumptions of the theorem. Let , where is the subgraph of with the smallest number of vertices, such that . Then, the test developed in the proof requires searching over all subsets of vertices which, in most cases, is non-polynomial.

The next theorem provides condition under which the two laws are indistinguishable.

###### Theorem 2.

Let , , , be as in Theorem 1. Then the two models are not weakly distinguishable if

 limsupn→∞d(Hn)log(1/q0)+(5/2)logv(Hn)logn<1. (2.3)

Further, if , then the laws , are not weakly distinguishable if

 limsupn→∞v(Hn)n1/2=0. (2.4)

Note that, under the condition for any (i.e. when the hidden subgraph is ‘not too large’), or when as (i.e. when the hidden subgraph is ‘dense enough’), this bound matches the positive result of Theorem 1. We illustrate these results with a few examples in Appendix A.

## 3 Computationally efficient tests

In this section we study two computationally plausible algorithms for detecting the planted subgraph in the setting described in Section 1. The first method leverages the spectral properties of the given graph for solving the problem. In this case, we establish sufficient conditions under which the algorithm succeeds with high probability. We then show that a modification of the spectral algorithm can be used to identify the hidden subgraph. The second approach uses an SDP relaxation of the problem, that is a priori more powerful than the spectral approach.

### 3.1 Spectral algorithm

For we denote by the shifted adjacency matrix of the graph , defined as follows:

 (ApG)ij={1 % if (i,j)∈E(G),−p/(1−p) otherwise. (3.1)

Further, we will denote by the adjacency matrix of . Recall that denote the eigenvalues of . The spectral test is simply based on the leading eigenvalue:

 Tspec(G) ={1 if λ1(Aq0G)≥2.1σ(q0)√n,0otherwise\, , (3.2) σ(q0) ≡√q01−q0. (3.3)

This algorithm was first proposed for the planted clique problem in [AKS98]. Note that this test uses the knowledge of , but does not assume the knowledge of planted subgraph .

###### Theorem 3.

Let be a sequence of non-empty graphs such that and for let be the null model with edge density , and be planted model with parameters , . Define as per Eq. (3.3). If

 liminfn→∞λ1(AHn)√n>3σ(q0), (3.4)

then the two laws , are strongly distinguishable.

###### Remark 3.1.

The constant in Eq. (3.2) can be reduced to for any . In addition, we expect that with further work the constant in Eq. (3.4) can be reduced to for any . These improvements are not the focus of the present paper.

Can spectral methods be used to identify the hidden subgraph? We start by noting that, even if can be detected, a subset of its node might remain un-identified. As an example, let be a graph over vertices, whereby vertices are connected by a clique, and vertex is connected to the others by a single edge, see figure below:

A

Then Example A.1 implies that can be detected with high probability as soon as . As we will see below, the spectral algorithm detects with high probability if . However it is intuitively clear (and not hard to prove) that the degree-one vertex in cannot be identified reliably.

With this caveat in mind, Algorithm 1 gives a spectral approach to identify a subset of the vertices of the hidden subgraph. In order to characterize the set of ‘important’ vertices of , we introduce the following notion.

###### Definition 3.1.

Given a graph , and , we define the -significant set of , as the following set of vertices

 Sc(H):={i∈V(H):deg(i)>cv(H)}. (3.5)

We also need to to assume that the leading eigenvector of is sufficiently spread out.

###### Definition 3.2.

Let be a graph with adjacency matrix . For , we say that has spectral expansion , if

 1−ε≥max(λ2(AH);−λn(AH))λ1(AH) (3.6)

Finally let be the leading eigenvector of . We say that is -balanced in spectrum if it has spectral expansion and

 mini∈V(H)|vi|≥μ√v(H). (3.7)

The following definition helps us present our result on Algorithm 1.

###### Definition 3.3.

Let be a graph. For any , the graph obtained by removing from is denoted by . Then:

1. We say that is -strictly balanced in spectrum if for all , is -balanced in spectrum.

2. We define as

 λ−(H)≡mini∈V(H)λ1(AH∖i). (3.8)

The next theorem states sufficient conditions under which Algorithm 1 succeeds in identifying the significant set of the planted subgraph.

###### Theorem 4.

Given , , let be the law of the random graph with edge density and planted subgraph , cf. Section 1, and assume . Assume and that, for each , is -strictly balanced in spectrum for some . Let be such that

 2δμ2(1−δ)<1.

Finally, assume that

 liminfn→∞|λ−(Hn)|√n>3σ(q0)εδ, (3.9)

where is defined as per Eq. (3.8).

Let be the output of Algorithm 1, and set , . Then the following statements hold with high probability as :

1. contains all the vertices of that correspond to the -significant set of planted subgraph .

2. does not contain any vertex that does not correspond to those of the planted subgraph .

###### Remark 3.2.

Note that if where is as in Theorem 4 (the minimum degree of nodes in the hidden subgraph is ‘sufficiently large’), and under the assumptions of Theorem 4, Algorithm 1 will find all the nodes of the planted subgraph .

In the opposite case, contains some ‘low degree’ vertices, namely, for some , , we have strictly. Then, in order to find all vertices of the planted subgraph in , after finding the output of Algorithm 1, , we can select the nodes such that , for some . Note that if is then for any , with high probability. Hence, this procedure will not choose any node such that . Moreover, this procedure will find the planted subgraph if for all nodes , .

Note that for any graph , . Hence by definition

 λ−(H)≤v(H∖i)≤v(H)q0. (3.10)

Hence, the assumptions of Theorem 4 imply in particular

 liminfn→∞v(Hn)n1/2>0. (3.11)

We can compare this condition with the one of Theorem 1. If is a dense graph, we expect generically , and hence there is a large gap between the condition of Theorem 1 (that guarantees distinguishability) and that of Theorem 4. We illustrate this with a few examples in appendix B.

### 3.2 SDP relaxation

Since the spectral method is generally sub-optimal with respect to the statistical detection threshold, it is natural to look for more powerful algorithms. In this section we use an SDP relaxation of the quadratic assignment problem, first proposed in [ZKRW98], for motif detection.

Recall that we denote by the adjacency matrix of graph

 (AG)ij={1 if (i,j)∈E(G),0 otherwise. (3.12)

We want to find a planted copy of a given graph in graph . Let . We consider therefore the problem

 {\rm maximize}Tr(AHΠTAGΠ){\rm subject to}ΠTΠ=IkΠ∈{0,1}n×k. (3.13)

This is a non-convex optimization problem known as Quadratic Assignment Problem (QAP) and is well studied in the literature, for example see [Bur13]. We will denote the value of this problem as .

Note indeed that, is feasible if it contains exactly one non-zero entry per column and at most one per row. Call the position of the non-zero-entry of column . Then is a labeling of the vertices of , and the objective function can be rewritten as

 Tr(AHΠTAGΠ)=2∑(i,j)∈E(H)(AG)φ(i),φ(j) (3.14)

Hence, if contains a planted copy of (e.g. under model ), we have . This suggests the following optimization-based test:

 TOPT(G)={1 if OPT(G;H)≥2e(H),0 otherwise. (3.15)

The proof of Theorem 1 suggests that this test is nearly optimal, provided , i.e. has no subgraph denser than itself111If this is the not case, the optimization problem (3.13) can be modified replacing by its densest subgraph..

Unfortunately, in general, is NP-complete even to approximate within a constant factor [SG76]. We will then resort to an SDP relaxation of the same problem. The following Lemma provides a different formulation of (3.13).

###### Lemma 3.4.

Let be an optimal solution of problem (3.13). Then such that is an optimal solution of the following problem

 {\rm maximize}Tr((AG⊗AH)Y){\rm subject to}Y∈{0,1}nk×nkY⪰0Tr(YJnk)=k2Tr(Y(In⊗(eieTi)))=1for$i=1,2,…,k$Tr(Y((ejeTj)⊗Ik))≤1for$j=1,2,…,n$rank(Y)=1. (3.16)

Now, we try the following SDP relaxation of problem (3.16) which is proposed in [ZKRW98]

 maximizeTr((AG⊗AH)Y)subject toY⪰00≤Y≤1Tr(YJnk)=k2Tr(Y(In⊗(eieTi)))=1for i=1,2,…,kTr(Y((ejeTj)⊗Ik))≤1for j=1,2,…,n (3.17)

The following theorem states an upper bound on the performance of the hypothesis testing method that rejects the null hypothesis if .

###### Theorem 5.

Let , , be as in Theorem 1. Consider the hypothesis testing problem in which under null is generated according to and under alternative it is generated according to . Define as per Eq. (3.3). If

 limsupn→∞λ1(AHn)√n<14σ(q0),

then for the method that rejects the null hypothesis if ,

 P0,n{T(Gn)=1}→1

as .

## 4 The case of multiple planted subgraphs

In this section we would like to generalize the results given in Section 2 to the regime in which atypical subgraphs are added to a random graph. Namely, fix and let be independent uniformly random labelings of in . As before, we let be the law of an Erdős-Renyi random graph with edge probability . On the other hand under edges are conditionally independent given , with

 P1,n((i,j)∈E∣∣(φl)l≤mn) = {1 if (i,j)∈φl(E(Hn)) for some% l∈{1,2,…,mn},q0 otherwise. (4.1)

We would like to find the conditions on under which the two laws are strongly or weakly distinguishable, generalizing Theorem 1 and Theorem 2 to . The following theorem states the sufficient condition under which the two laws are indistinguishable.

###### Theorem 6.

Let be a sequence of non-empty graphs and for let be the null model with edge density and be the planted model as in (4.1) with parameters , , . Then the two models are not weakly distinguishable if

 limsupn→∞(5/2)logv(Hn)+logmnlogn<1, (4.2)

and

 limsupn→∞d(Hn)log(1/q0)+(5/2)logv(Hn)logn<1. (4.3)

The following Theorem states sufficient conditions under which the two models are strongly distinguishable.

###### Theorem 7.

Let be as in Theorem 6. Then the two laws are strongly distinguishable if

 liminfn→∞mne(Hn)n=∞, (4.5)

or if

 liminfn→∞d(Hn)log(1/q0)logn>1. (4.6)
###### Remark 4.1.

While the necessary and sufficient conditions in the above theorems do not match in general, they do match in specific regimes of interest. A first regime is the one of bounded: in this case we recover the same asymptotics as for , cf. Section 2.

A second regime is obtained when the planted graphs have bounded size: for all . Theorem 6 implies that the two models are not weakly distinguishable if for some and all large enough.

Vice versa, by Theorem 7, they are strongly distinguishable if . In other words, we have a characterization of the distinguishability threshold that is tight up to sub-polynomial factors.

###### Remark 4.2.

The expected number of edges under the null model is roughly , and its standard deviation is of order . The sufficient condition (4.5) is therefore equivalent to requiring that the total number of edges in the planted graphs is much larger than this standard deviation. The proof of Theorem 7 constructs a simple test , by letting if and otherwise.

Notice that condition (4.6) is instead the same as in Theorem 1. In this regime, it is sufficient to find a single copy of the highest density subgraph of , and the multiplicity does not seem to help.

###### Remark 4.3.

It is possible to use the spectral algorithm in subsection 3.1 to detect the planted subgraphs by setting in Algorithm 1.

Note that if then the planted subgraphs will have disjoint vertex sets with high probability as . Hence the law studied in this section is the same as the one obtained by planting a graph consisting in the disjoint union of copies of . The top eigenvalue of the graph consisting of disjoint copies of a graph is the same as the top eigenvalue of . Hence, in this regime, we expect the spectral method to succeed in detecting motifs under the same conditions under which it succeeds in detecting one motif.

## Acknowledgments

H.J. was supported by the William R. Hewlett Stanford Graduate Fellowship. A.M. was partially supported by NSF grants CCF-1319979 and DMS-1106627 and the AFOSR grant FA9550-13-1-0036.

## References

• [AAM11] Noga Alon, Sanjeev Arora, Rajsekar Manokaran, Dana Moshkovitz, and Omri Weinstein, Inapproximability of densest -subgraph from average case hardness, Unpublished manuscript 1 (2011).
• [ACV14] Ery Arias-Castro and Nicolas Verzelen, Community detection in dense random networks, The Annals of Statistics 42 (2014), no. 3, 940–969.
• [ADBL11] Alekh Agarwal, John C Duchi, Peter L Bartlett, and Clement Levrard, Oracle inequalities for computationally budgeted model selection, Proceedings of the 24th Annual Conference on Learning Theory, 2011, pp. 69–86.
• [AKS98] Noga Alon, Michael Krivelevich, and Benny Sudakov, Finding a large hidden clique in a random graph, Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, 1998, pp. 594–598.
• [Alo07] Uri Alon, Network motifs: theory and experimental approaches, Nature Reviews Genetics 8 (2007), no. 6, 450–461.
• [BCC10] Aditya Bhaskara, Moses Charikar, Eden Chlamtac, Uriel Feige, and Aravindan Vijayaraghavan, Detecting high log-densities: an o (n ) approximation for densest k-subgraph, Proceedings of the forty-second ACM symposium on Theory of computing, ACM, 2010, pp. 201–210.
• [BCV12] Aditya Bhaskara, Moses Charikar, Aravindan Vijayaraghavan, Venkatesan Guruswami, and Yuan Zhou, Polynomial integrality gaps for strong sdp relaxations of densest k-subgraph, Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, 2012, pp. 388–405.
• [BD11] Joseph Blitzstein and Persi Diaconis, A sequential importance sampling algorithm for generating random graphs with prescribed degrees, Internet Mathematics 6 (2011), no. 4, 489–522.
• [BHK16] Boaz Barak, Samuel B Hopkins, Jonathan Kelner, Pravesh Kothari, Ankur Moitra, and Aaron Potechin, A nearly tight sum-of-squares lower bound for the planted clique problem, Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on, FOCS ’16, IEEE, 2016, pp. 428–437.
• [BR13] Quentin Berthet and Philippe Rigollet, Complexity theoretic lower bounds for sparse principal component detection, Conference on Learning Theory, 2013, pp. 1046–1066.
• [Bur13] Rainer E Burkard, Quadratic assignment problems, Springer, 2013.
• [Chu97] Fan RK Chung, Spectral graph theory, no. 92, American Mathematical Soc., 1997.
• [DLSS14] Amit Daniely, Nati Linial, and Shai Shalev-Shwartz, From average case complexity to improper learning complexity, Proceedings of the forty-sixth annual ACM symposium on Theory of computing, ACM, 2014, pp. 441–448.
• [DM14] Yash Deshpande and Andrea Montanari, Finding hidden cliques of size in nearly linear time, Foundations of Computational Mathematics (2014), 1–60.
• [DM15] Yash Deshpande and Andrea Montanari, Improved sum-of-squares lower bounds for hidden clique and hidden submatrix problems, Proceedings of The 28th Conference on Learning Theory (Paris, France) (Peter Grünwald, Elad Hazan, and Satyen Kale, eds.), Proceedings of Machine Learning Research, vol. 40, PMLR, 03–06 Jul 2015, pp. 523–562.
• [EK10] David Easley and Jon Kleinberg, Networks, crowds, and markets: Reasoning about a highly connected world, Cambridge University Press, 2010.
• [Fei02] Uriel Feige, Relations between average case complexity and approximation complexity, Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, ACM, 2002, pp. 534–543.
• [FGR13] Vitaly Feldman, Elena Grigorescu, Lev Reyzin, Santosh Vempala, and Ying Xiao, Statistical algorithms and a lower bound for detecting planted cliques, Proceedings of the Forty-fifth Annual ACM Symposium on Theory of Computing (New York, NY, USA), STOC ’13, ACM, 2013, pp. 655–664.
• [FK00] Uriel Feige and Robert Krauthgamer, Finding and certifying a large hidden clique in a semirandom graph, Random Structures and Algorithms 16 (2000), no. 2, 195–208.
• [FPK01a] Uriel Feige, David Peleg, and Guy Kortsarz, The dense k-subgraph problem, Algorithmica 29 (2001), no. 3, 410–421.
• [FPK01b]  , The dense k-subgraph problem, Algorithmica 29 (2001), no. 3, 410–421.
• [GM75] G. R. Grimmett and C. J. H. McDiarmid, On colouring random graphs, Mathematical Proceedings of the Cambridge Philosophical Society 77 (1975), no. 2, 313?324.
• [Gra73] Mark S Granovetter, The strength of weak ties, American journal of sociology (1973), 1360–1380.
• [Hås99] Johan Håstad, Clique is hard to approximate withinn1- , Acta Mathematica 182 (1999), no. 1, 105–142.
• [HKP15] Samuel B Hopkins, Pravesh K Kothari, and Aaron Potechin, Sos and planted clique: Tight analysis of mpw moments at all degrees and an optimal lower bound at degree four, arXiv:1507.05230 (2015).
• [HWX15] Bruce Hajek, Yihong Wu, and Jiaming Xu, Computational lower bounds for community detection on random graphs, Proceedings of The 28th Conference on Learning Theory (Paris, France) (Peter Grünwald, Elad Hazan, and Satyen Kale, eds.), Proceedings of Machine Learning Research, vol. 40, PMLR, 03–06 Jul 2015, pp. 899–928.
• [HWX17] Bruce Hajek, Yihong Wu, and Jiaming Xu, Information limits for recovering a hidden community, IEEE Transactions on Information Theory 63 (2017), no. 8, 4729–4745.
• [Jer92] Mark Jerrum, Large cliques elude the metropolis process, Random Structures & Algorithms 3 (1992), no. 4, 347–359.
• [KA05] Nadav Kashtan and Uri Alon, Spontaneous evolution of modularity and network motifs, Proceedings of the National Academy of Sciences of the United States of America 102 (2005), no. 39, 13773–13778.
• [Kho01] Subhash Khot, Improved inapproximability results for maxclique, chromatic number and approximate graph coloring, Foundations of Computer Science, 2001. Proceedings. 42nd IEEE Symposium on, IEEE, 2001, pp. 600–609.
• [Kho06]  , Ruling out ptas for graph min-bisection, dense k-subgraph, and bipartite clique, SIAM Journal on Computing 36 (2006), no. 4, 1025–1071.
• [KIMA04] Nadav Kashtan, Shalev Itzkovitz, Ron Milo, and Uri Alon, Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs, Bioinformatics 20 (2004), no. 11, 1746–1758.
• [Man17] Pasin Manurangsi, Almost-polynomial ratio eth-hardness of approximating densest k-subgraph, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, ACM, 2017, pp. 954–961.
• [MKI03] Ron Milo, Nadav Kashtan, Shalev Itzkovitz, Mark EJ Newman, and Uri Alon, On the uniform generation of random graphs with prescribed degree sequences, arXiv preprint cond-mat/0312028 (2003).
• [Mon15] Andrea Montanari, Finding one community in a sparse graph, Journal of Statistical Physics 161 (2015), no. 2, 273–299.
• [MPW15] Raghu Meka, Aaron Potechin, and Avi Wigderson, Sum-of-squares lower bounds for planted clique, Proceedings of the Forty-seventh Annual ACM Symposium on Theory of Computing (New York, NY, USA), STOC ’15, ACM, 2015, pp. 87–96.
• [MSOI02] Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon, Network motifs: simple building blocks of complex networks, Science 298 (2002), no. 5594, 824–827.
• [RS15] Prasad Raghavendra and Tselil Schramm, Tight lower bounds for planted clique in the degree-4 sos program, arXiv:1507.05136 (2015).
• [SG76] Sartaj Sahni and Teofilo Gonzalez, P-complete approximation problems, Journal of the ACM (JACM) 23 (1976), no. 3, 555–565.
• [SSR05] Sen Song, Per Jesper Sjöström, Markus Reigl, Sacha Nelson, and Dmitri B Chklovskii, Highly nonrandom features of synaptic connectivity in local cortical circuits, PLoS Biol 3 (2005), no. 3, e68.
• [Tao12] Terence Tao, Topics in random matrix theory, vol. 132, American Mathematical Soc., 2012.
• [TS07] Thomas Thorne and Michael PH Stumpf, Generating confidence intervals on biological networks, BMC bioinformatics 8 (2007), no. 1, 467.
• [VAC15] Nicolas Verzelen, Ery Arias-Castro, et al., Community detection in sparse random networks, The Annals of Applied Probability 25 (2015), no. 6, 3465–3510.
• [WF94] Stanley Wasserman and Katherine Faust, Social network analysis: Methods and applications, vol. 8, Cambridge university press, 1994.
• [YLSK04] Esti Yeger-Lotem, Shmuel Sattath, Nadav Kashtan, Shalev Itzkovitz, Ron Milo, Ron Y Pinter, Uri Alon, and Hanah Margalit, Network motifs in integrated cellular networks of transcription–regulation and protein–protein interaction, Proceedings of the National Academy of Sciences of the United States of America 101 (2004), no. 16, 5934–5939.
• [ZKRW98] Qing Zhao, Stefan E Karisch, Franz Rendl, and Henry Wolkowicz, Semidefinite programming relaxations for the quadratic assignment problem, Journal of Combinatorial Optimization 2 (1998), no. 1, 71–109.

## Appendix A Examples: Statistical limits

###### Example A.1.

Recall that denotes the complete graph over vertices (hence having degree ). Setting we recover the hidden clique problem. In this case . Hence, our theorems imply that the two laws are strongly distinguishable if , and are not weakly distinguishable if .

###### Example A.2.

Let be the hypercube graph over vertices (hence having degree ): this is the graph whose vertices are binary vectors of length , connected by an edge whenever their Hamming distance is exactly equal to one. Set . In other words, is an hypercube over vertices. It is easy to see that .

Let . Theorem 1 implies that this graph can be detected provided for some and all large enough. On the other hand, Theorem 2 implies that it cannot be detected if for some and all large enough. Hence, we can see that the lower and upper bounds for distinguishability are close for small and as increases the gap between the bounds increases.

###### Example A.3.

Let be a regular tree with degree and generations (hence , ). In this case for any , . Therefore, and Theorem 1, cannot guarantee the strong distinguishability of the hypotheses. Furthermore, and we are in the low density region. Hence, Theorem 2 implies that the null and planted models are not weakly distinguishable if .

###### Example A.4.

Let be the -th power of the cycle over vertices. This is the graph with vertex set , and two vertices are connected if for some . Let , for two functions , . In this case, and for all , . Therefore, for any , . Since , by definition, . Using Theorem 1, two models are strongly distinguishable if . In addition, depending on , , we can be in different graph density regimes. If , the laws are not weakly distinguishable if and we get a tight characterization. If , two models cannot be weakly distinguished if . Finally, for the intermediate regime where , if