Controlling the joint local false discovery rate is more powerful than metaanalysis methods in joint analysis of summary statistics from multiple genomewide association studies
Abstract
In genomewide association studies (GWASs) of common diseases/traits, we often analyze multiple GWASs with the same phenotype together to discover associated genetic variants with higher power. Since it is difficult to access data with detailed individual measurements, summarystatisticsbased metaanalysis methods have become popular to jointly analyze data sets from multiple GWASs.
In this paper, we propose a novel summarystatisticsbased joint analysis method based on controlling the joint local false discovery rate (Jlfdr). We prove that our method is the most powerful summarystatisticsbased joint analysis method when controlling the false discovery rate at a certain level. In particular, the Jlfdrbased method achieves higher power than commonly used metaanalysis methods when analyzing heterogeneous data sets from multiple GWASs. Simulation experiments demonstrate the superior power of our method over metaanalysis methods. Also, our method discovers more associations than metaanalysis methods from empirical data sets of four phenotypes. The Rpackage is available at: http://bioinformatics.ust.hk/Jlfdr.html.
1 Introduction
Understanding genetic mechanisms of common diseases and traits is important in biological and medical research. The goal of genomewide association studies (GWASs) is to discover the susceptibility of single nucleotide polymorphisms (SNPs) to common diseases/traits (Altshuler et al., 2008). Due to decreasing genotyping costs (Perkel, 2008), constantly emerging successful stories (Klein et al., 2005; Kraft and Haiman, 2010) and efforts of the GWAS consortiums (Wellcome Trust Case Control Consortium, 2007; Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014), more and more GWASs have been conducted for common phenotypes (Welter et al., 2014).
Analyses of GWAS results show that the identified associations can only explain a small part of the additive genetic variances. This is referred to as the “missing heritability” problem (Manolio et al., 2009). The hints of hidden heritability (Gibson, 2010; Yang et al., 2010) and the estimated distribution of common SNPs’ effect sizes (Park et al., 2010) suggest that common diseases/traits are influenced by thousands of SNPs with small effects. To discover these genetic variants with small effects, we need to improve studies’ power. Jointly analyzing data sets from multiple GWASs on the same diseases in the same population provide an opportunity to improve the power.
There are two kinds of joint analysis methods: individuallevel joint analysis and summarystatisticsbased joint analysis. Individuallevel joint analysis uses individuallevel genotype data from all studies. One such example is megaanalysis (Major Depressive Disorder Working Group of the Psychiatric GWAS Consortium, 2013), which pools all data together. Summarystatisticsbased joint analysis only uses summary statistics from different studies. Since individuallevel genotype data is difficult to access, summarystatisticsbased analysis is widely used in joint analysis. The most commonly used method of summarystatisticsbased joint analysis is metaanalysis (Evangelou and Ioannidis, 2013), which derives a new statistic for each SNP using summary statistics from multiple studies.
Our focus in this paper is to study summarystatisticsbased joint analysis methods. More specifically, we like to study which joint analysis method provides the highest power for a given false discovery rate level. Figure 1 illustrates our motivation.
Our major contribution in this paper is that we propose a novel summarystatisticsbased joint analysis method based on controlling the joint local false discovery rate (Jlfdr). The Jlfdr generalizes the concept of the local false discovery rate (Efron, 2005) from the analysis of single study to the joint analysis of multiple studies. We prove that our method is the most powerful summarystatisticsbased joint analysis method for a given false discovery rate level. In particular, the Jlfdrbased method is more powerful than commonly used metaanalysis methods when analyzing heterogeneous data sets from multiple GWASs.
The rest of this paper is organized as follows. In Section 2, we will first give the mathematical formulation of summarystatisticsbased joint analysis methods. We will prove that the most powerful summarystatisticsbased joint analysis method should control the Jlfdr. Then we will give implementation details of the Jlfdrbased method under the Gaussian mixture model. We will also discuss the relationship between the Jlfdrbased method and metaanalysis methods. In Section 3, we will use simulation experiments to demonstrate that Jlfdrbased method outperforms metaanalysis methods in terms of achieving higher power. Then we will show the empirical results using four different data sets. In Section 4, we will discuss limitations of our current method. Section 5 concludes the paper.
2 Method
2.1 Notations and criteria
Our method deals with a multiple GWAS setting. For simplicity, we illustrate the concepts with a twoGWAS setting. We use parenthesized superscript “” to denote the study index. For example, the sample sizes in study 1 and 2 are and , respectively. We use subscript (, is the total number of genotyped SNPs) to denote the SNP index.
To detect associations, we construct a null hypothesis for each SNP, in which association is assumed nonexistent. Assume we use a value scheme to detect associations between SNPs and the phenotype, i.e., the test statistics follow a standard normal distribution under a null hypothesis. We use to denote the observed effect size in study . The asymptotically standard error of is . Correspondingly, the test statistic in study is . The underlying expected effect size is . The expected effect size of the same SNP may vary in different studies due to heterogeneity. The test statistic (uppercase letter indicates a random variable) follows an distribution. We use to represent the vector of test statistics in all studies, i.e., . Similarly, we use to represent the vector of expected effect sizes in all studies, i.e., .
We further assume SNPs have no association with the phenotype and SNPs have associations. Thus, the null proportion reads (). We use and to denote the null hypothesis and the alternative hypothesis, respectively.
In the joint analysis of summary statistics from multiple GWASs, we assume that of the hypotheses are rejected. There are false positives and true positives (i.e., ). Table 1 summarizes the numbers of hypotheses in the different categories.
is true  is false  Total  

is rejected  
is not rejected  
Total 
When testing multiple hypotheses, it is very easy to have false positives by random chance. This problem is known as the “multiplicity” problem. Many criteria are proposed to address the multiplicity problem. We present an incomplete list of these criteria in Table 2 (a). Let’s define the false discovery proportion (Fdp) as with “” denoting the maximum operation. Fdp is an unknown quantity in real cases. The classical false discovery rate (FDR) is the expectation of the Fdp. Controlling the FDR is more powerful than controlling the familywise error rate (FWER). The Bayesian false discovery rate (Fdr) is the expected value of the Fdp given . Compared to FDR, Fdr is conditional on since we are only interested in controlling false positives when . We adopt Fdr in this paper as the criterion to avoid a plethora of false positives.
In addition to controlling false positives, we also need a criterion to measure the amount of true positives when evaluating a rejection region. A direct concept is power. The classical definition of power is a function of a given effect size as shown in the first row of Table 2(b). Since effect sizes of associated SNPs are different and unobserved, the actual power values are unknown. The Bayesian power removes the dependence of power on effect size by taking the expectation of the empirical power, which is defined as (). We list the definitions of the power and the Bayesian power in Table 2 (b). In this paper, we use the Bayesian power as the criterion to measure the amount of true positives.
Both Fdr and Bayesian power are functions of the rejection region . For two different rejection regions with the same Fdr level, we prefer the region with higher Bayesian power because it can find more true associations without increasing the proportion of false positives in the findings. Thus, we propose a joint analysis method determining the optimal rejection region when controlling the Fdr at a certain threshold , i.e.
(1) 
Here denotes the Bayesian power. Actually, when controlling the Fdr at the same threshold, metaanalysis methods can also be regarded as the solutions to the above optimization problem with further constraint about the form of , i.e.
(2)  
Here , and is a function which has different forms in different metaanalysis methods. We will give the explicit forms of the function in metaanalysis methods in subsection 2.4. Also, we will discuss the relationship between our proposed method and metaanalysis methods in detail in that subsection. In the next subsection, we will present the solution to the optimization problem in Eq. (1).
2.2 Jlfdr and optimal rejection region
To derive the solution to the optimization problem in Eq. (1), we need to introduce the concept of joint local false discovery rate (Jlfdr) first. Jlfdr is a simple extension of the local false discovery rate (Efron, 2005) from the analysis of single study to the joint analysis of multiple studies. It reads as
(3) 
which is the posterior probability of a null hypothesis, given the observed summary statistic vector .
The relationship between Jlfdr and Fdr is (see the Supplementary Note for details)
(4) 
In other words, Fdr is the expectation of Jlfdr, given that the test statistic vector is in the rejection region .
Let us define a rejection region , where is a threshold such that . We have the following theorem:
Theorem 1.
For any rejection region with , we have .
We show the proof of Theorem 1 in the Supplementary Note. Theorem 1 shows that is the most powerful rejection region when controlling Fdr at . This gives us a clue that we can improve the power of summarystatisticsbased joint analysis by controlling the Jlfdr. In the next section, we shall provide details of the implementation of the Jlfdrbased method under the Gaussian mixture model.
2.3 Implementation of Jlfdrbased method under the Gaussian mixture model
The hints of hidden heritability (Gibson, 2010; Yang et al., 2010) and the estimated distribution of common SNPs’ effect sizes (Park et al., 2010) suggest that thousands of common SNPs with small effect sizes are associated with complex diseases. A natural prior to depict this “infinitesimal model” (Gibson, 2012) is Gaussian distribution with mean and variance . We assume the effect sizes of associated SNPs have this prior distribution. Since we don’t know which SNP is associated with diseases, we propose the following twocomponent mixture model to describe the prior distribution of effect sizes:
(5) 
where is the unit point mass distribution at zero.
There may be some heterogeneity in different studies. The effect sizes of the same SNP may vary in different studies. We assume the effect sizes of the same associated SNP in different studies are normally distributed with mean and variance , i.e., . The distribution of the effect size vector is
(6) 
where is the bivariate unit point mass distribution at the origin, and denotes the bivariate Gaussian distribution with expectation and covariance matrix .
Since the observed effect size asymptotically follows Gaussian distribution , the test statistics vector with prior (6) follows twocomponent Gaussian mixture distribution:
(7) 
Here, is the identity matrix.
In order to obtain the global behavior of all SNPs, we need to obtain the marginal distribution of the test statistic vectors of all SNPs. Overall, the test statistic vector follows
(8) 
Here is the index set of all associated SNPs, and is the corresponding cardinality of . is normally unknown.
The above Gaussian mixture model is computationally difficult due to the large number of components ( is normally in the range of hundred to thousand). To simplify the model, we use a component Gaussian mixture model to approximate the nonnull components, i.e.,
(9) 
Then we reduce the distribution of to a component Gaussian mixture model:
(10) 
There are some unknown parameters and in the above mixture model. Dempster et al. (1977) proposed an EMalgorithm to estimate parameters with unobserved latent variables. With the observed vectors of summary statistics (), we use the EMalgorithm to estimate the parameters and in the Gaussian mixture model (10). Please note that is always much larger than any entry of in the GWAS setting. Hence, a Dirichlet prior is added for the proportions . This is the same penalty strategy proposed by Muralidharan (2010). Our experiments show that the rejection regions are not sensitive to the penalization parameter and the number of mixture components in the associated SNPs . In our default setting, and .
Denote the probability density function (pdf) of bivariate normal distribution as and the pdf of as . The Jlfdr reads
(11) 
After calculating the Jlfdr, we approximate Fdr as
(12) 
We determine the optimal rejection region by Jlfdrthresholding, which determines the rejection region with smaller than the threshold . To determine the threshold , we sort the calculated Jlfdr values of each SNP in an ascending order first. Denote the th Jlfdr value as . We can approximate the Fdr of the region as
(13) 
We use to denote the largest such that , namely
(14) 
Then the Jlfdr threshold is . We reject all SNPs with .
We present the detailed steps of the Jlfdrbased method in Algorithm 1.
2.4 Relationship between Jlfdrbased method and metaanalysis methods
We have the following theorem about the rejection region of Jlfdrbased method when using the Gaussian mixture model:
Theorem 2.
In the Gaussian mixture model (8), the rejection region of the Jlfdrbased method is
(15) 
where is a constant determined by . If no heterogeneity exists between studies, the rejection region is
(16) 
where , and is a constant determined by .
We present the proof of the above theorem in the Supplementary Note.
Metaanalysis methods are the most commonly used summarystatisticsbased joint analysis methods. In metaanalysis, we usually calculate the weighted average of effect sizes in different studies. Dividing the weighted average effect size by its standard error yields a new valuebased test statistic. There are two kinds of models in the metaanalysis: fixedeffects model and randomeffects model.
In the fixedeffects model, we assume that the underlying true effect sizes in different studies are identical. This corresponds to in the Gaussian mixture model 8. The optimal weighting strategy is the inversevariance weighting since it minimizes the variance of the weighted average. Each effect size is weighted by the inverse of its variance, i.e.,
(17) 
Here is the weighted average effect size. Its standard error is
(18) 
Dividing by yields the new test statistic . We have the following theorem about the rejection region of the fixedeffects metaanalysis method (See the Supplementary Note for detailed proof):
Theorem 3.
The rejection region of the fixedeffects metaanalysis method is asymptotically
(19) 
where , and is a constant determined by . .
This kind of rejection region is illustrated in Figure 1. The region coincides with the rejection region of the Jlfdrbased method when no heterogeneity exists between studies. Hence, the Jlfdrbased method and the fixedeffects metaanalysis method will have the same performance. In contrast, if heterogeneity exists between studies, the rejection regions determined by the Jlfdrbased method and the fixedeffects metaanalysis method are different. According to Theorem 1, the rejection region determined by the Jlfdrbased method can achieve the highest power among all summarystatisticsbased joint analysis methods when controlling the Fdr. In other words, the Jlfdrbased method is more powerful than the fixedeffects metaanalysis method.
In the randomeffects model, we assume that the true effect sizes in different studies are not identical and follow a distribution. Then we adjust the weights by incorporating the betweenstudy variance. The weighted average effect size is
(20) 
Its standard error is
(21) 
Dividing by yields the new test statistic . We have the following theorem about the rejection region of the randomeffects metaanalysis method (See the Supplementary Note for detailed proof):
Theorem 4.
The rejection region of the randomeffects metaanalysis method is asymptotically
(22) 
Here , which is the 2nd Hadamard power of , and is a constant determined by .
If no heterogeneity exists between studies, the randomeffects metaanalysis method is less powerful than the fixedeffects metaanalysis method and the Jlfdrbased method. If heterogeneity exists between studies, we usually need a large number of studies to estimate the betweenstudy variance precisely in the randomeffects metaanalysis. Since we usually only have a few GWASs of the same diseases in the same population, the randomeffects metaanalysis is not powerful enough. The Jlfdrbased method overcomes this problem by borrowing information from all genotyped SNPs. In any case, the rejection region determined by the Jlfdrbased method and the randomeffects metaanalysis method are different. According to Theorem 1, the Jlfdrbased method is more powerful than the randomeffects metaanalysis method.
3 Results
3.1 Simulation experiments
We use simulation experiments to demonstrate that the Jlfdrbased method is more powerful than the commonly used metaanalysis methods in analyzing summary statistics from multiple GWASs.
In our simulation experiments, we fix the sample size at 10000 in study 1. We conduct experiments with different sample sizes of 5000, 10000 and 15000 in study 2. The sample size ratios are 0.5, 1 and 1.5 correspondingly. The individual numbers in the control group and case group are the same in both studies, and the number of SNPs is . We simulate the minor allele frequency of each SNP according to uniform distribution . The proportion of the associated SNPs is . For associated SNPs, the expected odds ratio in each study is simulated according to the following model:
(23) 
where . In the homogeneous setting, . In the heterogeneous setting, . For nonassociated SNPs, the expected odds ratio is . The prevalence of the disease is . We use the odds ratio test to detect associations in our experiments.
We use the Jlfdrbased method, the fixedeffects metaanalysis method and the randomeffects metaanalysis method to jointly analyze summary statistics from study 1 and study 2. The Fdr is controlled at . In the fixedeffects metaanalysis and the randomeffects metaanalysis, we use the onedimensional mixture method Muralidharan (2010) to control the Fdr at .
In the homogeneous setting (), each SNP shares the same expected effect size between the two studies. Figure 2 presents the average empirical power and the average Fdp of 10 experimental runs using different methods. The average Fdp is well controlled in all methods. In this setting, the average empirical powers are almost the same in the Jlfdrbased method and the fixedeffects metaanalysis method. The subtle differences are due to random initial choices of the EMalgorithm and the Fdr approximations used in Eq. (13). This verifies the previous statement about the equivalence between the Jlfdrbased method and the fixedeffects metaanalysis method in the homogeneous setting.
In the heterogeneous setting (), the expected effect sizes of each SNP vary between studies. Figure 3 plots the discovered associations using the Jlfdrbased method and the fixedeffects metaanalysis method in one run when . Although the Jlfdrbased method missed some associations detected by the fixedeffects metaanalysis method, it identifies more associations than the metaanalysis method. We ran the simulation experiments 10 times for the sample size ratio and . Figure 4 shows the average empirical power and the average Fdp. The average Fdp using all three methods are about in all sample size ratio settings. From the figure, we can see that the Jlfdrbased method can achieve higher power than the other methods when controlling Fdr at the same threshold.
3.2 Real data applications
3.2.1 SCZ data from PGC
We jointly analyze the summary statistics from schizophrenia (SCZ) studies conducted by the Psychiatric Genomics Consortium (PGC). The summary statistics from two SCZ studies, Sweden+SCZ1 (Ripke et al., 2013) and SCZ2 (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014), are available from the PGC. Sweden+SCZ1 is a largescale metaanalysis of Swedish and mixedEuropean ancestry individuals that comprises 13,833 schizophrenia cases and 18,310 controls in the analysis. We use it as Study 1. SCZ2 is a largerscale metaanalysis that comprises 36,989 schizophrenia cases and 113,075 controls. The analysis includes the individuals which have been analyzed in Sweden+SCZ1. By using the following inverse metaanalysis formula, we obtain the summary statistics from the metaanalysis comprising the individuals only be analyzed in SCZ2. The formula is
(24) 
We use as the summary statistics of Study 2. We remove the SNPs with value in the test of homogeneity. After that, there are SNPs remaining.
We use the Jlfdrbased method, the fixedeffects metaanalysis method and the randomeffects metaanalysis method to jointly analyze the summary statistics from two studies. The Fdr is controlled at . We adopt the onedimensional mixture method to control the Fdr at in metaanalysis methods.
Figure 5(a) plots the discovered associations using the Jlfdrbased method and the fixedeffects metaanalysis method. The Jlfdrbased method identifies more associations. Table 3 shows the numbers of discovered associations and the rejection criteria of the different analysis methods. Besides the loci discovered by metaanalysis methods, there are eight novel loci discovered by the Jlfdrbased method. Each locus is separated by at least 500 kilobases (kb) or a weak linkage disequilibrium (). The SNPs showing the most significant association with SCZ in these novel loci are presented in Supplementary Table 1.
Method  Rejection Criterion  #{Identified SNPs} 

Jlfdrbased method  13405  
Fixedeffects metaanalysis  13014  
Randomeffects metaanalysis  8348 
3.2.2 SLE data from dbGaP
We conduct summarystatisticsbased joint analysis in systemic lupus erythematosus (SLE) data from phs000122.v1.p1 and phs000216.v1.p1 in dbGaP (Mailman et al., 2007; Tryka et al., 2014). We use the study phs000122.v1.p1, in which there are 1,311 SLE cases and 3,340 controls, as Study 1, and we use the study phs000216.v1.p1, in which there are 706 cases and 353 controls, as Study 2. The individuals in the first study are all North Americans of European descent, and those in the second study are all females of European ancestry. We use the following quality control procedures for both studies:

Missing data control: The SNPs with a missing data rate larger than are discarded.

Minor allele frequency control: The SNPs with minor allele frequency less than in either case group or control group are discarded.

HardyWeinberg equilibrium control: In the HardyWeinberg equilibrium test, the SNPs with values less than in either case group or control group are discarded.

Homogeneity control: In the homogeneity test, SNPs with values less than are discarded.
After the quality control steps, there are autosome SNPs remaining.
We use as the Fdr threshold in all analyses. 5(b) plots the associations discovered by the Jlfdrbased method and the fixedeffects metaanalysis method. The Jlfdrbased method discovers more associations than the metaanalysis methods. Table 4 lists the numbers of the associations discovered using the different methods. Besides the loci discovered by metaanalysis methods, there are three novel loci discovered by the Jlfdrbased method. The loci are separated by at least 500kb or a weak linkage disequilibrium (). The most significant associations in these novel loci can be seen in Supplementary Table 2.
Method  Rejection Criterion  #{Identified SNPs} 

Jlfdrbased method  106  
Fixedeffects metaanalysis  94  
Randomeffects metaanalysis  54 
3.2.3 BMI data from GIANT
We jointly analyze summary statistics from body mass index (BMI) studies conducted by the Genetic Investigation of ANthropometric Traits (GIANT) consortium (Locke et al., 2015). We use the joint GWAS and metabochip metaanalysis of 152,893 European men as Study 1, and we use the joint GWAS and metabochip metaanalysis of 171,977 European women as Study 2. There are autosome SNPs passing the homogeneity control (value).
We use as the Fdr threshold in all analyses. Figure 5(c) plots the associations discovered by the Jlfdrbased method and the fixedeffects metaanalysis method. The Jlfdrbased method discovers more associations than metaanalysis methods. Table 5 shows the number of discovered associations and the corresponding rejection criterion of each method. There are six novel loci discovered by the Jlfdrbased method. The SNPs showing the most significant associations in these novel loci are listed in Supplementary Table 3.
Method  Rejection Criterion  #{Identified SNPs} 

Jlfdrbased method  2717  
Fixedeffects metaanalysis  2667  
Randomeffects metaanalysis  2186 
3.2.4 WHRadjBMI data from GIANT
We conduct joint analysis in waisttohip ratio after adjusting for BMI (WHRadjBMI) studies from GIANT consortium (Shungin et al., 2015). We use the joint GWAS and metabochip metaanalysis of 93,480 European men as Study 1, and we use the joint GWAS and metabochip metaanalysis of 116,742 European women as Study 2. There are autosome SNPs passing the homogeneity control (value).
Figure 5(d) highlights the associations discovered by the Jlfdrbased method and the fixedeffects metaanalysis method. The Jlfdrbased method identifies more associations than metaanalysis methods when controlling Fdr at the same level . Table 6 shows the number of the discovered associations and the corresponding rejection criterion of each method. Besides the loci discovered by metaanalysis methods, there are four novel loci discovered by the Jlfdrbased method. The details of the most significant SNPs in these loci are listed in Supplementary Table 4.
Method  Rejection Criterion  #{Identified SNPs} 

Jlfdrbased method  452  
Fixedeffects metaanalysis  420  
Randomeffects metaanalysis  192 
4 Discussion
Both the Jlfdrbased method and the metaanalysis methods jointly analyze summary statistics from multiple GWASs. Metaanalysis methods collapse the test statistics of all studies into a weighted average value for each SNP, which is simpler than the Jlfdrbased method. When no heterogeneity exists between studies, the Jlfdrbased method will degenerate to the fixedeffects metaanalysis method. This can be understood by the fact that there is no information loss during the collapsing when all studies are homogeneous. When heterogeneity exists between studies, however, the Jlfdrbased method can achieve higher power than the fixedeffects metaanalysis method. This is understandable as information about heterogeneity is lost during collapse when using the metaanalysis method. Since heterogeneity widely exists in most cases, we suggest to use the Jlfdrbased method instead of metaanalysis methods to jointly analyze summary statistics from multiple GWASs.
This paper proves that the Jlfdrbased method is the most powerful summarystatisticsbased joint analysis method when the underlying distribution of the test statistics is known. In reality, we only know the theoretical distribution under a null hypothesis. The distribution under alternative hypotheses is usually unknown. Hence, in the implementation of the Jlfdrbased method, we assume test statistics follow the Gaussian mixture model. Then we use the EMalgorithm to infer parameters in the mixture model. Violation of the model assumptions and inaccuracy of parameters estimation will decrease the performance of the Jlfdrbased method.
We assume an independence between SNPs in the Gaussian mixture model. However, correlations between nearby SNPs often exist, which is known as linkage disequilibrium. We may further improve the Jlfdrbased method by taking advantage of the dependency information between SNPs.
5 Conclusion
Jointly analyzing data sets from multiple GWASs is a common strategy to discover associations. Since it is usually difficult to access individuallevel genotyping data, summarystatisticsbased joint analysis has become popular for jointly analyzing data sets from multiple GWASs. Among different summarystatisticsbased joint analysis methods, we prefer the method with higher Bayesian power when Fdr is controlled at the same level, because it can discover more associations. With this criterion, we propose the Jlfdrbased method. It is the most powerful summarystatisticsbased method. Simulation and empirical experiments demonstrate its superior performance over traditional metaanalysis methods.
Acknowledgement
This paper was partially supported by a grant under the Themebased Research Scheme (project T12402/13N) of the Hong Kong Research Grant Council (RGC).
References
 Altshuler et al. (2008) Altshuler, D., Daly, M. J., and Lander, E. S. (2008). Genetic mapping in human disease. Science, 322(5903):881–888.
 Benjamini and Hochberg (1995) Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1):289–300.
 Dempster et al. (1977) Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–38.
 Efron (2005) Efron, B. (2005). Local false discovery rates. Technical Report 200520B, Department of Statistics, Stanford University.
 Evangelou and Ioannidis (2013) Evangelou, E. and Ioannidis, J. P. (2013). Metaanalysis methods for genomewide association studies and beyond. Nature Reviews Genetics, 14(6):379–389.
 Gibson (2010) Gibson, G. (2010). Hints of hidden heritability in gwas. Nature Genetics, 42(7):558–560.
 Gibson (2012) Gibson, G. (2012). Rare and common variants: twenty arguments. Nature Reviews Genetics, 13(2):135–145.
 Klein et al. (2005) Klein, R. J., Zeiss, C., Chew, E. Y., Tsai, J.Y., Sackler, R. S., Haynes, C., Henning, A. K., SanGiovanni, J. P., Mane, S. M., Mayne, S. T., et al. (2005). Complement factor H polymorphism in agerelated macular degeneration. Science, 308(5720):385–389.
 Kraft and Haiman (2010) Kraft, P. and Haiman, C. A. (2010). GWAS identifies a common breast cancer risk allele among BRCA1 carriers. Nature Genetics, 42(10):819–820.
 Kruschke (2010) Kruschke, J. K. (2010). What to believe: Bayesian methods for data analysis. Trends in cognitive sciences, 14(7):293–300.
 Locke et al. (2015) Locke, A. E., Kahali, B., Berndt, S. I., Justice, A. E., Pers, T. H., Day, F. R., Powell, C., Vedantam, S., Buchkovich, M. L., Yang, J., et al. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature, 518(7538):197–206.
 Mailman et al. (2007) Mailman, M. D., Feolo, M., Jin, Y., Kimura, M., Tryka, K., Bagoutdinov, R., Hao, L., Kiang, A., Paschall, J., Phan, L., et al. (2007). The NCBI dbGaP database of genotypes and phenotypes. Nature Genetics, 39(10):1181–1186.
 Major Depressive Disorder Working Group of the Psychiatric GWAS Consortium (2013) Major Depressive Disorder Working Group of the Psychiatric GWAS Consortium (2013). A megaanalysis of genomewide association studies for major depressive disorder. Molecular Psychiatry, 18(4):497–511.
 Manolio et al. (2009) Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M. I., Ramos, E. M., Cardon, L. R., Chakravarti, A., et al. (2009). Finding the missing heritability of complex diseases. Nature, 461(7265):747–753.
 Muralidharan (2010) Muralidharan, O. (2010). An Empirical Bayes mixture method for effect size and false discovery rate estimation. The Annals of Applied Statistics, 4(1):422–438.
 Neyman and Pearson (1933) Neyman, J. and Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London. Series A, 231:289–337.
 Park et al. (2010) Park, J.H., Wacholder, S., Gail, M. H., Peters, U., Jacobs, K. B., Chanock, S. J., and Chatterjee, N. (2010). Estimation of effect size distribution from genomewide association studies and implications for future discoveries. Nature Genetics, 42(7):570–575.
 Perkel (2008) Perkel, J. (2008). SNP genotyping: six technologies that keyed a revolution. Nature Methods, 5(5):447–453.
 Ripke et al. (2013) Ripke, S., O’Dushlaine, C., Chambert, K., Moran, J. L., Kähler, A. K., Akterin, S., Bergen, S. E., Collins, A. L., Crowley, J. J., Fromer, M., et al. (2013). Genomewide association analysis identifies 13 new risk loci for schizophrenia. Nature Genetics, 45(10):1150–1159.
 Schizophrenia Working Group of the Psychiatric Genomics Consortium (2014) Schizophrenia Working Group of the Psychiatric Genomics Consortium (2014). Biological insights from 108 schizophreniaassociated genetic loci. Nature, 511(7510):421–427.
 Shungin et al. (2015) Shungin, D., Winkler, T. W., CroteauChonka, D. C., Ferreira, T., Locke, A. E., Mägi, R., Strawbridge, R. J., Pers, T. H., Fischer, K., Justice, A. E., et al. (2015). New genetic loci link adipose and insulin biology to body fat distribution. Nature, 518(7538):187–196.
 Storey (2003) Storey, J. D. (2003). The positive false discovery rate: a Bayesian interpretation and the qvalue. Annals of Statistics, 31(6):2013–2035.
 Tryka et al. (2014) Tryka, K. A., Hao, L., Sturcke, A., Jin, Y., Wang, Z. Y., Ziyabari, L., Lee, M., Popova, N., Sharopova, N., Kimura, M., et al. (2014). NCBI’ s database of genotypes and phenotypes: dbGaP. Nucleic Acids Research, 42(D1):D975–D979.
 Tukey (1953) Tukey, J. W. (1953). The problem of multiple comparisons. In The Collected Works of John W Tukey VIII. Multiple Comparisons: 19481983, number 1300. Chapman and Hall, New York. Unpublished manuscript.
 Wellcome Trust Case Control Consortium (2007) Wellcome Trust Case Control Consortium (2007). Genomewide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 447(7145):661–678.
 Welter et al. (2014) Welter, D., MacArthur, J., Morales, J., Burdett, T., Hall, P., Junkins, H., Klemm, A., Flicek, P., Manolio, T., Hindorff, L., et al. (2014). The NHGRI GWAS catalog, a curated resource of SNPtrait associations. Nucleic Acids Research, 42(D1):D1001–D1006.
 Yang et al. (2010) Yang, J., Benyamin, B., McEvoy, B. P., Gordon, S., Henders, A. K., Nyholt, D. R., Madden, P. A., Heath, A. C., Martin, N. G., Montgomery, G. W., et al. (2010). Common SNPs explain a large proportion of the heritability for human height. Nature Genetics, 42(7):565–569.