On composite likelihood in bivariate metaanalysis of diagnostic test accuracy studies
Abstract
The composite likelihood (CL) is amongst the computational methods used for estimation of the generalized linear mixed model (GLMM) in the context of bivariate metaanalysis of diagnostic test accuracy studies. Its advantage is that the likelihood can be derived conveniently under the assumption of independence between the random effects, but there has not been a clear analysis of the merit or necessity of this method.
For synthesis of diagnostic test accuracy studies, a copula mixed model has been proposed in the biostatistics literature.
This general model includes the GLMM
as a special case and can also allow for flexible dependence modelling, different from assuming simple linear correlation structures, normality and tail independence in the joint tails. A maximum likelihood (ML) method, which is based on evaluating the bidimensional integrals of the likelihood with quadrature methods has been proposed, and in fact it eases any computational difficulty that might be caused by the double integral in the likelihood function.
Both methods are thoroughly examined with extensive simulations and illustrated with data of a published metaanalysis. It is shown that the ML method has no nonconvergence issues or computational difficulties and at the same time allows estimation of the dependence between studyspecific sensitivity and specificity and thus prediction via summary receiver operating curves.
Keywords: Copula mixed model; diagnostic odds ratio; generalized linear mixed model; sensitivity/specificity; SROC.
0pt
2pt plus 4pt minus 2pt2pt plus 2pt minus 2pt
1 Introduction
Synthesis of diagnostic test accuracy studies is the most common medical application of multivariate metaanalysis; we refer the interested reader to the surveys by Jackson et al. (2011); Mavridis and Salanti (2013); Ma et al. (2016). These data have two important properties. The first is that the estimated sensitivities and specificities are typically negatively associated across studies, because studies that adopt less stringent criterion for declaring a test positive invoke higher sensitivities and lower specificities (Jackson et al., 2011). The second important property of the data is the substantial betweenstudy heterogeneity in sensitivities and specificities (Chu et al., 2012).
Nikoloulopoulos (2015a), to deal with the aforementioned properties, proposed a copula mixed model for bivariate metaanalysis of diagnostic test accuracy studies and made the argument for moving to copula random effects models. This general model includes the generalized linear mixed model (Chu and Cole, 2006; Arends et al., 2008) as a special case and can also operate on the original scale of sensitivity and specificity.
Chen et al. (2014, 2016b) proposed a composite likelihood (CL) method for estimation of the the generalized linear mixed model (hereafter GLMM) and the Sarmanov betabinomial model (Chu et al., 2012). Note in passing that both models are special cases of a copula mixed model (Nikoloulopoulos, 2015a). The composite likelihood can be derived conveniently under the assumption of independence between the random effects. The CL method has been recommended by Chen et al. (2014, 2016b) to overcome practical ‘issues’ in the joint likelihood inference such as computational difficulty caused by a double integral in the joint likelihood function, and restriction to bivariate normality.
However,

GLMM can only be unstable if there are too many parameters in the covariance matrix of the random effects or too many random effects for a small sample, which is not the case in this application domain. Furthermore, Chen et al. (2014, 2016b) restrict themselves to SAS PROC NLMIXED which is a general routine for random effect models and thus gives limited capacity. The CL method is well established as a surrogate alternative of maximum likelihood when the joint likelihood is too difficult to compute (Varin et al., 2011), which is apparently not the case in the synthesis of diagnostic test accuracy studies. The general model in Nikoloulopoulos (2015a) includes the GLMM as a special case and its numerical evaluation has been implemented in the package CopulaREMADA (Nikoloulopoulos, 2016) within the open source statistical environment R (R Core Team, 2015).

The random effects distribution of a copula mixed model can be expressed via other copulas (other than the bivariate normal) that allow for flexible dependence modelling, different from assuming simple linear correlation structures, normality and tail independence.
The contribution of this paper is to examine the merit of the CL method in the context of diagnostic test accuracy studies and compare it to the ML method in Nikoloulopoulos (2015a). The remainder of the paper proceeds as follows. Section 2 summarizes the copula mixed model for diagnostic test accuracy studies. Section 3 discusses both maximum and composite likelihood for estimation of the model parameters. Section 4 contains smallsample efficiency calculations to compare the two methods. Section 5 presents applications of the likelihood estimation methods to several data frames with diagnostic studies. We conclude with some discussion in Section 6.
2 The copula mixed model
We first introduce the notation used in this paper. The focus is on twolevel (withinstudy and betweenstudies) cluster data. The data are are , where is an index for the within study measurements and is an index for the individual studies. The data, for study , can be summarized in a table with the number of true positives (), true negatives (), false negatives (), and false positives ().
The withinstudy model assumes that the number of true positives and true negatives are conditionally independent and binomially distributed given , where denotes the bivariate latent pair of (transformed) sensitivity and specificity. That is
(1) 
where is a link function.
The stochastic representation of the between studies model takes the form
(2) 
where is a parametric family of copulas with dependence parameter and is the cdf of the univariate distribution of the random effect. The copula parameter is a parameter of the random effects model and it is separated from the univariate parameters, the univariate parameters and are the metaanalytic parameters for the sensitivity and specificity, and and express the betweenstudy variabilities. The models in (2) and (2) together specify a copula mixed model (Nikoloulopoulos, 2015a) with joint likelihood
(3) 
where is the copula density and is the binomial probability mass function (pmf). The choices of the and are given in Table 1.
logit, probit, cloglog  

Beta  identity 
3 Estimation methods
3.1 Maximum likelihood method
Estimation of the model parameters can be approached by the standard maximum likelihood (ML) method, by maximizing the logarithm of the joint likelihood in (3). For mixed models of the form with joint likelihood as in (3), numerical evaluation of the joint pmf is easily done with the following steps (Nikoloulopoulos, 2015a):

Convert from independent uniform random variables and to dependent uniform random variables and that have distribution . The inverse of the conditional distribution corresponding to the copula is used to achieve this.

Numerically evaluate the joint pmf
in a double sum:
The inverse conditional copula cdfs are given in Table 2 for the sufficient list of parametric families of copulas for metaanalysis of diagnostic test accuracy studies in Nikoloulopoulos (2015a, b). Since the copula parameter of each family has different range, in the sequel we reparametrize them via their Kendallâs ; that is comparable across families.
Copula  

BVN  
Frank  
Clayton  
Clayton by 90  
Clayton by 180  
Clayton by 270 
3.2 Composite likelihood method
The composite likelihood method assumes independence between the random effects. Hence, it is identical for any copula mixed model, since all the parametric families of copulas in Table 2 contain the independence copula as a special case. This subsection summarizes the composite likelihood estimating equations and the asymptotic covariance matrix for the estimator that solves them in the context of diagnostic test accuracy studies.
Composite likelihood estimator
Chen et al. (2014) and Chen et al. (2016b) proposed the composite likelihood method for estimation of the copula mixed model with normal and beta margins, respectively. Composite likelihood is a surrogate likelihood which leads asymptotically to unbiased estimating equations obtained by the derivatives of the composite loglikelihoods. Estimation of the model parameters can be approached by solving the marginal estimating equations or equivalently by maximizing the sum of composite (univariate) likelihoods.
By using composite likelihood the authors are assuming betweenstudy independence in sensitivities and specificities and thus the joint likelihood in (3) reduces to:
(4) 
where , since under the independence assumption the copula density is equal to 1. Note that the joint likelihood reduces to the product of two univariate likelihoods and the evaluation of univariate integrals, thus the computational effort (if any) is subsided. Essentially, for beta margins the univariate likelihoods result in a closed form since
where
is the pmf of a BetaBinomial() distribution.
Composite likelihood estimates can be obtained by maximizing the logarithm of the joint likelihood in (4) over the univariate parameters. The efficiency of the composite likelihood estimates has been studied and shown in a series of a papers (Varin, 2008; Varin et al., 2011). However, CL ignores the dependence at the estimation of the univariate marginal parameters, thus it is expected to be worse as the dependence increases.
Asymptotic covariance matrix–Inverse Godambe
Let . The asymptotic covariance matrix for the CL estimator , also known as the inverse Godambe information matrix (Godambe, 1991), is
4 Small and moderatesample efficiency–misspecification of the univariate distribution of the random effect
In this section an extensive simulation study with two different scenarios is conducted (a) to assess the performance of the CL and ML methods, and (b) to investigate in detail the effect of the misspecification of the parametric margin of the random effects distribution. The CL method assumes the independence copula and its focus is on marginal parameters and apparently not the choice of the copula. Hence in the simulations we only investigate the effect of the misspecification of the parametric margin of the random effects distribution. We refer the interested reader to Nikoloulopoulos (2015a) for a study on the misspecification of the parametric family of copulas of the random effects distribution.
We use the simulation process in Nikoloulopoulos (2015a) and set the univariate parameters and disease prevalence to mimic the telomerase data in Section 5. The details are given below:

Simulate the study size from a shifted gamma distribution, i.e., and round off to the nearest integer.

Simulate from a parametric family of copulas ; is converted to the dependence parameter via the relations in Table 2.

Convert to beta or normal realizations via for .

Draw the number of diseased from a distribution.

Set and generate from a for .
In the first scenario the simulated data are generated from the BVN copula mixed model with normal margins, logit link (the resulting model is the same with the GLMM) and true marginal parameters , while in the second scenario the simulated data are generated from the BVN copula mixed model with beta margins and true marginal parameters . The number of studies is set to and to represent a relatively small and moderate metaanalysis, and the Kendall’s association between studyspecific sensitivity and specificity is set to and to represent moderate and strong negative dependence.
True margin  MLnormal  MLbeta  CLnormal  CLbeta  

normal  10  0.5  45  17  995  0 
0.8  14  7  1025  0  
20  0.5  3  0  1118  0  
0.8  1  0  1136  0  
beta  10  0.5  50  27  1384  0 
0.8  24  8  1495  0  
20  0.5  0  0  1575  0  
0.8  0  0  1576  0  
MLnormal  MLbeta  CLnormal  CLbeta  

Bias  SD  RMSE  Bias  SD  RMSE  Bias  SD  RMSE  Bias  SD  RMSE  
10  0.5  0.08  2.95  2.62  2.95  0.68  2.91  2.63  2.99  0.02  2.93  2.80  2.93  0.67  2.91  2.72  2.98  
0.8  0.01  2.96  2.52  2.96  0.82  2.93  2.65  3.04  0.03  2.95  2.80  2.95  0.72  2.91  2.75  3.00  
20  0.5  0.01  2.07  1.91  2.07  0.74  2.03  1.93  2.16  0.01  2.07  2.00  2.07  0.77  2.05  1.98  2.19  
0.8  0.11  2.09  1.81  2.10  0.74  2.07  1.89  2.19  0.08  2.08  1.99  2.08  0.67  2.06  1.98  2.17  
10  0.5  1.76  5.87  4.38  6.13  9.12  7.04  4.72  11.52  1.68  5.57  5.13  5.82  9.29  6.99  5.76  11.63  
0.8  1.60  5.87  4.13  6.08  8.69  7.06  4.54  11.20  1.61  5.63  5.17  5.86  9.18  7.05  5.76  11.58  
20  0.5  0.93  3.90  3.18  4.01  9.40  5.04  3.72  10.67  0.92  3.74  3.63  3.85  9.43  5.01  4.30  10.68  
0.8  1.01  3.97  3.10  4.09  9.16  5.01  3.50  10.44  0.96  3.75  3.64  3.87  9.48  4.99  4.31  10.71  
10  0.5  6.16  16.46  15.55  17.57  1.94  1.86  8.39  18.59  56.84  20.40  1.97  1.99  
0.8  4.66  15.36  14.38  16.05  1.88  1.83  8.71  18.52  61.11  20.47  1.94  2.03  
20  0.5  3.42  11.75  10.82  12.24  1.48  1.38  3.99  12.72  19.78  13.33  1.48  1.43  
0.8  2.97  10.49  9.84  10.90  1.40  1.31  4.35  12.67  21.13  13.39  1.45  1.43  
10  0.5  20.52  46.38  42.88  50.72  10.54  7.42  20.85  45.98  14.51  50.49  10.20  8.68  
0.8  21.28  43.66  40.11  48.57  10.13  6.84  20.76  45.96  14.49  50.43  10.17  8.70  
20  0.5  12.33  32.85  31.25  35.09  7.81  5.82  12.12  32.71  10.02  34.89  7.61  6.62  
0.8  13.80  31.22  28.48  34.14  7.65  5.16  11.94  32.57  10.01  34.69  7.56  6.64  
dOR  10  0.5  14.46  41.89  35.65  44.32  51.62  54.24  36.04  74.88  13.46  39.07  33.66  41.32  52.46  53.57  43.79  74.98 
0.8  14.65  45.93  35.06  48.21  49.71  57.06  36.70  75.68  13.93  42.23  36.62  44.47  52.69  56.20  45.95  77.04  
20  0.5  6.87  23.95  24.03  24.92  48.18  35.32  26.27  59.74  6.61  22.75  21.32  23.69  48.11  34.95  30.53  59.46  
0.8  8.10  25.89  24.52  27.13  47.48  36.61  26.27  59.95  7.47  24.07  22.60  25.20  49.41  36.59  32.12  61.48  
The data are generated from the BVN copula mixed model with normal margins (GLMM) with true marginal parameters for different number of studies and different values of Kendall’s association between studyspecific sensitivity and specificity. All entries are multiplied by 100.
MLnormal  MLbeta  CLnormal  CLbeta  

Bias  SD  RMSE  Bias  SD  RMSE  Bias  SD  RMSE  Bias  SD  RMSE  
10  0.5  0.91  2.92  2.59  3.05  0.15  2.86  2.62  2.86  0.87  2.91  2.77  3.04  0.20  2.84  2.72  2.85  
0.8  1.05  2.89  2.46  3.07  0.20  2.82  2.59  2.83  1.02  2.91  2.73  3.08  0.38  2.84  2.70  2.87  
20  0.5  0.94  2.08  1.90  2.29  0.18  2.03  1.92  2.04  0.93  2.08  1.99  2.28  0.21  2.03  1.98  2.04  
0.8  0.97  2.04  1.78  2.26  0.08  2.00  1.88  2.00  0.96  2.06  1.98  2.27  0.25  2.00  1.98  2.01  
10  0.5  7.37  6.27  4.75  9.68  0.12  6.72  5.18  6.72  7.47  6.10  5.58  9.65  0.08  6.70  6.00  6.70  
0.8  7.12  6.33  4.64  9.53  0.34  6.64  4.89  6.65  7.28  6.17  5.66  9.54  0.25  6.68  6.02  6.69  
20  0.5  8.25  4.20  3.50  9.25  0.11  4.69  4.08  4.69  8.29  4.06  3.99  9.23  0.20  4.65  4.47  4.65  
0.8  8.04  4.29  3.41  9.11  0.33  4.72  3.76  4.74  8.23  4.14  4.00  9.22  0.19  4.71  4.46  4.71  
10  0.5  14.84  13.66  0.44  1.80  1.71  1.85  16.59  67.17  0.49  1.87  1.84  1.94  
0.8  13.73  12.62  0.45  1.65  1.63  1.71  16.74  38.99  0.63  1.82  1.83  1.92  
20  0.5  10.52  9.71  0.28  1.32  1.27  1.35  11.12  15.20  0.29  1.35  1.34  1.38  
0.8  9.57  8.83  0.28  1.25  1.19  1.28  11.20  17.56  0.34  1.35  1.34  1.39  
10  0.5  45.35  48.49  5.05  9.33  8.41  10.61  45.99  15.16  4.86  9.24  9.49  10.44  
0.8  41.46  44.93  5.73  8.92  7.59  10.61  46.86  15.13  4.90  9.27  9.53  10.49  
20  0.5  32.03  35.44  2.89  6.79  6.56  7.38  32.46  10.50  2.77  6.75  7.12  7.30  
0.8  29.41  31.67  3.54  6.63  5.66  7.52  32.87  10.52  2.89  6.78  7.11  7.37  
dOR  10  0.5  24.77  39.17  34.13  46.35  7.24  46.20  34.94  46.76  25.53  38.03  32.81  45.81  8.44  46.23  39.33  47.00 
0.8  22.20  42.84  35.09  48.25  7.20  48.26  35.55  48.80  23.32  41.45  35.22  47.56  11.10  49.40  41.65  50.63  
20  0.5  31.65  22.93  23.52  39.08  4.69  29.91  25.76  30.27  31.95  22.13  20.90  38.86  5.19  29.83  27.79  30.28  
0.8  30.13  24.88  23.90  39.08  2.58  31.31  25.24  31.42  31.20  23.84  22.36  39.27  5.83  31.76  28.94  32.29 
The data are generated from the BVN copula mixed model with beta margins with true marginal parameters for different number of studies and different values of Kendall’s association between studyspecific sensitivity and specificity. All entries are multiplied by 100.
As stated in Chen et al. (2014, 2016b) one advantage of the CL method is that the problem of nonconvergence is avoided, so we also report on the nonconvergence of different methods in Table 3. To summarize the simulated data, we report the resultant biases, root mean square errors (RMSE), and standard deviations (SD), along with average theoretical variances for the ML and CL estimates of the univariate parameters under different marginal choices based on iterations in which all four competing approaches converged in Table 4 and Table 5. Following Chen et al. (2016b) we also summarize the diagnostic odds ratio, that is dOR. Clearly, this is a function of the univariate parameters; its value ranges from zero to infinity, with a higher value indicating better discriminatory power.
Conclusions from the values in the table are the following:

The CL method is nearly as efficient as the ‘gold standard’ ML method.

The metaanalytic ML and CL estimates and SDs are not robust to the margin misspecification.

The ML method has negligible nonconvergence issues.

The CL method in Chen et al. (2014) has a nonconvergence rate between 10% to 16%.

The CL method in Chen et al. (2016b) has no nonconvergence issues at all as expected since the has a closed form.
The simulation results indicate that for both methods the effect of misspecifying the marginal choice can be seen as substantial for both the univariate parameters and the parameters that are functions of them, such as the dOR. This is in line with Nikoloulopoulos (2015a, b) for the ML method. Here we also show that the CL method is not robust to the misspecification of the margin. This agrees with the conclusions of Xu and Reid (2011) and Ogden (2016) who argue that if the marginal distribution of the random effects is misspecified then the CL estimator no longer retains robustness. This has not been revealed before, since Chen et al. (2014) (Chen et al., 2016b) focused solely on a beta (normal) margin and didn’t study the effect of misspecification of the marginal random effect distribution. The focus in the CL method is on marginal parameters and their functions (e.g., dOR). Since these are univariate inferences, all that matters, as regard as to the bias, is the univariate model.
5 Tumor markers for bladder cancer
In this section we illustrate the methods with data of the published metaanalyses in Glas et al. (2003b); also analysed in Chen et al. (2016b). This metaanalyses deal with the most common urological cancer, that is bladder cancer. Several diagnostic markers are assessed including the cytology () which is the classical marker for detecting bladder cancer since 1945 and is not expensive compared with the reference standard (that is cystoscopy procedure), but lacks the diagnostic sensitivity. The other markers under investigation to give a better sensitivity are NMP22 (), BTA (), BTASTAT(), telomerase (), and BTATRAK ().
For all the metaanalyses, we fit the copula mixed model for all different choices of parametric families of copulas and margins. Sufficient choices of copulas are BVN, Frank, Clayton, and the rotated versions of the latter (Table 2). These families have different strengths of tail behaviour; for more details see Nikoloulopoulos (2015a, b). We use the loglikelihood at estimates as a rough diagnostic measure for goodness of fit between the models and summarize the choice of the copula and margin with the largest loglikelihood, along with the GLMM (BVN copula mixed model with normal margins) as a benchmark. We also estimate the model parameters with the CL method under the assumption of both normal (CLnorm) and beta (CLbeta) margins. In Table 6 we report the resulting maximized ML and CL loglikelihoods, estimates, and standard errors.
NMP22  Telomerase  
GLMM  CLnorm  Cln270beta  CLbeta  GLMM  CLnorm  BVNbeta  CLbeta  
Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  
0.71  0.04  0.71  0.04  0.69  0.04  0.69  0.04  0.77  0.03  0.77  0.03  0.76  0.03  0.76  0.03  
0.79  0.03  0.79  0.03  0.78  0.03  0.77  0.03  0.91  0.05  0.90  0.05  0.81  0.06  0.81  0.06  
0.64  0.19  0.65  0.11  0.08  0.04  0.08  0.04  0.43  0.13  0.39  0.08  0.03  0.02  0.03  0.02  
0.58  0.15  0.56  0.13  0.05  0.03  0.05  0.02  1.83  0.40  1.66  0.14  0.28  0.10  0.24  0.09  
dOR  0.65  0.20  0.65  0.19  dOR  0.65  0.21  0.66  0.18  dOR  0.31  0.32  0.36  0.26  dOR  0.77  0.36  0.73  0.39 
0.17  0.21  0    0.28  0.20  0    1    0    1    0    
93.97  98.02  92.96  94.05  50.37  57.40  51.14  55.34  
BTA  BTATRAK  
GLMM  CLnorm  Cln180norm  CLbeta  GLMM  CLnorm  Cln270beta  CLbeta  
Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  
0.47  0.10  0.48  0.10  0.47  0.10  0.49  0.09  0.66  0.03  0.67    0.66  0.03  0.67  0.02  
0.80  0.04  0.81  0.04  0.80  0.04  0.79  0.04  0.69  0.14  0.70    0.66  0.09  0.66  0.10  
0.82  0.32  0.84  0.19  0.76  0.31  0.13  0.08  0.12  0.13  0.00    0.00  0.01  0.00  0.01  
0.53  0.20  0.55  0.18  0.48  0.19  0.04  0.03  1.20  0.52  1.23    0.18  0.10  0.19  0.11  
dOR  0.22  0.11  0.23  0.09  0.21  0.10  dOR  0.25  0.10  dOR  0.87  0.81  0.88    dOR  1.01  0.55  1.03  0.49 
0.26  0.36  0    0.30  0.25  0    0.92  0.49  0    0.91  0.18  0    
35.87  37.57  35.28  36.14  34.17  34.61  33.61  34.13  
BTASTAT  Cytology  
GLMM  CLnorm  BVNbeta  CLbeta  GLMM  CLnorm  Cln90beta  CLbeta  
Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  Est.  SE  
0.74  0.04  0.75  0.04  0.74  0.03  0.74  0.04  0.56  0.04  0.56  0.04  0.56  0.03  0.55  0.04  
0.75  0.05  0.76  0.05  0.73  0.05  0.74  0.05  0.96  0.01  0.96  0.01  0.92  0.02  0.92  0.02  
0.36  0.17  0.40  0.10  0.03  0.02  0.03  0.02  0.75  0.13  0.71  0.07  0.11  0.03  0.10  0.03  
0.72  0.21  0.72  0.14  0.08  0.04  0.08  0.04  1.46  0.23  1.51  0.10  0.12  0.05  0.12  0.04  
dOR  0.94  0.38  0.94  0.37  dOR  1.02  0.41  1.01  0.34  dOR  0.05  0.03  0.05  0.02  dOR  0.11  0.04  0.11  0.03 
0.30  0.38  0    0.29  0.38  0    0.09  0.19  0    0.06  0.11  0    
51.70  54.55  51.52  51.84  153.28  158.39  152.08  152.18 
Clnnorm and Clnbeta denotes a Clayton rotated by degrees copula mixed with normal and beta margins, respectively.
The CLnorm estimate of the between study variance was approximately zero, thus for this case the standard errors are unreliable as the betweenstudy variance parameter estimate is on the boundary of the parameter space.
Nmp22
The loglikelihoods show that a copula mixed model with rotated by 270 degrees Clayton copula and beta margins provides the best fit and the estimates of sensitivity and specificity are smaller under this assumption. The CL method performs well since the estimated is weak and not significantly different from zero.
Bta
The loglikelihoods show that a copula mixed model with rotated by 180 degrees Clayton copula and normal margins provides the best fit. Chen et al. (2016b) previously restricted to beta margins thus the sensitivity and dOR were overestimated (CLbeta).
Btastat
The loglikelihoods show that a BVN copula mixed model with beta margins provides the best fit and the estimates of specificity and dOR are smaller and larger, respectively, under this assumption.
Telomerase
Nikoloulopoulos (2015a) has previously analysed these data to illustrate the copula mixed model when there exists negative perfect dependence, and thus there is only one copula: the countermonotonic copula. This is a limiting case for all the parametric families of copulas, when the dependence parameter is fixed to the left boundary of its parameter space. Both models agree on the estimated sensitivity but the estimate of specificity is larger under the standard GLMM. The loglikelihood is for normal margins and for beta margins, and thus a normal margin seems to be a better fit for the data. In this example the CL method overestimates the dOR, since it ignores the perfect negative dependence at the estimation of the parameters.
Btatrak
The loglikelihoods show that a copula mixed model with rotated by 270 degrees Clayton copula and beta margins provides the best fit. Note that the CLnorm estimate of the between study variance was approximately zero, thus for this case the standard errors are unreliable as the betweenstudy variance parameter estimate is on the boundary of the parameter space.
Cytology
The loglikelihoods show that a copula mixed model with rotated by 90 degrees Clayton copula and beta margins provides the best fit. All models agree on the estimated sensitivity , but the estimated specificity and dOR are smaller when beta margins are assumed. The CL method performs well on the estimation of the univariate parameters and their functions since the estimated is weak and not significantly different from zero.
6 Discussion
In this paper we have challenged claims made in Chen et al. (2014, 2016b) about the advantages of using a composite likelihood in metaanalysis of diagnostic test accuracy studies, in terms of convergence and robustness to model misspecification. The usual reason for using a composite likelihood does not apply here, because the full likelihood is straightforward to compute. We have demonstrated that the copula mixed model in Nikoloulopoulos (2015a) does not suffer for computational problems or convergence issues. Nikoloulopoulos (2015a) proposed a numerically stable ML estimation technique based on GaussLegendre quadrature; the crucial step is to convert from independent to dependent quadrature points. Furthermore, it has been shown the secondary motivation of robustness of the CL method is not retained in this context if the marginal distributions are misspecified. Hence it is a digression to use the CL methods in Chen et al. (2014, 2016b) for estimation in metaanalysis of diagnostic test accuracy studies as apparently there is neither computationally difficulty in the calculation of the bivariate loglikelihood nor robustness in the misspecification of the marginal distribution of the random effects. These conclusions hold to any context where clinical trials or observational studies report more than a single binary outcome.
Furtermore, in Chen et al. (2014, 2016b) the main inference is univariate such as the overall sensitivity or specificity or their functions as a single measure of diagnostic accuracy, e.g., the diagnostic odds ratio (dOR). The dOR for many cases is not useful since cannot distinguish the ability to detect individuals with disease from the ability to identify healthy individuals (Chen et al., 2014). Whenever the balance between false negative and false positive rates is of immediate importance, both the prevalence and the conditional error rates of the test have to be taken into consideration to make a balanced decision; hence the dOR is less useful, as it does not distinguish between the two types of diagnostic mistake (Glas et al., 2003a).
NMP22 
Telomerase 



BTA  BTATRAK 
BTASTAT  Cytology 
In fact, if the interest is only to overall sensitivity, and specificity, then the overall test accuracy across studies will not be clearly defined. Different studies use different thresholds for a positive test result, thus the overall sensitivity and specificity do not make sense. Instead, some form of the summary receiver operating characteristic (SROC) curve makes much more sense and will help decision makers to assess the actual diagnostic accuracy of a diagnostic test. In an era of evidencebased medicine, decision makers need highquality procedures such as the SROC curves to support decisions about whether or not to use a diagnostic test in a specific clinical situation and, if so, which test.
An SROC curve is deduced for the copula mixed model in Nikoloulopoulos (2015a) through a median regression curve of on . For the copula mixed model, the model parameters (including dependence parameters), the choice of the copula, and the choice of the margin affect the shape of the SROC curve (Nikoloulopoulos, 2015a). However, there is no priori reason to regress on instead of the other way around, so Nikoloulopoulos (2015a) also provides a median regression curve of on . Apparently, while there is a unique definition of the ROC curve within a study with fixed accuracy, there is no unique definition of SROC curve across multiple studies with different accuracies (Rücker and Schumacher, 2010). As Arends et al. (2008) have pointed out, none of the SROC curves proposed in the literature can be interpreted as an average ROC. Rücker and Schumacher (2009) stated that instead of summarizing data using an SROC, it might be preferable to give confidence regions. Hence, in addition to using just median regression curves, Nikoloulopoulos (2015a) proposed quantile regression curves with a focus on high ( = 0.99) and low quantiles ( = 0.01), which are strongly associated with the upper and lower tail dependence imposed from each parametric family of copulas. These can been seen as confidence regions of the median regression SROC curve. Among the parametric families of copulas in Table 2 the tail dependence varies, and is a property to consider when choosing amongst different families of copulas as affects the shape of SROC curves (Nikoloulopoulos, 2015a). Finally, Nikoloulopoulos (2015a) to reserve the nature of a bivariate response instead of a univariate response along with a covariate, proposed to plot the estimated contour of the random effects distribution. The contour plot can be seen as the predictive region of the estimated pair of sensitivity and specificity. The prediction region of the copula mixed model does not depend on the assumption of bivariate normality of the random effects and has nonelliptical shape.
Figure 1 demonstrates these curves and summary operating points (a pair of average sensitivity and specificity) with a confidence and a predictive region from the best fitted copula mixed model for all the metaanalyses in Section 5. Both CL methods in Chen et al. (2014, 2016b) cannot be used to produce the SROC curves, since the dependence parameters affect the shape of the SROC curve and these are set to independence by definition. Note in passing that the CL method in Chen et al. (2014) can provide a confidence region but this is restricted to the elliptical shape.
Nevertheless, the additional feature of having to estimate the association among the random effects in ML estimation has been found to require larger sample sizes than in CL estimation where this parameter is set to independence. The application example includes cases with an adequate number of individual studies. For metaanalyses with fewer studies the CL methods in Chen et al. (2014, 2016b) can be recommended if a bivariate copula mixed model is near nonidentifiable (or has a flat loglikelihood) and the estimation of an average operating point (summary sensitivity and specificity) is of interest instead of a SROC curve.
Software
The R package CopulaREMADA (Nikoloulopoulos, 2016) has been used to produce the ML estimates (along with their SE) of the parameters from the copula mixed models and plot the SROC curves and summary operating points (a pair of average sensitivity and specificity) with a confidence and a predictive region. The R package xmeta (Chen et al., 2016a) has been used to produce the CL estimates (along with their SE) of the parameters from both methods in Chen et al. (2014, 2016b).
Acknowledgements
The simulations presented in this paper were carried out on the High Performance Computing Cluster supported by the Research and Specialist Computing Support service at the University of East Anglia.
Footnotes
 A.Nikoloulopoulos@uea.ac.uk, School of Computing Sciences, University of East Anglia, Norwich NR4 7TJ, UK
References
 Arends, L. R., Hamza, T. H., van Houwelingen, J. C., HeijenbrokKal, M. H., Hunink, M. G. M., and Stijnen, T. (2008). Bivariate random effects metaanalysis of ROC curves. Medical Decision Making, 28(5):621–638.
 Chen, Y., Hong, C., Chu, H., and Liu, Y. (2016a). xmeta: A Toolbox for Multivariate MetaAnalysis. R package version 1.12.
 Chen, Y., Hong, C., Ning, Y., and Su, X. (2016b). Metaanalysis of studies with bivariate binary outcomes: a marginal betabinomial model approach. Statistics in Medicine, 1:21–40.
 Chen, Y., Liu, Y., Ning, J., Nie, L., Zhu, H., and Chu, H. (2014). A composite likelihood method for bivariate metaanalysis in diagnostic systematic reviews. Statistical Methods in Medical Research. DOI:10.1177/0962280214562146.
 Chu, H. and Cole, S. R. (2006). Bivariate metaanalysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. Journal of Clinical Epidemiology, 59(12):1331–1332.
 Chu, H., Nie, L., Chen, Y., Huang, Y., and Sun, W. (2012). Bivariate random effects models for metaanalysis of comparative studies with binary outcomes: Methods for the absolute risk difference and relative risk. Statistical Methods in Medical Research, 21(6):621–633.
 Glas, A., Lijmer, J., Prins, M., Bonsel, G., and Bossuyt, P. (2003a). The diagnostic odds ratio: A single indicator of test performance. Journal of Clinical Epidemiology, 56(11):1129–1135.
 Glas, A., Roos, D., Deutekom, M., Zwinderman, A., Bossuyt, P., and Kurth, K. (2003b). Tumor markers in the diagnosis of primary bladder cancer. a systematic review. The Journal of Urology, 169(6):1975–1982.
 Godambe, V. P. (1991). Estimating Functions. Oxford University Press, Oxford.
 Jackson, D., Riley, R., and White, I. R. (2011). Multivariate metaanalysis: Potential and promise. Statistics in Medicine, 30(20):2481–2498.
 Ma, X., Nie, L., Cole, S. R., and Chu, H. (2016). Statistical methods for multivariate metaanalysis of diagnostic tests: An overview and tutorial. Statistical Methods in Medical Research, 25:1596–1619.
 Mavridis, D. and Salanti, G. (2013). A practical introduction to multivariate metaanalysis. Statistical Methods in Medical Research, 22(2):133–158.
 Nikoloulopoulos, A. K. (2015a). A mixed effect model for bivariate metaanalysis of diagnostic test accuracy studies using a copula representation of the random effects distribution. Statistics in Medicine, 34:3842–3865.
 Nikoloulopoulos, A. K. (2015b). A vine copula mixed effect model for trivariate metaanalysis of diagnostic test accuracy studies accounting for disease prevalence. Statistical Methods in Medical Research. DOI:10.1177/0962280215596769.
 Nikoloulopoulos, A. K. (2016). CopulaREMADA: Copula mixed effect models for bivariate and trivariate metaanalysis of diagnostic test accuracy studies. R package version 1.0.
 Ogden, H. (2016). A caveat on the robustness of composite likelihood estimators: The case of a misspecified random effect distribution. Statistica Sinica, 26(2):639–651.
 R Core Team (2015). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
 Rücker, G. and Schumacher, M. (2009). Letter to the editor. Biostatistics, 10(4):806–807.
 Rücker, G. and Schumacher, M. (2010). Summary ROC curve based on a weighted Youden index for selecting an optimal cutpoint in metaanalysis of diagnostic accuracy. Statistics in Medicine, 29:3069–3078.
 Stroud, A. H. and Secrest, D. (1966). Gaussian Quadrature Formulas. PrenticeHall, Englewood Cliffs, NJ.
 Varin, C. (2008). On composite marginal likelihoods. Advances in Statistical Analysis, 92:1–28.
 Varin, C., Reid, N., and Firth, D. (2011). An overview of composite likelihood methods. Statistica Sinica, 21:5–42.
 Xu, X. and Reid, N. (2011). On the robustness of maximum composite likelihood estimate. Journal of Statistical Planning and Inference, 141(9):3047–3054.