Power analysis for a linear regression model
when regressors are matrix sampled
Abstract
Multiple matrix sampling is a survey methodology technique that randomly chooses a relatively small subset of items to be presented to survey respondents for the purpose of reducing respondent burden. The data produced are missing completely at random (MCAR), and special missing data techniques should be used in linear regression and other multivariate statistical analysis. We derive asymptotic variances of regression parameter estimates that allow us to conduct power analysis for linear regression models fit to the data obtained via a multiple matrix sampling design. The ideas are demonstrated with a variation of the Big Five Inventory of psychological traits. An exploration of the regression parameter space demonstrates instability of the sample size requirements, and substantial losses of precision with matrixsampled regressors. A simulation with nonnormal data demonstrates the advantages of a semiparametric multiple imputation scheme.
Key Words: MCAR, multiple imputation, multiple matrix sampling, power analysis, respondent burden.
1 Introduction and motivation
This work was conceived and carried out in the context of a task to reduce respondent burden in a mental health study. We were interested in a range of outcome variables, and our analytical goal was fitting regression models to explain the mental health outcomes. The instrument collects demographic explanatory variables, as well as scores from the Big Five Inventory (John and Srivastava, 1999), a commonly used set of five psychological traits that are often found to be correlated with behaviors and outcomes. We expected that multiple matrix sampling would allow us to reduce the instrument length from over an hour to about 20–25 minutes. A key component of sampling design, sample size determination, will be based on a linear regression power analysis. However, complexities of regression analysis with missing data required custom derivations of power analyses, which is what this technical paper addresses.
2 Regression setting
Consider a regression analysis problem where an outcome is predicted by a set of explanatory variables :
(0) 
In the simplest possible case of no missing data and homoskedastic normal errors , the maximum likelihood estimates are the OLS estimates
(0) 
Inference on regression coefficients is based on normality of coefficient estimates, .
An unbiased estimate of can be obtained by changing the denominator (degrees of freedom) from to . The OLS estimates hold desirable properties in more general settings, e.g., by dropping the normality requirement. When regressors are stochastic, the OLS estimates and their variance estimates given above only have an asymptotic justification, and require independence of regressors and errors . When the basic assumptions are violated, sandwichtype or resampling variance estimates need to be used.
2.1 Power analysis and sample size determination in regression setting
Power analysis and sample size determination are statistical tasks of addressing, quantifying and controling type I error. In a typical power analysis problem, a null hypothesis, , and an alternative, , are formulated; a test statistic is selected, for which a critical region of level is specified. E.g., assuming that high values of the test statistic indicate disagreement between the data and the null, as is typical with or statistic tests common in regression models, the rejection region would have the form so that when the true value of the parameter . Finally, power analysis addresses the issue of Type II error, i.e., under the alternative . While the null hypothesis typically represents a simple hypothesis or a subset of reasonably small dimension, the alternative is necessarily complex. Hence researchers often formulate a measure of effect size and consider power analysis for parameter values under the alternative that are at least away from the specific value or the subset in an appropriate metric.
As the power to reject the null typically grows with the sample size , the task of sample size determination is to find the value that guarantees a given level of power . While the size of the test is often taken to be , the traditional type II error rate is leading to power.
While testing the location parameter of two populations, for example, the natural hypotheses might be vs. the (twosided) alternative , with relatively straightforward testing based on the Student distribution, linear regression models feature a variety of statistics that may be subject to testing and power analysis:

Test of overall fit: .

A version of this hypothesis can be formulated as . Depending on how easy or difficult it is to conduct inference on parameter estimates or regression sums of squares, one or the other may be preferred in applications.


An increase of overall fit: vs. .

Specific regression coefficients: the coefficient of the th explanatory variable is zero, vs. .

Linear hypothesis , which covers cases like:

Equality of two regression coefficient for the th and the th explanatory variable, , so

No impact of a set of variables : , so that is a subset of rows of a unit matrix, and is a zero vector of conforming dimension.


Tests on error variance , e.g. vs. .
3 Multiple matrix sampling
Multiple matrix sampling of a survey questionnaire consists of administering only a specific subset of items to a given respondent, out of all items this respondent is potentially eligible to be asked. The name stems from representation of the data with respondents as rows, and items as columns, so that matrix sampling concerns selecting specific entries in the matrix to be administered, rather than the full row as is typically done. The focus of the technique is on selecting items out of all the relevant ones that the respondent could be asked, with the potential skip patterns already taken into account. Similar or equivalent techniques are also known as partitioned designs and questionnaire splitting. The method originated in educational testing (Shoemaker, 1973), where it was first used to select items from a large pool of available ones. The educational testing companies have identified the need to implement multiple matrix sampling methods to protect the integrity of their data products, so that the students taking a standardized test are not able to get trained on a small subset of items known to be administered on standardized tests, thus biasing the estimates of achievement.
Given the relatively esoteric nature of the method, the existing publications have addressed some specific niche problems in matrix sampling (and the current paper is no exception). Gonzalez and Eltinge (2007) provided a review of matrix sampling and applications in Consumer Expenditure Quarterly Survey. Chipperfield and Steel (2009) put the problem of matrix sampling into a cost optimization framework, where proper subsets of items can be administered on up to forms at specific cost per item rates. They demonstrated that with two items (or groups of items), the split questionnaire best linear unbiased estimator (BLUE; also related to the GLS estimator) provides modest efficiency gains over a design in which all items are administered at once, and over a twophase design in which all items are administered to a fraction of respondents, and one subset of items is administered to the remainder of respondents. Merkouris (2010) extended their work to provide simplified composite estimation using the estimates based on the formspecific subsamples, where compositing is based on the secondorder probabilities of selection and the way they are utilized in estimating the variance of the HorvitzThompson estimator. Eltinge (2013) discussed connections to and relations with multiple frame and multiple systems estimation methods (e.g., integration of survey and administrative data, where administrative data may fill some of the survey items when available). We add to this literature by providing the asymptotic variancecovariance matrix of the coefficient estimates under matrix sampling of regressors, assuming that the outcome is always collected. We also discuss implications for power analysis and sample size determination.
3.1 A simple example
Consider the following matrix sampling design, in which the outcome is collected on every form, while the explanatory variables differ between forms.
Form  

1  +  
2  +  
3  + 
With this design, summaries (means, totals) of all the variables () can be obtained, and the bivariate relations between each of the regressors and the outcome can be analyzed. However, estimation of a multiple regression model requires estimability of all of the entries of the matrix, which this specific matrix sampling design does not provide.
To conduct regression analysis, we need to observe the crossentries of the matrix, which necessitates the following matrix sampling design.
Form  

1  +  +  
2  +  +  
3  +  + 
3.2 Parameter estimation under matrix sampling
Since the components of and/or necessary to obtain the OLS regression estimates may not be jointly available, more complex estimation strategies may need to be employed. We study two such strategies.
One possibility is to utilize structural equation modeling (SEM) with missing data, in which the marginal regression model of interest is formulated by using the regressors as exogenous variables, the dependent variable is introduced as the only endogenous variable explained by the model (Bollen, 1989), and the existing SEM estimation methods are applied (Yuan and Bentler, 2000; Savalei, 2010).
Alternatively, since the data are missing by design, and can be treated as MCAR, multiple imputation (Rubin, 1996; van Buuren, 2012, MI) can be used to fill in the missing values, with Rubin’s variance formulae used to combine MI estimates and provide inference. Of the several existing flavors of multiple imputation, one of the simplest strategies is imputation under multivariate normality (which we expect to behave in ways similar to the estimation methods for SEM with missing data under multivariate normality). A less modeldependent method is predictive mean matching (Little, 1988) in which a regression model is fit for each imputed variable, a linear prediction is obtained for each case with missing variable, and an imputation is made by choosing the value of the dependent variable from one of the nearest neighbors in terms of the linear prediction score.
4 Set up and notation
All of the derivations in this paper concern the joint matrix of the first and second order moments of the data:
(0) 
The maximum likelihood estimates of the coefficients in the regression of on (obtained, for instance, through SEM modeling using maximum likelihood estimates with multivariate normal missing data method; or approximated through multiple imputation) are obtained as
(0) 
where is the maximum likelihood estimator of the joint parameter matrix:
(0) 
where is the regression intercept by convention, so that , are the (estimated) means of the th explanatory variable, and is the estimated mean of .
To derive the likelihood, we need the formspecific submatrices obtained by multiplying the overall matrix by selector matrices. For instance, in Design 2 above, for the first form, the relevant covariance matrix is
(0) 
Matrices necessary to form and are defined in a similar way.
Define the unit selector vector that picks up the estimates of the means , which is the unit vector with 1 in the “zeroth” position corresponding to the intercepts in the parameter matrix . In addition to selecting the first order moments, define the unit selection vectors as the unit vector selecting the last row/column of corresponding to the parameters, and is a unit vector with 1 in the th position corresponding to the th variable (with the convention of indexing starting at zero). Then we observe that
(0) 
5 Likelihood and derivatives
5.1 Likelihood
Indexing the forms by , and observations within forms by , the likelihood can be written as
(0) 
where is the number of observations on which the th form is collected, and is the selector matrix corresponding to the th form.
Derivations of the asymptotic properties of the MLE estimate are based on the matrix differential (Magnus and Neudecker, 1999)
(0) 
After some tedious algebra, the following information matrix results.
(0)  
(0)  
(0) 
The zero expected crossderivatives indicate that the estimates of the multivariate normal means and the variancecovariance parameters are independent. (This may not be the case in general if the missing data mechanism coded by the matrices is not MCAR, and instead related to the data values.)
(0)  
(0)  
(0)  
(0) 
where is the th entry of the inverse of the formspecific covariance matrix; and indices can enumerate the explanatory variables and the response . As and are considered jointly multivariate normal at this point, there is no separation into dependent and explanatory variables.
Putting these entries together into a matrix, and using the standard maximum likelihood estimation theory results, the asymptotic variance of the maximum likelihood estimates of is given by
(0) 
5.2 The delta method derivation of the asymptotic variance of
Let us now return to the task of estimating the coefficients of the regression equation
via ( ‣ 4). The asymptotic variancecovariance matrix of can be obtained from the asymptotic covariance matrix of using the deltamethod, i.e., linearization of the relation ( ‣ 4):
(0) 
where the individual components of can be obtained from ( ‣ 5.1). Thus
(0) 
where the derivatives are with respect to the components of the vectorization , of which the last term is . By the standard multivariate deltamethod results (Newey and McFadden, 1994; van der Vaart, 1998),
(0) 
6 Example: Big Five Inventory
In our application, we wanted to analyze the relation between mental health outcomes and the Big Five personal traits:

Openness to experience (inventive/curious vs. consistent/cautious)

Conscientiousness (efficient/organized vs. easygoing/careless)

Extraversion (outgoing/energetic vs. solitary/reserved)

Agreeableness (friendly/compassionate vs. challenging/detached)

Neuroticism (sensitive/nervous vs. secure/confident)
These personal traits have been found in numerous studies to be related to academic performance, disorders, general health, and many other behaviors and outcomes. The standard Big Five scale consists of 44 items, some of which are reverse worded and reverse scored to minimize the risk of straightlining, and with items from different subscales mixed throughout the scales. Each item is a 5 point Likert scale with a clear midpoint.
In the population of interest, the Big Five traits are expected to have the following correlations, based on preceding research:
(0) 
We thus consider a regression model
where are subscale scores of the Big Five traits. Measurement error in these scores is ignored, although more accurate methods are available to account for it (Skrondal and Laake, 2001).
A balanced multiple matrix sampling design would consist of ten forms, each administering the outcome and two of the Big Five subscales:
Form  1  2  3  4  5  6  7  8  9  10 

O  +  +  +  +  
C  +  +  +  +  
E  +  +  +  +  
A  +  +  +  +  
N  +  +  +  +  
+  +  +  +  +  +  +  +  +  + 
7 Simulation 1: Parameter space exploration
In this simulation exercise, we explore the parameter space of regression coefficients to gauge the degree of variability of sample size determination results. Asymptotic variance resulting from ( ‣ 5.2) is used to obtain the sample sizes for the tasks outlined in Section 2.1. Simulation 1 consists of the following steps.

Population regression parameters are simulated from .

To provide the scale of the residual variance, the fraction of explained variance is set to , a moderate effect for behavioral and social science data, and the associated residual variance is calculated based on this value of .

The complete data variances stemming from ( ‣ 2) are recorded.

The multiplematrixsampled data variances stemming from ( ‣ 5.2) are recorded.

Sample size to reject the test of overall significance at 5% level with 80% power is recorded.

Sample size to detect an increase in by 0.01 (i.e., from 0.15 to 0.16), through a uniform multiplicative increase in the values of the regression parameters, keeping the residual variance constant, at 5% level with 80% power, is recorded.

Sample size to detect an increase in by 0.01 (i.e., from 0.15 to 0.16), through an increase in the value of the coefficient , keeping the residual variance constant and other regression parameters constant, at 5% level with 80% power, is recorded.
1,000 Monte Carlo draws of the vector, and subsequent analytical computation of asymptotic variances and power, were done. Results are presented graphically. Figure 1 presents the sample sizes obtained in steps 7–7 of the parameter exploration. A striking feature of the plot is wide variability of the sample sizes as a function of the specific configuration of parameters. While the lower limit of the sample size necessary to detect an overall increase in by is about , the median value is , the 95th percentile is , and the maximum (worst case scenario) identified in this simulation is . The patterns of the coefficients of the worst case scenarios typically indicate large coefficients of opposite signs of the positively correlated variables ( through ), or large coefficients of similar size of one of the positively correlated factors ( through ) and a high value of factor that is negatively correlated with all other subscales. This wide range of variability makes it difficult to provide a definite recommendation concerning the sample size for the study to the stakeholders. A conservative value based on a high percentile (80% or 90%) can be recommended, to protect against bad population values of regression parameters at the expense of a potentially unnecessary increase in costs.
Figure 2 presents the exploration distribution of the fraction of missing information due to the missing data. FMI for the intercept is generally low, below 0.2. FMI for regression slopes are generally high, in the range of about 70% to 80%. Given the structure of the missing data shown by the multiple matrix sampling design in Table 3, each of the predictor variables is observed in 40% of the data (informing the diagonal entries of the matrix), and each pairwise combination of the regressors is observed in 10% of the data (informing the offdiagonal entries). This yields an expected information loss for the predictor variables somewhere between 60% and 90%.
(a)  (b) 
(c)  (d) 
8 Simulation 2: Performance in finite samples
To study the performance of estimation methods based on SEM estimation with missing data, and on multiple imputation procedures, a simulation with microdata was also performed. For each simulation draw, the following steps were taken.

Sample size is set to (i.e., observations per form).

Multivariate nonnormal factor scores are simulated:

The nonnormal principal components of are simulated as
(0) (0) (0) so that each principal component has a mean of 0 and variance of 1, with all the underlying random variables being drawn independently of each other. The first component has a marginal exponential distribution with a heavy right tail, ensuring the overall skewness of each factor. The second component has a bimodal distribution with two exponential components and heavy tails. The remaining three components are normal.

The factor values are reconstructed as
(0) where is the eigenvalue decomposition of the target covariance matrix ( ‣ 6) of the Big Five factors.


The outcome is obtained as where the specific value of the residual variance was chosen to ensure that in the population.

The regression model with the complete data is fit to obtain the benchmark for FMI calculation.

The values of regressors were deleted in accordance with the multiple matrix sampling design in Table 3.

The normal theory based SEM model for missing data was fit; regression parameter estimates and their asymptotic standard errors based on the inverse Hessian were recorded.

complete data sets were imputed using multivariate normal imputation model.

The regression model was estimated using the first data sets, in accordance with the traditional recommendation regarding the number of imputed data sets. Regression parameter estimates and their asymptotic standard errors based on the Rubin’s rules were recorded.

The regression model was estimated using all of the data sets. Regression parameter estimates and their asymptotic standard errors based on the Rubin’s rules were recorded.

complete data sets were imputed using predictive mean matching imputation model for each of the missing variables.

The regression model was estimated using the first data sets, in accordance with the traditional recommendation regarding the number of imputed data sets. Regression parameter estimates and their asymptotic standard errors based on the Rubin’s rules were recorded.

The regression model was estimated using all of the data sets. Regression parameter estimates and their asymptotic standard errors based on the Rubin’s rules were recorded.
There were 1,200 Monte Carlo samples drawn.
Figure 3 reports the simulated distributions of the estimates of parameter . The population value of 0.3 is shown as a vertical line on the plot. As expected, the complete data regression model demonstrates higher efficiency. Estimates based on the multivarate normal methods are biased up, while those based on MI with predictive mean matching are biased down. Distributions of the estimates based on the multivariate normal methods are more spread out than the asymptotic variance based on ( ‣ 5.2), while those based on PMM MI are less spread out, with apparent efficiency gains extracted from higher moments of the data. The plots in Figure 3 are truncated, with about 3% of the Monte Carlo simulations outside the right range of the plot (the value of ), and about 1% of the Monte Carlo simulations outside the left range of the plot (the value of ) for each of the methods based on multivariate normality. Details for and other regression coefficient estimates are provided in Table 4.
Method  

Complete  0.3002  0.0016  0.0006  0.3002  0.0015 
data  [0.298,0.303]  [0.001,0.004]  [0.002,0.003]  [0.298,0.303]  [0.001,0.004] 
regression  0.0418  0.0408  0.0440  0.0414  0.0413 
SEM with  0.3277  0.0203  0.0130  0.3324  0.0096 
MVN  [0.320,0.336]  [0.028,0.013]  [0.022,0.004]  [0.324,0.340]  [0.003,0.017] 
missing data  0.1414  0.1356  0.1588  0.1429  0.1249 
MI using  0.3369  0.0253  0.0173  0.3393  0.0091 
MVN model,  [0.329,0.345]  [0.033,0.017]  [0.027,0.008]  [0.331,0.347]  [0.002,0.016] 
0.1390  0.1415  0.1645  0.1435  0.1259  
MI using  0.3430  0.0314  0.0208  0.3466  0.0109 
MVN model,  [0.334,0.352]  [0.040,0.023]  [0.031,0.011]  [0.338,0.355]  [0.003,0.018] 
0.1556  0.1507  0.1760  0.1531  0.1336  
MI using  0.2661  0.0356  0.0261  0.2666  0.0056 
PMM model,  [0.262,0.270]  [0.032,0.039]  [0.022,0.030]  [0.263,0.271]  [0.009,0.002] 
0.0707  0.0679  0.0758  0.0707  0.0631  
MI using  0.2678  0.0361  0.0251  0.2671  0.0043 
PMM model,  [0.264,0.272]  [0.032,0.040]  [0.021,0.029]  [0.263,0.271]  [0.008,0.001] 
0.0676  0.0656  0.0719  0.0665  0.0591  
Population  0.3  0  0  0.3  0 
0.0791  0.0856  0.0926  0.0824  0.0832 
Figure 4 provides the Monte Carlo distributions of the standard errors reported for the missing data methods. The dotted vertical line is the asymptotic standard error based on ( ‣ 5.2), 0.0791. The dashed lines are empirical means of the standard errors. All distributions are skewed with heavy right tails. The distributions of the standard errors based on multivariate data contain outliers outside the range of the plot (3% of the SEM with missing data results; 6% of the results for MI using the multivariate normal model with ; 8% of the results for MI using the multivariate normal model with ; the range of the plots is from 0 to the asymptotic standard error, 0.0791). Distributions of the standard errors for the multivariate normal methods are significantly higher that this asymptotic standard error, which reflects, to some extent, the greater variability of the estimates observed above in Figure 3 and Table 4. Distributions of the standard errors for the PMM MI method are significantly lower that the asymptotic standard error, which reflects, to some extent, the lower variability of the estimates based on this method. A higher number of multiple imputations vs. helps to stabilize the variance estimates, particularly in the case of PMM.
Coverage of the nominal 95% confidence intervals is analyzed in Table 5. Despite the shortcomings of both the point estimates and the standard errors noted above, things seem to balance out and provide confidence interval coverage fairly close to the target.
Method  

Complete data regression  95.5%  95.4%  95.1%  95.8%  97.3% 
SEM with MVN missing data  97.8%  98.6%  97.3%  98.7%  98.4% 
MI using MVN model,  93.3%  93.8%  93.0%  93.3%  96.8% 
MI using MVN model,  92.8%  93.4%  93.2%  93.1%  97.9% 
MI using PMM model,  94.4%  95.6%  94.5%  95.0%  94.2% 
MI using PMM model,  96.3%  96.6%  95.8%  96.8%  97.2% 
Estimated fractions of missing information reported by the software are shown on Figure 5. The dotted line is the value based on asymptotic variance, 73.6%. Dashed lines are the empirical FMI, based on the ratios of the Monte Carlo variance of based on a given missing data method to the variance of based on the complete data. The latter empirical FMI is greater than the theoretical one for the MI methods based on the multivariate normality assumption, and lower than the theoretical one for the PMM MI methods. The methods based on multivariate normality appear to underestimate FMI, as the distributions of the reported empirical FMI appear to the left of the true value (dashed line). The FMI that come out of PMM MI appear to be more accurate. An increase in the number of completed data sets from to helps to improve stability of the FMI estimates, making the distributions of the empirical FMI more concentrated.
9 Concluding remarks
This paper provides an analytical framework for analysis of regression models (and, more generally, other statistical methods that are based on the covariance matrices of observed items or scales) that allows for quick power analysis avoiding computationally intensive simulations.
Revisiting the initial motivation of burden reduction, the results are underwhelming. Is burden really reduced by multiple matrix sampling in the example considered? Out of five explanatory variables (based on approximately 8 survey items each) and one outcome, only three variables are collected on each of the matrix sampled instrument forms. This translates to about 50% burden reduction per respondent. However, given that the loss of information quantified by the fraction of missing information (FMI) is about 75â80%, the data collection sample sizes would need to be about 4–5 times larger compared to the traditional data collection of all items at once. Unless the response rate drops sharply by a factor of more than two due to the increase in questionnaire length, the total public burden is increased.
The sample sizes necessary to detect the required effect sizes in increased demonstrate long tails in the exploration of parameter spaces. These long tails make it difficult to plan for the worstcase scenarios associated with “unfortunate” regression parameter configurations. Should a specific decision need to be made based on the parameter explorations akin to those undertaken in Section 7, the tradeoff between the survey costs due to large sample sizes and risks of having an underpowered study should the coefficient estimates be found to have an “unfortunate” configuration should be carefully discussed with the survey stakeholders to find the most appropriate course of action.
We conducted a finite sample simulation with nonnormal data and several missing data methods, and determined that the methods that assume multivariate normality generally perform poorly, and generate a nonnegligible proportion of really bad outliers. In comparison, semiparametric multiple imputation by predictive mean matching with sufficiently large number of imputed data sets seem to work best.
Our work can be extended in a number of additional dimensions. The derivations of asymptotic variances are based on the working assumption of multivariate normality and using the inverse information matrix to estimate variances. With nonnormal data, the problem can be formulated in terms of estimating equations, and sandwich variance estimators should be formed. As our simulation demonstrated, asymptotic standard errors based on inverse information matrix are inadequate for the analysis methods that we used, leading to underestimates with misspecified normalitybased methods, and overestimates with a more accurate semiparametric method.
The current paper assumed independence of respondents. In practice, complex survey features such as strata, clusters, unequal probabilities of selection, and weight calibration would affect asymptotic properties of the estimates. In particular, the sandwich variance estimation will be required. Many practical survey statistics issues may also interact with multiple matrix sampling in unusual ways. How would differential nonresponse by form affect the results? What should we do when a stratum has fewer than two cases of a given form? These and other questions related to designbased inference would need to be answered when multiple matrix sampling is applied in practice.
Finally, in terms of ensuring adequate measurement properties, we note that psychometric properties are usually established and validated for scales, but not necessarily subscales that respondents are exposed to in multiple matrix sampling instruments. In particular, if the order of the items, or the degree of mixing of items from the different subscales of the Big Five Inventory is important for the validity of the scale and its subscales, these properties may be violated when shorter subscales are administered that require the respondent to answer similar questions more frequently.
References
 (1)
 Bollen (1989) Bollen, K. A. (1989), Structural Equations with Latent Variables, Wiley, New York.
 Chipperfield and Steel (2009) Chipperfield, J. and Steel, D. (2009), ‘Design and estimation for split questionnaire surveys’, Journal of Official Statistics 25, 227–244.
 Eltinge (2013) Eltinge, J. L. (2013), Integration of matrix sampling and multipleframe methodology, in ‘Proceedings of the 59th ISI World Statistics Congress’, Hong Kong, pp. 323–328.
 Gonzalez and Eltinge (2007) Gonzalez, J. M. and Eltinge, J. L. (2007), Multiple matrix sampling: A review, in ‘Proceedings of the American Statistical Association’, Survey Research Methods Section, American Statistical Association, Alexandria, VA, pp. 3069–3075.
 John and Srivastava (1999) John, O. P. and Srivastava, S. (1999), The Big Five trait taxonomy: History, measurement, and theoretical perspectives, in L. A. Pervin and O. P. John, eds, ‘Handbook of personality: Theory and research’, Guilford Press, New York, pp. 102–138.
 Little (1988) Little, R. J. A. (1988), ‘Missingdata adjustments in large surveys’, Journal of Business and Economic Statistics 6(3), 287–296.
 Magnus and Neudecker (1999) Magnus, J. R. and Neudecker, H. (1999), Matrix calculus with applications in statistics and econometrics, 2nd edn, John Wiley & Sons.
 Merkouris (2010) Merkouris, T. (2010), An estimation method for matrix survey sampling, in ‘Proceedings of the American Statistical Association’, Survey Research Methods Section, American Statistical Association, American Statistical Association, Alexandria, VA, pp. 4880–4886.
 Newey and McFadden (1994) Newey, W. and McFadden, D. (1994), Large sample estimation and hypothesis testing, in R. Engle and D. McFadden, eds, ‘Handbook of econometrics’, Vol. IV, North Holland, New York, chapter 36.
 Rubin (1996) Rubin, D. B. (1996), ‘Multiple imputation after 18+ years’, Journal of the American Statistical Association 91(434), 473–489.
 Savalei (2010) Savalei, V. (2010), ‘Expected versus observed information in SEM with incomplete normal and nonnormal data’, Psychological Methods 15(4), 352–367.
 Shoemaker (1973) Shoemaker, D. (1973), Principles and Procedures of Multiple Matrix Sampling, Ballinger Publishing Company, Cambridge, Massachusetts.
 Skrondal and Laake (2001) Skrondal, A. and Laake, P. (2001), ‘Regression among factor scores’, Psychometrika 66(4), 563–575.
 van Buuren (2012) van Buuren, S. (2012), Flexible Imputation of Missing Data, Interdisciplinary Statistics, Chapman and Hall/CRC, Boca Raton, FL.
 van der Vaart (1998) van der Vaart, A. W. (1998), Asymptotic Statistics, Cambridge University Press, Cambridge.
 Yuan and Bentler (2000) Yuan, K. H. and Bentler, P. M. (2000), ‘Three likelihoodbased methods for mean and covariance structure analysis with nonnormal missing data’, Sociological Methodology 30, 165–200.