Cryo-EM reconstruction of continuous heterogeneity by Laplacian spectral volumes

# Cryo-EM reconstruction of continuous heterogeneity by Laplacian spectral volumes

Amit Moscovich, Amit Halevi, Joakim Andén and Amit Singer Program in Applied & Computational Mathematics, Princeton University, Princeton, NJ Center for Computational Mathematics, Flatiron Institute, New York, NY Department of Mathematics, Princeton University, Princeton, NJ Equal contribution.
###### Abstract

Single-particle electron cryomicroscopy is an essential tool for high-resolution 3D reconstruction of proteins and other biological macromolecules. An important challenge in cryo-EM is the reconstruction of non-rigid molecules with parts that move and deform. Traditional reconstruction methods fail in these cases, resulting in smeared reconstructions of the moving parts. This poses a major obstacle for structural biologists, who need high-resolution reconstructions of entire macromolecules, moving parts included. To address this challenge, we present a new method for the reconstruction of macromolecules exhibiting continuous heterogeneity. The proposed method uses projection images from multiple viewing directions to construct a graph Laplacian through which the manifold of three-dimensional conformations is analyzed. The 3D molecular structures are then expanded in a basis of Laplacian eigenvectors, using a novel generalized tomographic reconstruction algorithm to compute the expansion coefficients. These coefficients, which we name spectral volumes, provide a high-resolution visualization of the molecular dynamics. We provide a theoretical analysis and evaluate the method empirically on several simulated data sets.

\usetkzobj

all

, , and

Keywords: single particle electron cryomicroscopy, heterogeneity, tomographic reconstruction, molecular conformation space, manifold learning, Laplacian eigenmaps, diffusion maps

## 1 Introduction

The function of biological macromolecules is determined not only by their chemical composition but also by their 3D configuration. Hence, accurately estimating these configurations is of great importance to the field of structural biology. Macromolecules may deform their structure, resulting in a continuum of possible configurations, known as conformations. It is not always possible to isolate different conformations and study each separately. As a result, practitioners often image a heterogeneous sample which is then “purified” computationally.

While X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy have been very successful in imaging homogeneous molecular structures to high resolution, they rely on aggregate measurements over an entire sample and are therefore ill-suited for imaging heterogeneous molecules. Single-particle electron cryomicroscopy (cryo-EM), on the other hand, produces a separate image for each individual molecule, opening the door to accurate determination of structural variability.

The cryo-EM process consists of rapidly freezing a solution containing the molecular sample and then imaging it using a transmission electron microscope. The electron detector captures a movie where each frame records the electron counts for every pixel. To reduce the effects of ionization damage—which destroys the fine structure of the molecules—the electron dose is kept low, resulting in exceptionally noisy images. See bottom row in Figure 1 for examples. Since each particle is randomly oriented with respect to the imaging plane, the resulting image contains projections of molecules from many random viewing directions. Each projection image is typically modeled as the line integral of the particle’s electric potential along the viewing direction, followed by convolution with a point spread function and the addition of noise [Frank2006, VulovicEtAl2013]. The goal of cryo-EM reconstruction is to invert this process and obtain an estimate of the molecular volume from its projection images. In recent years, better sample preparation techniques and improved detectors have led to reconstructions at a near-atomic resolution [Kuhlbrandt2014, AmuntsEtal2014, LiaoEtal2013, BartesaghiEtal2018].

The standard assumption in 3D reconstruction by cryo-EM is that of a homogeneous sample with no structural variability. Many methods for homogeneous 3D reconstruction have been proposed [Frank2006, BarnettGreengardPatakiSpivak2017, ChengEtal2015, MilneEtal2012, VinothkumarHenderson2016]. The prevalent methods are based on a Bayesian approach [Scheres2012a] which starts from some initial guess for the volume and then performs expectation-maximization (EM), alternating between estimating a latent distribution of viewing directions for every image and estimating the volume given these distributions. As discussed above, however, the homogeneous assumption does not hold in general. Resolving molecular structures with variability is known as the heterogeneity problem in single-particle cryo-EM. Two types of heterogeneity are typically considered: discrete and continuous.

### 1.1 Discrete heterogeneity

This is perhaps the simplest model for heterogeneity. In this model, it is assumed that the particles in the sample can be approximated by a finite number of fixed volumes. An example of a molecule that may be effectively modeled in this way is ATP synthase, an enzyme that acts as a molecular stepper motor and spends most of its time in one of three rotation angles [YasudaEtal1998].

Several software packages support reconstruction with discrete heterogeneity, also known as 3D classification in the cryo-EM community. These include RELION [Scheres2012b], cryoSPARC [PunjaniEtal2017], FREALIGN/cisTEM [LyumkisEtal2013, GrantRohouGrigorieff2018], and EMAN2 [TangEtal2007]. To recover distinct volumes, these methods assign, for each particle image, a latent distribution over the set . This is incorporated into the EM algorithm, similar to the latent distribution over the viewing directions.

### 1.2 Continuous heterogeneity

In this model, the molecular volumes in the sample vary continuously subject to the many constraints due to molecular bonds. If the number of degrees of freedom associated with the flexible motion is small then the space of molecular volumes forms a low-dimensional manifold (up to thermal vibrations). Figure 1 shows a simple molecular model with continuous heterogeneity that we use in our simulations. Here, the continuous motion is the free rotation of the top part around the vertical axis. In this case, the manifold of molecular volumes is diffeomorphic to the unit circle .

One approach for analyzing structural heterogeneity is to perform principal component analysis (PCA) of the 3D molecular structures represented as densities on an voxel grid. This idea goes back to [LiuFrank1995] and was further developed by [Penczek2002, PenczekEtal2006, PenczekFrankSpahn2006, PenczekKimmelSpahn2011, LiaoFrank2010]. These methods estimate the covariance matrix of the 3D volumes and compute its leading eigenvectors, known as eigenvolumes. One variant relies on a consistent least-squares estimator for the covariance [KatsevichKatsevichSinger2015, AndenKatsevichSinger2015, AndenSinger2018]. These methods may capture continuous heterogeneity—as illustrated by [AndenSinger2018]—but are ill-suited for high-resolution reconstruction, as we discuss in Section 3.3. A notable exception is the method proposed in [TagareEtal2015] that attempts to directly compute the leading eigenvectors, at high resolution, without estimating the entire covariance matrix.

A different approach is taken in [DashtiEtAl2014, SchwanderFungOurmazd2014, FrankOurmazd2016, DashtiEtal2018] and is based on diffusion maps, a non-linear dimensionality reduction method that is well-suited for recovering low-dimensional manifold structure [CoifmanEtal2005, CoifmanLafon2006]. These methods first cluster the projection images by their viewing direction and then compute a separate low-dimensional embedding for each cluster. All of these different embeddings are then aligned, yielding a global embedding of conformations. Sets of close points in the global embedding may then be used to reconstruct a 3D volume corresponding to a particular conformation. This approach faces two important challenges: first, unsupervised global registration of the embeddings is by itself a very challenging problem [WangMahadevan2009, CuiChangShanChen2014]; second, each individual embedding uses only a small subset of images from a particular viewing direction, which may be insufficient for accurate manifold recovery.

The RELION software package has also been recently extended to include multi-body refinement [NakaneEtal2018]. This method takes a segmentation of a 3D molecular reconstruction and attempts to refine each part separately from a static base model, with independent viewing directions and shift parameters for each part. Multi-body refinement, however, is limited to rigid variability and may fail to accurately reconstruct the interface between moving parts.

Other methods have been proposed based on normal mode analysis of the molecular structure reconstruction [JinEtal2014, SchilbachEtal2017]. However, the underlying harmonic oscillator model used in these methods may be too simple to describe sophisticated continuous variability such as structural deformations. See [SorzanoEtal2019] for a survey of methods for studying continuous heterogeneity using cryo-EM.

### 1.3 Our contribution

We present a new method for recovering continuous variability based on manifold learning. In contrast to the viewing-direction specific manifold estimates of [DashtiEtAl2014], our method directly approximates the global manifold of conformations from all projection images, regardless of their viewing direction.

Throughout this paper, we identify molecular volumes with their electric potential sampled on a 3D voxel grid of dimension . Under the continuous heterogeneity model, a single molecule corresponds to an embedded submanifold of . This manifold is the range of a smooth function that maps a set of conformation parameters to a volume. A standard technique for approximating smooth functions on manifolds is by series expansion in Laplacian eigenfunctions. This technique generalizes the familiar Fourier series expansion in Euclidean space. However, to apply it we need to have the Laplacian eigenfunctions. This is a “chicken and egg” problem: The computation of the Laplacian eigenfunctions requires the distribution of 3D volumes, which is the very thing we would like to estimate. To resolve this problem, we use the covariance-based approach [AndenSinger2018] to obtain low-resolution estimates of the 3D volumes. These reconstructions are then used to form an empirical graph Laplacian whose eigenvectors with lowest eigenvalues are used in lieu of the unknown Laplacian eigenfunctions. Then we compute a set of expansion coefficient vectors , which we refer to as spectral volumes. Together, they define a high-resolution 3D reconstruction for each projection image:

 ^x_s:=∑_ℓ=0r−1^α(ℓ)^ϕ(ℓ)s, (1)

To compute the expansion coefficients , we formulate a novel generalized tomographic reconstruction problem posed as a 3D deconvolution, similar to [WangShkolniskySinger2013]. The convolution kernel is computed efficiently using a non-uniform fast Fourier transform (NUFFT) [DuttRokhlin1993, GreengardLee2004] and the solution is computed using the conjugate gradient method, leveraging the fast Fourier transform (FFT) for the application of the convolution. These computational details are key for scaling up to high-resolution. See Figure 2 for a diagram of the main steps which constitute our method.

###### Remark 1.

The eigenvectors of the Laplacian can be used not only for function representation but also for non-linear dimensionality reduction (e.g. [BelkinNiyogi2003, CoifmanEtal2005]). In our case, they define an embedding of the low-resolution reconstructions that is useful for visualizing the underlying manifold of conformations. In Figure 5 we show the two-dimensional embedding of the ChannelSpin dataset using the second and third eigenvectors. Despite the high noise levels, the underlying circular manifold of motions is recovered.

###### Remark 2.

The spectral volumes have the same dimensionality as the high-resolution volumes that we reconstruct. They may therefore be visualized as 3D molecular volumes, albeit with negative values as well as positive. These visualizations provide insight regarding the range of motions of the molecule. See Figure 6 for examples and Section 5 for an asymptotic analysis of the spectral volumes.

Section 2 defines the forward model and formulates the inverse problem for continuous heterogeneity in cryo-EM. We describe our method in Section 3, including the generalized tomographic reconstruction from noisy projection images. Section 4 outlines the algorithms used and their computational complexity. In Section 5 we prove the convergence of the spectral volumes and high-resolution reconstructions under the manifold assumption. Finally, we present results on synthetic datasets in Section 6.

## 2 Problem formulation

We begin by describing the forward model for cryo-EM and then define the inverse problem that we wish to solve, first by considering the simpler case without heterogeneity and then by generalizing to the case of continuous heterogeneity.

### 2.1 Forward model

A sample of many identical molecules is prepared in a solution and then rapidly frozen, forming a thin sheet of vitreous ice which is then imaged using a transmission electron microscope. The resulting image is a measurement of the electrostatic potential of this thin sheet, integrated along the direction perpendicular to the imaging plane. The individual molecules, known as “particles” in the cryo-EM literature, are all captured in different orientations.

For every molecule in a particular 3D conformation, there is a corresponding real-valued electrostatic density map which we simply refer to as the volume and discretize it on an grid of voxels. We now describe the data generation model. First, the volumes are drawn i.i.d. from some distribution on which describes the structural variability of the molecule. Then, linear imaging operators are drawn i.i.d from some distribution. These operators are the composition of a volume rotation operator tomographic projection onto the imaging plane, and convolution with a point spread function. The individual particle images are formed by

 y_s=P_sx_s+ε_s∀s=1,2,…,n, (2)

where are noise terms. For simplicity, we assume that . The cryo-EM forward operator also includes an in-plane shift after the projection and filtering. In our pipeline, this is estimated and corrected for during the classical reconstruction stage (Figure 2, step (i)).

We consider the volumes as functions , where , is the grid for even values of (a similar grid may be defined for odd ). Similarly, the images are functions . To define the imaging operators we must define the tomographic projection operation. One approach to this is in terms of line integrals perpendicular to the projection plane but since the volumes lie on a discrete grid one must incorporate an interpolation scheme. An alternative is to express tomographic projection in the Fourier domain. Let be a -dimensional signal on , its discrete Fourier transform (DFT) is given by

 (F_ds)(k):=∑_u∈M_Nde−2πi⟨k,u⟩s[u]∀k∈Rd (3)

where is a wave vector that corresponds to a particular directional frequency. By the Fourier slice theorem, a tomographic projection along the axis in the spatial domain is equivalent to a restriction to the - plane in the Fourier domain [Natterer2001]. We use this fact to express the projection image in the Fourier domain as follows:

 (F_2P_sx_s)([k_1,k_2]T)=(F_3x_s)(R_s−1[k_1,k_2,0]T)⋅(F_2h_s)([k_1,k_2]T). (4)

where is a wave vector in the resulting 2D projection image, is the rotation of particle number and is the point-spread function whose Fourier transform is known as the contrast transfer function (CTF). See Section 2 of [AndenSinger2018] for more details on the forward model.

### 2.2 Inverse problem

Homogeneous case.  The traditional inverse problem in single-particle cryo-EM assumes that all of the molecular volumes in the sample are identical. Thus, the forward model (2) simplifies to

 y_s=P_sμ+ε_s∀s=1,2,…,n, (5)

where is a mean volume. Suppose the orientations and CTFs are known so that we have the imaging operators . Furthermore, suppose that the images are centered (i.e. in-plane shifts have been accounted for). Then for a white Gaussian noise model, the maximum-likelihood estimate of is the solution to the following least-squares problem:

 ^μ=argminμ∈RN3∑_s=1n∥∥y_s−P_sμ∥∥2. (6)

This problem and regularized variants of it are not well-posed in general, with the condition number depending on the distribution of the viewing angles, the CTFs, and the desired resolution of the reconstruction. Nevertheless, high accuracy solutions are routinely obtained using cryo-EM software packages. [Scheres2012b, PunjaniEtal2017, GrantRohouGrigorieff2018, TangEtal2007].

Continuous heterogeneity. Our main goal when analyzing a heterogeneous sample is to estimate the density of volumes associated with a given molecule. We approach this problem by performing reconstructions of the individual volumes . Clearly, estimating voxel values from merely noisy measurements is an ill-posed problem and much harder than the homogeneous problem, where only a single volume of voxels needs to be estimated. In this paper we make two main assumptions: The first is that the molecular volumes in the sample lie near a low-dimensional manifold. This model is natural since many heterogeneous macromolecules only have a few degrees of freedom that describe their range of motions [DashtiEtAl2014, SchwanderFungOurmazd2014, FrankOurmazd2016, DashtiEtal2018]. Varying these degrees of freedom traces out a smooth, low-dimensional manifold . The second assumption is that the imaging operators can be accurately estimated using standard cryo-EM reconstruction tools. This is the case when the molecule contains a large fixed component and a smaller heterogeneous part. A good indication that this is indeed the case for a particular dataset is when the reconstruction of the mean volume has a high resolution in some regions and lower resolution in others.

In the next section, we explain how we combine these assumptions with spectral techniques for function approximation on low-dimensional spaces to reconstruct all of the volumes in a heterogeneous molecular sample.

## 3 Methods

In this section, we describe our spectral approach to the reconstruction of molecular samples with continuous heterogeneity. Our approach is based on the representation and approximation of molecular volumes using an orthogonal basis expansion of eigenfunctions. By expanding the molecular volumes in this basis and imposing the projection constraints we obtain a generalized spectral formulation of the cryo-EM reconstruction problem.

### 3.1 Manifold spectral representation

Our method builds on the output of a low-resolution reconstruction method [AndenSinger2018] that we describe in Section 3.3. In this method, each reconstructed volume is a linear combination of PCA eigenvolumes, hence it defines some mapping where is the vector of eigenvolume coefficients corresponding to a low-dimensional representation of . In what follows, we ignore potential ambiguities due to the projection and consider the low-resolution reconstruction as a linear dimensionality reduction of the underlying volume Since we assumed the underlying manifold of volumes is -dimensional, then if the image of this mapping is some compact domain that is a -dimensional immersed manifold.

In what follows we consider the approximation of smooth functions on general domains via eigenfunctions of the Laplacian operator. We briefly review some relevant facts [GrebenkovNguyen2013]. The Laplacian has a set of real eigenfunctions that form a complete orthonormal basis of with corresponding non-negative eigenvalues . The smoothness of is controlled by , which corresponds to the spatial frequency of . Consequently, the eigenfunctions with lowest eigenvalues form a natural basis for approximating smooth functions on . In fact, this basis is optimal for the approximation of smooth functions with bounded gradient magnitudes [AflaloBrezisKimmel2015]. The idea of using Laplacian eigenfunctions for approximation and regression over arbitrary domains is a generalization of the classical approach for signal representation by Fourier series expansion [GreblickiPawlak1985].

Let us therefore consider the basis formed by the first eigenfunctions . Fix a voxel and consider its associated restriction function . We may approximate this function using low-frequency eigenfunctions

 x[u]≈∑_ℓ=0r−1α_u(ℓ)ϕ(ℓ)(β(x)), (7)

where is the image of in PCA coordinates. This can be written more succinctly by aggregating the coefficients for all voxels into a single volume, yielding

 x≈∑_ℓ=0r−1α(ℓ)ϕ(ℓ)(β(x)),∀β∈B. (8)

We call the coefficient vectors spectral volumes. Note that the above construction does not rely on a voxel-wise representation of the volumes as the same type of expansion can be done for volumes represented in any spatial basis.

The eigenfunctions are unknown, so we employ a widely used technique from the field of manifold learning, replacing them with estimates given by eigenvectors of a data-driven graph Laplacian. More specifically, we build a weighted undirected graph, where the vertices correspond to the projection images and the edge weights are estimates of the affinity between the underlying molecular conformations. In our case, the affinities are computed from the low-resolution reconstruction coordinate described in Section 3.3. We then form the symmetric normalized graph Laplacian and compute its eigenvectors with the lowest eigenvalues,

 ϕ^ϕ(0),…,ϕ^ϕ(r−1)∈Rn. (9)

See Section 4.1 for the specific algorithms used for forming the graph and computing these eigenvectors. As we explain in Section 5.3, we may assume that these estimates converge to the eigenfunctions in the sense that

 ^ϕ(ℓ)s≈1√nϕ(ℓ)(β_s)∀s=1,2,…,n, (10)

where the factor is needed for proper normalization, so that

 ∑_s=1n(^ϕ(ℓ)s)2=1. (11)

We can now write a data-driven variant of the spectral expansion in (8),

 x_s≈√n∑_ℓ=0r−1α(ℓ)^ϕ(ℓ)s∀s=1,2,…,n. (12)

In the next section we explain how we estimate the coefficients of this expansion.

### 3.2 Generalized tomographic reconstruction

We assume that the molecular orientations can be accurately estimated using standard methods for homogeneous cryo-EM reconstruction [PenczekKimmelSpahn2011, LiaoHashemFrank2015], so that the projection operators are estimated to high accuracy. By applying the imaging matrix to both sides of (12) and plugging in the forward model (2), we obtain

 y_s≈√n∑_ℓ=0r−1(P_sα(ℓ))^ϕ(ℓ)s∀s=1,2,…,n. (13)

We seek spectral volumes that minimize the squared error

 (^α(0),…,^α(r−1)):=argmin{α(0),…,α(r−1)}∑_s=1n∥∥∥y_s−√n∑_ℓ=0r−1(P_sα(ℓ))^ϕ(ℓ)s∥∥∥2. (14)

The minimizer can be calculated efficiently by forming the normal equations and solving them using the conjugate gradient method. See Section 4.2 for more details on the numerical solution of this minimization problem. Note that in contrast to the low-resolution PCA eigenvolumes, the spectral volumes are at the full resolution . Our high-resolution reconstructions of the molecular volumes are now given by

 ^x_s=√n∑_ℓ=0r−1^ϕ(ℓ)s^α(ℓ)∀s=1,2,…,n. (15)

This estimator generalizes the least-squares estimator (6) for a single mean volume to multiple volumes whose contribution to the reconstructed volumes is given by the Laplacian eigenvectors defined in Eq. (9).

### 3.3 Low-resolution reconstruction

While the approach outlined above provides a recipe for computing the eigenvectors and using them to obtain high-resolution volume estimates, a crucial ingredient is missing still: the graph weights . We would like them to approximate an affinity of the underlying molecular volumes.

Several approaches have been proposed for computing affinities between projection images of heterogeneous ensembles. One of the earliest was to compute affinities using a common-line distance [HermanKalinowski2008], without estimating the relative orientations. This procedure finds the best common-line correspondence out of all candidate common lines, resulting in very noisy affinity estimates. To reduce the noise one can first estimate the orientations of the projection images and then compute the common line distance based on the relative orientation. This was proposed in [ShatskyEtal2010], however, the resulting affinity measure is still very noisy, so the authors first performed 2D class averaging within each set of projection images from the same orientation. However, this may average different conformations together.

We define the affinity to be the Euclidean distance between the low-resolution reconstructions, obtained using the covariance estimation method [AndenSinger2018]. This approach achieves robustness to noise without averaging different conformations together. We now briefly describe their method. The first step is to estimate the mean of the distribution of molecular volumes. This is done by taking the derivative of Equation (6) with respect to and setting it equal to zero. This yields the normal equations

 1n(∑_s=1nP_sTP_s)^μ=1n∑_s=1nP_sTy_s. (16)

This formulation corresponds to the maximum-likelihood estimator of in the setting of Gaussian white noise. As a consequence, is a consistent estimator [KatsevichKatsevichSinger2015]. A similar estimator for the covariance matrix is given by

 ^Σ=argminΣ∈RN3×N3∑_s=1n∥∥(P_sΣP_sT+σ2I_N2)−(y_s−P_s^μ)(y_s−P_s^μ)T∥∥2_F. (17)

While not a maximum-likelihood estimator, it is consistent under mild conditions [KatsevichKatsevichSinger2015]. Computing its normal equations yields a linear system in variables. Fortunately, this linear system can be reformulated as a deconvolution problem in six dimensions. Precalculating the convolution kernel requires operations, but it can then be applied with complexity . The equations can now be solved using the preconditioned conjugate gradient method. Empirically, it takes around iterations to converge [AndenSinger2018].

While more efficient than a naive approach, the algorithm outlined above still scales poorly in image size . As a result, this covariance estimation method is not currently practical for . Furthermore, from a simple dimensionality argument, to estimate the elements of from images of size , we need at least images. So to apply the algorithm to experimental data, we must first downsample the images from to . It is possible to gain insight on the structural variability using this approach, but the resulting reconstructions are of low-resolution.

After obtaining the mean and covariance estimates and , the volumes can be reconstructed by the PCA method introduced in [PenczekKimmelSpahn2011]. First, the eigenvectors, or eigenvolumes, of are extracted and arranged as columns in a matrix . They represent the principal directions of molecular volume variability in . Together with the estimated mean, they define an affine -dimensional subspace of of the form , where is a coordinates vector. Each image may then be associated with a volume in the affine subspace through [AndenSinger2018]

 (18)

where is the diagonal matrix of the leading eigenvalues of . The above estimator is the maximum a posteriori (MAP) estimator of the coordinates of for Gaussian distributions of and . It is also equal to the Wiener filter estimator and the linear minimum mean squared error estimator of the coordinates [Mallat2009, Kay1993].

Given the solutions to (18), we have a low-resolution estimate of each volume given by . We assume that the manifold structure of is not destroyed by the mapping of projection images to coordinate vectors in , hence that it is possible to invert this process and associate a unique molecular conformation with every low-dimensional reconstruction. If the intrinsic dimensionality of the conformation space is low and the volumes vary smoothly along this space then the inverse map can be approximated by a small number of spectral volumes.

## 4 Algorithms and computational complexity

In this section, we provide the technical details of our reconstruction method. In Section 4.1 we describe the precise methods used to form the graph Laplacian and compute its eigenvectors, and in Section 4.2 we describe the deconvolution-based solution of the generalized tomographic reconstruction problem (14).

### 4.1 Graph computations

To compute the PCA eigenvolumes, we begin by downsampling the input images to size , where is typically about 16. These images are then fed into the mean and covariance estimation pipeline described in [AndenSinger2018]. It has computational complexity . The condition number is of the order of . The top eigenvectors of the estimated covariance are computed and the -dimensional coordinates of each image are obtained via (18). This step has computational complexity , following the algorithm described in [AndenSinger2018]. A weighted undirected graph is then constructed with vertices that correspond to the images and edge weights calculated from the PCA coordinates . We tested two kinds of weight matrices:

1. Gaussian kernel weights .

2. Binary symmetric KNN matrices, whereby if and only if is one of the nearest neighbors of or vice versa, and otherwise.

In our preliminary experiments we obtained similar results with both choices. For our final results, we chose to use the symmetric KNN graph since it is sparse, which reduces the memory and computational costs. For the Laplacian matrix, we use the symmetric normalized graph Laplacian

 L:=D−1/2(D−W)D−1/2=I−D−1/2WD−1/2, (19)

where is a diagonal matrix that satisfies . The symmetry of permits the use of specialized algorithms for eigenvector calculation and guarantees that the resulting eigenvectors are orthogonal. See the tutorial by [Vonluxburg2007] for other common choices of weight and Laplacian matrices.

We build the KNN weights matrix using MATLAB’s knnsearch function which for low dimensions is based on a KDTree [FriedmanBentleyFinkel1977]. The running time of this part is where is the dimension of the PCA coordinates used in the low-resolution reconstruction. We then form the Laplacian matrix and compute its eigenvectors with lowest eigenvalues using MATLAB’s eigs function. This function implements the Krylov–Schur algorithm [Stewart2001]. The matrices and are stored as sparse matrices of average degree , hence their memory usage is . There exist newer methods of computing eigenvectors, such as the algebraic multigrid preconditioner used by the megaman manifold learning package [McQueenEtal2016, OlsonSchroder2018]. We did not incorporate such methods in the current work, as the eigenvector calculation step was not a bottleneck in our implementation.

### 4.2 Spectral volume estimation

Recall that the spectral volumes are defined in (14) as minimizers of the generalized tomographic reconstruction equation, (14). To find this minimum, we compute the gradient with respect to and set it to zero, obtaining the normal equations

 1√n∑_s=1n^ϕ(ℓ)sP_sTy_s=∑_m=0r−1∑_s=1n^ϕ(ℓ)s^ϕ(m)sP_sTP_sα(m)∀ℓ=0,1,…,r−1. (20)

We can rewrite the equation in vector notation by defining the vectors to be weighted backprojected images

 b(ℓ)=1√n∑_s=1n^ϕ(ℓ)sP_sTy_s , (21)

and to be an block matrix, with blocks of size . Each block is a weighted sum of projection-backprojection matrices, with its block given by

 K(ℓ,m)=∑_s=1n^ϕ(ℓ)s^ϕ(m)sP_sTP_s. (22)

By defining the vector to be the concatenation of and to be the concatenation of we can rewrite (20) as

 b=Kα. (23)

Since is of size , it would be very expensive to directly solve this equation using standard direct inversion algorithms such as those based on LU or Cholesky decomposition, since this would require operations. Even merely storing the matrix in RAM may be prohibitive. However, if we use an iterative solver such as the conjugate gradient method, we do not need to explicitly store the matrix so long as we have an efficient method to apply it. To this end, we draw on the work of [WangShkolniskySinger2013] and note that applying to a volume is equivalent to convolving that volume with a kernel calculated from and . A complication arises from the fact that the points in (4) do not lie on a regular grid, hence to evaluate the expression we need to compute Fourier amplitudes on a non-regular grid which cannot be achieved through the standard FFT. Instead, we use the FINUFFT non-uniform fast Fourier transform software package [BarnettEtal2019]. It has computational complexity where is the number of points at which the transform is computed. Here, , as both and are of size , and we consider instances of projection images. We must compute the convolution kernel that corresponds to for each of the -pairs, and for each . Thus, the total time to calculate the convolution kernels of all the blocks of is . The backprojected images vector is also calculated from , , and using a non-uniform FFT at a total computational cost of .

Each step of the conjugate gradient method involves applying the forward operator as well as performing several vector dot products and additions. Applying the forward operator is done using FFT operations of size , which has a total complexity of . The complexity of the conjugate gradient method is thus , where is the condition number of , since the conjugate gradient method converges in steps [GolubVanLoan2013, TrefethenBau1997]. In conclusion, the total runtime for solving the normal equations (20) is . For our synthetic data sets ChannelSpin and ChannelStretch, using spectral volumes we found that is of the order of . See Section 6.3 for empirical runtimes on these data sets.

###### Remark 3.

The running time may be reduced by computing an approximation to . In the proof of Theorem 1 we show that in probability. We can thus approximate by setting the off-diagonal blocks to zero and setting the diagonal blocks to the empirical estimate of

 K(ℓ,ℓ)=1n∑_s=1nP_sTP_s. (24)

With this approximation, the time to approximate reduces to and is now dominated by the computation of . The time to multiply vectors by is , so the total runtime drops by a factor of to .

## 5 Theory

In this section, we analyze the solution to the generalized tomographic reconstruction as defined in (14), starting with a simplified special case.

### 5.1 Warmup: Spectral volumes without projections

We first analyze the solution in an easy setting where the imaging operators are all equal to the identity matrix. That is, we have direct, albeit noisy, measurements without projections and point spread function. This case is directly applicable for reconstructing a manifold of 2D images, as we later demonstrate in Section 6.1. In this setting, the spectral volumes minimize

 ∑_s=1n∥∥∥z_s−√n∑_ℓ=0r−1^ϕ(ℓ)sα(ℓ)∥∥∥2. (25)

In this sum, each voxel can be considered separately, giving

 ^α(ℓ)[u]=argminα(ℓ)[u]∑_s=1n∣∣∣z_s[u]−√n∑_ℓ=0r−1^ϕ(ℓ)sα(ℓ)[u]∣∣∣2%. (26)

For a symmetric graph Laplacian , the eigenvectors form an orthonormal set. Hence the coefficient is given by an orthonormal projection of onto

 ^α(ℓ)[u]=1√n∑_s=1n^ϕ(ℓ)sz_s[u]=1√n∑_s=1n^ϕ(ℓ)s(x_s[u]+ε_s[u]), (27)

or, in vector form,

 ^α(ℓ)=1√n∑_s=1n^ϕ(ℓ)s(x_s+ε_s)=1√n∑_s=1n^ϕ(ℓ)sx_s+N(0,σ2nI_N2). (28)

The last equality stems from the fact that the noise terms satisfy . Consequently, the spectral volumes in this simplified model are, up to a noise term, orthogonal projections of the true volumes onto the basis of Laplacian eigenvectors. In the next subsection, we show that this is also the case when tomographic projections are incorporated into the model.

### 5.2 Spectral volumes with projections

We now consider the full forward model with non-trivial imaging operators . First note that in our model, the images and the imaging operators are random vectors, therefore the Laplacian eigenvectors are also random vectors. For our analysis, we make the following two assumptions:

###### Assumption 1.

Let be an image, drawn according to the forward model (2). Then its Laplacian eigenvector coordinates are independent of .

In other words, the Laplacian eigenmap (or diffusion map) coordinates are independent of the viewing direction and CTF of the particle. We can justify this assumption by assuming that the covariance-based method of [AndenSinger2018] performs accurate low-resolution reconstruction, regardless of the viewing angle.

###### Assumption 2.

For any , the following sum converges in probability:

 max_ℓ∈{0,…,r−1}∑_s=1n(^ϕ(ℓ)s)4→0. (29)

That is, for any , the probability that tends to zero as .

Note that from the normalization constraint , unless the energy of the eigenvectors is highly concentrated, we expect to have and thus , in which case Assumption 2 holds. Under standard assumptions the Laplacian eigenvectors converge to limiting eigenfunctions of some differential operator. As we show in Section 5.3, if the eigenfunctions are bounded and this spectral convergence holds then Assumption 2 follows.

Before stating our main result, we recall big-O in probability notation for stochastic boundedness: a sequence of random variables satisfies if for every , there is some bound such that . We now state our main result which characterizes the estimated spectral volumes up to a stochastically bounded error.

###### Theorem 1.

(Spectral volume convergence) Let be an eigenvector of the symmetric graph Laplacian described in Section 4.1. Under Assumptions 1 and 2, the spectral volumes as defined in (14) satisfy

 ^α(ℓ)=E[1√n∑_s=1n^ϕ(ℓ)sx_s]+OP(1√n), (30)

where the expectation is taken with respect to the random draw of projection images as described in Section 2.1.

The proof is in A.

### 5.3 Convergence of the reconstructed volumes

Consider the graph Laplacian eigenvectors computed from the low-resolution reconstruction coordinates . Several variants of the discrete graph Laplacian are known to converge to a continuous linear operator on . This convergence is not only pointwise, but also spectral, meaning that the eigenvectors of the graph Laplacian converge to the eigenfunctions of this operator [VonluxburgBelkinBousquet2008, RosascoBelkinDevito2010, LeeIzbicki2016]. In particular cases, the limiting operator is the continuous Laplacian, but more generally it is a weighted Laplacian operator, or Fokker–Planck operator, which has an additional drift term towards, or away from, high-density regions [CoifmanLafon2006, NadlerLafonCoifmanKevrekidis2005, TingHuangJordan2010]. For our theoretical analysis we only need spectral convergence towards some set of eigenfunctions, not necessarily the Laplacian eigenfunctions. We formulate this requirement in the following assumption. For simplicity, we ignore possible eigenvalue multiplicities.

###### Assumption 3.

The domain is compact and the eigenvectors of the graph Laplacian, ordered by their eigenvalues, converge in probability to a set of eigenfunctions of some continuous linear differential operator on , in the sense that

 sup_s=1,…,n|√n^ϕ(ℓ)s−ϕ(ℓ)(^β_s)|→0. (31)

Furthermore, form an orthonormal set with respect to the measure ,

 ⟨ϕ(ℓ),ϕ(m)⟩=∫_Bϕ(ℓ)(β)ϕ(m)(β)dν(β)=δ_ℓ,m. (32)
###### Remark 4.

Under this assumption, the eigenfunctions have upper bounds, which we denote as . This is due to the fact that they are continuous functions on a compact domain. It follows that,

 ∑_s=1n(^ϕ(ℓ)s)4→∑_s=1n(1√nϕ(ℓ)(^β_s))4≤1nU_ℓ4. (33)

Thus, Assumption 2 follows from Assumption 3.

###### Remark 5.

The term in (31) is necessary for the eigenvector normalization, since

 ∑_s=1n(^ϕ(ℓ)s)2→∑_s=1n(1√nϕ(ℓ)(^β_s))2→∫_B(ϕ(ℓ)(β))2dν(β)=1. (34)

By Propositions 2.1 and 2.2 of [KatsevichKatsevichSinger2015], the low-resolution mean and covariance estimates are consistent. However, unlike these aggregate quantities, the PCA coordinates are computed from a single image, so they must contain an irreducible error term due to the finite noise level. We codify this in the following assumption.

###### Assumption 4.

The estimated PCA coordinates are correct up to some stochastically bounded noise term,

 ^β_s=β(x_s)+O_P(1). (35)

We now show that, up to noise, the spectral volumes are merely voxel-wise orthogonal projections of the true volumes onto a basis of eigenfunctions.

###### Corollary 1.

Under Assumptions 1, 3 and 4 it follows from Theorem 1 that

 ^α(ℓ)=E[ϕ(ℓ)(β(x)+O_P(1))x]+OP(1√n) (36)

where the expectation is with respect to the distribution of .

So far we have treated the convergence of the spectral volumes. We now turn to the convergence of the high-resolution reconstructions. As discussed in Section 3.1, we assume that the manifold of the molecular volumes can be well approximated by a small number of eigenfunctions. We define this notion precisely in the following assumption.

###### Assumption 5.

There is a set of spectral volumes and a non-negative function that satisfies such that bounds the approximation of by spectral volumes:

 ∥∥∥x−∑_ℓ=0r−1α(ℓ)ϕ(ℓ)(β(x))∥∥∥=O(h(r))∀x∈M. (37)

In that case, we can prove that the true volumes are recovered up to noise.

###### Theorem 2.

Consider a sample from a manifold that conforms to (37), then it follows from Assumptions 1, 3, 4 and 5 that as we have

 ^x_s=x_s+∑_ℓ=0r−1O_P(C_ℓ)α(ℓ)+O_P(h(r))(∑_ℓ=0r−1ϕ(ℓ)(β(x_s))+O_P(C_ℓ)), (38)

where is an upper bound on the norm of the gradient of .

The proof of this theorem is in A. Note that the first error term contains an irreducible error from the finite level of noise in the PCA coordinate assignment.

## 6 Results

In this section, we apply our method to several synthetic datasets with a low-dimensional conformation space. We first consider clean images of a clock face with a rotating hand, then more realistic datasets of molecular volumes with one- and two-dimensional motions.

### 6.1 Clock dataset

We begin with a toy model of a 2D clock face with a single moving hand. Since the objects we wish to reconstruct are images rather than volumes, no projections are involved. This is the setting studied in Section 5.1. The simulated dataset is comprised of noisy images with , where each image shows the clock hand at a random angle with additive Gaussian noise. The affinity matrix was chosen to be where is the standard deviation of the Gaussian noise. We then constructed a normalized graph Laplacian, extracted its eigenvectors, and computed spectral volumes by solving the least-squares problem of (25). Figure 3 shows representative input images and their corresponding reconstructions. Figure 4 shows the estimated spectral volumes (spectral images in this case). We need a large value of to get good reconstructions, since the clock hand has sharp discontinuities.

To analyze this example we note that the clock dataset has the manifold geometry of the unit circle . Ignoring an arbitrary phase offset, the set of real eigenfunctions of the Laplace–Beltrami operator on are and for all integer , and . It follows from (28) that

 ^α(ℓ)→E[ϕ(ℓ)(z)z]=∫_z∼