Stable, Robust and Super Fast Reconstruction of Tensors Using MultiWay Projections
Abstract
In the framework of multidimensional Compressed Sensing (CS), we introduce an analytical reconstruction formula that allows one to recover an thorder data tensor from a reduced set of multiway compressive measurements by exploiting its low multilinearrank structure. Moreover, we show that, an interesting property of multiway measurements allows us to build the reconstruction based on compressive linear measurements taken only in two selected modes, independently of the tensor order . In addition, it is proved that, in the matrix case and in a particular case with rdorder tensors where the same 2D sensor operator is applied to all mode3 slices, the proposed reconstruction is stable in the sense that the approximation error is comparable to the one provided by the best lowmultilinearrank approximation, where is a threshold parameter that controls the approximation error. Through the analysis of the upper bound of the approximation error we show that, in the 2D case, an optimal value for the threshold parameter exists, which is confirmed by our simulation results. On the other hand, our experiments on 3D datasets show that very good reconstructions are obtained using , which means that this parameter does not need to be tuned. Our extensive simulation results demonstrate the stability and robustness of the method when it is applied to realworld 2D and 3D signals. A comparison with stateofthearts sparsity based CS methods specialized for multidimensional signals is also included. A very attractive characteristic of the proposed method is that it provides a direct computation, i.e. it is noniterative in contrast to all existing sparsity based CS algorithms, thus providing super fast computations, even for large datasets.
I Introduction
During the last years there has been an increased interest in Compressed Sensing (CS), whose aim is the reconstruction of signals based on a set of measurements that is much smaller than the original signal size. Thus, instead to acquire a potentially large signal and compress it, CS suggests that signals can be compressively sampled thus reducing the amount of measurements. More specifically, in standard CS [1, 2], a signal , which is assumed unavailable, is reconstructed from a reduced set of linear projections () , where the sensing matrix is typically random or composed by few selected rows of the Fourier transform matrix [3]. In order to make the CS problem solvable, it is necessary to exploit a priori information about the signal of interest by imposing some constraints. For example, it is widely assumed that the signal is compressible by decomposing it in a known Wavelet basis (dictionary). In other words, it is assumed that every signal admits a sparse representation on a given dictionary, i.e. combining only few elements, called atoms. Under the scope of these sparse models, many efficient CS algorithms were developed in order to reconstruct signals from compressive measurements which involve iterative refinements of the solution by means of Greedy algorithms or by minimizing the norm of the solution (see [4] for an up to date summary of algorithms). These algorithms have found many applications in diverse fields such as in medical imaging, surveillance, machine learning, etc [5].
Ia Multidimensional CS
Most of the development of CS was focused on problems involving 1D signal or 2D image data encoded in vectors. However, many important applications involve higher dimensional signals or tensors. Some data sources are readily generated as tensors such as hyperspectral images, videos, 3D light field displays [6], Magnetic Resonance Imaging (MRI) [7], etc., in other cases, tensors can be synthetically created by a rearrangement of lower dimensional data structures or by mathematical construction [8]. In some applications, like in materials science [9] or in scientific computation [10], the exponential increase in memory and time requirements when the number of dimensions increases, makes impossible to work with full datasets and models with few parameters must be used. Such models are referred as tensor decompositions, which can be obtained by making few inspections of the fulldatasets or by taking compressive measurements.
Recently, the KroneckerCS model [11] has been proposed in order to provide a practical implementation of CS for higher order tensors by exploiting their multidimensional structure. This model explicitly assumes that multidimensional signals have sparse representations using separable dictionaries, usually known as Kronecker dictionaries. Kronecker bases are well known and widely used in image processing, for example, given a 3D image, its associated dictionary is where () are small dictionaries associated to columns (mode1), rows (mode2) and tubes (mode3), respectively. Moreover, when working with multidimensional signals, the Kronecker structure also arises naturally in the physical implementation of the sensing devices since they can operate on different dimensions or modes of the signal, independently, through separated sensing matrices [11, 12]. This Kronecker structure of the sensing operator/dictionary is equivalent to apply the constrained Tucker model [13] and made possible to implement relatively fast and practical algorithms based on sparsity structures, e.g., on hyperspectral 3D images and video data through a vector norm minimization algorithm [11]. More recently, greedy algorithms, especially designed to take advantage of the Kronecker structure and block sparsity of the representations, were proposed in [13] and applied to a variety of multidimensional signal processing problems such as in MRI, hyperspectral imaging and multidimensional inpainting [14]. Also, in [15], the authors developed generalized tensor compressed sensing algorithms by exploiting the Kronecker structure in a similar way as done in [13, 16] but using an minimization approach.
IB Exploiting lowrank approximations instead of sparsity
While sparsity is the “working horse” of standard CS, recently a new line of research has been proposed suggesting that, instead of using sparse models as a priori information about multidimensional signals, the lowrank approximation property could be exploited, i.e. without any a priori knowledge about the possible bases or factor matrices for each mode (dictionaries). This idea was first explored in [17], where matrices were reconstructed from its undersampled measurements by solely assuming the existence of a lowrank approximation and by solving a convex optimization problem involving the matrix nuclear norm. These ideas have been extended to tensors, by considering different models for the measurements and using tensor lowrank approximations (based on the CANDECOMP/PARAFAC (CP) model) or lowmultilinearrank approximations (based on the Tucker model). For example, in [18], tensor completion of visual data was analyzed by generalizing the minimization of matrix nuclear norm to the tensor case. In [19], also the problem of estimating missing entries in tensors was considered by assuming that a lowrank CP model is fitted through a weighted least squares problem formulation. In [20], hyperspectral images (3D tensors) are recovered from random linear projections of all channels by using a reconstruction algorithm that combines lowrank and joinsparse matrix recovery. In [21] the problem of reconstructing tensors having a low multilinearrank Tucker model, based on a set of linear measurements, was investigated and used for tensor denoising via an iterative multilinearrank minimization method. In the context of opticalinterferometric imaging, in [22], a method for recovering a supersymmetric rank1 3D tensor from a set of multilinear measurements was proposed by using a convex linear program. In [23], the Kronecker sensing structure was used for tensor compression and a method involving a lowrank model fitting, followed by a per mode decompression, was proposed in order to recover a lowrank CP tensor with sparse factors. We refer to Table I for a brief summary of recent approaches to CS involving tensor datasets, where the differences and similarities among the methods are highlighted.
Articles  Tensor order  Data Model  Measurements Model  Algorithm type 
Duarte et al [11]  Sparsity, Kronecker dictionaries  Multilinear  norm minimization  
Caiafa et al [13, 14]  Sparsity, Blocksparsity, Kronecker dictionaries  Multilinear  Greedy  
Li et al [15]  Sparsity, Kronecker dictionaries  Multilinear  norm minimization  
Candes et al [17]  Lowrank matrix  Linear (missing entries)  Nuclear norm minimization  
Liu et al [18]  Lowrank (CP model)  Linear (missing entries)  Tensor nuclear norm minimization  
Acar et al [19]  Lowrank (CP model)  Linear (missing entries)  Weighted Least Squares  
Golbabaee et al [20]  Jointsparsity, Lowrank matrices  Linear  Nuclear norm, mixed norm minimization.  
Rauhut et al [21]  Low multilinearrank (Tucker model)  Linear  Iterative Hard Thresholding  
Auría et al [22]  Rank1 Tensor  Multilinear  Convex programming  
Sidiropoulos et al [23]  Lowrank Sparse CP model  Multilinear  norm decompression  
Current article  Low multilinearrank (Tucker model)  Multilinear  Noniterative (direct) reconstruction  
In this work, we extend the ideas and results of our recent conference paper [24], providing a direct (i.e., analytical) reconstruction formula that allows us to recover a tensor from a set of multilinear projections that are obtained by multiplying the data tensor by a different sensing matrix in each mode. This model comes into scene naturally in many potential applications, for example, in the case of sensing 2D or 3D images by means of a separable operator as developed in [25, 26, 11, 12], i.e., by taking compressive measurements of columns, rows, etc. separately, imposing a Kronecker structure on the sensing operator. The key assumption is that our multidimensional signal is well approximated by a low multilinearrank Tucker model which is realistic for many structured datasets, specially in the case of multidimensional images. We formulate our multidimensional CS model in a general setting for th order tensors and provide theoretical stability analysis and robustness evidence for and a very important particular case with .
A particularly distinctive and attracting feature of the proposed reconstruction method is that, unlike all other methods listed in Table I, our method is noniterative. We believe that the present mathematical model could be fully exploited by the next generation of multidimensional compressive sensors for very large datasets. Through extensive simulations on realworld datasets, we also illustrate the relevance of our results in hyperspectral compressive imaging for which the technology is already available [11, 25, 12].
IC Paper organization
This paper is organized as follows: in Section II, tensor notation, definitions and basic results used throughout the paper, are introduced; in Section III, the reconstruction formula is introduced for the ideal case when the tensor of interest admits an exact low multilinearrank representation; in Section IV, the effect of a more realistic model for signals is analyzed and a modified reconstruction formula is proposed in order to guarantee stable reconstructions; in Section V, several numerical results based on 2D and 3D realworld signals are provided, validating our theoretical results and evaluating the stability and robustness of our proposed reconstruction scheme. The performance is evaluated in terms of computational time, quality of reconstructions and variance of the results over Monte Carlo simulations (robustness). Finally, in section VI, the main conclusions of the present work are outlined.
Ii Notation, definitions and preliminary results
Iia Tensor notation and operations
Tensors (multiway arrays) are denoted by underlined boldface capital letters, e.g. is an th order tensor of real numbers. Matrices (2D arrays) are denoted by bold uppercase letters and vectors by boldface lowercase letters, e.g. and are a matrix and a vector, respectively. The element of a tensor is referred as . The Frobenius norm is defined by . The spectral norm of a matrix is denoted by corresponding to its largest singular value.
Given a tensor , its mode fibers are the vectors obtained by fixing all indices except , which correspond to columns (), rows (), and so on. Mode unfolding of a tensor yields a matrix () whose columns are the corresponding mode fibers arranged in a particular order, to be more precise, tensor element maps to matrix element , where with [27].
Given a multidimensional signal (tensor) and a matrix the mode tensor by matrix product is defined by:
(1) 
with () and . It should be noted that this corresponds to the product of matrix by each one of the mode fibers of since .
IiB Tucker model and multilinearrank
The Tucker decomposition model [28] provides a generalization of the lowrank approximation of matrices to the case of tensors, i.e. for a given tensor , we have , where is an error tensor and the multilinearrank() tensor approximation is defined as follows (Tucker model):
(2) 
with a core tensor and factor matrices (typically ). A data tensor is said to have multilinearrank if such a decomposition is exact for a set of minimal values , i.e. . We say that a tensor is fullrank if all its unfolded matrices are fullrank, i.e., , . A particularly interesting case of the Tucker model is when factor matrices are orthogonal and chosen as the truncated matrices of left singular vectors associated with the unfolding matrices . In this case, we obtain the so called truncated Higher Order Singular Value Decomposition (HOSVD) [28]. It is noted that, in the matrix case, the truncated SVD provides the best low rank approximation having orthogonal factors and a diagonal core matrix.
IiC Multiway Projections
While in classical 1D CS, the set of compressive measurements is obtained by a linear projection, i.e. by multiplying the vector signal by a sensing matrix, in the tensor case, we can exploit its multiway structure and use devices that provide compressive measurements by multiplying an thorder tensor by sensing matrices in several modes, similarly or identically to the KroneckerCS setting [11]. According to the definition of the mode product in eqn. (1), multiplying a data tensor by a sensing matrix in the mode corresponds to apply a projection to every mode fiber. For example, a 2D signal can be compressively sensed by using two sensing matrices, and for mode1 and mode2, respectively, i.e. or, equivalently, , or , where and are the vectorized versions of matrices and , respectively. Thus, the objective of KroneckerCS is to recover the signal from the measured data matrix .
In this paper, we assume that the following set of compressive multiway measurements (), are available:
(3) 
where () are the corresponding mode sensing matrices. Note that eqn. (3) indicates that the original tensor is multiplied by the set of sensing matrices in all modes except in mode (see Fig. 1 (top)). We also assume available the following core tensor :
(4) 
which, in fact, is redundant since it can be computed from and taking into account that for any .
Most of stateoftheart tensor reconstruction algorithms based on KroneckerCS [11, 13, 14, 22, 23] assume that the only available measurement tensor is , i.e. the product of the original dataset by the sensing matrices (n=1,2,3) in all modes simultaneously. On the other hand, our present method requires to have the set of tensor measurements obtained by multiplying the datasets by all the sensing matrices except one, i.e. tensors (n=1,2,3), not necessarily involving a larger amount of measurements (see experimental comparison in Section VE). However, it is not difficult to see that already existing hardware implementations of CS imaging systems can be easily adapted in order to provide the kind of measurements required by our method. For example, in the 2D case, our method proposes to collect measurements on columns using a common sensing matrix , and rows using . By using the same ideas of the singlepixel camera developed in [29] and used in [11], our sensing operator can be obtained by using, for example, a linear array of DMDs (Digital Microarray Devices) with random orientations to sense columns and another linear array of DMDs to sense rows. On the other hand, it is also interesting to note that in [25], a different hardware implementation was used to provide real KroneckerCS, i.e., by applying different sensing matrices to columns and rows which can be used to provide the measurements required by our method. It is also interesting to note that, recently, in [12], a new hardware implementation has been proposed that allows to employ KroneckerCS with separable sensing operators in space and in spectral domains which could be also used to provide the measurements required by our method.
In some 3D applications, the mode sensing matrix is the identity matrix, i.e. , because the same sensing operator is applied to each frontal slice of the tensor. This is the case, for instance, in hyperspectral compressive imaging, where each frequency band (a frontal slice) is sensed by applying a different selective filter [11, 30]. This mathematical model is also valid for video sequences, where each frontal slice of the tensor corresponds to a snapshot taken at a given time.
IiD The Truncated MoorePenrose Pseudoinverse
It is well known that, by using the MoorePenrose (MP) pseudoinverse of an illconditioned matrix , an unstable behavior is produced since the norm could be extremely large when the smallest singular value is close to zero. In order to avoid this problem, the truncated MP pseudoinverse can be used as a regularization technique, which is defined as follows [31]:
(5) 
with entries of the diagonal matrix defined as follows:
(6) 
where is a free threshold parameter (see next sections). It is noted that as and . Also, the following properties are easily verified:
(7) 
with if , and if ;
(8)  
(9) 
The following lemma, provides a generalization of the property (7) to the rd order tensor case.
Lemma II.1.
For a given tensor with smallest singular values in each mode , and , respectively, the following property holds:
(10) 
with
if 
where and ().
Proof.
See proof in the Appendix A.∎
Iii Exact low multilinearrank tensor recovery
The following theorem provides an explicit reconstruction formula, as illustrated in Fig. 1 (bottom), and states the conditions under which the original tensor can be exactly recovered from the set of multiway measurements () and , defined in equations (3) and (4), respectively.
Theorem III.1 (Low multilinearrank case).
If tensor has multilinearrank and sensing matrices are such that the tensor is fullrank, then the following reconstruction formula is exact, i.e. :
(11) 
where “” stands for the MP pseudoinverse of a matrix and , with .
Proof.
Let us consider the exact HOSVD decomposition , with core tensor and orthogonal factors , which exists because it is assumed that tensor has multilinearrank(). We consider, for convenience, a change of bases such that the new factors satisfy . We can do this by defining the new set of factors as follows:
(12) 
thus, we have where . Note also that, with these new bases, the multiway measurements are now simplified to or, equivalently, . Taking into account that , the mode unfolded version of eqn. (11) is:
(13) 
which, can be written in terms of its associated mode unfolding matrix as:
Now, by substituting in the previous equation and, by repeating this process for all modes , we finally arrive at:
(14) 
which proves that , since . ∎
It is noted that, using a simplified notation, the Tucker model of a tensor with size having multilinear rank , requires parameters and the suggested reconstruction needs no more than measurements corresponding to tensors , . This is larger than the number of parameters of the associated Tucker model, however, if we see that the number of measurements is approximately times the number of parameters, which is in general much smaller that the total number of entries .
Iiia Multiway Measurements Via Linear Projections Applied to Only Two Selected Modes
It is interesting to note that all multiway measurements defined in eqn. (3) () and eqn. (4), can be computed from linear measurements taken only in two selected modes out of . To show this, suppose that we have at our disposal the linear measurements in modes and given by: , with ; then, it is easy to see that the mode unfolding matrix of each multiway measurement , , can be written as follows:
For example, in the important particular case of hyperspectral images, sometimes it is easier to take compressive measurements of columns (mode1) and rows (mode2) of a 3rd order tensor by using the corresponding matrices and as follows:
(15) 
It is easy to see that the multiway measurements required by our method can be obtained from these two set of linear projections, for example, noting that: , , , and that tensor can be obtained using that (for any ). In this case, matrix is not a real sensing matrix and its rank must be chosen in order to capture the numerical rank of the dataset in its 3rd mode, i.e. . In Algorithm 1, the steps to reconstruct a tensor from the linear projections defined in equation (15) are summarized.
Iv Stable Reconstructions of Approximately Low multilinearRank Tensors
The reconstruction formula of eqn. (11) is exact for the case where the tensor has multilinearrank(). However, in real world applications, signals usually have not low multilinearrank, but they admit good low multilinearrank approximations. Thus, a more realistic model for a signal should be where has exact multilinearrank() and is a tensor error with sufficiently small norm, i.e. .
In this case, one may ask about the optimal choice for sensing matrices in order to provide the best low multilinearrank reconstruction. In the matrix case (), it is not difficult to see that, if sensing matrices are constructed using the first singular vectors in each mode, then the obtained reconstruction is optimal (best lowrank approximation). To be more specific, given the SVD of the data matrix with , and where and are the first left and right singular values, respectively, and is a diagonal matrix containing the first singular values in its main diagonal in decreasing order, if we define the sensing matrices as follows: , , then the reconstruction is given by:
(16)  
where is, by definition the truncated SVD which is the best low rank approximation.
However, in practice we do not know the singular vectors because the original dataset is not available so we need to use sensing matrices that are independent from the dataset, for example, by generating them randomly which will give us suboptimal reconstructions, i.e. .
In particular, we say that a reconstruction method is stable if the obtained error is comparable to the input error , i.e. for some constant . As we will show in this section, the formula given by eqn. (11) may suffer from an unstable behavior, especially in the matrix case (), i.e. generating large output errors even when the input error is small. We will show that we can solve this unstable behavior by using the truncated pseudoinverse, as defined in Section IID, instead of the exact MP pseudoinverse. In other words, we define the modified reconstruction formula as follows:
(17) 
where is the truncated pseudoinverse of matrix (with the threshold parameter ). It is noted that, when , eqn. (17) is equivalent to eqn. (11).
In the next sections we derive theoretical upper bounds for the reconstructions errors for (matrix case) and for a particular case of a 3D tensor (), where the same 2D sensor is applied to every frontal slice. We also analyze under which conditions the parameter can be chosen in order to provide the minimum upper bound (optimal value ).
Iva Error Upper Bound for the 2D case (N=2)
The following theorem provides an upper bound for the reconstruction of eqn. (17) and shows that the error bound approaches to zero as if or (i.e., is proportional to the best approximation error ).
Theorem IV.1.
. Let matrix be approximated by a rank matrix , i.e. where and, given sensing matrices , such that is fullrank, then the following error upper bound holds:
(18) 
where is the th singular value (smallest) of the matrix and constants , and are defined below.
Proof.
Let be the truncated SVD of , and the factors are defined by eqn. (12), then we obtain
(19) 
The mode measurement matrix is , with (where we assumed that ), and using a similar analysis for mode, we obtain that , with .
Using the property (8) of the truncated pseudoinverse, the reconstructed matrix becomes:
(20) 
Now, by inserting the expressions for and into eqn. (20) we obtain:
(21) 
Assuming that with if , and , if (see Section IID), and using and equation (19), we obtain:
(22) 
If we apply the spectral norm to the last matrix equation and by using that , we finally obtain
(23) 
where constants are identified as follows:
(24)  
(25)  
(26) 
∎
It is important to note that the obtained theoretical bound is not tight in general because it is based on matrix inequalities that usually are not tight, such as the triangular inequality, i.e. and other used matrix inequalities. However, it still is useful since:

The case (exact case) is not realistic since always real world datasets are not low multilinear rank, so a theoretical bound, even if it is not tight is needed to characterize the reconstruction behavior in real applications;

The bound, as a function of is asymptotically linear, i.e. of order . So as more precise the model is (smaller ), much better reconstructions, and of the same order, are expected, which is intuitive but can be only proved by a theoretical bound. In particular, it proves that, the bound approaches zero as the best lowrank approximation error , when or , and perfect reconstruction is obtained when the signal has rank, as it was already proved in Theorem III.1;
IvB Optimal selection of threshold parameter
A quite important question arises here: What is the optimal value of the threshold parameter ? Since the objective is to obtain a reconstruction error as small as possible, we can try to minimize the upper bound and hope that the theoretical bound gets its minimum at a point close to the minimum actual error. After a careful look at eqn. (18) as a function of , if we define
(27) 
we can identify two different cases:
However, it is important to note that, in practice, we are not able to compute the optimal parameter because we do not know matrices and which depends on the SVD decomposition of the original unknown signal. However, we may use a rough overestimated approximation of this parameter by assuming that and using the fact that which provides us the following rough estimation:
(28) 
But, the problem still remains challenging since usually we do not know exactly the error of the best lowrank estimation .
IvC Error Upper Bound for a particular 3D case (N=3)
In this section we consider the recovery of a rd order tensor , for the particular and very important case where the same sensing operator is applied to every frontal slice of a tensor as considered hyperspectral CS imaging or video CS [11]. In other words, we assume that matrix (i.e. ) is the identity matrix, thus the following multiway projections are available: ,