Bayesian 3D Reconstruction of Subsampled Multispectral Singlephoton Lidar Signals
Abstract
Light detection and ranging (Lidar) singlephoton devices capture range and intensity information from a 3D scene. This modality enables long range 3D reconstruction with high range precision and low laser power. A multispectral singlephoton Lidar system provides additional spectral diversity, allowing the discrimination of different materials. However, the main drawback of such systems can be the long acquisition time needed to collect enough photons in each spectral band. In this work, we tackle this problem in two ways: first, we propose a Bayesian 3D reconstruction algorithm that is able to find multiple surfaces per pixel, using few photons, i.e., shorter acquisitions. In contrast to previous algorithms, the novel method processes the jointly all the spectral bands, obtaining better reconstructions using less photon detections. The proposed model promotes spatial correlation between neighbouring points within a given surface using spatial point processes. Secondly, we account for different spatial and spectral subsampling schemes, which reduce the total number of measurements, without significant degradation of the reconstruction performance. In this way, the total acquisition time, memory requirements and computational time can be significantly reduced. The experiments performed using both synthetic and real singlephoton Lidar data demonstrate the advantages of tailored sampling schemes over random alternatives. Furthermore, the proposed algorithm yields better estimates than other existing methods for multisurface reconstruction using multispectral Lidar data.
I Introduction
Singlephoton Lidar devices provide accurate range information by constructing, for each pixel, a histogram of time delays between emitted light pulses and photon detections. Using time correlated singlephoton counting (TCSPC) technology, Lidar systems are able to provide accurate depth information (of the order of centimetres) over very long ranges, while allowing the use of laser sources of low power [1]. The acquired range information (3D structure) has many important applications, such as selfdriving cars [2], the study of structures behind dense forests [3] and environmental monitoring [4]. Multispectral Lidar (MSL) systems gather measurements at many spectral bands, constructing one histogram of time delays per wavelength, as illustrated in Figure 1. The spectral diversity can be obtained either using a supercontinuum laser source [5, 6] or multiple lasers [7]. The detector generally consists of a spatial form of wavelength routing to demultiplex the channels [5, 6, 7] or wavelengthtotime codification [8]. Recovering spatial and spectral information from MSL data is a challenging task, specially in scenes with strong ambient illumination (i.e., multiple spurious detections) or when the acquisition time is very low (i.e., very few photon detections per histogram). Moreover, in a general setting, it is possible to find more than one object per pixel. This phenomenon occurs in scenes where the light goes through semitransparent materials (e.g., glass), or when the laser footprint is such that multiple surfaces appear in the field of view. Thus, several signal processing algorithms have been proposed to address these challenges: while many algorithms are available for singlewavelength Lidar, either assuming a single surface per pixel [9, 10, 11] and multiple surfaces per pixel [12, 13, 14], to the best of our knowledge, only the singlesurfaceperpixel case was studied in the multispectral case [6, 15, 16]. The singledepth assumption greatly simplifies the reconstruction problem, as it significantly reduces memory requirements and overall complexity [15].
In this work, we propose a new method to perform 3D reconstruction with subsampled MSL data, which is capable of finding multiple surfaces per pixel. The novel method draws ideas from a recently published algorithm named MANIPOP [14], which obtains stateoftheart reconstructions in the multiple surface, singlewavelength case. However, due to the significantly larger dimensionality of multispectral data, we propose to modify the Bayesian model and estimation strategy of MANIPOP. Adopting a Bayesian framework similar to [14], we assign a spatial point process prior to promote spatial correlation and a Gaussian Markov random field prior to regularize the spectral reflectivity within surfaces/objects. Inference using the resulting posterior distribution is performed using reversible jump Markov chain Monte Carlo algorithm (RJMCMC), coupled with a multiresolution approach that improves the convergence speed and reduces the total computing time of the algorithm. We introduce new RJMCMC proposals, which take into account the additional spectral dimension and improve the acceptance ratio, compared to the ones proposed in [14]. Moreover, we propose an empirical Bayes approach to build the prior distribution associated with the background detections, which further improves the convergence of the RJMCMC sampler.
Datasets containing dozens of wavelengths can be prohibitively large for practical 3D reconstruction algorithms, both in terms of memory and computing requirements. For example, a typical MSL hypercube with 32 wavelengths has more than data voxels. To alleviate this problem, some compression and subsampling strategies have been proposed. While TCSPC technology hinders compressive techniques along the depth axis^{1}^{1}1Due to the asynchronous nature of TCSPC technology, it is not possible to measure an arbitrary subset of histogram bins or linear combination of them. However, other alternatives such as gated cameras [17] can obtain such measurements., reducing the number of measurements can be achieved by integrating multiple wavelengths in a single histogram [8] or measuring less histograms (i.e., subsampling) [15]. The wavelengthtotime approach proposed by Ren et al. [8] compresses histograms into a single one by shifting in time the photon detections according to each measured wavelength. While significantly reducing the data size, this strategy is not wellsuited in the presence of multiple surfaces per pixel, as the resulting likelihood becomes highly multimodal and extremely difficult to handle. A recent analysis in [15] studied different random subsampling schemes, without obtaining any significant differences in terms of reconstruction quality in the lowphoton count regime. In this work, we investigate a new pseudorandom subsampling scheme for lowphoton count MSL data based on ideas from coded aperture design [18, 19]. By choosing the pixels measured for each wavelength in a more principled way, we achieve better results than the completely random schemes of Altmann et al. [15]. Furthermore, the proposed subsampling strategy can be easily implemented in many Lidar systems, reducing the total number of measurements, i.e., the time to acquire an MSL frame. Rasterscan systems using a laser supercontinuum source [5, 6] can be easily modified to measure only a subset of pixels per wavelength. More interestingly, singlewavelength array technology [20] can be combined with coded apertures [18], which measure different wavelengths at each pixel.
The main contributions of this work are

a new Bayesian 3D reconstruction algorithm for subsampled MSL data with multiple surfaces per pixel

the analysis of an appropriate subsampling scheme based on coded aperture design, which provides better results than completely random alternatives.
The new algorithm is referred to as MUSAPOP, as it models MultiSpectrAl Lidar signals using POint Processes. The remainder of the paper is organized as follows. Section II recalls the classical observation model for singlephoton MSL data. Sections IV and III present the Bayesian 3D reconstruction algorithm and the associated RJMCMC sampler. Section V details the principled subsampling strategy. Experiments performed with synthetic and real Lidar data are introduced and discussed in Section VI. Conclusions and future work are finally reported in Section VII.
Ii Singlephoton multispectral Lidar
A full multispectral Lidar data hypercube consists of discrete photoncount measurements, where is the set of positive integers, and are the numbers of vertical and horizontal pixels respectively, is the histogram length (i.e., the number of time bins) and is the number of acquired wavelengths. The 3D reconstruction algorithm estimates a point cloud, referred to as , from the data hypercube . The 3D point cloud is represented by an unordered set of points, that is
(1) 
where is the total number of points, and and are the coordinate vector and the spectral response of the th point, respectively. In the presence of distributed objects, the observed photon count at pixel , bin and spectral band follows a Poisson distribution [15], whose intensity is a mixture of the background level, denoted by , and the responses of the surfaces present in that pixel, i.e.,
where denotes the Poisson distribution, is a known binary variable that indicates whether wavelength at pixel has been acquired/observed or not, is the set of points corresponding to pixel and wavelength , and is the impulse response of the Lidar device at wavelength , which can be measured using a spectralon panel during a calibration step. Note that to lighten notations, we assume that the impulse responses are only wavelengthdependent but the algorithm can be easily adapted when the responses vary among pixels. Assuming mutual independence between noise realizations in different bins, pixels and spectral bands [1], the likelihood of the proposed model can be written as
(2) 
For clarity in the notation, we will also denote the set of point coordinates as and the set of spectral responses as . The set of all background levels is denoted by , which is the concatenation of images , one for each wavelength. The cube of binary measurements is designated by , where . Note that the model used in the MANIPOP algorithm [14] can be obtained from (2) by setting all the binary variables to 1, and considering only one band, i.e., .
Iii Multiplereturn multiplewavelength 3D reconstruction
Recovering the position of the objects , their spectral signatures and background levels from the subsampled MSL data is an illposed inverse problem, as many solutions can explain the observed photon counts. Thus, prior regularization is necessary to promote reconstructions in a set of feasible realworld 3D point clouds. In this work, we adopt a Bayesian framework, which allows us to include prior knowledge about the scene through tailored prior distributions assigned to the parameters of interest.
Iiia Prior distributions
The proposed model considers prior regularization for the point positions and reflectivity. As explained in Section IIIA3, an empirical Bayes prior [21] is assigned to the background levels.
IiiA1 Spatial configuration
We adopt the spatial prior distribution of 3D points developed in the MANIPOP algorithm. This distribution is designed to promote spatial correlation between points within one surface and repulsion between points belonging to different surfaces. While only a brief summary is included below, we refer the reader to [14] for a detailed discussion about this prior model. The spatial point process prior for the position of the points is modelled by a density defined with respect to a Poisson point process reference measure [22, Chapter 9], i.e.,
where and are the Strauss and area interaction [23] processes respectively. The repulsive Strauss process is written as
where is the minimum separation between two surfaces in the same pixel. Attraction between points within the same surface is promoted by the area interaction process, that is
(3) 
where denotes the standard Lebesgue measure, defines a convex set around the point , and and are two hyperparameters, accounting for the amount of attraction and total number of points, respectively. Both densities define Markovian iteractions between points, only correlating points in a local neighbourhood. Moreover, the combination of both processes implicitly defines a connectedsurface structure, which is used to model 2D manifolds in a 3D space.
IiiA2 Reflectivity
The spectral signatures are related to the materials of the surfaces [16]. Neighbouring points corresponding to a surface composed of a specific material show similar spectral signatures. This prior information is added to the model using a Gaussian Markov random field distribution, where the neighbours are defined by the connectedsurface structure of the point process prior. First, in order to avoid the positivity constraint on the intensity , we use the canonical form [24, 25, 26],
(4) 
where denotes the logintensity of the th point at band . As multispectral devices only acquire dozens of wellseparated wavelengths, the spectral measurements within a pixel do not show significant correlation. Hence, although potential correlations between wavelengths could be modelled, we choose here to neglect this correlation to keep the estimation strategy tractable. As a consequence, we consider the following reflectivity prior model
(5) 
Spatial correlations between logintensity values in neighbouring pixels are defined according to the distribution
(6) 
where and are hyperparameters controlling the level of smoothness. The precision matrix is very sparse due to the Markovian structure, being defined by
(7) 
where is the set of neighbours of point , which is obtained using the connectedsurface structure illustrated in Figure 2, and is the Euclidean distance between two points, normalized according to the camera parameters of the scene to have a physical meaning [27].
IiiA3 Background levels
Background detections are due to detector dark counts and ambient illumination, as explained in [12, 28]. If the transceiver system is monostatic [1], the set of background levels can be interpreted as a multispectral passive image of the scene, as background detections generally come from ambient illumination reflected onto the target and collected by the singlephoton detector. In this case, the background levels are strongly spatially correlated within each wavelength. However, in bistatic systems [12], the transmit and receive channels do not share the same objective lens aperture, yielding weakly or uncorrelated background detections. Although not showing a strong spatial correlation in this second case, all the background levels have similar values, which also serves as prior information.
Independent prior distributions
In order to simultaneously model potential spatial correlation and ensure the positivity of the background levels, MANIPOP uses a 2D gamma Markov random field, which was introduced by Dikmen et al. in [29]. However, this prior is not well suited for MSL data as it introduces an undesired penalization for large background levels, whose negative effects are amplified when considering multiple bands (see Appendix A for details). Other alternatives such as Gaussian Markov random fields [24] cannot be sampled directly in closed form, requiring proposals with a rejection step, whose mixing and convergence scale badly with the dimension of the spectral cube, as shown in [30]. To alleviate these problems, we assign independent gamma priors, i.e.,
(8) 
where and are the shape and scale hyperparameters of the gamma distributions. Despite using independent priors, we can capture the spatial correlation by setting the hyperparameters appropriately. More precisely, in a similar fashion to variational Bayes [31] or expectation propagation [32] methods, in order to simplify the estimation of , we specify (8) such that is similar to another distribution which explicitly correlates the background levels in neighbouring pixels and assumes mutual independence between spectral bands. Here, we use as a similarity criterion the KullbackLeibler divergence
(9) 
As discussed in Appendix B, solving (9) can be achieved by computing expectations with respect to .
Empirical Bayes approach
To ensure that the prior model (8) is informative, a suitable distribution should be chosen. Assuming that we have a coarse estimate of the point cloud (this information will be obtained using the multiresolution approach detailed in Section IVB), we build the distribution following an empirical Bayes approach, as illustrated in Figure 3. First, one can discard almost all the signal photons in the dataset by removing the photons detected in the compact support of around each point (see Figure 3, central subplot). The number of bins that is not excluded in each pixel is referred to as . Secondly, we integrate the remaining photons of each pixel, obtaining a coarse estimate of the perpixel background photons, denoted by . We then define with
(10) 
where a vectorized image of background levels at wavelength , a positive semidefinite matrix, is a fixed hyperparameter controlling the degree of smoothness. In monostatic systems^{2}^{2}2In monostatic Lidar systems, the laser and detector are coaxial, whereas in bistatic systems, the source and detector do not share the same axis., a twodimensional Laplacian filter is chosen for to promote spatial correlation [24], whereas in bistatic systems, is replaced by the identity matrix, only penalizing large background levels. is a translation parameter centring in the linear part of the exponential function and is defined as . As mentioned above, solving (9) requires the computation of expectations with respect to which are unfortunately not available in closed form. Instead of using additional MCMC sampling to find numerical approximations (detailed in Appendix B), obtaining samples from (10) is more attractive both in terms of convergence and computational complexity than using the original full datacube which includes mixtures of background and signal photons. Indeed, (10) simply involves integrated photon counts (over the range dimension). Moreover, given the independence property of among spectral bands, all the bands can be processed independently in parallel.
IiiB Posterior distribution
Following Bayes theorem, the joint posterior distribution of the model parameters is given by
(11) 
where denotes the set of hyperparameters and is the Poisson point process reference measure. Figure 4 shows the directed acyclic graph associated with the proposed hierarchical Bayesian model.
Iv Inference
In this work, we estimate the point cloud positions and spectral signatures using the maximumaposteriori (MAP) estimator, i.e.,
(12) 
Instead of using the MAP for the background levels, the minimum mean squared error estimator of [21] is preferred, as we found experimentally that it provides better estimates. This estimator is defined as
(13) 
As this expectation cannot be derived analytically, we propose to investigate Markov Chain Monte Carlo (MCMC) simulation methods. In particular, we use a reversible jump MCMC algorithm that can handle the varying dimension nature of the spatial point process. This sampler generates samples of and from the posterior distribution (11) denoted as
(14) 
These samples are then used to approximate the statistics of interest, i.e.,
where is the number of burnin iterations.
Iva Reversible jump MCMC moves
While other stochastic simulation algorithms for varying dimensions can be used [22], we choose an RJMCMC sampler, as it allows us to build proposals tailored for the MSL reconstruction problem. RJMCMC can be interpreted as a natural extension of the MetropolisHastings algorithm for problems with an a priori unknown dimensionality. Given the actual state of the chain of model order , a random vector of auxiliary variables is generated to create a new state of model order , according to an appropriate deterministic function . To ensure reversibility, an inverse mapping with auxiliary random variables has to exist such that . The move is accepted or rejected with probability , where is defined as
(15) 
and is the probability of proposing the move , is the probability distribution of the random vector , and is the Jacobian of the mapping . The RJMCMC algorithm performs birth/death, dilation/erosion, spatial and mark shifts, and split/merge moves with probabilities , , , , , , and respectively. Due to the Markovian nature of the prior distributions, all the proposed moves are local, having a complexity proportional to the size of the neighbourhood. These moves are detailed in the following subsections. For ease of presentation, we summarize the main aspects of each move, inviting the reader to consult Appendix C for more details.
Birth and death moves
The birth move proposes a new point uniformly at random in the 3D cube. The spectral signature of the new point is proposed by extracting a fraction from the current value of the background level according to the SBR , that is for each wavelength
(16) 
where denotes the uniform distribution on the interval . The death move proposes the removal of a point. In contrast to the birth move, we modify the background level according to
(17) 
Dilation and erosion moves
Birth moves have low acceptance ratio, as the probability of randomly proposing a point within or close to the surfaces of interest is very low. However, this problem can be overcome by using the current estimation of the surface to propose in regions of high probability. The dilation move proposes a point inside the neighbourhood of an existing surface with uniform probability across all possible neighbouring positions where a point can be added. Once a new position is proposed, the spectral signature is sampled in the same way as the birth move (16). The complementary move (named erosion), proposes to remove a point with one or more neighbours. In this case, the background is updated in the same way as the death move.
Mark and shift moves
The mark move updates the logintensity of a randomly chosen point . Each wavelength is updated independently using a Gaussian proposal with variance as a proposal (also known as Metropolis Gaussian random walk), that is
(18) 
Similarly, the shift move updates the position of a uniformly chosen point using a Gaussian proposal with variance
(19) 
The values of and are adjusted by crossvalidation to yield an acceptance ratio close to for each move, which is the optimal value, as explained in [22, Chapter 4].
Split and merge moves
Some pixels might present two points with overlapping impulse responses in depth. In such cases, a death move followed by two birth moves would happen with very low probability. Hence, as in MANIPOP, we propose a split move, which randomly picks a point and proposes two new points, and . The logintensity is proposed for each wavelength following the mapping
(20) 
where denotes the beta distribution and is a fixed parameter. The new positions are determined according to
(21) 
where e denotes the Bernoulli distribution, is the length in bins of the instrumental response at the wavelength with longest impulse response. The complementary move, named merge move, is performed by randomly choosing two points and inside the same pixel ( and ) that satisfy the condition
(22) 
The merged point is obtained by the inverse mapping
which preserves the total pixel intensity and weights the spatial shift of each peak according to the sum of the intensity values.
IvA1 Sampling the background
In order to exploit the conjugacy between the Poisson likelihood and gamma priors for the background levels, we use a data augmentation scheme as in [33], which classifies each photondetection according to the source (target(s) or background), i.e.,
where are the photons at bin associated with the th surface and are the ones associated with the background. In the augmented space defined by , the background levels are sampled as follows
(23) 
where denotes the Binomial distribution. The transition kernel defined by (23) produces samples of distributed according to the marginal posterior distribution of . In practice, we observed that only one iteration of this kernel is sufficient.
IvB Full algorithm
We adopt a multiresolution approach to speed up the convergence of the RJMCMC algorithm, in a fashion similar to [14]. The dataset is downsampled by integrating photon detections in patches of pixels. Hence, the number of pixels is reduced by a factor of , meaning less points and background levels to infer with times more photons per pixel. The estimated point cloud at the coarse scale is upsampled using a simple nearest neighbour algorithm and used as initialization for the next (finer) scale. In all our experiments we repeat the process for scales. The background hyperpriors and are initialized with noninformative values, i.e., and for all pixels and wavelengths . In finer scales, these hyperparameters are computed using the algorithm detailed in Section IIIA3. The multiresolution approach is finally summarized in Algorithm 1.
V Subsampling strategy
Despite not being able to design compressive measurements along the depth axis, we can still reduce the number of measurements in the two spatial (horizontal and vertical) dimensions and in the spectral dimension [15]. Given the point positions, recovering their reflectivity profile reduces to a multispectral image restoration problem using measured data corrupted by Poisson noise. While many compressive sensing strategies have been proposed for measurements under this noise assumption [34, 35, 36], MSL datasets have an additional limitation if multiple surfaces per pixel are considered: photondetections belonging to different wavelengths cannot be integrated into a single histogram, as the reconstruction problem becomes highly nonconvex, preventing practical reconstruction algorithms^{3}^{3}3As mentioned in the introduction, the system presented in [8] considers the integration of photons belonging to different histograms, but is limited to one surface per pixel.. Indeed, summing histograms associated with different wavelengths and including multiple peaks generates histograms with even more peaks (possibly overlapping), which makes the 3D reconstruction and the reflectivity estimation more difficult. As a consequence, we only consider subsampling of depth histograms, which incorporates all of the practical sampling limitations. Following the formulation of the observation model (2), the subsampling strategy consists of choosing the binary coefficients for a given compression level , with the average number of observed band per pixel. Several subsampling strategies have been proposed for different applications, such as halftoning and stippling [37], rendering, compressive spectral imaging [38, 39, 40], compressive computed tomography [41, 42], geometry processing [43], amongst others [44, 45, 46]. These approaches exploit the sampling geometry of the sensing devices to design a set of criteria. Similarly to coded aperture snapshot spectral imaging systems, the distribution of reflectivity profiles of 3D surfaces in real natural scenes suggests an uniform sampling in the row, column and spectral axes. Following the design in Correa et al. [19], the coefficients are chosen according to the spatiotemporal characteristics of blue noise, which distributes the measurements in spectral and spatial dimensions as homogeneously as possible [47]. The binary cube is obtained by minimizing the variance of (weighted) measurements per local neighbourhood, i.e.,
subject to 
where denotes the variance operator, denotes the set of pixels in a local neighborhood of and are the weights. The minimization is simplified by dividing the data in slices of contiguous bands and running the algorithm introduced in [19] per slice. As shown in Figure 5, the proposed strategy distributes the measurements uniformly in space, while other random strategies [15] tend to exhibit clusters, leaving some regions without measurements.
Vi Experiments
To illustrate the efficacy of the proposed method, the new reconstruction algorithm is compared to other alternatives [5] using a synthetic dataset. Subsequently, the new subsampling scheme is compared with other random subsampling choices for a real MSL dataset. In all the experiments, the performance was measured using the following summary statistics:

True detections : Probability of true detection as a function of the distance , considering an estimated point as a true detection if there is another point in the ground truth point cloud in the same pixel ( and ) such that .

False detections : Number of estimated points that cannot be assigned to a ground truth point at a distance .

Mean intensity absolute error at distance (IAE): Mean across all the points of the intensity absolute error , normalized with respect to the total number of ground truth points. The ground truth and estimated points are coupled using the probability of detection . Note that if a point was falsely estimated or a ground truth point was not found, then they are considered to have resulted in an error of . The comparison is done with normalized intensity values, that is for .

Background normalized mean squared error : Mean of the normalized squared error of the estimated background at each wavelength, i.e., .
Hyperparameter  Coarse scale  Fine scale 

Via Synthetic data
We first assessed the performance of the proposed algorithm using a synthetic dataset created from the “Art” scene of the Middlebury dataset [48], as shown in Figure (a)a. The measurements were obtained by simulating the singlephoton multispectral Lidar system of [16]. The generated dataset has and pixels, wavelengths (red, green and blue), a bin width of 0.3 mm and a mean of 9.83 photons per wavelength per pixel, where approximately photons are due to the background illumination.
We compared the proposed method to a twostage algorithm that first estimates the point positions using MANIPOP [14] (singlewavelength multiplereturn stateoftheart algorithm) on the most energetic spectral band and then infers the spectral signatures with fixed point positions, following the procedure suggested by Wallace et al. [5], as summarized in Algorithm 2. Figure 6 shows the estimated 3D point clouds, whereas Figure 7 illustrates , and IAE for both methods.
The proposed algorithm performs significantly better than the twostage alternative, as it finds % of the true points, whereas the twostage MANIPOP only recovers % of these points. The proposed MUSAPOP algorithm performs slightly worse in terms of false detections, finding 8990 false points in comparison to 4172 by the competitor. In terms of intensity estimation, MUSAPOP obtains significantly better results, having an asymptotic IAE of 0.88, whereas the twostage algorithm obtains an IAE of 1.24. The estimated background levels are shown in Figure 8 showing that the proposed method yields an NMSE of 0.053, whereas the twostage algorithm leads to an NMSE of 0.264. The total execution time for MUSAPOP was 1300 s, whereas the twostage MANIPOP required 593 s.
ViB Real MSL data
The proposed subsampling scheme was evaluated on a real MSL dataset [16]. The scene consists of wavelengths sampled at regular intervals of 10 nm from 500 nm to 810 nm, pixels and histogram bins. The target is composed by a series of blocks of different types of clay and two leaves. Figure 9 shows an RGB image of the scene and the 3D reconstruction using the full acquisition time of 10 ms per wavelength per pixel. We compare the blue noise codes mentioned in Section V with the random schemes introduced in [15], all yielding the same total number of measurements and acquisition time:

Random sampling without overlap: out of bands per pixel are sampled without replacement (i.e., for a given pixel, each wavelength is measured at most once).

Random sampling without overlap: For each wavelength, % of the pixels are sampled without replacement.

Proposed sampling method: For each wavelength, % of the pixels are chosen following the scheme presented in Section V.
The codes were evaluated for bands per pixel and acquisition times of 0.1, 1 and 10 ms per measurement (i.e., the histogram of one wavelength), using as groundtruth the reconstruction obtained with all the measurements and an acquisition time of 10 ms. Figure 10 shows the percentage of true detections, IAE and background NMSEs for all codes, acquisition times and numbers of sensed bands per pixel . In terms of total number of estimated points, the blue noise codes achieve better performance in high compression scenarios and low acquisition times (0.1 and 1 ms). For example, for an acquisition time of 1 ms, almost all points are reconstructed using blue noise codes, whereas the random codes only yield around of the groundtruth points. The choice of the subsampling strategy has a stronger impact in terms of IAE, achieving smaller IAE for all acquisition times and number of bands per pixel.
Figure 11 shows the execution time for an acquisition time of 0.1 ms and different numbers of sensed bands. The proposed RJMCMC sampler has a complexity proportional to the number of photon detections in the support of the impulse response around the 3D point being modified, whereas the background update has a complexity proportional to the total number of active histogram bins in the Lidar scene. The background extraction step required around of the total execution time, which could be significantly reduced if all the bands were processed in parallel instead of sequentially as it is done in the current implementation.
Vii Conclusions and future work
This paper has studied a new 3D reconstruction algorithm referred to as MUSAPOP using multispectral Lidar data, which is able to find multiple surfaces per pixel. The proposed method leads to better reconstruction quality than other alternatives, as it considers all measured wavelengths in a single observation model. While based on some ideas initially investigated in MANIPOP [14], MUSAPOP also relies on new strategies to deal with the very high dimensionality of the multispectral problem. The first novelty is the use of an empirical Bayes prior for the background levels, which speeds up significantly the RJMCMC algorithm. A second new idea is the use of dilation/erosion and split/merge moves yielding higher acceptance rates in the multispectral case. Finally, the subsampling strategy further reduces both the algorithm’s complexity and total number of measurements, leading to faster acquisitions and reconstructions. The sparse point cloud representation of the proposed method speeds up the computations proportionally to the number of measurements, whereas other dense models [12, 13] would not achieve similar improvements.
Further work will be devoted to the design of compressive systems that are not limited to subsampling the spectral cube. Moreover, the compression can also be extended to depth information, for example using a rangegated camera [17].
Appendix A Gamma Markov random field
Most classical prior distributions (often referred to as regularization terms in the convex optimization literature) for images, such as Laplacian [24] and total variation [49], only penalize variations between neighbouring pixels, ignoring the mean intensity of the image. However, the gamma Markov random field [29], includes a penalization on the mean intensity, which promotes pixels with smaller values. This can be shown by inspecting the marginal distribution (defined as [30]),
(24) 
where is a hyperparameter controlling the degree of smoothness and denotes the set of pixels in the neighbourhood of pixel . For an image of constant intensity , we have for all pixels and spectral bands, yielding the density
(25) 
This dependency on the mean promotes reconstructions with lower background levels, decreasing the acceptance ratio of death and erosion moves (that propose to increase the background levels). In the case of MANIPOP, only one band is considered (). Thus, the bias towards smaller background levels does not impact the overall reconstruction significantly. However, in the MSL case (), the reconstruction quality is reduced, hindering the use of gamma Markov random fields.
Appendix B Empirical prior for the background levels
The prior for the background levels is chosen to minimize the KullbackLeibler divergence in (9), where the correlated model is given by (10). The minimization of (9) can be written as
(26) 
Considering the product of independent gamma distributions in (8), the problem reduces to
(27) 
for all pixels and and wavelengths . The expected values and cannot be obtained in closed form for the PoissonGaussian model of (10). Thus, we approximate them numerically by obtaining MCMC samples of . As explained in [30], offtheshelf sampling strategies (e.g., Hamiltonian Monte Carlo [22, Chapter 9]) do not scale well with the dimension of the problem, being inefficient when applied to large multispectral Lidar datasets. Hence, we consider proposals from a Gaussian approximation of (10) (as detailed in [24]) using the perturbation optimization algorithm [50], accepting or rejecting them according to the MetropolisHastings rule [22, 24]. We generate samples and compute the desired expected values as
(28)  
(29) 
Finally, the values of the hyperparameters are obtained by setting and minimizing (27) with a onedimensional Newton method.
Appendix C Complete expressions of the RJMCMC moves
The birth move of point has an acceptance ratio given by with
where is defined as
Similarly, the death move is accepted with probability , where the term in the second line is replaced by . The dilation move of point is accepted with probability with
where is defined as