SinglePhoton Depth Imaging Using a UnionofSubspaces Model
Abstract
Light detection and ranging systems reconstruct scene depth from timeofflight measurements. For low lightlevel depth imaging applications, such as remote sensing and robot vision, these systems use singlephoton detectors that resolve individual photon arrivals. Even so, they must detect a large number of photons to mitigate Poisson shot noise and reject anomalous photon detections from background light. We introduce a novel framework for accurate depth imaging using a small number of detected photons in the presence of an unknown amount of background light that may vary spatially. It employs a Poisson observation model for the photon detections plus a unionofsubspaces constraint on the discretetime flux from the scene at any single pixel. Together, they enable a greedy signalpursuit algorithm to rapidly and simultaneously converge on accurate estimates of scene depth and background flux, without any assumptions on spatial correlations of the depth or background flux. Using experimental singlephoton data, we demonstrate that our proposed framework recovers depth features with 1.7 cm absolute error, using 15 photons per image pixel and an illumination pulse with 6.7cm scaled rootmeansquare length. We also show that our framework outperforms the conventional pixelwise logmatched filtering, which is a computationallyefficient approximation to the maximumlikelihood solution, by a factor of 6.1 in absolute depth error.
Computational imaging, LIDAR, singlephoton imaging, unionofsubspaces, greedy algorithms.
1 Introduction
A conventional light detection and ranging (LIDAR) system, which uses a pulsed light source and a singlephoton detector, forms a depth image pixelwise using the histograms of photon detection times. The acquisition times for such systems are made long enough to detect hundreds of photons per pixel for the finely binned histograms these systems require to do accurate depth estimation. In this letter, we introduce a framework for accurate depth imaging using only a small number of photon detections per pixel, despite the presence of an unknown amount of spatiallyvarying background light in the scene. Our framework uses a Poisson observation model for the photon detections plus a unionofsubspaces constraint on the scene’s discretetime flux at any single pixel. Using a greedy signalpursuit algorithm—a modification of CoSaMP [1]—we solve for accurate estimates of scene depth and background flux. Our method forms estimates pixelwise and thus avoids assumptions on transverse spatial correlations that may hinder the ability to resolve very small features. Using experimental singlephoton data, we demonstrate that our proposed depth imaging framework outperforms logmatched filtering, which is the maximumlikelihood (ML) depth estimator given zero background light.
Because our proposed framework is photonefficient while using an estimator that is pixelwise and without background calibration, it can be useful for dynamic low lightlevel imaging applications, such as environmental mapping using unmanned aerial vehicles.
1.1 Prior Art
The conventional LIDAR technique of estimating depth using histograms of photon detections is accurate when the number of photon detections is high. In the low photoncount regime, the depth solution is noisy due to shot noise. It has been shown that image denoising methods, such as wavelet thresholding, can boost the performance of scene depth recovery in the presence of background noise [2]. Also, using an imaging model that incorporates occlusion constraints was proposed to recover an accurate depth map [3]. However, these denoising algorithms implicitly assume that the observations are Gaussian distributed. Thus, at low photoncounts, where depth estimates are highly nonGaussian [4], their performance degrades significantly [5].
Firstphoton imaging (FPI) [6] is a framework that allows highaccuracy imaging using only the first detected photon at every pixel. It demonstrated that centimeteraccurate depth recovery is possible by combining the nonGaussian statistics of firstphoton detection with spatial correlations of natural scenes. The FPI framework uses an imaging setup that includes a rasterscanning light source and a lensless singlephoton detector. More recently, photonefficient imaging frameworks that use a detector array setup, in which every pixel has the same acquisition time, have also been proposed [5, 7, 8].
We observe two common limitations that exist in the prior active imaging frameworks for depth reconstruction.

Oversmoothing: Many of the frameworks assume spatial smoothness of the scene to mitigate the effect of shot noise. In some imaging applications, however, it is important to capture fine spatial features that only occupy a few image pixels. Using methods that assume spatial correlations may lead to erroneously oversmoothed images that wash out the scene’s finescale features. In such scenarios, a robust pixelwise imager is preferable.

Calibration: Many imaging methods assume a calibration step to measure the amount of background flux existing in the environment. This calibration mitigates bias in the depth estimate caused by backgroundphoton or darkcount detections, which have high temporal variance. In practical imaging scenarios, however, the background response varies in time, and continuous calibration may not be practical. Furthermore, many methods assume background flux does not vary spatially. Thus, a calibrationless imager that performs simultanous estimation of scene parameters and spatiallyvarying background flux from photon detections is useful.
In this letter, we propose a novel framework for depth acquisition that is applied pixelwise and without calibration. At each pixel, our imager estimates the background response along with scene depth from photon detections. Similar to [3], we use a unionofsubspaces constraint for modeling the scene parameters. However, our unionofsubspace constraint is defined for both the incoherent signal and background waveform parameters that generate photon detections; the constraint in [3] is defined for only the coherent signal waveform that is perturbed by Gaussian noise, not photon noise.
Using the derived imaging model, we propose a greedy signal pursuit algorithm that accurately solves for the scene parameters at each pixel. We evaluate the photon efficiency of this framework using experimental singlephoton data. In the presence of strong background light, we show that our pixelwise imager gives an absolute depth error that is times lower than that of the pixelwise logmatched filter.
2 SinglePhoton Imaging Setup
Figure. 1 illustrates our imaging setup, for one illumination pulse, in which photon detections are made. A focused optical source, such as a laser, illuminates a pixel of the scene with the pulse waveform that starts at time and has rootmeansquare pulsewidth . This illumination is repeated every seconds for a sequence of pulses. The singlephoton detector, in conjunction with a time correlator, is used to time stamp individual photon detections, relative to the time at which the immediately preceding pulse was transmitted. These detection times, which are observations of a timeinhomogeneous Poisson process, whose rate function combines contributions from pixel return, background light, and dark counts, are used to estimate scene depth for the illuminated pixel. This pixelwise acquisition process is repeated for image pixels by raster scanning the light source in the transverse directions.
3 Forward Imaging Model
In this section, we study the relationship between the photon detections and the scene parameters. For simplicity of exposition and notation, we focus on one pixel; this is repeated for each pixel of a rasterscanning or arraydetection setup.
Let , , and be unknown scalar values that represent reflectivity, depth, and background flux at the given pixel. The reflectivity value includes the effects of radial falloff, view angle, and material properties. Then, after illuminating the scene pixel with a single pulse , the backreflected waveform that is incident at the singlephoton detector is
(1) 
3.1 Photodetection statistics
Using (1), we observe that the rate function that generates the photon detections is
(2) 
where is the quantum efficiency of the detector and is the darkcount rate of the singlephoton detector.
Let be the time bin duration of the singlephoton detector. Then, we define to be the total number of time bins that capture photon detections. Let be the vector of size that contains the number of photon detections at each time bin after we illuminate the pixel times with pulse waveform . Then, from photodetection theory [9], we can derive
(3) 
for . Note that we have assumed that our total pixelwise acquisition time is short enough that is constant during that time. Let
(4)  
(5)  
(6) 
for , where is a small number, such that is divisible by and . Defining to be an vector of ones, we can approximate the rate function in (3) and rewrite the distribution as
(7) 
for . Finally, defining and , we can further rewrite (7) as
(8) 
So far, we have simplified the pixelwise singlephoton observation model, such that the photoncount vector is a linear measurement of scene response vector corrupted by Poisson noise.
3.2 Scene parameter constraints
Using the expression in (4), we observe that
(9)  
(10) 
for , where is an indicator function that equals if and otherwise. In other words, vector has exactly one nonzero element, and the value and index of the nonzero element represents the scene reflectivity and depth at an image pixel, respectively.
We defined our signal to be a concatenation of , which is the scene response vector of size , and , which is the scalar representing background flux. Since has exactly one nonzero entry, lies in the union of subspaces defined as
(11) 
where each subspace is of dimension .
4 Solving the Inverse Problem
Using accurate photodetection statistics and scene constraints, we have interpreted the problem of robust singlephoton depth imaging as a noisy linear inverse problem, where the signal of interest lies in the union of subspaces . Using (8), the observed photon count histogram has the probability mass function
(12) 
Thus, neglecting terms in the negative loglikelihood function that are dependent on but not on , we can define the objective function
(13) 
This objective can be proved to be convex in .
We solve for by minimizing with the constraint that lies in the union of subspaces . Also, because photon flux is a nonnegative quantity, the minimization results in a more accurate estimate when we include a nonnegative signal constraint. In summary, the optimization problem that we want to solve can be written as
(14)  
s.t.  
To solve our constrained optimization problem, we propose an algorithm that is inspired by existing fast greedy algorithms for sparse signal pursuit. CoSaMP [1] is a greedy algorithm that finds a sparse approximate solution to a linear inverse problem. We modify the CoSaMP algorithm so that we obtain for a solution constrained to the union of subspaces , instead of a sparse one.
(a) Photograph  (b) Truth  (c) Logmatched filter  (d) Proposed  (e) Error of (c)  (f) Error of (d)  

\adjustboxtrim=0.220pt .120pt 0.190pt .10pt,clip  \adjustboxtrim=0.220pt .120pt 0.190pt .10pt,clip  \adjustboxtrim=0.220pt .120pt 0.190pt .10pt,clip  \adjustboxtrim=0.220pt .120pt 0.190pt .10pt,clip  \adjustboxtrim=0.220pt .120pt 0.190pt .10pt,clip 
Our proposed greedy algorithm is given in Algorithm 1. We define to be the thresholding operator setting all negative entries of to zero, to be the set of indices of nonzero elements of , and to be the vector that approximates with its largest terms. Also, we take to be a matrix with columns of chosen by the index set . Finally, we use and to denote the transpose and pseudoinverse of matrix , respectively.
To solve the constrained optimization problem in (14), our algorithm iteratively performs the following steps:

gradient descent on , which is approximated by the squared norm for computational efficiency;

projection of the intermediate estimate onto the closest subspace in the union of subspaces ; and

projection of the intermediate estimate onto the nonnegative cone,
until a convergence criterion is satisfied. We define convergence of the solution as , where is a small number.
5 Experimental results
To validate our imaging framework, we used a dataset collected by D. Venkatraman for the FirstPhoton Imaging project [6]; this dataset and others are available from [10]. The experimental setup uses a pulsed laser diode with pulsewidth and repetition period . A twoaxis galvo was used to scan pixels of a mannequin face at a distance of about . A lensless SPAD detector with quantum efficiency was used for detection. The background light level was set using an incandescent lamp. The original mannequin data from [10] had the background count rate approximately equal to the signal count rate. Our experiment uses the cropped data showing only the mannequin’s face, where the background count rate was approximately of the average signal count rate. Although we used a rasterscanning setup for our singlephoton imaging experiments, since our imaging algorithm is applied pixelwise, it can be also used for imaging with a floodlight illumination source and a singlephoton detector array.
We could compare our imaging method with the ML estimator of scene parameters. Unfortunately, due to nonzero background flux, ML estimation of , , and requires minimizing a nonconvex cost function, leading to a solution without an accuracy guarantee. Thus, zero background is assumed conventionally such that the ML depth estimate reduces to the simple logmatched filter [11]:
(15) 
We use (15) as the baseline depth estimator that is compared with our proposed estimator.
Figure 2 shows the results of recovering depth of the mannequin face using singlephoton observations. The kernel matrix was obtained by an offline measurement of the pulse shape. Note that this measurement depends only on the source, not on properties of the scene. The groundtruth depth, shown in Fig. 2(b), was generated separately by using backgroundcalibrated ML estimation from photons at each pixel.
In our depth imaging experiment, the number of photon detections at each pixel was set to . We observe that, due to extraneous background photon detections, the logmatched filter estimate in Fig. 2(c) () is corrupted with highvariance noise and the facial features of the mannequin are not visible. On the other hand, our estimate, shown in Fig. 2(d), shows highaccuracy depth recovery (). As shown by the error maps in Fig. 2(e), (f), both methods fail in depth recovery in the face boundary regions, where very little light is reflected back from the scene to the singlephoton detector and the signaltobackground ratio is thus very low. Also, we observe that our estimated average background level over all pixels was , which is very close to the calibrated true background level . In this experiment, we had . Also, we set and the average number of iterations until convergence was measured to be over all pixels. Code and data used to generate results can be downloaded from [12].
6 Conclusions and Future Work
In this letter, we presented an imaging framework for calibrationless, pixelwise depth reconstruction using singlephoton observations. Our imaging model combined photon detection statistics with the discretetime flux constraints expressed using a unionofsubspaces model. Then, using our imaging model, we developed a greedy algorithm that recovers scene depth by solving a constrained optimization problem.
Our pixelwise imaging framework can be used in low lightlevel imaging applications, where the scene being imaged has fine features and filtering techniques that exploit patchwise smoothness can potentially wash out those details. For example, it can be useful in applications such as airborne remote sensing [13], where the aim is to recover finelyfeatured 3D terrain maps.
It is straightforward to generalize the proposed singlephoton imaging framework to multipledepth estimation, where more than one reflector may be present at each pixel. In the case of estimating depths of reflectors at a pixel, the sparsity assumption must be changed to more general sparsity assumption when defining the unionofsubspaces constraint.
Footnotes
 thanks: This material is based upon work supported in part by a Samsung Scholarship, by the US National Science Foundation under Grant No. 1422034, and by the MIT Lincoln Laboratory Advanced Concepts Committee.
 thanks: D. Shin and J. H. Shapiro are with the Department of Electrical Engineering and Computer Science and the Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA 02139 USA.
 thanks: V. K. Goyal is with the Department of Electrical and Computer Engineering, Boston University, Boston, MA 02215 USA.
References
 D. Needell and J. A. Tropp, “CoSaMP: Iterative signal recovery from incomplete and inaccurate samples,” Appl. Comput. Harmon. Anal., vol. 26, no. 3, pp. 301–321, 2009.
 B.Y. Sun, D.S. Huang, and H.T. Fang, “Lidar signal denoising using leastsquares support vector machine,” IEEE Signal Process. Lett., vol. 12, no. 2, pp. 101–104, 2005.
 P. T. Boufounos, “Depth sensing using active coherent illumination,” in Proc. IEEE Int. Conf. Acoust., Speech, and Signal Process., 2012, pp. 5417–5420.
 A. McCarthy, R. J. Collins, N. J. Krichel, V. Fernández, A. M. Wallace, and G. S. Buller, “Longrange timeofflight scanning sensor based on highspeed timecorrelated singlephoton counting,” Appl. Optics, vol. 48, no. 32, pp. 6241–6251, 2009.
 D. Shin, A. Kirmani, V. K. Goyal, and J. H. Shapiro, “Photonefficient computational 3d and reflectivity imaging with singlephoton detectors,” IEEE Trans. Comput. Imaging, vol. 1, 2015, to appear.
 A. Kirmani, D. Venkatraman, D. Shin, A. Colaço, F. N. Wong, J. H. Shapiro, and V. K. Goyal, “Firstphoton imaging,” Science, vol. 343, no. 6166, pp. 58–61, 2014.
 D. Shin, A. Kirmani, V. K. Goyal, and J. H. Shapiro, “Computational 3d and reflectivity imaging with high photon efficiency,” in Proc. IEEE Int. Conf. Image Process., Oct. 2014, pp. 46–50.
 Y. Altmann, X. Ren, A. McCarthy, G. S. Buller, and S. McLaughlin, “Lidar waveform based analysis of depth images constructed using sparse singlephoton data,” arXiv preprint arXiv:1507.02511, 2015.
 D. L. Snyder, Random Point Processes. Wiley, New York, 1975.
 “GitHub repository for photonefficient imaging,” https://github.com/photonefficientimaging/sampledata/.
 R. E. Blahut, Principles and Practice of Information Theory. AddisonWesley Longman Publishing Co., Inc., 1987.
 “GitHub repository for unionofsubspace imaging,” https://github.com/photonefficientimaging/uosimaging/.
 M. Nilsson, “Estimation of tree heights and stand volume using an airborne lidar system,” Remote Sensing of Environment, vol. 56, no. 1, pp. 1–7, 1996.