Imaging with SPADs and DMDs: Seeing through Diffraction-Photons
This paper addresses the problem of imaging in the presence of diffraction-photons. Diffraction-photons arise from the low contrast ratio of DMDs ( ), and very much degrade the quality of images captured by SPAD-based systems.
Herein, a joint illumination-deconvolution scheme is designed to overcome diffraction-photons, enabling the acquisition of intensity and depth images. Additionally, a proof-of-concept experiment is conducted to demonstrate the viability of the designed scheme. It is shown that by co-designing the illumination and deconvolution phases of imaging, one can substantially overcome diffraction-photons.
Today, imaging with single-photon avalanche diodes (SPADs) is rapidly gaining attention. The high sensitivity and temporal resolution of SPADs is finding favor in a variety of applications, ranging from light-in-flight recording [Gariepy] to imaging around corners [NLOS].
Traditionally, photon detection has relied on photomultiplier tubes (PMTs). PMTs have a fast response time and provide a high amplification gain, yet they are fairly large vacuum tubes and thus not well-suited for cameras.
Nowadays, SPADs are increasingly becoming the preferred solid-state device for photon detection. Low dark counts ( Hz) and high quantum efficiencies (up to ) [MPD] are among the attractive features of SPADs.
SPAD cameras are, however, still in their infancy. Currently, commercially available SPAD cameras have low pixel resolutions ( pixels). To increase the spatial resolution of single-pixel SPAD cameras, raster-scanning, whereby pixel values of a scene are sequentially acquired, is typically used.
Raster-scanning is commonly achieved by means of galvanometric mirrors. In this approach, light is steered to positions of a scene mechanically via a pair of mirrors. Such mechanical systems, however, are cumbersome.
In contrast to single-point illumination in galvanometric systems, digital micro-mirror devices (DMDs) can project optical patterns. A DMD is a spatial light modulator (SLM), comprising a matrix of micro-mirrors integrated with a semiconductor chip. When light shines on a DMD, high-resolution ( pixels) [TI] optical patterns are projected; in our experiment, we use these optical patterns to illuminate a scene.
A challenging drawback of DMD-based projection systems, compared with their galvanometric counterparts, is the presence of scattered light. Light diffracted by edges of DMD structures, such as mirrors, gives rise to scattered light; this light can undergo multiple internal reflections within the DMD and exit as a wide cone of light, Fig. 1. Due to scattered light, scene locations that would ideally be masked by DMD mirrors in the off-state are illuminated; this imperfection hinders the collection of quality images.
The degree to which a DMD system rejects light is quantified by the contrast ratio111The full-on/full-off contrast ratio is defined as , where is the measured luminance when all DMD mirrors are in the on-state, resulting in a fully illuminated screen, and is the measured luminance when all mirrors are in the off-state.. The higher the contrast ratio, the lower the impact of scattered light on image quality. We shall refer to photons originating from a DMD’s scattered light as diffraction-photons, Fig. 1.
There are further challenges. For example, increasing the illumination power may not always be a feasible option for enhancing image quality, such as when imaging sensitive biological samples [Handbook]. Moreover, it is desirable to operate an imaging system in a gateless manner in order to reduce the number of measurements.
This study aims to overcome diffraction-photons and provide an alternative to raster-scanning. Using a SPAD, a DMD, and a co-designed illumination-deconvolution scheme, we experimentally demonstrate a system capable of acquiring intensity and depth images in an environment tainted by diffraction-photons.
The contributions of this study are as follows:
The central idea of this paper can be summarized as follows:
Project overlapping illumination blocks to overcome diffraction-photons, and let deconvolution algorithms untangle pixel values.
The remainder of this paper is organized into seven sections: In Section II, we introduce some preliminaries. In Section III, we review prior work. In Section IV, we describe a method for scene illumination, and we then present our image-formation models in Section V. In Section VI, we describe details of our experimental setup. In Section VII, we present and discuss our experimental results. In Section VIII, we draw conclusions from our findings and outline future research directions.
In this section, we introduce some principles that will serve as starting points for the subsequent sections. This section is based in part on [System].
Each pixel, , of the scene has a depth and reflectivity , and is illuminated times by a laser pulse with a waveform . The rate of photons impinging on a SPAD is given by
where is the speed of light and is the average photon rate due to ambient light.
The quantum efficiency of the SPAD lowers the photon rate by a factor of , and dark counts, , of the SPAD are added:
A time-correlated single-photon counting (TCSPC) system quantizes the observation interval, , into time-bins of duration . Accordingly, the average number of photon counts in a time-bin (per illumination) is
where is the index of a time-bin, .
After illuminations, the number of photons, , detected in a time-bin draws values from a Poisson distribution:
The signal-to-background ratio (SBR), a measure of signal fidelity, is defined as follows: SBR, where and are the mean number of signal and noise photons, respectively. Moreover, deadtime is defined as the time a SPAD needs to reset to its initial state after an avalanche event occurs. During this period of time, arriving photons will go undetected.
Iii Prior Work
Based on the method of acquisition employed, the literature can be categorized into the following classes:
Raster-Scanning: In raster-scanning pixel values are sequentially acquired. Typically, a pair of galvanometric mirrors or a motorized translation stage is used to obtain a scene’s spatial content [First_Photon, Stanford, Photonics, Shin].
Our imaging system and systems in this category require the same number of measurements to reconstruct an intensity and depth image. The main problems, however, associated with mechanical scanning systems such as galvanometric mirrors or motorized translation stages is that they are substantially large, require high power levels, and in the case of galvanometric mirrors, suffer from geometric distortion [ThorlabsD]. A solution to these problems is to employ a DMD, which we use in our experiment.
Time-Gated Compressive Sensing: In [Sweep2, Sweep3, Sweep4], 3D imaging systems based on compressive sensing were demonstrated. Using range gating, time-slices of a scene are obtained, from which depth images are reconstructed.
While such systems can be used to reduce the impact of diffraction-photons, a full time sweep is required to construct a 3D image, as gating is needed to distinguish objects that fall within the same range interval.
What sets our imaging system apart from others in this category is that it is gateless, hence requiring fewer measurements to reconstruct a 3D image. The number of measurements in our imaging system is equal to the number of pixels, , as opposed to for gated compressed sensing, where is a selected constant for a given image sparsity/compressibility (typically in the range between 2 and 4) and is the number of time-bins ().
Gateless Compressive Sensing: Methods of circumventing the need for time gating have been proposed. In [Andrea], an imaging system able to construct a depth image in a gateless manner was demonstrated. Prior to recovering a scene’s spatial information, parametric deconvolution is first used to estimate the set of depths present in scene. Once the set of depths is determined, a compressive sensing approach is used to recover a scene’s spatial content.
A major obstacle to adopting the aforementioned system is the presence of deadtime-interference. To give an example, consider the following: an object in the background of a scene may not be visible as photons reflected from it may go undetected due to the deadtime initiated by preceding photons reflected from foreground objects. A way to resolve this obstacle is to use a photo diode operating in the linear-mode (analogue mode) [Sun], as opposed to Geiger-mode (photon-counting mode); this, however, will entail a loss in photon detection sensitivity.
In our experiment, we employ a SPAD (Geiger-mode), enabling single-photon detection sensitivity. We also use block illumination (Section IV), which ameliorates deadtime-interference, allowing our imaging system to operate in gateless manner.
Epipolar Imaging: A relatively new method of image acquisition is epipolar imaging [Epi1, Epi2, Epi3]. In epipolar imaging, an illumination sheet sweeps through a scene, and an array of pixels on an epipolar plane of illumination detects incoming light. This method of imaging can potentially overcome diffraction-photons as it limits indirect light. To realize such system, however, some hardware, such as controllable mirrors, is needed to select an epipolar plane. Our imaging system is free of such hardware, making it simpler to operate.
Trading-off spatial resolution in return for a higher signal power can be made. At low illumination levels, raster-scanning (Fig. 1(a)) can yield inaccurate estimations of intensity and depth images due to the presence of diffraction-photons and low SBR. To improve signal power, one can scan large patches of the scene; this, however, comes at the expense of a loss in spatial resolution (Fig. 1(b)).
A simple yet effective approach to boosting signal power while maintaining a given spatial resolution is to use an illumination window (for example, of size ) that scans the scene in steps of one pixel (Fig. 1(c) provides an illustration). This approach improves signal power as more photons are collected and simultaneously retains the native resolution. Additionally, deadtime-interference is alleviated as the optical power is concentrated within an illumination block, where depth values of a scene are notably correlated.
A by-product, however, of using overlapping scanning blocks (Fig. 1(c)) is a blur artifact as photons from adjacent pixels are collected. In Section V, we overcome this challenge via a deconvolution technique designed for the problem at hand.
A key advantage of the proposed scheme is that it provides (via optimization) a means to relax the trade-off between spatial resolution and signal power.
Let us now discuss the two prime factors that need to be taken into consideration when selecting an illumination window size, :
Deadtime and SPAD saturation: Increasing the window size results in more signal photons arriving at the SPAD. This, however, can improve performance only to a limited extent, because the deadtime causes the SPAD to saturate at high photon rates.
Contrast ratio and diffraction-photons: A DMD with a high contrast ratio requires a relatively small window size, because fewer diffraction-photons are emitted from it; the converse is equally true.
V Image Formation
Using the illumination scheme (Fig. 1(c)) presented in Section IV, we now describe approaches to recovering intensity and depth images from photon-arrival events. We first formulate the image reconstruction process as a discrete inverse problem and then solve it by a convex optimization algorithm.
V-a Intensity Image
The goal here is to reconstruct an intensity image, , with the aid of the proposed illumination scheme (see Fig. 1(c)). The input–output relationship, in the absence of noise, can be described as follows:
where denotes spatial convolution, is the point-spread function that describes the proposed illumination scheme (Fig. 1(c)), is the latent intensity image, and is the observed number of photon counts at pixel .
Let denote the vectorized representations of image —column-wise stacked. Likewise, let denote the observation vector, where is the total number of photon counts at the pixel.
The two-dimensional convolution in Eq. 5 can be expressed as matrix–vector multiplication:
where is the DMD reflection coefficient. When a pixel of the scene is within an illumination block (see Fig. 1(c)), takes on a value of one—full reflection. Other pixels of the scene are, however, moderately illuminated due to the low contrast ratio of the DMD: when a DMD mirror is in the off-state, a fraction of the optical power illuminates the pixels of the scene; we denote this fraction by . More formally:
where denotes the number of rows of the image, and is the length of an illumination window of size (Fig. 1(c) provides an illustration). Additionally, let “” denote the smallest positive integer “” such that .
The following four steps are performed to recover (Eq. 6):
In Eq. 6, is a Poisson distributed random vector; to use an -norm apt for Gaussian noise, we apply an Anscombe transform to convert the signal-dependent Poisson noise of to (approximately) additive Gaussian noise with a constant variance [Anscombe]:
The aim here is to denoise a blurred image prior to deconvolving it:
where and is a denoised yet blurred image. A constraint is added because the minimum value of Eq. 8 is , which occurs when .
Algebraic inversion of Eq. 8 produces a bias; we therefore use a maximum likelihood (ML) transformation [Foi]:
where can be regarded as the denoised version of (Eq. 6). In this step, a lookup table is used to map the values of to .
We now wish to solve the following system of equations . Matrix is ill-conditioned. Accordingly, we consider a regularized least-squares approach to recover our latent image:
Limitation: A limitation of the procedure described (in steps 1–4) is that a low or high value of in Eq. 9 may diminish the quality of the image recovered in step 4 (Eq. 11), and thus one must carefully select . In our experimental results (Section VII), the value of is chosen by inspection: the value of that produces the best result is selected.
V-B Depth Image
The aim here is to obtain a depth image, , from photon-arrival events. In the absence of noise, the input–output relationship can be expressed as follows:
Here, two convolution processes take place:
is a 2D spatial convolution over , describing the convolution of the scene with the proposed overlapping illumination blocks, (Fig. 1(c) provides an example).
is a 1D temporal convolution over , describing the convolution of the scene with the waveform of the illumination pulse, .
Each pixel, , of the scene has a depth, , value and a corresponding time-of-flight (ToF), , which can be represented by the following impulse response:
where is a Dirac function (Fig. 2(a)) and is the speed of light. Likewise, in the continuous spatial and temporal domain, function in Eq. 12 is zero except at the ToF. Additionally, let denote the number of photons at a given space-time point .
Discretization: The continuous volume containing the scene is discretized into a voxel grid: the DMD discretizes the scene spatially into pixels, and the TCSPC module discretizes the scene temporally into time-bins. In this voxel grid, a discrete depth image is represented by a vector .
To keep the notation light, the discrete version of in this voxel domain is denoted by —akin to Eq. 13, but now for a complete image—and is given by
where is the time-bin, is the speed of light, is the depth at pixel , is the number of time-bins, is the total number of pixels, and is a Kronecker delta function defined as
Each column of matrix in Eq. 14 represents a time-slice of the voxel grid at a particular time-bin, , and each row is allotted to a single pixel of the scene. Additionally, matrix is a binary matrix, and the summation of entries along each row of has a value of one—as each pixel can only have a single depth.
The convolution operators in Eq. 12 can be represented as a matrix multiplication:
Here, each row of matrix is a time-histogram of the illumination waveform and is a circular shift of its preceding row. When a row of is multiplied by matrix , the illumination waveform is shifted to the ToF (Fig. 2(a) provides a detailed example).
In Eq. 15, matrix is the observed time-histogram at the detector (Fig. 4 provides an illustration). Each row of is a time-histogram, resulting from convolving the scene with both an illumination window and pulse waveform:
Optimization: The inverse problem of recovering (Eq. 15) can be formulated using the following optimization problem:
where is a regularizer weighted by . Eq. 18 is a combinatorial optimization problem, and it is challenging to solve within a reasonable period of time (the search space is ).
A tractable method for inferring depth from Eq. 18 can be attained using the following three steps:
The optimization problem for each time-slice (Fig. 2(b)) of the voxel grid is
Solving this optimization problem for each yields matrix . We solve this convex optimization problem using an ADMM algorithm. Details of the ADMM algorithm are provided in the Appendix.
Prior to temporally deconvolving our image, we apply an -order median filter. Experimentation has shown that this filter significantly improves our results.
Let denote the row of matrix . The -order median filter of is
from which we obtain .
Using from the previous step, we can now determine directly in a per-pixel manner. Vector contains the received signal for pixel , which can be regarded as a delayed and scaled version of the transmitted pulse. The ToF (or signal delay) for each pixel is independently obtained by cross-correlating with the waveform of the transmitted pulse, , and finding the maximum, as follows:
For simplicity of notation, denotes the element of vector . Using Eq. 21, the depth at pixel, , is , from which we obtain .
Vi Hardware and Experimental Setup
Fig. 5 shows a schematic of the experimental setup. The illumination source is a laser diode (LDH-P-C-650, PicoQuant) with a central wavelength of nm. The laser is controlled by a laser driver (PDL 828 Sepia II, PicoQuant) and emits pulses with a duration of ps (FWHM) and a repetition rate of MHz.
An optical beam expander enlarges the laser beam by a factor of . The DMD (in a DLP 4500 projector, Texas Instruments) spatially modulates the incoming light and projects it towards the scene. The contrast ratio of the DMD is .
Photons reflected by surfaces of the scene are detected by a SPAD (PDM m Series, Micro Photon Devices) with a ps (FWHM) temporal resolution. The SPAD has a quantum efficiency, ns deadtime, active area diameter of m, and dark counts rate of Hz. Photons detected by the SPAD are time-stamped, relative to a sent laser pulse, with a resolution of ps by a TCSPC module (PicoHarp 300, PicoQuant).
The scene is spatially sampled by the DMD into pixels, forming an image of size rows by columns. The observation interval is divided by the TCSPC module into time-bins, each with a duration of ps. A total of laser pulses are transmitted for each measurement. The integration time for each measurement is ms.
The scene consists of a ball ( cm radius) placed in front of a screen. The ball is cm away from the SPAD, and the horizontal distance between the SPAD and the background screen is cm. Table I provides a summary of the experimental parameters.
|Average dark-count rate of SPAD||Hz|
|Center wavelength of laser||nm|
|Contrast ratio of DMD|
|Diameter of active area of SPAD||m|
|Integration time per measurement||ms|
|Laser pulse duration||ps|
|Number of transmitted laser pulses per measurement|
|Number of time-bins|
|Quantum efficiency of SPAD|
|Repetition rate of laser pulses||MHz|
|Timing resolution of SPAD||ps|
|Timing resolution of TCSPC module||ps|
|Non-extensible deadtime [DeadTime]. Full width at half maximum.|
Vii Results and Discussion
In this section, we present and discuss the results of a proof-of-concept experiment. The goal is to demonstrate the ability of the designed illumination-deconvolution scheme (Sections IV and V) to overcome diffraction-photons.
Fig. 6 displays histograms of our experimental data, and the statistics of our measurements are as follows: the average number of noise (ambient + dark) photons was measured and found to be photons/pixel. The average number diffraction-photons with noise was measured and found to be photons/pixel. For consistency, we have taken samples for all measurements presented in this section—this value is equal to the number of pixels of the sought image.
The average number of photons (signal + noise + diffraction) per pixel for raster-scanning, and illumination blocks of size , , and , was measured and found to be , , , and photons/pixel, respectively.
Fig. 6(b) and Figs. 7(a)–7(e), display data gathered from the experiment for the observation vector (Eq. 6) and matrix (Eq. 17), respectively. Fig. 6(d) and Figs. 7(h)–7(j) present the performance of the proposed scheme.
A rich body of research exists on 3D structure and reflectivity imaging with photon-counting detectors. However, this is the first paper to address the issue of diffraction-photons in such systems. We therefore believe that, in addition to presenting the performance of our scheme, it would be beneficial to compare it with commonly used benchmarks ([System] and [First_Photon]).
Fig. 7 and 8 show that the proposed scheme reveals a latent image amid a bath of diffraction-photons ( photons/pixel, Fig. 6). Methods based on traditional raster-scanning [First_Photon, System], however, are adversely affected by diffraction-photons. The reason for this is that the average number of signal photons per pixel for raster-scanning is greatly below the number of diffraction-photons; this results in diminished image quality.
The proposed scheme has an improved performance as it takes co-designed illumination and deconvolution approach to solve the image capturing problem, which boosts the average number of signal photons collected per pixel (Fig. 6) and maintains the native image resolution as the relatively large illumination blocks overlap (Fig. 2). For our experiment, an illumination window size of produced the best results: this size is a workable balance between signal photons collected and blur introduced.
It is important to point out, however, that the performance improvement of the proposed scheme over [System] and [First_Photon] should be viewed with some caution, because [System] and [First_Photon] are designed for imaging environments without diffraction-photons. As such, they produce a lower image quality than our scheme.
This study set out to overcome diffraction-photons. We described a method for acquiring intensity and depth images in an environment tainted by diffraction-photons ( photons/pixel).
A proof-of-concept experiment demonstrates the viability of the designed scheme. It is shown that at a low DMD contrast ratio ( ), diffraction-photons can be overcome via a joint illumination-deconvolution scheme, enabling the acquisition of intensity and depth images.
The scheme works as the number of signal photons collected is boosted by overlapping sizeable illumination blocks. The overlapping blocks mix pixel values, which are subsequently untangled by deconvolution algorithms.
The central conclusion that can be drawn from this study is that the designed scheme offers a means to relax the trade-off between spatial resolution and signal power—this is achieved mainly through convex optimization techniques.
A promising avenue for future research is to determine the optimal illumination window size mathematically, for a given contrast ratio, deadtime and noise level. Thus far, the optimal window size has been determined by experimental testing. Additionally, a potential research direction is to analyze the sensitivity of the problems in Eqs. 11 and 19 to noise in . Such an analysis would provide insight into how to best deconvolve images captured by SPAD/DMD-based imaging systems.
To conclude, we believe that the findings in this paper will be of special interest to researchers in the field of ToF imaging as it addresses a new practical challenge.
Appendix A alternating direction method of multipliers
In this section we shall denote the shrinkage operator (element-wise soft thresholding) by
Additionally, we denote the projection of onto set and by and , respectively.
Appendix B Derivative Matrices
In this section we describe the derivative matrices used throughout the paper. Let be defined as follows:
where and are the first and second order derivative matrices, respectively; and constant controls the strength of the second derivative. Here, the first and second derivative matrices are defined as
where are convolution matrices for the first and second derivatives, respectively; and their subscripts specify the direction in which a derivative operation is performed.
The rationale of including a second derivative in our regularizer is that it encourages the recovery of image curvatures, rendering deblurred images more naturally-looking.
The authors wish to express their gratitude to Congli Wang, Muxingzi Li and Qilin Sun for their technical assistance and fruitful discussions. This research was made possible by KAUST Baseline Funding.