Increase the frame rate of a camera via temporal ghost imaging
Computational temporal ghost imaging (CTGI) allows the reconstruction of a fast signal from a two dimensional detection with no temporal resolution. High speed spatial modulation is implemented to encode temporal detail of the signal into the two dimensional detection. By calculating the correlation between the modulation and the rendered image, the temporal information can be retrieved. CTGI indicates a way to detect high speed non-reproducible signal from a slow detector. Based on CTGI, we propose an innovative scheme that can increase the frame rate of a camera by resolving the temporal detail of every camera image. To achieve this, CTGI is conducted parallelly to different areas of the scene. High speed spatial multiplexed modulation is performed, constraining the continuous scene into a series of short-time-scale frames. All the modulated frames are accumulated into one image that is eventually used in the correlation retrieval process. By performing CTGI reconstruction on each area independently, the temporal detail of the whole scene can be obtained. This method can have a strong application in ultrafast imaging.
Modern digital cameras use focal plan array (FPA) to record two dimensional images. In visible spectrum, charge coupled device (CCD) and complementary metal oxide semiconductor (CMOS) are two typical kinds of FPAs which are most widely used. To take an image, an FPA device has to accomplish signal acquisition and data transfer consequently, which together define the frame rate of the system. Due to the limitation of detection sensitivity and data transfer speed, frame rate of normal CCD or CMOS is generally below KHz level. This frame rate is far from enough to detect high speed scene.
To conduct ultrafast observation, there is high demand to increase the frame rate of FPA cameras. Besides the achievement in camera hardware update, computational method has become a powerful approach to break through the frame rate constraint of FPA system. Bub et al. Bub et al. (2010) employed a digital metal micro-mirror device (DMD) functioning as a high-speed shutter to resolve relayed images of the sample that is beyond the temporal resolution of the camera. Wilburn et al. Wilburn et al. (2005) and Agarwal et al. Agrawal et al. (2010) utilized a series of high speed triggers to control a camera array precisely and sequentially so as to achieve temporal super resolution. Pournaghi et al. proposed a methodology of acquiring high frame rate video using multiple cameras of random coded exposure Pournaghi and Wu (2014). In addition, the prototype of compressive video camera has also been proposed, which takes advantage of image sparsity to conduct reconstruction from under-sampled data Gao et al. (2014); Koller et al. (2015). This compressive method could increase both spatial and temporal resolution in less than 10 times.
In this work, we propose a novel scheme of high speed imaging based on temporal ghost imaging (TGI) that can capture high frame video with a slow FPA camera. Ghost imaging (GI) is an unique imaging technique that produces an image by correlating two beams of light, only one of which interacts with the object. Originally, GI was conducted in the spatial domain. It can retrieve two dimensional spatial information from non-pixelated bucket detection by utilizing spatial correlation of entangled photons Pittman et al. (1995), classical light Chen et al. (2009); Chan et al. (2009), or even computational scheme Shapiro (2008); Khamoushi et al. (2015); Sun et al. (2013). Spatial GI receives intensive research interests for its minimum requirement of the pixel number of the detector Altmann et al. (2018). It was only recently that GI has been extended to the temporal domain, by taking into account space-time duality in optics Ryczkowski et al. (2016). Temporal ghost imaging (TGI) has been investigated theoretically, numerically and experimentally with both classical light and bi-photon state Cho and Noh (2012); Shirai et al. (2010); Chen et al. (2013). Same as spatial GI, TGI uses two beam correlation to retrieve an object, which enables the detection of a fast signal from a slow detector. Later on, a computational version of TGI (CTGI) is proposed based on spatial multiplexed modulation Devaux et al. (2016). By using an FPA camera, CTGI not only simplifies TGI system into a one arm scheme, but also realizes the TGI detection in one measurement. It enables reconstruction of a single non-reproducible, periodic or non-periodic temporal signal Devaux et al. (2016); Jiang et al. (2017); Xu et al. (2018). The high frame rate imaging scheme proposed here utilizes CTGI to resolve temporal details of a scene captured by a camera in one exposure. To do this, we divide a camera into many identical areas. Each area composes many pixels so that it can carry out a CTGI reconstruction to resolve the temporal details of the scene projected in the area. By conducting CTGI in all these areas parallelly, the scheme can record videos in a speed that is much faster than the frame rate of the camera.
ii.1 Computational temporal ghost imaging
In TGI, a test light beam with temporal fluctuation is generated and split into two arms. In the object arm, test light interacts with a temporal object and get modulated, the total power is then collected by a slow detector. Meanwhile, in the reference arm, the waveform of the test light is recorded by a fast detector. After many iteration of measurement with changing test light beams, accumulation of the two-arm correlation reveals the temporal object. Corresponding to computational spatial GI, if the test light beams are known, then TGI can be conducted in a one-arm scheme, which should be called computational TGI (CTGI). In Devaux et al. (2016), CTGI is proposed by implementing both spatial multiplexed modulation and FPA detection. A spatial modulator displays a series of independent random binary patterns in the size of pixels. Let be the matrix representing the modulation, where and indicate the spacial and temporal coordinate, respectively. These successively displayed binary patterns interact with a temporal object and get recorded on a multi-pixel camera in one exposure time. The process can be expressed as
Here is a two dimensional matrix representing the image recorded by the camera, i.e., the time integration of the displayed patterns modulated by the temporal object. is the transmittance of the temporal object at the time when the -th pattern is displayed. The temporal object can be reconstructed by calculating the intensity correlation between the time integrated image and the random patterns according to the following equation
Here stands for an ensemble average for measurements. It should be noted that the object has no spatial resolution. The final reconstructed temporal resolution is equal to the modulation steps .
ii.2 High frame rate imaging scheme based on CTGI
The fastest motion that a normal camera can record is determined by its exposure time. Motion detail within an exposure time can not be resolved and even causes smearing or motion blur. Inspired by CTGI, we propose a method to increase the frame rate of an FPA camera. The principle of our scheme can be interpreted from Fig.1. To resolve the temporal information, we implement a series of spatial modulation during one exposure. The high speed modulation split the dynamic scene into K frames. What the camera records is the accumulation of all these modulated frames. In this process, both the modulation and the camera are spatially divided into ”super-pixels” according to the resolution of the scene. Every super-pixel independently conducts an pixels operation. In other words, for every pixel in the scene, an area of pixels both in the modulation and recording plane are allocated to perform temporal correlation measurement. By conducting CTGI reconstruction on every pixel, information of the scene in the time domain can be resolved.
In the process of spatial multiplexed operation, CTGI works to ”transfer” spatial pixels into temporal resolution. The increase of imaging frame rate is therefore at the cost of spatial resolution reduction. Assuming that both the modulation and the camera are in pixels, we have . That is, a camera can only conduct CTGI over a scene that is of its spatial resolution. Generally speaking, given a certain reconstruction quality, more modulation pixels are required for higher temporal resolution. If the temporal resolution has been increased by times, we define a ratio as
can be regarded as the transfer efficiency from a spatial pixel to its temporal counterpart. This transfer efficiency is related to the sampling efficiency according to the modulation basis . In Devaux et al. (2016), random binary modulation is used. Since random binary patterns contain inherent crosstalk that will introduce inherent correlation noise and reduce reconstruction quality, to guarantee a reasonable reconstruction, the modulation resolution has to be set much larger than the temporal resolution () even in a noise-free situation Sun et al. (2012); Ferri et al. (2010). In our scheme, we want to minimize the spatial cost in order to realize the best imaging resolution in both time and space domain. To achieve this, we use Walsh Hadamard matrix as the modulation basis, as the orthogonality of Hadamard matrix will maximize the correlation efficiency. Taking Walsh Hadamard matrix into Eq.1, one can resolve a dimensional temporal signal in a set of complete sampling. That is, .
To further increase the transfer efficiency, we use compressive sensing to reduce the spatial cost of CTGI. Compressive sensing is already well studied in spatial GI or single pixel camera. It is used to reduce the measurement steps required for a reasonable single pixel image. In our scheme, we can utilize compressive sensing to reduce the spatial cost for temporal reconstruction. This is possible based on the fact that temporal signals are sparse or can be sparse in certain transform domain. To conduct compressive sensing reconstruction, the correlation measurement can be rewritten as
Where is still a column vector with elements that denotes the temporal object, is the measurement matrix in the size of . It means that dimensional signal is projected into an dimensional detection value through a certain basis . For compressive sensing, the size of super-pixel is less than the resolved temporal resolution, , i.e., . To solve the underdetermined problem from Eq.4, we adopt a Total Variation (TV) constraint term based on the fact that the gradient of most natural images is sparse. The objective function of this procedure can be written as:
where the is norm and is a constant. is the sum of the magnitudes of discrete gradient at horizontal and vertically direction, which can be represented as . There are many types of algorithms designed to solve this problem. Here, we adopt an algorithm called L1-magic Emmanuel Candés and Justin Romberg, Caltech ().
Iii Results and discussion
To demonstrate the feasibility of this new high frame rate imaging scheme, proof-of-principle simulation is conducted. The system configuration is illustrated in Fig.2, which includes an imaging lens, a DMD and a camera. DMD is an excellent device in the scheme for pixel multiplexed modulation on a dynamic scene, for its high spatial resolution and peak modulation frequency of over twenty kilohertz. Under ambient illumination, the scene is imaged onto the DMD by the imaging lens. To shear and modulate the incident dynamic scene, a series of binary patterns are generated and displayed on the DMD. The scene modulated by the DMD is recorded by camera at a single exposure. Our dynamic scene is about a rotating rubik’s cube, composed of 64 frames of 8 bits grayscale images . The spatial resolution is pixels. This 64-frame video is set to last for a period equal to the exposure time of the camera. Fig.3(a) illustrates the image captured by a camera directly. The accumulation of the dynamic frames renders obvious blur and no temporal details can be seen.
In the first simulation, to resolve the images from one exposure of the camera, a 64-order Walsh Hadamard matrix is selected as the modulation basis. Each of its rows (or columns) is reshaped into a two dimensional array in the size of , which acts as a super-pixel of the modulator. A total array of super-pixels are employed to match the spatial resolution of the target. A total number of steps of modulation are conducted. The entire modulation basis can be represented as a three dimensional array in the size of super-pixels. Mathematically, the modulation process is the Hadamard product between the modulation basis and the dynamic scene. All the modulated frames are added together to form a camera image, which is in the same spatial resolution as the DMD, i.e., pixels. From this image together with the modulation basis, a 64-frame video can be reconstructed using Eq.2. Fig.3(b) shows all the frames reconstructed by our high frame rate imaging scheme, and its temporal resolution is 64 times that of Fig.3(a).
In our second simulation, the compressive sensing algorithm is employed. To minimize the sampling rate, instead of specially designed matrix, random binary patterns are used as the modulation basis. As compressive sensing enables reconstruction from under-sampled data, the spatial resolution of a super-pixel is less than and varies according to the sampling rate. Fig.4 shows selected CTGI frames under four different sampling rate. Low sampling rate means less spatial cost for certain temporal resolution. On the other hand, less sampling rate will lead to worse image quality. For example, at a sampling rate of , i.e., , the required effective resolution of the camera and modulator is pixels, a quarter of that for Hadamard reconstruction. However, the reconstructed images, especially the moving part, are severely dimmed. As the sampling rate increases, the quality of reconstructed images improves significantly. When the sampling rate reaches 76%, i.e.,, a nearly perfect reconstruction can be achieved. In practical application, the sampling rate for an acceptable reconstruction varies dependent on the spatial and temporal complexity of the motion. For some simple moving targets, one could achieve a highly compressed reconstruction at low spatial cost.
Both simulations above are conducted in a way that each target pixel is projected onto an exclusive modulation area. This is to ensure that the target within a super-pixel is uniform at each modulation step, as non-uniform distribution within a super-pixel will lead to crosstalk noise between different pixels and different frames. In practice, however, due to the complexity of object as well as the limited spatial resolution of the modulation device, it is more likely to have a non-uniform distribution within a super-pixel, which then neglects the necessity for independent modulation. Standing at this point, in the Hadamard scheme, we can take advantage of the periodic structure of the spatial modulation to maximize the reconstruction resolution. More specifically, in the situation above, each spatial modulation is an array of duplication of certain pixels pattern. Therefore, any adjacent pixels on the DMD then form a complete modulation basis from 64 modulation steps. In that case, the reconstructed frames are in a scale of pixels. A simulation applying this approach is shown in Fig.5. The dynamic target is about the free fall of three simple objects in a constant background, each being represented in a 64-frame video. The left column shows the image captured by a camera in one exposure. The resolved images at different frames are shown on the right part of the figure. All the reconstructed images in Fig.5 are in a spatial resolution of pixels. As mentioned above, the non-uniform distribution of the target within a super-pixel leads to crosstalk noise and eventually trailing effect in the resolved frames. This trailing effect has been removed in the results simply by adding a threshold to the reconstructed images.
To summarize, we have proposed a novel method to boost the frame rate of a camera based on CTGI. By implementing high speed spatial multiplexed modulation in front of a camera detection, the temporal information of a camera image can be resolved. The boost of frame rate is at the cost of spatial resolution, and dependent on the modulation efficiency. To achieve a reasonable temporal resolution while minimizing the loss of spatial resolution, two reconstruction algorithms, iterative Hadamard correlation and compressive sensing, are employed for proof-of-principle simulation demonstration. To further minimizing the spatial cost, in the Hadamard scheme, we take advantage of the periodic structure in the modulation to form a maximum number of CTGI modulation unit by combining any adjacent modulation pixels in certain size. Our high frame rate scheme is of great application potential, especially for low spatial resolution circumstances, i.g., to track the location of a fast-moving object. Using a high speed spatial modulator such as DMD, one can easily increase the frame rate of a slow camera in a magnitude of two or even higher.
This work was supported by National Natural Science Foundation of China (NSFC) (Grant No. 61675117).
- Bub et al. (2010) G. Bub, M. Tecza, M. Helmes, P. Lee, and P. Kohl, Nat.Methods 7, 209 (2010), URL https://search.proquest.com/docview/223244618?accountid=13813.
- Wilburn et al. (2005) B. Wilburn, N. Joshi, V. Vaish, E.-V. Talvala, E. Antunez, A. Barth, A. Adams, M. Horowitz, and M. Levoy, ACM Trans. Graph. 24, 765 (2005), ISSN 0730-0301, URL http://doi.acm.org/10.1145/1073204.1073259.
- Agrawal et al. (2010) A. Agrawal, M. Gupta, A. Veeraraghavan, and S. G. Narasimhan, in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (IEEE, 2010), pp. 599–606.
- Pournaghi and Wu (2014) R. Pournaghi and X. Wu, IEEE Transactions on Image Processing 23, 5670 (2014).
- Gao et al. (2014) L. Gao, J. Liang, C. Li, and L. V. Wang, Nature 516, 74 (2014).
- Koller et al. (2015) R. Koller, L. Schmid, N. Matsuda, T. Niederberger, L. Spinoulas, O. Cossairt, G. Schuster, and A. K. Katsaggelos, Optics express 23, 15992 (2015).
- Pittman et al. (1995) T. Pittman, Y. Shih, D. Strekalov, and A. Sergienko, Physical Review A 52, R3429 (1995).
- Chen et al. (2009) X.-H. Chen, Q. Liu, K.-H. Luo, and L.-A. Wu, Optics letters 34, 695 (2009).
- Chan et al. (2009) K. W. C. Chan, M. N. O’Sullivan, and R. W. Boyd, Optics letters 34, 3343 (2009).
- Shapiro (2008) J. H. Shapiro, Physical Review A 78, 061802 (2008).
- Khamoushi et al. (2015) S. M. Khamoushi, Y. Nosrati, and S. H. Tavassoli, Optics letters 40, 3452 (2015).
- Sun et al. (2013) B. Sun, M. P. Edgar, R. Bowman, L. E. Vittert, S. Welsh, A. Bowman, and M. Padgett, Science 340, 844 (2013).
- Altmann et al. (2018) Y. Altmann, S. McLaughlin, M. J. Padgett, V. K. Goyal, A. O. Hero, and D. Faccio, Science 361, eaat2298 (2018).
- Ryczkowski et al. (2016) P. Ryczkowski, M. Barbier, A. T. Friberg, J. M. Dudley, and G. Genty, Nature Photonics 10, 167 (2016).
- Cho and Noh (2012) K. Cho and J. Noh, Optics Communications 285, 1275 (2012).
- Shirai et al. (2010) T. Shirai, T. Setälä, and A. T. Friberg, JOSA B 27, 2549 (2010).
- Chen et al. (2013) Z. Chen, H. Li, Y. Li, J. Shi, and G. Zeng, Optical Engineering 52, 076103 (2013).
- Devaux et al. (2016) F. Devaux, P.-A. Moreau, S. Denis, and E. Lantz, Optica 3, 698 (2016).
- Jiang et al. (2017) S. Jiang, Y. Wang, T. Long, X. Meng, X. Yang, R. Shu, and B. Sun, Scientific reports 7, 7676 (2017).
- Xu et al. (2018) Y.-K. Xu, S.-H. Sun, W.-T. Liu, G.-Z. Tang, J.-Y. Liu, and P.-X. Chen, Optics express 26, 99 (2018).
- Sun et al. (2012) B. Sun, S. S. Welsh, M. P. Edgar, J. H. Shapiro, and M. J. Padgett, Optics Express 20, 16892 (2012).
- Ferri et al. (2010) F. Ferri, D. Magatti, L. A. Lugiato, and A. Gatti, Physical Review Letters 104, 253603 (2010).
- (23) Emmanuel Candés and Justin Romberg, Caltech, L1-magic, https://statweb.stanford.edu/~candes/l1magic/.