Accelerated HighResolution Photoacoustic Tomography via Compressed Sensing
Abstract
Current 3D photoacoustic tomography (PAT) systems offer either high image quality or high frame rates but are not able to deliver high spatial and temporal resolution simultaneously, which limits their ability to image dynamic processes in living tissue (4D PAT). A particular example is the planar FabryPérot (FP) photoacoustic scanner, which yields highresolution 3D images but takes several minutes to sequentially map the incident photoacoustic field on the 2D sensor plane, pointbypoint. However, as the spatiotemporal complexity of many absorbing tissue structures is rather low, the data recorded in such a conventional, regularly sampled fashion is often highly redundant. We demonstrate that combining modelbased, variational image reconstruction methods using spatial sparsity constraints with the development of novel PAT acquisition systems capable of subsampling the acoustic wave field can dramatically increase the acquisition speed while maintaining a good spatial resolution: First, we describe and model two general spatial subsampling schemes. Then, we discuss how to implement them using the FP interferometer and demonstrate the potential of these novel compressed sensing PAT devices through simulated data from a realistic numerical phantom and through measured data from a dynamic experimental phantom as well as from invivo experiments. Our results show that images with good spatial resolution and contrast can be obtained from highly subsampled PAT data if variational image reconstruction techniques that describe the tissues structures with suitable sparsityconstraints are used. In particular, we examine the use of total variation (TV) regularization enhanced by Bregman iterations. These novel reconstruction strategies offer new opportunities to dramatically increase the acquisition speed of photoacoustic scanners that employ pointbypoint sequential scanning as well as reducing the channel count of parallelized schemes that use detector arrays.
1 Introduction
Photoacoustic Tomography (PAT) is an emerging biomedical, "Imaging from Coupled Physics"technique [1] based on lasergenerated ultrasound (US). It allows the rich contrast afforded by optical absorption to be imaged with the high spatial resolution of ultrasound. Furthermore, the wavelength dependence of the optical absorption can, in principle, be utilized to provide spectroscopic (chemical) information on the absorbing molecules (chromophores). While PAT’s potential for an increasing variety of clinical applications is currently being explored [2, 3], it is already widely used in preclinical studies to examine small animal anatomy, physiology and pathology [4, 5, 6].
For further applications and references we refer to the reviews [7, 8, 9].
To obtain high quality threedimensional (3D) photoacoustic (PA) images with a spatial resolution around one hundred , acoustic waves with a frequency content typically in the range of tens of need to be sampled over scale apertures. Hence, satisfying the spatial Nyquist criterion necessitates sampling intervals on a scale of tens of , which requires scanning several thousand detection points. Using sequential scanning schemes, such as the FabryPérot based PA scanner (FB scanner) or mechanically scanned piezoelectric receivers, this inevitably results in long acquisition times. In principle, this can be overcome by using an array of detectors. However, a fully sampled array comprising several thousand elements, each with their own signal conditioning electronics and radio frequency analogtodigital (RF AD) electronics, would be prohibitively expensive, and efficient multiplexing is challenging due to the low pulse repetition frequency of current excitation lasers. The slow acquisition speed currently limits the use of PAT for applications where movement of the target will cause image artefacts and prohibits the examination of dynamic anatomical and physiological events in high resolution in real time, which is the goal in 4D PAT.
A different approach to accelerate sequential PAT scanners relies on the key observation that, in many situations, the spatial complexity of many of the absorbing tissue structures is rather low, and therefore, data recorded in a conventional, regularly sampled, fashion is highly redundant. It may be possible, therefore, to speed up the data acquisition without a significant loss of image quality by exploiting this redundancy and measuring a subset of the data chosen in such a way as to maximize its nonredundancy. This concept, established as the field of compressed sensing (CS) [10, 11, 12], has been applied to several imaging modalities with success, most notably to magnetic resonance tomography (MRI) [13, 14, 15] and computed tomography (CT) [16, 17, 18].
As the inverse problem of PAT is conceptually similar to these, there has been an increased interest in applying CS to PAT in different ways and for different types of scanners:
In [19, 20, 21, 22, 23], 2D reconstructions from regularly subsampled circular and linear transducer arrays were computed using matrixbased algorithms and first promising results were obtained. An asymmetrical 2D circular sensor arrangement designed based on a lowresolution preimage was examined in [24]. In [25], the CSbased recovery of suitably transformed integrating line detector data followed by a universal backprojection in 2D was proposed. Another approach, using patterned excitation light, was presented in [26, 27], which is a possible approach for photoacoustic microscopy, but is not applicable in PAT where light undergoes strong scattering. In our work, we further explore the potential for acceleration of high resolution 3D PAT using randomized, incoherent encoding of the PA signals and a sparsityconstrained image reconstruction via variational regularization enhanced by Bregman iterations. Extensive studies with simulated data from realistic numerical phantoms, measured data from experimental phantoms and invivo recordings are carried out to demonstrate which conditions have to be fulfilled to obtain high subsampling rates. To cope with the immense computational challenges, GPU computing is combined with matrixfree, stateoftheart optimization approaches.
The remainder of the paper is organized as follows: Section 2 provides background on theoretical and practical aspects of PAT. Section 3 describes two general approaches to accelerate sequential scanning schemes by spatial subsampling / compressive sensing and exemplifies their implementation with an FP scanner. In Section 4, we then describe how images can be reconstructed from the subsampled/compressed data. Our methods are exemplified by simulation studies in Section 5 and by reconstruction from experimental data in Section 6. In Section 7, we summarize and discuss the results of our work and point to future directions of research. Table 1 lists all commonly occurring abbreviations for later lookup.
2 Background
2.1 Basics of Photoacoustic Tomography
The photoacoustic (PA) signal is generated by the coupling of optical and acoustic wave propagation processes through the photoacoustic effect: Firstly, the tissue is illuminated by a laser pulse with a duration of a few nanoseconds. Inside the tissue, the photons will be scattered and absorbed, the latter predominantly in regions with a high concentration of chromophores, such as haemoglobin. The photoacoustic effect occurs when a sufficient part of the absorbed optical energy is converted to heat (thermalised) sufficiently fast and not reemitted: The induced, local pressure increase initiates acoustic waves travelling in the tissue on the microsecond timescale. These waves can be measured by ultrasonic transducers at the boundary of the domain.
With several assumptions on the tissue’s properties (see [28] for a detailed discussion), the acoustic part of the signal generation can be modelled by the following
initial value problem for the wave equation:
(1) 
The measurement data consists of samples of on the boundary of the domain. See [29, 30] for recent reviews on measurement systems. The computation of a high resolution reconstruction of , usually referred to as the photoacoustic image (PA image), from is the subject of this paper. Obtaining a high quality PA image is of crucial importance for any subsequent analysis, e.g., for quantitative photoacoustic tomography (QPAT) [31], wherein the optical part of the signal generation is inverted based on the PA image.
2.2 Nyquist Sampling in Space and Time
Before we discuss how to subsample the incident photoacoustic field in Section 3, it is important to understand that a complete sampling requires a certain relation between sampling in and : Imagine we measure the PA signal on the boundary of a domain with homogenous sound speed which is bandlimited to . Firstly, the Nyquist criterion requires us to sample the PA signal with a temporal spacing of . Secondly, the PA signal is caused by incident acoustic waves coming from various spatial directions. As an incident wave with a wave vector leads to a PA signal with frequency
(2) 
the bandlimit of the PA signal implies that the spatial waves are limited by . To resolve all the spatial information, the Nyquist criterion would require to sample the domain boundary with a spatial spacing of .
3 Compressed Photoacoustic Sensing
3.1 SubSampling of the Photoacoustic Field
As discussed in Section 1, a drawback of all current 3D PAT systems is that they offer either exquisite image quality or high frame rates but not both, partly due to the difficulty of realizing a scheme that would complete a scan with a sufficiently small in an acceptable acquisition time. In this section, we describe two novel sensing paradigms that aim to accelerate the data acquisition by spatially subsampling the incident photoacoustic field . In this study, the practical realization of these two approaches is achieved via the FabryPérot (FP) photoacoustic scanner as described in Section 3.2, but they are equally applicable to other sequential scanning systems, such as mechanically scanned piezoelectric detectors.
For simplicity but without loss of generality, we assume that a planar detection surface located at is used, that a rectangular area on it can be interrogated, and that the target is located in the region . The extension to other detection geometries is straight forward, but complicates the notation. The concrete measurement process can be modeled in the following way: The incident photoacoustic field on the the detection plane, , caused by the pulse of the excitation laser, is first multiplied by a spatial window function and then integrated over the whole detection surface:
(3) 
The spatial sampling is followed by a temporal sampling, e.g., by measuring at , . The setting is sketched in Figure 1.
3.1.1 SinglePoint Scanning
A standard approach is to try to focus on a single point (cf. Figure LABEL:subfig:FPSL), which belongs to a set of points forming a regular grid with spacings and . We will refer to the data obtained by this scanning pattern as the conventional (cnv.) data. Ideally, these spacings are chosen fine enough to assure that the Nyquist criterion is satisfied in space and time (cf. Section 2.2).
Due to the spatiotemporal characteristics of the wave propagation, the pressure time series recorded at two neighbouring locations on the regular grid provide very similar information. To reduce the coherence of the measured time series, we can instead also sample the photoacoustic field at a smaller number of randomly chosen points , , which would yield an acceleration factor of . We will denote this subsampling strategy by rSP. Choosing random points is firstly motivated by several results in compressed sensing theory [12], that point out the importance of randomness for designing subsampling pattern. Secondly, we want to avoid choosing sampling pattern that might lead to systematic biases. We will return to this point in the computational studies in Sections 5 and 6.
3.1.2 Patterned Interrogation Scanning
The second idea to accelerate the acquisition is to choose a series of orthogonal pattern with each being supported all over with no particular focus on a single location (cf. Figure LABEL:subfig:FPBP). Again, choosing pattern would yield an acceleration factor of over conventional single point scanning. As we will use a specific type of binary pattern based on scrambled Hadamard matrices described later (Section 4.3), we will denote the subsampling strategy by sHd.
3.2 Implementation of SubSampling Schemes using the FB Scanner
The planar, interferometrybased FabryPérot photoacoustic scanner provides a convenient implementation of the subsampling strategies described above. In the standard setup, the FP scanner performs the conventional singlepoint scanning described in Section 3.1.1: For each pulse sent in by the excitation laser, the pressure time series at a different location on a grid is measured by interrogating the FP sensor head with an interrogation laser [32]. Due to the planar geometry and the option to introduce the excitation laser through the transparent detection plane, PA signals from a large range of anatomical targets can be scanned in the frequency range from DC to several tens of on a scale of tens of ( in our model corresponds to the beam profile of the interrogation laser, cf. Figure LABEL:subfig:FPSL). Superficial features located a few below the skin surface can be imaged with high spatial resolution from a conventional FP scan (see, e.g., [6]). However, as described in Section 1, obtaining such an image requires scanning several tens of thousands of locations which, due to the repetition rates of currently available excitation lasers, takes several minutes.
While the single point subsampling described in Section 3.1.1 is straight forward to implement with a standard FP scanner, implementing the patterned interrogation scheme requires several modifications: Instead of focusing the interrogation beam on a single location, the whole detection plane is illuminated and the reflected beam is patterned before being focused into the photo diode. The spatial modulation is inspired by the workingprinciple of the celebrated "singlepixel Rice camera" [33]: A digital micromirror device (DMD) is used to block rectangular sections of the reflected interrogation beam which creates binary pattern. Note that this hardware realization slightly differs from the conceptual sketch in Figure LABEL:subfig:FPBP, where the direct interrogation beam is patterned. Further details of the patternedillumination scanner can be found in [34, 35, 36].
4 Image Reconstruction from SubSampled Data
In this section, we describe a model of the accelerated data acquisition and how to invert it.
4.1 Continuous Forward Model
The PAT forward operator, , maps a given initial pressure in the volume to the time dependent pressure on the detection plane, , as determined by (1): . A more detailed discussion of the operator and its adjoint can be found in [37]. In this work, we assume that the target’s dynamics are slow enough to be well approximated by a constant within the acquisition time. A sensing operator implements (3) to produce the measured data from the detection plane pressure :
(4) 
where we added the term to account for additive measurement noise. In the following, denotes the sampling operator: The conventional pointbypoint scan described in Section 3.1.1 (where ) or one of the two subsampling strategies, the random singlepoint subsampling described in Section 3.1.1 or the patterned interrogation described in Section 3.1.2.
4.2 Image Reconstruction Strategies
Given the compressed data , there are in principle two strategies of how to reconstruct : In twostep procedures, we first reconstruct the detection plane pressure from based on (data reconstruction) and then use a standard PAT reconstruction for complete planar data (see, e.g. [38]). In onestep procedures, is reconstructed directly from using a modelbased approach (4). While both approaches have advantages and disadvantages and novel twostep procedures have been introduced in [35, 25, 39], the focus of this work is not to carry out a fair, detailed comparison between onestep and twostep approaches: We rather want to emphasize on the differences between simple, linear reconstruction techniques and variational, modelbased reconstruction techniques, independent of the former being twostep and the latter being onestep procedures. To ease the following presentation, we first introduce the discrete PAT model we will use for numerical computations before discussing the details of the reconstruction techniques.
4.3 Discrete Forward Model
As all methods we examine directly rely on the wave equation (1), we need a fast numerical method for 3D wave propagation with high spatial and temporal resolution. Our choice is the space pseudospectral time domain method [40, 41, 42] implemented in the kWave Matlab Toolbox [43]. For the following, it is only important that kWave discretizes into voxels and uses an explicit time stepping. The time step used will always be the same as used in the temporal sampling of the pressure field, i.e., (cf. Section 3.1). From now on, all variables used are discrete although we will continue to use the same notation for them. For instance, the discretization of the PAT forward operator is now a matrix , mapping the discrete initial pressure to the pressure at the first layer of voxels in direction at the discrete time steps, as these voxels represent the detection plane. Note that we cannot construct explicitly, but rely on computing matrixvector products with and using kWave. A detailed discussion of the implementation can be found in [37]. The discrete subsampling operators map from the pressuretime series of the detection plane voxels to . The single point sampling operators (cf. Sections 3.1.1 and 3.1.1) simply extract the pressuretime series at the sensor voxels (which are, in general, only a subset of the detection plane voxels). For the interrogation with binary sensing pattern constructed by the DMD (cf. Section 3.2), (3) can be implemented by multiplying a binary matrix with the vector of pressure values at the scanning voxels, separately for each time point . Compressed sensing theory suggests that random Bernoulli matrices are optimal for many applications [12]. As those are difficult to implement experimentally and computationally, we use scrambled Hadamard matrices which are known to have very similar properties compared to a random Bernoulli matrices and can be implemented very efficiently by a fastfouriertransformlike operation [44, 12]. Note that as the entries of Bernoulli and Hadamard matrices take the values and not as implemented by the DMD, we need to demean experimental data in a preprocessing step. Further details can be found in [35, 39, 36].
4.4 Linear BackProjectionType Reconstructions
As described above, we will use simple, linear twostep procedures to compare the more sophisticated variational methods to: First, we reconstruct the complete data by the pseudoinverse of : (cf. (4)), where for the subsampling operators we consider here, . Then, we either multiply with (called backprojection (BP) here), or with the discrete time reversal (TR) [45, 46, 6] operator :
(5)  
(6) 
The difference between BP and TR (including enhanced variants of TR [47]), is discussed in more detail in [37]. In summary, in the continuous setting, they differ in the way they introduce the timereversed pressure time series at the detection plane: TR approaches use them as a time dependent Dirichlet boundary condition for the wave equation (1) while the adjoint approach introduces them as a time dependent source term without altering the boundary conditions.
4.5 Variational Image Reconstruction
Variational regularization [48] is a popular and well understood approach for approximating the solutions of illposed operator equations like (4) in a reasonable and controlled manner: The regularized solution is defined as the minimizer of a suitable energy functional . Assuming that the additive noise, , is i.i.d. normally distributed, a reasonable approach is to solve
(7) 
to obtain a regularized solution . While the first term in the composite functional measures the misfit between measured and predicted data (data fidelity term), has to render the minimization problem (7) wellposed by ensuring existence, uniqueness and stability of (regularization functional). Furthermore, its choice can be used to penalize or constrain unwanted features of , thereby encoding apriori knowledge about the solution. The regularization parameter controls the balance between both terms.
The first variational method we examine corresponds to classical, zerothorder Tikhonov regularization, augmented by the physical constraint :
(8) 
As a second functional , we examine the popular total variation (TV) energy, which is a discrete version of the totalvariation seminorm [49, 50]:
(9) 
The energy measures the norm of the amplitude of the gradient field of (the details of its implementation are given in A) and is a prominent example of nonsmooth, edgepreserving image reconstruction techniques, and, more generally, of spatial sparsity constraints. While TV regularization has been used for PAT before (see, e.g., [51]), our main interest in it arises from its results when applied to subsampled data for other imaging modalities
[13, 52, 16, 53, 17, 54, 15, 18]: TV regularization is often able to recover the object’s main feature edges even for high subsampling factors. Therefore, we focus on this rather general regularization energy in this first PAT subsampling study and examine more specific functionals in future work.
As all of the involved functionals and constraints are convex, a variety of methods exist to solve (7) computationally. In this work, we use an accelerated proximal gradienttype method described in B.
4.6 Bregman Iterations
A potential drawback of variational techniques like (7) is that they inevitably lead to a systematic bias of the regularized solutions: The solution moves from an unbiased datafit towards a minimizer of . Formally, let be the true, noisefree data and the solution of (7) for . Then, by the minimizing properties of , we have
(10) 
and thereby, . For the TV energy, this bias manifests in the wellknown, nonlinear contrast loss of TV regularized solutions [50]. For PAT, this systematic error poses a crucial limitation on the use of TV regularized PA images for quantitative analysis like QPAT studies. To overcome this drawback, an iterative enhancement of variational solutions by the Bregman iteration [55] was proposed in [56]: For (7), they take the form
(11)  
(12) 
with . This iteration has several attractive features [50]: It solves the unregularized problem
(13) 
by solving a sequence of wellregularized problems (11) while the residual of iterates, , is monotonically decreasing. The potential of Bregman iterations, in particular when used on subsampled data, has been demonstrated in [52, 53, 54, 15]. Note the difference between the use of the Bregman iteration in the Split Bregman method [57], a method to solve problems like (7) which is also known as the augmented Lagrangian method, and the usage here which does not have an equivalent Lagrangian interpretation.
5 Simulation Studies
We now examine the different inverse methods described in the previous section when applied to subsampled data from numerical phantoms.
5.1 Realistic Numerical Phantom






While studies using numerical phantoms composed of simple geometrical objects can provide valuable insights to the basic properties of the inverse problem and reconstruction methods, it is often unclear how their results translate to experimental data from complex targets. In this section, we briefly describe the construction of a realistic numerical phantom that will be used in the main simulation studies.
The phantom is based on a segmentation of a microCT scan of a mouse brain ( voxel) into gray matter, vasculature and dura mater. The vasculature is morphologically closed (dilation followed
by erosion) using the 18neighborhood as a convolution kernel. Thereafter, the vasculature is one connected component with respect to the 26neighborhood. The whole segmentation is clipped to the bounding box of the vasculature leading to a size of . Next, a gray matter voxel in the central part of the segmentation is chosen uniformly at random. It is used as the seed point for the construction of an artificial cancer tissue inside the gray matter by a stochastic growth process that consists of an iterative application of morphological operations on the surface voxels. Figure 2 shows the final result of this construction. Note that vasculature and tumor tissue are nonintersecting.
For the studies in this work, the clipped volume is embedded into a cubic volume of voxels, centered in and direction and in two different heights in direction: In the first phantom, called Tumor1, the distance between the detection plane and the phantom is half of the distance between the plane and the phantom. In the second phantom, called Tumor2, it is centered in direction, leading to a larger distance between sensor and target (cf. Figure 3). To construct from the segmentation, all vasculature voxels are given the value of , whereas the tumor tissue is given the value of for Tumor1 and for Tumor2. Next, is downsampled to the desired resolution by successive subaveraging over blocks. For Tumor1, we modify the resulting to obtain sharper boundaries: is normalized such that and the intensity of all nonzero voxels is set to , thereby ensuring a contrast minimal value of between background and target. Figure 3 shows maximum intensity projections (mxIP) of the resulting .
5.2 Simulation Studies with Tumor1
We first examine the inverse reconstructions using Tumor1 with for "inverse crime" data [59], which means that we assume that we have exact knowledge of all physical parameters, which are summarized in Table 2, and use the same model for both data simulation and reconstruction. In addition, we sampled the data with the spatial spacing given by the Nyquist criterion: The spatial sampling intervals coincide with the spatial spacing of the computational grid and are fine enough to capture all relevant spatial frequencies of the incident photoacoustic field (cf. Section 2.2): was presmoothed to ensure that its discrete approximation by the truncated Fourier basis used in kWave was nonoscillatory. This is reflected in a sharp drop of the power spectrum of the pressure time series around 4.8 , which (cf. Section 2.2) corresponds to a spatial Nyquist rate equal to the spatial spacing . White noise with a standard deviation of is added to the clean data , leading to a signaltonoise ratio (SNR) of .
For all computed solutions , voxels with negative pressure and all sensor voxels have been set to in a postprocessing step. We will mainly rely on a visual comparison of the results via maximum intensity projections along the direction (cf. Figures LABEL:subfig:VaTu1Y, LABEL:subfig:VaTu2Y). Unless stated otherwise, the color scale of each figure is determined independently. It ranges from to the value separating the largest values of from the smaller values. This clipping is necessary to avoid that a few large outliers determine the contrast of the image and complicate the comparison between different methods. In addition to the visual comparison, we report the meansquarederror (MSE) between the reconstructed solution and , i.e., , in the conventional logarithmic scaling termed peaksignaltonoise ratio (PSNR):
(14) 
might have a different scaling compared to and contain small scale noise. As this typically does not influence the evaluation of a human observer, we also do not want to account for it when computing the PSNR. Therefore, we first rescale and threshold and ,
and then compute by (14).
Figure 4 shows the results of TR and BP for using rSP128 or sHd128 subsampling, i.e., accelerated by a factor of compared to the conventional scan. This high subsampling factor leads to unsatisfactory reconstructions. The different subsampling strategies lead to a different appearance of backpropagated noise and subsampling artifacts in the reconstructed images: In the rSP128 case, both features and noise are backpropagated along the surfaces of spheres centered on the scanning locations while in the sHd128 case, the nonlocal nature of the patterned interrogation leads to backpropagation along planes parallel to the detection plane. Figure 5 shows the corresponding results of the classical Tikhonov regularization (8), denoted by "L2+", TV regularization (9), denoted by "TV+" and of the Bregman iteration (11)(12) applied to TV regularization, denoted by "TV+Br". Despite the high subsampling factor, TV+ and TV+Br are still able to reconstruct the main structures of the phantom without excessive image noise.
In the results shown, the regularization parameter for L2+ and TV+ was chosen by the discrepancy principle (DP): For the general variational problem (7), the discrepancy principle selects such that
(15) 
for . The DP is based on the heuristic argument, that the regularized solution should explain the data no more than up to the noise level, which is assumed to be known. As the residual is monotonically increasing in , the DP is robust and easytoimplement. We chose a simple intervalbased method that linearly interpolates the left hand side of (15) (the discrepancy of the data) in the current search interval. It was terminated when was reached within a tolerance of . The Bregman iterations were started with = , where is the regularization parameter found for TV+, and stopped as soon as the discrepancy of falls below (recall that the residual, and thereby the discrepancy monotonically decreases with , cf. Section 4.6). This typically happens after about 10 Bregman iterations.





















5.2.1 Enhancement through Bregman Iterations
The Bregman iteration was introduced to compensate for the systematic contrast loss of TV regularized solutions. Figure 6 compares TV+ and TV+Br solutions using the same color scaling and additionally shows mxIPs of the positive and negative parts of the difference between TV+Br and TV+. The difference plots demonstrate that using Bregman iterations especially improves the contrast of the small scale vessel structures and that the benefit is more pronounced for subsampled data compared to conventional data.








5.3 Simulation Studies with Tumor2
The good quality of, e.g., the TV+Br results for the very high subsampling factor of have to be interpreted with care: The phantom Tumor1 used was intentionally easy to reconstruct as it was close to the sensors and had high contrast (cf. Figure 3). In addition, the data was created by the same forward model used in the reconstruction, which is known as committing an "inverse crime" [59], and the conventional data was sampled finely enough to fulfill the Nyquist criterion. The phantom Tumor2 was designed to carry out simulation studies that more accurately reproduce the challenges of experimental data scenarios.
5.3.1 Inverse Crimes



While inverse crimes are, to a certain extend, unavoidable when carrying out simulation studies, they make it more difficult to extrapolate the results obtained to experimental data. In this section, we discuss how to bridge this gap by modifying the model used for the data generation:

The sensitivity of the FP sensor varies spatially [32]. While this effect can be mitigated by calibration and data preprocessing procedures, a residual uncertainty remains that we will model by modifying the discrete (static) data generation model to
(16) where is a diagonal matrix that multiplies the pressuretime course of each voxel in the plane by a random variable following a centered lognormal distribution:
(17) We choose (90% of are in ).

In a similar spirit, we assume that we only have a rough estimate of the statistics of , e.g., from baseline measurements, and can, therefore, only approximately decorrelate the data before the inversion. The residual uncertainty about the noise variance per sensor voxel is modeled by replacing the additional noise term by , where is constructed like with (90% of are in ). Keeping , as the standard deviation of , we end up with an average SNR of (the value is considerable lower than for Tumor1 due to the larger distance to the sensors and the lower contrast of Tumor2).

Although we assume that the medium we image is sufficiently homogeneous to assume a constant sound speed in the inverse reconstruction, the real sound speed will slightly vary, especially between different tissue types. We use for the data generation, where is constructed by adding a smooth, normalized Gaussian random field and the normalized initial pressure (as it represents a tissue different from the background). Then, is centered and scaled such that its mean is and its maximal absolute value is (a sound speed variation of % is not unusual for soft tissue [60]). The resulting sound speed is shown in Figure LABEL:subfig:SoundSpeed.

Among the many other ways to modify the data generation model are to include acoustic absorption, inhomogeneous illumination, acoustic reflections, baseline shifts and drifts in the pressuretime series, correlated noise and corrupted channels. We leave these extensions to future studies.
Figure LABEL:subfig:PressureTimeCourse compares the noisefree pressuretime series of a single voxel after adding sensitivity and sound speed variations (adding noise and noise variation complicates a visual comparison).
5.3.2 Nyquist criterion
In contrast to the previous studies, we now compare to conventional data acquired on a regular grid having a grid spacing corresponding to twice the length determined by the Nyquist criterion (cf. Section 5.2): , i.e., we model the conventional data by extracting the pressuretime series at a subset of the detection plane voxels forming a regular grid with spacing 2. This is closer to the measured datasets we will examine in Section 6. All acceleration factors are defined with respect to the total number of locations of the regular grid which is given by . Note that it now also makes sense to consider rSP1 and sHd1 as "subsampling" pattern: Although , the data they measure cannot be converted to the conventionally sampled data, as was the case with Tumor1.
5.3.3 Reconstruction Results
As the results of TR and BP are of similar quality as for Tumor1 (cf. Figure 4) we omit them here, and concentrate on the results of the variational methods. We again used the DP () to select the regularization parameter. Figure 8 compares them for rSP16 and sHd16 subsampling: TV+Br, again, leads to the best results by visual impression and PSNR. In Figure 9, we therefore examine the influence of on the reconstructed images only for TV+Br: Up to , the rSPbased TV+Br reconstructions only slightly deteriorate. From to , however, a clear degradation is visible. For the sHd subsampling, the image quality remains acceptable up to .



























5.3.4 Influence of the Spatial SubSampling Pattern
As described in Section 3.1.1, a random partition of the scanning locations was chosen as the main single point subsampling pattern to avoid unintended systematic artifacts like aliasing and was furthermore inspired by several results in compressed sensing theory [12]. In Figure 10, we compare this random pattern to using a regular subsampling pattern based on a coarse grid, which we denote as gSP. The results show that the concrete choice of the single point subsampling pattern seems to be a minor influence, compared to, e.g., the choice of the inverse method used. We leave a more detailed examination and the design of optimal (dynamic) sampling pattern for further research.






6 Experimental Data
In this section, we examine the subsampling strategies for three example experimental data sets. To have a ground truth, subsampling was only carried out artificially: For each experiment, a conventional data set was acquired first, and subsampled data sets were produced from this data thereafter.
6.1 Single Point SubSampling  Dynamic Phantom


We start with data acquired by a conventional single point FP scanner, measuring a pseudodynamic experimental phantom, which we call "Knot".
6.1.1 Setup
Figure LABEL:subfig:ThKnSetup shows the experimental setup: Two polythene tubes were filled with 10% and 100% ink and interleaved to form a knot. One of the loose ends was tied to a motor shaft (top right corner of Figure LABEL:subfig:ThKnSetup) while the other three ends were fixated. The pseudodynamic data was acquired in a stopmotion style: A conventional scan (duration 15) was performed while the whole arrangement was fixed. Then, the motor shaft was rotated by a stationary angle, causing the knot to tighten and to move into the direction of the motor, and a new scan was performed. This way, a conventional data set comprising frames was acquired. The tubes were immersed in a % Intralipid solution with deionised water. The excitation laser pulses used had a wavelength of , an energy of around and were delivered a rate of . A conventional scan consisted of locations (), measured over time points with a resolution of .
6.1.2 Preprocessing
First, the data was clipped to locations. The baseline of the pressuretime course at each location was estimated by the median of the preexcitationpulse time points () and subtracted. Then, a zerophase bandpass filter around  was applied (see applyFilter.m
in the kWave toolbox). The % locations with the highest variance (computed by the time points ) were excluded from the further analysis. Finally, the remaining data was clipped to the time points .
6.1.3 Results
Table 2 shows the parameters of the acoustic model used for the inversion. Note that the spacing of the spatial grid, , is times finer than the distance between scanning locations: Assuming a sound speed of the Nyquist criterion would require that we had sampled with in space and in time to be able to reconstruct initial pressure distributions leading to signals with a frequency content of . This means that the conventional data is oversampled in time, but already undersampled in space, similar to the simulation studies with Tumor2 (cf. Section 2.2). When choosing a finer spatial grid spacing to reconstruct the data as compared to the one in which it was recorded we attempt to recover some spatial resolution from the higher temporal resolution of the pressure time series.
In 4D PAT, we can vary the subsampling operator used in each frame . For singlepoint subsampling, one would try to avoid measuring the pressure time series at the same location in subsequent frames as it may contain very similar information^{1}^{1}1For patterned interrogation, one would not use the same pattern in subsequent frames. Therefore, we randomly partitioning the set of all scanned locations into subsets, each containing different random locations. This yields a sequence of subsampling operators , that we periodically apply to the set of all frames, i.e., after frames, each locations has been scanned once and the th frame is scanned with , again. Figure 12 shows the inverse reconstructions of the middle frame for conventional data, rSP4, rSP8 and rSP16 (a movie of the complete reconstruction can be found in the supplementary material). Next to TR, TV+ and TV+Br, we also show the result of postprocessing the TR solution with positivityconstrained TV denoising, which we will denote by "TRppTV+":
(18) 
where we chose large enough to suppress most visible reconstruction artifacts. Solving (18) is discussed in B. The regularization parameter chosen for TV+ was = for the conventional data while this value was multiplied by for the subsampled data. For TV+Br, we carried out 10 Bregman iterations with = .
(a) TR, cnv.  (b) TRppTV+, cnv.  (c) TV+, cnv.  (d) TV+Br, cnv. 
(e) TR, rSP4  (f) TRppTV+, rSP4  (g) TV+, rSP4  (h) TV+Br, rSP4 
(i) TR, rSP8  (j) TRppTV+, rSP8  (k) TV+, rSP8  (l) TV+Br, rSP8 
(m) TR, rSP16  (n) TRppTV+, rSP16  (o) TV+, rSP16  (p) TV+Br, rSP16 
6.2 Patterned Interrogation  Static Phantom
Next, we investigate data acquired by the patterned interrogation FP scanner (cf. Figure LABEL:subfig:FPBP), measuring a static experimental phantom, which we will call "Hair".
6.2.1 Setup
The full technical details of the scan can be found in [34]. The target to be scanned was a knotted artificial hair (see Figure LABEL:subfig:HrKnSetup, diameter 150 ), immersed in 1% Intralipid solution and positioned approximately above the detection plane and under the Intralipid surface. On the DMD, an active area of micromirrors was subdivided by grouping micromirrors to form one of "pixels". Due to an angle in the optical path, each pixel corresponded to an area of on the detection plane. Then, the rows of a scrambled Hadamard matrix were used to implement binary pattern on the DMD.
6.2.2 Preprocessing
First, the data needed to be calibrated to fit to our model (cf. Section 4.3). Subsequently, a zerophase bandpass filter around  was applied and the data was clipped to the time points .
6.2.3 Results
Table 2 shows the parameters of the acoustic model used for the inversion. Note that the conventional data is, again, slightly oversampled in time but undersampled in space. For all computed solutions , all voxels in the first xlayers were set to in a postprocessing step. The latter was done to ease the visualization of the results through maximum intensity projections: The first time points of the signal only seem to contain noise, which means that up to a distance of in voxel length, will only account for noise. For TR and BP solutions, voxels with negative values were also set to . Figure 13 shows different reconstructions using the conventional data (sHd1) and : The regularization parameter chosen for TV+Br was = for the conventional data while this value was first divided by and then multiplied by for for the subsampled data. A total of 10 Bregman iterations were carried out.
(a) TR, sHd1  (b) BP, sHd1  (c) TRppTV+, sHd1  (d) BPppTV+, sHd1  (e) TV+Br, sHd1 
(f) TR, sHd4  (g) BP, sHd4  (h) TRppTV+, sHd4  (i) BPppTV+, sHd4  (j) TV+Br, sHd4 
(k) TR, sHd8  (l) BP, sHd8  (m) TRppTV+, sHd8  (n) BppTV+, sHd8  (o) TV+Br sHd8 
(p) TR, sHd16  (q) BP, sHd16  (r) TRppTV+, sHd16  (s) BPppTV+, sHd16  (t) TV+Br, sHd16 
6.3 InVivo Measurements  Single Point SubSampling
While validating inverse methods on data from experimental phantoms is an important step forward from pure simulation studies, experimental phantoms cannot reproduce all the features of real invivo data sets. As a last example, we therefore investigate a static invivo data set of skin vasculature and subcutaneous anatomy near the right flank of a nude mouse, which we will call "Vessels".
6.3.1 Setup
The data was acquired with an excitation wavelength of , further technical details and illustrations can be found in [6]. A conventional scan consisted of locations over a region of size of (), measured at time points with a resolution of .
6.3.2 Preprocessing
The data was clipped to locations and to the time points .
6.3.3 Results
Table 2 shows the parameters of the acoustic model used for the inversion. Note that by a similar reasoning about the spatial and temporal sampling intervals, the spatial spacing is, again, chosen times finer than the distance between scanning locations. Figure 14 shows maximum intensity projections and a slice through for TR, TRppTV+ and TV+Br solutions when using the conventional, rSP4 and rSP8 data. The denoising parameter , was, again, chosen large enough to suppress most visible reconstruction artifacts. The regularization parameter chosen for TV+Br was = for the conventional data while this value was multiplied by for the subsampled data. A total of 10 Bregman iterations were carried out.
(a) TR, cnv.  (b) TR, rSP4  (c) TR, rSP8 
(d) TRppTv+, cnv.  (e) TRppTV+, rSP4  (f) TRppTv+, rSP8 
(g) TV+Br, cnv.  (h) TV+Br, rSP4  (i) TV+Br, rSP8 
7 Discussion, Outlook and Conclusion
7.1 Discussion
The main results of the simulation studies (Section 5) can be summarized as follows:

Using modelbased variational reconstruction methods employing spatial sparsity constraints, such as TV+, (9), is essential for obtaining good quality PA images from subsampled PAT data. The linear methods, TR (6) and BP (5), and the L2+ method (8) could not produce images of acceptable quality in any setting.

The results obtained for Tumor1 and Tumor2 demonstrate that the image quality obtained for a certain subsampling rate varies strongly: The "inverse crime" data of the more superficial, high contrast target Tumor1 can be upsampled up to without a significant loss of image quality. On the other hand, a bad modelfit, i.e., a mismatch between of the models used for data generation and inversion, combined with a more challenging target, such as Tumor2, significantly impairs the image quality beyond a certain subsampling rate ( in our particular example).

Using Bregman iterations improves upon conventional variational approaches for PAT. Most importantly, the systematic contrast loss of small scale structures such as blood vessels is mitigated which is a crucial prerequisite for QPAT studies.

Subsampling by patterned interrogation (sHd) was slightly more efficient than single point subsampling (rSP).
The subsampling rates we achieved with experimental data were similar to those obtained with simulated data in the more realistic Tumor2 scenario. However, unexpectedly, the qualitative difference between the linear methods TR and BP and the variational methods TV+ and TV+Br was not as dramatic as in the simulation studies (cf. Figures 4, 5, 12, 13). In many cases, postprocessing the TR solution with TV+ denoising, (18), comes remarkably close to the TV+ solution, which is not to be expected from the simulation studies (see, e.g., the TR sHd128 solution in Figure 4). In addition, Bregman iterations could not improve much over conventional variational approaches, again in contrast to our findings in Section 5.2.1. A partial explanation for both findings is a bad model fit: While some preprocessing routines (e.g., baseline correction, bandpass filtering and a detection and deletion of corrupt channels) were implemented and carried out to align the data with the model used, other known modelmismatches (e.g., the inhomogeneous sensitivity of the FP sensor and the nonwhiteness of the measurement noise) were not accounted for.
A closer examination of the "Knot" data reveals several flaws which deteriorated our results: Firstly, the optical excitation was inhomogeneous in lateral direction (cf. Figure 12), leading to an inhomogeneous initial pressure distribution in regions consisting of the same materials. The TV energy is not wellsuited to recover such targets. Secondly, a the baseline shifts in the data are more complex than what we corrected for and a spatiotemporal visualization of the data shows several artifacts on the sensor that our automatic channel deletion procedure cannot fully remove. And lastly, the acoustic properties of the polythene tubes lead to reflections we did not account for in our model. For these reasons, the decrease in image quality for subsampled data is faster compared to the simulation studies. While we see a clear advantage of using TV+ as opposed to the simpler TRppTV+, using Bregman iterations does not seem to lead to a better image quality.
The "Hair" data was acquired with the novel patterned FP scanner which still suffers from several major technical difficulties, e.g., the widening of the interrogation beam leads to a significant loss in SNR and systematic shifts in signal power can be observed that need to be examined more carefully. Furthermore, we suspect that the hair knot might have moved during the acquisition (cf Figure 13(c)). For these reasons, already the reconstructions from the sHd1 data are not of very good quality. If we accept these as a ground truth nonetheless, we see that we can reach subsampling rates of around without a significant further loss of image quality. If we compare the different methods, we find that TRppTV+ yields the visually most appealing results while BPppTV+ and TV+Br look very similar. As BPppTV+ is more or less the first iterate of the optimization scheme we use to compute TV+Br (cf. Section B), the latter might, again, point to a bad modelfit (see above).
For the "Vessel" data set, the visual impression of the conventional data reconstructions (cf. Figure 14) and the fact that we did not have to perform any preprocessing suggest that the invivo data examined is of good quality and we have a good modelfit. A comparison with the results from experimental phantoms further reveals that the diffusive nature of biological tissue leads to a more even lateral illumination. Now, TV+Br clearly outperforms TR and TRppTV+. For instance, from the slice view we can see that TR seems to overestimate the diameter of the blood vessels. The reason for not achieving high subsampling rates despite the good data quality is the apparent mismatch between the target geometry and the spatial sparsity constraints employed by the TV energy: The PA image is exceptionally rich in vasculature, but TV regularization tends to break up such anisotropic, linelike structures (cf. Figure 14(i)).
7.2 Outlook
As the simulation studies show that a model misfit can severely decrease the subsampling rates achievable, we need to improve the accuracy of the acoustic forward model to obtain better results for experimental data: Data preprocessing aligns the data with the forward model, model calibration can determine some uncertain parameters of the forward model (such as the FP sensitivity distribution or the noise statistics), and Bayesian model selection [61] or Bayesian approximation error modeling [62, 59, 63] can reduce or account for the uncertainty in other parameters, such as . To improve the results for invivo data (cf. Figure 14) acoustic absorption models of biological tissue [42, 64] need to be incorporated.
While we exploited spatial sparsity to accelerate the acquisition of a single scan here, the next step to enable 4D PAT imaging with both high spatial and temporal resolution (cf. Section 6.1) is to extend the framebyframe inversion methods examined here to full spatiotemporal variational models that also exploit the temporal redundancy of data generated by dynamics of low complexity.
We used the TV energy as a generic, wellunderstood first example of a spatial sparsity constraint. However, as discussed above, it is, e.g., not very suitable to recover thin vasculature. Higher subsampling rates could be reached by employing more sophisticated regularization functionals designed to recover such anisotropic structures [65].
Choosing random locations as singlepoint subsampling and scrambled Hadamard pattern as patterned interrogation was based on results obtained for similar applications such as CT and MRI. The optimal choice for PAT applications is yet to be determined. For instance, [66] has shown that a nonuniform distribution of the sampling locations in singlepoint subsampling can be used to focus into a specific area (at the expense of the resolution elsewhere). A theoretical examination, e.g., through microlocal analysis [67], could help to gain new insights on this.
7.3 Conclusion
In this study, we investigated different possibilities to subsample the incident photoacoustic field in order to accelerate the acquisition of high resolution PAT. In simulation studies, we demonstrated that PAT wave fields generated by targets with a low spatial complexity can indeed be highly compressible and identified under which conditions this feature can be exploited to obtain high quality images from highly subsampled data: Firstly, variational image reconstruction methods employing sparsity constraints that match the structure of the target have to be used. Secondly, using an accurate forward model wellaligned with the data is crucial. We furthermore applied the methods developed to three experimental data sets from experimental phantoms and invivo recordings. While obtaining promising first results, we also identified several challenges for realizing the full potential of data subsampling, most notably obtaining a good modelfit as discussed above. While we focused on subsampling the data using the FabryPérot based scanner, the novel reconstruction strategies offer new opportunities to dramatically increase the acquisition speed of other photoacoustic scanners that employ pointbypoint sequential scanning as well as reducing the channel count of parallelized schemes that use detector arrays.
The numerical phantom is based upon CT data that was kindly provided by Simon WalkerSamuel at the Centre for Advanced Biomedical Imaging, University College London, and a segmentation thereof kindly provided by Roman Hochuli, University College London.
This work was supported by the Engineering and Physical Sciences Research Council, UK (EP/K009745/1), the European Union project FAMOS (FP7 ICT, Contract 317744) and the National Institute of General Medical Sciences of the National Institutes of Health under grant number P41 GM10354518.
Appendix A Discrete Total Variation Energy
Let the voxels of the 3D pressure , with be indexed by , , , . Using finite forward differences, the most commonly used discretization of the total variation seminorm with Neumann boundary conditions is given by
where , and . Additional terms have to be added to account for different boundary conditions: In the simulation studies, we use Dirichlet boundary conditions which requires to add the jumps at the domain boundaries. For the experimental data, we impose Dirichlet boundary conditions at the detection plane voxels while Neumann boundary conditions are applied on all other faces of the image cube.
Appendix B Optimization
The optimization problem given by (7) consists of the minimizing the sum of two functionals , where we can compute the gradient of the strictly convex, smooth functional ,
(19) 
but no higher derivatives, and we know how to compute the proximal operator of the convex, potentially nondifferentiable functional :
(20) 
In the case of being the positivityconstrained TV energy, the proximal operators simply solves a positivityconstrained TV denoising problem (18).
A wide range of semismooth, first order optimization algorithms for image reconstruction have been developed over recent years [68], each of them advantageous for a specific scenario. In our case, the special feature of PAT is that the application of and requires considerably more computation time than solving most proximal operators (20), including the 3D positivityconstrained TV denoising problem (18), up to a high numerical precision. Under these circumstances, the rather simple proximal gradient descent scheme:
(21) 
turns out to be most efficient if tuned carefully (see [69] for an extensive overview):

The stepsize is set to , where is an approximation of the Lipschitz constant of . For a given setting and subsampling scheme, can be computed quite efficiently by a simple power iteration and then stored in a lookup table.

We use a gradient extrapolation modification (fast or accelerated gradient methods) ensuring a quadratic convergence. The concrete technique we use is the FISTA algorithm [70], where we restart the acceleration if an increase in is detected and switch to a normal gradient for this iterations , followed by up to 5 backtracking steps if necessary ( is, however, not changed for future iterations).

For the 3D PAT problems considered in this work, the iterates from onwards are usually visually not distinguishable. However, we computed a maximum of iterations for all except for Knot, where we only computed iterations. We terminated the iteration earlier if did not change or did not decrease for times in a row.
The proximal operator for (8) can be computed componentwise and explicitly:
(22) 
The positivityconstrained TV denoising is implemented by a primaldual hybrid gradient algorithm as described in [71].
Appendix C Implementation
All routines have been implemented in as part of a larger, Matlab toolbox for PAT image reconstruction which will be made available in near future. The toolbox relies on the kWave toolbox (see [43], http://www.kwave.org/) to implement and , which allows to use highly optimized C++ and CUDA code to compute the 3D wave propagation on parallel CPU or GPU architectures. To give an idea about the range of different computations times, computing one application of for the invivo Vessels scenario (cf. Table 2) in single precision takes 15 using the optimized CUDA code on a Tesla K40 GPU (counting only the GPU runtime), 51 using the Matlab code on the same GPU, 47/6 36 using the optimized C++ code on 12/1 cores of an Intel Xeon CPU (2.70GHz) (counting only the CPU runtime) and 4 3/26 48 using the Matlab code on 12/1 cores of the same CPU.
References
References
 [1] S R Arridge and O Scherzer. Imaging from coupled physics. Inverse Problems, 28(8):080201, 2012.
 [2] S. Zackrisson, S. M W Y van de Ven, and S. S. Gambhir. Light In and Sound Out: Emerging Translational Strategies for Photoacoustic Imaging. Cancer Research, 74(4):979–1004, 2014.
 [3] Adrian Taruttis and Vasilis Ntziachristos. Advances in realtime multispectral optoacoustic imaging and its applications. Nature Photonics, 9(4):219–227, 2015.
 [4] Junjie Yao and Lihong V Wang. Photoacoustic Brain Imaging: from Microscopic to Macroscopic Scales. Neurophotonics, 1(1):011003–1–13, 2014.
 [5] Jun Xia and Lihong V Wang. Smallanimal wholebody photoacoustic tomography: a review. IEEE Transactions on Biomedical Engineering, 61(5):1380–9, 2014.
 [6] Amit P. Jathoul, Jan Laufer, Olumide Ogunlade, Bradley Treeby, Ben Cox, Edward Zhang, Peter Johnson, Arnold R. Pizzey, Brian Philip, Teresa Marafioti, Mark F. Lythgoe, R. Barbara Pedley, Martin A. Pule, and Paul Beard. Deep in vivo photoacoustic imaging of mammalian tissues using a tyrosinasebased genetic reporter. Nature Photon, 9(4):239–246, 2015.
 [7] Lihong V. Wang. Multiscale photoacoustic microscopy and computed tomography. Nature Photonics, 3(9):503–509, Sep 2009.
 [8] Paul Beard. Biomedical photoacoustic imaging. Interface Focus, 2011.
 [9] Liming Nie and Xiaoyuan Chen. Structural and functional photoacoustic molecular tomography aided by emerging contrast agents. Chemical Society Reviews, 43(20):7132–70, 2014.
 [10] E.J. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. Information Theory, IEEE Transactions on, 52(2):489–509, 2006.
 [11] David L. Donoho. Compressed sensing. IEEE Trans Inf Theory, 52(4):1289–1306, 2006.
 [12] Simon Foucart and Holger Rauhut. A Mathematical Introduction to Compressive Sensing. Birkhäuser Basel, 2013.
 [13] Michael Lustig, David Donoho, and John M. Pauly. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine, 58(6):1182–1195, 2007.
 [14] B. Tremoulheac, N. Dikaios, D. Atkinson, and S.R. Arridge. Dynamic MR Image Reconstruction  Separation From Undersampled ( k,t )Space via LowRank Plus Sparse Prior. Medical Imaging, IEEE Transactions on, 33(8):1689–1701, Aug 2014.
 [15] E. von Harbou, H. T. Fabich, M. Benning, A. B. Tayler, A. J. Sederman, L. F. Gladden, and D. J. Holland. Quantitative mapping of chemical compositions with MRI using compressed sensing. Journal of Magnetic Resonance, 261:27 – 37, 2015.
 [16] Ludwig Ritschl, Frank Bergner, Christof Fleischmann, and Marc Kachelrieß. Improved total variationbased CT image reconstruction applied to clinical data. Physics in Medicine and Biology, 56(6):1545, 2011.
 [17] Keijo Hämäläinen, Aki Kallonen, Ville Kolehmainen, Matti Lassas, Kati Niinimaki, and Samuli Siltanen. Sparse Tomography. SIAM J Sci Comput, 35(3):B644–B665, 2013.
 [18] J. S. Jørgensen and E. Y. Sidky. How little data is enough? Phasediagram analysis of sparsityregularized Xray computed tomography. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 373(2043), 2015.
 [19] J. Provost and F. Lesage. The application of compressed sensing for photoacoustic tomography. IEEE Transactions on Medical Imaging, 28(4):585–594, April 2009.
 [20] Zijian Guo, Changhui Li, Liang Song, and Lihong V. Wang. Compressed sensing in photoacoustic tomography in vivo. Journal of Biomedical Optics, 15(2):021311–021311–6, 2010.
 [21] Yan Zhang, Yuanyuan Wang, and Chen Zhang. Total variation based gradient descent algorithm for sparseview photoacoustic image reconstruction. Ultrasonics, 52(8):1046 – 1055, 2012.
 [22] Jing Meng, Lihong V. Wang, Dong Liang, and Liang Song. In vivo opticalresolution photoacoustic computed tomography with compressed sensing. Opt. Lett., 37(22):4573–4575, Nov 2012.
 [23] Jing Meng, Lihong V. Wang, Leslie Ying, Dong Liang, and Liang Song. Compressedsensing photoacoustic computed tomography in vivo with partially known support. Opt. Express, 20(15):16510–16523, Jul 2012.
 [24] Meng Cao, Jie Yuan, Sidan Du, Guan Xu, Xueding Wang, Paul L. Carson, and Xiaojun Liu. Fullview photoacoustic tomography using asymmetric distributed sensors optimized with compressed sensing method. Biomedical Signal Processing and Control, 21:19 – 25, 2015.
 [25] M. Sandbichler, F. Krahmer, T. Berer, P. Burgholzer, and M. Haltmeier. A novel compressed sensing scheme for photoacoustic tomography. SIAM Journal on Applied Mathematics, 75(6):2475–2494, 2015.
 [26] Dong Liang, Hao F. Zhang, and Leslie Ying. Compressedsensing photoacoustic imaging based on random optical illumination. International Journal of Functional Informatics and Personalised Medicine, 2(4):394–406, 2009.
 [27] Mingjian Sun, Naizhang Feng, Yi Shen, Xiangli Shen, Liyong Ma, Jiangang Li, and Zhenghua Wu. Photoacoustic imaging method based on arcdirection compressed sensing and multiangle observation. Opt. Express, 19(16):14801–14806, Aug 2011.
 [28] Kun Wang and MarkA. Anastasio. Photoacoustic and thermoacoustic tomography: Image formation principles. In Otmar Scherzer, editor, Handbook of Mathematical Methods in Imaging, pages 781–815. Springer New York, 2011.
 [29] M. Benning. Singular Regularization of Inverse Problems. PhD thesis, University of Muenster, 2011.
 [30] Christian Lutzweiler and Daniel Razansky. Optoacoustic imaging and tomography: Reconstruction approaches and outstanding challenges in image performance and quantification. Sensors, 13(6):7345, 2013.
 [31] Ben Cox, Jan G. Laufer, Simon R. Arridge, and Paul C. Beard. Quantitative spectroscopic photoacoustic imaging: a review. Journal of Biomedical Optics, 17(6):061202–1–061202–22, 2012.
 [32] Edward Zhang, Jan Laufer, and Paul Beard. Backwardmode multiwavelength photoacoustic scanner using a planar fabryperot polymer film ultrasound sensor for highresolution threedimensional imaging of biological tissues. Appl. Opt., 47(4):561–577, Feb 2008.
 [33] M.F. Duarte, M.A Davenport, D. Takhar, J.N. Laska, Ting Sun, K.F. Kelly, and R.G. Baraniuk. Singlepixel imaging via compressive sampling. Signal Processing Magazine, IEEE, 25(2):83–91, March 2008.
 [34] Nam Huynh, Edward Zhang, Marta Betcke, Simon Arridge, Paul Beard, and Ben Cox. Patterned interrogation scheme for compressed sensing photoacoustic imaging using a Fabry Pérot planar sensor. In Proc. SPIE, volume 8943, pages 894327–894327–5, 2014.
 [35] Nam Huynh, Edward Zhang, Marta Betcke, Simon Arridge, Paul Beard, and Ben Cox. A realtime ultrasonic field mapping system using a Fabry Pérot single pixel camera for 3D photoacoustic imaging. In Proc. SPIE, volume 9323, pages 93231O–93231O–7, 2015.
 [36] Nam Huynh, Edward Zhang, Marta Betcke, Simon Arridge, Paul Beard, and Ben Cox. Singlepixel optical camera for video rate ultrasonic imaging. Optica, 3(1):26–29, Jan 2016.
 [37] Simon R. Arridge, Marta M. Betcke, Ben Cox, Felix Lucka, and Brad Treeby. On the adjoint operator in photoacoustic tomography. arXiv, (1602.02027), 2016.
 [38] Peter Kuchment and Leonid Kunyansky. Mathematics of photoacoustic and thermoacoustic tomography. In Otmar Scherzer, editor, Handbook of Mathematical Methods in Imaging, pages 817–865. Springer New York, 2011.
 [39] M. M. Betcke, B. T. Cox, N. Huynh, E. Z. Zhang, P. C. Beard, and S. R. Arridge. Acoustic Wave Field Reconstruction from Compressed Measurements with Application in Photoacoustic Tomography. arxiv, (1609.02763), September 2016.
 [40] T.D. Mast, L.P. Souriau, D.L.D. Liu, M. Tabei, A.I. Nachman, and R.C. Waag. A kspace method for largescale models of wave propagation in tissue. Ultrasonics, Ferroelectrics, and Frequency Control, IEEE Transactions on, 48(2):341–354, March 2001.
 [41] B. T. Cox, S. Kara, S. R. Arridge, and P. C. Beard. kspace propagation models for acoustically heterogeneous media: Application to biomedical photoacoustics. The Journal of the Acoustical Society of America, 121(6), 2007.
 [42] Bradley E Treeby, Edward Z Zhang, and B T Cox. Photoacoustic tomography in absorbing acoustic media using time reversal. Inverse Problems, 26(11):115003, 2010.
 [43] Bradley E. Treeby and B. T. Cox. kwave: Matlab toolbox for the simulation and reconstruction of photoacoustic wave fields. Journal of Biomedical Optics, 15(2):021314–021314–12, 2010.
 [44] T.T. Do, Lu Gan, N.H. Nguyen, and T.D. Tran. Fast and efficient compressive sensing using structurally random matrices. Signal Processing, IEEE Transactions on, 60(1):139–154, Jan 2012.
 [45] David Finch and Sarah K Patch. Determining a function from its mean values over a family of spheres. SIAM journal on mathematical analysis, 35(5):1213–1240, 2004.
 [46] Yuan Xu and Lihong V Wang. Application of time reversal to thermoacoustic tomography. In Biomedical Optics 2004, pages 257–263. International Society for Optics and Photonics, 2004.
 [47] Plamen Stefanov and Gunther Uhlmann. Thermoacoustic tomography with variable sound speed. Inverse Problems, 25(7):075011, 2009.
 [48] Otmar Scherzer, Markus Grasmair, Harald Grossauer, Markus Haltmeier, and Frank Lenzen. Variational methods in imaging, volume 320. Springer, 2009.
 [49] Leonid I. Rudin, Stanley Osher, and Emad Fatemi. Nonlinear total variation based noise removal algorithms. Physica D Nonlinear Phenomena, 60:259–268, November 1992.
 [50] Martin Burger and Stanley Osher. A Guide to the TV Zoo. In Level Set and PDE Based Reconstruction Methods in Imaging, Lecture Notes in Mathematics, pages 1–70. Springer International Publishing, 2013.
 [51] Chao Huang, Kun Wang, Liming Nie, L.V. Wang, and M.A. Anastasio. Fullwave iterative image reconstruction in photoacoustic tomography with acoustically inhomogeneous media. Medical Imaging, IEEE Transactions on, 32(6):1097–1110, June 2013.
 [52] C. Brune, A. Sawatzky, and M. Burger. Primal and dual Bregman methods with application to optical nanoscopy. International Journal of Computer Vision, pages 1–19, 2010.
 [53] J. Mueller. Advanced Image Reconstruction and Denoising  Bregmanized (Higher Order) Total Variation and Application in PET. PhD thesis, Institute for Computational and Applied Mathematics, University of Münster, 2013.
 [54] M. Benning, L. Gladden, D. Holland, C.B. Schönlieb, and T. Valkonen. Phase reconstruction from velocityencoded MRI measurements  A survey of sparsitypromoting variational approaches. Journal of Magnetic Resonance, 238:26 – 43, 2014.
 [55] L.M. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. {USSR} Computational Mathematics and Mathematical Physics, 7(3):200 – 217, 1967.
 [56] S. Osher, M. Burger, D. Goldfarb, J. Xu, and W. Yin. An iterative regularization method for total variationbased image restoration. Multiscale Modeling and Simulation, 4(2):460–489, 2006.
 [57] Tom Goldstein and Stanley Osher. The Split Bregman method for L1regularized problems. SIAM J Img Sci, 2:323–343, April 2009.
 [58] SCI Institute, 2015. SCIRun: A Scientific Computing Problem Solving Environment, Scientific Computing and Imaging Institute (SCI), Download from: http://www.scirun.org.
 [59] Jari P Kaipio and Erkki Somersalo. Statistical inverse problems: Discretization, model reduction and inverse crimes. J Comput Appl Math, 198(2):493–504, 2007. Applied Computational Inverse Problems.
 [60] Xing Jin and Lihong V Wang. Thermoacoustic tomography with correction for acoustic speed variations. Physics in Medicine and Biology, 51(24):6437, 2006.
 [61] U. Toussaint. Bayesian inference in physics. Rev Mod Phys, 83(3):943–999, 2011.
 [62] S R Arridge, J P Kaipio, V Kolehmainen, M Schweiger, E Somersalo, T Tarvainen, and M Vauhkonen. Approximation errors and model reduction with an application in optical diffusion tomography. Inverse Problems, 22(1):175, 2006.
 [63] T. Tarvainen, A. Pulkkinen, B.T. Cox, J.P. Kaipio, and S.R. Arridge. Bayesian Image Reconstruction in Quantitative Photoacoustic Tomography. IEEE Trans Med Imaging, 32(12):2287–2298, Dec 2013.
 [64] Bradley E. Treeby. Acoustic attenuation compensation in photoacoustic tomography using timevariant filtering. Journal of Biomedical Optics, 18(3):036008–036008, 2013.
 [65] Gitta Kutyniok and WangQ Lim. Curves and Surfaces: 7th International Conference, Avignon, France, June 24  30, 2010, Revised Selected Papers, chapter Image Separation Using Wavelets and Shearlets, pages 416–430. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
 [66] J. Schmid, T. Glatz, B. Zabihian, M. Liu, W. Drexler, and O. Scherzer. NonEquispaced Grid Sampling in Photoacoustics with a NonUniform FFT. arXiv, (1510.01078), 2015.
 [67] JÃrgen Frikel and Eric Todd Quinto. Artifacts in incomplete data tomography with applications to photoacoustic tomography and sonar. SIAM Journal on Applied Mathematics, 75(2):703–725, 2015.
 [68] Martin Burger, Alexander Sawatzky, and Gabriele Steidl. First Order Algorithms in Variational Image Processing. arXiv, (1412.4237):60, 2014.
 [69] T. Goldstein, C. Studer, and R. Baraniuk. A Field Guide to ForwardBackward Splitting with a FASTA Implementation. ArXiv eprints, November 2014.
 [70] A. Beck and M. Teboulle. Fast gradientbased algorithms for constrained total variation image denoising and deblurring problems. Image Processing, IEEE Transactions on, 18(11):2419–2434, Nov 2009.
 [71] Antonin Chambolle and Thomas Pock. A FirstOrder PrimalDual Algorithm for Convex Problems with Applications to Imaging. Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011.
Tables and table captions
Abbreviation  Meaning  Reference 
BP  back projection  Sec. 4.4, (5) 
BPppTV+  BP followed by TV+ denoising as postprocessing  see TRppTV+ 
cnv.  conventional (data sampling)  see Sec. 3.1.1 
DMD  digital micromirror device  Sec. 4.3 
DP  discrepancy principle  Sec. 5.2, (15) 
FP  FabryPérot interferometer  Sec. 3.2 
L2+  positivityconstrained regularization  Sec. 4.5, (8) 
mxIP  maximum intensity projection  Sec. 5.1 
PSNR  peak signaltonoise ratio  (14) 
(Q)PAT  (quantitative) photoacoustic tomography  Sec. 1 
rSP  random single point subsampling  Sec. 3.1.1 
sHd  scrambled Hadamard subsampling  Sec. 3.1.2, 4.3 
TR  time reversal  Sec. 4.4, (6) 
TRppTV+  TR followed by TV+ denoising as postprocessing  (18) 
TV+  positivityconstrained total variation regularization  Sec. 4.5, (9) 
TV+BR  Bregman iterations applied to TV+  Sec. 4.6, (11), (12) 
parameter  Tumor1/2  Knot  Hair  Vessels 

128  (44,264,264)  (28,128,128)  (42,282,282)  
156.25  75  (62.12,62.12,68.00)  50  
740  391  56  621  
31.25  12  20  10  
16384  17424  16384  19881  
156.25/312.5  150  /  100  
1500  1540  1500  1420 