Improved lowcount quantitative PET reconstruction with a variational neural network
Abstract
Image reconstruction in lowcount PET is particularly challenging because gammas from natural radioactivity in Lubased crystals cause high random fractions that lower the measurement signaltonoiseratio (SNR). In modelbased image reconstruction (MBIR), using more iterations of an unregularized method may increase the noise, so incorporating regularization into the image reconstruction is desirable to control the noise. New regularization methods based on learned convolutional operators are emerging in MBIR. We modify the architecture of a variational neural network, BCDNet, for PET MBIR, and demonstrate the efficacy of the trained BCDNet using XCAT phantom data that simulates the low true coincidence countrates with high random fractions typical for Y90 PET patient imaging after Y90 microsphere radioembolization. Numerical results show that the proposed BCDNet significantly improves PET reconstruction performance compared to MBIR methods using nontrained regularizers, total variation (TV) and nonlocal means (NLM), and a nonMBIR method using a single forward pass deep neural network, UNet. BCDNet improved activity recovery for a hot sphere significantly and reduced noise, whereas nontrained regularizers had a tradeoff between noise and quantification. BCDNet improved CNR and RMSE by 43.4% (85.7%) and 12.9% (29.1%) compared to TV (NLM) regularized MBIR. Moreover, whereas the image reconstruction results show that the nonMBIR UNet overfits the training data, BCDNet successfully generalizes to data that differs from training data. Improvements were also demonstrated for the clinically relevant phantom measurement data where we used training and testing datasets having very different activity distribution and countlevel.
I Introduction
Image reconstruction in lowcount PET is particularly challenging because dominant gammas from natural radioactivity in Lubased crystals cause low signaltonoiseratio (SNR) measurements [1]. To accurately reconstruct images in lowcount PET, regularized modelbased image reconstruction (MBIR) solves the following variational problem consisting of 1) a data fidelity that models the physical PET imaging model, and 2) a regularization term that penalizes image roughness and controls noise[2]:
(1) 
Here, is the Poisson negative loglikelihood for measurement and estimated measurement means , the matrix denotes the system model, and denotes the mean background events such as scatter and random coincidences. Recently, applying learned regularizers to is emerging for MBIR [3].
While there is much ongoing research on machine learning or deeplearning techniques applied to CT [4, 5, 6, 7, 8] and MRI [9, 10, 11, 12, 13] reconstruction problems, only a few studies have applied these techniques to PET. Most past PET studies use deep learning in image space without exploiting the physical imaging model in (1). For example, [14] applied a deep neural network (NN) mapping between reconstructed PET images with normal dose and reduced dose and [15] applied a multilayer perceptron mapping between reconstructed images using maximum a posteriori algorithm and a reference (true) image, however, their framework uses the acquisition data only to form the initial image. Therefore, the reconstruction quality depends greatly on the training dataset and information from atypical imaging situations and variable noise levels (that are not part of the training set) may not be recovered well. A recent work [16] trained a NN to reconstruct an image directly from sinogram, however, its framework has similar limitation with [14]. Recently [17, 18] proposed a PET MBIR framework using a deeplearning based regularizer. Our proposed MBIR framework, BCDNet, also uses a regularizer that penalizes differences between the unknown image and “denoised” images given by regression neural network in a recurrent manner. In particular, whereas [17, 18] trained only a single image denoising NN, the proposed method is a recurrent framework that is composed of multiple trained NNs. This recurrent framework enables NNs in the later stages to learn how to recover fine details. Our proposed BCDNet is also distinct from [17, 18] in that denoising NNs are derived by variational (optimization) formulation with a mathematical motivation (whereas, for the trained regularizer, [17, 18] brought UNet [19] and DnCNN [20] developed for medical image segmentation and general Gaussian denoising) and characterized by less parameters, thereby avoiding overfitting and generalizing well to unseen data especially when training samples are limited (see Section IV).
Variational NNs [21, 11, 9, 22, 10, 8] are a broad family of methods that originate from unrolling algorithm for solving an optimization problem and BCDNet [23] is a specific example of a variational NN. BCDNet is constructed by unfolding block coordinate descent (BCD) MBIR algorithm using “learned” convolutional analysis operators[24, 25], and significantly improved image recovery accuracy in extreme imaging applications, e.g., highly undersampled MRI, denoising lowSNR images, etc. A preliminary version of this paper was presented at the 2018 Nuclear Science Symposium and Medical Imaging Conference [26]. We significantly extended this work by applying our proposed method to the measurement data with newly developed techniques. We also added detailed analysis on our proposed method as well as comparisons to related works.
To show the efficacy of our proposed method BCDNet in lowcount PET imaging, we performed both digital phantom simulation and experimental measurement studies with activity distributions and countrates that are relevant to clinical Y90 PET imaging after liver radioembolization. Novel therapeutic applications have sparked growing interest in quantitative imaging of Y90, an almost pure beta emitter that is widely used in internal radionuclide therapy. In addition to the FDA approved Y90 microsphere radioembolization and Y90 ibritumomab radioimmunotherapy, there are 50 active clinical trials for Y90 labeled therapies (www.clinicaltrails.gov). However, the lack of gamma photons complicates imaging of Y90; it involves SPECT via bremsstrahlung photons produced by the betas [27] or PET via a very low abundance positron in the presence of bremsstrahlung that leads to low signaltonoise [28]. We apply a deep BCDNet that is trained for realistic lowcount PET imaging environments and compare its performance with those of nontrained regularizers and single forward pass denoising NN.
Our proposed BCDNet applies to PET imaging in general, particularly in other imaging situations that also have low counts. Using shorter scan times and lower tracer activity in diagnostic PET has cost benefits and reduces radiation exposure, but at the expense of reduced counts that makes traditional iterative reconstruction challenging.
Section II develops our proposed BCDNet architecture for PET MBIR. Section II also explains the simulation studies in the setting of Y90 radioembolization and provides details on how we perform the physical phantom measurement. Section III presents how the different reconstruction methods perform with the simulation and measurement data. Section IV discusses what imaging factor affects generalization performance of BCDNet most, and compares between BCDNet using shallow convolutional autoencoding denoiser and BCDNet using deep UNet denoiser. Section V concludes with future works.
Ii Methods
This section presents the problem formulation of the BCDNet and gives a detailed derivation on how we get the final form of BCDNet. We also provide several techniques for BCDNet that we specifically devised for PET measurement data where each measurement has different countlevel (and noiselevel). Then we review the related works that we compare with BCDNet such as MBIR methods using conventional nontrained regularizers and single forward pass denoising (nonMBIR) NN. This section also describes the simulation setting and details on measurement data and what evaluation metrics are used to assess the efficacy of each reconstruction algorithm.
Iia BCD algorithm for MBIR using “learned” convolutional regularization
Conventional PET regularizers penalize differences between neighboring pixels [29]. That approach is equivalent to assuming that convolving the image with the [1,1] finite difference filters with different directions produces a sparse output. Using such “handcrafted” filters is unlikely to be the best approach. A recent trend is to use training data to learn filters that produce sparse outputs when convolved with images of interest [24, 25]. Such learned filters can be used to define a regularizer that prefers images having sparse outputs, as follows[30]:
(2) 
where is regularization parameter, is a set of convolutional filters, is a set of sparse codes, is a set of thresholding parameters controlling the sparsity of , is the number of image voxels, and and is the size and number of learned filters, respectively. BCDNet is inspired by this type of “learned” regularizer. Ultimately, we hope that the learned regularizer can better separate true signal from noisy components.
BCD algorithm solves (1) with regularizer (2) by alternatively updating and :
(3)  
(4) 
where is the elementwise soft thresholding operator: .
Assuming that learned filters satisfy the tightframe condition, [24], we rewrite the updates in (3)(4) as follows:
(5)  
(6) 
where denotes a rotated version of . For efficient image reconstruction (6) in PET, we use EMsurrogate of Poisson loglikelihood function [31]:
where and is the number of rays. Equating to zero is equivalent to finding the root of the following quadratic formula:
where denotes an element of the system model at th row and th column, and denotes th iteration in (6).
IiB BCDNet for PET MBIR and its training
To further improve denoising capability, we extend the convolutional autoencoder in (5) [23], by replacing with separate decoding filters . We define the following updates for each layer:
(7)  
(8) 
One can further extend the convolutional autoencoder in (7) to a general regression NN, e.g., deep UNet [19]. See Section IV.
We train the image denoising module in the form of (7) – specifically, consisting of encoding and decoding filters, and thresholding values – that “best” map between highquality images (e.g., true images if available) and noisy images in the sense of mean squared error:
(9) 
where is a set of true images and is a set of images estimated by image reconstruction module in the th layer.
IiC BCDNet for measurement data in PET
IiC1 Normalization and scaling scheme
Different PET images can have very different intensity values due to variations in scan time and activity, and it is important for trained methods to be able to generalize to a wide range of count levels. Towards this end, we implemented normalization and scaling techniques in BCDNet. [18] extended [17] by implementing “local linear fitting” to ensure that the denoising NN output has similar intensity as the input patch from the current estimated image. Our approach is different in that we normalize and scale the image with a global approach, not a patchbased approach. In particular, we modify the architecture in (7)(8) as:
(10)  
(11) 
where the normalization function is defined by to satisfy , and the scaling function is defined by with . We solve the optimization problem over using Newton’s method:
(12) 
We also apply this normalization technique when training the convolutional filters and thresholding values:
IiC2 Adaptive regularization parameter scheme
The best regularization parameter value can also vary greatly between scans, depending on the count level. Therefore, instead of choosing one specific value for regularization parameter, we set the value based on evaluation on current gradients of datafidelity term and regularization term:
(13) 
where is a constant specifying how we balance between the datafidelity term and regularization term and denotes th iteration in (11).
IiD Conventional MBIR methods: Nontrained regularizers
We compared the proposed BCDNet with two MBIR methods using standard nontrained regularizers.
IiD1 Totalvariation (TV)
TV regularization penalizes the sum of absolute value of differences between adjacent voxels:
where is finite differencing matrix. Recent work [32] applied PrimalDual Hybrid Gradient (PDHG) [33] for PET MBIR using TV regularization and demonstrated that PDHGTV is superior than clinical reconstruction (e.g., OSEM) for lowcount datasets in terms of several image quality evaluation metrics such as contrast recovery and variability.
IiD2 Nonlocal means (NLM)
NLM regularization minimizes the differences between nearby patches in image:
where is a potential function of a scalar variable , is the search neighborhood around the th voxel, and is a patch extraction operator at the th voxel. We implemented Fair potential function for :
where is a design parameter and is the number of voxels in the patch . Unlike conventional local filters that assume similarity between only adjacent voxels, NLM filters can average image intensities over distant voxels. As in [34], we used ADMM to accelerate algorithmic convergence with an adaptive penalty parameter selection method [35].
IiE Conventional nonMBIR method: Denoising deep UNet
Many related works [5, 36, 14] use single image denoising (deep) NN (e.g., UNet) as a postreconstruction processing. We implemented nonrecurrent (single forward pass) 3D version of UNet to compare with the proposed BCDNet. The ‘encoder’ part of UNet consists of multiple sets of 1) max pooling layer, 2) 333 convolutional layer, 3) batch normalization (BN) layer, 4) ReLU layer and the ‘decoder’ part of UNet consists of multiple sets of 1) upsampling with trilinear interpolation [17], 2) 333 convolutional layer, 3) batch normalization (BN) layer, 4) ReLU layer. We used ReLU layer as the last step to enforce the nonnegativity constraint on image [17].
IiF Experimental setup: Digital phantom simulation and experimental measurement
IiF1 Y90 PET/CT XCAT simulations
We used XCAT [37] phantom (Fig. 2) to simulate Y90 PET following radioembolization. We set the image size to 128128100 with a voxel size 4.04.04.0 (mm) and chose 100 slices ranging from lung to liver. To simulate the extremely low count scan, typical for Y90 PET, we set total true coincidences and random fraction based on numbers from patient PET imaging performed after radioembolization [38]. To test the generalization capability of trained NNs, we changed all imaging factors between training and testing dataset. Here, imaging factors include activity distribution (shape and size of tumor and liver background, concentration ratio between hot and warm region) and countlevel (total true coincidences and random fraction). Fig. 2 and Table I provide details on how we changed the testing dataset from the training dataset.
Training data  Testing data  

Concentration ratio (hot:warm)  9:1  4:1 
Total net trues  200 K  500 K 
Random fraction (%)  90.9  87.5 
Sphere  Livertorso  

Total activity (GBq)  0.65  1.9 
Concentration ratio (hot:warm)  8.9:1  5.4:1 
Total prompts  3.2  6.3 M  2.3 M 
Total randoms  2.9  5.7 M  2.1 M 
Total net trues  308  599 K  220 K 
Random fraction(%)  90.3  90.5  90.7 
Patient A  

Total activity (GBq)  2.55 
Total prompts  2.7 M 
Total randoms  2.3 M 
Total net trues  380 K 
Random fraction(%)  85.8 
IiF2 Y90 PET/CT physical phantom measurements and patient scan
For training NNs (BCDNet and UNet), we used measurement with sphere phantom (Fig. 4) where six ‘hot’ spheres (2,4,8,16,30 and 113 mL, 0.5 MBq/ml) are placed in a ‘warm’ background (0.057 MBq/ml) with total activity of 0.65 GBq. The phantom was scanned for 40 (3 acquisitions)  80 (1 acquisition) minutes on a Siemens Biograph mCT PET/CT. For testing NNs and other reconstruction algorithms, we used an anthropomorphic liver/lung torso phantom (Fig. 4) with total activity and distribution that is clinically realistic for imaging following radioembolization with Y90 microspheres: 5% lung shunt, 1.17 MBq/mL in liver, 3 hepatic lesions (4 and 16 mL spheres, 29 mL ovoid) of 6.6 MBq/ml. The phantom with total activity of 1.9 GBq was scanned for 30 minutes on a Siemens Biograph mCT PET/CT. Fig. 4 and Table II provide details on the countlevel and activity distribution difference between training (sphere phantom) and testing (liver phantom) dataset. We also tested NNs with an actual Y90 patient scan and the countlevel information is provided in Table III.
We acquired all measurement data with time of flight information. The measurement data size is 40016862113. The last dimension of measurement indicates the number of time bin. The reconstructed image size is 400400112 with a voxel size 2.042.042.03 (mm). To reconstruct the image with measurement data, we used SIEMENS time of flight system model ( in (1)) along with manufacturer given attenuation/normalization correction, PSF modelling, and randoms/scatters estimation.
IiG Evaluation metrics
For XCAT phantom simulation, we evaluated each reconstruction with activity recovery (AR) (VOI: hot sphere, background), contrast recovery (CR) (VOI: cold spot), contrast to noise ratio (CNR), and root mean squared error (RMSE). For physical phantom measurement, we used AR, coefficient of variation (CV) [39] (VOI: hot spheres), and CNR averaged over multiple hot spheres. For patient measurement, we used field of view (FOV) activity bias since the total activity in FOV is known (equal to the injected activity because the microspheres are trapped) while the activity distribution is unknown:
where is mean counts in the volume of interest (VOI), is standard deviation between voxel values in uniform background liver, and is the total number of voxels in field of view (FOV).
Attenuation map (coronal)  Attenuation map (axial)  True activity (training)  True activity (testing)  Zoomed in 
EM  TV  NLM  UNet  BCDNet 
Profile at Lesion 
Profile at Cold Spot 
Sphere phantom attenuation map (coronal)  Attenuation map (axial)  True activity image  of BCDNet 
Liver phantom attenuation map (coronal)  Attenuation map (axial)  True activity image  of BCDNet 
TRUE  OSEM w/ postfiltering  UNet  BCDNet 

Iii Results
Iiia Experimental setup: Training BCDNet and UNet
IiiA1 BCDNet
We trained 3D convolutional filters and thresholding values in each layer with a stochastic gradient descent method using PyTorch [40] deeplearning library. We trained a thirtylayer BCDNet where each layer has sets of thresholding values and convolutional encoding/decoding filters for the simulation experiment and twentylayer BCDNet with (due to GPU memory capacity) for the measurement experiment. We set the size of each filter as , and the initial thresholding values by sorting the initial estimate of image and getting a 10% largest value of sorted initial image. We used Adam optimization method [41] to train the NN with learning rate of for encoding filters, for decoding filters, and for thresholding values. We applied the learning rate decay scheme (e.g., decreasing the learning rate as a factor of 0.9 per 20 epochs). Due to the large size of 3D input, we set the batch size as 1. We used 300 epochs to train the denoising NN at each layer.
With the XCAT simulation data, we trained BCDNet using five pairs () of 3D true image and 3D realizations (1 true image, 5 realizations). We generated multiple realizations to train the denoising NN to deal with the Poisson noise.
With measurement data, we used the sphere phantom data to train the BCDNet. We used 4 pairs () of 3D true image and 4D measurements (1 true image, 4 scans) for training. We set in (13). In measurement experiment, due to the memory capacity of GPUs, we downsample with a factor of 2 before feeding into the denoising module and upsample before feeding into the image reconstruction module. The final output image is from the reconstruction module.
IiiA2 UNet
For training UNet, we used identical training dataset that we used for training BCDNet. We also used Adam optimization method to train the NN with learning rate of . We scaled the image generated by UNet with described in (12). We set the number of convolutional filter channels of first layer of encoder as 48 for simulation dataset and 32 for measurement dataset.
IiiB Experimental setup: Reconstruction and testing trained NN
With simulation dataset, we compare the proposed method BCDNet to the standard EM (1 subset), TVbased MBIR with PDHG algorithm (PDHGTV), NLMbased MBIR with ADMM algorithm (ADMMNLM), and single forward pass denoising UNet. For BCDNet and UNet, we used 20 EM algorithm iterations to get the initial image. We report averaged evaluation metrics over 3 realizations. With measurement dataset, we compare the proposed method BCDNet to the standard EM (1 subset), OSEM (1 iteration and 21 subsets) with postfiltering which is identical to reconstruction setting used in our clinic, and denoising UNet. We used 10 EM algorithm iterations to get the initial image for BCDNet and UNet.
IiiB1 Reconstruction (testing) results on simulation data
We used 50 iterations for EM and 200 iterations for PDHGTV, ADMMNLM, and 5 iterations for the reconstruction module (8) at each layer of BCDNet. When reporting evaluation results, we selected the iteration number for EM to obtain the highest CNR. For each regularizer, we finely tuned the regularization parameter (within range ) to achieve highest CNR and lowest RMSE values. For NLM, we additionally tuned the window and search sizes.
IiiB2 Reconstruction (testing) results on measurement data
We tested on the livertorso phantom measurement data and patient measurement data with trained BCDNet (and UNet) using sphere phantom data. We used 2 iterations for the reconstruction module (11) at each layer of BCDNet.
IiiC Results: Reconstruction (testing) on simulation data
The proposed variational NN, BCDNet, significantly improves overall reconstruction performance over the other nontrained regularization methods. See Table IV and Fig. 2. Table IV shows that BCDNet achieves best results in most evaluation metrics. In particular, BCDNet improves CNR and RMSE by 43.4%/85.7%/8.3% and 12.9%/29.1%/26.9% compared to PDHGTV/ADMMNLM/UNet. Fig. 2 shows that reconstructed image using BCDNet is closest to the true image whereas PHDGTV and UNet exceedingly blur in cold region and ADMMNLM is noisy in uniform region. The profiles in Fig. 3 also illustrate that PDHGTV and UNet underestimate the hot region, whereas profile of BCDNet is very close to true value in both hot and cold region.
AR  AR  CR  CNR  RMSE  
Lesion  Liver  Cold Spot  
EM  77.6  79.5  49.4  7.7  7.0 
PDHGTV  73.8  90.9  64.5  6.9  6.5 
ADMMNLM  77.6  88.2  68.0  5.3  7.9 
UNet  72.3  88.3  35.7  9.1  7.7 
BCDNet  83.7  90.4  70.2  9.9  5.6 
AR  CV  CNR  


29ml  16ml  4ml  Liver  
EM  74.9  66.0  49.1  88.3  35.2  6.1 
OSEM w/ postfiltering  83.9  80.1  47.5  90.4  40.9  5.6 
UNet  72.9  62.1  35.9  46.5  12.7  7.2 
BCDNet  97.1  80.3  54.0  87.2  25.6  6.8 
We report CV and CNR value using average of three hot lesions.
FOV bias  

OSEM w/ filter  19.2 
UNet  33.0 
BCDNet  10.6 
IiiD Results: Reconstruction (testing) on measurement data
IiiD1 Phantom study
Similar to the result of simulation data, BCDNet improves overall reconstruction performance over the other reconstruction methods. See Fig. 4 and Table V. When reporting evaluation metrics and visualizing the images, we selected the iteration number for EM and BCDNet to obtain the highest CNR. Table V shows that BCDNet improves AR/CV/CNR at multiple spheres by 9.3%/37.5%/22.3% compared to OSEM. Even though UNet improves CNR more than BCDNet, image generated by UNet looks unnatural and the shape of liver is roundshape like sphere phantom showing that UNet is overfitting to the training data.
IiiD2 Patient study
Because of the unknown true activity distribution, we quantitatively evaluated each reconstruction method with FOV activity bias. BCDNet improves the FOV bias by 44.9% and 67.0% compared to OSEM and UNet. UNet generates very artificial image that resembles the sphere phantom activity distribution. See Fig. 5.
Patient attenuation map (coronal)  Attenuation map (axial)  OSEM w/ filter (coronal)  OSEM w/ filter (axial) 
UNet (coronal)  UNet (axial)  BCDNet (coronal)  BCDNet (axial) 
Iv Discussion
In this study we showed the efficacy of trained regularization method BCDNet on both qualitative and quantitative Y90 PET/CT imaging and compared between conventional nontrained regularizers and other trainingbased denoising method. The proposed regularization method uses convolutional filters to lift estimated signals and thresholding operations to remove unwanted signals. Moreover, the recurrent framework of BCDNet enables one to train the filters and thresholding values to deal with the different image roughness at its each layer. We experimentally demonstrate its capability of generalization with simulation and measurement data. In the XCAT PET/CT simulation with activity distributions and countrates mimicking Y90 PET imaging, AR in the volume of interest was significantly underestimated with standard reconstruction and other MBIR methods using nontrained regularization, however approached the true activity with the proposed regularization method. Improvements were also demonstrated for the measurement data where we used training and testing datasets having very different activity distribution and countlevel.
We investigated if the recurrent BCDNet combined with UNet denoisers [17] (denoising module in (7) is replaced by UNet) is better than the proposed BCDNet using convolutional autoencoder denoiser (7). At each BCDNet layer, UNet in this experiment has about 22.6 million trainable parameters, whereas the denoiser in (7) has only 7040 ( trainable parameters. In Fig. 6, RMSE value of UNetbased BCDNet is lower than that of the proposed BCDNet using (7) in training dataset, however, it is opposite in testing data. This result demonstrates that the shallow convolutional autoencoders in (7) has better generalization capability over the deep UNet denoisers, when the training samples are limited. Fig. 6 also compares between reconstructed images from BCDNet with the convolutional autoencoders (7) and that using UNets, and it shows that the visual quality of (7)based BCDNet is better than that of UNetbased BCDNet in that UNet denoisers generates artificial cold region around the hot region. This overfitting issue with UNet based regularizer could be moderated if one trained the NN with more diverse and large dataset; however, medical imaging community often finds it hard to collect large enough datasets compared to the computer vision community [42], therefore it is important for a reconstruction method to ensure the generalization to unseen dataset.
(a) RMSE in training dataset  (b) RMSE in testing dataset 

(c) TRUE  (d) Autoencoder denoiser  (e) UNet denoiser 

(a) Impact of number of filter  (b) Impact of size of filter 
(Fixed )  (Fixed ) 
(c) Trained with loss  (d)  (e) 

(a) at each layer 
(b) at each layer 
in training and testing case when 
Changed imaging variable  Training  Testing  RMSE  Drop (%) 
Identical    4.74    
Shape and size  See Fig. 2  5.49  15.9  
Concentration ratio  9:1  4:1  5.55  17.1 
Concentration ratio  1.7:1  4:1  5.81  22.5 
Trues Countlevel  5.01  5.7  
Trues Countlevel  5.71  20.5 
We tested which imaging variable impacts most on the generalization performance of the proposed BCDNet. Table VII shows that how BCDNet performs when training and testing data had same activity distribution and countlevel (only difference is Poisson noise) and how the performance of BCDNet is degraded when each imaging variable is changed between training and testing dataset. We changed one of three factors (shape and size of tumor and liver, concentration ratio, countlevel) in training dataset compared to testing dataset. The result shows that generalization performance of the proposed BCDNet depends largely on all of imaging variables. However, training with higher contrast and lower countlevel dataset (compared to testing dataset) gives less degradation of performance compared to the opposite cases. This result suggests that it is better to have noisier data in training dataset than testing dataset. In other words, training for extra noise reduction than needed is better than less noise reduction than needed.
We investigated how each factor in training of denoising module (7) impacts on the generalization capability of BCDNet. Fig. 7(a)(b) show the impact of number and size of filters on performance. Plots show that the proposed BCDNet achieves lower training RMSE when using larger number and size of filters; however, it does not decrease testing RMSE compared to smaller number and size of filters and BCDNet with larger size of filter exceedingly blurs image thereby resulting in higher RMSE. See Fig. 7(e). This result well corresponds to the above result related to the parameter dimension of NN. We also tested if replacing training loss (mean squared error) in (9) by loss improves the performance. However, it results in unnaturally piecewise constant image and details in small cold regions are ignored.
Fig. 8 shows plots of the thresholding values at each layer and sum of the values at each layer. Sum of thresholding value decreases as the algorithm iterates. The result corresponds to the motivation of BCDNet in that the noise component of signal (unwanted signal) is removed by thresholding operations at early layer and fine details of true signal are recovered at later layer.
Fig. 9 shows how regularization parameter in (13) changes with iterations in training and testing datasets. The value in each layer converges to different limits in training and testing cases. Large discrepancy of values between training and testing cases is mostly from the different intensity of normalization correction factor (sum of factors in training case is times larger than testing case) associated in the system model in (1). These empirical results underscore the importance of such adaptive regularization parameter selection schemes proposed in Section IIC2 in PET imaging.
V Conclusion
It is important for a “learned” regularizer to have generalization capability to guarantee the performance when applying it to unseen dataset. For lowcount PET reconstruction, the proposed variational NN, BCDNet, significantly improves the generalization capability, compared to a deep NN, UNet, to the unseen dataset, when training dataset is small. The proposed BCDNet achieves significant qualitative and quantitative improvements over the conventional MBIR methods using “handcrafted” nontrained regularizers, TV and NLM. In particular, these conventional MBIR methods have a tradeoff between noise and recovery accuracy, whereas the proposed BCDNet improves AR for hot regions and reduced noise at the same time. Visual comparison of reconstructed images also shows that the proposed BCDNet significantly improves PET image reconstruction performance compared to MBIR methods using nontrained regularizers and nonMBIR denoising UNet.
Future work includes investigating performance of BCDNet trained with endtoend training principles and adaptive selection of trainable parameter numbers depending on the size of training dataset.
Vi Acknowledgment
We acknowledge Se Young Chun (UNIST) for providing NLM regularization codes. We also acknowledge Maurizio Conti and Deepak Bharkhada (SIEMENS Healthcare Molecular Imaging) for providing the forward/back projector for TOF measurement data. This work was supported by NIHNIBIB grant R01EB022075.
References
 [1] T. Carlier, K. P. Willowson, E. Fourkal, D. L. Bailey, M. Doss, and M. Conti, “Y90PET imaging: exploring limitations and accuracy under conditions of low counts and high random fraction,” Med. Phys., vol. 42, no. 7, pp. 4295–309, Jun. 2015.
 [2] S. Ahn, S. G. Ross, E. Asma, J. Miao, X. Jin, L. Cheng, S. D. Wollenweber, and R. M. Manjeshwar, “Quantitative comparison of OSEM and penalized likelihood image reconstruction using relative difference penalties for clinical PET,” Physics in Medicine & Biology, vol. 60, no. 15, p. 5733, 2015.
 [3] G. Wang, J. C. Ye, K. Mueller, and J. A. Fessler, “Image reconstruction is a new frontier of machine learning,” IEEE Trans. Med. Imag., vol. 37, no. 6, pp. 1289–96, Jun. 2018.
 [4] H. Chen, Y. Zhang, M. K. Kalra, F. Lin, Y. Chen, P. Liao, J. Zhou, and G. Wang, “Lowdose CT with a residual encoderdecoder convolutional neural network,” IEEE transactions on medical imaging, vol. 36, no. 12, pp. 2524–2535, 2017.
 [5] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep convolutional neural network for inverse problems in imaging,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017.
 [6] J. C. Ye, Y. Han, and E. Cha, “Deep convolutional framelets: A general deep learning framework for inverse problems,” SIAM Journal on Imaging Sciences, vol. 11, no. 2, pp. 991–1048, 2018.
 [7] H. Gupta, K. H. Jin, H. Q. Nguyen, M. T. McCann, and M. Unser, “CNNbased projected gradient descent for consistent CT image reconstruction,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1440–1453, 2018.
 [8] I. Y. Chun, H. Lim, Z. Huang, and J. A. Fessler, “Fast and convergent iterative signal recovery using trained convolutional neural networkss,” in Proc. Allerton Conf. on Commun., Control, and Comput., Allerton, IL, Oct. 2018.
 [9] H. K. Aggarwal, M. P. Mani, and M. Jacob, “MoDL: modelbased deep learning architecture for inverse problems,” IEEE Trans. Med. Imag., vol. 38, no. 2, pp. 394–405, Feb. 2019.
 [10] K. Hammernik, T. Klatzer, E. Kobler, M. P. Recht, D. K. Sodickson, T. Pock, and F. Knoll, “Learning a variational network for reconstruction of accelerated MRI data,” Magnetic resonance in medicine, vol. 79, no. 6, pp. 3055–3071, 2018.
 [11] J. Sun, H. Li, Z. Xu et al., “Deep ADMMNet for compressive sensing MRI,” in Advances in neural information processing systems, 2016, pp. 10–18.
 [12] M. Mardani, E. Gong, J. Y. Cheng, S. S. Vasanawala, G. Zaharchuk, L. Xing, and J. M. Pauly, “Deep generative adversarial neural networks for compressive sensing MRI,” IEEE transactions on medical imaging, vol. 38, no. 1, pp. 167–179, 2019.
 [13] G. Yang, S. Yu, H. Dong, G. Slabaugh, P. L. Dragotti, X. Ye, F. Liu, S. Arridge, J. Keegan, Y. Guo et al., “DAGAN: Deep DeAliasing Generative Adversarial Networks for fast compressed sensing MRI reconstruction,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1310–1321, 2018.
 [14] J. Xu, E. Gong, J. Pauly, and G. Zaharchuk, “200x lowdose PET reconstruction using deep learning,” arXiv preprint arXiv:1712.04119, 2017.
 [15] B. Yang, L. Ying, and J. Tang, “Artificial Neural Network Enhanced Bayesian PET Image Reconstruction,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1297–1309, June 2018.
 [16] I. Haggstrom, C. R. Schmidtlein, G. Campanella, and T. J. Fuchs, “DeepPET: A deep encoderdecoder network for directly solving the PET image reconstruction inverse problem,” Med. Im. Anal., vol. 54, pp. 253–62, May 2019.
 [17] K. Gong, J. Guan, K. Kim, X. Zhang, J. Yang, Y. Seo, G. El Fakhri, J. Qi, and Q. Li, “Iterative PET image reconstruction using convolutional neural network representation,” IEEE transactions on medical imaging, vol. 38, no. 3, pp. 675–685, 2019.
 [18] K. Kim, D. Wu, K. Gong, J. Dutta, J. H. Kim, Y. D. Son, H. K. Kim, G. El Fakhri, and Q. Li, “Penalized PET reconstruction using deep learning prior and local linear fitting,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1478–1487, 2018.
 [19] O. Ronneberger, P. Fischer, and T. Brox, “Unet: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computerassisted intervention. Springer, 2015, pp. 234–241.
 [20] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142–3155, July 2017.
 [21] K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” in Proceedings of the 27th International Conference on International Conference on Machine Learning. Omnipress, 2010, pp. 399–406.
 [22] Y. Chen and T. Pock, “Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 6, pp. 1256–1272, 2017.
 [23] I. Y. Chun and J. A. Fessler, “Deep BCDnet using identical encodingdecoding CNN structures for iterative image recovery,” in 2018 IEEE 13th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP). IEEE, 2018, pp. 1–5.
 [24] ——, “Convolutional analysis operator learning: Acceleration and convergence,” arXiv preprint arXiv:1802.05584, Jan 2018.
 [25] I. Y. Chun, D. Hong, B. Adcock, and J. A. Fessler, “Convolutional analysis operator learning: Dependence on training data,” submitted, Feb. 2019.
 [26] H. Lim, Z. Huang, J. A. Fessler, Y. K. Dewaraja, and I. Y. Chun, “Application of trained deep BCDNet to iterative lowcount PET image reconstruction,” in 2018 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC). IEEE, 2018.
 [27] M. Elschot, M. G. Lam, M. A. van den Bosch, M. A. Viergever, and H. W. de Jong, “Quantitative Monte Carlobased 90Y SPECT reconstruction,” Journal of Nuclear Medicine, vol. 54, no. 9, pp. 1557–1563, 2013.
 [28] A. S. Pasciak, A. C. Bourgeois, J. M. McKinney, T. T. Chang, D. R. Osborne, S. N. Acuff, and Y. C. Bradley, “Radioembolization and the dynamic role of 90Y PET/CT,” Frontiers in oncology, vol. 4, p. 38, 2014.
 [29] J. Nuyts, D. Beque, P. Dupont, and L. Mortelmans, “A concave prior penalizing relative differences for maximumaposteriori reconstruction in emission tomography,” IEEE Transactions on nuclear science, vol. 49, no. 1, pp. 56–60, 2002.
 [30] I. Y. Chun and J. A. Fessler, “Convolutional analysis operator learning: Application to sparseview CT,” in Proc. Asilomar Conf. on Signals, Syst., and Comput., Pacific Grove, CA, Oct. 2018.
 [31] A. R. De Pierro, “A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography,” IEEE Trans. Med. Imag., vol. 14, no. 1, pp. 132–7, Mar. 1995.
 [32] Z. Zhang, S. Rose, J. Ye, A. E. Perkins, B. Chen, C.M. Kao, E. Y. Sidky, C.H. Tung, and X. Pan, “Optimizationbased image reconstruction from lowcount, listmode TOFPET data,” IEEE Transactions on Biomedical Engineering, vol. 65, no. 4, pp. 936–946, 2018.
 [33] A. Chambolle and T. Pock, “An introduction to continuous optimization for imaging,” Acta Numerica, vol. 25, pp. 161–319, 2016.
 [34] S. Y. Chun, Y. K. Dewaraja, and J. A. Fessler, “Alternating direction method of multiplier for tomography with nonlocal regularizers,” IEEE transactions on medical imaging, vol. 33, no. 10, pp. 1960–1968, 2014.
 [35] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. & Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2010.
 [36] E. Kang, J. Min, and J. C. Ye, “A deep convolutional neural network using directional wavelets for lowdose Xray CT reconstruction,” Medical physics, vol. 44, no. 10, pp. e360–e375, 2017.
 [37] W. Segars, G. Sturgeon, S. Mendonca, J. Grimes, and B. M. Tsui, “4D XCAT phantom for multimodality imaging research,” Medical physics, vol. 37, no. 9, pp. 4902–4915, 2010.
 [38] H. Lim, Y. K. Dewaraja, and J. A. Fessler, “A PET reconstruction formulation that enforces nonnegativity in projection space for bias reduction in Y90 imaging,” Phys. Med. Biol., vol. 63, no. 3, p. 035042, Feb. 2018.
 [39] Quality Assurance for PET and PET/CT Systems, ser. Human Health Series. Vienna: INTERNATIONAL ATOMIC ENERGY AGENCY, 2009, no. 1. [Online]. Available: https://www.iaea.org/publications/8002/qualityassuranceforpetandpet/ctsystems
 [40] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in PyTorch,” in NIPSW, 2017.
 [41] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
 [42] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and L. FeiFei, “ImageNet: A LargeScale Hierarchical Image Database,” in CVPR09, 2009.