Deep Mesh Projectors for Inverse Problems

Deep Mesh Projectors for Inverse Problems

Sidharth Gupta
University of Illinois at Urbana-Champaign
gupta67@illinois.edu
&Konik Kothari 11footnotemark: 1
University of Illinois at Urbana-Champaign
kkothar3@illinois.edu
&Maarten V. de Hoop
Rice University
mdehoop@rice.edu
&Ivan Dokmanić
University of Illinois at Urbana-Champaign
dokmanic@illinois.edu
S. Gupta and K. Kothari contributed equally.
Abstract

We develop a new learning-based approach to ill-posed inverse problems. Instead of directly learning the complex mapping from the measured data to the reconstruction, we learn an ensemble of simpler mappings from data to projections of the unknown model into random low-dimensional subspaces. We form the reconstruction by combining the estimated subspace projections. Structured subspaces of piecewise-constant images on random Delaunay triangulations allow us to address inverse problems with extremely sparse data and still get good reconstructions of the unknown geometry. This choice also makes our method robust against arbitrary data corruptions not seen during training. Further, it marginalizes the role of the training dataset which is essential for applications in geophysics where ground-truth datasets are exceptionally scarce.

1 Introduction

Deep neural networks produce impressive results on a variety of inverse problems, as documented by a whopping number of recent papers (cf. Section 2). Typically, a neural network is trained to remove artifacts due to sparse data or noise, for example in low-dose CT imaging Chen et al. [2017]. Classical approaches use universal regularization principles such as smoothness or sparsity. Although they give good results, deep neural networks often do better.

In certain domains, however, problems are so ill-posed and the data is so sparse that the artifact removal paradigm is not appropriate: even a coarse reconstruction of the unknown model is hard to get. Unlike in the typical biomedical setting where applying a regularized pseudoinverse of the imaging operator to the measurements (in the linear case) already brings out considerable structure, in applications of our interest standard techniques cannot produce a reasonable image. We illustrate this point in Figure 1. This highly unresolved regime is common in geophysics and it requires alternative, more involved strategies Galetti et al. [2017]. The sought reconstructions are accordingly much less detailed.

Figure 1: We consider the problem of reconstructing an image from its tomographic measurements. In the moderately ill-posed problem, conventional methods based on the pseudoinverse and regularized constrained least squares (, is image dimension) give correct structural information. In fact, total variation approaches already give a very good result. A neural network Jin et al. [2016] can be trained to remove the artifacts. In a severely ill-posed problem on the other hand (traveltime tomography with few sensors) with scarce ground truth data, neither the classical techniques nor a classical neural network give a reasonable output.

We propose a new way to regularize ill-posed inverse problems using convolutional neural networks that map measured data to low-dimensional projections of the unknown model.

Concretely, we are concerned with the following operator equation,

(1)

with the domain and and being Hilbert spaces. In applications, models the data.111We refer to as the data, and as the model as is common in the inverse problems literature.

For ill-posed problems, attempting to learn , even when it formally exists, is a dubious proposal since the inverse mapping satisfies very poor stability estimates; learning it is not meaningful. A discretization of (1) might give rise to systems that are not singular in theory, but that are too ill-conditioned for practical computation. Discretizing the problem can be interpreted as projecting into some high (but finite) dimensional subspace . Generally, one can show that the projected mapping is Lipschitz stable, but with a very poor constant Beretta et al. [2013], Mandache [2001]. Thus, even if we could learn it would lead to a brittle result.

Instead, we have to learn a regularized inverse. One possibility is to restrict the inversion to the model manifold but this requires many ground truth training examples and leads to considerable model bias. A good universal strategy would be to learn the best -Lipschitz approximation for some favorable . Alas, we do not know how to translate Lipschitzness into an optimization constraint on the network weights.

Main contributions.

We suggest an alternative strategy: one can show that the Lipschitz constant of projected into a carefully chosen low-dimensional subspace is exponentially smaller than that of a high-dimensional projection . Thus, instead of learning , we learn for a collection of projections onto low-dimensional subspaces, . Each projection is easier to learn, for example, in terms of sample complexity Cooper [1995] or achievable sup-norm error, and each has a controlled Lipschitz constant Beretta et al. [2013]. We then construct a regularized approximation of from estimates of .

We test our ideas on the problem of linearized seismic traveltime tomography Bording et al. [1987], Hole [1992] and show that the proposed method outperforms learned direct inversion in terms of achieved reconstructions, robustness to errors in the data, and independence of training dataset. The latter is essential in domains with few available ground truth images. Finally, we propose a new architecture, the SubNet, which receives as input the low-dimensional subspace in which to compute the reconstruction. This dramatically shortens the training time and it allows us to quickly generate many projections to subspaces that were not seen at training time, instead of training a network for each subspace.

2 Related work

Although neural networks have long been used to address inverse problems Ogawa et al. [1998], Hoole [1993], Schiller and Doerffer [2010], the past few years have seen the number of related deep learning papers grow exponentially. The majority address biomedical imaging Güler and Übeylı [2005], Hudson and Cohen [2000] with several special issues222IEEE Transactions on Medical Imaging, May 2016 Greenspan et al. [2016]; IEEE Signal Processing Magazine, November 2017, January 2018 Porikli et al. [2017, 2018]. and review papers Lucas et al. [2018], McCann et al. [2017] dedicated to the topic. All these papers address reconstruction from subsampled or low-quality data, often motivated by reduced scanning time or lower radiation doses. Beyond biomedical imaging, machine learning techniques are emerging in geophysical imaging Araya-Polo et al. [2017], Lewis et al. [2017], Bianco and Gertoft [2017], though at a slower pace, perhaps partly due to the lack of standard open datasets.

Existing methods can be grouped into non-iterative methods that learn a feed-forward mapping from the measured data (or some standard manipulation such as adjoint or a pseudoinverse) to the model Jin et al. [2016], Pelt and Batenburg [2013], Zhu et al. [2018], Wang [2016], Antholzer et al. [2017], Han et al. [2016], Zhang et al. [2016]; and iterative energy minimization methods, with either the regularizer being a neural network Li et al. [2018], or neural networks replacing various iteration components such as gradients, projectors, or proximal mappings Kelly et al. [2017], Adler and Öktem [2017b, a], Rick Chang et al. [2017]. These are further related to the notion of plug-and-play regularization Tikhonov A.N. [2013], Chambolle [2004], Mallat [1999], as well as early uses of neural nets to unroll and adapt standard sparse reconstruction algorithms Gregor and LeCun [2010], Xin et al. [2016]. An advantage of the first group of methods is that they are fast; an advantage of the second group is that they are better at enforcing data consistency.

A rather different take was proposed by Bora et al. Bora et al. [2017, 2018] in the context of compressed sensing where the reconstruction is constrained to lie in the range manifold of a pretrained generative network. Their scheme achieves impressive results on compressed sensing and comes with theoretical guarantees. However, training generative networks requires many examples of ground truth and the method is inherently subject to dataset bias.

Our work is further related to sketching Gribonval et al. [2017], Pilanci and Research [2016] where the learning problem is also simplified by random low-dimensional projections of some object—either the data or the unknown reconstruction itself Yurtsever et al. [2017]. This also exposes natural connections with learning via random features Ali Rahimi [2008, 2009]. Estimating projections and asking for consistency across the various subspaces can also be considered a variant of multi-task learning Zhang et al. [2014], Collobert and Weston [2008], Seltzer and Droppo [2013].

3 Lipschitz Stability of Inverse Problems

In inverse problems, one is concerned with whether determines , and whether it does so stably. We assume that is continuous and locally Fréchet differentiable. One way to analyze the uniqueness and stability of an inverse problem is to couple them to a construction of a local solution based on the Landweber iteration Landweber [1951]. The radius of convergence is then a quantitative measure of well-posedness.

Let denote a closed ball centered at with radius , such that , . We let generate the data , that is,

(2)

and we assume that .

Assumption 1.

Let denote the operator modelling the data. Then

  1. The Fréchet derivative, , of is Lipschitz continuous locally in and

    (3)

    Moreover

    (4)
  2. is weakly sequentially closed, that is,

  3. The inversion has the uniform Lipschitz stability, that is, there exists a constant, , such that

    (5)

Then the inversion is stable in the sense that the Landweber iteration has some finite radius of convergence, say, given by de Hoop et al. [2012]

(6)

With a twice Fréchet-differentiable operator , the essence of the constants is more transparent. Let stand for the second-order Fréchet derivative of ; then

(7)

In other words, the constant is a curvature to gradient condition, which degenerates for linear operators to zero. Thus any bound on restricts the nonconvexity of .

The Lipschitz stability condition (5) is implied by a lower bound of the Fréchet derivative . More precisely, if there exists a constant such that

(8)

for sufficiently small, then it can be shown that

for some constant depending on and . We note that a very similar condition specialized to linear operators plays a central role in Bora et al. [2017].

While in the above discussion the stability estimate is assumed to hold on the entire ball , in realistic problems this estimate typically holds on a convex compact subset only. A projected gradient descent can still yield an approximate reconstruction in with error determined by the smallest such that , say, while the true model, , lies outside ; the expression for the radius of convergence depends on de Hoop et al. [2012].

Unfortunately, (8) fails for ill-posed problems. In fact, it can equally fail for both linear and non-linear problems. On the other hand, a Lipschitz stability estimate commonly holds on finite-dimensional subspaces. Growth of the stability constant, typically exponential, reflects the ill-posedness of the inverse problem. This motivates the approach developed in this paper enforcing the dimension to remain relatively small.

3.1 Decomposing Lipschitz Maps by Random Mesh Projections

We begin with a simple randomization argument. Suppose that we wish to reconstruct a high-resolution image with pixels. If is large, then the inverse mapping projected into this -dimensional subspace, , is Lipschitz, but with a poor constant ,

Consider instead the map from the data to a projection of the model into some -dimensional subspace , where . Denote the projection by and assume is chosen uniformly at random.333One way to construct the corresponding projection matrix is as , where is a matrix with standard iid Gaussian entries. We want to evaluate the expected Lipschitz constant of the map from to , noting that it can be written as :

where the first inequality is Jensen’s inequality, and the second one follows from

and the observation that . In other words, random projections reduce the Lipschitz constant by a factor of on average. This simple computation already suggests exponential gains in terms of sample complexity when learning the projected mapping Cooper [1995]. However, the inverse problem theory tells us that a careful choice of subspace family can give exponential improvements in Lipschitz stability. In particular, it is favorable to consider subspaces of piecewise constant images, with being a characteristic function of some domain subset Beretta et al. [2013].

The Case for Delaunay Triangulations.

Motivated by this observation, we use subspaces of piecewise-constant functions over random Delaunay triangle meshes. The Delaunay triangulations enjoy a number of desirable learning-theoretic properties. In the context of function learning it was shown that given a set of vertices, piecewise linear functions on Delaunay triangulations achieve the smallest sup-norm error among all triangulations Omohundro [1989].

Lipschitz Constant of the Composite Map.

Fix a collection of -dimensional subspaces . Suppose that for each subspace we have an -Lipschitz map that maps the data to an estimate of the expansion coefficients of in some orthonormal basis for , ascribed to the columns of . Let , and ; then we can estimate as

Denote the mapping by . Then we have the following simple estimate:

with the smallest (non-zero) singular value of . We observe empirically that grows exponentially with the number of meshes which is consistent with the theory. However, each individual mesh projection gives “correct” local information which can be used to form the final estimate.

Learning Lipschitz Functions.

It is a standard result in statistical learning theory Cooper [1995] that the number of samples required to learn a -variate -Lipschitz function to some prescribed accuracy in the sup-norm is of the order . While this result is proved for scalar-valued multivariate functions, it is reasonable to expect that the same scaling in should hold for vector-valued maps if we treat pixelized images as collections of scalar maps. Thus, a reduction of the Lipschitz constant by any factor allows us to work with exponentially fewer samples. Conversely, given a fixed training dataset, we obtain much more accurate estimates.

3.2 Summary of the Proposed Scheme

We decompose a hard learning task into an ensemble of easier problems of estimating projections of the unknown model in random piecewise-constant subspaces. The subspace estimates are then combined to get a higher-resolution reconstruction, as illustrated in Figure 2.

Figure 2: Regularization by random projections: Each projector is approximated by a convolutional neural network which maps from a non-negative least squares reconstruction of an image to its projection onto a lower dimension subspace. We then combine each projection and estimate the original image using total variation (TV) regularization.

Consider a set of random Delaunay triangulations, and let be the map from to . Instead of learning the hard inverse mapping , with being the high-dimensional “pixel” space, we learn an ensemble of simpler mappings . Each is approximated by a convolutional neural network parameterized by a set of weights . The weights are chosen by minimizing empirical risk:

where is a set of training models and measurements.

Recall that is an orthonormal basis for . We then compute an estimate of the expansion coefficients of in as

and use those to get a final estimate as

(9)

The total variation (TV) seminorm. is used primarily for visualization purposes. Used directly on the data , it fails to recover any geometry (Figure 1).

4 Numerical Results and Discussion

4.1 Application: Traveltime Tomography

In this work, we restrict ourselves to linear ill-posed inverse problems with sparse data, , . We discretize the domain into pixels so that . Concretely, we consider linearized traveltime tomography Hole [1992], Bording et al. [1987], but we note that the method applies to any inverse problem.

In travel time tomography, we measure wave travel times between sensors as in Figure 3. Travel times depend on the medium property called slowness (inverse of speed) and the task is to reconstruct the spatial slowness map. In the linearized regime, the problem becomes that of straight-line tomography with data modeled as

(10)

where is the continuous slowness map and are sensor locations. We use a pixel grid with sensors placed uniformly in an inscribed circle, and corrupt the measurements with zero-mean iid Gaussian noise.

Figure 3: An illustration of the measurements: on the left we show a sample model, with red crosses indicating sensor locations and dashed blue lines indicating linearized travel paths; on the right we show a reconstruction by non-negative least squares.

4.2 Architectures and Reconstruction

We generate random Delaunay meshes each with 50 triangles. The corresponding projector matrices compute average intensity over triangles to yield a piecewise constant approximation of . We test two distinct architectures: (i) ProjNet, tasked with estimating the projection into a single subspace; and (ii) SubNet, tasked with estimating the projection over multiple subspaces.

The ProjNet architecture is inspired by the FBPConvNet Jin et al. [2016] and the U-Net Ronneberger et al. [2015] as shown in Figure 4a. Similar to Jin et al. [2016], we do not use the data directly as this would require the network to first learn to map back to the image domain; we rather warm-start the reconstruction by a non-negative least squares reconstruction. The network consists of a sequence of downsampling layers followed by upsampling layers, with skip connections He et al. [2016b, a] between the downsampling and upsampling layers. Crucially, we constrain the network output to live in by fixing the last layer of the network to be a projector, (Figure 4a). A similar trick in a different context was proposed in Sønderby et al. [2016].

We combine projection estimates from many ProjNets by regularized linear least-squares (9) to get the reconstructed model (cf. Figure 2) with the regularization parameter determined on five held-out images. A drawback of this approach is that a separate ProjNet must be trained for each subspace. That is the motivation for the SubNet, shown in Figure 4b. Each input to SubNet is the concatenation of a non-negative least squares reconstruction and 50 basis functions, one for each triangle. This approach scales to any number of subspaces which allows us to get visually smoother reconstructions without any further regularization as in (9). On the other hand, the projections are inexact which can lead to slightly degraded performance. Both networks are trained using the Adam optimizer Kingma and Ba [2014].444Code available at https://github.com/swing-research/deepmesh

Figure 4: a) architecture of the ProjNet; b) architecture of the SubNet. In both cases, the input is a non-negative least square reconstruction and the network is trained to reconstruct a projection into one subspace. In SubNet, the subspace specification is concatenated to the data.

As a quantitative figure of merit we use the signal-to-noise ratio (SNR). The input SNR is defined as where and are the signal and noise variance; the output SNR is defined as with the ground truth and the reconstruction.

130 ProjNets are trained with measurements at various SNRs; SubNet is trained with 350 different triangular meshes. We compare the ProjNet and SubNet reconstructions with a baseline convolutional neural network that was built to directly reconstruct images from their non-negative least squares reconstructions. We pick the best performing baseline network from multiple networks (inspired by Jin et al. [2016]) which were designed to have a comparable number of trainable parameters to SubNet. We test on patches from the BP2004 model.555http://software.seg.org/datasets/2D/2004_BP_Vel_Benchmark/

Robustness to Corruptions.

Figure 5: a) Reconstructions for different combination of training and testing input SNR. The output SNR is indicated for each reconstruction. We compare our method against a network which directly reconstructs images from non-negative least squares reconstructions. Our method performs better when the training and testing noise levels do not match; b) reconstructions with erasures with probability , and . The reconstructions are obtained from networks which are trained for input SNR of 10 dB. The direct network cannot produce a reasonable image in any of the cases.

To demonstrate that our subspace regularization gives results that are robust against arbitrary assumptions made at training time, we consider two experiments. First, we corrupt the measured data with the same type of noise as the training data, but at a different SNR. In Figure 5a, we summarize the results with reconstructions of geo images with the network arbitrarily trained on the LSUN bridges dataset Yu et al. [2015]. In all cases our method reports better SNRs compared with the direct reconstruction network. In fact, when trained without noise and tested with a 10 dB input SNR, the direct method is unable to produce a workable result and instead hallucinates structures seen in training. For applications in geophysics it is essential that our method correctly captures the shape of the cavities unlike the direct inversion which produces sharp but wrong geometries (see annotations in the figure).

Second, we consider a different corruption where traveltime measurements are erased (set to zero) independently with , and use networks trained with 10 dB input SNR to reconstruct. Figure 5b summarizes our findings. Unlike with Gaussian noise (Figure 5a) the direct method completely fails to recover coarse geometry in all test cases.

Robustness Against Dataset Overfitting.

Finally, in Figure 6 we show that the training dataset has only a marginal influence on reconstructions—a desirable property in applications where real ground truth is unavailable. Training with LSUN Yu et al. [2015], CelebA Liu et al. [2015] and a synthetic dataset of random overlapping shapes (as in Figure 1) all give comparable reconstructions.

Figure 6: Reconstructions of networks trained on non-negative least squares reconstructions of different datasets (LSUN, CelebA and Shapes). All measurements in the training data had a SNR of 10dB.

We complement the above results with reconstructions of other types of images in Figure 7: checkerboards and X-ray images of metal castings Mery et al. [2015]. We note that in addition to better SNR, our method produces more accurate geometry estimates, per annotations in the figure.

Figure 7: Reconstructions with SNR of checkerboard and metal casting x-ray images. Inset A: direct method misses corners; inset B: misses the parallel lines structure whilst our method captures both structures. Overall, our method provides better SNR values (reported in bottom left of each image).

5 Conclusion

We proposed a new way to solve ill-posed inverse problem based on decomposing a complex mapping which is hard to learn into a collection of simpler mappings. These simpler mappings correspond to reconstructions in low-dimensional subspaces of images piecewise-constant on Delauney triangular meshes. Numerical experiments show that our method is consistently able to produce better reconstructions than a method trained to do the inversion directly, both in terms of output SNR and, more importantly, producing correct geometric features. When the data is corrupted in ways not seen at training time, our method still produces good results while the direct inversion breaks down altogether.

A simple intuitive argument can be made to explain this behavior. Instead of learning to estimate pixel values directly, we learn to estimate local averages of pixel values. Estimating averages is a much simpler task since they are considerably more invariant than pixels themselves. This statement can be made precise for many inverse problems in terms of Lipschitz stability estimates. The key is that estimating averages over triangles lets us robustly convert global traveltime measurements into local information. This has important consequences: robustness against overfitting the dataset, robustness against various unseen corruptions, and ability to get correct global geometric information without hallucinating sharp, but wrong structures.

Acknowledgement

This work utilizes resources supported by the National Science Foundation’s Major Research Instrumentation program, grant #1725729, as well as the University of Illinois at Urbana-Champaign.

References

  • Adler and Öktem [2017a] Jonas Adler and Ozan Öktem. Solving ill-posed inverse problems using iterative deep neural networks. arXiv preprint arXiv:1704.04058v2, April 2017a.
  • Adler and Öktem [2017b] Jonas Adler and Ozan Öktem. Learned Primal-dual Reconstruction. arXiv preprint arXiv:1707.06474v1, July 2017b.
  • Ali Rahimi [2008] Benjamin Recht Ali Rahimi. Random features for large-scale kernel machines. Advances in Neural Information and Processing (NIPS), 2008.
  • Ali Rahimi [2009] Benjamin Recht Ali Rahimi. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning. Advances in Neural Information and Processing (NIPS), pages 1313–1320, 2009.
  • Antholzer et al. [2017] Stephan Antholzer, Markus Haltmeier, and Johannes Schwab. Deep Learning for Photoacoustic Tomography from Sparse Data. arXiv preprint arXiv:1704.04587v2, April 2017.
  • Araya-Polo et al. [2017] Mauricio Araya-Polo, Joseph Jennings, Amir Adler, and Taylor Dahlke. Deep-learning tomography. The Leading Edge, December 2017.
  • Beretta et al. [2013] Elena Beretta, Maarten V de Hoop, and Lingyun Qiu. Lipschitz Stability of an Inverse Boundary Value Problem for a Schrödinger-Type Equation. SIAM J. Math. Anal., 45(2):679–699, March 2013.
  • Bianco and Gertoft [2017] Michael Bianco and Peter Gertoft. Sparse travel time tomography with adaptive dictionaries. arXiv preprint arXiv:1712.08655, 2017.
  • Bora et al. [2017] Ashish Bora, Ajil Jalal, Eric Price, and Alexandros G Dimakis. Compressed sensing using generative models. arXiv preprint arXiv:1703.03208, 2017.
  • Bora et al. [2018] Ashish Bora, Eric Price, and Alexandros G Dimakis. Ambientgan: Generative models from lossy measurements. In International Conference on Learning Representations (ICLR), 2018.
  • Bording et al. [1987] R Phillip Bording, Adam Gersztenkorn, Larry R Lines, John A Scales, and Sven Treitel. Applications of seismic travel-time tomography. Geophysical Journal International, 90(2):285–303, 1987.
  • Chambolle [2004] Antonin Chambolle. An algorithm for total variation minimization and applications. Journal of Mathematical Imaging and Vision, 20(1-2):89–97, 2004.
  • Chen et al. [2017] Hu Chen, Yi Zhang, Weihua Zhang, Peixi Liao, Ke Li, Jiliu Zhou, and Ge Wang. Low-dose CT denoising with convolutional neural network. In 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), pages 143–146. IEEE, 2017.
  • Collobert and Weston [2008] Ronan Collobert and Jason Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, pages 160–167. ACM, 2008.
  • Cooper [1995] Duane A Cooper. Learning lipschitz functions. International Journal of Computer Mathematics, 59(1-2):15–26, 1995.
  • de Hoop et al. [2012] Maarten V de Hoop, Lingyun Qiu, and Otmar Scherzer. Local analysis of inverse problems: Hölder stability and iterative reconstruction. Inverse Problems, 28(4):045001, April 2012.
  • Galetti et al. [2017] Erica Galetti, Andrew Curtis, Brian Baptie, David Jenkins, and Heather Nicolson. Transdimensional Love-wave tomography of the British Isles and shear-velocity structure of the East Irish Sea Basin from ambient-noise interferometry. Geophys. J. Int., 208(1):36–58, January 2017.
  • Greenspan et al. [2016] Hayit Greenspan, Bram van Ginneken, and Ronald M Summers. Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique. IEEE Trans. Med. Imag., 35(5):1153–1159, may 2016.
  • Gregor and LeCun [2010] Karol Gregor and Yann LeCun. Learning fast approximations of sparse coding. In Proceedings of the 27th International Conference on International Conference on Machine Learning, pages 399–406. Omnipress, 2010.
  • Gribonval et al. [2017] Rémi Gribonval, Gilles Blanchard, Nicolas Keriven, and Yann Traonmilin. Compressive statistical learning with random feature moments. arXiv preprint arXiv:1706.07180, 2017.
  • Güler and Übeylı [2005] İnan Güler and Elif Derya Übeylı. ECG beat classifier designed by combined neural network model. Pattern Recognition, 38(2):199–208, 2005.
  • Han et al. [2016] Yo Seob Han, Jaejun Yoo, and Jong Chul Ye. Deep Residual Learning for Compressed Sensing CT Reconstruction via Persistent Homology Analysis. arXiv preprint arXiv:1611.06391, November 2016.
  • He et al. [2016a] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. In European Conference on Computer Vision, pages 630–645. Springer, 2016a.
  • He et al. [2016b] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition, pages 770–778. IEEE, 2016b.
  • Hole [1992] John Hole. Nonlinear high-resolution three-dimensional seismic travel time tomography. Journal of Geophysical Research: Solid Earth, 97(B5):6553–6562, 1992.
  • Hoole [1993] S R H Hoole. Artificial neural networks in the solution of inverse electromagnetic field problems. IEEE Trans. Magn., 29(2):1931–1934, March 1993.
  • Hudson and Cohen [2000] Donna L Hudson and Maurice E Cohen. Neural networks and artificial intelligence for biomedical engineering. Wiley Online Library, 2000.
  • Jin et al. [2016] Kyong Hwan Jin, Michael T McCann, Emmanuel Froustey, and Michael Unser. Deep Convolutional Neural Network for Inverse Problems in Imaging. arXiv preprint arXiv:1611.03679v1, November 2016.
  • Kelly et al. [2017] Brendan Kelly, Thomas P Matthews, and Mark A Anastasio. Deep Learning-Guided Image Reconstruction from Incomplete Data. arXiv preprint arXiv:1709.00584, September 2017.
  • Kingma and Ba [2014] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  • Landweber [1951] Louis Landweber. An iteration formula for fredholm integral equations of the first kind. American Journal of Mathematics, 73(3):615–624, 1951.
  • Lewis et al. [2017] Winston Lewis, Denes Vigh, et al. Deep learning prior models from seismic images for full-waveform inversion. In SEG International Exposition and Annual Meeting. Society of Exploration Geophysicists, 2017.
  • Li et al. [2018] Housen Li, Johannes Schwab, Stephan Antholzer, and Markus Haltmeier. NETT: Solving Inverse Problems with Deep Neural Networks. arXiv preprint arXiv:1803.00092v1, February 2018.
  • Liu et al. [2015] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
  • Lucas et al. [2018] Alice Lucas, Michael Iliadis, Rafael Molina, and Aggelos K Katsaggelos. Using Deep Neural Networks for Inverse Problems in Imaging: Beyond Analytical Methods. IEEE Signal Process. Mag., 35(1):20–36, 2018.
  • Mallat [1999] Stéphane Mallat. A wavelet tour of signal processing. Academic Press, 1999.
  • Mandache [2001] Niculae Mandache. Exponential instability in an inverse problem for the Schrodinger equation. Inverse Problems, 17(5):1435–1444, October 2001.
  • McCann et al. [2017] Michael T McCann, Kyong Hwan Jin, and Michael Unser. Convolutional neural networks for inverse problems in imaging: A review. IEEE Signal Process. Mag., 34(6):85–95, 2017.
  • Mery et al. [2015] Domingo Mery, Vladimir Riffo, Uwe Zscherpel, Germán Mondragón, Iván Lillo, Irene Zuccar, Hans Lobel, and Miguel Carrasco. GDXray: The Database of X-ray Images for Nondestructive Testing. Journal of Nondestructive Evaluation, 34, 11 2015.
  • Ogawa et al. [1998] Takehiko Ogawa, Yukio Kosugi, and Hajime Kanada. Neural network based solution to inverse problems. In Neural Networks Proceedings, 1998. IEEE World Congress on Computational Intelligence. The 1998 IEEE International Joint Conference on, volume 3, pages 2471–2476. IEEE, 1998.
  • Omohundro [1989] S M Omohundro. The Delaunay triangulation and function learning, 1989.
  • Pelt and Batenburg [2013] Daniel Maria Pelt and Kees Joost Batenburg. Fast tomographic reconstruction from limited data using artificial neural networks. IEEE Trans. on Image Process., 22(12):5238–5251, 2013.
  • Pilanci and Research [2016] M Pilanci and MJ Wainwright Research. Iterative Hessian sketch: Fast and accurate solution approximation for constrained least-squares. The Journal of Machine Learning, 2016.
  • Porikli et al. [2017] Fatih Porikli, Shiguang Shan, Cees Snoek, Rahul Sukthankar, and Xiaogang Wang. Deep Learning for Visual Understanding [From the Guest Editors]. IEEE Signal Process. Mag., 34(6):24–25, nov 2017.
  • Porikli et al. [2018] Fatih Porikli, Shiguang Shan, Cees Snoek, Rahul Sukthankar, and Xiaogang Wang. Deep Learning for Visual Understanding: Part 2 [From the Guest Editors]. IEEE Signal Process. Mag., 35(1):17–19, jan 2018.
  • Rick Chang et al. [2017] JH Rick Chang, Chun-Liang Li, Barnabas Poczos, BVK Vijaya Kumar, and Aswin C Sankaranarayanan. One Network to Solve Them All–Solving Linear Inverse Problems Using Deep Projection Models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5888–5897, 2017.
  • Ronneberger et al. [2015] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 234–241. Springer, 2015.
  • Schiller and Doerffer [2010] Helmut Schiller and Roland Doerffer. Neural network for emulation of an inverse model operational derivation of Case II water properties from MERIS data. International Journal of Remote Sensing, November 2010.
  • Seltzer and Droppo [2013] Michael L Seltzer and Jasha Droppo. Multi-task learning in deep neural networks for improved phoneme recognition. In International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013, pages 6965–6969. IEEE, 2013.
  • Sønderby et al. [2016] Casper Kaae Sønderby, Jose Caballero, Lucas Theis, Wenzhe Shi, and Ferenc Huszár. Amortised map inference for image super-resolution. arXiv preprint arXiv:1610.04490, 2016.
  • Tikhonov A.N. [2013] Stepanov V.V. Yagola Anatoly G Tikhonov A.N., Goncharsky A.V. Numerical methods for the solution of ill-posed problems, volume 328. Springer Science & Business Media, 2013.
  • Wang [2016] Ge Wang. A perspective on deep imaging. IEEE Access, 4:8914–8924, 2016.
  • Xin et al. [2016] Bo Xin, Yizhou Wang, Wen Gao, David Wipf, and Baoyuan Wang. Maximal sparsity with deep networks? In Advances in Neural Information Processing Systems, pages 4340–4348, 2016.
  • Yu et al. [2015] Fisher Yu, Yinda Zhang, Shuran Song, Ari Seff, and Jianxiong Xiao. LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop. arXiv preprint arXiv:1506.03365, 2015.
  • Yurtsever et al. [2017] Alp Yurtsever, Madeleine Udell, Joel A Tropp, and Volkan Cevher. Sketchy decisions: Convex low-rank matrix optimization with optimal storage. arXiv preprint arXiv:1702.06838, 2017.
  • Zhang et al. [2016] Hanming Zhang, Liang Li, Kai Qiao, Linyuan Wang, Bin Yan, Lei Li, and Guoen Hu. Image Prediction for Limited-angle Tomography via Deep Learning with Convolutional Neural Network. arXiv preprint arXiv:1607.08707v1, July 2016.
  • Zhang et al. [2014] Zhanpeng Zhang, Ping Luo, Chen Change Loy, and Xiaoou Tang. Facial landmark detection by deep multi-task learning. In European Conference on Computer Vision, pages 94–108. Springer, 2014.
  • Zhu et al. [2018] Bo Zhu, Jeremiah Z Liu, Stephen F Cauley, Bruce R Rosen, and Matthew S Rosen. Image reconstruction by domain-transform manifold learning. Nature, 555(7697):487, March 2018.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
200410
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description