Dynamic MRI Reconstruction from Undersampled Data with an Anatomical Prescan

# Dynamic MRI Reconstruction from Undersampled Data with an Anatomical Prescan

Julian Rasch, Ville Kolehmainen, Riikka Nivajärvi, Mikko Kettunen33footnotemark: 3,
Olli Gröhn33footnotemark: 3, Martin Burger11footnotemark: 1  and Eva-Maria Brinkmann11footnotemark: 1
Applied Mathematics Münster: Institute for Analysis and Computational Mathematics, Westfälische Wilhelms-Universität (WWU) Münster. Einsteinstr. 62, 48149 Münster, Germany. e-mail: julian.rasch@wwu.deDepartment of Applied Physics, University of Eastern Finland, POB1627, 70211 Kuopio, FinlandKuopio Biomedical Imaging Unit, A. I. Virtanen Insititute for Molecular Sciences, University of Eastern Finland, POB1627, 70211 Kuopio, Finland
###### Abstract

The goal of dynamic magnetic resonance imaging (dynamic MRI) is to visualize tissue properties and their local changes over time that are traceable in the MR signal. We propose a new variational approach for the reconstruction of subsampled dynamic MR data, which combines smooth, temporal regularization with spatial total variation regularization. In particular, it furthermore uses the infimal convolution of two total variation Bregman distances to incorporate structural a-priori information from an anatomical MRI prescan into the reconstruction of the dynamic image sequence. The method promotes the reconstructed image sequence to have a high structural similarity to the anatomical prior, while still allowing for local intensity changes which are smooth in time. The approach is evaluated using artificial data simulating functional magnetic resonance imaging (fMRI), and experimental dynamic contrast-enhanced magnetic resonance data from small animal imaging using radial golden angle sampling of the -space.

Keywords: Dynamic magnetic resonance imaging, spatio-temporal regularization, structural prior, infimal convolution of Bregman distances, total variation, golden angle subsampling

## 1 Introduction

Since its invention in the 1970s, magnetic resonance imaging (MRI) has developed into a very versatile non-invasive medical imaging technique, which can provide information about a rich variety of tissue properties with a high spatial resolution and good soft tissue contrast. Initially, however, this high image quality came along with long acquisition times rendering the imaging of rapid physiological changes impossible. Hence, since the early days of MRI, the speed-up of the data acquisition process has been a major direction of research. In the meantime significant progress in hardware and development of fast imaging protocols has paved the way for modern dynamic imaging techniques such as functional MRI (fMRI) and dynamic contrast-enhanced MRI (DCE-MRI).

In this paper, we propose a novel variational approach for the reconstruction of highly subsampled dynamic MR data. It combines temporal smoothing with spatial total variation (TV) regularization and uses an regularization functional to incorporate structural prior information from an anatomical MR scan acquired prior to the dynamic sequence. In the subsequent paragraphs, we briefly introduce the concept of fMRI and DCE-MRI and explain the idea of our approach in more detail afterwards.

Functional magnetic resonance imaging
The first fMRI images depicting physiological function inside the brain were presented in the early 1990’s [33, 34]. Since then, fMRI has developed into a widely used imaging technique in basic brain research and serves as a tool for the identification and characterization of brain diseases such as Alzheimer’s disease. The fMRI technique relies on the idea that neuronal activity leads to an increase in metabolism and hence to a local change in the cerebral blood flow that can be detected in T weighted MR signals. Consequently, during an fMRI experiment a series of fast T weighted MR scans is taken over time to obtain spatial information about the hemodynamic changes in the brain. For a general reference on fMRI, see for example [22].

Dynamic contrast-enhanced magnetic resonance imaging
The technique of DCE-MRI also dates back to the early 1990s [42]. In contrast to fMRI, this imaging method is based on injection of an exogenous bolus of gadolinium-based contrast agent. Aiming at the detection of cerebral ischemia, brain tumor or infections, it relies on the fact that a characteristic property of normal and healthy tissue in the brain is an intact blood-brain barrier. This blood-brain barrier prevents the contrast agent from passing the cell membranes and hence causes the contrast agent to quickly re-enter the vessels instead of accumulating in the tissue. However, in tissue that has been damaged by a tumor or an infection the blood-brain barrier is disrupted, and hence the contrast agent accumulates. In order to detect this change of contrast, a time series of fast MR scans is acquired. The scans can be based on either a short or long echo time depending on the application. While imaging of tumors is usually based on a short echo time, aiming at good T contrast, the brain ischemia studies usually use T weighted imaging with short echo times for detection of the fast perfusion related changes. For further details on DCE-MRI, see for example the review papers [37, 10, 41].

Compressed sensing in magnetic resonance imaging
Since the strength of dynamic MRI methods lies in their ability to measure spatial contrast or physiological changes with rapidly switching dynamics, a critical point is to guarantee a sufficiently high temporal resolution in the measured data. In this context, the evolution of the theory of compressed sensing (CS) and its application to MRI, e.g. [7, 12, 27, 21], allowed for a significant acceleration of the data acquisition process. Building on the observation that images are likely to have a sparse representation in some transform domain, it led to a boost in the development of new mathematical reconstruction techniques. These are capable of producing reconstructions of comparable quality from a substantially reduced amount of sampled -space coefficients, provided that the remaining coefficients are chosen in a suitable way. In this context, the subsampling scheme is regarded as being suitable, if the resulting aliasing artifacts in the images obtained by linear reconstruction methods are incoherent, or in other words noise-like, in the respective transform domain. Appropriate nonlinear reconstruction methods promoting sparsity in the particular transform domain and consistency with the acquired data can then facilitate high quality reconstructions despite the comparably small amount of measured data. In the following, we will therefore assume that at each timestep of the respective dynamic MRI acquisition process only a very sparse subset of the entire -space is sampled.

The proposed approach
Gathering the ideas of fMRI, DCE-MRI and compressed sensing described above, we introduce the three buidling blocks of our variational model for the reconstruction of dynamic MR data.

The first is an appropriate data fidelity term (cf. Sections 2.1 and 2.2) combined with a (spatial) total variation (TV) regularization [39] to ensure that the reconstructed images to a certain degree exhibit sparsity in the gradient domain (cf. Section 2.3.1). Provided a sufficiently high temporal resolution, we moreover expect the intensity values to change only in a small part of the entire image domain between two consecutive timesteps (since only a small area is activated). Note that this is a common assumption for dynamic MRI applications, especially when motion is negligible, see for example [17, 16, 47]. Hence, we also suppose a temporal redundancy of the dynamic data that we want to derive benefit from. Against this background, it seems reasonable to sample the -space for successive timesteps in a complementary manner, meaning that the subsampling scheme should not be identical among timesteps that are temporally close to each other, but rather such that it adds to the information gained at neighboring points in time. One such sampling scheme, which is popular in DCE-MRI, is the radial golden angle sampling where the -space is sampled continuously along radial spokes with a uniform angular stepping (). It leads to a rather regular spatial coverage of the -space over time with high temporal incoherence and offers great flexibility with respect to the selection of the time resolution in the sense that one can afterwards choose how many radial spokes per frame are used in the reconstruction of the data [46, 16]. The second building block is hence a temporal regularization (cf. Section 2.4), which couples the sampled data across the dynamic sequence such that the information acquired at neighboring time points can contribute to the reconstruction of the respective frame.

Finally, we seek to take advantage of the specific experimental setups: in both fMRI and DCE-MRI, the experimental protocol includes an anatomical high-resolution MRI scan, which is carried out prior to the acquisition of the actual dynamic MRI data. While this data contains high resolution anatomical information of the target, it is mostly used for visualization purposes only. One of the goals of our approach is to utilize the structural information contained in the anatomical image as prior information in the reconstruction of the dynamic image sequence. In order to increase quality and hence obtain a higher spatial resolution in the reconstruction of the dynamic MRI data series, we incorporate the edge information of the anatomical high-resolution scan encoded in the subgradient into the reconstruction framework (cf. Section 2.3.2). Mathematically, this is realized by the regularization that has first been introduced for color image processing in [30] and discussed in more detail and applied to PET-MRI joint reconstruction in [38].

Related work
Our approach certainly stands in the tradition of other CS-related variational models for the reconstruction of dynamic MRI data that combine suitable undersampling strategies with spatio-temporal regularization [28, 3, 23, 43, 36, 48, 17, 16, 47]. Note that the list of related work given here is only a small excerpt of methods proposed in literature. We believe that the method of Adluru and coworkers [3] is closest to our new approach. However, to the best of our knowledge, no method has yet been proposed which combines the incorporation of high-resolution structural a-priori information from the anatomical prescan with spatio-temporal regularization allowing for both simultaneously, a surprisingly high temporal and spatial resolution.

Organization of the paper
The remainder of the paper is organized as follows: In Section 2 we describe the mathematical forward model for MRI and introduce the proposed reconstruction method. In Section 3, we first comment on the implementation of all operators and then explain how to solve the minimization problem by a first-order primal-dual optimization method. The numerical results are given in Section 4. First we evaluate the method using a test case based on fMRI with simulated measurement data, then the approach is evaluated using experimental small animal DCE-MRI data from a glioma model rat specimen, where in both cases the data results from golden angle radial sampling. We complete the paper with a brief conclusion and give an outlook on future directions of research in this area.

## 2 The proposed reconstruction framework

In this section we translate our novel reconstruction approach, whose basic modules we motivated above, into a mathematical setting. To this end, we specify the modeling of the required operators and all terms contained. We start with the static MRI forward problem suitable for modeling any arbitrary MR data acquisition protocol.

### 2.1 Undersampled MRI: A mathematical model

While the forward model in MRI can become arbitrarily difficult if modeling the influence of all the underlying physical and technical parameters, it is common practice to assume it to be a simple Fourier transform and hide all the unkowns. Following this line, we define the Fourier transform of the MR image at -space coordinate by

 (Ku)(ξ)=∫Ωu(x)e−ix⋅ξdx, (2.1)

where is a bounded image domain (usually ). The associated inverse problem is then to reconstruct from noisy measurements obtained by

 Ku+e=f, (2.2)

where models the (complex-valued) noise.

A strong point of MRI is that it allows for a great variety of different data acquisition procedures each of which leads to a very distinct appearance of the image , which does not become obvious from the above formula (2.1), since, as mentioned before, all the parameters characterizing the specific acquisition scheme are hidden. In particular, different scanning protocols produce different types of image contrasts. For example, depending on the respective relaxivities, the same tissue can appear very bright in T1 contrast while being very dark in T2 and some details might only be visible at a specific setup of the acquisition scheme. However, since the relaxivity varies only between different types of tissues while being constant across the same tissue, it is very likely for images of different contrast to nevertheless share the same structures (with the exception of features just visible in one of the contrasts).

In practice, one has access to only a small portion of Fourier coefficients , , such that the system (2.2) becomes highly underdetermined. As a remedy, one also discretizes the image domain and samples the desired reconstruction at the respective locations. More precisely, letting be a discretization of the image domain into pixels, we define the discrete Fourier transform of by

 (KNu)(ξm)=∑xn∈ΩNu(xn)e−ixn⋅ξm (2.3)

for all -space coordinates , . Here denotes a sampling of at location . It is worth noticing that the -space coordinates can be chosen arbitrarily and are in particular independent of the choice of discretization of the image domain. Hence, from a mathematical perspective, the resolution of the reconstruction can (theoretically) be chosen arbitrarily at the price of a nontrivial nullspace, and in particular does not depend on the number of sampled Fourier coefficients. The standard way in a Cartesian setting is to choose , i.e. to sample as many Fourier coefficients as pixels in the desired image, since in this case there holds a one-to-one relationship between an image and its Fourier transform. Vice versa, having measured Fourier coefficients on a Cartesian grid, the canonical resolution of is pixels. This gives the notion of the standard Cartesian Fourier transform, which can simply be inverted (cf. Section 3.1).

However, driven in particular by developments in compressed sensing (cf. [27, 28]) it is common practice to speed up the data acquisition by undersampling the -space, i.e. by performing significantly less measurements than desired image pixels. In particular, the measurements are usually not sampled on a Cartesian grid (cf. Section 2.2). In this case there is neither the possibility to simply “invert” the operator, nor a “canonical” way of choosing the resolution . Instead one can try to compensate for the missing data using additional a-priori information (regularization) and choose the resolution as high as the respective method allows. The quality of the reconstruction then depends on the location of the measured Fourier coefficients and on the ratio . As we show in the numerical section, this ratio can actually be less than one per cent (as opposed to per cent for ), choosing a golden angle sampling and an appropriate reconstruction method (see also Section 4).

Summing up the discrete setting, we aim to solve the inverse problem

 KNu+e=f, (2.4)

with , and random measurements errors . Since the noise in MRI is commonly approximated by Gaussian ones (and more advanced noise models such Rician are beyond the scope of the present study), the canonical related minimization problem reads

 u∈argminu∈CN\leavevmode\nobreak α2∥KNu−f∥2CM+J(u).

The regularizer is yet to be defined. The goal is to construct it such that it promotes desirable properties (such as smoothness etc.) of the sought-after solution.

### 2.2 Dynamic MRI: model extension

The scanning protocol we consider throughout this paper involves a (densely sampled) prior scan, followed by a (subsampled) dynamic scan. In order to cover the dynamic scan we extend the above idea along the same lines to dynamic MR imaging.

Let us first comment on the characteristics of the dynamic data: Generally, there are many eligible options in how to choose the specific scanning protocol and (sub-) sampling pattern in practice. In this work, we consider radial samplings, that is the data is collected on radial spokes through the -space center. Note that the basic idea does not change for different types of samplings. The interesting part about dynamic MRI is that the MR scanner collects the data almost continuously in time, meaning that it measures the radial spokes in -space sequentially one after the other. This leaves us free to determine the time resolution of the recorded sequence only during the reconstruction process. More precisely, for the reconstruction we divide the entire collection of spokes into an arbitrary number of sets of (consecutive) spokes . For example, if the scanner recorded 100 spokes of data, we can divide the data set into sets of 1 spoke each, or sets of 10 spokes each, etc. leading to a very high time resolution (1 spoke/frame) or to a more moderate time resolution (10 spokes/frame), respectively. In choosing , we naturally always have to face the trade-off between a high temporal resolution and a sufficient amount of data per frame available for a meaningful reconstruction to build upon. In the following, we shall always assume that we have already divided the dynamic data set into parts.

In this setting, we then denote the anatomical prior image by with the corresponding densely sampled -space data , and for the undersampled data set , where each . The corresponding sought-after reconstruction is . The imaging operators are

 K0:CN0→CM0,Kt:CN→CMt,

for the (densely) sampled anatomical prior and for time index in the dynamic sequence, respectively. Note that we drop the dependence on the resolution , in the following since it will always be clear from the context.

We remark that we keep fixed for the dynamic scans, implying that the spatial resolution of the stays the same over time. Theoretically, however, there also exists the possibility to change the resolution over time. The corresponding joint reconstruction problem (see also [14, 38]) for the dynamic sequence then reads

 u∈argminu∈CN×T\leavevmode\nobreak T∑t=1αt2∥Ktut−ft∥2CMt+J(u,u0),

with a regularizer that now depends on the different frames and on the anatomical prior image and thus introduces a coupling among all these images. Note that this regularizer can, of course, be composed of several terms.

### 2.3 Spatial regularization: TV and ICBTV

We now specify our choice of the regularizer . In the current section, we will concentrate on penalty terms that do not take into account that we actually aim at reconstructing an entire sequence of consecutive time frames . Instead, these terms rather act on each frame independently. The question of how we can link the time frames during the reconstruction will be addressed in the subsequent subsection.

#### 2.3.1 TV: enforcing sparsity in the gradient domain

As motivated and explained before, we assume that at each time step of the dynamic sequence only a small portion of all Fourier coefficients is sampled along radial spokes through the -space center. According to the theory of compressed sensing [7, 12, 27, 21], this requires the data fidelity term to be accompanied by a term that guarantees sparsity in some transform domain. One popular choice for such a term and the approach we pursue in this paper is the (spatial) total variation regularization. First introduced for image denoising in 1992 [39], the distinctive feature of this regularization is the promotion of piecewise constant solutions with sharp edges and thus of sparsity with respect to the image gradient. For real-valued images , the discrete, isotropic form of TV reads:

 TVd(z):=∥∇z∥1:=N∑n=1|(∇z)n|=N∑n=1√|(∇z)n,1|2+|(∇z)n,2|2.

with denoting the Euclidean norm on .

However, since in this work we model the unknown MRI reconstruction to be complex-valued, it is necessary to extend the concept of gradients and total variation accordingly. The extension of gradients is straightforward and for a complex-valued image we just define the gradient by , where more details on our particular implementation can be found in Section 3.1. The extension of the total variation is slightly more involved: though the underlying image is complex-valued, in most real-world applications one is interested only in the magnitude of the reconstructed image. More precisely, using the identity , one ideally aims to perform the regularization of the sought-after image directly on its magnitude, i.e. to penalize the total variation of . Unfortunately, this leads to a nonlinear (in ) forward operator in (2.2) or (2.4), rendering the numerical solution of the resulting minimization problem considerably more difficult. We refer the reader to [44] for an extensive study of this situation.

In this work, we shall instead use the linear forward model (2.4) and an approximation to the total variation on the magnitude, which turns out to have a similar effect in practice. More precisely, in the complex-valued case, we redefine the discrete, isotropic total variation in the following way

 TVd(u)=N∑n=1√|Re(∇u)n,1|2+|Im(∇u)n,1|2+|Re(∇u)n,2|2+|Im(∇u)n,2|2,

which reveals that this approach is equivalent to regarding the image to be located in and using real-valued isotropic total variation on its gradient in . Simply speaking, this approach consists in taking the magnitude of the gradient of rather than the gradient of the magnitude of .

At the end of this paragraph we address the TV regularization also in a slightly different context, namely with respect to the reconstruction of the anatomical prior . As already explained in Section 1, we shall use the structural information of this high resolution image as additional a-priori information for the reconstruction of the dynamic sequence . In order to reconstruct the usually fully sampled prior scan, i.e. M = N in Equation (2.3), one would typically simply apply the inverse Fourier transform. However, since we do not want to incorporate minor oscillations, but rather like to concentrate on major structures encoded in the prior scan, we incorporate a TV regularization term into the reconstruction problem, i.e., in the first step we solve:

 u0∈argminu0∈CN0\leavevmode\nobreak α02∥K0u0−f0∥2CM0+TVd(u0).

We hence obtain a high-quality piecewise constant anatomical reconstruction with sharp edges which serves as the basis for our further considerations.

#### 2.3.2 ICBTV: incorporating structural prior information

In this paragraph, we explain how we can utilize the structural information of the (high resolution) MR image to support the reconstruction of the dynamic MRI sequence. For this purpose, we will again build upon the concept of total variation regularization [39] and the related Bregman distances and iterations [35]. Extensions of these techniques have been shown to be very effective for coupling the edge information of different color channels in RGB image processing [30] and of different medical imaging modalities in PET-MRI [38], as well as in the context of related debiasing methods [6].

A natural approach for extracting the structural information from the reconstruction of the prior scan is to use the image gradient, since it essentially encodes the edges and hence the structure of the image. Based on this idea, there emerged a variety of methods for structural priors in the literature, of which we only mention a few here [24, 14, 13, 5, 9, 32, 4, 25, 26, 45]. However, since most of these approaches are not normalized with respect to the size of the gradient, a common difficulty is to deal with images in different scales, i.e. images sharing the same structure, but having very different intensity ranges and hence different sizes of jumps. Considering MRI this problem is of particular importance, since the absolute intensities of the reconstructions can be very different depending on the acquisition protocol chosen.

In [38] it has therefore been argued for using a subgradient rather than a gradient to include edge information without considering their size. In order to define a straightforward extension of subdifferentials to the complex setting, we need to equip with the following inner product:

 ⟨u,v⟩CN:=Re(u∗v)u,v∈CN

with denoting the convex conjugation. This inner product induces the same norm as the standard inner product on , but is real-valued, which enables us to consider angles between complex vectors. Indeed, for we have

 cos(φ)=⟨u,v⟩CN∥u∥CN∥v∥CN,

where denotes the angle between and .

We now define the subdifferential of a convex functional by

 ∂J(ut)={p∈CN\leavevmode\nobreak |\leavevmode\nobreak J(v)≥J(ut)+⟨p,v−ut⟩CN∀v∈CN}.

By the chain rule, a subgradient of the total variation at is then given by

 p0∈∂TVd(u0)⇔p0=−div(q0),q0∈∂∥∇u0∥1,

with such that

 |q0|≤1 and q0=∇u0|∇u0| if ∇u0≠0.

Hence, we see that where the gradient of does not vanish, the vector field equals the direction of the edge in at this location and this information is encoded in the subgradient. At positions with zero gradient, the prior does not provide any structural information (apart from being constant), and is an arbitrary vector from the unit ball. It has been argued in [35, 30, 38], that specific choices of subgradients can provide ”hints“ about structure also at these locations. In this work, however, we want to use only the location of edges already clearly visible in the reconstructed anatomical prior. Accordingly, we will use the following vector field

 q0={∇u0|∇u0|, % if ∇u0≠0,0, else,

to obtain a subgradient . The associated Bregman distance can be written as

 Dp0TVd(ut,u0) =TVd(ut)−⟨p0,ut⟩CN=∥∇ut∥1−⟨q0,∇ut⟩CN×2 =∑{∇u0≠0}|(∇ut)n|(1−⟨(∇ut)n,(∇u0)n⟩|(∇ut)n||(∇u0)n|)+∑{∇u0=0}|(∇ut)n| =∑{∇u0≠0}|(∇ut)n|(1−cos(φn))+∑{∇u0=0}|(∇ut)n|,

where denotes the angle between the vectors. Hence we conclude that on the support of the Bregman distance penalizes the gradient of weighted by its directional deviation from , while outside the support of we obtain a standard TV regularization. Consequently, this functional favors aligned edges between and the prior without penalizing the size of the gradients. More precisely, the height of the jumps of shared edges can be determined only in dependence on the data term, since once an edge of is aligned to the edge of , i.e., , the regularization functional vanishes at this position. We refer to [30, 38] for an extensive discussion of the behavior.

In view of noisy data and with respect to practical applications it often makes sense to use a regularized approximation of the vector field :

 qη0={∇u0|∇u0|, if |∇u0|≥η,0, else. (2.5)

In this formulation, the threshold parameter defines a minimum height such that we consider a jump in to be an edge and in this way avoids to falsely identify small oscillations as edges (see also [13] for a similar approach). In this case, the Bregman distance reads

 Dpη0TVd(ut,u0)=∑{|∇u0|≥η}|(∇ut)n|(1−cos(φn))+∑{|∇u0|<η}|(∇ut)n|. (2.6)

Due to potentially different image contrasts in the prior scan and the dynamic sequence, it is unfortunately unrewarding to directly use this Bregman distance to include the structural prior information in the reconstruction framework for the dynamic sequence. To illustrate that this is indeed true, we consider a situation, where taking a step up across the edges of the structure in the prior corresponds to taking a step down across the same structure in the dynamic sequence. In this situation, the angle between the two vectors in Equation (2.6) is close to such that the Bregman distance (2.6) would highly penalize the edge in at the respective position, which in our setting seems counterintuitive. We therefore face a situation, where we expect edge positions to coincide, but where we not necessarily aim at aligning edges to the same direction, but rather would like to adapt the regularizer such that it promotes also the opposing direction of the jump. In order to obtain a symmetric measure with respect to vector orientation, we again follow [30] and [38] and employ the infimal convolution of two Bregman distances with respect to subgradients with opposing signs

 ICBpη0TV(ut,u0):=infϕ+ψ=ut\leavevmode\nobreak Dpη0TVd(ϕ,u0)+D−pη0TVd(ψ,−u0).

This functional operation yields a decomposition of the image into two parts of which one matches the edge set of , and the other one matches the edge set of , hence with ”inverted“ contrast. Acting as an edge indicator independent of the sign of jumps, the functional thus allows to incorporate the structure of the prior image as a-priori information into the reconstruction of the dynamic sequence. For a more detailed discussion of the behavior of the functional we again refer to [38].

### 2.4 Temporal regularization

So far, we have only discussed spatial regularization techniques, which affect all the time frames independently from each other, thus not imposing any particular relation between frames at successive points in time. The last ingredient we want to add to our dynamic reconstruction framework is an additional temporal regularization, which interlinks consecutive time frames.

In the literature a whole zoo of approaches can be found, reaching from Tikhonov-type regularization of the time derivative [1, 3] via sparsity in some known basis, i.e., [28] after applying e.g. the temporal Fourier transform [43], the time derivative [16, 47] or the discrete cosine transform [15], to low rank regularization [48]. In addition, it has been proposed to combine the last two classes to get a decomposition into a low rank part and into a part which is sparse in a known transform domain [43, 36]. To make full use of the temporal redundancy present in the dynamic data set, one should always choose the regularization in dependence on the expected dynamic behavior. For example, for a recurrent dynamic such as the beat of a heart a low rank regularization stands to reason, while in case of objects quickly moving from one region of the image to another sparsity of the time derivative seems to be a more natural choice. Since the (pixelwise) temporal behavior we expect in our applications is commonly modeled by a smooth curve, we penalize the squared time derivative.

For a stack of images , a discrete time derivative can be defined by forward differences, i.e.

 ∂tut={(ut+1−ut)/(Δt),t=1,…,T−1,0,t=T,

with step size . Using a Tikhonov-type regularization for the time derivative we arrive at the penalty function

 ΔtT−1∑t=1∥∂tut∥2CN=1ΔtT−1∑t=1∥ut+1−ut∥2CN.

### 2.5 The proposed method

We are finally able to combine the above considerations to a full method for dynamic MRI with a structural prior. The goal of the method is that for all the reconstructed time frame

1. matches the data , i.e. is small,

2. has sparse (spatial) gradient , i.e. is small,

3. has a structure similar to the prior , i.e. is small,

4. should not be too different from the previous and consecutive frame, i.e. the time derivative is small.

In order to weight between these four a-priori assumptions on , we introduce weighting factors and the reconstruction method becomes:

 u∈arg minu T∑t=1αt2∥Ktut−ft∥2CMt (data fidelity) + T∑t=1wtTVd(ut) (TV regularization) + T∑t=1(1−wt)ICBp0TV(ut,u0) (structural prior) + T−1∑t=1γt2∥ut+1−ut∥2CN. (temporal smoothing) (2.7)

We quickly outline the behavior. The parameters control the data fidelity, introducing data accuracy for the reconstruction. The parameters weight between total variation regularization and proximity to the structural prior. The weights are chosen between 0 and 1 since both the and the term enforce spatial regularity. For small , the reconstructions will have a structure very similar to the prior, for close to 1 the prior does not substantially influence the reconstructions. Since we expect dynamics only in some part of the image while the rest stays constant, it is reasonable to keep more (but not all the) weight on the prior. The parameters control the temporal smoothness of the reconstructed sequence. For the sake of clarity we embed the temporal resolution in . However, switching to a different time resolution, the parameters have to be adjusted related to the change in .

## 3 Numerical implementation and solution

In this section, we explain how to implement and solve the minimization problem (2.7) numerically which, depending on the amount of time steps , can be challenging. We derive a general primal-dual algorithm for its solution, before we line out some strategies to reduce computational costs and speed up the implementation at the end of this section.

### 3.1 Gradients and sampling operators

In order to use the discrete total variation already defined in Section 2.3.1, we need a discrete gradient operator that maps an image to its gradient . Following [8], we implement the gradient by standard forward differences. Moreover, we will use its discrete adjoint, the negative divergence , defined by the identity . The inner product on the gradient space is defined in a straightforward way as

 ⟨v,w⟩CN×2=Re(v∗1w1)+Re(v∗2w2),

for . For , the (isotropic) 1-norm is defined by

 ∥v∥1:=N∑i=1√|(vi)1|2+|(vi)2|2,

and accordingly the dual -norm for is given by

 ∥w∥∞:=maxi=1,⋯,N|wi|=maxi=1,⋯,N\leavevmode\nobreak √|(wi)1|2+|(wi)2|2.

The sampling operators (and analogously for ) we consider are either a standard fast Fourier transform (FFT) on a Cartesian grid, followed by a projection onto the sampled frequencies, or a non-uniform fast Fourier transform (NUFFT), in case the sampled frequencies are not located on a Cartesian grid [18].

Fourier transform on a Cartesian grid - the simulated data case
In the numerical study on artificial data we use a simple version of the Fourier transform and sampling operator (the same can also be found in e.g. [13, 38]). We discretize the image domain on the unit square using an (equi-spaced) Cartesian grid with pixels such that the discrete grid points are given by

 ΩN={(n1N1−1,n2N2−1)\leavevmode\nobreak ∣∣\leavevmode\nobreak n1=0,…,N1−1,n2=0,…N2−1}.

We proceed analogously with the -space, i.e. the location of the -th Fourier coefficient is given by . Then, we arrive at the following formula for the (standard) Fourier transform applied to :

 (Fu)m1,m2=1N1N2N1−1∑n1=0N2−1∑n2=0un1,n2e−2πi(n1m1N1+n2m2N2),

where . For simplicity, we use a vectorized version such that with . We then employ a simple sampling operator which discards all Fourier frequencies which are not located on the desired sampling geometry at time (i.e. the chosen spokes). More precisely, following [13], if we let be an injective mapping which chooses Fourier coefficients from the coefficients available, we can define the sampling operator applied to as

 St:CN→CMt,(Stf)k=fPt(k).

The full forward operator can hence be expressed as

 Kt:CNF−→CNSt−→CMt. (3.1)

The corresponding adjoint operator of is given by

 K∗t:CMtS∗t−→CNF−1−−→CN,

where denotes the standard inverse Fourier transform and ‘fills’ the missing frequencies with zeros, i.e.

 (S∗tz)l=Mt∑k=1zkδl,Pt(k),for l=1,…,N.

For the prior , we choose a full Cartesian sampling, which corresponds to being the identity. For the subsequent dynamic scan, we set up such that it chooses the frequencies located on (discrete) spokes through the center of the -space. It is important to notice that this implies that the locations of the (discretized) spokes are still located on a Cartesian grid, which allows to employ a standard fast Fourier transform (FFT) followed by the above projection onto the desired frequencies. This is not the case for the operators we use for real data.

Non-uniform Fourier transform - the real data case
In contrast to the above (simplified) setup for artificial data, in many real world application the measured -space frequencies in (2.3) are not located on a Cartesian grid. While this is not a problem with respect to the formula itself, it however excludes the possibility to employ a fast Fourier transform, which usually reduces the computational costs of an -point Fourier transform from an order of to . To get to a similar order of convergence also for non-Cartesian samplings, it is necessary to employ the concept of non-uniform fast Fourier transforms (NUFFT) [18, 19, 29, 31, 40]. We only give a quick intuition here and for further information we refer the reader to the literature listed above. The main idea is to use a (weighted) and oversampled standard Cartesian -point FFT , followed by an interpolation in -space onto the desired frequencies . Note that the oversampling takes place in -space. The operator for time can hence again be expressed as a concatenation of a -point FFT and a sampling operator

 Kt:CNF−→CNSt−→CMt.

For our numerical experiments with the experimental DCE-MRI data, the sampling operator and its adjoint were taken from the NUFFT package [19].

### 3.2 Numerical solution

Due to the nondifferentiablity and the involved operators we apply a primal-dual method [8] to solve the minimization problem (2.7). We first line out how to solve the (simple) TV-regularized problem for the prior (3.2) and then extend the approach to the dynamic problem. Interestingly, the problem for the prior already provides all the ingredients needed for the numerical solution of the dynamic problem, which can then be done in a very straightforward way. We consider the problem

 minu0\leavevmode\nobreak α02∥K0u0−f0∥2CM0+∥∇u0∥1, (3.2)

with . Dualizing both terms leads to its primal-dual formulation

 minu0maxy1,y2\leavevmode\nobreak ⟨y1,K0u0−f0⟩CM0−12α0∥y1∥2CM0+⟨y2,∇u0⟩CN0×2+χC(y2), (3.3)

where and denotes the characteristic function of the set

 C:={y∈CN0×2\leavevmode\nobreak |\leavevmode\nobreak ∥y∥∞≤1}.

The primal-dual algorithm in [8] now essentially consists in performing a proximal gradient descent on the primal variable and a proximal gradient ascent on the dual variables and , where the gradients are taken with respect to the linear part, the proximum with respect to the nonlinear part. We hence need to compute the proximal operators for the nonlinear parts in (3.3) to obtain the update steps for and . It is easy to see that the proximal operator for is given by

 y1=proxσϕ(r)⇔y1=αrα+σ. (3.4)

The proximal operators for the update of are given by a simple projection onto the set , i.e.

 y2=projC(r)⇔(y2)i=ri/max(|ri|,1)for all i. (3.5)

Putting everything together leads to Algorithm 1.

The numerical realization of the dynamic problem is now straightforward. In order to deal with the infimal convolution, we use its definition and introduce an additional auxiliary variable yielding

 minu T∑t=1αt2∥Ktut−ft∥2CMt+T−1∑t=1γt2∥ut+1−ut∥2CN+T∑t=1wtTV(ut) + T∑t=1(1−wt)ICBp0TV(ut,u0) = minu,z T∑t=1αt2∥Ktut−ft∥2CMt+T−1∑t=1γt2∥ut+1−ut∥2CN+T∑t=1wt∥∇ut∥1 + T∑t=1(1−wt)[∥∇(ut−zt)∥1+∥∇zt∥1−⟨p0,ut⟩CN+⟨2p0,zt⟩CN]

where and . Introducing a dual variable for all the terms containing an operator, leads to the primal-dual formulation

 minu,zmaxy T∑t=1(⟨yt,1,Ktut−ft⟩CMt−12αt∥yt,1∥2CM)+T−1∑t=1γt2∥ut+1−ut∥2CN + T∑t=1(⟨yt,2,∇ut⟩CN×2+⟨yt,3,∇(ut−zt)CN×2+⟨yt,4,∇zt⟩CN×2) − T∑t=1⟨(1−wt)p0,ut⟩CN+T∑t=1⟨2(1−wt)p0,zt⟩CN + T∑t=1(χCt,2(yt,2)+χCt,3(yt,3)+χCt,4(yt,4)) (3.6)

where , , and for all , and

 Ct,2:={y∈CM×2\leavevmode\nobreak |\leavevmode\nobreak ∥y∥∞≤wt}, Ct,3:={y∈CM×2\leavevmode\nobreak |\leavevmode\nobreak ∥y∥∞≤(1−wt)}, Ct,4:={y∈CM×2\leavevmode\nobreak |\leavevmode\nobreak ∥y∥∞≤(1−wt)}.

To solve the problem, we again perform a proximal gradient descent on the primal variables and , and a proximal gradient ascent on the dual variables , where the gradients are taken with respect to the linear parts, the proximum with respect to the nonlinear parts. We hence need to compute the proximal operators for the nonlinear parts in (3.2) to obtain the update steps for and for every . The proximal operators for can be computed exactly as in (3.4). The proximal operators for the updates of , , are given by projections onto the sets similar to (3.5). For the squared norm related to the time regularization, we notice that for every , only interacts with the previous and the following time step, i.e. and . Hence, analogously to , the proximum for

 ψt(ut)=γt−12∥ut−ut−1∥2CN+γt2∥ut+1−ut∥2CN

is given by

 ut=proxτψt(r)⇔ut=r+τγtut+1+τγt−1ut−1τ(γt+γt−1)+1.

The two odd updates for and can be obtained by the same formula by simply setting and , respectively. Putting everything together, we obtain Algorithm 2.

### 3.3 Step sizes and stopping criteria

We quickly discuss the choice of the step sizes and stopping criteria for Algorithm 2. In most standard applications it stands to reason to choose the step sizes according to the condition ( denotes the collection of all operators) such that convergence of the algorithm is guaranteed [8]. However, depending on , i.e. the number of time frames we consider, the norm of the operator can be very costly to compute, or too large such that the condition only permits extremely small step sizes. For practical use, we instead simply choose and reasonably ”small“ and track both the energy of the problem and the primal-dual residual [20] to monitor convergence. For the sake of brevity, we do not write down the primal-dual residual for Algorithm 2 and instead refer the reader to [20] for its definition. The implementation is then straightforward. We hence stop the algorithm if both, the relative change in energy between consecutive iterates and the primal-dual residual, have dropped below a certain threshold.

### 3.4 Practical considerations

It is clear that for a large number of time frames Algorithm 2 starts to require an increasing amount of time to return reliable results and for reasonably ”large“ step sizes and it is even doubtful whether we can obtain convergence. In practice, it is hence necessary to divide the time series into smaller bits of consecutive time frames. More precisely, choose numbers such that with . We can then perform the reconstruction separately for all . In order to keep the ”continuity“ between and , we can include the last frame of into the reconstruction of by letting and choosing as the respective last frame of . This divides the overall problem into smaller and easier subproblems, which can be solved faster. In practice, we observed that a size of five to ten frames per subset is a reasonable choice, which essentially gives very similar results as doing a reconstruction for the entire time series .

## 4 Numerical results

In this section we show numerical results for simulated fMRI data and real DCE-MRI data.

### 4.1 Methods

In order to evaluate the potential of the proposed method and to illustrate how each of the terms of our model contributes to the result, we consider the following reconstruction approaches:

• Least squares (LS) reconstruction of the dynamic sequence:

 minut\leavevmode\nobreak ∥Ktut−ft∥2CMt,t=1,…,T. (4.1)

The solution of (4.1) corresponds to a conventional, non-regularized reconstruction.

• Reconstruction of the dynamic sequence using spatial TV regularization (TV):

 minut\leavevmode\nobreak αt2∥Ktut−ft∥2CMt+TV(ut),t=1,…,T (4.2)

The solution of (4.2) gives a gradient sparsity promoting reconstruction without time correlation between the frames in the image sequence.

• Reconstruction of the dynamic sequence using temporal (smoothness) regularization, but no spatial regularization (temp):

 minu\leavevmode\nobreak