Deep Convolutional Framelets: A General Deep Learning Framework for Inverse Problems^{†}^{†}thanks: The authors would like to thanks Dr. Cynthia MaCollough, the Mayo Clinic, the American Association of Physicists in Medicine (AAPM), and grant EB01705 and EB01785 from the National Institute of Biomedical Imaging and Bioengineering for providing the LowDose CT Grand Challenge data set. This work is supported by National Research Foundation of Korea, Grant number NRF2016R1A2B3008104, NRF2015M3A9A7029734, and NRF2017M3C7A1047904.
Abstract
Recently, deep learning approaches with various network architectures have achieved significant performance improvement over existing iterative reconstruction methods in various imaging problems. However, it is still unclear why these deep learning architectures work for specific inverse problems. Moreover, in contrast to the usual evolution of signal processing theory around the classical theories, the link between deep learning and the classical signal processing approaches such as wavelets, nonlocal processing, compressed sensing, etc, are not yet well understood. To address these issues, here we show that the longsearchedfor missing link is the convolution framelets for representing a signal by convolving local and nonlocal bases. The convolution framelets was originally developed to generalize the theory of lowrank Hankel matrix approaches for inverse problems, and this paper further extends the idea so that we can obtain a deep neural network using multilayer convolution framelets with perfect reconstruction (PR) under rectilinear linear unit nonlinearity (ReLU). Our analysis also shows that the popular deep network components such as residual block, redundant filter channels, and concatenated ReLU (CReLU) do indeed help to achieve the PR, while the pooling and unpooling layers should be augmented with highpass branches to meet the PR condition. Moreover, by changing the number of filter channels and bias, we can control the shrinkage behaviors of the neural network. This discovery reveals the limitations of many existing deep learning architectures for inverse problems, and leads us to propose a novel theory for deep convolutional framelets neural network. Using numerical experiments with various inverse problems, we demonstrated that our deep convolution framelets network shows consistent improvement over existing deep architectures. This discovery suggests that the success of deep learning is not from a magical power of a blackbox, but rather comes from the power of a novel signal representation using nonlocal basis combined with datadriven local basis, which is indeed a natural extension of classical signal processing theory.
Key words. Convolutional neural network, framelets, deep learning, inverse problems, ReLU, perfect reconstruction condition
AMS subject classifications. Primary, 94A08, 97R40, 94A12, 92C55, 65T60, 42C40 ; Secondary, 44A12
1 Introduction
Deep learning approaches have achieved tremendous success in classification problems [44] as well as lowlevel computer vision problems such as segmentation [59], denoising [76], superresolution [42, 61], etc. The theoretical origin of its success has been investigated [58, 63], and the exponential expressivity under a given network complexity (in terms of VC dimension [3] or Rademacher complexity [5]) has been often attributed to its success. A deep network is also known to learn highlevel abstractions/features of the data similar to the visual processing in human brain using multiple layers of neurons with nonlinearity [47].
Inspired by the success of deep learning in lowlevel computer vision, several machine learning approaches have been recently proposed for image reconstruction problems. In Xray computed tomography (CT), Kang et al [39] provided the first systematic study of deep convolutional neural network (CNN) for lowdose CT and showed that a deep CNN using directional wavelets is more efficient in removing lowdose related CT noises. Unlike these lowdose artifacts from reduced tube currents, the streaking artifacts originated from sparse projection views show globalized artifacts that are difficult to remove using conventional denoising CNNs [15, 52]. Han et al [29] and Jin et al [35] independently proposed a residual learning using UNet [59] to remove the global streaking artifacts caused by sparse projection views. In MRI, Wang et al [67] was the first to apply deep learning to compressed sensing MRI (CSMRI). They trained the deep neural network from downsampled reconstruction images to learn a fully sampled reconstruction. Then, they used the deep learning result either as an initialization or as a regularization term in classical CS approaches. Multilayer perceptron was developed for accelerated parallel MRI [46, 45]. Deep network architecture using unfolded iterative compressed sensing (CS) algorithm was also proposed [25]. Instead of using handcrafted regularizers, the authors in [25] tried to learn a set of optimal regularizers. Domain adaptation from sparse view CT network to projection reconstruction MRI was also proposed [30]. These pioneering works have consistently demonstrated impressive reconstruction performances, which are often superior to the existing iterative approaches. However, the more we have observed impressive empirical results in image reconstruction problems, the more unanswered questions we encounter. For example, to our best knowledge, we do not have the complete answers to the following questions that are critical to a network design:

What is the role of the filter channels in convolutional layers ?

Why do some networks need a fully connected layers whereas the others do not ?

What is the role of the nonlinearity such as rectified linear unit (ReLU) ?

Why do we need a pooling and unpooling in some architectures ?

What is the role of bypass connection or residual network ?

How many layers do we need ?
Furthermore, the most troubling issue for signal processing community is that the link to the classical signal processing theory is still not fully understood. For example, wavelets [17] has been extensively investigated as an efficient signal representation theory for many image processing applications by exploiting energy compaction property of wavelet bases. Compressed sensing theory [19, 14] has further extended the idea to demonstrate that an accurate recovery is possible from undersampled data, if the signal is sparse in some frames and the sensing matrix is incoherent. Nonlocal image processing techniques such as nonlocal means [8], BM3D [16], etc have also demonstrated impressive performance for many image processing applications. The link between these algorithms have been extensively studied for last few years using various mathematical tools from harmonic analysis, convex optimization, etc. However, recent years have witnessed that a blind application of deep learning toolboxes sometimes provides even better performance than mathematicsdriven classical signal processing approaches. Does this imply the dark age of signal processing or a new opportunity ?
Therefore, the main goal of this paper is to address these open questions. In fact, our paper is not the only attempt to address these issues. For instance, Papyan et al [56] showed that once ReLU nonlinearity is employed, the forward pass of a network can be interpreted as a deep sparse coding algorithm. Wiatowski et al [69] discusses the importance of pooling for networks, proving that it leads to translation invariance. Moreover, several works including [23] provided explanations for residual networks. The interpretation of a deep network in terms of unfolded (or unrolled) sparse recovery is another prevailing view in research community [24, 71, 25, 35]. However, this interpretation still does not give answers to several key questions: for example, why do we need multichannel filters ? In this paper, we therefore depart from this existing views and propose a new interpretation of a deep network as a novel signal representation scheme. In fact, signal representation theory such as wavelets and frames have been active areas of researches for many years [50], and Mallat [51] and Bruna et al [7] proposed the wavelet scattering network as a translation invariant and deformationrobust image representation. However, this approach does not have learning components as in the existing deep learning networks.
Then, what is missing here? One of the most important contributions of our work is to show that the geometry of deep learning can be revealed by lifting a signal to a high dimensional space using Hankel structured matrix. More specifically, many types of input signals that occur in signal processing can be factored into the left and right bases as well as a sparse matrix with energy compaction properties when lifted into the Hankel structure matrix. This results in a frame representation of the signal using the left and right bases, referred to as the nonlocal and local base matrices, respectively. The origin of this nomenclature will become clear later. One of our novel contributions was the realization that the nonlocal base determines the network architecture such as pooling/unpooling, while the local basis allows the network to learn convolutional filters. More specifically, the applicationspecific domain knowledge leads to a better choice of a nonlocal basis, on which to learn the local basis to maximize the performance.
In fact, the idea of exploiting the two bases by the socalled convolution framelets was originally proposed by Yin et al [74]. However, the aforementioned close link to the deep neural network was not revealed in [74]. Most importantly, we demonstrate for the first time that the convolution framelet representation can be equivalently represented as an encoderdecoder convolution layer, and multilayer convolution framelet expansion is also feasible by relaxing the conditions in [74]. Furthermore, we derive the perfect reconstruction (PR) condition under rectified linear unit (ReLU). The mysterious role of the redundant multichannel filters can be then easily understood as an important tool to meet the PR condition. Moreover, by augmenting local filters with paired filters with opposite phase, the ReLU nonlinearity disappears and the deep convolutional framelet becomes a linear signal representation. However, in order for the deep network to satisfy the PR condition, the number of channels should increase exponentially along the layer, which is difficult to achieve in practice. Interestingly, we can show that an insufficient number of filter channels results in shrinkage behavior via a low rank approximation of an extended Hankel matrix, and this shrinkage behavior can be exploited to maximize network performance. Finally, to overcome the limitation of the pooling and unpooling layers, we introduce a multiresolution analysis (MRA) for convolution framelets using wavelet nonlocal basis as a generalized pooling/unpooling. We call the new class of deep network using convolution framelets as the deep convolutional framelets.
1.1 Notations
For a matrix , denotes the range space of and refers to the null space of . denotes the projection to the range space of , whereas denotes the projection to the orthogonal complement of . The notation denotes a dimensional vector with 1’s. The identity matrix is referred to as . For a given matrix , the notation refers to the generalized inverse. The superscript of denotes the Hermitian transpose. Because we are mainly interested in real valued cases, is equivalent to the transpose . The inner product in matrix space is defined by where . For a matrix , denotes its Frobenius norm. For a given matrix , denotes its th column, and is the elements of . If a matrix is partitioned as with submatrix , then refers to the th column of . A vector is referred to the flipped version of a vector , i.e. its indices are reversed. Similarly, for a given matrix , the notation refers to a matrix composed of flipped vectors, i.e. For a block structured matrix , with a slight abuse of notation, we define as
(1) 
Finally, Table LABEL:tbl:notation summarizes the notation used throughout the paper.
Notation  Definition 

nonlocal basis matrix at the encoder  
nonlocal basis matrix at the decoder  
local basis matrix at the encoder  
local basis matrix at the decoder  
encoder and decoder biases  
th nonlocal basis or filter at the encoder  
th nonlocal basis or filter at the decoder  
th local basis or filter at the encoder  
th local basis or filter at the decoder  
convolutional framelet coefficients at the encoder  
convolutional framelet coefficients at the decoder  
input dimension  
convolutional filter length  
number of input channels  
number of output channels  
single channel input signal, i.e.  
a channel input signal, i.e.  
Hankel operator, i.e.  
extended Hankel operator, i.e.  
generalized inverse of Hankel operator, i.e.  
generalized inverse of an extended Hankel operator, i.e.  
left singular vector matrix of an (extended) Hankel matrix  
right singular vector matrix of an (extended) Hankel matrix  
singular value matrix of an (extended) Hankel matrix  
circulant matrix 
2 Mathematics of Hankel matrix
Since the Hankel structured matrix is the key component in our theory, this section discusses various properties of the Hankel matrix that will be extensively used throughout the paper.
2.1 Hankel matrix representation of convolution
Hankel matrices arise repeatedly from many different contexts in signal processing and control theory, such as system identification [21], harmonic retrieval, array signal processing [33], subspacebased channel identification [64], etc. A Hankel matrix can be also obtained from a convolution operation [72], which is of particular interest in this paper. Here, to avoid special treatment of boundary condition, our theory is mainly derived using the circular convolution.
Let and . Then, a singleinput singleoutput (SISO) convolution of the input and the filter can be represented in a matrix form:
(2) 
where is a wraparound Hankel matrix:
(3) 
Similarly, a singleinput multioutput (SIMO) convolution using filters can be represented by
(4) 
where
On the other hand, multiinput multioutput (MIMO) convolution for the channel input can be represented by
(5) 
where and are the number of input and output channels, respectively; denotes the length  filter that convolves the th channel input to compute its contribution to the th output channel. By defining the MIMO filter kernel as follows:
(6) 
the corresponding matrix representation of the MIMO convolution is then given by
(7)  
(8)  
(9) 
where is a flipped block structured matrix in the sense of (LABEL:eq:block), and is an extended Hankel matrix by stacking Hankel matrices side by side:
(10) 
For notational simplicity, we denote . Fig. LABEL:fig:hankel illustrates the procedure to construct an extended Hankel matrix from when the convolution filter length is 2.
Finally, as a special case of MIMO convolution for , the multiinput singleoutput (MISO) convolution is defined by
(11) 
where
The SISO, SIMO, MIMO, and MISO convolutional operations are illustrated in Fig. LABEL:fig:conv(a)(d).
The extension to the multichannel 2D convolution operation for an image domain CNN (and multidimensional convolutions in general) is straightforward, since similar matrix vector operations can be also used. Only required change is the definition of the (extended) Hankel matrices, which is now defined as block Hankel matrix. Specifically, for a 2D input with , the block Hankel matrix associated with filtering with filter is given by
(12) 
Similarly, an extended block Hankel matrix from the channel input image is defined by
(13) 
Then, the output from the 2D SISO convolution for a given image with 2D filter can be represented by a matrix vector form:
where denotes the vectorization operation by stacking the column vectors of the 2D matrix . Similarly, 2D MIMO convolution for given input images with 2D filter can be represented by a matrix vector form:
(14) 
Therefore, by defining
(15) 
(16) 
the 2D MIMO convolution can be represented by
(17) 
Due to these similarities between 1D and 2D convolutions, we will therefore use the 1D notation throughout the paper for the sake of simplicity; however, readers are advised that the same theory applies to 2D cases.
In convolutional neural networks (CNN), unique multidimensional convolutions are used. Specifically, to generate output channels from input channels, each channel output is computed by first convolving  2D filters and  input channel images, and then applying the weighted sum to the outputs (which is often referred to as convolution). For 1D signals, this operation can be written by
(18) 
where denotes the 1D weighting. Note that this is equivalent to an MIMO convolution, since we have
(19)  
where
(20) 
The aforementioned matrix vector operations using the extended Hankel matrix also describe the filtering operation (LABEL:eq:2dConv) in 2D CNNs as shown in Fig. LABEL:fig:cnnConv.
Throughout the paper, we denote the space of the wraparound Hankel structure matrices of the form in (LABEL:eq:hank) as , and an extended Hankel matrix composed of Hankel matrices of the form in (LABEL:eq:ehank) as . The basic properties of Hankel matrix used in this paper are described in Lemma LABEL:lem:calculus in Appendix LABEL:ap1. In the next section, we describe advanced properties of the Hankel matrix that will be extensively used in this paper.
2.2 Lowrank property of Hankel Matrices
One of the most intriguing features of the Hankel matrix is that it often has a lowrank structure and its lowrankness is related to the sparsity in the Fourier domain (for the case of Fourier samples, it is related to the sparsity in the spatial domain)[72, 37].
Note that many types of image patches have sparsely distributed Fourier spectra. For example, as shown in Fig. LABEL:fig:flowchart(a), a smoothly varying patch usually has spectrum content in the lowfrequency regions, while the other frequency regions have very few spectral components. Similar spectral domain sparsity can be observed in the texture patch shown in Fig. LABEL:fig:flowchart(b), where the spectral components of patch are determined by the spectrum of the patterns. For the case of an abrupt transition along the edge as shown in Fig. LABEL:fig:flowchart(c), the spectral components are mostly localized along the axis. In these cases, if we construct a Hankel matrix using the corresponding image patch, the resulting Hankel matrix is lowranked [72]. This property is extremely useful as demonstrated by many applications [37, 34, 55, 48, 49, 36]. For example, this idea can be used for image denoising [38] and deconvolution [53] by modeling the underlying intact signals to have lowrank Hankel structure, from which the artifacts or blur components can be easily removed.
In order to understand this intriguing relationship, consider a 1D signal, whose spectrum in the Fourier domain is sparse and can be modelled as the sum of Diracs:
(21) 
where refer to the corresponding harmonic components in the Fourier domain. Then, the corresponding discrete timedomain signal is given by:
(22) 
Suppose that we have a length filter which has the following ztransform representation [66]:
(23) 
Then, it is easy to see that
(24) 
because
(25)  
where and the last equality comes from (LABEL:eq:afilter) [66]. Thus, the filter annihilates the signal , so it is referred to as the annihilating filter. Moreover, using the notation in (LABEL:eq:SISO), Eq. (LABEL:eq:annf) can be represented by
This implies that Hankel matrix is rankdeficient. In fact, the rank of the Hankel matrix can be explicitly calculated as shown in the following theorem:
Theorem 1.
[72] Let denote the minimum length of annihilating filters that annihilates the signal . Then, for a given Hankel structured matrix with , we have
(26) 
where denotes a matrix rank.
Thus, if we choose a sufficiently large , the resulting Hankel matrix is lowranked. This relationship is quite general, and Ye et al [72] further showed that the rank of the associated Hankel matrix is if and only if can be represented by
(27) 
for some . If , then it is directly related to the signals with the finite rate of innovations (FRI) [66]. Thus, the lowrank Hankel matrix provides an important link between FRI sampling theory and compressed sensing such that a sparse recovery problem can be solved using the measurement domain lowrank interpolation [72].
In [34], we also showed that the rank of the extended Hankel matrix in (LABEL:eq:ehank) is low, when the multiple signals has the following structure:
(28) 
such that the Hankel matrix has the following decomposition:
(29) 
where is wraparound Hankel matrix, and for any with is defined by
(30) 
Accordingly, the extended Hankel matrix has the following decomposition:
(31) 
Due to the rank inequality , we therefore have the following rank bound:
(32)  
Therefore, if the filter length is chosen such that the number of column of the extended matrix is sufficiently large, i.e. , then the concatenated matrix becomes lowranked.
Note that the lowrank Hankel matrix algorithms are usually performed in a patchbypatch manner [38, 37]. It is also remarkable that this is similar to the current practice of deep CNN for low level computer vision applications, where the network input is usually given as a patch. Later, we will show that this is not a coincidence; rather it suggests an important link between the lowrank Hankel matrix approach and a CNN.
2.3 Hankel matrix decomposition and the convolution framelets
The last but not least important property of Hankel matrix is that a Hankel matrix decomposition results in a framelet representation whose bases are constructed by the convolution of socalled local and nonlocal bases [74]. More specifically, for a given input vector , suppose that the Hankel matrix with the rank has the following singular value decomposition:
(33) 
where and denote the left and right singular vector bases matrices, respectively; and is the diagonal matrix whose diagonal components contains the singular values. Then, by multiplying and to the left and right of the Hankel matrix, we have
(34) 
Note that the th element of is given by
(35) 
where the last equality comes from (LABEL:eq:inner). Since the number of rows and columns of are and , the rightmultiplied vector interacts locally with the neighborhood of the vector, whereas the leftmultiplied vector has a global interaction with the entire elements of the vector. Accordingly, (LABEL:eq:sigma) represents the strength of simultaneous global and local interaction of the signal with bases. Thus, we call and as nonlocal and local bases, respectively.
This relation holds for arbitrary bases matrix and that are multiplied to the left and right of the Hankel matrix, respectively, to yield the coefficient matrix:
(36) 
which represents the interaction of with the nonlocal basis and local basis . Using (LABEL:eq:C0) as expansion coefficients, Yin et al derived the following signal expansion, which they called the convolution framelet expansion [74]:
Proposition 2 ([74]).
Let and denotes the th and th columns of orthonormal matrix and , respectively. Then, for any dimensional vector ,
(37) 
Furthermore, with form a tight frame for with the frame constant .
This implies that any input signal can be expanded using the convolution frame and the expansion coefficient . Although the framelet coefficient matrix in (LABEL:eq:C0) for general nonlocal and local bases is not as sparse as (LABEL:eq:Sigma) from SVD bases, Yin et al [74] showed that the framelet coefficients can be made sufficiently sparse by optimally learning for a given nonlocal basis . Therefore, the choice of the nonlocal bases is one of the key factors in determining the efficiency of the framelet expansion. In the following, several examples of nonlocal bases in [74] are discussed.

SVD: From the singular value decomposition in (LABEL:eq:svd0), the SVD basis is constructed by augmenting the left singular vector basis with an orthogonal matrix :
such that . Thanks to (LABEL:eq:sigma), this is the most energy compacting basis. However, the SVD basis is inputsignal dependent and the calculation of the SVD is computationally expensive.

Haar: Haar basis comes from the Haar wavelet transform and is constructed as follows:
where the lowpass and highpass operators are defined by
Note that the nonzero elements of each column of Haar basis is two, so one level of Haar decomposition does not represent a global interaction. However, by cascading the Haar basis, the interaction becomes global, resulting in a multiresolution decomposition of the input signal. Moreover, Haar basis is a useful global basis because it can sparsify the piecewise constant signals. Later, we will show that the average pooling operation is closely related to the Haar basis.

DCT: The discrete cosine transform (DCT) basis is an interesting global basis proposed by Yin et al [74] due to its energy compaction property proven by JPEG image compression standard. The DCT bases matrix is a fully populated dense matrix, which clearly represents a global interaction. To the best of our knowledge, the DCT basis have never been used in deep CNN, which could be an interesting direction of research.
In addition to the nonlocal bases used in [74], we will also investigate the following nonlocal bases:

Identity matrix: In this case, , so there is no global interaction between the basis and the signal. Interestingly, this nonlocal basis is quite often used in CNNs that do not have a pooling layer. In this case, it is believed that the local structure of the signal is more important and localbases are trained such that they can maximally capture the local correlation structure of the signal.

Learned basis: In extreme case where we do not have specific knowledge of the signal, the nonlocal bases can be also learnt. However, a care must be taken, since the learned nonlocal basis has size of that quickly becomes very large for image processing applications. For example, if one is interested in processing (i.e. ) image, the required memory to store the learnable nonlocal basis becomes , which is not possible to store or estimate. However, if the input patch size is sufficiently small, this may be another interesting direction of research in deep CNN.
3 Main Contributions: Deep Convolutional Framelets Neural Networks
In this section, which is our main theoretical contribution, we will show that the convolution framelets by Yin et al [74] is directly related to the deep neural network if we relax the condition of the original convolution framelets to allow multilayer implementation. The multilayer extension of convolution framelets, which we call the deep convolutional framelet, can explain many important components of deep learning.
3.1 Deep Convolutional Framelet Expansion
While the original convolution framelets by Yin et al [74] exploits the advantages of the low rank Hankel matrix approaches using two bases, there are several limitations. First, their convolution framelet uses only orthonormal basis. Second, the significance of multilayer implementation was not noticed. Here, we discuss its extension to relax these limitations. As will become clear, this is a basic building step toward a deep convolutional framelets neural network.
Proposition 3.
Let and denote the nonlocal and local bases matrices, respectively. Suppose, furthermore, that and denote their dual bases matrices such that they satisfy the frame condition:
(38)  
(39) 
Then, for any input signal , we have
(40) 
or equivalently,
(41) 
where is the th column of the framelet coefficient matrix
(42)  
(43) 
Proof.
Using the frame condition (LABEL:eq:phi0) and (LABEL:eq:ri0), we have
where denotes the framelet coefficient matrix computed by
and its th element is given by
where we use (LABEL:eq:inner) for the last equality. Furthermore, using (LABEL:eq:recon1) and (LABEL:eq:invfilter), we have
This concludes the proof.
Note that the socalled perfect recovery condition (PR) represented by (LABEL:eq:frame) can be equivalently studied using:
(44) 
Similarly, for a given matrix input , the perfect reconstruction condition for a matrix input can be given by
(45) 
which is explicitly represented in the following proposition:
Proposition 4.
Let denote the nonlocal basis and its dual, and denote the local basis and its dual, respectively, which satisfy the frame condition:
(46)  
(47) 
Suppose, furthermore, that the local bases matrix have block structure:
(48) 
with whose th column is represented by and , respectively. Then, for any matrix , we have
(49) 
or equivalently,
(50) 
where is the th column of the framelet coefficient matrix
Proof.
For a given , using the frame condition (LABEL:eq:phi0) and (LABEL:eq:ri0), we have
where denotes the framelet coefficient matrix computed by
and its th element is given by
Furthermore, using (LABEL:eq:recon1), (LABEL:eq:invfilter) and (LABEL:eq:recon2), we have
This concludes the proof.
Remark 1.
Compared to Proposition LABEL:prp:yin, Propositions LABEL:prp:1 and LABEL:prp:2 are more general, since they consider the redundant and nonorthonormal nonlocal and local bases by allowing relaxed conditions, i.e. or . The specific reason for is to investigate existing CNNs that have large number of filter channels at lower layers. The redundant global basis with is also believed to be useful for future research, so Proposition LABEL:prp:1 is derived by considering further extension. However, since most of the existing deep networks use the condition , we will mainly focus on this special case for the rest of the paper.
Remark 2.
For the given SVD in (LABEL:eq:svd0), the frame conditions (LABEL:eq:phi0) and (LABEL:eq:ri0) can be further relaxed to the following conditions:
due to the following matrix identity:
In these case, the number of bases for nonlocal and local basis matrix can be smaller than that of Proposition LABEL:prp:1 and Proposition LABEL:prp:2, i.e. and . Therefore, smaller number of bases still suffices for PR.
Finally, using Propositions LABEL:prp:1 and LABEL:prp:2 we will show that the convolution framelet expansion can be realized by two matched convolution layers, which has striking similarity to neural networks with encoderdecoder structure [54]. Our main contribution is summarized in the following Theorem.
Theorem 5 (Deep Convolutional Framelets Expansion).
Under the assumptions of Proposition LABEL:prp:2, we have the following decomposition of input