Multi domain CT metal artifacts reduction using partial convolution based inpainting

Multi domain CT metal artifacts reduction using partial convolution based inpainting


Recent CT Metal Artifacts Reduction (MAR) methods are often based on image-to-image convolutional neural networks for adjustment of corrupted sinograms or images themselves. In this paper we are exploring the capabilities of a multi domain method which consists of both sinogram correction (projection domain step) and restored image correction (image domain step). Moreover, we propose a formulation of the first step problem as a sinogram inpainting which allows us to use methods of this specific field such as partial convolutions. The proposed method allows to achieve state-of-the-art ( MSE) improvement in comparison with a classic benchmark - Li-MAR.

Artem Pimkin, Alexander Samoylenko, Natalia Antipina
Anna Ovechkina, Andrey Golanov, Alexandra Dalechina and Mikhail Belyaev

Skolkovo Institute of Science and Technology, Moscow, Russia
Moscow Institute of Physics and Technology, Moscow, Russia
N. N. Burdenko National Medical Research Center of Neurosurgery
Lomonosov Moscow State University
The Gamma Knife Center at the N. N. Burdenko Institute of Neurosurgery,, {keywords} Convolutional Networks, Computed Tomography (CT) images, Metal Artifacts Reduction, Sinogram Inpainting, Partial Convolutions

1 Introduction

Computed Tomography is a commonly used imaging method in disease diagnosis and radiation therapy for dose distribution calculation based on the electron density of the irradiated tissues and patient-specific anatomy. High-density objects (e.g., containing metal) may occur in the area of interest and strongly affect the attenuation of X-Rays that may lead to distortion of the final image reconstructed from an inconsistent sinogram [boas2012ct]. These image artifacts could have a significant impact on the dose distribution accuracy and reduce the visibility of organs and structures close to the metal objects [giantsoudi2017metal], [kovacs2018metal].

given an input 3D tensor that represents a CT scan corrupted by the presence of high density objects, generate a corresponding CT image with suppressed artifacts. different methods have been in development during the last 40 years [gjesteby2016metal]. The majority of the proposed solutions can be divided into two large groups based on the domain of the input data:

  1. Algorithms consisting of removal of the high density area from sinogram with further reconstruction based on the other parts of image, also known as projection-based, e.g. Li-MAR [kalender1987reduction]. Nowadays this problem may be solved using image to image deep learning based approaches (e.g. [park2017sinogram]).

  2. Image-based solutions that use image-to-image networks and reduce artifacts directly on the scans (e.g. [park2017machine]).

In this paper we propose a deep learning based method that combines both approaches described above: it consists of two models that process the image representation in two domains. The first inpainting model is responsible for removal of the distorted metal trace from the sinogram. The second refining model corrects the residual artifacts after restoration of the image. In this work we have successfully verified the following statements:

  1. Refining model improves the quality of the result since even minor inconsistencies in the sinogram may lead to significant artifacts on the restored image.

  2. Sinogram adjustment via inpainting model may simplify the problem for direct image-to-image refining model.

Besides, we formulate the problem of restoration of the corrupted area of sinogram directly as image inpainting which in turn allows to use state-of-the art approaches for this step, e.g. partial convolution based neural network (which was introduced in [Liu2018ImageIF] for inpainting of irregular holes).

For a training of deep learning models we propose a high density random objects generation pipeline that allows us to solve a problem of natural paired data absence. Also it allows us to avoid a problem of domain adaptation since shapes of generated objects are structurally close to the real shapes (more details in Section 2.2).

Li-MAR is a classic MAR method which consists of linear interpolation for missing data within the metal trace. For this reason, we chose it as our reference. Also it allows us to compare quality of our approach with other works using relative difference between model and Li-MAR performances as a metric. We compared the performance in these terms with FCN-MAR[ghani2018deep], Deep-MAR[ghani2019fast] and CNN-MAR[zhang2018convolutional] as one of the most popular and recent deep learning based methods of metal artifact reduction.

Figure 1: Example of real shapes of high density objects (above) and shapes generated via proposed pipeline (below).
Figure 2: Training pipeline. Step 1 on the figure represents training version of the first step of the algorithm replaced by generated object and uncorrupted scan instead of segmentation.

2 Proposed solution

The main idea of the proposed solution is to combine both image-based and projection-based approaches. The first step is the removal of the metal trace from sinogram with restoration of deleted area (projection domain step). The second step is the elimination of residual artifacts from the restored image (image domain step). More accurate formulation of the pipeline is further in Section 2.1.

2.1 Structure

Firstly, it is important to mention that we implemented slice by slice pipeline due to avoidance of data spatial inconsistencies e.g. different spacing between slices. Thus, we propose the following algorithm structure for each of corrupted slices:

  1. Cut a mask of the high density object using threshold (since CT voxel intensities represents its density). This is a common method of detection of such objects (e.g [ghani2019fast]).

  2. Obtain sinograms for the corrupted image and for the mask of the object using Radon transform.

  3. Crop the sinogram according to the transformed mask.

  4. Restore cropped sinogram area using inpainting model. Ideally it has to match the sinogram of the image with the absence of both high density object and artifacts caused by it.

  5. Restore image from obtained sinogram.

  6. Adjust the image to remove residual artifacts using refining model.

  7. Restore the high density object using the mask from step 1.

At steps 4 and 6 we are using convolutional neural networks as described in Section 2.3. To measure the effectiveness of the proposed algorithm in comparison with only sinogram inpainting and direct image to image model we have also trained the same pipeline without step 4 and step 6 respectively. More details on the pipeline and training loops are shown in Figure 2.

2.2 Synthetic data

Due to absence of the natural paired data we have used CT scans of the patients with no such high density objects on the scan and hence without artifacts caused by them for creation of the synthetic dataset. To create synthetic dataset we have built a random object generator such that generated objects had a similar structure as a real high density objects on head CT scans. To achieve this we used the following algorithm:

  1. Select a uniformly random size (up to 10% of the linear size of the image) of a volume where object will be placed.

  2. Put a uniformly random (from 1 to 25) number of geometrical structures (ball, octahedron, parallelepiped) of linear size up to 10 pixels into this volume.

  3. Merge them using morphological closing operation to obtain a random shaped objects or a small set of them.

  4. Put into the volume up to 30 geometrical structures of small size (from 1 to 3 pixels) to obtain an object with outliers.

  5. Select a position of the obtained object randomly (via uniform 2D distribution) so that the overlap between the mask and the brain is .

  6. Repeat the process up to 10 times to obtain a scan with multiple objects

Figure 1 shows examples of real and generated objects. Using that algorithm for each of 24 patients (more details in Section 3) we created 30 differently distributed object masks with an approximate depth of 90 slices per sample on average.

2.3 Architectures

For deep learning problems formulated above we used the following architectures:

UNet[unet] - model with that architecture won ISBI Challenge: Segmentation of neuronal structures in EM stacks. It was effective even on small amount of data and also performed well in task of metal artifacts reduction on sinograms (e.g. [park2017sinogram]) as well as in direct image to image artifacts removement process (e.g. [park2017machine]). In our case we use a model with this architecture for the step 6 of our algorithm - reduction of the residual artifacts on the images.

UNet with partial convolutions - this architecture differs from the UNet mentioned above with convolutions that were replaced by partial convolutions [Liu2018ImageIF] which are masked and renormalized to be conditioned only on the pixels that are not masked. In this work since we formulate the problem of sinogram correction directly as inpainting it allowed us to use this method for that part of pipeline as the state-of-the-art in the field.

3 Experimental setup

Data. The initial dataset consists of 24 CT brain scans taken for radiation therapy planning. Each scan is represented by a 3D tensor of shape , where N is the number of axial slices varying from 130 to 210. Also a set of 47 corrupted scans with the same characteristics was used for visual validation of the solution. The data was divided into three parts patientwisely: training set for the first model, training set for the second model, testing set.

Preprocessing. All input images for both sinogram inpainting and restored CT slice residuals reduction were lineary normalized to fit into window. Such a simple preprocessing was used to maintain the physical sense of voxel intensities.

Metrics. We used a standard MAE (Mean Absolute Error), MSE (Mean Squared Error) and SSIM (Structural Similarity) metrics to measure quality of the final result on the CT scans.

Training. For our full pipeline and approaches comparison we end up with training 3 models: sinogram inpainting (UNet with partial convolutions), residual artifacts elimination (UNet) and image-to-image model for pipeline without inpainting step (Unet). All of the models mentioned above were trained using Adam optimizer with initial learning rate of .

For sinogram inpainting UNet with partial convolutions was trained for 500 epochs with multiplication of the learing rate by on each of the following epochs: 100, 200, 300 and multiplication by on 400, 450, 475 and 490 epochs respectively.

Models with plain UNet architecture both were trained for 200 epochs with learning rate multiplication on 100, 150 and 175 epochs.

All models were created and trained using PyTorch framework and DPipe111 for configurations management and experiments setup.

4 Results and conclusion

Test metrics of inpainting-only, image-to-image only, Li-MAR as a classic benchmark and the proposed method are represented in Table 1.

Algorithm MAE (Hounsfield units) MSE (HU) SSIM
Li-MAR 26.6 3282 0.94
Inpainting-only 22.0 10170 0.94
Image-to-image only 14.5 2064 0.97
Proposed algorithm 9.6 831 0.99
Table 1: Comparison between one step methods, Li-MAR and two step method in terms of MAE and SSIM between the final output and test set

Here we can see a significant () decrease of MAE between inpainting-only method and the proposed solution. Thus we can conclude that image-to-image network can successfully remove residual artifacts and increase quality of result model. We obtain a huge MSE and relatively small MAE for inpainting model due to inhomogeneity of its errors caused by sinogam inconsistency. On the other hand, we can see a decrease of MAE of between image-to-image only method and our two-step approach. This confirms our hypothesis that sinogram inpainting as preprocessing greatly simplifies the task for the image-to-image model. Moreover the proposed solution significantly outperforms Li-MAR by in terms of MAE.

Table 2 shows a relative drop of MSE reported in different papers in comparison with reported Li-MAR score and provides us an understanding of relatively good performance of the proposed model.

Algorithm MSE drop
Li-MAR [kalender1987reduction] 0.0
FCN-MAR [ghani2018deep] -52.2
Deep-MAR [ghani2019fast] -58.5
CNN-MAR [zhang2018convolutional] -71.1
Proposed algorithm -75.5
Table 2: Relative drop compared to Li-MAR performance.

Figure 3 shows examples of our algorithm and Li-MAR work on a real CT scans of the patients with high density objects that cause artifacts on the images. It is visible in the brain window that model restores distinguishable brain structures well.

Figure 3: Examples of processing real corrupted CT scan with intensities in brain window (40+-80 HU). From left to right: original image, Li-MAR output, output of proposed solution. Each row corresponds to the different patient.

In this paper we present a state-of-the-art multi domain deep learning procedure that consists of sinogram completion and additional adjustment after inverse Radon transform. The results of our experiments show the effectiveness of the proposed approach compared to one step algorithms. Thus qualitative and quantitative analysis show an application potential of the proposed method and the next step of our research is going to be a clinical validation on the dose calculations for radiosurgery planning.

The results have been obtained under the support of the Russian Foundation for Basic Research grant 18-29-26030.


Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description