Autoencoders for Multi-Label Prostate MR Segmentation

Autoencoders for Multi-Label Prostate MR Segmentation

Ard de Gelder111Ard de Gelder is with the Diagnostic Image Analysis Group, Radboud University Nijmegen Medical Centre (e-mail:, Henkjan Huisman
July 15, 2019

Organ image segmentation can be improved by implementing prior knowledge about the anatomy. One way of doing this is by training an autoencoder to learn a lowdimensional representation of the segmentation. In this paper, this is applied in multi-label prostate MR segmentation, with some positive results.



1 Introduction

Prostate cancer is a major cause of cancer mortality among men. Multi-parameter MRI is being used, both in diagnosis and in treatment of prostate cancer. For these purposes, however, segmentation is necessary, which requires a lot of expertise. So automation of this task could greatly benefit clinical practice, possibly even enabling population-wide pre-emptive screening. In particular, it would be useful to segment two different zones within the prostate: the transition zone (TZ) and peripheral zone (PZ), since these have differing guidelines for mpMRI diagnosis of cancer. This is quite challenging, especially the border between the two zones is subtle and hard to segment.

Automatic segmentation has greatly improved recently, most notably due to the use of convolutional neural networks like UNet. [cicek] A variant, VNet, has been applied to full prostate segmentation [milletari] and recently a 3D-version of UNet has been used to do multi-zonal prostate segmentation. [germonda]

In cardiac imaging, autoencoders have been used as a way of implementing prior knowledge into neural networks, with some positive results. [oktay] We hypothesize that the same techniques can be used to increase the accuracy of automatic multi-label prostate segmentation.

Figure 1: MRI-slice (left) with manual segmentation (right). Red = TZ, Green = PZ

2 Dataset

The used dataset constists of 64 3D T2-weighted MRI volumes of the prostate and surrouding region from the 2016 Detection Archive [dataset]. In these volumes both TZ and PZ are annotated by hand. See for example Figure 1.

The original images are too large to fit in memory, so they are cropped and rescaled, see Table 1.

During training, the data is augmented by small transitions, left-right flips, isotropic expansions, elastic deformations, and rotations.

dimension original voxels voxel size rescaled voxels voxel size
x 384 or 640 0.5mm 36 3mm
y 384 or 640 0.5mm 36 3mm
z 19-27 3.6mm 18 3.6mm
Table 1: Image rescaling

3 Methods

The main network used for segmentation is based on the 3D-UNet architecture, with one modification: to reflect the anisotropicity of the MRIs, some 3D-convolutions were replaced by 2D-convolutions. [cicek, germonda]

In an attempt to improve this, an autoencoder is added.

3.1 Autoencoder

An autoencoder is a neural network that consists of two parts: an encoder, that reduces a given segmentation to a lower-dimensional encoding and a decoder, that aims to reconstruct the original segmentation from the encoding as accurately as possible.

Figure 2: Encoder. Figure from [oktay]

Since the size of the encoding is lower than the input, the encoder has to capture the most important features of the data. So this lower-dimensional encoding can be used as a summary of the global properties of a segmentation.

The used autotoencoder is a fully convolutional one that reduces the 36x36x18 segmentation to a 9x9x5 encoding. See for an example Figure 5.

This autoencoder is trained on the manual annotations in the dataset for 100 epochs, using binary crossentropy loss and an Adam optimizer.

The main metric used to evaluate performance is the DICE score, given by

where is the prediction and the ground truth.

The autoencoder could reconstruct the TZ with an average DICE of 0.95 and the PZ with a DICE of 0.85.

Figure 3: 3D-UNet based artitecture. Figure from [germonda]
Figure 4: Autoencoder architecture
Figure 5: Left: ground truth, center: one encoding slice, right: reconstruction

3.2 Implementation

During training of the 3D-UNet, the pre-trained encoder is used to add an extra global loss, as seen in Figure 6. This global loss is added to the pixel-wise loss, where the pixel-wise loss has a weight factor of 1 and the encoder-generated global loss a weight factor of 0.2.
The pixel-wise loss is calculated by weighted categorical crossentropy, where the background has weight 1, the TZ weight 2, and the PZ weight 6, in order to compensate for label inbalances.

Figure 6: Implementation of the encoder. Figure from [oktay]

The 3D-UNet was trained for 300 epochs, using an Adam optimizer, a learning rate of 0.0001 and L2 kernel regulazation.

4 Results

First the 3D-UNet was trained without the extra loss provided by the encoder, and the results were compared to a 3D-UNet that was trained with the extra encoder-based loss.

The 3D-UNet that was trained with the encoder obtained slightly better results.

3D-UNet 0.85 0.60
3D-UNet trained with encoder 0.85 0.67
Table 2: Segmentation DICE scores
Figure 7: Left: manual segmentation (ground truth), center: segmentation by 3D-UNet trained without encoder, right: segmentation by 3D-UNet trained with encoder

5 Conclusions

In this work we applied convolutional autoencoders to aid the training of a 3D-UNet in multi-label prostate segmentation. This did increase the segmentation accuracy, but only slightly.
One of the reasons the improvement is quite small could be that the image size is already reduced quite significantly before using it in 3D-UNet. It could be studier further whether for larger images, the autoencoder has more impact on performace.
Another possible improvement would be gradually decreasing the weight of the encoder-based loss while training the 3D-UNet, since the contribution the encoder-based global loss is largest during the beginning of the training.

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description