HAMLET: Hierarchical Harmonic Filters for Learning Tracts from Diffusion MRI
In this work we propose HAMLET, a novel tract learning algorithm, which, after training, maps raw diffusion weighted MRI directly onto an image which simultaneously indicates tract direction and tract presence. The automatic learning of fiber tracts based on diffusion MRI data is a rather new idea, which tries to overcome limitations of atlas-based techniques. HAMLET takes a such an approach. Unlike the current trend in machine learning, HAMLET has only a small number of free parameters HAMLET is based on spherical tensor algebra which allows a translation and rotation covariant treatment of the problem. HAMLET is based on a repeated application of convolutions and non-linearities, which all respect the rotation covariance. The intrinsic treatment of such basic image transformations in HAMLET allows the training and generalization of the algorithm without any additional data augmentation. We demonstrate the performance of our approach for twelve prominent bundles, and show that the obtained tract estimates are robust and reliable. It is also shown that the learned models are portable from one sequence to another.
It becomes more and more apparent that looking through the eyes of diffusion MRI (dMRI) shows rather a rough sketch of the underlying connectome. It was shown that tracking algorithms running on brain-like numerical phantoms with known ground truth heavily suffer from false positives . There have been attempts to account for such problems by reweighing and filtering strategies. Of course, also data quality plays a role, low spatial and angular resolution as experienced in the clinics is not sufficient and introduces ambiguities that cannot be resolved with pure model-based approaches. Explicit usage of anatomical prior knowledge is necessary. Atlas-guided methods  do not suffer from these problems, because the existence of a fiber bundle is taken as a prerequisite and the position in atlas-space is very informative. However, those methods rely on an accurate registration with an anatomical contrast. Also the inclusion of the tract prior knowledge is usually on a local level, i.e. the progression probabilities during streamlining are multiplied by priors defined locally. In this work we follow an alternative approach and use machine learning. We will directly map the dMRI data onto a directional saliency map of tract presence. It is nearby to require such a mapping to be translation covariant, as convolution neural networks (CNN) are used to be. We go here a step further and require additionally the filter to be rotation covariant. Which is, if we fully refrain from using a template space or normalization frame, also a very reasonable assumption. The idea we propose here for representation of such a mapping, we call it rotation covariant filters, is based on Spherical Tensor Algebra . The filter is, similar to CNNs, an iterated application of spatial convolutions and non-linearities, however, all operations of the filter are rotation covariant, i.e. wherever one rotates the input data, the output is rotated accordingly. So, the filter will only learn the typical spatio-orientational relationships between white matter structures and nothing about their absolute orientations nor positions. It has the advantage that the filter does not have to learn the robustness against orientational changes from the training data, but has it intrinsically built in. The filter can generalize quick and does not need any data augmentation.
The outputs of HAMLET are spherical tensor fields of order , or, equivalently fields of trace-less, symmetric matrices. The magnitude of the tensors represent the local ”likeliness” for the presence of a specific tract, the tensor orientation predicts the local tract directions. The top of Figure 2 shows an overview of the principle. Once the tensor field is predicted several interpretations are possible. It is nearby to use the field for a subsequent streamline tracking, which is also the way we follow mostly in this study.
There are two main technical contribution in this paper: (1) In HAMLET, we use an extension of the scalar filter proposed in . Unlike its scalar valued predecessor, the extended filter in HAMLET can now learn and output directions in addition to a tract saliency score. (2) In order to determine the tract locations and orientations, a global treatment of the problem is proposed. The large amount of 3D image data requires a hierarchical, multi-scale implementation to incorporate global image features for the rough tract localization, and local features for the precise representation of fiber bundle properties on a voxel resolution level.
Practically, we provide a fully automatic tracking and tract labeling procedure for (known) anatomical white matter fiber bundles. We will show that the procedure can measure apparent tract volumes in a reliable way and provides nice visualizations and representations of white matter geometry, even in a clinical setting. In its current implementation the filter relies only on second-order orientation information of the dMRI data, and is hence applicable to large varieties of protocols. We will show that a filter can also work for a dMRI protocol it was not trained with.
1.1 Related Work
Machine learning in the context of diffusion MRI was already used for simple white/gray matter segmentations , or learning brain parcellations/segmentations [17, 16]. Also in the context of microstructure estimation machine learning approaches provide efficient estimates . In [10, 5] neural networks are used to predict progression probabilities for streamline tracking. In  these ideas are extended to tract shape. Also the direct learning of fiber orientation distributions is possible ([8, 7]).
In the following we give a brief introduction to spherical tensor algebra and show how it can be used to build rotation covariant trainable mappings. For a more complete introduction see .
2.1 Spherical Tensor Algebra
Spherical harmonic representations are quite commonly used in the dMRI community to represent the direction dependent diffusion signal.
Let be a function which varies with image coordinates and directions . A nearby example is a dMRI signal. We call such a function an orientation field. It should be noted that we consider, without any loss of generality, that is complex valued. The field can be expressed in terms of the spherical harmonic expansion
The functions are the orthogonal spherical harmonic basis functions of order . The expansion coefficients form themselves a field . We call such a field of spherical harmonic coefficients a spherical tensor field of order (ST-field).
Any orientation field can be rotated with , where the first argument is a coordinate transformation, and the second argument a value transformation that rotates the local spherical function accordingly. For the corresponding expansion fields this means,
where is the Wigner-D matrix, the irreducible representations of . This fact can be used as the defining one:
Definition 1 (Spherical Tensor Field)
A function is called a spherical tensor field of order if it transforms with respect to rotations as
for all .
Spherical Tensor Algebra defines a set of rotation covariant operations on such fields. The most important operation is the spherical product:
Definition 2 (Spherical Products)
For every we define a family of bilinear forms, a tensor product
where has to be chosen according to the triangle inequality . The product is defined by
where is the th standard basis vector in and are the Clebsch-Gordan coefficients.
Spherical Products are covariant in the sense that for any and , and any .
On the basis of spherical products spherical derivatives can be introduced. The homogeneous polynomials of order are called the “solid harmonics“; With we denote the distance to the center, and with the direction. The operator transformation maps the Cartesian gradient operator onto the spherical domain and one can define a spherical differential operator by
The operator can be used to linearly map ST-fields onto other ST-fields by intertwining spatial neighborhood information. From a computational perspective this derivative operator can be used to efficiently compute local projection on spherical basis function.
2.2 Covariant Nonlinear Filters
The basic concept we rely on was already presented here for scalar-valued output (spherical order ). As we are interested in learning images of tracts we want to extend this here to second-order (spherical order ) tensor-like output. But first, the general concept. Suppose we have given a collection of spherical tensor fields describing the dMRI data, i.e. the SH-projections of a dMRI signal at a certain b-value. Note that in all experiments we set , i.e. already a rather low number of gradient directions will be sufficient. We can now form linear features describing local neighborhoods by the following differential operations:
where is a convolution with a radial function, typically a Gaussian. Due to the commutativity of convolutions the above operation may be interpreted as a projection onto the by generated basis functions. For a Gaussian , they are related to the Gauss-Laguerre basis. For simplicity we set here including all free indices/parameters on the right hand side. Note the constraints on the indices coming from the selection rules of the Clebsch-Gordan coefficients, hence, not all combinations are possible. The linear features can now be used to form non-linear ones by spherical products . One can form also higher order products than order two, but here we refrain from doing so. Indeed, the predicted tract labels are homogeneous of order two in the b0-normalized dMRI signal. Now, these products can again be linearly mapped by differentiation to any possible order. As we want to learn tracts, we map them to order two:
Finally, linear combinations of such form the final filter output, i.e. the input containing the dMRI data is mapped onto
where are coefficients to be determined during a training stage. Due to the linearity in the training can easily be accomplished in a least-squares sense by solving a linear equation. The number of features, i.e. the number of combinations for the multi-index can take, is crucial, it can easily get quite high, we denote it by .
2.3 Hierarchical Covariant Filters
The described approach works already quite well for simple tasks, however, problems appear when the learning task gets difficult. Increasing the complexity of the filter by increasing the number of features by allowing more product combinations or higher orders is possible, however, one can easily get into trouble regarding the memory consumption and computation time. One dataset can easily take tenth of Gigabytes of memory (). We propose here a simple hierarchical multi-scale generalization of the above filter to increase the expressiveness of the filter while keeping computation and memory requirement in a reasonable range. The filter is consecutively applied: it starts at low resolution with a rather high number of features and the output is forwarded to the next higher scale. To be precise: let denote the result of the filter at scale and the linear features from equation LABEL:eq computed on properly rescaled versions of the data at scale s, then the filter output is defined by
and with . The output is a linear combinations of: the output of the last scale, together with a quadratic term (as known from the single scale filter) at the current scale windowed with the previous layer’s magnitude response, and finally, a term being a product of the signal with the normalized spherical powers of the previous output.
In all our experiments we used a rescaling factor of and three scales . The parameters are the factors to be learned. The number of features differ and decrease with increasing resolution to keep the memory consumption similar at each scale. Details how the the rescaling was done can be found in the appendix A.2.
2.3.1 Parity and Point Symmetry
Besides rotation covariance the filter can also to be made to be invariant to parity changes (axial or point reflections in 3D), if the spherical products are chosen accordingly. It seems to be reasonable to require the filter to be covariant with respect to mirroring at the saggital plane, which involves a parity change of the coordinate system. In this way the machine does not have any prejudices about hemispherical differences. Recall the features
For parity covariant filters the sum has to be even. More details how the products are chosen can be found in the appendix A.1
A consequence of the parity covariance is that the filter can intrinsically not decide whether a tract is in the right or left hemisphere, so it makes no sense to learn tracts individually on the left or right hemisphere. All training labels should contain both homologues tracts for intra-hemispheric connection, commissural tracts are already approximately symmetric. So, only approximately parity symmetric training labels make sense. Note, that this does not mean that the output image of the filter is point symmetric, the filter is covariant with respect to parity changes.
Suppose, we have given a number of training tuples , were stands for the raw dMRI dataset of the th subject and for a properly generated spherical order 2 tract label map. Imagine the label map as an indicator map of the tract to be detected, with values in regions where the tract is present and zero otherwise. The principal direction of the should coincide with the direction of the tract. For example, for a set of ’ground truth’ tracts represented by their traces with tangents we can compute
and normalize , where is small constant determining a threshold of detectability. Indeed, we could also use directly the density map , however, then fiber density and detection probability would intertwine, which is not wanted and difficult to control.
For the training procedure we follow rather an ad-hoc strategy. Training starts at the lowest resolution, with properly smoothed and downsampled data and label maps. Once lowest resolution is trained over the whole training set, the predictions of the trained filter at this resolution are upsampled and forwarded to create the upper level features. Then, training is performed on the this upper level, and so on, until finest resolution is reached. Note, that with this approach we do not optimize for interdependencies of weights between the layers, but learn the filter layer by layer. This approach should work well, if the intermediate predictions are reproducible and generalize well. In order to assure this, an L2-regularization (Thikonov) is necessary, which has to be stronger for low resolutions than for high resolutions. Although high regularizations degrade predictions on the training set, they help to generalize quicker and keep prediction more reliable though less accurate. For details of the regularization see the appendix A.2.
We consider two data sets, one rather high quality protocol measured at a Siemens PRISMA for training and one low quality measured at a Tim TRIO. The first cohort consists of 55 healthy controls (also used here ), the second includes 28 volunteers all scanned twice in different sessions for testing reliability.
(PRISMA) Healthy subjects were scanned on a Siemens 3â¯T TIM PRISMA using an SE EPI sequence with a TEâ¯=â¯88â¯ms and TRâ¯=â¯2008â¯ms, bandwidth 1780â¯Hz, flip-angle 90Â°, GRAPPA factor 2, SMS factor 3 with 17 non-diffusion weighted images, 2*58 images with b-factor bâ¯=â¯1000 and 2000â¯s/mm2; with an in-plane voxel size of 1.5â¯mmâ¯Ãâ¯1.5â¯mm and a slice thickness of 3â¯mm. One phase-encoding flipped bâ¯=â¯0 image was acquired, which is used for distortion correction (FSL’s top up (Andersson et al., 2003)). Additionally to the healthy subjects three tumor patients will be shown to demonstrate how HAMLET works with unseen data.
(TRIO) Subjects were scanned on a Siemens TIM TRIO using a 2-shell protocol with two shells at b-values 1 and with 60 directions per shell, at an isotropic resolution of , 6/8 partial Fourier, TR=10,900 ms, TE=107 ms. The data was reconstructed with adaptive combine (Walsh et al., 2000) such that the noise distribution is close to Rician. Additionally, a T1-weighted structural dataset was acquired, resolution 1â¯mm isotropic. (maxim’s dico)
The diffusion weighted images were first denoised by a post-processing technique which uses random matrix theory (see  for details). This is followed by a Gibbs artifact removal  based on local sub-voxel shifts. Then, images (from PRISMA) were corrected for EPI distortions by FSL’s top up . The PRISMA dataset was additionally up-sampled to isotropic resolution by an edge preserving interpolation approach (Reisert et al., ISMRM 2017). Finally, on all scans underwent a simple BET. For all subjects the anatomical reference scans were segmented using VBM8 (http://www.fil.ion.ucl.ac.uk/spm). White matter probability maps were thresholded at a probability of 0.5 to determine the area of fiber reconstruction.
3.2 Generation of Training Data
To generate labels one may use any kind of classical fiber tracking algorithm at hand, take a certain selection or seeding strategy to delineate the tract of interest and generate a tract images by equation (7). We decided to use global tractography as described in  and use point based selections to generate tract labels. We selected the 12 tracts, which should cover most of the geometries typically encountered: inter- and intra hemispherics, commissural and tracts originating or passing through the midbrain. Details about the selection criteria can be found in appendix A.3. All selections are based on the non-rigid deformations to MNI group space provided by VBM8. We want to emphasize that the training set generation we used here is nothing than perfect. Still, false tracts can appear due to too loose restrictions, or tracts are missed because of slight atlas misregistrations.
To get a first qualitative impression of the behavior Figure 2 shows results from a random test subject from the PRISMA dataset. We compare with the automatic selected bundles from ordinary global tractography. The thresholding used for visualization was determined during training by maximization of segmentation accuracy. The tracking itself was obtained from a simple Euler integration of the vector field formed by the principal vector (eigenvector of the trace-less matrix corresponding to the maximal eigenvalue). Tracts are randomly seeded within the volume and terminated at its boundary. No other stopping criteria are used.
4.1 False Positives/Negatives
The detected tracts follow mostly the automatically selected GT tracts. However, also problems are apparent. Overall, the detections are rather ’smooth’. For example, for the MFB the bottleneck at its trunk is not accurately modeled. However, this is not always the case, for example the fornix and the anterior commisure can be nicely detected, although the structures are rather fine. However, the general problem exists that regions of a tract with high variability (over subjects) get lower responses in these regions. So, depending on the threshold, the ’trunk’ of a tract is always a bit enlarged, if also the fine tract processes should be included. For example, looking at the commisural fibers in the example in Figure 2, the laterals are not well represented with HAMLET in this subject.
In general, false positive tracts are small additional processes. For example, the AF shows sometimes a small additional extension at its posterior bend. Or the IFOF shows frontally a wrong inter-hemispheric connection. In fact, some tracts are just simple to learn and others (e.g. the optic radiation) are more difficult. One can also find false positives in regions with high fractional anisotropy.
4.2 Retest Reliability
The TRIO dataset consists of 28 healthy volunteers scanned twice in two different sessions. We use this dataset to quantify the reproducibility of the obtained tract images, and second, to show that a trained machine does also work for protocols that are quite different compared to the one used for training. Compared to the PRISMA dataset the TRIO protocol uses partial Fourier, has a different resolution, different echo time and different distortion corrections. The diffusion weighting schemes are approximately the same (see details above). Figure 3 shows the AF and MFB for three random subjects and the two different sessions. Both tracts are cleanly detected and the appearance and shape of the tract is well preserved between the sessions.
To measure the ’size’ or ’volume’ of the tract as a biomarker one can follow a simple strategy: if is the tract image, we just calculated the squared power to measure an apparent tract volume. In Figure 4 we show correlation plots between the sessions for all considered tracts. For reference we show the streamline counts obtained from the automatic selection strategy based on GT. To quantify reproducibility intra-class correlation coefficients111 Used definition of ICC: and are values from first and second session, then ICC is defined by (ICC) are reported for HAMLET’s apparent tract volumes and the streamline count obtained from GT.
The detection and classification of structures in dMRI is challenging because of the low spatial resolution, the large image size and strong acquisition noise. The difficult, time consuming and error prone task of creating high quality training data makes it even more challenging to solve such problems in a machine learning driven manner. For a long time, the usage of machine learning techniques for solving classification tasks in dMRI data was mostly restricted to the classification of rather simple problems, like the classification into brain gray-matter and brain white-matter . With the contemporary remarkable improvements in machine learning, it is very recent that it became possible to perform complex classification tasks in dMRI, such as the localization of whole fiber tracts [20, 19].
In this context, we have presented HAMLET, a novel machine learning approach that tackles the challenging problem of dMRI-based fiber tracing. in contrast to the state-of-the-art, HAMLET provides both the tract location and tract directions. The combination of localization (tract position) and tract orientation is, in general, a difficult, memory demanding problem due to its high dimensionality in 3D. However, with the novel, rotation covariant training framework, HAMLET has been tailored to the learning of the concept of complicated fiber tracts with a small number of free parameters in short time.
This is contrary to the trend of modern CNN architectures, like , which are very generic functional regressors with millions of unknowns. To train a CNN on biomedical imagery usual requires a large number of training samples to let the network learn the underlying pattern statistics, and to learn the underlying concept of possible image transformations. An incomplete or incorrect representation of the underlying concepts is often the reason for over-fitting or poor generalization. A common method to regularize the training is the usage of image augmentation, like the on-the-fly generated deformations enable quick generalization. However, the optimization of such high-dimensional problems is frequently based on heuristics, and therefore it is currently impractical to prove whether a CNN has correctly learned the underlying transformation concepts or not. Further, training from scratch usually takes days.
In fact, the deformations used for data augmentations are usually diffeomorphisms and locally very close to simple rotations. So, it is quite nearby to require the predictor to be intrinsically robust, or even covariant against rotations. The strict covariance constraint can drastically decrease the parametric complexity of the whole regressor (not necessarily the computational complexity). We followed the rather radical idea of full rotation and parity covariance. In its current setting HAMLET needs about 1000 parameters to describe a tract, which is three orders of magnitudes less compared to CNNs used for 3D segmentations on comparable sized volumes (e.g. [2, 19]). We found that already a few training examples without any augmentations are enough and a reasonable training session is accomplished within an hours. If you turn it up-side-down, the space of possible tracts can be described within a 1000-dimensional space. We have also seen that the covariance makes the obtained apparent tract volumes very stable and an ideal candidate for a quantitative biomarker.
On the other hand, the problems of the approach have also been highlighted: responses are rather smooth and the performance depends on the structure to be learned. Of course, it is not surprising that certain structures can be defined better than others if only relative spatio-orientational information is available. It is also not very surprising that a machine with about 1000 parameters produces rather rough segmentations. However, we also have seen that the simplicity of the machine makes it to generalize very well. Additionally, the ’ground truth’ itself is a problem in general. Variabilities within the training set are not just caused by the ’true’ natural variability, but also by errors during its generation.
Memory is also an issue, one application of HAMLET takes about 4 GBs of memory, and learning is rather limited by memory than by speed, as long as no batch-learning is used. In the current setting the training of 20 subjects needs about 60 GBs of data concurrently held in memory. However, improvements using batch-learning are nearby.
This is a feasibility study to show that fully automatic, rotation covariant dMRI tract learning based on Spherical Tensor Algebra is possible. The input of the machine is very close to the raw data and not very demanding with respect to the acquisition protocol. The bundle anatomies follow the ones obtained from conventional tractography and their volumes are stable across sessions, which makes the application as a quantitative biomarker possible. Instead of predicting scalar fields the prediction of rank-2 tensor fields offers the opportunity to perform visually appealing streamline tracking on the predicted tract images. Applications to neuro-navigation are nearby.
For future work, it might be reasonable to drop partially the rotation covariance constraint as it makes the problem quite hard and rough normalizations can usually be determined in a reliable way. Taking higher order input might also be an option, however, this will limit the number of possible protocols. We only learned symmetric tracts; learning asymmetry is also possible, however, we found in preliminary experiments that it is probably not worth to directly learn asymmetric tracts, because it is hard to avoid contralateral false positives. And actually it is not necessary that the machine solves the left/right problem for each label again and again. Labels could share the prediction of of learned presegmentations of left and right hemisphere. One could even go further and learn, in an unsupervised way, rough, initial brain parcellations. The actual tract learning would just get these parcellations as additional input. Also concentrating on certain parts by subdividing the brain could be a way to increase the resolution of the predictions.
Appendix A Appendix
a.1 Product Selection
There are a manifold of different ways to combine products and initial convolutions, however, not all of them can be robustly computed. For example, large initial smoothing allow higher order differentiations, while tight kernels only allow for low order expansions. In the following we explain the gritty details how different smoothings and products are chosen.
Recall, the initial linear features are spherical tensor fields of order and obtained as follows
and are combined quadratically by
or linearly with the output of the previous layers
Here denote the radially symmetric initial smoothings. We work with Gaussians and Laplacians of Gaussians. Let’s call this set . The set depends on the level of the filter. The final smoothing is a fixed Gaussian with width of voxel. We denote a Gaussian of width by and a Laplacian of Gaussian by
The free indices can vary according to the triangle inequality (or for higher levels ). To draw a cutoff we restricted and to be smaller than a certain limit , while the other free indices can take all possible (parity covariant) values. For different smoothing kernels , it makes sense to adapt the cutoff, so . The cutoff is also depending on the filter level and decreases with resolution.
The following inequalities conclude the selection criteria
The used filter uses three layers, in Table XX we conclude the level specific parameters. We additionally report , the number of features quadratic in the signal (obtained from equation (8)) and , the number of features obtained by combining signal and the output from the previous layer (from equation (9)). Note that, the latter have to be computed for each tract label individually, while the other do not depend on the previous layer’s output, and hence, are independent of the label to be learned.
a.2 Details on Processing and Learning
In the following we will give a more detailed description of the actual processing and learning steps.
As computational load depends strongly on the size of the processed volume, we used MNI coordinates to automatically generate a minimal bounding box. To avoid boundary effects and fold overs due to the circularity of FFT-based convolutions an additional margin had to be respected (of about 3 voxels). Each volume is resampled to a resolution 1.5mm isotropic and windowed by a smooth windowing function. If no MNI coordinates are available, alternative brain extraction tools could also be used (e.g. by the use of FSL’s BET). Depending on the scale, we apply a Gaussian smooth before we resample the raw data. If is the scale, we smooth by . The resampling of the label images is done accordingly on each scale.
The learning itself is based on a simple step-wise linear training. Let be five matrices representing th spherical component of the collected features from all subjects. That is, the matrix is of size where is the number of voxels in the th subject and the number of features.
And further let the label vectors, each of size . Then, we optimize the following linear objective
where are the parameters to be learned. For the lowest scale , or for the higher scales (see equation (6)).
Due to the high dynamic variance the features are normalized before inversion by their power (the square-root of the diagonal of ). So, also the regularization have to be seen in this respect, relative to the power of the feature. The conditioning of the problem usually gets better with higher scales, because dimensionality decreases and the number of examples increases. We used the same regularization of for all scales, which might be not optimal, but simple. We found that differences in performance are not very prominent as long as is not too low. Using the pseudoinverse is also an option, but would lengthen the computation time.
a.3 Tract Selection
Supposing MNI coordinates are given: a tract, namely a set of streamlines, is determined by a sequence of MNI coordinates and tolerances with . A arc-length parametrized streamline belongs to a tract, if
where . Here the list of 12 tracts used in this study:
|Sup. Long. Fasc. (SLF)||-45||30||22||8|
|Arcuate Fascicle (AF)||-48||22||10||10|
|anterior commisure (AC)||-30||-78||-25||15|
All selections where taken from a global tractography with 5-fold accumulated tracts.
-  Jesper LR Andersson, Stefan Skare, and John Ashburner. How to correct susceptibility distortions in spin-echo echo-planar images: application to diffusion tensor imaging. Neuroimage, 20(2):870–888, 2003.
-  Özgün Çiçek, Ahmed Abdulkadir, Soeren S Lienkamp, Thomas Brox, and Olaf Ronneberger. 3d u-net: learning dense volumetric segmentation from sparse annotation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 424–432. Springer, 2016.
-  Volker Arnd Coenen, Lena Valerie Schumacher, Christoph Kaller, Thomas Eduard Schlaepfer, Peter Christoph Reinacher, Karl Egger, Horst Urbach, and Marco Reisert. The anatomy of the human medial forebrain bundle: Ventral tegmental area connections to reward-associated subcortical and frontal lobe regions. NeuroImage: Clinical, 18:770–783, 2018.
-  Donald J Hagler, Mazyar E Ahmadi, Joshua Kuperman, Dominic Holland, Carrie R McDonald, Eric Halgren, and Anders M Dale. Automated white-matter tractography using a probabilistic diffusion tensor atlas: Application to temporal lobe epilepsy. Human brain mapping, 30(5):1535–1547, 2009.
-  Daniel Jörgens, Örjan Smedby, and Rodrigo Moreno. Learning a single step of streamline tractography based on neural networks. In Computational Diffusion MRI, pages 103–116. Springer, 2018.
-  Elias Kellner, Bibek Dhital, Valerij G Kiselev, and Marco Reisert. Gibbs-ringing artifact removal based on local subvoxel-shifts. Magnetic resonance in medicine, 76(5):1574–1581, 2016.
-  Simon Koppers, Matthias Friedrichs, and Dorit Merhof. Reconstruction of diffusion anisotropies using 3d deep convolutional neural networks in diffusion imaging. In Modeling, Analysis, and Visualization of Anisotropy, pages 393–404. Springer, 2017.
-  Simon Koppers and Dorit Merhof. Direct estimation of fiber orientations using deep learning in diffusion imaging. In International Workshop on Machine Learning in Medical Imaging, pages 53–60. Springer, 2016.
-  Klaus H Maier-Hein, Peter F Neher, Jean-Christophe Houde, Marc-Alexandre Côté, Eleftherios Garyfallidis, Jidan Zhong, Maxime Chamberland, Fang-Cheng Yeh, Ying-Chia Lin, Qing Ji, et al. The challenge of mapping the human connectome based on diffusion tractography. Nature communications, 8(1):1349, 2017.
-  Peter F Neher, Michael Götz, Tobias Norajitra, Christian Weber, and Klaus H Maier-Hein. A machine learning based approach to fiber tractography using classifier voting. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 45–52. Springer, 2015.
-  Philippe Poulin, Marc-Alexandre Cote, Jean-Christophe Houde, Laurent Petit, Peter F Neher, Klaus H Maier-Hein, Hugo Larochelle, and Maxime Descoteaux. Learn to track: Deep learning for tractography. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 540–547. Springer, 2017.
-  M. Reisert, I. Mader, C. Anastasopoulos, M. Weigel, S. Schnell, and V. Kiselev. Global fiber reconstruction becomes practical. Neuroimage, 54(2):955–62, 2011.
-  Marco Reisert, Elias Kellner, Bibek Dhital, Juergen Hennig, and Valerij G Kiselev. Disentangling micro from mesostructure by diffusion mri: A bayesian approach. NeuroImage, 147:964–975, 2017.
-  Susanne Schnell, Dorothee Saur, BW Kreher, Jürgen Hennig, H Burkhardt, and Valerij G Kiselev. Fully automated classification of hardi in vivo data using a support vector machine. NeuroImage, 46(3):642–651, 2009.
-  Henrik Skibbe and Marco Reisert. Spherical tensor algebra: A toolkit for 3d image processing. Journal of Mathematical Imaging and Vision, pages 1–33.
-  Henrik Skibbe and Marco Reisert. Dense rotation invariant brain pyramids for automated human brain parcellation. In Proceedings of the Workshop on Emerging Technologies for Medical Diagnosis and Therapy (Informatik???11), 2011.
-  Henrik Skibbe, Marco Reisert, and Hans Burkhardt. Gaussian neighborhood descriptors for brain segmentation. In MVA, pages 35–38, 2011.
-  Jelle Veraart, Dmitry S Novikov, Daan Christiaens, Benjamin Ades-Aron, Jan Sijbers, and Els Fieremans. Denoising of diffusion mri using random matrix theory. Neuroimage, 142:394–406, 2016.
-  Jakob Wasserthal, Peter Neher, and Klaus H Maier-Hein. Tractseg-fast and accurate white matter tract segmentation. arXiv preprint arXiv:1805.07103, 2018.
-  Jakob Wasserthal, Peter F Neher, Fabian Isensee, and Klaus H Maier-Hein. Direct white matter bundle segmentation using stacked u-nets. arXiv preprint arXiv:1703.02036, 2017.