Multimodal Registration of Retinal Images Using Domain-Specific Landmarks and Vessel Enhancement

Multimodal Registration of Retinal Images Using Domain-Specific Landmarks and Vessel Enhancement


The analysis of different image modalities is frequently performed in ophthalmology as they provide complementary information for the diagnosis and follow-up of relevant diseases, like hypertension or diabetes. This work presents an hybrid method for the multimodal registration of color fundus retinography and fluorescein angiography. The proposed method combines a feature-based approach, using domain-specific landmarks, with an intensity-based approach that employs a domain-adapted similarity metric. The methodology was tested on a dataset of 59 image pairs containing both healthy and pathological cases. The results show a satisfactory performance of the proposed combined approach in the multimodal scenario, improving the registration accuracy achieved by the feature-based and the intensity-based methods.

1 Introduction

Multimodal medical image registration is important in the context of diagnosis and follow-up of many relevant diseases. An accurate multimodal registration allows the integration of information obtained from different image modalities, providing complementary information to the clinicians, and improving the diagnostic capabilities. Ophthalmology benefits from this fact given the significant number of existing retinal image modalities: color fundus retinography, fluorescein angiography, autofluorescence fundus retinography or red-free fundus retinography, among others. These modalities offer different visualizations of the eye fundus anatomical structures, lesions and pathologies, without the possibility of achieving the combined multimodal information using one of the modalities only.

In general, registration algorithms can be classified in two groups: feature-based registration (FBR) and intensity-based registration (IBR) [1]. FBR methods use interest points, such as landmarks, along with the intensity profiles at their neighbourhoods to find point correspondences and estimate the spatial transformation between the images. For the detection of interest points, common algorithms as Harris corner detector [2], SIFT [3][4], SURF [5] as well as variations of them [6][7] have been used in different proposals for retinal images. These algorithms detect a large number of interest points in the images. However, as the detected points are not necessarily representative characteristics of the retinal images contents, many of them may not be present across the different modalities. An excessive number of non-representative detected points increases the computational cost of the posterior point matching and increases the likelihood of matching wrong correspondences. The application of generic descriptors is also limited by the differences among retinal image modalities, requiring a preprocessing for its use in multimodal scenarios [7]. Some proposals solve this issue with the design of domain-specific descriptors [8] [2] but they still rely on non-specific methods for the detection of interest points. The detection of line structures, mainly from vessels and disease boundaries, may be seen as a first approach to detect representative characteristics of the retinal images [9]. However, those boundaries do not show clear correspondence among all the modalities. More representative characteristics can be obtained with the extraction of natural landmarks, such as vessel bifurcations. This idea was tested by Laliberté et al. [10] for the registration of retinal images, although their method, that also requires the detection of the optic disk, was not robust enough and failed in several images. The use of these natural landmarks for multimodal retinal registration was not explored in posterior works to our knowledge, even though its successful application can greatly reduce the number of detected candidate points for the matching process.

IBR methods use similarity metrics that take into account the intensity values of the whole images instead of sparse local neighbourhoods. This allows to perform the registration with high order transformations [1], as it prevents the risk of overfitting to a small number of points. The registration is performed y optimizing a similarity measure, as intensity differences or cross-correlation for monomodal cases, or mutual information (MI) for multimodal cases. Nevertheless, the application in multimodal scenarios depends on the complexity of the image modalities and the relation between their intensity distributions. Specifically for retinal images, Legg et al. [11] found that in some cases there is an inconsistency between the MI value and the accuracy of the registration, existing transformations with better MI scores that the ground truth registration. These difficulties may explain the reduced number of IBR proposals for multimodal retinal image registration. Other use of the IBR approach may be in combination with FBR methods, being combined in hybrid methodologies that try to exploit the capabilities of both strategies [12] [13].

In this work, we propose an hybrid methodology for the multimodal registration of color fundus retinographies and fluorescein angiographies. The method combines a initial FBR approach driven by domain-specific landmarks, with a IBR refinement that uses a domain-adapted similarity metric. Both approaches exploit the presence of the retinal vascular tree in the retinographies and the angiographies. The proposed FBR is based on the detection of landmarks present in both the retinal image modalities, i.e. vessel bifurcations and crossovers. These landmarks can be detected with high specificity, which greatly reduces the number of detected points and facilitates the subsequent point matching. We completely avoid the descriptor computational step, as the point matching is performed with only the geometric information already obtained from the detection algorithm. The latter IBR aims to refine the registration through the estimation of a high order transformation. To perform the IBR over the multimodal images, a domain adapted similarity metric is used. This adaptation consist in the enhancement of vessel regions that transform retinography and angiography to a common image space where the similarity metrics from the monomodal scenarios can be employed. Experiments are conducted to evaluate the performance of the hybrid approach and the improvement over the independent application of the FBR and IBR methods.

2 Methodology

2.1 Landmarks-based Registration

The retinal vascular tree is a complex network of arteries and veins that branch and intersect frequently. The intersection points of the blood vessel segments are natural characteristic points of in the retina and have proven to be a reliable biometric pattern [14]. These intersection points, consisting of vessel bifurcations and crossovers, are used as landmarks. The detection and matching of these domain-specific landmarks is performed following an approach proposed for retinal biometric authentication [14]. The original method was applied in a monomodal scenario with optic disc centered images to compute the similarity between different retinographies. The multimodal registration shows the reverse problem, as it is known that both images are from the same individual and the similarity between them must be maximized. This implies that a higher accuracy in the localization of the landmarks is needed. The mentioned method is adapted to detect landmarks in both retinography and angiography with specific modality modifications.

Retinal images can be seen as landscapes where vessels appear as creases (ridges and valleys). In retinographies, the vessels are valleys in the landscape while in angiographies they are ridges. Defining the images as level functions, valleys (or ridges) are formed in the points of negative minima (or positive maxima) curvature. The local curvature minima and maxima are detected using the MLSEC-ST operator [15]. The vessel tree is given by the set of valleys (or ridges) for retinography (or angiography). The result is a binary image for each modality, consisting of 1 pixel width vessel segments,

The obtained vessel tree is fragmented at some points. Discontinuities appear at crossovers and bifurcations where vessels with different directions meet, and in the middle of a single vessel due to illumination and contrast variations of the image. Bifurcation and crossover detections are approached by joining the segmented vessels as described in [16]. Bifurcations are established where an extended segment under a given maximum distance intersects another segment. Crossovers, instead, are considered as double bifurcations. They are detected at positions where two bifurcations are closer than a given distance and the relative angle between their directions is below a certain threshold. Fig. 1 shows an example of retinography/angiography pair and the result of the vessel tree and landmarks obtained.

Figure 1: Example of multimodal image pair and result of the landmarks detection method: a) retinography; b) angiography; c,d) binary vessel tree and detected landmarks for the retinography(c) and the angiography(d).

This detection method results in a low number of suitable detected points and it allows to immediately perform the transformation estimation without an additional computation of descriptors. Bifurcations and crossovers are used to estimate the transformation between image pairs with a RANSAC point matching algorithm. The applied transformation is a restricted form of affine transformation that only considers translation, rotation and isotropic scaling. Therefore, the transformation has 4 degrees of freedom and can be computed with only two pairs of matched points. For each previously detected bifurcation or crossover, the position and vessel orientation are known as they are directly obtained from the detection method. These two characteristics are enough to perform the registration, without the need of an specific descriptor computation stage. The high specificity of the detection method leads to a low number of detected landmarks per image. Thus, it is practical to consider all possible matching pairs. The number of possibilities is additionally reduced by taking into account a maximum and minimum scaling factor, which can be is computed in advance as the ratio between the distance of two points in an image and the distance of any other two points in the other image. Relative angles between points, derived from the vessel orientation, are also used for additional restrictions [14].

2.2 Intensity-based Registration

The registration accuracy of the proposed FBR method is limited by the complexity of the transformation considered and the landmark localization precision. A refinement stage that considers high order transformations is proposed to improve the registration accuracy. In order to estimate higher order transformations is convenient to use an IBR approach considering all the pixels of the image pairs. A new domain-adapted similarity metric is constructed combining a vessel enhancement preprocessing with the normalized cross-correlation (NCC). The vessel enhancement transforms images from both modalities to a common image space where the NCC can be successfully employed. This whole operation is named as VE-NCC.

The vessel enhancement is motivated by the fact that the vessels are present in both the retinography and the angiography in form of tubular regions of low or high intensity values, respectively. These vessels vary in thickness throughout the image and can appear in any direction. This motivates the use of a multiscale analysis. A scale-space is defined as where is the scale parameter and is a gaussian kernel [17]. The enhancement of the vessel regions is performed using the Laplace operator . The Laplacian image, , will have a high response at nearby positions of the image edges, like those at the vessel boundaries. The distance from the Laplacian peaks to the edges depends on the scale used to compute . The vessel centerlines achieve maximum response at the scales where the peaks from both vessel boundaries concur. The scale parameter , therefore, allows to control the scale of the vessels to enhance. The normalized Laplacian scale-space is defined as:


Where is a normalization factor. A property of scale-space representation is that the amplitude of spatial derivatives decreases with the scale [17]. The normalization factor allows the comparison and combination of the magnitude at different scales. Finally, the maximum value across scales for every point is computed as:


where for angiography and for retinography, and denotes halfwave rectification. The rectification is used to avoid the negative Laplacian peaks outside the vessel regions, so that only the vessel interiors are represented in the enhanced images. This results in a common representation for retinography and angiography, with enhanced vessel regions and the same intensity level pattern. Fig. 2 shows the result of the vessel enhancement operation applied to the retinography/angiography pair from Fig. 1.

Figure 2: Exampled of the vessel enhancement operation applied to a multimodal image pair: a) retinography; b) angiography.

The transformation between images is obtained through minimization of the negative VE-NCC. It is important to initialize the algorithm with a proper initial transformation. The estimated transformation from the FBR serves as initialization for the IBR. Two different transformation models are considered to perform the IBR: Affine Transformation (AT) and Free Form Deformation(FFD). AT allows translation, rotation, anisotropic scaling and shearing, having 6 degrees of freedom. Differently, FFD uses a grid of control points that are moved individually along the image to define a high order transformation.

3 Experiments and Results

For the evaluation of the proposed methodology, we used the publicly available Isfahan MISP dataset of retinography and angiography images of diabetic patients [18]. This dataset consists of 59 image pairs divided in two collections of healthy and pathological cases. The pathological cases correspond to patients with mild and moderate retinal diseases due to diabetic retinophaty. The images have a resolution of 720 576 pixels. The division of the dataset in healthy and pathological cases allows to analyze the effect of the pathologies in the registration performance.

Several experiments were conducted to evaluate the hybrid methodology as well as the performance of the FBR and IBR methods. Regarding the IBR method, both the affine transformation (IBR-AT) and the free form deformation (IBR-FFD) variations were applied. We propose the hybrid method formed by the sequential application of FBR, IBR-AT and IBR-FFD, and alternative variations over this by removing one or two steps at a time. This results in 7 different methods as reported in Table 1. The table shows the average and standard deviation VE-NCC for each method in healthy and pathological cases. Fig. 3 depicts the cummulative distribution of the VE-NCC values. The best result are achieved by the proposed hybrid method. There is a large difference between the experiments that perform the initial FBR and the ones that directly apply IBR. For the latter experiments the registration failed in most cases. Most of the image pairs do not significantly change their VE-NCC values by applying IBR alone, and only a few of them obtained values over the minimum that was achieved by the initial FBR. These results indicate that, with the use of IBR and high order transformations, more accurate registrations can be achieved. However, they also evidence the importance of a proper initialization for the convergence of the optimization algorithm, which is provided by the initial FBR. Moreover, the IBR-FFD also benefits from the previous IBF-AT, as the order of the applied transformation directly fixes the search space dimensionality, increasing the complexity of the optimization.

Healthy cases Phatological cases
0.6123 0.0815 0.4758 0.1419
0.5980 0.0865 0.4661 0.1406
0.5668 0.0828 0.4401 0.1381
0.5266 0.0928 0.3961 0.1416
0.0673 0.0500 0.0930 0.1065
0.0733 0.0627 0.1005 0.1250
0.0581 0.0323 0.0656 0.0497
0.0481 0.0159 0.0518 0.0220
Table 1: Average and standard deviation VE-NCC for the different configurations tested.
Figure 3: Cummulative distribution of the VE-NCC: a) healthy cases: b) pathological cases.

Additionally, we performed a more in-depth analysis of the effect of the different steps in the proposed hybrid configuration. Fig. 4 shows scatter plots of the VE-NCC values before and after each step of the hybrid method for both healthy and pathological cases. It is observed that the biggest contribution comes from the initial FBR. The improvement decreases with each step as minor adjustments in the estimated transformation are required. The presence of pathologies in the images does not affect the general behaviour of the proposed hybrid method, as similar conclusions can be drawn from sets of both scatter plots. However, the average VE-NCC values are slightly lower for the pathological cases, at the same time that the variance is slightly higher. This may be an indication of the slightly influence of the pathological structures in the VE-NCC, as the optimum value is not necessarily the same for every image pair.

Figure 4: Step-by-step analysis of the proposed hybrid method: a,b,c) healthy cases: d,e,f) pathological cases.

4 Conclusions

The joint analysis of color fundus retinography and fluorescein angiography usually requires the registration of the images. In this work, an hybrid method for the multimodal registration of pairs of retinographies an angiographies is presented. Domain-specific solutions, exploiting the presence of the retinal vasculature in both image modalities, were proposed for both the feature and intensity-based registration steps that constitute the hybrid proposal. The use of a domain-adapted similarity metric allows the estimation of high order transformations that increase the accuracy of the registration. Simultaneously, accurate registration is only feasible departing from the initial registration with domain-specific landmarks. Different experiments were conducted to validate the suitability of the proposed method and to evaluate the contribution of each registration steps. The results demonstrated that the hybrid method outperforms the individual application of each of its constituting approach.


This work is supported by I.S. Carlos III, Government of Spain, and the ERDF of the EU through the DTS15/00153 research project, and by the MINECO, Government of Spain, through the DPI2015-69948-R research project. The authors of this work also receive financial support from the ERDF and ESF of the EU, and the Xunta de Galicia through Centro Singular de Investigación de Galicia, accreditation 2016-2019, ref. ED431G/01 and Grupo de Referencia Competitiva, ref. ED431C 2016-047 research projects, and the predoctoral grant contract ref. ED481A-2017/328.


  1. Oliveira, F., Tavares, J.: Medical image registration: A review. Computer methods in biomechanics and biomedical engineering 17 (01 2014) 73–93
  2. Chen, J., Tian, J., Lee, N., Zheng, J., Smith, R.T., Laine, A.F.: A partial intensity invariant feature descriptor for multimodal retinal image registration. IEEE Transactions on Biomedical Engineering 57(7) (July 2010) 1707–1718
  3. Yang, G., Stewart, C.V., Sofka, M., Tsai, C.L.: Registration of challenging image pairs: Initialization, estimation, and decision. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(11) (Nov 2007) 1973–1989
  4. Tsai, C.L., Li, C.Y., Yang, G., Lin, K.S.: The edge-driven dual-bootstrap iterative closest point algorithm for registration of multimodal fluorescein angiogram sequence. IEEE Transactions on Medical Imaging 29(3) (March 2010) 636–649
  5. Wang, G., Wang, Z., Chen, Y., Zhao, W.: Robust point matching method for multimodal retinal image registration. Biomedical Signal Processing and Control 19 (2015) 68–76
  6. Ghassabi, Z., Sedaghat, A., Shanbehzadeh, J., Fatemizadeh, E.: An efficient approach for robust multimodal retinal image registration based on UR-SIFT features and PIIFD descriptors. EURASIP J. Image and Video Processing 2013 (2013)  25
  7. Ma, J., Jiang, J., Chen, J., Liu, C., Li, C.: Multimodal retinal image registration using edge map and feature guided gaussian mixture model. In: 2016 Visual Communications and Image Processing (VCIP). (Nov 2016) 1–4
  8. Lee, J.A., Cheng, J., Lee, B.H., Ong, E.P., Xu, G., Wong, D.W.K., Liu, J., Laude, A., Lim, T.H.: A low-dimensional step pattern analysis algorithm with application to multimodal retinal image registration. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (June 2015) 1046–1053
  9. Hernandez, M., Medioni, G., Hu, Z., Sadda, S.: Multimodal registration of multiple retinal images based on line structures. In: 2015 IEEE Winter Conference on Applications of Computer Vision. (Jan 2015) 907–914
  10. Laliberte, F., Gagnon, L., Sheng, Y.: Registration and fusion of retinal images-an evaluation study. IEEE Transactions on Medical Imaging 22(5) (May 2003) 661–673
  11. P.A. Legg, P.L. Rosin, D.M., Morgan, J.: Improving accuracy and efficiency of mutual information for multi-modal retinal image registration using adaptive probability density estimation. Computerized Medical Imaging and Graphics 37 (2013) 597–606
  12. Chanwimaluang, T., Fan, G., Fransen, S.R.: Hybrid retinal image registration. IEEE Transactions on Information Technology in Biomedicine 10(1) (Jan 2006) 129–142
  13. Kolar, R., Harabis, V., Odstrcilik, J.: Hybrid retinal image registration using phase correlation. The Imaging Science Journal 61 (05 2013) 369–384
  14. Ortega, M., Penedo, M.G., Rouco, J., Barreira, N., Carreira, M.J.: Retinal verification using a feature points-based biometric pattern. EURASIP Advances in Signal Processing 2009(1) (Mar 2009) 235746
  15. López, A.M., Lloret, D., Serrat, J., Villanueva, J.J.: Multilocal creaseness based on the level-set extrinsic curvature. Computer Vision and Image Understanding 77(2) (2000) 111–144
  16. Ortega, M., Rouco, J., Novo, J., Penedo, M.G.: Vascular landmark detection in retinal images. In Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A., eds.: Computer Aided Systems Theory - EUROCAST 2009, Berlin, Heidelberg, Springer Berlin Heidelberg (2009) 211–217
  17. Lindeberg, T.: Edge detection and ridge detection with automatic scale selection. International Journal of Computer Vision 30(2) (Nov 1998) 117–156
  18. Golmohammadi, H., Kashefpur, M., Kafieh, R., Jorjandi, S., Khodabande, Z., Abbasi, M., Akbar Fakharzadeh, A., Kashefpoor, M., Rabbani, H.: Isfahan misp dataset. Journal of Medical Signals and Sensors 7(1) (2017) 43–48
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minumum 40 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description