Fractal Dimension Invariant Filtering and Its CNN-based Implementation

Fractal Dimension Invariant Filtering and Its CNN-based Implementation

Hongteng Xu, Junchi Yan, Nils Persson, Weiyao Lin, Hongyuan Zha
School of ECE, CSE, Chemical & Biomolecular Engineering, Georgia Tech
IBM Research – China, Department of EE, Shanghai Jiao Tong University
hxu42, npersson3@gatech.edu, yanjc@cn.ibm.com, wylin@sjtu.edu.cn, zha@cc.gatech.edu
corresponding author
Abstract

Fractal analysis has been widely used in computer vision, especially in texture image processing and texture analysis. The key concept of fractal-based image model is the fractal dimension, which is invariant to bi-Lipschitz transformation of image, and thus capable of representing intrinsic structural information of image robustly. However, the invariance of fractal dimension generally does not hold after filtering, which limits the application of fractal-based image model. In this paper, we propose a novel fractal dimension invariant filtering (FDIF) method, extending the invariance of fractal dimension to filtering operations. Utilizing the notion of local self-similarity, we first develop a local fractal model for images. By adding a nonlinear post-processing step behind anisotropic filter banks, we demonstrate that the proposed filtering method is capable of preserving the local invariance of the fractal dimension of image. Meanwhile, we show that the FDIF method can be re-instantiated approximately via a CNN-based architecture, where the convolution layer extracts anisotropic structure of image and the nonlinear layer enhances the structure via preserving local fractal dimension of image. The proposed filtering method provides us with a novel geometric interpretation of CNN-based image model. Focusing on a challenging image processing task — detecting complicated curves from the texture-like images, the proposed method obtains superior results to the state-of-art approaches.

1 Introduction

Many complex natural scenes can be modeled as fractals [21, 29]. In the field of computer vision, fractal analysis has been proven to be a useful tool for modeling textures, and many research fruits have been proposed. Taking textures as fractals, the work in [39] learns local fractal dimensions and lengths as features for classifying textures. Similarly, the work in [45] learns the spectrum of fractal dimension as textures’ features via the box-counting method [10]. It is easy to find that all of these methods treat the fractal dimension as a key concept of fractal-based image model because the fractal dimension is invariant to bi-Lipschitz transformation. This property means that the fractal dimension is robust to geometrical deformation (e.g., ridge and non-ridge transformation) of image. Hence, the fractal dimension reflects intrinsic structural information of image, which can be treated as a representative feature of image.

(a) Iterative FDIF
(b) CNN-based Implementation
(c) Illustration of the FDIF-based Curve Detector
Figure 1: Given real-world noisy images (i.e., material images) having complicated curves in (a), we apply the proposed iterative FDIF method in (b) to detect curves. The FDIF can be efficiently and approximately re-instantiated via a CNN in (c). The illustration of the FDIF-based curve detector is shown in (d).

Unfortunately, the fractal dimension of image cannot be preserved after filtering, which might lead to the loss of structural information. A typical example is the interpolation of digital image, where the result can be viewed as a low-pass filtering of ground truth. The low-pass filtering suppresses the high-resolution details of image, and thus, leads to the loss of structural information. The work in [44] shows that the fractal dimension of interpolated image is smaller than that of real high-dimensional image. However, the recent development of deep convolutional neural networks (CNNs) shows that the stacked nonlinear filtering model is very suitable to learn features of images, which has a capability of extracting structural and semantic information of image robustly. Many CNN-based methods have been proposed to deal with various tasks e.g. image classification [16], texture analysis [4], and contour detection [49]. In other words, for extracting representative feature of image, the filtering operation are instrumental in CNNs while detrimental to fractal-based methods. Given these two seemingly contradictory phenomena, the following two problems arise: 1) Can we propose a filtering method preserving the invariance of fractal dimension? 2) Is there any connection between fractal-based image models and CNNs, especially for unsupervised feature learning?

In this paper, we give positive answers to these two problems. We propose a fractal dimension invariant filtering (FDIF) method and use a CNN-based architecture to re-instantiate it. This work provides us with a geometrical interpretation of CNN based on local fractal analysis of image. The proposed work obtains encouraging curve detection results for texture-like images, which is superior to other competitors. As Fig. 1(a) shows, we give a local fractal model of image and propose a curve detector under an iterative FDIF framework. In each iteration, we take patches of image as local fractals, and compute their fractal dimensions accordingly. An anisotropic filter is designed for each patch of image according to the analysis of gradient field, and the filtering result is further enhanced via preserving fractal dimension across various measurements. Inspired by the iterative filtering strategy in [23], we apply the steps above repeatedly to obtain the features of curves, and detect curves via unsupervised (i.e., thresholding) or supervised (i.e., logistic regression) methods. In particular, we demonstrate that such a pipeline can be implemented via a CNN-based architecture, as shown in Fig. 1(b). This CNN is interpretable from a geometrical viewpoint — the convolution layer corresponds to an anisotropic filter bank while the nonlinear layer approximately preserves local fractal dimensions. Applying backpropagation algorithm for supervised case and predefined parameters (filters) for unsupervised case, we achieve encouraging curve detection results.

As Fig. 1(c) shows, the principle of our FDIF-based curve detector is preserving local fractal dimensions via adjusting the measurement of fractal (i.e., the image itself). Generally, the measurement obtained via anisotropic filtering is smoothed. To preserve local fractal dimensions, we apply the nonlinear processing and get a new measurement, where the sharpness of curve is enhanced while the sharpness of the rest regions is suppressed. As a result, the FDIF method provides us with a better representation of curves.

We test our method on a collected atomic-force microscopy (AFM) image set, detecting complicated curves of materials from AFM images. Experimental results show that our method is promising in most situations, especially in the noisy and texture-like cases, which obtains superior results to existing curve detectors. Overall, the contributions of our work are mainly in three aspects: First, to the best of our knowledge, our work is the first attempt to propose a fractal dimension invariant filtering method and connect it with CNNs. It is also perhaps the first time to interpret CNNs from a (fractal) geometry perspective. Second, our method connects traditional handcrafted filter-based curve detector with a CNN architecture. It establishes a bridge on the gap between filter-based curve detectors and learning-based especially CNN-based ones. This connection also allows us to instantiate a new predefined CNN that can work in an unsupervised setting, different from most of its peers known for their ravenous appetite for labeled data. Third, we demonstrate a meaningful interdisciplinary application of our curve detector in computational material science. A material informatics image dataset is collected and will be released with this paper for future public research.

2 Related Work

Fractal Analysis: Fractal-based image model has been widely used to solve many problems of computer vision, including, texture analysis [31], bio-medical image processing [40], and image quality assessment [46]. The local fractal analysis method in [39] and the spectrum of fractal dimension in [52, 45] take advantage of the bi-Lipschitz invariance property of fractal dimension for texture classification, whose features are very robust to the deformation and scale changing of textures. Because the local self-similarity of image is often ubiquitous both within and across scales [15, 12], natural images can also be modeled as fractals locally [21, 29]. Recently, the fractal model of natural image is applied to image super-resolution [44, 50], where the local fractal analysis is used to enhance image gradient adaptively. In [40], a fracal-based dissimilarity measurement is proposed to analyze MRI images. However, because the invariance of fractal dimension does not hold after filtering, it is difficult to merge fractal analysis into other image processing methods.

Convolution Neural Networks: CNNs have been widely used to extract visual features from images, which have many successful applications. In these years, this useful tool has been introduced into many low-and middle-level vision problems, e.g., image reconstruction [41, 5], super-resolution [8], dynamic texture synthesis [47], and contour detection [42, 49]. Currently, the physical meanings of different CNN modules are not fully comprehended. For example, the nonlinear layer of CNN, i.e., the rectifier linear unit (ReLU), and its output are often mysterious. For comprehending CNNs in depth, many attempts have been made. Many existing feature extraction methods have been proven to be equivalent to deep CNNs, like deformable part models in [14] and random forests in [27]. A pre-trained deep learning model called scattering convolution network (SCN) is proposed in [20, 4, 25]. This model consists of hierarchical wavelet transformations and translation-invariant operators, which explains deep learning from the viewpoint of signal processing. However, none of these methods discuss the geometrical explanation of CNNs from the viewpoint of fractal analysis.

Curve Detection: Curve detection is a potential application of fractal-based image processing method regarding many practical tasks, such as power line detection [19], geological measurement [26], and rigid body detection [28] etc. More recently, the curve detection technique is introduced into more interdisciplinary fields, e.g., materials, biology, and nanotechnology [48, 36, 17]. To our surprise, although in the following section we show that fractal-based image model is very suitable for the problem of curve detection, very few existing methods apply fractal analysis to solve the problem. Taking advantage of the directionality of curve, early curve detectors are based on diverse transformations, including the Hough transformation [9], the curvelets [35], the wave atoms [43]. Besides the direction, the multiscale property of curve is considered via applying multiscale Fourier transformation [6], Frangi filtering [11], and the scale-space distance transformation [34]. Focusing on curve and line segment detection, the parameterless fitting model proposed in [28] achieves the state-of-the-art. These methods principally construct an isotropic filter bank and detect the local strong response to certain directions. Beyond these manually-designed methods, the learning-based approaches become popular as a huge amount of labeled images become available [1, 51]. Focusing on edge detection, which is a problem related to curve detection, the structured forest-based detector [7] and the CNN-based detector [33, 2, 42, 32] are proposed. These methods learn their parameters on a large dataset, and thus, have powerful generalization ability to deal with challenging cases. However, most of the existing methods aim to detect sparse curves from relatively smooth background. Few of them can detect complicated curves from texture-like images.

3 Fractal Dimension Invariant Filtering

In this section, we introduce our fractal-based image model and show the derivation of local fractal dimension. According to the model, we propose an iterative fractal dimension invariant filtering method, which preserves local fractal dimensions of patches across various measurements in the phase of feature extraction.

3.1 Fractal-based Image Model

As shown in Fig. 2, a typical fractal is generated via transforming a geometry to analogues with scaling factor and then applying the transformation infinitely on each analogue. The union of the analogues is a fractal, denoted as . The fractal is a “Mathematical monster” that is unmeasurable in the measure space of . Therefore, the analysis of fractal is mainly based on the Hausdorff measure [21], which gives rise to the concept of fractal dimension. The fractal dimension is involved by a power law of measurements across multiple scales, i.e., the quantities . Here is called fractal dimension, which is larger than the topological dimension of .

Figure 2: Transforming each line segment into analogues with scaling factor repeatedly, we obtain a fractal so-called the Von Koch curve with .

In our work, an image is represented via a function of pixels, denoted as . Here is the union of the coordinates of pixels. Each coordinate of pixel is denoted as . We propose a fractal-based image model, representing as a union of local fractals, and image as (, ), where is a measurement supported on the fractal set . According to the power law of measurements mentioned above, for each pixel we have , where is a ball centering at with radius and is the local fractal dimension at under the measurement . Here, we use the intensity of pixel as the measurement directly, so the local fractal dimension at is

(1)

where , is a Gaussian kernel defined as [45, 44], and “” indicates the valid convolution.

In practice, we estimate the local fractal dimension in (1) numerically by linear regression. Specifically, we calculate sample pairs by multiscale Gaussian filtering, and learn a linear model for all according to (1). Here is the value of measurement in the unit ball (), which is interpreted as the -dimensional fractal length in [39]. Algorithm 1 gives the scheme of fractal dimension estimation.

1:  Input: , the number of scales .
2:  Output: Fractal dimension .
3:  For , perform a convolution of with to get .
4:  , .
5:  .
Algorithm 1 Fractal Dimension Estimation

Local Fractal dimension contains important structural information of image, e.g., smooth patches with fractal dimensions close to , the patch containing curves with fractal dimensions close to , and textures with fractal dimensions between and  [10, 44]. For detecting structures, e.g., curves in images robustly, fractal dimension shall be preserved. One fundamental property of fractal dimension is its invariance to bi-Lipschitz transform shown in Theorem 1:

Theorem 1.

Bi-Lipschitz Invariance. For a fractal with fractal dimension , its bi-Lipschitz transformation is still a fractal, whose fractal dimension .

Recall (1), we can find that the fractal dimension is not unique, which depends on the choice of measurement . The theorem holds because the bi-Lipschitz transformation (i.e., the geometric transformation and non-rigid deformation of image) does not change the measurement of fractal, which is revealed via the proof in the appendix.

However, after filtering or convolution, the invariance of fractal dimension does not hold any more. For example, if we change the convolution kernel in (1), the measurement of fractal and the associated fractal dimension will be changed accordingly. Therefore, we cannot find a filter ensuring the fractal dimension of filtering result to be exactly same with that of original image.

To pursuit the fractal dimension preservation philosophy in face of the reality that filtering will inevitably change fractal dimension, we aim to suppress the expected change between original fractal dimension and filtered one. Denote the proposed filter as , the measurement and the fractal dimension of filtering result as and , respectively. We assume that the filter is a random variable yielding to a probabilistic distribution. According to (1), we have

(2)

where computes the expectation of random variable. Obviously, to minimize the expected change between and , the expectation of the filter should be as close to impulse function as possible.

3.2 Iterative FDIF Framework

Motivated by the analysis above, we propose the following iterative FDIF method as detailed in Fig. 1(a).

Anisotropic Filtering: To suppress fractal dimension change, the expectation of the filter shall be as close as to impulse function. Anisotropic filters have been one natural choice for this purpose. Take directional filtering [30] as an example: for each pixel , compute the smoothed gradient in its neighborhood as . Here is a Gaussian filter, is the cardinality of the neighborhood, () is partial differential operator along horizontal (vertical) direction, and denotes vectorization. The eigenvector corresponding to the largest eigenvalue of , denoted as , indicates the direction information of . Such a direction field of image induces a series of directional filters in the polar coordinate system, denoted as , whose element satisfies

(3)

Obviously, the filtering result at has the strongest response for . The directional filters satisfy the following proposition:

Proposition 2.

If the distribution of pixel’s direction is uniform, then the expectation value of the filters in (3) is an impulse function , where .

The proof is given in the appendix. Fig. 3 visualizes several typical directional filters and their mean in the right most, which further verifies the proposition. Recall (2), we can find that as long as the distribution of directions is uniform in the direction field of image, the proposition indicates that the proposed filters tend to preserve the expected value of fractal dimension after filtering.

Figure 3: The illustration of ’s with . The average of the filters (the last one) is close to an impulse function.

Nonlinear Post-processing: Anisotropic filtering prevents the expected fractal dimension from changing globally. Furthermore, we propose a transformation to preserve local fractal dimensions of the filtering result . In particular, although the local fractal dimension with the measurement is not equal to the original with , we can apply a transformation to , such that the fractal dimension with the new measurement , denoted as , is equal to . According to the definition of fractal dimension in (1) and the relationship given by Algorithm 1, it is easy find that the proposed transformation should be , where . In this situation, we have

In other words, the local fractal dimension . Then we apply the transformation directly to the filtering result such that the local fractal dimension is preserved under the new measurement. At each , we have

(4)

Here the term preserves the energy of filtering result, which merely changes fractal length.

Iterative Framework: Combining the anisotropic filtering with the post-processing, we obtain the proposed FDIF method. As Fig. 1(a) shows, FDIF can be applied iteratively, in order to extract structures hidden in images.

Take curve detection as an example. Fig. 9 illustrates the enlarged output of an AFM image in each iteration and compare the iterative filtering process with traditional path operator [22]. We can find that the pixels corresponding to curves are more and more discriminative. When the labels of curves are available, we learn the curve detector as a binary classifier with the help of logistic regression. Sampling the final filtering result into patches with overlaps, we learn the parameters of the sigmoid function. On the contrary, if the labels are unavailable, we simply apply a thresholding method [24] to convert the filtering result to a binary image. On the contrary, the traditional morphological filtering method, e.g., the path operator [37, 22], also aims at detecting curves and tubes, but it is sensitive to the noisy in the image. These two detection methods are shown in the last layer in Fig. 1(a). The iterative FDIF-based curve detector is physically-interpretable. The fractal dimension of patch reflects its sharpness: the patch of curve has higher sharpness than the patch of smooth region, whose fractal dimension tends to . The filters we used achieve an anisotropic smoothing process of image, so that the measurement of fractal dimension is smoothed as well. Essentially, preserving fractal dimensions under a smoother measurement, like (4) does, actually enhances the sharpness of curves and suppresses the sharpness of the rest regions, which provides us with a better representation of curves.

(a) Original
(b) Path operator
(c) #1 Iteration
(d) #3 Iteration
Figure 4: Comparison between the iterative adaptive filtering process and traditional path operator [22].

4 FraCNN: Implementing FDIF via CNN

In this section, we will show that FDIF can be re-instantiated via a CNN, as described in Fig. 1(b). In particular, the convolution layer can be explained as an anisotropic filter bank and the nonlinear layer performs the post-processing function approximately.

4.1 The Architecture of The CNN

Convolution Layer: The anisotropic filtering can be approximately implemented via a filter bank. At each pixel , the process can be rewritten as

(5)

where is the bank of anisotropic filters. only preserves the filtering result having the maximum response.

Nonlinear Layer: The proposed post-processing can also be approximated via the following nonlinear layer:

(6)

Here the normalization term is implemented via a convolution, where is a mean filter, which sums the intensities in the neighborhood for each . Different from neuroscience, we explain the rectified linear unit (ReLU, ) based on fractal analysis. The ReLU ensures the filtering result to be a valid measurement (as the measurement used in the box-counting method [10, 45]): A valid measurement defined on the set satisfies nonnegativity , countable additivity , and null empty set simultaneously, where . The null empty set is satisfied by our filtering result naturally while the ReLU operator guarantees the nonnegativity and countable additivity.

Note that the parameter of transformation operation can be fixed approximately as a constant . This approximation is reasonable for the problem of curve detection. On one hand, we model the coordinates of image as a set of fractals, whose fractal dimension must be in the interval , where is the topology dimension of 2D geometry, and because the fractal dimension of a fractal generated from a 2D geometry via 2D transformation cannot reach to . On the other hand, after filtering the curves are also modeled as a set of fractals with fractal dimension in the interval , where is the topology dimension of curve (1D geometry) and . Based on the fractal-based model, we have . When and are small, we can estimate for all ’s.

4.2 FraCNN-based Curve Detection

The iterative FDIF framework can be achieved via stacking the layers above. As a result, the architecture of the proposed CNN is shown in Figs. 1(b). For convenience, we call it FraCNN. Similar to the iterative FDIF framework, we can also add a sigmoid layer to the end of the CNN and train the model via traditional backpropagation algorithm, or apply a thresholding layer for the final output. In contrast to many CNN models with a disadvantage of their ravenous appetite for labeled training, we believe the adaptability for unlabeled data of our method is perhaps due to the fact that we instantiate our tailored CNN from the fractal-based geometry perspective. Focusing on the task of curve detection, we propose a detection algorithm shown in Algorithm 2.

1:  Input: Image , filter bank , layer number .
2:  Output: Binary map corresponding to curves.
3:  For , obtain from via (5,6), and set .
4:  Unsupervised: .
5:  Supervised: . is learned parameters, are patch matrix of .
Algorithm 2 FraCNN-based Curve Detector

We present further comparisons and analysis as follows.

FraCNN v.s. FIDF: The proposed CNN model can be viewed as a fast implementation of FIDF. Firstly, the adaptive anisotropic filtering is approximately achieved by an anisotropic filter bank. The direction of filter is no longer computed from the eigenvector of the local gradient matrix, but sampled uniformly from the interval (as Fig. 3 shows). Although such an approximation reduces the accuracy of the description of direction, it avoids to do eigen-decomposition for each pixel, and thus, accelerates the filtering process notably. Secondly, the ratio between the fractal dimension and the original one is replaced by a fixed value, such that we do not need to apply Algorithm 1 to estimate fractal dimension. As a result, the computational complexity of original FIDF is , where the first term corresponds to adaptive filtering and the second term corresponds to local fractal dimension estimation (and is the number of scales in Algorithm 1), while the complexity of proposed CNN is at most , where is the number of filters in the filter bank.

FraCNN v.s. Scattering Convolution Network: To our CNN model, the most related work might be the scattering convolution network (SCN) in [4, 25]. Both of our fractal-based CNN and the SCN can apply predefined filters and are suitable for unsupervised learning when labels are not available. However, there are several important differences between our model and SCNs. First, SCNs aim at extracting discriminative feature for image recognition and classification, while our Fractal-based CNN model focuses on low-and middle-level vision problems, i.e., curve detection. Second, the nonlinear layer of SCN applies multiple nonlinear operators to enhance the invariance of feature to geometric transformation. For example, the absolute operator is applied to achieve translation invariance. In our work, the nonlinear layer aims to preserve local fractal dimension such that the local structural information of image will be enhanced. The geometric invariance of feature is not our goal. Finally, different from wavelet transformation, our fractal-based CNNs do not down-sample filtering result (i.e., pooling operation).

5 Experiments

5.1 The AFM Image Benchmark and Protocols

We apply our fractal dimension invariant filtering method to a challenging real-world task: detecting structural curves in AFM images of materials. The demo code and partial data are in https://sites.google.com/site/htxu313/resources/software. The images in this study are atomic force microscopy (AFM) phase images of nano-fibers. Each image is taken in tapping mode at a m and with size . The fibrillar structure of the material has a huge influence on its electronic properties, which is represented via the complex salient curves in the image, as Fig.1 shows. Detecting curves from the AFM images is challenging. First, the AFM images often suffer from heavy noise and low contrast, which has negative influences on curve detection. Second, the curves in these scenes are very complicated — dense curves (i.e., nano fibers) with different shapes and directions are distributed in the image randomly and have overlaps with each other. The ground truth of curves are extracted manually by a semi-automatic tool called FiberApp [38].

We test our FraCNN-based curve detector with the original FDIF-based detector in both unsupervised and supervised cases. Specifically, we consider these two detectors with thresholding-based binary processing (BP) and logistic regression (LR) as the last layer, respectively. The size of filters used in FDIF and FraCNN is , and the number of anisotropic filters used in FraCNN is , as shown in Fig. 3. For investigating the influence of model’s iteration number (depth) on learning results, we set the iteration number of FDIF to be (relatively shallow) or (relatively deep). Accordingly, the depth of FraCNN is or . In the supervised case (note only for last layer), we use AFM images as training set and the remaining AFM images as testing set. patches of size are sampled from the output images of FDIF or FraCNN to training parameters of the sigmoid layer. A half of training patches whose central pixels correspond to curves are labeled as positive samples, while the rest patches are negative ones.

For further demonstrating the superiority of our method, we consider the following competitors: the curve and line segment detector (ELSD) in [28]; the traditional Frangi filtering-based curve detector [11] the simple logistic regression LR using patches as features directly; the classical CNN so-called LeNet [18]; the state-of-art holistically-nested edge detector (HED[42]. Although the HED is originally designed to detect edges, it should also be suitable to detect curves because both curves and edges satisfies the assumption of multi-scale consistency. Therefore, we use the training images to fine-tune the pre-trained HED model and learn a curve detector accordingly.111The training code and pre-trained model is from https://github.com/s9xie/hed. Following the instruction in [42], a post-process is applied to the output of CNNs, achieving the shrinkage and the binarization of detected curves. The logistic regression is trained by patches with size sampled randomly from training images. The training samples of the LeNet is also patches of images, the only difference is that the size of the patches is . In the testing phase, each patch of testing image will be classified and its label will be used as the corresponding pixel value of final binary map.

Similar to contour detection [1], we use the standard metrics for curve detection, including the optimal F-score with fixed threshold (ODS), the optimal F-score with per-image best threshold (OIS), and average precision (AP).

[c] Method ODS OIS AP  Non-Learning ELSD [28] 0.058 0.058 0.030 Frangi [11] 0.629 0.659 0.578 FDIF()+BP 0.717 0.735 0.699 FDIF()+BP 0.715 0.733 0.695 FraCNN()+BP 0.691 0.719 0.708 FraCNN()+BP 0.689 0.715 0.702  Learning LR 0.639 0.706 0.707 LeNet [18] 0.677 0.718 0.643 HED [42] 0.722 0.739 0.784 FDIF()+LR 0.728 0.770 0.700 FDIF()+LR 0.724 0.767 0.697 FraCNN()+LR 0.743 0.782 0.730 FraCNN()+LR 0.739 0.774 0.718

Table 1: Performance comparison for various methods.

5.2 Experimental Results

Table 1 gives comparison results for various methods, and Fig. 5 and 6 visualize some typical results. More experimental results are attached in the appendix.

The traditional image processing methods like the ELSD and the Frangi filter seems unsuitable for detecting complicated curves in our case. The ELSD method aims at detecting line segments and ellipse curves of rigid body in natural image. The Frangi filter is originally designed for detecting vessels from medical images. Both of these two methods can only detect sparse curves from relatively smooth background. In our case, however, the curves of nano-fibers are very dense and complex and the AFM images are generally noisy. As a result, the ELSD cannot detect complete curves and obtains very low ODS, OIS, and AP while the Frangi filtering method is not robust to noise and the change of contrast, which can only obtain chaotic results.

(a) AFM image
(b) Manual labels
(c) ELSD [28]
(d) Frangi [11]
(e) LR
(f) LeNet [18]
(g) HED [42]
(h) FDIF+BP
Figure 5: Visual comparisons for various methods.

The learning-based approaches, including LR, LeNet, and HED, achieve much better results (i.e., higher ODS, OIS, and AP) than basic image processing methods. However, their results are still very noisy. In Fig. 5, LR’s results contain many non-curve pixels and the many broken curves. LeNet gets some improvements: long curves are detected correctly, but there are still many non-curve pixels. HED is superior to LR and LeNet. Long curves are detected with more confidence and fewer incorrect isolated pixels appear in the results. Table 1 shows the superiority of HED.

FDIF and FraCNN both achieve encouraging results. Specifically, our unsupervised methods, FDIF+BP and FraCNN+BP, outperform the other non-learning methods (ELSD and Frangi) notably, with better performance in Table 1 and visual results in Fig. 5. Additionally, FDIF+BP and FraCNN+BP are also better than some learning-based methods. We can find that they get higher ODS, OIS and AP than LR and LeNet. The comparison results still demonstrate that the fractal-based image model is suitable for the problem of curve detection, and our methods can extract representative features for curves. In the supervised case, our FDIF+LR and FraCNN+LR methods outperform all the competitors in ODS and OIS while getting slightly worse AP than HED. Moreover, from the enlarged comparison results in Fig. 6, we can find that HED’s result is still very coarse, while our method can get thin curves. The results demonstrate that the proposed methods are at least comparable to the state-of-art in the problem of curve detection. Note that our method is superior to HED in the aspect of computational complexity. Specifically, in each layer, our FraCNN just applies 2D convolutions with kernel size to image , whose computational complexity is , while HED applies 3D convolutions to an image tensor with channels, whose computational complexity is .

One important observation here is that although FraCNN can be viewed as an implementation of FDIF, it sometimes outperforms FDIF in Table 1. A potential explanation for this phenomenon might be that FDIF is more sensitive to the noise in the image. Specifically, the flexibility of FDIF on selecting directions might be a “double-edged sword”. Heavy noise in the image would lead to bad estimate of filter’s direction and have negative influences on filtering results. The FraCNN, however, uses a predefined anisotropic filter bank. The limited options of directions might help to suppress the influence of noise. Additionally, experimental results show that with the increase of iteration number and depth, the performance of our methods is degraded slightly. In the viewpoint of numerical analysis, too many iterations or too deep architecture might lead to the underflow problem of pixel value. In the unsupervised case, instead of fine-tuning the threshold case by case, we uniformly set the threshold to for fair comparison. Note the threshold can have direct impact to final results: some underflow points might appear on curves, and thus the thresholding operation might break a complete curve into several pieces of short segments. In the supervised case, the underflow points in patches also hurt the representation of curve, which have negative influences on training the sigmoid layer.

(a) Manual labels
(b) HED’s curves
(c) FDIF’s curves
Figure 6: Enlarged comparisons for various methods. The red curves are manually labeled results and the learning results of various methods. The green regions mark the unlabeled curves.

Furthermore, we select some texture images containing curves from the public Brodatz texture data set [3], label them manually, and test our method accordingly. Some typical visual results and numerical results are shown in Fig. 7, which further verify the performance of our method.

Figure 7: Brodatz texture images and filtering results. The numerical results of FDIF()+LR are: OIS; ODS; AP. On the other hand, the results of HED (the best competitor) are: OIS; ODS; AP.

5.3 Robustness to Missing Labels

Compared with the state-of-art learning-based detector, an important advantage of the proposed method is that it is able to detect unlabeled curves. The ground truth of curves is manually labeled. For labeling the texture-like complex image samples, humans are likely to miss some subtle or short curves in the labeling phase, as exemplified in Fig. 6(a). As a result, the learning-based methods (e.g. HED) tend to ignore many existing curves or merge them together because in the training phase they have been “taught” to pay less attention to such unlabeled curves – see Fig. 6(b). On the contrary, our method (e.g. FDIF+BP) is more robust to unlabeled curves – see Fig. 6(c). We think this is partially attributed to its intrinsic unsupervised learning nature: the representation of curve aims at preserving local fractal-dimension rather than approaching manual labels. As long as the response of a patch after anisotropic filtering is large enough, it will be preserved to represent curves. In this viewpoint, our method can be utilized as a robust feature extraction method, which has potential to label salient curves automatically.

5.4 Other Possible Applications

Besides curve detection, our fractal dimension invariant filtering method can also be used to create painting-style image from natural image. Considering the nature of most paintings that the objects in a painting are drawn via a series of curved strokes, we can treat paintings as a union of curves (fractals). Therefore, we can apply iterative FDIF method to natural image, enhancing their strokes and suppressing their textures. Fig. 14 gives a typical example. More visual results are given in the appendix. Similar to the neural algorithm in [13], our FDIF method has potential to generate diverse artistic styles via designing or learning different anisotropic filters.

(a) Natural image
(b) Painting-style image
Figure 8: The painting-style portrait of Benoit B. Mandelbrot, the author of “The fractal geometry of nature” [21].

6 Conclusion and Outlook

Taking an image as a union of local fractals, this paper presents a model involving anisotropic filtering with fractal dimension preservation. The model is also re-implemented from a CNN interpretation. This work is the first attempt to bridge fractal-based image model with neural networks.

One notable character of our method is for its unsupervised feature extraction part, which does not rely on manually labeled data. This fact can be potentially of interest to the community: manual labeling in low-level vision problems is tedious and error-prone, which hurts the practical use of supervised learning approaches, while our method can obtain competitive performance on these task against supervised learning method (i.e., HED). From the feature learning perspective, we believe that our fractal dimension invariant filtering can be further integrated with supervised learning techniques. Additionally, we will further explore the potential applications of our method, e.g., the artistic style generation problem mentioned above.

Acknowledgment: The work is supported in part via NSF IIS-1639792, NSF DMS-1317424, NSF DMS-1620345, NSF 1258425, NSFC 61471235, NSF FLAMEL IGERT Traineeship program, IGERT-CIF21, the Key Program of Shanghai Science and Technology Commission under Grant 15JC1401700, and the NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Information under Grant U1609220.

7 Appendix

7.1 Proof of Theorem 1

Proof.

The mapping is bi-Lipschitz transform if and only if is invertible and there exists so that holds for all . According to the definition, for arbitrary two points , we have

Recall the relationship that . The following condition holds for all ’s:

which implies and . ∎

7.2 Proof of Proposition 2

Proof.

According to the assumption, the expectation of the filters in (3) is , . For each element , when , we have

When , we have

When , . In summary, is the proposed impulse function. ∎

7.3 More Enlarged Experimental Results

Figs. 9-13 show enlarged experimental results of detecting nano-fiber curves from AFM images. Figs. 14-19 show painting-style generation results of several famous portraits. The contrast of each image in Figs. 14-19 is adjusted, ensuring that the average intensity of FDIF’s result is equal to the average intensity of original image.

(a) AFM image
(b) Manual labels
(c) ELSD [28]
(d) Frangi [11]
(e) LR
(f) LeNet [18]
(g) HED [42]
(h) FDIF
Figure 9: Visual comparisons for various methods.
(a) AFM image
(b) Manual labels
(c) ELSD [28]
(d) Frangi [11]
(e) LR
(f) LeNet [18]
(g) HED [42]
(h) FDIF
Figure 10: Visual comparisons for various methods.
(a) AFM image
(b) Manual labels
(c) ELSD [28]
(d) Frangi [11]
(e) LR
(f) LeNet [18]
(g) HED [42]
(h) FDIF
Figure 11: Visual comparisons for various methods.
(a) AFM image
(b) Manual labels
(c) ELSD [28]
(d) Frangi [11]
(e) LR
(f) LeNet [18]
(g) HED [42]
(h) FDIF
Figure 12: Visual comparisons for various methods.
(a) AFM image
(b) Manual labels
(c) ELSD [28]
(d) Frangi [11]
(e) LR
(f) LeNet [18]
(g) HED [42]
(h) FDIF
Figure 13: Visual comparisons for various methods.
(a) Natural image
(b) FDIF: #1 Iteration
(c) FDIF: #2 Iteration
(d) FDIF: #3 Iteration
Figure 14: The painting-style portrait of Albert Einstein generated via iterative FDIF method.
(a) Natural image
(b) FDIF: #1 Iteration
(c) FDIF: #2 Iteration
(d) FDIF: #3 Iteration
Figure 15: The painting-style portrait of Max Planck generated via iterative FDIF method.
(a) Natural image
(b) FDIF: #1 Iteration
(c) FDIF: #2 Iteration
(d) FDIF: #3 Iteration
Figure 16: The painting-style portrait of Niels Bohr generated via iterative FDIF method.
(a) Natural image
(b) FDIF: #1 Iteration
(c) FDIF: #2 Iteration
(d) FDIF: #3 Iteration
Figure 17: The painting-style portrait of David Hilbert generated via iterative FDIF method.
(a) Natural image
(b) FDIF: #1 Iteration
(c) FDIF: #2 Iteration
(d) FDIF: #3 Iteration
Figure 18: The painting-style portrait of Carl Friedrich Gauss generated via iterative FDIF method.
(a) Natural image
(b) FDIF: #1 Iteration
(c) FDIF: #2 Iteration
(d) FDIF: #3 Iteration
Figure 19: The painting-style portrait of Benoit B. Mandelbrot generated via iterative FDIF method.

References

  • [1] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. TPAMI, 33(5):898–916, 2011.
  • [2] G. Bertasius, J. Shi, and L. Torresani. Deepedge: A multi-scale bifurcated deep network for top-down contour detection. In CVPR, 2015.
  • [3] P. Brodatz. Textures: a photographic album for artists and designers. Dover Pubns, 1966.
  • [4] J. Bruna and S. Mallat. Invariant scattering convolution networks. TPAMI, 35(8):1872–1886, 2013.
  • [5] H. C. Burger, C. J. Schuler, and S. Harmeling. Image denoising: Can plain neural networks compete with bm3d? In CVPR, 2012.
  • [6] A. Calway and R. Wilson. Curve extraction in images using the multiresolution fourier transform. In ICASSP, 1990.
  • [7] P. Dollár and C. L. Zitnick. Fast edge detection using structured forests. TPAMI, 37(8):1558–1570, 2015.
  • [8] C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep convolutional network for image super-resolution. In ECCV. 2014.
  • [9] R. O. Duda and P. E. Hart. Use of the hough transformation to detect lines and curves in pictures. Communications of the ACM, 15(1):11–15, 1972.
  • [10] K. Falconer. Fractal geometry: mathematical foundations and applications. John Wiley & Sons, 2004.
  • [11] A. F. Frangi, W. J. Niessen, K. L. Vincken, and M. A. Viergever. Multiscale vessel enhancement filtering. In Medical Image Computing and Computer-Assisted Interventation, pages 130–137. 1998.
  • [12] G. Freedman and R. Fattal. Image and video upscaling from local self-examples. TOG, 30(2):12, 2011.
  • [13] L. Gatys, A. Ecker, and M. Bethge. A neural algorithm of artistic style. Nature Communications, 2015.
  • [14] R. Girshick, F. Iandola, T. Darrell, and J. Malik. Deformable part models are convolutional neural networks. In CVPR, 2015.
  • [15] D. Glasner, S. Bagon, and M. Irani. Super-resolution from a single image. In ICCV, 2009.
  • [16] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015.
  • [17] S. Jordens, L. Isa, I. Usov, and R. Mezzenga. Non-equilibrium nature of two-dimensional isotropic and nematic coexistence in amyloid fibrils at liquid interfaces. Nature communications, 4:1917, 2013.
  • [18] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  • [19] Q. Ma, D. S. Goshi, Y.-C. Shih, and M.-T. Sun. An algorithm for power line detection and warning based on a millimeter-wave radar video. TIP, 20(12):3534–3543, 2011.
  • [20] S. Mallat. Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10):1331–1398, 2012.
  • [21] B. B. Mandelbrot. The fractal geometry of nature, volume 173. Macmillan, 1983.
  • [22] O. Merveille, H. Talbot, L. Najman, and N. Passat. Tubular structure filtering by ranking orientation responses of path operators. In ECCV, 2014.
  • [23] P. Milanfar. A tour of modern image filtering: New insights and methods, both practical and theoretical. Signal Processing Magazine, 30(1):106–128, 2013.
  • [24] N. Otsu. A threshold selection method from gray-level histograms. Automatica, 11(285-296):23–27, 1975.
  • [25] E. Oyallon and S. Mallat. Deep roto-translation scattering for object classification. In CVPR, 2015.
  • [26] J. W. Park, J. W. Lee, and K. Y. Jhang. A lane-curve detection based on an lcf. Pattern Recognition Letters, 24(14):2301–2313, 2003.
  • [27] A. B. Patel, T. Nguyen, and R. G. Baraniuk. A probabilistic theory of deep learning. arXiv preprint arXiv:1504.00641, 2015.
  • [28] V. Pătrăucean, P. Gurdjos, and R. G. Von Gioi. A parameterless line segment and elliptical arc detector with enhanced ellipse fitting. In ECCV. 2012.
  • [29] A. P. Pentland. Fractal-based description of natural scenes. TPAMI, (6):661–674, 1984.
  • [30] G. Peyré. Texture synthesis with grouplets. TPAMI, 32(4):733–746, 2010.
  • [31] Y. Quan, Y. Xu, Y. Sun, and Y. Luo. Lacunarity analysis on image patterns for texture classification. In CVPR, 2014.
  • [32] L. Shen, T. Wee Chua, and K. Leman. Shadow optimization from structured deep edge detection. In CVPR, 2015.
  • [33] W. Shen, X. Wang, Y. Wang, X. Bai, and Z. Zhang. Deepcontour: A deep convolutional feature learned by positive-sharing loss for contour detection. In CVPR, 2015.
  • [34] A. Sironi, V. Lepetit, and P. Fua. Multiscale centerline detection by learning a scale-space distance transform. In CVPR, 2014.
  • [35] J.-L. Starck, E. J. Candès, and D. L. Donoho. The curvelet transform for image denoising. TIP, 11(6):670–684, 2002.
  • [36] C. J. Takacs, N. D. Treat, S. Krämer, Z. Chen, A. Facchetti, M. L. Chabinyc, and A. J. Heeger. Remarkable order of a high-performance polymer. Nano letters, 13(6):2522–2527, 2013.
  • [37] H. Talbot and B. Appleton. Efficient complete and incomplete path openings and closings. Image and Vision Computing, 25(4):416–425, 2007.
  • [38] I. Usov and R. Mezzenga. Fiberapp: an open-source software for tracking and analyzing polymers, filaments, biomacromolecules, and fibrous objects. Macromolecules, 48(5):1269–1280, 2015.
  • [39] M. Varma and R. Garg. Locally invariant fractal features for statistical texture classification. In ICCV, 2007.
  • [40] C. Wang, E. Subashi, F.-F. Yin, and Z. Chang. Dynamic fractal signature dissimilarity analysis for therapeutic response assessment using dynamic contrast-enhanced mri. Medical physics, 43(3):1335–1347, 2016.
  • [41] J. Xie, L. Xu, and E. Chen. Image denoising and inpainting with deep neural networks. In NIPS, 2012.
  • [42] S. Xie and Z. Tu. Holistically-nested edge detection. In ICCV, 2015.
  • [43] H. Xu, G. Zhai, L. Chen, and X. Yang. Automatic movie restoration based on wave atom transform and nonparametric model. EURASIP Journal on Advances in Signal Processing, (1):1–19, 2012.
  • [44] H. Xu, G. Zhai, and X. Yang. Single image super-resolution with detail enhancement based on local fractal analysis of gradient. TCSVT, 23(10):1740–1754, 2013.
  • [45] Y. Xu, H. Ji, and C. Fermüller. Viewpoint invariant texture description using fractal analysis. IJCV, 83(1):85–100, 2009.
  • [46] Y. Xu, D. Liu, Y. Quan, and P. Le Callet. Fractal analysis for reduced reference image quality assessment. TIP, 24(7):2098–2109, 2015.
  • [47] X. Yan, H. Chang, S. Shan, and X. Chen. Modeling video dynamics with deep dynencoder. In ECCV. 2014.
  • [48] H. Yang and W. B. Lindquist. Three-dimensional image analysis of fibrous materials. In International Symposium on Optical Science and Technology, pages 275–282, 2000.
  • [49] J. Yang, B. Price, S. Cohen, H. Lee, and M.-H. Yang. Object contour detection with a fully convolutional encoder-decoder network. arXiv preprint arXiv:1603.04530, 2016.
  • [50] L. Yu, Y. Xu, H. Xu, and X. Yang. Self-example based super-resolution with fractal-based gradient enhancement. In ICME Workshops, pages 1–6, 2013.
  • [51] C. Zhang, X. Ruan, Y. Zhao, and M.-H. Yang. Contour detection via random forest. In ICPR, 2012.
  • [52] Q. Zhang and Y. Xu. Block-based selection random forest for texture classification using multi-fractal spectrum feature. Neural Computing and Applications, 27(3):593–602, 2016.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
334936
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description