Fractal Dimension Invariant Filtering and Its CNNbased Implementation
Abstract
Fractal analysis has been widely used in computer vision, especially in texture image processing and texture analysis. The key concept of fractalbased image model is the fractal dimension, which is invariant to biLipschitz transformation of image, and thus capable of representing intrinsic structural information of image robustly. However, the invariance of fractal dimension generally does not hold after filtering, which limits the application of fractalbased image model. In this paper, we propose a novel fractal dimension invariant filtering (FDIF) method, extending the invariance of fractal dimension to filtering operations. Utilizing the notion of local selfsimilarity, we first develop a local fractal model for images. By adding a nonlinear postprocessing step behind anisotropic filter banks, we demonstrate that the proposed filtering method is capable of preserving the local invariance of the fractal dimension of image. Meanwhile, we show that the FDIF method can be reinstantiated approximately via a CNNbased architecture, where the convolution layer extracts anisotropic structure of image and the nonlinear layer enhances the structure via preserving local fractal dimension of image. The proposed filtering method provides us with a novel geometric interpretation of CNNbased image model. Focusing on a challenging image processing task — detecting complicated curves from the texturelike images, the proposed method obtains superior results to the stateofart approaches.
1 Introduction
Many complex natural scenes can be modeled as fractals [21, 29]. In the field of computer vision, fractal analysis has been proven to be a useful tool for modeling textures, and many research fruits have been proposed. Taking textures as fractals, the work in [39] learns local fractal dimensions and lengths as features for classifying textures. Similarly, the work in [45] learns the spectrum of fractal dimension as textures’ features via the boxcounting method [10]. It is easy to find that all of these methods treat the fractal dimension as a key concept of fractalbased image model because the fractal dimension is invariant to biLipschitz transformation. This property means that the fractal dimension is robust to geometrical deformation (e.g., ridge and nonridge transformation) of image. Hence, the fractal dimension reflects intrinsic structural information of image, which can be treated as a representative feature of image.
Unfortunately, the fractal dimension of image cannot be preserved after filtering, which might lead to the loss of structural information. A typical example is the interpolation of digital image, where the result can be viewed as a lowpass filtering of ground truth. The lowpass filtering suppresses the highresolution details of image, and thus, leads to the loss of structural information. The work in [44] shows that the fractal dimension of interpolated image is smaller than that of real highdimensional image. However, the recent development of deep convolutional neural networks (CNNs) shows that the stacked nonlinear filtering model is very suitable to learn features of images, which has a capability of extracting structural and semantic information of image robustly. Many CNNbased methods have been proposed to deal with various tasks e.g. image classification [16], texture analysis [4], and contour detection [49]. In other words, for extracting representative feature of image, the filtering operation are instrumental in CNNs while detrimental to fractalbased methods. Given these two seemingly contradictory phenomena, the following two problems arise: 1) Can we propose a filtering method preserving the invariance of fractal dimension? 2) Is there any connection between fractalbased image models and CNNs, especially for unsupervised feature learning?
In this paper, we give positive answers to these two problems. We propose a fractal dimension invariant filtering (FDIF) method and use a CNNbased architecture to reinstantiate it. This work provides us with a geometrical interpretation of CNN based on local fractal analysis of image. The proposed work obtains encouraging curve detection results for texturelike images, which is superior to other competitors. As Fig. 1(a) shows, we give a local fractal model of image and propose a curve detector under an iterative FDIF framework. In each iteration, we take patches of image as local fractals, and compute their fractal dimensions accordingly. An anisotropic filter is designed for each patch of image according to the analysis of gradient field, and the filtering result is further enhanced via preserving fractal dimension across various measurements. Inspired by the iterative filtering strategy in [23], we apply the steps above repeatedly to obtain the features of curves, and detect curves via unsupervised (i.e., thresholding) or supervised (i.e., logistic regression) methods. In particular, we demonstrate that such a pipeline can be implemented via a CNNbased architecture, as shown in Fig. 1(b). This CNN is interpretable from a geometrical viewpoint — the convolution layer corresponds to an anisotropic filter bank while the nonlinear layer approximately preserves local fractal dimensions. Applying backpropagation algorithm for supervised case and predefined parameters (filters) for unsupervised case, we achieve encouraging curve detection results.
As Fig. 1(c) shows, the principle of our FDIFbased curve detector is preserving local fractal dimensions via adjusting the measurement of fractal (i.e., the image itself). Generally, the measurement obtained via anisotropic filtering is smoothed. To preserve local fractal dimensions, we apply the nonlinear processing and get a new measurement, where the sharpness of curve is enhanced while the sharpness of the rest regions is suppressed. As a result, the FDIF method provides us with a better representation of curves.
We test our method on a collected atomicforce microscopy (AFM) image set, detecting complicated curves of materials from AFM images. Experimental results show that our method is promising in most situations, especially in the noisy and texturelike cases, which obtains superior results to existing curve detectors. Overall, the contributions of our work are mainly in three aspects: First, to the best of our knowledge, our work is the first attempt to propose a fractal dimension invariant filtering method and connect it with CNNs. It is also perhaps the first time to interpret CNNs from a (fractal) geometry perspective. Second, our method connects traditional handcrafted filterbased curve detector with a CNN architecture. It establishes a bridge on the gap between filterbased curve detectors and learningbased especially CNNbased ones. This connection also allows us to instantiate a new predefined CNN that can work in an unsupervised setting, different from most of its peers known for their ravenous appetite for labeled data. Third, we demonstrate a meaningful interdisciplinary application of our curve detector in computational material science. A material informatics image dataset is collected and will be released with this paper for future public research.
2 Related Work
Fractal Analysis: Fractalbased image model has been widely used to solve many problems of computer vision, including, texture analysis [31], biomedical image processing [40], and image quality assessment [46]. The local fractal analysis method in [39] and the spectrum of fractal dimension in [52, 45] take advantage of the biLipschitz invariance property of fractal dimension for texture classification, whose features are very robust to the deformation and scale changing of textures. Because the local selfsimilarity of image is often ubiquitous both within and across scales [15, 12], natural images can also be modeled as fractals locally [21, 29]. Recently, the fractal model of natural image is applied to image superresolution [44, 50], where the local fractal analysis is used to enhance image gradient adaptively. In [40], a fracalbased dissimilarity measurement is proposed to analyze MRI images. However, because the invariance of fractal dimension does not hold after filtering, it is difficult to merge fractal analysis into other image processing methods.
Convolution Neural Networks: CNNs have been widely used to extract visual features from images, which have many successful applications. In these years, this useful tool has been introduced into many lowand middlelevel vision problems, e.g., image reconstruction [41, 5], superresolution [8], dynamic texture synthesis [47], and contour detection [42, 49]. Currently, the physical meanings of different CNN modules are not fully comprehended. For example, the nonlinear layer of CNN, i.e., the rectifier linear unit (ReLU), and its output are often mysterious. For comprehending CNNs in depth, many attempts have been made. Many existing feature extraction methods have been proven to be equivalent to deep CNNs, like deformable part models in [14] and random forests in [27]. A pretrained deep learning model called scattering convolution network (SCN) is proposed in [20, 4, 25]. This model consists of hierarchical wavelet transformations and translationinvariant operators, which explains deep learning from the viewpoint of signal processing. However, none of these methods discuss the geometrical explanation of CNNs from the viewpoint of fractal analysis.
Curve Detection: Curve detection is a potential application of fractalbased image processing method regarding many practical tasks, such as power line detection [19], geological measurement [26], and rigid body detection [28] etc. More recently, the curve detection technique is introduced into more interdisciplinary fields, e.g., materials, biology, and nanotechnology [48, 36, 17]. To our surprise, although in the following section we show that fractalbased image model is very suitable for the problem of curve detection, very few existing methods apply fractal analysis to solve the problem. Taking advantage of the directionality of curve, early curve detectors are based on diverse transformations, including the Hough transformation [9], the curvelets [35], the wave atoms [43]. Besides the direction, the multiscale property of curve is considered via applying multiscale Fourier transformation [6], Frangi filtering [11], and the scalespace distance transformation [34]. Focusing on curve and line segment detection, the parameterless fitting model proposed in [28] achieves the stateoftheart. These methods principally construct an isotropic filter bank and detect the local strong response to certain directions. Beyond these manuallydesigned methods, the learningbased approaches become popular as a huge amount of labeled images become available [1, 51]. Focusing on edge detection, which is a problem related to curve detection, the structured forestbased detector [7] and the CNNbased detector [33, 2, 42, 32] are proposed. These methods learn their parameters on a large dataset, and thus, have powerful generalization ability to deal with challenging cases. However, most of the existing methods aim to detect sparse curves from relatively smooth background. Few of them can detect complicated curves from texturelike images.
3 Fractal Dimension Invariant Filtering
In this section, we introduce our fractalbased image model and show the derivation of local fractal dimension. According to the model, we propose an iterative fractal dimension invariant filtering method, which preserves local fractal dimensions of patches across various measurements in the phase of feature extraction.
3.1 Fractalbased Image Model
As shown in Fig. 2, a typical fractal is generated via transforming a geometry to analogues with scaling factor and then applying the transformation infinitely on each analogue. The union of the analogues is a fractal, denoted as . The fractal is a “Mathematical monster” that is unmeasurable in the measure space of . Therefore, the analysis of fractal is mainly based on the Hausdorff measure [21], which gives rise to the concept of fractal dimension. The fractal dimension is involved by a power law of measurements across multiple scales, i.e., the quantities . Here is called fractal dimension, which is larger than the topological dimension of .
In our work, an image is represented via a function of pixels, denoted as . Here is the union of the coordinates of pixels. Each coordinate of pixel is denoted as . We propose a fractalbased image model, representing as a union of local fractals, and image as (, ), where is a measurement supported on the fractal set . According to the power law of measurements mentioned above, for each pixel we have , where is a ball centering at with radius and is the local fractal dimension at under the measurement . Here, we use the intensity of pixel as the measurement directly, so the local fractal dimension at is
(1) 
where , is a Gaussian kernel defined as [45, 44], and “” indicates the valid convolution.
In practice, we estimate the local fractal dimension in (1) numerically by linear regression. Specifically, we calculate sample pairs by multiscale Gaussian filtering, and learn a linear model for all according to (1). Here is the value of measurement in the unit ball (), which is interpreted as the dimensional fractal length in [39]. Algorithm 1 gives the scheme of fractal dimension estimation.
Local Fractal dimension contains important structural information of image, e.g., smooth patches with fractal dimensions close to , the patch containing curves with fractal dimensions close to , and textures with fractal dimensions between and [10, 44]. For detecting structures, e.g., curves in images robustly, fractal dimension shall be preserved. One fundamental property of fractal dimension is its invariance to biLipschitz transform shown in Theorem 1:
Theorem 1.
BiLipschitz Invariance. For a fractal with fractal dimension , its biLipschitz transformation is still a fractal, whose fractal dimension .
Recall (1), we can find that the fractal dimension is not unique, which depends on the choice of measurement . The theorem holds because the biLipschitz transformation (i.e., the geometric transformation and nonrigid deformation of image) does not change the measurement of fractal, which is revealed via the proof in the appendix.
However, after filtering or convolution, the invariance of fractal dimension does not hold any more. For example, if we change the convolution kernel in (1), the measurement of fractal and the associated fractal dimension will be changed accordingly. Therefore, we cannot find a filter ensuring the fractal dimension of filtering result to be exactly same with that of original image.
To pursuit the fractal dimension preservation philosophy in face of the reality that filtering will inevitably change fractal dimension, we aim to suppress the expected change between original fractal dimension and filtered one. Denote the proposed filter as , the measurement and the fractal dimension of filtering result as and , respectively. We assume that the filter is a random variable yielding to a probabilistic distribution. According to (1), we have
(2) 
where computes the expectation of random variable. Obviously, to minimize the expected change between and , the expectation of the filter should be as close to impulse function as possible.
3.2 Iterative FDIF Framework
Motivated by the analysis above, we propose the following iterative FDIF method as detailed in Fig. 1(a).
Anisotropic Filtering: To suppress fractal dimension change, the expectation of the filter shall be as close as to impulse function. Anisotropic filters have been one natural choice for this purpose. Take directional filtering [30] as an example: for each pixel , compute the smoothed gradient in its neighborhood as . Here is a Gaussian filter, is the cardinality of the neighborhood, () is partial differential operator along horizontal (vertical) direction, and denotes vectorization. The eigenvector corresponding to the largest eigenvalue of , denoted as , indicates the direction information of . Such a direction field of image induces a series of directional filters in the polar coordinate system, denoted as , whose element satisfies
(3) 
Obviously, the filtering result at has the strongest response for . The directional filters satisfy the following proposition:
Proposition 2.
If the distribution of pixel’s direction is uniform, then the expectation value of the filters in (3) is an impulse function , where .
The proof is given in the appendix. Fig. 3 visualizes several typical directional filters and their mean in the right most, which further verifies the proposition. Recall (2), we can find that as long as the distribution of directions is uniform in the direction field of image, the proposition indicates that the proposed filters tend to preserve the expected value of fractal dimension after filtering.
Nonlinear Postprocessing: Anisotropic filtering prevents the expected fractal dimension from changing globally. Furthermore, we propose a transformation to preserve local fractal dimensions of the filtering result . In particular, although the local fractal dimension with the measurement is not equal to the original with , we can apply a transformation to , such that the fractal dimension with the new measurement , denoted as , is equal to . According to the definition of fractal dimension in (1) and the relationship given by Algorithm 1, it is easy find that the proposed transformation should be , where . In this situation, we have
In other words, the local fractal dimension . Then we apply the transformation directly to the filtering result such that the local fractal dimension is preserved under the new measurement. At each , we have
(4) 
Here the term preserves the energy of filtering result, which merely changes fractal length.
Iterative Framework: Combining the anisotropic filtering with the postprocessing, we obtain the proposed FDIF method. As Fig. 1(a) shows, FDIF can be applied iteratively, in order to extract structures hidden in images.
Take curve detection as an example. Fig. 9 illustrates the enlarged output of an AFM image in each iteration and compare the iterative filtering process with traditional path operator [22]. We can find that the pixels corresponding to curves are more and more discriminative. When the labels of curves are available, we learn the curve detector as a binary classifier with the help of logistic regression. Sampling the final filtering result into patches with overlaps, we learn the parameters of the sigmoid function. On the contrary, if the labels are unavailable, we simply apply a thresholding method [24] to convert the filtering result to a binary image. On the contrary, the traditional morphological filtering method, e.g., the path operator [37, 22], also aims at detecting curves and tubes, but it is sensitive to the noisy in the image. These two detection methods are shown in the last layer in Fig. 1(a). The iterative FDIFbased curve detector is physicallyinterpretable. The fractal dimension of patch reflects its sharpness: the patch of curve has higher sharpness than the patch of smooth region, whose fractal dimension tends to . The filters we used achieve an anisotropic smoothing process of image, so that the measurement of fractal dimension is smoothed as well. Essentially, preserving fractal dimensions under a smoother measurement, like (4) does, actually enhances the sharpness of curves and suppresses the sharpness of the rest regions, which provides us with a better representation of curves.
4 FraCNN: Implementing FDIF via CNN
In this section, we will show that FDIF can be reinstantiated via a CNN, as described in Fig. 1(b). In particular, the convolution layer can be explained as an anisotropic filter bank and the nonlinear layer performs the postprocessing function approximately.
4.1 The Architecture of The CNN
Convolution Layer: The anisotropic filtering can be approximately implemented via a filter bank. At each pixel , the process can be rewritten as
(5) 
where is the bank of anisotropic filters. only preserves the filtering result having the maximum response.
Nonlinear Layer: The proposed postprocessing can also be approximated via the following nonlinear layer:
(6) 
Here the normalization term is implemented via a convolution, where is a mean filter, which sums the intensities in the neighborhood for each . Different from neuroscience, we explain the rectified linear unit (ReLU, ) based on fractal analysis. The ReLU ensures the filtering result to be a valid measurement (as the measurement used in the boxcounting method [10, 45]): A valid measurement defined on the set satisfies nonnegativity , countable additivity , and null empty set simultaneously, where . The null empty set is satisfied by our filtering result naturally while the ReLU operator guarantees the nonnegativity and countable additivity.
Note that the parameter of transformation operation can be fixed approximately as a constant . This approximation is reasonable for the problem of curve detection. On one hand, we model the coordinates of image as a set of fractals, whose fractal dimension must be in the interval , where is the topology dimension of 2D geometry, and because the fractal dimension of a fractal generated from a 2D geometry via 2D transformation cannot reach to . On the other hand, after filtering the curves are also modeled as a set of fractals with fractal dimension in the interval , where is the topology dimension of curve (1D geometry) and . Based on the fractalbased model, we have . When and are small, we can estimate for all ’s.
4.2 FraCNNbased Curve Detection
The iterative FDIF framework can be achieved via stacking the layers above. As a result, the architecture of the proposed CNN is shown in Figs. 1(b). For convenience, we call it FraCNN. Similar to the iterative FDIF framework, we can also add a sigmoid layer to the end of the CNN and train the model via traditional backpropagation algorithm, or apply a thresholding layer for the final output. In contrast to many CNN models with a disadvantage of their ravenous appetite for labeled training, we believe the adaptability for unlabeled data of our method is perhaps due to the fact that we instantiate our tailored CNN from the fractalbased geometry perspective. Focusing on the task of curve detection, we propose a detection algorithm shown in Algorithm 2.
We present further comparisons and analysis as follows.
FraCNN v.s. FIDF: The proposed CNN model can be viewed as a fast implementation of FIDF. Firstly, the adaptive anisotropic filtering is approximately achieved by an anisotropic filter bank. The direction of filter is no longer computed from the eigenvector of the local gradient matrix, but sampled uniformly from the interval (as Fig. 3 shows). Although such an approximation reduces the accuracy of the description of direction, it avoids to do eigendecomposition for each pixel, and thus, accelerates the filtering process notably. Secondly, the ratio between the fractal dimension and the original one is replaced by a fixed value, such that we do not need to apply Algorithm 1 to estimate fractal dimension. As a result, the computational complexity of original FIDF is , where the first term corresponds to adaptive filtering and the second term corresponds to local fractal dimension estimation (and is the number of scales in Algorithm 1), while the complexity of proposed CNN is at most , where is the number of filters in the filter bank.
FraCNN v.s. Scattering Convolution Network: To our CNN model, the most related work might be the scattering convolution network (SCN) in [4, 25]. Both of our fractalbased CNN and the SCN can apply predefined filters and are suitable for unsupervised learning when labels are not available. However, there are several important differences between our model and SCNs. First, SCNs aim at extracting discriminative feature for image recognition and classification, while our Fractalbased CNN model focuses on lowand middlelevel vision problems, i.e., curve detection. Second, the nonlinear layer of SCN applies multiple nonlinear operators to enhance the invariance of feature to geometric transformation. For example, the absolute operator is applied to achieve translation invariance. In our work, the nonlinear layer aims to preserve local fractal dimension such that the local structural information of image will be enhanced. The geometric invariance of feature is not our goal. Finally, different from wavelet transformation, our fractalbased CNNs do not downsample filtering result (i.e., pooling operation).
5 Experiments
5.1 The AFM Image Benchmark and Protocols
We apply our fractal dimension invariant filtering method to a challenging realworld task: detecting structural curves in AFM images of materials. The demo code and partial data are in https://sites.google.com/site/htxu313/resources/software. The images in this study are atomic force microscopy (AFM) phase images of nanofibers. Each image is taken in tapping mode at a m and with size . The fibrillar structure of the material has a huge influence on its electronic properties, which is represented via the complex salient curves in the image, as Fig.1 shows. Detecting curves from the AFM images is challenging. First, the AFM images often suffer from heavy noise and low contrast, which has negative influences on curve detection. Second, the curves in these scenes are very complicated — dense curves (i.e., nano fibers) with different shapes and directions are distributed in the image randomly and have overlaps with each other. The ground truth of curves are extracted manually by a semiautomatic tool called FiberApp [38].
We test our FraCNNbased curve detector with the original FDIFbased detector in both unsupervised and supervised cases. Specifically, we consider these two detectors with thresholdingbased binary processing (BP) and logistic regression (LR) as the last layer, respectively. The size of filters used in FDIF and FraCNN is , and the number of anisotropic filters used in FraCNN is , as shown in Fig. 3. For investigating the influence of model’s iteration number (depth) on learning results, we set the iteration number of FDIF to be (relatively shallow) or (relatively deep). Accordingly, the depth of FraCNN is or . In the supervised case (note only for last layer), we use AFM images as training set and the remaining AFM images as testing set. patches of size are sampled from the output images of FDIF or FraCNN to training parameters of the sigmoid layer. A half of training patches whose central pixels correspond to curves are labeled as positive samples, while the rest patches are negative ones.
For further demonstrating the superiority of our method, we consider the following competitors: the curve and line segment detector (ELSD) in [28]; the traditional Frangi filteringbased curve detector [11] the simple logistic regression LR using patches as features directly; the classical CNN socalled LeNet [18]; the stateofart holisticallynested edge detector (HED) [42]. Although the HED is originally designed to detect edges, it should also be suitable to detect curves because both curves and edges satisfies the assumption of multiscale consistency. Therefore, we use the training images to finetune the pretrained HED model and learn a curve detector accordingly.^{1}^{1}1The training code and pretrained model is from https://github.com/s9xie/hed. Following the instruction in [42], a postprocess is applied to the output of CNNs, achieving the shrinkage and the binarization of detected curves. The logistic regression is trained by patches with size sampled randomly from training images. The training samples of the LeNet is also patches of images, the only difference is that the size of the patches is . In the testing phase, each patch of testing image will be classified and its label will be used as the corresponding pixel value of final binary map.
Similar to contour detection [1], we use the standard metrics for curve detection, including the optimal Fscore with fixed threshold (ODS), the optimal Fscore with perimage best threshold (OIS), and average precision (AP).
5.2 Experimental Results
Table 1 gives comparison results for various methods, and Fig. 5 and 6 visualize some typical results. More experimental results are attached in the appendix.
The traditional image processing methods like the ELSD and the Frangi filter seems unsuitable for detecting complicated curves in our case. The ELSD method aims at detecting line segments and ellipse curves of rigid body in natural image. The Frangi filter is originally designed for detecting vessels from medical images. Both of these two methods can only detect sparse curves from relatively smooth background. In our case, however, the curves of nanofibers are very dense and complex and the AFM images are generally noisy. As a result, the ELSD cannot detect complete curves and obtains very low ODS, OIS, and AP while the Frangi filtering method is not robust to noise and the change of contrast, which can only obtain chaotic results.
The learningbased approaches, including LR, LeNet, and HED, achieve much better results (i.e., higher ODS, OIS, and AP) than basic image processing methods. However, their results are still very noisy. In Fig. 5, LR’s results contain many noncurve pixels and the many broken curves. LeNet gets some improvements: long curves are detected correctly, but there are still many noncurve pixels. HED is superior to LR and LeNet. Long curves are detected with more confidence and fewer incorrect isolated pixels appear in the results. Table 1 shows the superiority of HED.
FDIF and FraCNN both achieve encouraging results. Specifically, our unsupervised methods, FDIF+BP and FraCNN+BP, outperform the other nonlearning methods (ELSD and Frangi) notably, with better performance in Table 1 and visual results in Fig. 5. Additionally, FDIF+BP and FraCNN+BP are also better than some learningbased methods. We can find that they get higher ODS, OIS and AP than LR and LeNet. The comparison results still demonstrate that the fractalbased image model is suitable for the problem of curve detection, and our methods can extract representative features for curves. In the supervised case, our FDIF+LR and FraCNN+LR methods outperform all the competitors in ODS and OIS while getting slightly worse AP than HED. Moreover, from the enlarged comparison results in Fig. 6, we can find that HED’s result is still very coarse, while our method can get thin curves. The results demonstrate that the proposed methods are at least comparable to the stateofart in the problem of curve detection. Note that our method is superior to HED in the aspect of computational complexity. Specifically, in each layer, our FraCNN just applies 2D convolutions with kernel size to image , whose computational complexity is , while HED applies 3D convolutions to an image tensor with channels, whose computational complexity is .
One important observation here is that although FraCNN can be viewed as an implementation of FDIF, it sometimes outperforms FDIF in Table 1. A potential explanation for this phenomenon might be that FDIF is more sensitive to the noise in the image. Specifically, the flexibility of FDIF on selecting directions might be a “doubleedged sword”. Heavy noise in the image would lead to bad estimate of filter’s direction and have negative influences on filtering results. The FraCNN, however, uses a predefined anisotropic filter bank. The limited options of directions might help to suppress the influence of noise. Additionally, experimental results show that with the increase of iteration number and depth, the performance of our methods is degraded slightly. In the viewpoint of numerical analysis, too many iterations or too deep architecture might lead to the underflow problem of pixel value. In the unsupervised case, instead of finetuning the threshold case by case, we uniformly set the threshold to for fair comparison. Note the threshold can have direct impact to final results: some underflow points might appear on curves, and thus the thresholding operation might break a complete curve into several pieces of short segments. In the supervised case, the underflow points in patches also hurt the representation of curve, which have negative influences on training the sigmoid layer.
Furthermore, we select some texture images containing curves from the public Brodatz texture data set [3], label them manually, and test our method accordingly. Some typical visual results and numerical results are shown in Fig. 7, which further verify the performance of our method.



5.3 Robustness to Missing Labels
Compared with the stateofart learningbased detector, an important advantage of the proposed method is that it is able to detect unlabeled curves. The ground truth of curves is manually labeled. For labeling the texturelike complex image samples, humans are likely to miss some subtle or short curves in the labeling phase, as exemplified in Fig. 6(a). As a result, the learningbased methods (e.g. HED) tend to ignore many existing curves or merge them together because in the training phase they have been “taught” to pay less attention to such unlabeled curves – see Fig. 6(b). On the contrary, our method (e.g. FDIF+BP) is more robust to unlabeled curves – see Fig. 6(c). We think this is partially attributed to its intrinsic unsupervised learning nature: the representation of curve aims at preserving local fractaldimension rather than approaching manual labels. As long as the response of a patch after anisotropic filtering is large enough, it will be preserved to represent curves. In this viewpoint, our method can be utilized as a robust feature extraction method, which has potential to label salient curves automatically.
5.4 Other Possible Applications
Besides curve detection, our fractal dimension invariant filtering method can also be used to create paintingstyle image from natural image. Considering the nature of most paintings that the objects in a painting are drawn via a series of curved strokes, we can treat paintings as a union of curves (fractals). Therefore, we can apply iterative FDIF method to natural image, enhancing their strokes and suppressing their textures. Fig. 14 gives a typical example. More visual results are given in the appendix. Similar to the neural algorithm in [13], our FDIF method has potential to generate diverse artistic styles via designing or learning different anisotropic filters.
6 Conclusion and Outlook
Taking an image as a union of local fractals, this paper presents a model involving anisotropic filtering with fractal dimension preservation. The model is also reimplemented from a CNN interpretation. This work is the first attempt to bridge fractalbased image model with neural networks.
One notable character of our method is for its unsupervised feature extraction part, which does not rely on manually labeled data. This fact can be potentially of interest to the community: manual labeling in lowlevel vision problems is tedious and errorprone, which hurts the practical use of supervised learning approaches, while our method can obtain competitive performance on these task against supervised learning method (i.e., HED). From the feature learning perspective, we believe that our fractal dimension invariant filtering can be further integrated with supervised learning techniques. Additionally, we will further explore the potential applications of our method, e.g., the artistic style generation problem mentioned above.
Acknowledgment: The work is supported in part via NSF IIS1639792, NSF DMS1317424, NSF DMS1620345, NSF 1258425, NSFC 61471235, NSF FLAMEL IGERT Traineeship program, IGERTCIF21, the Key Program of Shanghai Science and Technology Commission under Grant 15JC1401700, and the NSFCZhejiang Joint Fund for the Integration of Industrialization and Information under Grant U1609220.
7 Appendix
7.1 Proof of Theorem 1
Proof.
The mapping is biLipschitz transform if and only if is invertible and there exists so that holds for all . According to the definition, for arbitrary two points , we have
Recall the relationship that . The following condition holds for all ’s:
which implies and . ∎
7.2 Proof of Proposition 2
Proof.
According to the assumption, the expectation of the filters in (3) is , . For each element , when , we have
When , we have
When , . In summary, is the proposed impulse function. ∎
7.3 More Enlarged Experimental Results
Figs. 913 show enlarged experimental results of detecting nanofiber curves from AFM images. Figs. 1419 show paintingstyle generation results of several famous portraits. The contrast of each image in Figs. 1419 is adjusted, ensuring that the average intensity of FDIF’s result is equal to the average intensity of original image.
References
 [1] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. TPAMI, 33(5):898–916, 2011.
 [2] G. Bertasius, J. Shi, and L. Torresani. Deepedge: A multiscale bifurcated deep network for topdown contour detection. In CVPR, 2015.
 [3] P. Brodatz. Textures: a photographic album for artists and designers. Dover Pubns, 1966.
 [4] J. Bruna and S. Mallat. Invariant scattering convolution networks. TPAMI, 35(8):1872–1886, 2013.
 [5] H. C. Burger, C. J. Schuler, and S. Harmeling. Image denoising: Can plain neural networks compete with bm3d? In CVPR, 2012.
 [6] A. Calway and R. Wilson. Curve extraction in images using the multiresolution fourier transform. In ICASSP, 1990.
 [7] P. Dollár and C. L. Zitnick. Fast edge detection using structured forests. TPAMI, 37(8):1558–1570, 2015.
 [8] C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep convolutional network for image superresolution. In ECCV. 2014.
 [9] R. O. Duda and P. E. Hart. Use of the hough transformation to detect lines and curves in pictures. Communications of the ACM, 15(1):11–15, 1972.
 [10] K. Falconer. Fractal geometry: mathematical foundations and applications. John Wiley & Sons, 2004.
 [11] A. F. Frangi, W. J. Niessen, K. L. Vincken, and M. A. Viergever. Multiscale vessel enhancement filtering. In Medical Image Computing and ComputerAssisted Interventation, pages 130–137. 1998.
 [12] G. Freedman and R. Fattal. Image and video upscaling from local selfexamples. TOG, 30(2):12, 2011.
 [13] L. Gatys, A. Ecker, and M. Bethge. A neural algorithm of artistic style. Nature Communications, 2015.
 [14] R. Girshick, F. Iandola, T. Darrell, and J. Malik. Deformable part models are convolutional neural networks. In CVPR, 2015.
 [15] D. Glasner, S. Bagon, and M. Irani. Superresolution from a single image. In ICCV, 2009.
 [16] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015.
 [17] S. Jordens, L. Isa, I. Usov, and R. Mezzenga. Nonequilibrium nature of twodimensional isotropic and nematic coexistence in amyloid fibrils at liquid interfaces. Nature communications, 4:1917, 2013.
 [18] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradientbased learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
 [19] Q. Ma, D. S. Goshi, Y.C. Shih, and M.T. Sun. An algorithm for power line detection and warning based on a millimeterwave radar video. TIP, 20(12):3534–3543, 2011.
 [20] S. Mallat. Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10):1331–1398, 2012.
 [21] B. B. Mandelbrot. The fractal geometry of nature, volume 173. Macmillan, 1983.
 [22] O. Merveille, H. Talbot, L. Najman, and N. Passat. Tubular structure filtering by ranking orientation responses of path operators. In ECCV, 2014.
 [23] P. Milanfar. A tour of modern image filtering: New insights and methods, both practical and theoretical. Signal Processing Magazine, 30(1):106–128, 2013.
 [24] N. Otsu. A threshold selection method from graylevel histograms. Automatica, 11(285296):23–27, 1975.
 [25] E. Oyallon and S. Mallat. Deep rototranslation scattering for object classification. In CVPR, 2015.
 [26] J. W. Park, J. W. Lee, and K. Y. Jhang. A lanecurve detection based on an lcf. Pattern Recognition Letters, 24(14):2301–2313, 2003.
 [27] A. B. Patel, T. Nguyen, and R. G. Baraniuk. A probabilistic theory of deep learning. arXiv preprint arXiv:1504.00641, 2015.
 [28] V. Pătrăucean, P. Gurdjos, and R. G. Von Gioi. A parameterless line segment and elliptical arc detector with enhanced ellipse fitting. In ECCV. 2012.
 [29] A. P. Pentland. Fractalbased description of natural scenes. TPAMI, (6):661–674, 1984.
 [30] G. Peyré. Texture synthesis with grouplets. TPAMI, 32(4):733–746, 2010.
 [31] Y. Quan, Y. Xu, Y. Sun, and Y. Luo. Lacunarity analysis on image patterns for texture classification. In CVPR, 2014.
 [32] L. Shen, T. Wee Chua, and K. Leman. Shadow optimization from structured deep edge detection. In CVPR, 2015.
 [33] W. Shen, X. Wang, Y. Wang, X. Bai, and Z. Zhang. Deepcontour: A deep convolutional feature learned by positivesharing loss for contour detection. In CVPR, 2015.
 [34] A. Sironi, V. Lepetit, and P. Fua. Multiscale centerline detection by learning a scalespace distance transform. In CVPR, 2014.
 [35] J.L. Starck, E. J. Candès, and D. L. Donoho. The curvelet transform for image denoising. TIP, 11(6):670–684, 2002.
 [36] C. J. Takacs, N. D. Treat, S. KraÌmer, Z. Chen, A. Facchetti, M. L. Chabinyc, and A. J. Heeger. Remarkable order of a highperformance polymer. Nano letters, 13(6):2522–2527, 2013.
 [37] H. Talbot and B. Appleton. Efficient complete and incomplete path openings and closings. Image and Vision Computing, 25(4):416–425, 2007.
 [38] I. Usov and R. Mezzenga. Fiberapp: an opensource software for tracking and analyzing polymers, filaments, biomacromolecules, and fibrous objects. Macromolecules, 48(5):1269–1280, 2015.
 [39] M. Varma and R. Garg. Locally invariant fractal features for statistical texture classification. In ICCV, 2007.
 [40] C. Wang, E. Subashi, F.F. Yin, and Z. Chang. Dynamic fractal signature dissimilarity analysis for therapeutic response assessment using dynamic contrastenhanced mri. Medical physics, 43(3):1335–1347, 2016.
 [41] J. Xie, L. Xu, and E. Chen. Image denoising and inpainting with deep neural networks. In NIPS, 2012.
 [42] S. Xie and Z. Tu. Holisticallynested edge detection. In ICCV, 2015.
 [43] H. Xu, G. Zhai, L. Chen, and X. Yang. Automatic movie restoration based on wave atom transform and nonparametric model. EURASIP Journal on Advances in Signal Processing, (1):1–19, 2012.
 [44] H. Xu, G. Zhai, and X. Yang. Single image superresolution with detail enhancement based on local fractal analysis of gradient. TCSVT, 23(10):1740–1754, 2013.
 [45] Y. Xu, H. Ji, and C. Fermüller. Viewpoint invariant texture description using fractal analysis. IJCV, 83(1):85–100, 2009.
 [46] Y. Xu, D. Liu, Y. Quan, and P. Le Callet. Fractal analysis for reduced reference image quality assessment. TIP, 24(7):2098–2109, 2015.
 [47] X. Yan, H. Chang, S. Shan, and X. Chen. Modeling video dynamics with deep dynencoder. In ECCV. 2014.
 [48] H. Yang and W. B. Lindquist. Threedimensional image analysis of fibrous materials. In International Symposium on Optical Science and Technology, pages 275–282, 2000.
 [49] J. Yang, B. Price, S. Cohen, H. Lee, and M.H. Yang. Object contour detection with a fully convolutional encoderdecoder network. arXiv preprint arXiv:1603.04530, 2016.
 [50] L. Yu, Y. Xu, H. Xu, and X. Yang. Selfexample based superresolution with fractalbased gradient enhancement. In ICME Workshops, pages 1–6, 2013.
 [51] C. Zhang, X. Ruan, Y. Zhao, and M.H. Yang. Contour detection via random forest. In ICPR, 2012.
 [52] Q. Zhang and Y. Xu. Blockbased selection random forest for texture classification using multifractal spectrum feature. Neural Computing and Applications, 27(3):593–602, 2016.