Let’s take a Walk on Superpixels Graphs: Deformable Linear Objects Segmentation and Model Estimation

Let's take a Walk on Superpixels Graphs: Deformable Linear Objects Segmentation and Model Estimation

Abstract

While robotic manipulation of rigid objects is quite straightforward, coping with deformable objects is an open issue. More specifically, tasks like tying a knot, wiring a connector or even surgical suturing deal with the domain of Deformable Linear Objects (DLOs). In particular the detection of a DLO is a non-trivial problem especially under clutter and occlusions (as well as self-occlusions). The pose estimation of a DLO results into the identification of its parameters related to a designed model, e.g. a basis spline. It follows that the stand-alone segmentation of a DLO might not be sufficient to conduct a full manipulation task. This is why we propose a novel framework able to perform both a semantic segmentation and b-spline modeling of multiple deformable linear objects simultaneously without strict requirements about environment (i.e. the background). The core algorithm is based on biased random walks over the Region Adiacency Graph built on a superpixel oversegmentation of the source image. The algorithm is initialized by a Convolutional Neural Networks that detects the DLO’s endcaps. An open source implementation of the proposed approach is also provided to easy the reproduction of the whole detection pipeline along with a novel cables dataset in order to encourage further experiments.

Figure 1: Application of our algorithm to an image featuring a complex background (left). The first kind of output (center) is a Walk (white dots) over the Region Adjacency Graph (RAG) of the superpixel segmentation which allows for computing a b-spline model of each object. Yellow boxes highlight detection of cables terminals. The second output (right) consists in a segmentation of the whole image.

1 Introduction

Plenty of manipulation tasks deal with objects that can be modeled as non-rigid linear – or more generally tubular – structures. In case the task has to be executed by a robot in an unstructured environment, particular effort must be devoted to effectiveness, reliability and efficiency of the automated perception sub-system. Tying knots, for example, is a common though hard to automate activity. In particular, in surgical operations like suturing, grasp and knot-tying is a very important and repetitive sub-task ([javdani2011modeling][jackson2015automatic][saha2007manipulation]). Similar knot-tying and path planning procedures, like e.g. knots untangling, are also relevant to contexts like service, collaborative and rescue robotics ([lui2013tangled][nair2017combining][schulman2013tracking][hopcroft1991case]). As for industrial scenarios, one of the hardest tasks involving flexible linear objects is wire routing in assembly processes ([remde1999picking][yue2002manipulating][alvarez2016approach][koo2008development]). This paper focuses primarily on the industrial field and extends the original concepts introduced in the WIRES1 project and described in [degregorio2018]. As the project aims at automatizing the switchgear wiring procedure, cable modeling by perception is key to address sub-tasks like cable grasp, terminal insertion, wireways routing and, not least in importance, simulation and validation.

In this paper we propose a novel computer vision algorithm for generic Deformable Linear Object (DLO) detection. As highlighted in Fig. 1, the proposed algorithm yields a twofold representation of detected objects, namely a b-spline model for each target alongside with segmentation of the whole image. This twofold representation helps addressing both relatively simpler application settings dealing with cable detection as well as more complex endeavours calling for estimation of cable bend. The proposed algorithm consists of two distinct modules: the first, which may be considered as a pre-processing stage, detects the end-caps regions of DLOs by exploiting off-the-shelf Convolutional Neural Networks ([redmon2017yolo9000, huang2016speed]); the second module, instead, is the core of this work and allows for identifying DLOs based on the coarse position of their endpoints in images featuring complex backgrounds as well as occlusions (e.g. cables crossing other cables) and self-occlusions (e.g. a cable crossing itself, also several times).

The algorithm exploits an over-segmentation of the source image into superpixels to build a Region Adjacency Graph (RAG [tremeau2000regions]). This representation enables to detect the area enclosing each target object by efficiently analyzing meaningful regions (i.e. superpixels) only rather than the whole pixels set. The task is accomplished by an iterative procedure capable to find the best path (or walk) through the RAG between two seed points by analyzing several local and global features (e.g. visual similarity, overall curvature etc.). This iterative procedure yields a directed graph of superpixels conducive to vectorization and b-spline approximation. As better explained in the remainder of the paper, our approach is mostly unsupervised (i.e. only the endpoints detection CNN is trained supervisedly) and relies on just a few parameters that can be easily tuned manually (or estimated based on the characteristics of object’s material, e.g. elasticity or the plasticity). As we shall see in Sec. 2, our algorithm outperforms other known approaches that one may apply to try solving the addressed task.

2 Related Work

Object Type Curvature Intersections Bifurcations
Cables Spline Model A N
Fingerprints Low and Bounded N A
Guidewires Spline Model N N
Pavement Cracks Random A A
Power Lines Spline Model A N
Roads Model Based A A
Ropes Spline Model A N
Surgery Threads Spline Model A N
Vessels Low and Bounded N A
Table 1: Curvilinear objects alongside with their key features. Curvature expresses the high-level shape representation for that object. Intersections indicates whether a crossing between two objects is allowed or not (A = Allowed, N = Not); Bifurcations denotes whether an object may bifurcate into multiple parts (A) or not (N).

Although the literature concerning DLOs is more focused on manipulation than on perception (see e.g. [saha2007manipulation]), we may refer to the broader topic of Curvilinear Objects Segmentation to highlight related works as well as suitable alternatives to evaluate our proposal comparatively. As described in [bibiloni2016survey], the aforementioned topic pertains several kinds of objects, as summarized in Table 1. Our target objects are Cables, which, however, share similar properties in terms of Intersections and Bifurcations with the other categories highlighted in bold in Table 1,

As far as Cables are concerned, visual perception is typically addressed in fairly simple settings. In [jiang2011robotized] Augmented Reality markers are deployed to track end-points. In other works, like [remde1999picking] and [camarillo2008vision], detection relies on background removal 2.

Moving to the domain of knot-tying with Ropes, the basic approach still turns out to be background removal, like in [hopcroft1991case], or its 3D counterpart – plane removal– like in [lui2013tangled] and [schulman2016learning]. All these methods produce a raw set of points on which a region growing algorithm is run to attain a vectorization of the target object. A different approach is used in [schulman2013tracking], with the model of the object registered to the 3D point cloud in real-time in order to avoid the segmentation step. As described in [nair2017combining], deep learning can also be used to track a deformable object: deep features are associated with rope configurations so to establish a direct mapping toward energy configurations without any explicit modeling step. This approach, however, may hardly work effectively in presence of complex and/or unknown backgrounds.

In the medical field, Surgery Threads detection is just the same kind of problem, albeit at a smaller scale. Also the literature dealing with this domain is more focused on manipulation than on detection issues, either assuming the latter as solved [moll2006path] or addressing it by hand-labeled markers [javdani2011modeling]. A more scalable approach is proposed in [padoy20113d] and [jackson2015automatic], where the authors borrow the popular Frangi Filter [frangi1998multiscale] from the field of vessels segmentation in order to enhance the curvilinear structure of suture threads and produce a binary segmentation amenable to estimate a spline model.

Despite Table 1 would suggest Ropes, Guidewires, Powelines and Surgery Threads to exhibit more commonalities with Cables, applications domains like Vessels Segmentation or Road detection can provide interesting insights and solutions for curvilinear object detection. Akin to most object detection problems, the most successful approaches to Vessels Segmentation leverage on deep learning. In [liskowski2016segmenting] the authors trained a Convolutional Neural Network (CNN) with hundreds of thousands labeled images in order to obtain a very effective detector. This supervised approach, however, mandates availability of a huge training set and lots of man-hours, a combination quite unlikely amenable to real-world industrial settings. Similar considerations apply to other remarkably effective vessels segmentation approaches based on supervised learning like [melinvsvcak2015retinal] and [li2016cross]. Yet, in this realm several methods exploiting 2D filtering procedures are very popular and may be applied within a cable detection pipeline for industrial applications. The first interesting approach was developed by Frangi et.al [frangi1998multiscale] (hereinafter Frangi algorithm3) and consists in a multi-scale filtering procedure capable of enhancing tubular structures. Another methods, described in [staal2004ridge], deploys a pre-processing stage based on ridge detection (hereinafter Ridge algorithm4) to detect vessels. Although the authors use the outcome of this stage to feed a pixel-wise classifier, we found that the stage itself is very useful to highlight generic tubular structures and hence choose to compare this algorithm with ours in the experiments. Given that both the Frangi and Ridge algorithms deal with detection/enhancement of 2D curvilinear structure, we decided to include in the evaluation also ELSD [puatruaucean2012parameterless], which is a popular parameter-less algorithm aimed at detection of line segments and elliptical arcs.

3 Algorithm Description

Figure 2: A graphical representation of the whole pipeline described in detail in Sec. 3

The basic idea underlying our algorithm is to detect DLOs as suitable walks within an adjacency graph built on superpixels. We provide first an overview of the approach with the help of Fig. 2, which illustrates the following main steps of the whole pipeline.

  1. Endpoints Detection: The first step consists in detecting endpoints. This is numbered as zero because it may be thought of as an external process not tightly linked to the rest of the algorithm. Indeed, any external algorithm capable to produce detections around targets () may be deployed in this step.

  2. Superpixel Segmentation: The source image is segmented into adjacent sub-regions (superpixels) in order to build a set of segments exhibiting a far smaller cardinality than the whole pixel set. Moreover, an adjacency graph is created on top of this segmentation in order to keep track of the neighborhood of each superpixel. Eventually, those superpixels containing the 2D detections obtained in the previous step are marked as seeds ().

  3. Start Walks: From each seed we can start an arbitrary number of walks () by moving into adjacent superpixels as defined by the adjacency graph.

  4. Extend Walks: For each walk we move forward along the adjacency graph by choosing iteratively the best next superpixel (e.g. in Fig. 2-(3)) between the neighbourhood {} of the current one.

  5. Terminate Walks: When a walk reaches another seed (or lies in its neighborhood) it is marked as closed. Due to the iterative nature of the computation, a maximum number of extending steps is allowed for each search to ensure a bounded time complexity.

  6. Discard Unlikely Walks: As a set of random walks are started in Step 2, we keep only the most likely ones and mark others as outliers.

(1) Example of crossing and self-crossing wires
(2) Visual likelihood (3) Curvature likelihood (4) Distance likelihood
Figure 3: (1) shows a complex configuration dealing with crossing and self-crossing wires. The zoom highlights the last node () alongside with the candidates to select the next node, i.e. first and second order neighbours (green and blue dots, respectively). (2),(3) and (4) plot the Visual, Curvature and Distance likelihoods, respectively. Based on the contribution of all the three likelihood terms, node is selected to extend the current walk.

In the remainder of this Section we will describe in detail the different concepts, methods and computations needed to realize the whole pipeline. In particular, in Sec. 3.1 we address Superpixels together with the Region Adjacency Graph; in Sec. 3.2 we define walks and how they can be built iteratively by analyzing local and global features; in Sec. 3.3 we propose a method to start the random walks by exploiting an external Object Detector based on a CNN and, to conclude, in Sec. 3.4 we describe how to deploy walks to attain a semantic segmentation of the image as depicted in Fig. 1.

3.1 Superpixel Segmentation and Adjacency Graph

The main aim of Superpixel Segmentation is to replace the rigid structure of the pixel grid with an higher-level subdivision into more meaningful primitives called superpixels. These primitives group regions of perceptually similar raw pixels, thereby potentially reducing the computational complexity of further processing steps. As regards the Cable Detection and Segmentation problem (i.e. DLOs with a thickness higher than a single pixel, in the majority of applications), our assumptions are that the target wire can be represented as a subset of similar adjacent superpixels. Thus, the overall problem can be seen as a simple iterative search through the superpixels set subject to model-driven constraints (e.g. avoiding solutions with implausible curvature or non-uniform visual appearance).

Superpixel Segmentation algorithms can be categorized into graph-based approaches (e.g. the method proposed by Felzenszwalb et al. [felzenszwalb2004efficient]) and gradient-ascent methods (e.g. Quick Shift [vedaldi2008quick]). In our experiments we found the state-of-the-art algorithm referred to as SLIC [achanta2012slic] to perform particularly well in terms of both speed and accuracy. Accordingly, we deploy SLIC in our Superpixel Segmentation stage. SLIC is an adaptation of the k-means clustering algorithm to pixels represented as 5D vectors , with denoting color channels in the CIELAB space and image coordinates. During the clustering process the compactness of each cluster can be either increased or reduced to the detriment of visual similarity. In other words we can choose easily to assign more importance to visual consistency of superpixels or to their spatial uniformity. Fig. 4 (b),(c) show two segmentations provided by SLIC according to different settings for the visual consistenvy vs. spatial uniformity trade-off.

Superpixel Segmentation allows then to build a Region Adjacency Graph (RAG in short) according to the method described in [tremeau2000regions]. Thus, a generic image can be partitioned into disjoint non-empty regions (i.e. the superpixels) such as . Accordingly, an undirected weighted graph si given by , where is the set of nodes , corresponding to each region , and is the set of edges such as that if and are adjacent. A graphical representation of this kind of graph is shown in Fig. 4-(d), where black dots represents nodes and black lines represent edges . In this quite straightforward to observe that in Fig. 4-(d) there exists a walk trough the graph , highlighted by white dots, which covers a target DLO (i.e. the red cable in the middle).

It is worth pointing out that our approach is similar to a Region Growing algorithm, with a seed point corresponding to the cable’s tip and the search space bounded by the RAG. The main difference with a classical Region Growing approach is that we restrict the search along a walk applying several model-base constraints rather than relying only on visual similarity only. In particular, the shape of the walk is considered by assigning geometric primitives to the elements of the adjacency graph, i.e. 2D points for our nodes and 2D segments for our edges , as further described in Sec. 3.2. In simple terms, the geometric consistency of the curve superimposed on a walk is analyzed to choose the next node during the iterative search, and all unlikely configurations are discarded.

(a) (b) (c) (d)
Figure 4: (a) Original input image. (b) SLIC superpixels segmentation with low compactness. (c) Segmentation with high compactness. (d) The Region Adjacency Graph (RAG) built on the (c) segmentation.

3.2 Walking on the Adjacency Graph

Formally, a walk over a graph is a sequence of alternating vertices and edges , where an edge connects nodes and , and is the length of the walk. The definition of walk is more general with respect to the path or trail over the graph because it admits repeated vertices, a common situation when dealing with self-crossing cables. It is important to notice that the Region Adjacency Graph shown in Fig. 4(d) is a simple-connectivity relationship graph, or, equivalently, connectivity is of order , with this meaning that only directly connected regions are mapped into the graph. We can build also RAGs with order , thereby allowing, for example, second or third order connectivity. All this translates into the possibility to jump during the walk also to the vertex not directly connected to the current region. This turns out very useful, for example, to deal with intersections like that depicted in Fig. 3, where vertices are of order and vertices of order .

For the sake of simplicity, we can define a generic walk as , i.e. an ordered subset of vertices, without considering edges. Under the hypothesis that the target walk is superimposed to a portion of the object, the problem is to extend the walk in such a way that the next node does belong to the sought DLO. An exemplar situation is illustrated in Fig. 3, where we have a current path which ends with and we wish to choose between the 8 vertices the best one to extend the walk. Considering the new path , i.e. the path with the addition of vertex , we cast the problem as the estimation of the likelihood of the new path given the current one, which we denote as . Moreover, we estimate this likelihood based on visual similarity, curvature smoothness and spatial distance features and assume these features to be independent:

(1)

The three terms , and are referred to as Visual, Curvature and Distance likelihood, respectively, and computed as follows.

Visual Likelihood

measures the visual similarity between the previous path and the path achievable by adding node . Assuming an evenly coloured DLO, we can compute this similarity by matching only the last node of , , with . Although, in principle, it may be possible to use any arbitrary visual matching function, as highlighted in [ning2010interactive] we found the Color Histogram of the superpixels associated with vertices to be a good feature to compare two image regions. Denoting as and the normalized color histograms (in the HSV color space) of the two regions associated with and , respectively, we can compute their distance with the intersection equation: . Then we normalize this distance in the range using the Bradford normal distribution:

(2)

where is a parameter that enables to control the shape of the distribution and, hence, the weight assigned to the visual similarity information in the overall computation of the likelihood (Eq. 1). Fig. 3(2) plots the visual likelihoods computed for the different neighbours of , which suggests nodes and to represent the most likely superpixels to extend the walk.

Curvature Likelihood

is concerned with estimating the most likely configuration of a DLO’s curvature. Following the intuitions of Predoehl et al. [predoehl2013statistical], for each new node we can assume that the object’s curvature changes smoothly along the walk. To quantify this smoothness criterion we exploit the product of the von Mises distributions of the angles between two successive vertices. As introduced in Sec. 3.1, by extending the model of our adjacency graph with geometric primitives we can assign a 2D point corresponding to the centroid of the associated superpixel to each vertex , as well as a unit vector to each edge by considering the segment joining two consecutive centroids , . As shown in Fig. 5, this allows for measuring the angle difference between two consecutive edges. By denoting as the angle difference between two consecutive edges , , the overall von Mises distribution allowing to establish upon the smoothness of the curvature of a target DLO is given by:

(3)

where is the von Mises distribution at each vertex. An exemplar estimation is shown in Fig. 3(3): vertices and appear to be the most likely candidates to extend the walk as they minimize the curvature changes of the target .

Figure 5: A generic unit vector can be assigned to each edge in the adjacency graph so as to compute the angle difference , between consecutive edges.

Distance Likelihood

is the term concerned with the spatial distance of the next vertex in the walk. This term is mainly introduced to force the iterative procedure to choose the nearest available vertex without undermining the chance to pick a far vertex instead, for example when we want to deal with an intersection (see Fig. 3(1)). Thus, similarly to subsubsection 3.2.1 we normalize the distance in pixel between two nodes, , according to the Bradford normal distribution:

(4)

with tuned such that the decay of the distribution is slow to prefer nearest vertex but not enough to discard the furthest points. Fig. 3(4) highlights how, thanks to the normalization in subsubsection 3.2.3, second order neighbours () are not excessively penalized with respect to first order ones () and hence have the chance to be picked in case they exhibit a high visual similarity and/or yield a particularly smooth walk.

Estimation of the most likely walk

can therefore be computed for all considered neighbours in order to pick the most likely vertex, , to extend the walk, with:

(5)

Considering again the example in Fig. 3, although the farthest from , vertex is selected to extend the walk as it shows a high visual likelihood as well as a high curvature likelihood.

3.3 Starting and Terminating Walks

As described Sec. 3, walks need to be initialized with seed superpixels located at DLOs’ endpoints. Purposely, we deployed a Convolutional Neural Network to detect endpoints. In particular, we fine-tuned the publicly available YOLO v2 model [redmon2017yolo9000] pre-trained on ImageNet based on the images from our Electrical Cable Dataset (see Sec. 4.1) and by performing several data augmentations. As already mentioned, the endpoint detection module may be seen as an external process with respect to our core algorithm and, as such, in the comparative experimental evaluation we will use the same set of endpoints obtained by YOLO v2 to initialize all considered methods.

As illustrated in Fig. 2(1), the endpoint detection step predicts a set of bounding boxes around the actual endpoints. For each such prediction we find the superpixel containing the central point of the box. The graph vertex corresponding to this superpixel is marked as a seed to start a new walk. As no prior information concerning the direction of the best walk across the target DLO is available, multiple walks are actually started, in particular along each possible direction (see Fig. 2(3)). It is worth pointing out that the considered directions are those defined by the seed vertex and all its neighbours in the graph, which, as discussed in Sec. 3.2, can be both first order () as well as higher order () neighbours. Each started walk, then, is iteratively extended according to the procedure described in Sec. 3.2. .

As for the criteria to terminate walks, first of all we set a maximum number of iterations to extend a walk. We also terminate a walk if it reaches another seed vertex in the adjacency graph. More precisely, as depicted in Fig. 2(5), we terminate a walk if the distance from the current vertex and a seed is smaller than a radius threshold . Thus, given a seed, all walks started from that seed will terminate and we will have to pick only the optimal one. Purposely, we exploit again the Curvature analysis described in Sec. 3.2 and use a formulation similar to Eq. 3 to pick the smoothest path (i.e. the walk with the highest value of ).

3.4 Segmentation and Model Estimation

The simplest technique to segment the image is to assign different labels to the superpixels belonging to the different walks alongside with a background label to those superpixels not included into any walk. Besides, we estimate a B-Spline approssimation (with the algorithm described in [dierckx1995curve]) for each walk based on the centroids of the its superpixels . Then, given the B-Spline model we can refine the segmentation by building a pixel mask. In particular, by evaluating the smoothing polynomial we can sample densely the points belonging to each parametrised curve and then adjust the thickness of the segmented output based on the mean size of the superpixels belonging to the walk. The accurate segmentation provided by this procedure is shown in Fig. 1, where the right image contains the colored B-Splines built over the walks represented in the middle image (the color is estimated by averaging the color of the corresponding superpixels).

4 Experimental results

4.1 Software and Dataset

We provide an open source software framework called Ariadne available online 5. The name Ariadne is inspired by the name of Minos’s daughter, who, in Greek mythology, used a thread to lead Theseus outside the Minotaur’s maze. The software is written in Python and implements the described approach. We created also a novel dataset 6 which, to the best of our knowledge, is the first Electrical Cable dataset for detection and segmentation. We used this dataset to perform quantitative and qualitative evaluations of our approach with respect to others curvilinear structure detector.

The Dataset is made up of two separated parts: the first consists of 60 cable images with homogeneous backgrounds (white, wood, colored papers, etc.); the second includes 10 cable images with complex backgrounds. In Fig. 7, the first 3 rows deal with homogenous backgrounds whilst the other with complex one.

For each image in the dataset we provide an hand-labeled binary mask superimposed over each target cable separately and an overall mask that is the union of them. Furthermore, we provide a discretization of the B-Spline for each cable in every image which consists in a set of 2D points in pixel coordinates useful to have a lighter model of the cable and track easily its endings. Further details can be found on the dataset website 7. .

4.2 Segmentation results

Figure 6: Segmentation timings (complex background).

To test our approach we compared it to the popular Curvilenar Object Detector discussed in Sec. 2: the Frangi 2D Filter [frangi1998multiscale], the Ridge Algorithm [staal2004ridge] and the more generic one ELSD [puatruaucean2012parameterless]. For each algorithm we produce a mask associated with the detected curvilinear structures and compare it, by means of the Intersection Over Union, , to the ground truth provided by our dataset. Table 1 reports the weighted on images, where the weight is proportional to the number of cables present in the image: . The first row refers to images featuring a homogeneous background (60 images with a total of 395 cables), the second to those having a complex background (10 images and a total of 40 cables). We tuned the hyperparameters of the three competing methods trying to choose the best configuration in order to cope with both the simpler and harder dataset images. As for our algorithm, we used: a 3D Color Histogram with 8 bins for each channel as Visual similarity feature; a Von Mises distribution with to compute the Curvature likelihood and a degree for the adjacency graph (i.e. it means that we search in a neighborhood of level 3 during our walk construction, as described in Sec. 3.2). For both our approach and the competitors we exploit the information about the endpoints provided by the inintialization step: for Ariadne we use this information to start random walks, for the competitors to discard many outliers.

The results reported in Table 1 show how our approach outperforms the other methods by a large margin, although it is fair to point out that the competitors are generic curvilinear detectors and not specific DLO detectors. It is also worth highlighting that our method is remarkably robust with respect to complex backgrounds and that this is achieved without any training or fitting procedure that would hinder applicability to unknown scenarios. Thus, our approach could be used in any real industrial application without requiring prior knowledge of the environment.

Finally, Fig. 7, present some qualitative results obtained by our approach. Moreover, an additional qualitative evaluation is present in the supplementary material, where interactive examples of the Ariadne software are shown. In the abovementioned material, as a naive proof-of-concept, we tested Ariadne also in similar challenging contexts like Roads and Rivers segmentation.

Ariadne Frangi [frangi1998multiscale] Ridge [staal2004ridge] ELSD [puatruaucean2012parameterless]
Homegeneous Background 0.754 0.406 0.293 0.225
Complex Background 0.583 0.063 0.023 0.147
Table 2: The Intersection Over Union of the cable segmentations obtained with our approach compared with the three major curvilinear structure detectors.
Figure 7: Qualitative evaluation of the different algorithms. The first column represent the input image, whereas the other columns, in order: the result yielded by Ariadne, Frangi[frangi1998multiscale], Ridge[staal2004ridge] and ELSD[puatruaucean2012parameterless]. The difficultly increases from a row to the following: indeed the first row represent a very simple example with white background while the last example, instead, can be considered very hard. Our approach is very robust despite background complexity.

4.3 Timings and failure cases

Ariadne is an iterative approach, with the number of iterations depending on the length of the DLO. Thus, as depicted in Fig. 6, we can estimate an average iteration time of about and an average segmentation time of about . We also point out that these measurements were obtained with the actual Python implementation. Moreover, Fig. 8(a),(b) shows the two main failure cases that we found for Ariadne. In (a), as two DLOs (blue and green) are adjacent and exhibit very similar colour and curvature, the walk may jump on the wrong cable. In (b), as the distance between the DLO and the camera varies greatly, the density of Superpixels is not constant and the walk can cover only a portion of the sought object or even completely fail.

5 Concluding Remarks

We presented an effective unsupervised approach to segment DLOs in images. This segmentation method may be deployed in industrial applications involving wire detection and manipulation. Our approach requires an external detector to localize cable terminals, as otherwise we should start walks at every superpixel, which would be almost unworkable, although not impossible. So far we deploy an external detector which provides only the approximate position of the endpoints. We are currently working to develop a smarter endpoint detector capable to infer also the orientation of the cable terminal in order to dramatically shrink the number of initial walks. Another future development concerns building a much larger Electrical Cable dataset equipped with ground truth information suitable to train a specific CNN aimed at cable segmentation and compare this supervised approach to Ariadne. It is worth pointing out that, in specific and known in advance settings, a supervised approach may turn out peculiarly effective: in such circumstances Ariadne could be used to vastly ease and speed up the manual labeling procedure required to obtain the training images by replacing the initial object detector with the interactive intervention of the user.

(a) Failure due to adjacent cables.
(b) Failure due to perspective.
Figure 8: Failure Cases.

References

Footnotes

  1. This work was supported by the European Commissions Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 601116.
  2. In [camarillo2008vision] the authors deal with a thin flexible manipulator which may be described as a cable due to the high number of degrees of freedom.
  3. https://github.com/ntnu-bioopt/libfrangi
  4. https://github.com/kapcom01/Curviliniar_Detector
  5. https://github.com/m4nh/ariadne
  6. https://github.com/m4nh/cables_dataset
  7. https://github.com/m4nh/cables_dataset
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minumum 40 characters
Add comment
Cancel
Loading ...
303146
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description