Simple Interactive Image Segmentation using Label Propagation through kNN graphs

Simple Interactive Image Segmentation using Label Propagation through kNN graphs

Abstract

Many interactive image segmentation techniques are based on semi-supervised learning. The user may label some pixels from each object and the SSL algorithm will propagate the labels from the labeled to the unlabeled pixels, finding object boundaries. This paper proposes a new SSL graph-based interactive image segmentation approach, using undirected and unweighted kNN graphs, from which the unlabeled nodes receive contributions from other nodes (either labeled or unlabeled). It is simpler than many other techniques, but it still achieves significant classification accuracy in the image segmentation task. Computer simulations are performed using some real-world images, extracted from the Microsoft GrabCut dataset. The segmentation results show the effectiveness of the proposed approach.

\address

Instituto de Geociências e Ciências Exatas
Universidade Estadual Paulista “Júlio de Mesquita Filho” (UNESP)
Avenida 24A, 1515 – 13.506-900 – Rio Claro – SP – Brazil \emailfabricio@rc.unesp.br

1 Introduction

Image segmentation is considered one of the most difficult tasks in image processing [16]. It is the process of dividing an image into parts, identifying objects or other relevant information [24]. Fully automatic segmentation is still very difficult to accomplish and the existing techniques are usually domain-dependent. Therefore, interactive image segmentation, in which the segmentation process is partially supervised, has experienced increasing interest in the last decades [7, 17, 22, 5, 14, 13, 23, 21, 19, 1, 2, 27, 8, 10].

Semi-supervised learning (SSL) is an important field in machine learning, usually applied when unlabeled data is abundant but the process of labeling is expensive, time consuming and/or requiring intensive work of human specialists [30, 11]. This characteristics makes SSL an interesting approach to perform interactive image segmentation, which may be seen as a pixel classification process. In this scenario, there are often many unlabeled pixels to be classified. An human specialist can easily classify some of them, which are away from the borders, but the process of defining the borders manually is difficult and time consuming.

Many interactive image segmentation techniques are, in fact, based on semi-supervised learning. The user may label some pixels from each object, away from the boundaries where the task is easier. Then, the SSL algorithm will iteratively propagate the labels from the labeled pixels to the unlabeled pixels, finding the boundaries. This paper proposes a different SSL-based interactive image segmentation approach. It is simpler than many other techniques, but it still achieves significant classification accuracy in the image segmentation task. In particular, it was applied to some real-world images, including some images extracted from the Microsoft GrabCut dataset [23]. The segmentation results show the effectiveness of the proposed approach.

1.1 Related work

The approach proposed in this paper may be classified in the category of graph-based semi-supervised learning. Algorithms on this category rely on the idea of building a graph which nodes are data items (both labeled and unlabeled) and the edges represent similarities between them. Label information from the labeled nodes is propagate through the graph to classify all the nodes [11]. Many graph-based methods [6, 29, 28, 3, 4, 18] are similar and share the same regularization framework [30]. They usually employ weighted graphs and labels are spread globally, differently from the proposed approach, where the label spreading is limited to neighboring nodes and the graph is undirected and unweighted.

Another graph-based method, known as Label Propagation through Linear Neighborhoods [26], also uses a -nearest neighbors graph to propagate labels. However, the edges have weights, which require the resolution of quadratic programming problems to be calculated, prior to the iterative label propagation process. On the other hand, the proposed approach uses only unweighted edges.

1.2 Technique overview

In the proposed method, an unweighted and undirected graph is generated by connecting each node (data item) to its -nearest neighbors. Then, in a iterative process, unlabeled nodes will receive contributions from all its neighbors (either labeled or unlabeled) to define their own label. The algorithm usually converges quickly, and each unlabeled node is labeled after the class from which it received most contributions. Differently from many other graph-based methods, no calculation of edge weights or Laplacian matrix are required.

2 The Proposed Model

In this section, the proposed technique will be detailed. Given a bidimensional digital image, the set of pixels are reorganized as , such that is the labeled pixel subset and is the unlabeled pixels subset. is the set containing the labels. is the function associating each to its label as the algorithm output. The algorithm will estimate for each unlabeled pixel .

2.1 -NN Graph Generation

A large amount of features may be extracted from each pixel to build the graph. In this paper, features are used. They are shown on Table 1. These are the same features used in [9].

# Feature Description
1 Pixel row location
2 Pixel column location
3 Red (R) component of the pixel
4 Green (G) component of the pixel
5 Blue (B) component of the pixel
6 Hue (H) component of the pixel
7 Saturation (S) component of the pixel
8 Value (V) component of the pixel
9 ExR component of the pixel
10 ExG component of the pixel
11 ExB component of the pixel
12 Average of R on the pixel and its neighbors (MR)
13 Average of G on the pixel and its neighbors (MG)
14 Average of B on the pixel and its neighbors (MB)
15 Standard deviation of R on the pixel and its neighbors (SDR)
16 Standard deviation of G on the pixel and its neighbors (SDG)
17 Standard deviation of B on the pixel and its neighbors (SDB)
18 Average of H on the pixel and its neighbors (MH)
19 Average of S on the pixel and its neighbors (MS)
20 Average of V on the pixel and its neighbors (MV)
21 Standard deviation of H on the pixel and its neighbors (SDH)
22 Standard deviation of S on the pixel and its neighbors (SDS)
23 Standard deviation of V on the pixel and its neighbors (SDV)
Table 1: List of features extracted from each image to be segmented

For measures to , the pixel neighbors are the -connected neighborhood, except on the borders where no wraparound is applied. All components are normalized to have mean and standard deviation . They are also scaled by a vector of weights in order to emphasize/deemphasize each feature during the graph generation. ExR, ExG, and ExB components are obtained from the RGB components using the method described in [20]. The HSV components are obtained from the RGB components using the method described in [25].

The undirected and unweighted graph is defined as , where is the set of nodes, and is the set of edges . Each node corresponds to a pixel . Two nodes and are connected if is among the -nearest neighbors of , or vice-versa, considering the Euclidean distance between and features. Otherwise, and are disconnected.

2.2 Label Propagation

For each node , a domination vector is created. Each element corresponds to the domination level from the class over the node . The sum of the domination vector in each node is always constant, .

The domination levels are constant in nodes corresponding to labeled pixels, with full domination by the corresponding class. On the other hand, domination levels are variable in nodes corresponding to unlabeled pixels and they are initially set equally among classes. Therefore, for each node , the domination vector is set as follows:

(1)

In the iterative phase, at each iteration each unlabeled node will get contributions from all its neighbors to calculate its new domination levels. Thus, for each unlabeled node , the domination levels are updated as follows:

(2)

where is the size of , and is the set of the neighbors. In this way, the new dominance vector is the arithmetic mean of all its neighbors dominance vectors, no matter if they are labeled or unlabeled.

The average maximum domination levels is defined as follows:

(3)

considering all representing unlabeled nodes. is checked every iterations and the algorithm stops when its increase is below between checkpoints.

At the end of the iterative process, each unlabeled pixel is assigned to the class that has the highest domination level on it:

(4)

2.3 The Algorithm

Overall, the proposed algorithm can be outlined as follows:

1 Build the -NN graph, as described in Subsection 2.1;
2 Set nodes’ domination levels by using Eq. (1);
3 repeat
4        for each unlabeled node do
5               Update node domination levels by using Eq. (2);
6              
7       
8until the stopping criterion is satisfied;
Label each unlabeled pixel using Eq. (4);
Algorithm 1 The proposed method algorithm

3 Implementation

In order to reduce the computational resources required by the proposed method, the following implementation strategy is applied.

The iterative step of the algorithm is very fast in comparison with the graph generation step, i.e., the graph generation dominates the execution time. Therefore, the graph is generated using the k-d trees method [15], so the algorithm runs in linearithmic time ().

In the iterative step, each iteration runs in , where is the amount of unlabeled nodes and is usually proportional to the amount of neighbors each node has (not equal because the graph is undirected). is usually a fraction of in practical problems, and often . By increasing , one also increases each iteration execution time. On the other hand, the amount of iterations required to converge decreases as the graph becomes more connected and the labels propagate faster, as it was empirically observed in computer simulations.

The iterative steps are synchronous, i.e., the contributions any node receives to produce its domination vector in time refer to the domination levels its neighbors had in time . Therefore, parallelization of this step, corresponding to the inner loop in steps and of the Algorithm 1, is possible. Nodes can calculate their new domination vectors in parallel without running into race conditions. Synchronization is only required between iterations of the outer loop (steps to ).

4 Experiments

The proposed technique efficacy is first tested using the real-world image shown on Fig. (a)a, extracted from [8], which has pixels. A trimap providing seed regions is presented in Figure (b)b. Black (0) represents the background, ignored by the algorithm; dark gray (64) is the labeled background; light gray (128) is the unlabeled region, which labels will be estimated by the proposed method; and white (255) is the labeled foreground.

(a)
(b)
(c)
Figure 1: (a) Original image, (b) Trimap providing seed regions for Fig. (a)a segmentation, (c) Close-up foreground segmentation results by the proposed method.

The proposed technique efficacy is then verified using a series of computational experiments using nine image selected from the Microsoft GrabCut database [23] 1. The selected images are shown on Fig. 2. The corresponding trimaps providing seed regions are shown on Fig. 3. Finally, the ground truth images are shown on Fig. 4.

(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Figure 2: Original images from the GrabCut dataset: (a) 21077; (b) 124084; (c) 271008; (d) 208001; (e) llama; (f) doll; (g) person7; (h) sheep; (i) teddy.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Figure 3: The trimaps providing seed regions from the GrabCut dataset: (a) 21077; (b) 124084; (c) 271008; (d) 208001; (e) llama; (f) doll; (g) person7; (h) sheep; (i) teddy.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Figure 4: Close-up foreground segmentation results by the proposed method: (a) 21077; (b) 124084; (c) 271008; (d) 208001; (e) llama; (f) doll; (g) person7; (h) sheep; (i) teddy.

For each image, and the vector of weights were optimized using the genetic algorithm available in Global Optimization Toolbox of MATLAB, with its default parameters.

5 Results and Discussion

First, the proposed method was applied to the image shown on Fig. (a)a. The best segmentation result is shown on Fig. (c)c. By comparing this output with the segmentation result achieved in [8] for the same image, one can notice that the proposed method achieved slightly better results, by eliminating some misclassified pixels and better defining the borders.

Then, the proposed method was applied to the nine images shown on Fig. 2, as described on Section 4. The best segmentation results achieved with the proposed method are shown on Fig. 5. Error rates are computed as the fraction between the amount of incorrectly classified pixels and the total amount of unlabeled pixels (light gray on the trimaps images shown on Fig. 3). Notice that ground truth images (Fig. 4) have a thin contour of gray pixels, which corresponds to uncertainty, i.e., pixels that received different labels by the different persons who did the manual classification. These pixels are not used in the classification error calculation.

(a) 5.13%
(b) 0.57%
(c) 3.09%
(d) 3.88%
(e) 6.83%
(f) 0.64%
(g) 1.09%
(h) 1.41%
(i) 1.63%
Figure 5: The ground-truth images from the GrabCut dataset: (a) 21077; (b) 124084; (c) 271008; (d) 208001; (e) llama; (f) doll; (g) person7; (h) sheep; (i) teddy. Error rates are indicated below each image.

Segmentation error rates are also summarized on Table 2. Some results from other methods [14, 12, 8] are also included for reference. By analyzing them, one can notice that the proposed method has comparable results. The results from the other methods were extracted from the respective references.

Table 2: Segmentation error rates achieved by Learning on Hypergraphs model (ISLH) [12], Directed Image Neighborhood Hypergraph model (DINH) [14], Particle Competition and Cooperation (PCC) [8] and the proposed model (LPKNN).

It is also important to notice that the proposed method is deterministic. Given the same parameters, it will always output the same segmentation result on different executions. Other methods, like Particle Competition and Cooperation [8], are stochastic. Therefore, they may output different segmentation results on each execution.

The optimized parameters and features weights () are shown on Table 3. Considering the images evaluated in this paper, pixel location features (Row and Col) are the most important features, followed by the ExB component, intensity (V), and the mean of green (MG). The least important features were hue (H), saturation (S) and all those related to standard deviation. However, no single feature received a high weight in all images. The optimal weights and seem to be highly dependent on image characteristics.

Table 3: Parameter and feature weights optimized by the proposed method for each segmented image.

6 Conclusion

In this paper, a new SSL graph-based approach is proposed to perform interactive image segmentation. It employs undirected and unweighted NN graphs to propagate labels from nodes representing labeled pixels to nodes representing unlabeled pixels. Computer simulations with some real-world images show that the proposed approach is effective, achieving segmentation accuracy similar to those achieved by some state-of-the-art methods.

As future work, the method will be applied on more images and more features may be extracted. Methods to automatically define the parameters and may also be explored. Graph generation may also be improved to provide further increase in segmentation accuracy.

Moreover, the proposed method works for multiple labels simultaneously at no extra cost, which is an interesting property not often exhibited by other interactive image segmentation methods. This feature will also be explored in future works.

Acknowledgment

The author would like to thank the São Paulo Research Foundation - FAPESP (grant #2016/05669-4) and the National Counsel of Technological and Scientific Development - CNPq (grant #475717/2013-9) for the financial support.

Footnotes

  1. Available at http://web.archive.org/web/20161203110733/research.microsoft.com/en-us/um/cambridge/projects/visionimagevideoediting/segmentation/grabcut.htm

References

  1. Y. Artan and I.S. Yetik (2010-05) Improved random walker algorithm for image segmentation. In Image Analysis Interpretation (SSIAI), 2010 IEEE Southwest Symposium on, pp. 89–92. External Links: Document Cited by: §1.
  2. Y. Artan (2011-05) Interactive image segmentation using machine learning techniques. In Computer and Robot Vision (CRV), 2011 Canadian Conference on, pp. 264–269. External Links: Document Cited by: §1.
  3. M. Belkin, I. Matveeva and P. Niyogi (2004) Regularization and semisupervised learning on large graphs. In Conference on Learning Theory, pp. 624–638. Cited by: §1.1.
  4. M. Belkin, N. P. and V. Sindhwani (2005) On manifold regularization. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTAT 2005), New Jersey, pp. 17–24. Cited by: §1.1.
  5. A. Blake, C. Rother, M. Brown, P. Perez and P. Torr (2004) Interactive image segmentation using an adaptive gmmrf model. In Computer Vision - ECCV 2004, T. Pajdla and J. Matas (Eds.), Lecture Notes in Computer Science, Vol. 3021, pp. 428–441 (English). External Links: ISBN 978-3-540-21984-2, Document, Link Cited by: §1.
  6. A. Blum and S. Chawla (2001) Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the Eighteenth International Conference on Machine Learning, San Francisco, pp. 19–26. Cited by: §1.1.
  7. Y.Y. Boykov and M.-P. Jolly (2001) Interactive graph cuts for optimal boundary amp; region segmentation of objects in n-d images. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, Vol. 1, pp. 105–112 vol.1. External Links: Document Cited by: §1.
  8. F. Breve, M. G. Quiles and L. Zhao (2015-07) Interactive image segmentation using particle competition and cooperation. In 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. External Links: Document, ISSN 2161-4393 Cited by: §1, §4, Table 2, §5, §5, §5.
  9. F. A. Breve (2015) Auto feature weight for interactive image segmentation using particle competition and cooperation. In Proceedings - XI Workshop de Visão Computacional WVC’2015, , Vol. , , pp. 164–169. Note: Cited by: §2.1.
  10. F. Breve, MarcosG. Quiles and L. Zhao (2015) Interactive image segmentation of non-contiguous classes using particle competition and cooperation. In Computational Science and Its Applications – ICCSA 2015, O. Gervasi, B. Murgante, S. Misra, M. L. Gavrilova, A. M. A. C. Rocha, C. Torre, D. Taniar and B. O. Apduhan (Eds.), Lecture Notes in Computer Science, Vol. 9155, pp. 203–216 (English). External Links: ISBN 978-3-319-21403-0, Document, Link Cited by: §1.
  11. O. Chapelle, B. Schölkopf and A. Zien (Eds.) (2006) Semi-Supervised Learning. Adaptive Computation and Machine Learning, The MIT Press, Cambridge, MA. Cited by: §1.1, §1.
  12. L. Ding and A. Yilmaz (2008-12) Image segmentation as learning on hypergraphs. In Machine Learning and Applications, 2008. ICMLA ’08. Seventh International Conference on, pp. 247–252. External Links: Document Cited by: Table 2, §5.
  13. L. Ding and A. Yilmaz (2010) Interactive image segmentation using probabilistic hypergraphs. Pattern Recognition 43 (5), pp. 1863 – 1873. Note: External Links: ISSN 0031-3203, Document, Link Cited by: §1.
  14. A. Ducournau and A. Bretto (2014) Random walks in directed hypergraphs and application to semi-supervised image segmentation. Computer Vision and Image Understanding 120 (0), pp. 91 – 102. Note: External Links: ISSN 1077-3142, Document, Link Cited by: §1, Table 2, §5.
  15. J. H. Friedman, J. L. Bentley and R. A. Finkel (1977-09) An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3 (3), pp. 209–226. External Links: ISSN 0098-3500, Link, Document Cited by: §3.
  16. R. C. Gonzalez and R. E. Woods (2008) Digital image processing (3rd edition). Prentice-Hall, Inc., Upper Saddle River, NJ, USA. External Links: ISBN 013168728X Cited by: §1.
  17. L. Grady (2006-11) Random walks for image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 28 (11), pp. 1768–1783. External Links: Document, ISSN 0162-8828 Cited by: §1.
  18. T. Joachims (2003) Transductive learning via spectral graph partitioning. In Proceedings of International Conference on Machine Learning, pp. 290–297. Cited by: §1.1.
  19. J. Li, J.M. Bioucas-Dias and A. Plaza (2010-11) Semisupervised hyperspectral image segmentation using multinomial logistic regression with active learning. Geoscience and Remote Sensing, IEEE Transactions on 48 (11), pp. 4085–4098. External Links: Document, ISSN 0196-2892 Cited by: §1.
  20. M. Lichman (2013) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. External Links: Link Cited by: §2.1.
  21. A.R.C. Paiva and T. Tasdizen (2010-03) Fast semi-supervised image segmentation by novelty selection. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp. 1054–1057. External Links: Document, ISSN 1520-6149 Cited by: §1.
  22. A. Protiere and G. Sapiro (2007-04) Interactive image segmentation via adaptive weighted distances. Image Processing, IEEE Transactions on 16 (4), pp. 1046–1057. External Links: Document, ISSN 1057-7149 Cited by: §1.
  23. C. Rother, V. Kolmogorov and A. Blake (2004-08) “GrabCut”: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23 (3), pp. 309–314. External Links: ISSN 0730-0301, Link, Document Cited by: §1, §1, §4.
  24. L.G. Shapiro and G.C. Stockman (2001) Computer vision. Prentice Hall. External Links: ISBN 9780130307965, LCCN 00066556 Cited by: §1.
  25. A. R. Smith (1978) Color gamut transform pairs. In ACM Siggraph Computer Graphics, Vol. 12, pp. 12–19. Cited by: §2.1.
  26. F. Wang and C. Zhang (2008-Jan.) Label propagation through linear neighborhoods. IEEE Transactions on Knowledge and Data Engineering 20 (1), pp. 55–67. Cited by: §1.1.
  27. J. Xu, X. Chen and X. Huang (2008-12) Interactive image segmentation by semi-supervised learning ensemble. In Knowledge Acquisition and Modeling, 2008. KAM ’08. International Symposium on, pp. 645–648. External Links: Document Cited by: §1.
  28. D. Zhou, O. Bousquet, T. N. Lal, J. Weston and B. Schölkopf (2004) Learning with local and global consistency. In Advances in Neural Information Processing Systems, Vol. 16, pp. 321–328. External Links: Link Cited by: §1.1.
  29. X. Zhu, Z. Ghahramani and J. Lafferty (2003) Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the Twentieth International Conference on Machine Learning, pp. 912–919. Cited by: §1.1.
  30. X. Zhu (2005) Semi-supervised learning literature survey. Technical report Technical Report 1530, Computer Sciences, University of Wisconsin-Madison. Cited by: §1.1, §1.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
407974
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description