The Effect of Color Space Selection on Detectability and Discriminability of Colored Objects
In this paper, we investigate the effect of color space selection on detectability and discriminability of colored objects under various conditions. 20 color spaces from the literature are evaluated on a large dataset of simulated and real images. We measure the suitability of color spaces from two different perspectives: detectability and discriminability of various color groups.
Through experimental evaluation, we found that there is no single optimal color space suitable for all color groups. The color spaces have different levels of sensitivity to different color groups and they are useful depending on the color of the sought object. Overall, the best results were achieved in both simulated and real images using color spaces C1C2C3, UVW and XYZ.
In addition, using a simulated environment, we show a practical application of color space selection in the context of top-down control in active visual search. The results indicate that on average color space C1C2C3 followed by HSI and XYZ achieve the best time in searching for objects of various colors. Here, the right choice of color space can improve time of search on average by 20%. As part of our contribution we also introduce a large dataset of simulated 3D objects.
The choice of color space is an important task in various computer vision applications such as image compression , and annotation , object detection  or object tracking [4, 5]. However, it is hard to define a universal color space as it can be modeled in numerous ways, e.g. Luv, Lab, HSV, etc. .
The computer vision community has proposed a large number of solutions for color space selection for different applications. In one of the early works, Ohta et al.  introduce a new color space, I1I2I3 and show that using this color space can result in improved object segmentation. In , the authors use an automatic color space selection for biological image segmentation. They use the Liu and Borsotti segmentation evaluation method to determine what color space provides the best segmentation. Similar adaptive approaches also have been applied to applications such as sky/cloud  or skin  segmentation.
Using the right color space is also important in object detection and recognition applications. Vezhnevets et al.  perform a comparative study on various color spaces for skin detection. The authors highlight that changes in color luminance have little effect on separating skin from non-skin regions. In the context of cast shadow detection, Benedek and Sziranyi  evaluate numerous color spaces such as HSV, RGB and Luv, and show that using Luv color space is the most efficient for color based clustering of the pixels and foreground-background-shadow segmentation. Van de Sande et al.  evaluate different color spaces in conjunction with SIFT features for object detection. They argue that depending on the nature of the dataset using different color spaces such as opponent axis or RGB can result in the best performance. Scandaliaris et al.  combine three color spaces including C1C2C3, opponent axis and RGB to generate shadow invariant features to detect objects by finding their contours. Moreover, there are a number of works that investigate the suitability of color spaces for texture classification of objects such as textile  and tree  classification.
In robotics, color spaces are also studied for various applications. Song et al.  use a genetic algorithm to generate a color space suitable for recognition of colored objects in the context of robotic soccer. They show that using the color space, rSc2, the highest recognition rate can be achieved compared to spaces such as YUV or HSI. In a similar study , the authors propose the uSb color space based on an iterative feature selection procedure for recognition of colored objects for underwater robotics. Duan et al.  investigate the optimal color space for segmentation of aerial images captured by unmanned UAVs. The authors report that for Bayer true color images, the I1I2I3 color space is the optimal choice for the segmentation of buildings compared to YCbCr, YIQ or Lab.
In the context of robotic visual search, color features can be used to optimize the process of search . In this work the authors use the sought object’s color in a top-down control manner to bias the search. If the sought object is outside detection range, its color features are used to guide the search. The similarity between detected colors and the color of the object is measured using a backprojection technique. If a similarity is observed, the importance of the corresponding regions is increased, therefore the robot searches those regions first. The authors show that using this method, the time of search can be significantly reduced. In , however, the authors only use normalized RGB color space and the search is only performed for a single red and green toy.
In this paper we first investigate the impact of color space selection on detection of objects with different colors (detectability) and then using a cluster scoring technique identify how well different color groups can be separated (discriminability) using each color space. We perform this evaluation for the 20 most common color spaces in the literature using both synthetic and real images. Finally, we use active visual search as an application to examine the role of color space selection in practice.
Ii Color space and robustness
Ii-a Color space
We evaluated 20 color spaces (see Table I) including the RGB space.
|XYZ, I1I2I3, HSI, YIQ||Guo and Lyu |
|Lab, YCrCb, rg, HSV||Danelljan et al. |
|C1C2C3||Salvador et al. |
|Opp, Nopp, Copp||Liang et al. |
|Luv, xyz||Moroney and Fairchild |
|YES||Saber et al. |
|CMY, YUV||Tkalcic and Tasic |
|HSL||Weeks et al. |
|UVW||Ohta et al. |
|xyY||Lucchese and Mitra |
As mentioned earlier, our objective is to evaluate the suitability of color spaces for various applications, in particular, detecting and distinguishing colored objects. In this sense, two factors have to be considered. First, the color should be represented in a way that it is easily detectable (Detectability) under various conditions such as in the presence of shadow, illumination changes or various reflecting surfaces. Second, different color groups should be easily separable (Discriminability) and not confused with one another. Measuring the Detectability and Discriminability of colors in different color spaces accounts for how robust a color representation can be.
Ii-B Measuring robustness
We measure the detectability of different colors using the histogram backprojection (BP) technique . The BP algorithm generates a probability map in which the pixel values refer to how likely is the presence of a given color in the image. The computation of the BP map is as follows. Let be the histogram function which maps color space , where is the channel of , to a bin of histogram computed from the object's template, . The backprojection of the object’s color over an image is given by,
where is the grayscale backprojection image.
Choosing the right bin size for histograms is vital in the BP algorithm. The larger the size of the bins, the more tolerant BP is to the illumination change. At the same time different colors will likely be detected together. To eliminate bias in our detection, we use histograms with bin sizes of 16, 32, 64 and 128 and average the results over all configurations.
We compare the detection results against the ground truth data and use 3 measures of performance, , and where , and stand for true positive, false negative and false positive respectively.
One way to measure discriminability (separability) of colors in different spaces is by performing clustering and measuring how well the color data is grouped into different clusters. For this purpose we use a K-means clustering technique using Expectation Maximization (EM) to find the maximum number of clusters that best represent the color distributions in each image. The number of clusters depends on the number of colors (out of 12 colors of the traditional color wheel, see Figure 1) present in the input image.
We employ silhouette analysis  to measure the consistency of each cluster against the ground truth. Let be a single pixel in the image and be the cluster that belongs to. We define as the average dissimilarity of pixel to all other pixels in cluster . We define as the average dissimilarity of to all other pixels in and . Based on these definitions, the silhouette measure of pixel is given by,
where . The value of 1 means pixel is well clustered whereas -1 implies that it does not belong to the allocated cluster.
Iii Test Images
Iii-a Synthetic images
We setup a simulated environment to capture the robustness of the color spaces to various illumination conditions. The following setups were considered for our simulation.
Objects - To model different surfaces we used objects with three different shapes namely, sphere, cylinder and cube (see Figure 15)
Colors - We used 12 colors of the traditional color wheel depicted in Figure 1. The traditional color wheel is chosen because it represents three basic color schemes including primary, secondary and tertiary colors.
Configuration - The objects where placed on a circle with the radius of 2 meters from the center and equally distant from one another (see Figure 3).
Camera - We used a camera model same as the Zed stereo camera with FOV and 1280 1024 resolution placed on a rod with the height of 1 meter. The camera was rotated 12 times along a circle with radius of 3 meters and each time it was placed 1 meter away from the closest object while facing towards it.
Light - We used two types of light sources, directional and point, placed 10 meters above the scene. To vary illumination conditions, the directional light source was rotated along the y-axis with an interval of forming a total of 12 orientations. The point light source also was rotated along an arc in the x-z plane by within an interval of . The arc also was rotated along z-axis 6 times equally within an interval of (see Figure 4).
Using the above setups a total of 4752 simulated images where generated. Figure 5 shows some sample synthetic images that were generated under various lighting conditions.
Iii-B Real images
A total of 60 real images are collected from the web representing different materials, object shapes and lighting conditions. Unlike the synthetic images, the real images only contain a subset of the 12 colors. Therefore we selected them so that all colors are reasonably distributed. Figure 6 illustrates some of the images used in the experiments.
We evaluated our samples both with and without normalization (pixel-wise normalization of each channel). In the latter method pixel-wise normalization was used to reduce the effect of illumination changes. The normalization took place on the original RGB images and was computed by dividing each pixel value in each channel by the sum of values in all channels.
In the following subsection we use the following abbreviations for each color category: blue-green (bg), blue (b), green (g), green-yellow(gy), orange (o), orange-red (or), red (r), red-violet (rv), violet-blue (vb), violet (v), yellow-orange (yo), and yellow (y).
Iv-a Color spaces in the simulated environment
Performing backprojection (BP) on the original images without pixel-wise normalization, the scores in the majority of the color spaces are fairly low. This is due to the high degree of illumination changes in the images. The only color spaces that perform well are the photometric color invariants (C1C2C3, NOPP and rg) which describe color configuration of image discounting the effect of shadow or highlights.
To compensate for the changes in illumination, the spaces NOPP and rg normalize the pixel values of each channel by dividing by the sum of pixel values in all channels. The C1C2C3 color space achieves the illumination invariant behavior by calculating each channel value by where and .
After pixel-wise normalization, the results are changed significantly. The highlight is that the scores of the top three color spaces before normalization deteriorate due to loss of information as a result of double normalization. On the contrary, all other methods perform dramatically better due to reducing the effect of illumination changes.
Figure 7 shows the scores after normalization. The best performance is achieved in color spaces HSI, UVW and XYZ. The prime sign () indicates the color space is applied after pixel-wise normalization of the input image. In addition, despite the drop in the performance of C1C2C3, it still remains one of the top 5 color spaces.
After combining the results of both experiments, as shown in Figure 8, BP performs best in the C1C2C3 color space followed by UVW and HSI.
Although BP has the best performance on average in the spaces in Figure 8, it does not necessarily perform best in detecting each color group using those color spaces. In fact, using the C1C2C3 color space, the BP algorithm does not have the best hit rate in any of the color categories.
To highlight the performance of BP using different color spaces to detect each color group, in Table II we list the top 3 color spaces followed by the highest score for the corresponding color group.
||#1||#2||#3||Best Score||#1||#2||#3||Best Score||#1||#2||#3||Best Score|
The best hit rate overall (recall) is achieved using HSI, UVW and XYZ. HSI is most suitable when the color is concentrated only in a single channel. On the other hand, UVW and XYZ are better for colors containing violet (red-blue) and yellow (green-red) respectively.
Using the C1C2C3 color space, the best precision and FMeasure are obtained overall. As for the recall, in four cases the second best performance is achieved using this color space and in the cases where colors yellow or orange are present, using C1C2C3 is not advantageous.
We performed the EM clustering in both original and normalized images. The maximum number of clusters was set to the maximum number of colors present in the image (in this case 12) and the silhouette score was measured for each color space.
Figure 9 shows the results of the clustering. As expected, after normalization, the clustering is improved. Considering the results in both scenarios, the top three color spaces are YIQ (0.6977), C1C2C3 (0.6741) and rg (0.6701).
Iv-B Color spaces in the real images
We followed the same procedure as for the simulated images and ran the BP algorithm in both original and pixel-wise normalized real images. The templates for BP are generated using a color checker.
Once again without normalization the performance using the majority of color spaces is poor. After normalization, however, the best performance is achieved using color spaces C1C2C3 and UVW followed by XYZ as shown in Figure 11. It should be noted that the performance in C1C2C3 is also slightly improved after normalizing the image.
The performance of BP in detecting different color groups in different color spaces is reflected in Table III.
||#1||#2||#3||Best Score||#1||#2||#3||Best Score||#1||#2||#3||Best Score|
The best performance results from the C1C2C3 color space. The only cases in which the performance was poor was in color groups y and gy similar to simulated images.
The runner-up color spaces are UVW and XYZ. The performance was more or less similar in both synthetic and real images. However, there are some exceptions. For instance, in the case of color v, the best performance belongs to UVW in synthetic images where XYZ is not even in top three spaces. In contrast, using real images opposite results were observed. Here, the best space is XYZ whereas UVW is not in the top three.
The clustering was done on both original and normalized images. The maximum number of clusters was set to the maximum number of colors present in the image ranging from 4-12 colors maximum depending on the image. The silhouette scores are measured and reflected in Figure 12.
Similar to simulated images, normalization improves the overall results. The best performance was achieved using the color spaces COPP, rg and C1C2C3 in original images and UVW, C1C2C3 and rg after normalization.
Iv-C Active visual search
In this section we put our findings into practice and evaluate the effect of color space choice on the performance of a mobile robot searching for an object. To perform search we used the same greedy algorithm introduced in  with the difference of omitting the bottom-up saliency to eliminate any bias beside the color values.
The experiments were conducted in Gazebo simulation environment. For this purpose we generated a large number of objects111The dataset is available at http://data.nvision2.eecs.yorku.ca/3DGEMS/ (see Figure 13) to create a typical office environment (see Figure 14). The search robot is a simulated Pioneer 3 platform equipped with a Zed camera for visual processing and Hokuyo Lidar for mapping. In addition, communications and navigation are done using ROS nav package.
We used 6 target objects with various colors (see Figure 15) placed randomly in the scene. In each iteration the locations of objects were rotated, so that objects were equally represented in the environment. In each configuration, the robot was placed in four different locations to begin the search.
As for the color spaces, we used the top 6 performing spaces found in Section IV-A, namely C1C2C3, UVW, XYZ, YCrCb, Luv and HSI. In total over 1000 experiments were conducted.
Table IV lists the results of experiments for each object. The outcomes are consistent with the results in Table II. Using the top color spaces resulted in lower search time. There are, however, a few exceptions that show the importance of discriminability. For instance, UVW is placed in top 2 position for detecting 5 color groups but in search at best is placed in 3rd and 4th position. Given this color space’s low silhouette score, the detection algorithm can be distracted by the other colors in the environment, therefore, the efficiency of the color space is reduced. In contrast, using C1C2C3 better performance is achieved. This is consistent with the high silhouette score which means this space is more robust against distraction.
Figure 16 shows the average results of the search for all objects. Here, the best performance overall is achieved using color spaces (in descending order) C1C2C3, HSI and XYZ.
In this paper we evaluated a large number of color spaces to measure their suitability detecting and discriminating different colored objects. Using empirical evaluations on both synthetic and real images we showed that there is no single optimal color space for detecting and discriminating all color groups. On average, however, the best performance was achieved using color spaces C1C2C3, UVW and XYZ.
The color spaces were also put to test in the context of visual search in a simulated environment. A combination of high detection rate and robustness to distractors resulted in the lowest time of search using C1C2C3 and XYZ color spaces.
We only measured the robustness to distraction by a clustering technique. It would be beneficial to measure the sensitivity of each color space to different color groups. In addition, the visual search experiments were done only in a simulated environment. In the future, we intend to perform a similar study on a practical platform to confirm our evaluation on the real images.
We acknowledge the financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC), the NSERC Canadian Field Robotic Network (NCFRN), and the Canada Research Chairs Program through grants to JKT.
-  N. Moroney and D. F. Fairchild, “Color space selection for jpeg image compression,” Journal of Electronic Imaging, vol. 4, no. 4, pp. 373–381, 1995.
-  E. Saber, A. Tekalp, R. Eschbach, and K. Knox, “Automatic image annotation using adaptive color classification,” Graphical Models and Image Processing, vol. 58, no. 2, pp. 115–126, March 1996.
-  A. Kumar and S. Malhotra, “Pixel-Based Skin Color Classifier : A Review,” International Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 8, no. 7, pp. 283–290, 2015.
-  M. Danelljan, F. S. Khan, M. Felsberg, and J. V. de Weijer, “Adaptive color attributes for real-time visual tracking,” in he IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1090–1097.
-  P. Liang, E. Blasch, and H. Ling, “Encoding color information for visual tracking: algorithms and benchmark,” IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5630–5644, 2015.
-  H. Stokman and T. Gevers, “Selection and Fusion of Color Models for Feature Detection.pdf,” CVPR, 2005.
-  Y.-I. Ohta, T. Kanade, and T. Sakai, “Color information for region segmentation,” Computer graphics and image processing, vol. 13, no. 3, pp. 222–241, 1980.
-  V. Meas-Yedid, E. Glory, E. Morelon, C. Pinset, G. Stamon, and J. C. Olivo-Marin, “Automatic color space selection for biological image segmentation,” Proceedings - International Conference on Pattern Recognition, vol. 3, pp. 514–517, 2004.
-  S. Dev, Y. H. Lee, and S. Winkler, “Systematic study of color spaces and components for the segmentation of sky/cloud images,” 2014 IEEE International Conference on Image Processing, ICIP 2014, pp. 5102–5106, 2014.
-  A. Gupta and A. Chaudhary, “Robust skin segmentation using color space switching,” Pattern Recognition and Image Analysis, vol. 26, no. 1, pp. 61–68, 2016.
-  V. Vezhnevets, “A Survey on Pixel-Based Skin Color Detection Techniques,” Cybernetics, vol. 85, no. 0896-6273 SB - IM, pp. 85–92, 2003.
-  C. Benedek and T. Szirányi, “Study on color space selection for detecting cast shadows in video surveillance,” International Journal of Imaging Systems and Technology, vol. 17, no. 3, pp. 190–201, 2007.
-  K. Van De Sande, T. Gevers, and C. Snoek, “Evaluating color descriptors for object and scene recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1582–1596, 2010.
-  J. Scandaliaris, M. Villamizar, J. Andrade-Cetto, and A. Sanfeliu, “Robust color contour object detection invariant to shadows,” Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications, pp. 301–310, 2007.
-  G. Paschos, “Perceptually uniform color spaces for color texture analysis: annempirical evaluation,” IEEE Transactions on Image Processing, vol. 10, no. 6, pp. 932–937, 2001.
-  A. Porebski and N. Vandenbroucke, “ Iterative Feature Selection for Color Texture Classification Ecole d’Ingenieurs du Pas-de-Calais Departement Automatique Campus de la Malassise 62967 Longuenesse Cedex - France Laboratoire LAGIS - UMR CNRS 8146 Universite des Sciences et Techno,” Image Processing, 2007. ICIP 2007. IEEE International Conference on, pp. 509–512, 2007.
-  D.-L. Song, L.-H. Ge, W.-W. Qi, and M. Chen, “Illumination invariant color model selection based on genetic algorithm in robot soccer,” Information Science and Engineering (ICISE), 2010 2nd International Conference on, no. 3, pp. 1–4, 2010.
-  D. Song, W. Sun, Z. Ji, G. Hou, X. Li, and L. Liu, “Color model selection for underwater object recognition,” International Conference on Information Science, Electronics and Electrical Engineering, pp. 1339–1342, 2014.
-  G. Duan, F. Duan, Y. Xu, H. Gong, and X. Qu, “Investigation of Optimal Segmentation Color Space of Bayer True Color Images with Multi-Objective Optimization Methods,” Journal of the Indian Society of Remote Sensing, vol. 43, no. 3, pp. 487–499, 2015.
-  A. Rasouli and J. K. Tsotsos, “Attention in autonomous robotic visual search,” in i-SAIRAS, Montreal, June 2014.
-  P. Guo and M. R. Lyu, “A study on color space selection for determining image segmentation region number,” in the 2000 International Conference on Artificial Intelligence (IC-AIâ2000), Las Vegas, 2000, pp. 1127–1132.
-  E. Salvador, A. Cavallaro, and T. Ebrahimi, “Cast shadow segmentation using invariant color features,” Computer vision and image understanding, vol. 95, no. 2, pp. 238–259, 2004.
-  M. Tkalcic and J. F. Tasic, “Colour spaces: perceptual, historical and applicational background,” in Eurocon, 2013.
-  A. R. Weeks, C. E. Felix, and H. R. Myler, “Edge detection of color images using the hsl color space,” in In IS&T/SPIE’s Symposium on Electronic Imaging: Science & Technology, 1995, pp. 291–301.
-  Y. I. Ohta, T. Kanade, and T. Sakai, “Color information for region segmentation,” Computer graphics and image processing, vol. 13, no. 3, pp. 222–241, 1980.
-  L. Lucchese and S. K. Mitra, “Filtering color images in the xyy color space,” in ICIP, 2000, pp. 500–503.
-  M. J. Swain and D. H. Ballard, “Color indexing,” Computer Vision, vol. 7, no. 1, pp. 11–32, 1991.
-  P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” Journal of computational and applied mathematics, vol. 20, pp. 53–65, 1987.
-  A. Rasouli and J. K. Tsotsos, “Sensor planning for 3d visual search with task constraints,” in Computer and Robot Vision (CRV), 2016 13th Conference on. IEEE, 2016, pp. 37–44.