Multiple Light Source Dataset for Colour Research

Multiple Light Source Dataset for Colour Research

Anna Smagina\supit1    Egor Ershov\supit1    Anton Grigoryev\supit1, 2 \skiplinehalf\supit1Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences
\supit2Moscow Institute of Physics and Technology (State University)

We present a collection of 24 multiple object scenes each recorded under 18 multiple light source illumination scenarios. The illuminants are varying in dominant spectral colours, intensity and distance from the scene. We mainly address the realistic scenarios for evaluation of computational colour constancy algorithms, but also have aimed to make the data as general as possible for computational colour science and computer vision. Along with the images of the scenes, we provide spectral characteristics of the camera, light sources and the objects and include pixel-by-pixel ground truth annotation of uniformly coloured object surfaces thus making this useful for benchmarking colour-based image segmentation algorithms. The dataset is freely available at

computer vision, colour constancy, illuminant estimation, multiple light source

1 Introduction

In this paper we describe a new laboratory dataset for assessing the performance of colour constancy and colour-based segmentation algorithms, that is also useful for evaluating the representativeness of spectral colour models and for other colour research work. 39 years have already passed since the GrayWorld method invention [buchsbaum1980spatial], which is one of the first and most popular colour constancy approaches. Colour constancy algorithms are useful, for example, for face identification and detection systems [sengupta2018sfsnet]. The research activity in this topic is not declining even now, which is largely due to the increase in available computing power and the development of technologies for creating better optical sensors. One can learn more about the history of the colour constancy research up to from the work [gijsenij2011computational].

Indicative of the relevance of colour constancy to modern science is the fact that in the period from to the present, on average, each year one colour constancy dataset was published. Such datasets could be essentially be divided into two types: ones collected in the laboratory and ones collected under uncontrolled conditions. And while the latter are usually created to evaluate existing algorithms under real-life conditions, the former are more likely accurately assess the quality of existing solutions, as well as to determine the degree of solvability of tasks that have not yet been solved. The list of the latter should include such problems as the colour estimation of multiple light sources (about ten works have already been written on this topic, in particular [bianco2017single]), restoring the position of light sources in the camera coordinate system from an image, creating low-parametric spectral models for describing colour transformations [gusamutdinova2017verification], and others.

Datasets collected under natural and uncontrolled conditions, for example, REC [hemrit2018rehabilitating], CubePlus [banic2017unsupervised], GrayBall [ciurea2003large], NUS [cheng2014illuminant], are poorly suited for assessing the accuracy of solving the problems listed above, since assessment of the source colour using calibration objects is performed inaccurately as the calibration object (or objects) are usually illuminated by several sources simultaneously in unknown proportions. The same is true for datasets of optical multichannel images [nascimento2002statistics, foster2006frequency, parraga1998color]. Although the latter are favourably distinguished by the fact that they allow not only a more accurate assessment of colorimetric algorithms, but also allow augmentation of lighting. For example, [gijsenij2011color] proposed a method for synthesising images, where different parts of them are illuminated by various sources.

More precisely, the assessment of illumination can be done either by increasing the number of calibration objects in the scene, for example, this is done in [funt2010rehabilitation], or, more accurately but more time-consuming, by measuring the emission spectra of illuminants (our method). While forming this dataset, we were focusing precisely on the accuracy of measuring of all characteristics in the scene, both spectra and the location of objects. In summary, it is worth noting that in the context of the tasks mentioned above, the following requirements were defined for the data set:

  • [noitemsep]

  • The presence of different light sources (different chromaticities) with known spectral characteristics;

  • The presence of scenes illuminated by one or several light sources of various types (spot and diffuse);

  • Different types of objects—metals (preferably chromatic), dielectrics, flat / non-flat (important for algorithms based on a linear model of image formation and for evaluating the representativeness of spectral colour models);

  • Known spectral characteristics of the objects reflectance.

  • Known camera characteristics and linear images.

Among the published laboratory datasets [barnard2002comparison, geusebroek2005amsterdam, bleier2011color, rizzi2013yaccd2] there is no one that would simultaneously satisfy all the items on the list, which served as the motivation for the construction of the one described in this paper.

2 Scene setup

The setup used to acquire the image collection consists of a lightbox (FALCON EYES PBF-60AB), 4 different light sources and a camera, see Fig. 1. The halogen lights are mounted at equal distances (1.5 m) from two opposite sides of softbox to produce a diffuse lighting. In immediate vicinity of the softbox the desk and the LED lamps are placed to produce close lighting. Objects are positioned manually inside the of the softbox. The camera is pointing on the scene through the open side of the softbox.

Figure 1: Schematic view of the experimental setup for image collection.

The halogen lamps are 35 W each with similar light spectra. Duration of the whole experimental procedure (setting-up, calibrating, and recording the database, approx. 100 hours) was only a fraction of the average life time of the lamps (about 1000 hours), hence ageing effects affecting colour temperature can be considered minimal. The brightness of the incandescent lamp (60W) was adjustable and was set to 30% of the maximum. The RGB LED strip is under computer control via in-house made USB controller, which allows to tune the emitting power of red, green and blue LEDs individually.

Images are recorded with Canon 5D Mark III camera equipped with Canon EF 24–70 mm f/2.8L II USM lens. The shooting was performed in a RAW format with 5760 x 3840 resolution in the manual mode. The camera settings was adjusted under the maximal bright in the experiment illumination in the way that avoids over-exposures. Zoom was switched off. Focal length was set to 25 mm, ISO speed to 320. Aperture was closed to F/16 to achieve sharpness across most of the scene. The exposure time was set to 1 second and the 2 second delay was used for shooting. The in-camera white balance was adjusted to 6500K. The spectral responses for Canon 5D Mark III was measured by Baek et al. in [baek2017compact].

3 Objects

To compose the scenes we used objects without colour-textured surfaces and mirroring effects to provide clear colouring annotation (section 4) and spectral characteristics (section 5). The objects represent 122 surfaces of various materials and shapes. These include 88 dielectric surfaces (both matte and glance), 6 achromatic and 10 chromatic metals. Also, we also recorded a scene with DGK Color Tools WDKK Color Chart consisting of 18 patches with known colors (fig. 2).

4 Image data

The scenes are composed from multiple objects placed on a coloured background. Illumination of each scene is varied in 18 configurations, which are shown at figure 2 and includes the following combinations of light sources used for shooting:

  • [noitemsep, nolistsep]

  • 2HAL — two halogen lights only,

  • 2HAL_DESK – two halogen lights with the desktop lamp,

  • 2HAL_DESK_LED-{R, RG, BG, B}{025, 050, 075, 100} — two halogen lights with the desktop lamp and red (R), blue (B), red and green (RG) or blue and green (BG) lights of the LED strip turned on at 25%, 50%, 70%, and 100% of the emitting power correspondingly.

Figure 2: Example scene viewed under 18 different illumination scenarios.

Each image is provided in 3 formats: a full-resolution raw (single-channel RGGB Bayer array) image cropped and converted to 16-bit PNG format from the camera-specific raw CR2 image, a half-resolution color PNG (in linear sensor-specific RGB coordinates) image obtained by naive demosaicing algorithm (the camera sensor has a built-in optical low-pass filter, so no moire artifacts), and a quarter-resolution color JPEG image for preview. The raw and 16-bit color images are presented without any autocontrast or range stretching and may appear very dark when viewed in ordinary software, however they contain all information to reconstruct the full-color images as given by previews.

Each scene is accompanied with the pixelwise annotation masks of uniformly coloured object surfaces. The masks are given in a 8-bit PNG file, in which pixels corresponding to uniformly coloured object surface have the same, unique on the mask, colour. For annotation purposes, images were first automatically split into small regions (but not less than 100 pixels) with guaranteed colour constancy using the algorithm given in [smagina2019linear]. Then, regions were manually merged in accordance to uniform colouring of the scene. Each such region is associated with a reflectance spectrum (section 5). Same as with the scene images, the colouring annotation is provided in full-, half- and quarter-resolutions.

Figure 3: Overview of all recorded MLS scenes.

5 Spectral data

In addition to colour images under various lighting conditions, we also collected a set of spectral measurements, both for illuminant emission and for object reflectance. The measurement was conducted using a miniature spectrometer OceanOptics FLAME-S-VIS-NIR-ES with a corresponding set of accessories from OceanOptics, including two light sources (HL-2000 for reflectance measurement and HL-3P-CAM for radiometric calibration for illuminant characterization) and a reflectance standard WS-1.

The source spectra were collected for all individual sources (two halogen lamps, the desk lamp and the 3 components of the RGB LED lamp) measured through the lightbox, to characterize the real spectra illuminating the objects. Also, the spectrum of both halogen lamps measured near the camera lens to get some imformation on relative spectral intensities. The dataset is accompanied by the photos illustrating the spectral measurement conditions.

The measurements were taken with built-in nonlinearity correction and calibrated for the uniform sensitivity for all wavelength using the radiometrically calibrated light source. The dark room background was taken into account and subtracted, however, some small nonuniformities (e.g. imperfectly blinking indicator lights on) can be visible in the resulting spectra. The chromaticities of illuminants relative to our camera are shown in figure 4a.

The reflectance spectra were measured relative to the polytetrafluoroethylene standard (Ocean Optics WS-1). The reflection probe holder was used to control the angle of the measurement. measurement angle was used for all materials, the specular component of metallic or metallized surfaces was also measured with probe perpendicular to the surface. For each object the spectra for all major constituent materials visible in the scene were measured. Chromacities of all surfaces taken at measurement angle are shown at figure 4b. The materials are specified by a unique string (e.g. “red_plastic_ball” or “brown_metallic_door_knob”) containing the colour, the material and the object identification.

Chromaticities of illuminants Chromaticities of surfaces

(a) (b)
Figure 4: Chromaticity distributions of illuminants and surfaces in camera-specific color coordinates. The star marker denotes the white point.

6 Conclusion

We present a collection of 24 scenes images recorded under 18 different multiple light source illuminations. The main feature of the dataset is a completeness, containing spectral data both for illuminants and for object surfaces. The dataset mainly offers a evaluation for computation colour constancy, but also can be used in variety of computer vision involving a colour- or material based image segmentation, photometric invariant extraction and others.


This work is partially supported by Russian Foundation for Basic Research (projects 17-29-03514 and 17-29-03370).

We are grateful to Alexander Belokopytov for his help in constructing the acquisition setup. We thank Dmitry Nikolaev for his valuable comments in experimental design and result verification. We thank Alexey Glikin for the technical help. For contributing the objects, we acknowledge Artem Sereda, Sergey Gladilin, Dmitry Bocharov.


Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description