home | people | research | facilities | collaborations | publications | data | code | feedback



Information about this dataset

Links to data

Publications prepared with this data


Shi's Re-processing of Gehler's Raw Dataset



The Gehler dataset contains 568 images and includes a variety of indoor and outdoor shots taken using two high quality DSLR cameras (Canon 5D and Canon1D) with all settings in auto mode. Each image contains a MacBeth colorchecker for reference. All images were saved in Canon RAW format. Gehler also provides tiff-format versions created from the RAW images using the automatic mode of the Canon Digital Photo Professional program to convert the images into tiffs. The image coordinates (measured by hand) of each colorcheckers' squares are provided with the dataset.

Because the tiff images in the Gehler dataset were produced automatically they contain clipped pixels, are non-linear (i.e., have gamma or tone curve correction applied), are demosaiced, and include the effect of the camera's white balancing. To avoid these problems we chose to reprocess the raw data and created almost-raw 12-bit Portable Network Graphics (PNG) format (lossless compression) images from the Canon RAW format data by decoding them using dcraw (by Windows executable dcrawMS.exe). To preserve the original digital counts for each of the RGB channels, demosaicing was not enabled. The cameras both output 12-bit data per channel so the range of possible digital counts is 0 to 4095. The raw images contain 4082 x 2718 (Canon 1D) and 4386 x 2920 (Canon 5D) 12-bit values in an RGGB pattern. To create a color image the two G values were averaged, but no further demosaicing was done. This results in a 2041 x 1359 (for Canon 1D) or 2193 x 1460 (for Canon 5D) linear image (gamma=1) in camera RGB space. This processing takes into account that the two camera models have slightly different sensor mosaics (same pattern but different starting offset).

We provide the least processed possible data. Note that for most applications (e.g., testing colour constancy methods) the black level offset will still need to be subtracted from the original images. A Matlab template for loading the images and removing the offset is available here. The blacklevel of each camera was estimated by finding the minimum pixel values across the whole dataset. For the Canon 5D the black level is 129 and for the Canon 1D it is zero.

Measuring the Scene Illumination

The colorchecker in each image has six achromatic squares. As the ground truth measure of the illumination's RGB color, we used the median of the RGB digital counts (i.e, median R, median G, median B) from the brightest achromatic square (ranked by average of each square) containing no RGB digital count > 3300. The threshold eliminates any clipping and the effects of any possible non-linearity in sensor response that might occur as intensities approach the maximum of 4095. The median is used instead of the mean because the median automatically excludes any of the black pixels surrounding each square that might have been incorrectly included in the square due to the inexactness in the hand labeling of a colorchecker's position.



PNG Images: (Converted from Raw Images. Divided into 4 parts, 1~2GB each)
Canon 1D, Canon 5D(1), Canon 5D(2), Canon 5D(3)

Measured Illumination: (based on linear images) download

 Publications That Use This Data

If you use this version of the data set please cite it as

Lilong Shi and Brian Funt, "Re-processed Version of the Gehler Color Constancy Dataset of 568 Images,"
accessed from

Please also include a citation to the original source:

Peter Gehler and Carsten Rother and Andrew Blake and Tom Minka and Toby Sharp, "Bayesian Color Constancy Revisited,"
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2008.

Computational Vision Lab
Computing Science,
Simon Fraser University,
Burnaby, BC, Canada,
V5A 1S6

Fax: (778) 782-3045
Tel: (778) 782-4717
Office: ASB 10865, SFU

Site by the Centre for Systems Science.
Last Updated: September 18, 2000