Datasets & Benchmarks

Datasets & Benchmarks

This page provides data sets for several of our publications as well as publicly available benchmarks. Unless otherwise noted, the data is for personal, research & teaching use only. Please contact us, should you wish to use the data for commercial purposes.

A synthetic vision dataset for a part-based analysis of explainable AI methods.
Relevant citation
(please cite this paper if you are using the dataset)
R. Hesse, S. Schaub-Meyer, S. Roth, “FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods,” in International Conference on Computer Vision (ICCV), October 20223.
Data separate page
Contact Robin Hesse
A multilingual evaluation benchmark for the visual question answering task.
Relevant citation
(please cite this paper if you are using the dataset)
J. Pfeiffer, G. Geigle, A. Kamath, J.-M. O. Steitz, S. Roth, I. Vulić, and I. Gurevych, “xGQA: Cross-Lingual Visual Question Answering,” in Findings of the Association for Computational Linguistics (ACL), May 2022.
Data separate page
Contact visit website
Dataset for evaluating image denoising methods on images with real sensor noise.
Relevant citation
(please cite this paper if you are using the dataset)
T. Plötz and S. Roth, “Benchmarking denoising algorithms with real photographs,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, July 2017.
Data separate page
Contact Tobias Plötz
Synthetic data from computer games with labels for semantic segmentation.
Relevant citation
(please cite this paper if you are using the dataset)
S. R. Richter, V. Vineet, S. Roth, and V. Koltun, “Playing for data: Ground truth from computer games,” in Proc. of the European Conference on Computer Vision (ECCV), J. Matas, B. Leibe, M. Welling and N. Sebe, Eds., ser. LNCS, Springer, 2016
Data separate page
Contact Stephan Richter
Synthetic and real data for stereo video deblurring.
Relevant citation
(please cite this paper if you are using the dataset)
A. Sellent, C. Rother, and S. Roth, “Stereo video deblurring,” in Proc. of the European Conference on Computer Vision (ECCV), B. Leibe, J. Matas, N. Sebe and M. Welling, Eds., ser. LNCS, vol. 9906, Springer, 2016, pp. 558–575.
Data synthetic data, motorized rail data
Contact Anita Sellent
This large scale dataset provides training and test data for semantic segmentation in urban street scenes.
Relevant citation
(please cite this paper if you are using the dataset)
M. Cordts, M. Omran, S. Ramos, T. Scharwächter, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The Cityscapes dataset for semantic urban scene understanding,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, June 2016
Data cityscapes-dataset.net
Contact
Multiple Object Tracking Benchmark – Comprehensive evaluation of multi-target tracking algorithms across a variety of public datasets.
Relevant citations
(please cite one of these papers if you are using the dataset)
L. Leal-Taixé, A. Milan, I. Reid, S. Roth, and K. Schindler, “MOTChallenge 2015: Towards a benchmark for multi-target tracking,” arXiv:1504.01942 [cs.CV], Apr. 2015.

L. Leal-Taixé, A. Milan, I. Reid, S. Roth, and K. Schindler, “MOT16: A Benchmark for Multi Object Tracking,” arXiv:1603.00831 [cs.CV], Mar. 2016.
Benchmark website
Contact visit website
Test dataset for semantic segmentation in urban street scenes.
Relevant citations
(please cite one of these papers if you are using the dataset)
T. Scharwächter, M. Enzweiler, S. Roth, and U. Franke, “Efficient multi-cue scene segmentation,” in Proc. of the German Conference on Pattern Recognition (GCPR), J. Weickert, M. Hein, and B. Schiele, Eds., ser. LNCS, vol. 8142, Springer, 2013, pp. 435–445.

T. Scharwächter, M. Enzweiler, U. Franke, and S. Roth, “Stixmantics: A medium-level model for real-time semantic scene understanding,” in Proc. of the European Conference on Computer Vision (ECCV), D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., ser. LNCS, vol. 8693, Springer, 2014, pp. 553–548.
Dataset website
Contact Timo Scharwächter
Benchmark for optical flow with real and synthetic scenes.
Relevant citation
(please cite this paper if you are using the dataset)
S. Baker, D. Scharstein, J. Lewis, S. Roth, M. J. Black, and R. Szeliski, “A database and evaluation methodology for optical flow,” International Journal of Computer Vision (IJCV), vol. 92, no. 1, pp. 1–31, Mar. 2011.
Other relevant citations S. Baker, D. Scharstein, J. Lewis, S. Roth, M. J. Black, and R. Szeliski, “A database and evaluation methodology for optical flow,” in Proc. of the IEEE International Conference on Computer Vision (ICCV), Rio de Janeiro, Brazil, Oct. 2007.
Dataset website
Contact see web site
Database of 800 synthetically generated optical flow fields (from range images and camera motion) used to analyze their spatial statistics.
Relevant citation
(please cite these papers if you are using the dataset)
S. Roth and M. J. Black, “On the spatial statistics of optical flow,” International Journal of Computer Vision (IJCV), vol. 74, no. 1, pp. 33–50, Aug. 2007.

J. Huang, A. B. Lee, and D. Mumford. “Statistics of range images,” in Proc. of the IEEE Conference on Computer Vision and PatternRecognition (CVPR), vol. 1, p. 1324ff, June 2000.
Other relevant citations S. Roth and M. J. Black, “On the spatial statistics of optical flow,” in Proc. of the IEEE International Conference on Computer Vision (ICCV), vol. 1, Beijing, China, Oct. 2005, pp. 42–49.
Dataset separate page
Contact Stefan Roth
Set of 68 images for image denoising (subset of the Berkeley segmentation dataset), originally used with Fields of Experts.
Relevant citation
(please cite these papers if you are using the dataset)
S. Roth and M. J. Black, “Fields of experts,” International Journal of Computer Vision (IJCV), vol. 82, no. 2, pp. 205–229, Apr. 2009.

D. Martin, C. Fowlkes, D. Tal and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in Proc. of the International Conference on Computer Vision (ICCV), vol. 2, pp. 416-423, July 2001.
Other relevant citations S. Roth and M. J. Black, “Fields of experts: A framework for learning image priors,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, San Diego, California, Jun. 2005, pp. 860–867.
Dataset separate page
Contact Stefan Roth