Social Event Detection (SED) 2012 dataset
Overview: This page makes available for research purposes the dataset, challenge definitions, ground truth challenge results and corresponding evaluation script that were created and used in the 2012 edition of the Social Event Detection (SED) task of the MediaEval international benchmarking activity.
The Social Event Detection (SED) task of MediaEval 2012 requires participants to discover social events and detect related media items in a collection of images that are accompanied by metadata typically found on the social web (including time-stamps, tags, geotags for a small subset of them). By social events, we mean that the events are planned by people, attended by people and that the media illustrating the events are captured by people. Finding the events, in this task, means finding a set of photo clusters, each cluster comprising only photos associated with a single event (thus, each cluster defining a retrieved event).
For more information on the SED 2012 dataset, challenges and evaluation, please see the following two publications. If you use the dataset for your research, please cite one or both of the following two publications (the first describing the challenge and the second the dataset):
- S. Papadopoulos, E. Schinas, V. Mezaris, R. Troncy, I. Kompatsiaris, "Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation", Proc. MediaEval 2012 Workshop, Pisa, Italy, October 2012.
- S. Papadopoulos, E. Schinas, V. Mezaris, R. Troncy, I. Kompatsiaris, "The SED2012 Dataset", Proc. MMSys 2013 conference, Oslo, Norway, 2013. video
How to access: The data being released through this page include:
- The SED 2012 test kit (which includes the definitions of the three SED 2012 challenges and the XML file with the image metadata that could be used for addressing these challenges).
- The images of the collection (167,332 images that were captured between the beginning of 2009 and end of 2011 by 4,422 unique Flickr users, and were posted to Flickr by their respective owners under a creative commons license). The images are made available in the form of 4 compressed image archives, plus an image license file, as follows:
- sed2012_photos_part1.tar.gz (~1.2GB)
- sed2012_photos_part2.tar.gz (~1.6GB)
- sed2012_photos_part3.tar.gz (~2.2GB)
- sed2012_photos_part4.tar.gz (~0.8GB)
- sed2012_photos_license.zip (~4MB)
- The ground truth results for the three defined challenges (queries) on the provided dataset, together with a script for evaluating any social event detection results against this ground truth.
- sed2012_evaluation_kit.zip (~1MB)
Copyright notice: The images distributed as part of the Social Event Detection 2012 (SED 2012) dataset were collected from Flickr, where they were posted by their respective owners under a Creative Commons license. The Creative Commons attribution licenses allow for image use as long as the photographer is credited for the original creation. Possibly, use is granted under additional restrictions, but none of these preclude the use of the images for benchmarking purposes. While compiling the Social Event Detection 2012 (SED 2012) dataset, we collected only Creative Commons images, and also collected as much information possible about the creators of each image. The creator information, the exact license type and other relevant information are included in the image license file, which is distributed together with the images. We would like to take this opportunity to express our gratitude to the image photographers for allowing us to use their pictures: we greatly appreciate this and gladly acknowledge your work. Your names and license details are listed in image license file. Please let us know if you have special wishes on how you would like to be credited or have additional details that must be incorporated.