EGO-CH: Dataset and Fundamental Tasks for Visitors Behavioral Understanding using Egocentric Vision


F. Ragusa1,3, A. Furnari1, S. Battiato1, G. Signorello2, G. M. Farinella1,2

1IPLab, Department of Mathematics and Computer Science - University of Catania, IT
2CUTGANA - University of Catania, IT
3Xenia Gestione Documentale s.r.l. - Xenia Progetti s.r.l., Acicastello, Catania, IT




We propose EGO-CH, a dataset of egocentric videos for visitors’ behavior understanding. The dataset has been collected in two different cultural sites and includes more than 27 hours of video acquired by 70 subjects, including volunteers and 60 real visitors. The overall dataset includes labels for 26 environments and over 200 Points of Interest (POIs). Specifically, each video of EGO-CH has been annotated with 1) temporal labels specifying the current location of the visitor and the observed POI, 2) bounding box annotations around POIs. A large subset of the dataset, consisting of 60 videos,is also associated with surveys filled out by the visitors at the end of each visit. To encourage researchon the topic, we propose 4 challenging tasks useful to understand visitors’ behavior and report baseline results on the dataset



Dataset



The dataset has been acquired using a head-mounted Microsoft HoloLens device in two cultural sites located in Sicily,Italy: 1) “Palazzo Bellomo”, located in Siracusa, and 2) “Monastero dei Benedettini”, located in Catania.

EGO-CH: Palazzo Bellomo

  • 22 environments and 191 Points of Interest (POIs)
  • Video Acquisition: 1280x720 at 29.97 fps
  • 57 training video and 10 validation/test video
  • 191 reference images related to the considered POIs
  • Temporal labels to indicate the environment in which the visitor is located and the currently observed POI
  • 70088 frames annotated with bounding boxes
  • 23727 images patches extracted from the bounding boxes annotations

EGO-CH: Monastero dei Benedettini
  • 4 environments and 35 Points of Interest (POIs)
  • Video Acquisition: training/validation videos 1216x684 at 24 fps || test videos 1408x792 at 30.03 fps
  • 48 training video and 5 validation videos
  • 60 real visits
  • 35 reference images related to the considered POIs
  • Temporal labels to indicate the environment in which the visitor is located and the currently observed POI
  • 106911 frames annotated with bounding boxes
  • 45048 images patches extracted from the bounding boxes annotations
  • 60 surveys related to the 60 real visits



You can download the whole dataset and annotations at this link .




Tasks


Room-based Localization

The task consists in determining the room in which the visitor of a cultural site is located from egocentric images.



Points of Interest Recognition

The task consists in recognizing the points of interest which the user is looking at.



Object Retrieval

Given a query image contraining an object, the task consists in retrieving an image of the same object from a database.


Survey Generation

This task consists in predicting the content of a survey from the analysis of the related egocentric video.




Paper

F. Ragusa, A. Furnari, S. Battiato, G. Signorello, G. M. Farinella. EGO-CH: Dataset and Fundamental Tasks for Visitors Behavioral Understanding using Egocentric Vision. Pattern Recognition Letters - Special Issue on Pattern Recognition and Artificial Intelligence Techniques for Cultural Heritage, 2020. Download the paper.



Code

We provide references and code related to our baselines:



Supplementary Material

More details on the dataset and additional information about experiments can be found in the supplementary material associated to the publication.


Acknowledgement

This research is part of the project VALUE - Visual Analysis for Localization and Understanding of Environments (N. 08CT6209090207) supported by PO FESR 2014/2020 - Azione 1.1.5. - “Sostegno all’avanzamento tecnologico delle imprese attraverso il finanziamento di linee pilota e azioni di validazione precoce dei prodotti e di dimostrazioni su larga scala” del PO FESR Sicilia 2014/2020, and Piano della Ricerca 2016-2018 linea di Intervento 2 of DMI, University of Catania. The authors would like to thank Regione Siciliana Assessorato dei Beni Culturali dell'Identità Siciliana - Dipartimento dei Beni Culturali e dell'Identità Siciliana and Polo regionale di Siracusa per i siti culturali - Galleria Regionale di Palazzo Bellomo.




People