Abstract

Exploiting Multimodal Synthetic Data for Egocentric Human-Object Interaction Detection in an Industrial Scenario

In this paper, we tackle the problem of Egocentric Human-Object Interaction (EHOI) detection in an industrial domain. To overcome the lack of public datasets in this context, we propose a pipeline and a tool able to generate synthetic images of EHOIs paired with several annotations and data signals. Using the proposed pipeline, we present EgoISM-HOI a new multimodal dataset composed of synthetic EHOI images in an industrial environment with rich annotations of hands and objects. To demonstrate the utility of synthetic data, we designed an EHOI detection method that uses the different multimodal signals available within our dataset. Our study shows that exploiting synthetic data to pre-train the proposed system significantly improves performance when tested on real-world data. Additional experiments show that the proposed approach outperforms classic baseline approaches based on state-of-the-art class agnostic methods.

Code Data

EgoISM-HOI

We present EgoISM-HOI a new multimodal synthetic-real dataset of Egocentric Human-Object interactions, which contains a total of 39,304 RGB images, 23,356 depth maps and instance segmentation masks, 59,860 hand annotations, 237,985 object instances across 19 object categories and 35,416 egocentric human-object interactions.

Download

@article{leonardi2024synthdata, title = {Exploiting multimodal synthetic data for egocentric human-object interaction detection in an industrial scenario}, journal = {Computer Vision and Image Understanding}, volume = {242}, pages = {103984}, year = {2024}, issn = {1077-3142}, doi = {https://doi.org/10.1016/j.cviu.2024.103984}, author = {Rosario Leonardi and Francesco Ragusa and Antonino Furnari and Giovanni Maria Farinella}, }

@inproceedings{leonardi2022egocentric, title={Egocentric Human-Object Interaction Detection Exploiting Synthetic Data}, author={Leonardi, Rosario and Ragusa, Francesco and Furnari, Antonino and Farinella, Giovanni Maria}, booktitle={Image Analysis and Processing -- ICIAP 2022}, pages={237--248}, year={2022} }

Abstract

Exploiting Multimodal Synthetic Data for Egocentric Human-Object Interaction Detection in an Industrial Scenario

Abstract

Exploiting Multimodal Synthetic Data for Egocentric Human-Object Interaction Detection in an Industrial Scenario

EHOI Generation Pipeline

Dataset

EgoISM-HOI

RGB Images

39304

Hand annotations

59860

Object annotations

237985

EHOI annotations

35416

Proposed approach

Paper

People

Rosario Leonardi

Francesco Ragusa

Antonino Furnari

Giovanni Maria Farinella