FacebookFacebook group TwitterTwitter
ICVSS From Perception to Action

An Egocentric Approach to Social AI

Jim Rehg

University of Illinois at Urbana-Champaign, USA

Abstract

While computer vision and NLP have made tremendous progress in extracting semantics from images, video, and language, our computational understanding of human social behavior is still in its infancy. Face-to-face communication, using a rich repertoire of visual, acoustic, and linguistic channels, is the foundation for all social interactions and relationships, as well as all other means of communication. Moreover, the acquisition of social skill in infancy is a critical developmental milestone, and its disruption by conditions such as autism has life-long consequences. The current state-of-the-art in AI consists largely of surface-level analyses, e.g., action detection, recognition, and prediction from video, or inferring the sentiment of an utterance from NLP. A major challenge is to leverage this recent progress and mount an attack on the core constructs of social interaction, such as joint attention, theory of mind, and social appraisals. A key hypothesis is that computational social understanding is enabled by an egocentric perspective, i.e. the capture of social interactions from the perspective of each social partner via head- and body-worn sensors. This perspective is well-aligned with existing commercial efforts in Augmented and Virtual Reality, and with the literature on child development. In this lecture, I will provide a background on egocentric computer vision and summarize current and future progress towards the development of an egocentric social AI.