Object Recognition and Reconstruction in the era of LLMs
Georgia Gkioxari
California Institute of Technology, USA
Abstract
In this talk, I will cover modern developments in visual perception in 2D and 3D. I will discuss the state of the methods in recognizing and localizing objects from images, perceiving them in 3D space, including predicting their size, pose and distance from camera, and then I will cover how to learn general representations to reconstruct their geometry, all from a single image. In the 2 hours of my lecture, I hope that students get a comprehensive understanding of how to design modern object recognition systems, leveraging ideas from large language models and guiding image-centric representations for the task of 2D and 3D recognition.