Computer Vision for Spatial and Physical Intelligence

LECTURES

Speakers	Syllabus	Titles & Abstracts
Daniel Cremers Technical University of Munich, DE	3D Computer Vision	3D Computer Vision in the Age of Deep Learning
Dima Damen University of Bristol, UK	Video Understanding, Multi-modal Learning, Egocentric Videos, Hand-Object Interactions	Video Understanding Out of the Frame - an Egocentric Perspective
Andrew Davison Imperial College, UK	Representations for real-time reconstruction and mapping; new computing architectures and sensors; distributed computation	From SLAM to Spatial AI
Christoph Feichtenhofer FAIR, Meta, USA	Visual recognition, instance segmentation, vision-language models, video understanding	Demystifying the impact of data for image and video understanding
Vittorio Ferrari Synthesia, UK	generative models, diffusion, transformers, controllable generation	Generating images and videos with diffusion models
Leonidas Guibas Stanford University, USA	Foundation Models, Scene Understanding	Foundation Models for 3D / 4D Scene Understanding and Content Creation
Ranjay Krishna University of Washington, USA	coming soon	coming soon
Ishan Misra GenAI, Meta, USA	video generation, flow matching, diffusion, foundation models, multimodal learning	Foundation models for video generation, editing, and personalization.
Matthias Niessner Technical University of Munich, DE	AI Avatars	Photo-realistic AI Avatars
Gerard Pons-Moll University of Tübingen, DE	3D humans, neural implicit fields, 3D Gaussian Splats, generative models, embodiment, LLMs, multi-view diffusion, human-object interaction, humans and 3D scenes.	Real Virtual Humans: The Path from Statistical Models to Neural Avatars that Act and Behave
Fatih Porikli Australian National University, Qualcomm, AUS & USA	coming soon	coming soon
Stefano Soatto Amazon and University of California Los Angeles, USA	Reasoning, LLM, World Models	Emergence of Reasoning in Language and World Models
Gul Varol École des Ponts ParisTech, FR	humans, generative models, 3D and language, human motion, hands, sign language	Dynamic Humans: Generating 3D Human Motion with Language
Andrea Vedaldi University of Oxford, UK	Spatial Intelligence, Visual Geometry, 3D Generative AI	Spatial Intelligence: The New Frontier of Computer Vision

READING GROUP

Speakers	Syllabus	Rules of Engagement
Stefano Soatto Amazon and University of California Los Angeles, USA	Reading Group: Meeting with Mentors	Reading Group: Meeting with Mentors

ESSAY COMPETITION (WITH PRIZE!)

Speakers	Syllabus	Rules of Engagement
Fabio Galasso Sapienza University of Rome, Italy	Essay Competition	Essay Competition

INDUSTRY MEETS STUDENTS

Industrial Panel

More industries are coming soon!

Web statistics