Sergio Casas

Sr Staff Researcher and TLM

Waymo

About me

I am a researcher and engineering leader driven by the mission of bringing AI from the lab into the real world. I am currently a Sr Staff Researcher and Tech Lead Manager at Waymo, developing Foundation Models for Self-Driving Cars. My work builds on a deep background in autonomous technology, including my time leading the Perception and Behavior Reasoning team at Waabi, my role as a Research Scientist at Uber ATG, and my PhD in end-to-end autonomy from the University of Toronto, where I was advised by Raquel Urtasun.

Interests

Artificial Intelligence
Machine Learning
Computer Vision
Generative Models
Autonomous Driving
Robotics

Education

PhD in Computer Science, 2020 - 2024
University of Toronto
MSc in Computer Science, 2018 - 2020
University of Toronto
BSc in Computer Science, 2013 - 2017
Universitat Politècnica de Catalunya
BSc in Industrial Tech. Engineering, 2012 - 2017
Universitat Politècnica de Catalunya

Selected Publications

For a complete and up-to-date list of publications visit my Google Scholar

(* denotes equal contribution)

Scaling Laws of Motion Forecasting and Planning

Technical Report 2025
Studying how motion forecasting and planning models scale with compute and data, both in open-loop and closed-loop

Mustafa Baniodeh, Kratarth Goel, Scott Ettinger, Carlos Fuertes, Ari Seff, Tim Shen, Cole Gulino, Chenjie Yang, Ghassen Jerfel, Dokook Choe, Rui Wang, Vinutha Kallem, Sergio Casas, Rami Al-Rfou, Benjamin Sapp, Dragomir Anguelov

Project PDF

Scaling Laws of Motion Forecasting and Planning

MAD: Memory-Augmented Detection of 3D Objects

CVPR 2025
Pushing the boundaries of Memory-based Perception

Ben Agro, Sergio Casas, Patrick Wang, Thomas Gilles, Raquel Urtasun

Project PDF Poster Video

MAD: Memory-Augmented Detection of 3D Objects

DIO: Decomposable Implicit 4D Occupancy-Flow World Model

CVPR 2025
Object-Centric Occupancy Foundation Model

Christopher Diehl*, Quinlan Sykora*, Ben Agro, Thomas Gilles, Sergio Casas, Raquel Urtasun

Project PDF Poster Video

DIO: Decomposable Implicit 4D Occupancy-Flow World Model

DeTra: A Unified Model for Object Detection and Trajectory Forecasting

ECCV 2024
Unified object detection and trajectory prediction as trajectory refinement.

Sergio Casas*, Ben Agro*, Jiageng Mao*, Thomas Gilles, Alexander Cui, Thomas Li, Raquel Urtasun

Project PDF Video

DeTra: A Unified Model for Object Detection and Trajectory Forecasting

UnO: Unsupervised Occupancy Fields for Perception and Forecasting

CVPR 2024 (Oral)
Occupancy Foundation Model (Unsupervised)

Ben Agro*, Quinlan Sykora*, Sergio Casas*, Thomas Gilles, Raquel Urtasun

Project PDF Video

UnO: Unsupervised Occupancy Fields for Perception and Forecasting

Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

ICLR 2024
LiDAR World Model.

Lunjun Zhang, Yuwen Xiong, Ze Yang, Sergio Casas, Rui Hu, Raquel Urtasun

Project PDF

Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

ImplicitO: Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving

CVPR 2023 (Highlight)
Efficient implicit occupancy perception and forecasting model.

Ben Agro*, Quinlan Sykora*, Sergio Casas*, Raquel Urtasun

Project PDF Video

ImplicitO: Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving

MP3: A Unified Model to Map, Perceive, Predict and Plan

CVPR 2021 (Best Paper Candidate, Oral)
Interpretable end-to-end neural motion planning without high-definition maps

Sergio Casas*, Abbas Sadat*, Raquel Urtasun

PDF Poster Video

MP3: A Unified Model to Map, Perceive, Predict and Plan

TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

CVPR 2021
Realistic long-term vehicle behavior simulation learned from imitation and common sense

Simon Suo, Sebastian Regalado, Sergio Casas, Raquel Urtasun

PDF Video

TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

ECCV 2020
ILVM characterizes the joint distribution over multiple actors’ future trajectories

Sergio Casas*, Cole Gulino*, Simon Suo*, Katie Luo, Renjie Liao, Raquel Urtasun

PDF Video

Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

End-to-end Interpretable Neural Motion Planner

CVPR 2019 (Oral)
Neural motion planner from LiDAR and HD maps

Wenyuan Zeng*, Wenjie Luo*, Simon Suo, Abbas Sadat, Bin Yang, Sergio Casas, Raquel Urtasun

PDF Video

End-to-end Interpretable Neural Motion Planner

Intentnet: Learning to Predict Intention from Raw Sensor Data

CoRL 2018 (Spotlight)
Joint perception and prediction from LiDAR point clouds and HD maps

Sergio Casas, Wenjie Luo, Raquel Urtasun

PDF Video

Intentnet: Learning to Predict Intention from Raw Sensor Data