Sergio Casas

Sergio Casas

Sr Staff TLM @ Waymo Research

Ph.D. @ UofT

About me

I am a Sr Staff Tech Lead Manager at Waymo Research working on Behavior Modeling for Self-Driving Cars. Previously, I led the Perception and Behavior Reasoning team at Waabi, and before that I was a Research Scientist at Uber ATG. I completed my PhD at the University of Toronto CS department, supervised by Professor Raquel Urtasun (thesis here). I am excited about bringing AI to the physical world.

Interests

  • Artificial Intelligence
  • Machine Learning
  • Computer Vision
  • Robotics - Autonomous Driving
  • Generative Models
  • Imitation Learning

Education

  • PhD in Computer Science, 2020 - 2024

    University of Toronto

  • MSc in Computer Science, 2018 - 2020

    University of Toronto

  • BSc in Computer Science, 2013 - 2017

    Universitat Politècnica de Catalunya

  • BSc in Industrial Tech. Engineering, 2012 - 2017

    Universitat Politècnica de Catalunya

Selected Publications

For a complete and up-to-date list of publications visit my Google Scholar

(* denotes equal contribution)

DeTra: A Unified Model for Object Detection and Trajectory Forecasting

ECCV 2024
Unified object detection and trajectory prediction as trajectory refinement.
DeTra: A Unified Model for Object Detection and Trajectory Forecasting

UnO: Unsupervised Occupancy Fields for Perception and Forecasting

CVPR 2024 (Oral)
Occupancy Foundation Model (Unsupervised)
UnO: Unsupervised Occupancy Fields for Perception and Forecasting

QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving

ICRA 2024
End-to-end autonomy leveraging implicit occupancy.
QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving

Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

ICLR 2024
LiDAR World Model.
Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

ImplicitO: Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving

CVPR 2023 (Highlight)
Efficient implicit occupancy perception and forecasting model.
ImplicitO: Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving

MP3: A Unified Model to Map, Perceive, Predict and Plan

CVPR 2021 (Best Paper Candidate, Oral)
Interpretable end-to-end neural motion planning without high-definition maps
MP3: A Unified Model to Map, Perceive, Predict and Plan

LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving

ICCV 2021
Contingency planning from diverse joint trajectory samples for all actors in the scene
LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving

TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

CVPR 2021
Realistic long-term vehicle behavior simulation learned from imitation and common sense
TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

ECCV 2020
ILVM characterizes the joint distribution over multiple actors’ future trajectories
Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

End-to-end Interpretable Neural Motion Planner

CVPR 2019 (Oral)
Neural motion planner from LiDAR and HD maps
End-to-end Interpretable Neural Motion Planner

Intentnet: Learning to Predict Intention from Raw Sensor Data

CoRL 2018 (Spotlight)
Joint perception and prediction from LiDAR point clouds and HD maps
Intentnet:  Learning to Predict Intention from Raw Sensor Data