Yale University · Ph.D., Computer Science

Researching 3D computer vision and multimodal AI systems at Yale Vision Lab.

Caltech · B.S., Computer Science & Applied Mathematics (minor)

Awarded in recognition of the Best Academic Record in Computer Science.


Google · Research Intern

Building real-time streaming 3D reconstruction and end-to-end transformer-based SLAM, integrating open-vocabulary semantic distillation from Gemini to power robotic manipulation, navigation, and AR/VR applications.

NVIDIA Research · Research Scientist Intern

Led the creation of Spatial-IQ, a novel hierarchical framework that deconstructs spatial reasoning in multimodal LLMs. Post-training on Spatial-IQ via chained SFT-CoT and RLVR improves spatial intelligence.

Meta Reality Labs · Research Scientist Intern

Led the creation of SHOW3D, the first ever hand-object interaction dataset captured in the wild. The most valuable 4.6 million frames of egocentric data ever captured, open-sourced for the community!

What I Research and Why

My work is centered on building embodied AI agents capable of adaptive, efficient, and robust physical perception. My research bridges the gap between digital reasoning and the physical world by integrating multimodal capabilities across vision, language, and range sensing.

What coding agents did for white-collar workflows, physical AI will do for all of humankind.

Recent Publications

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild
Patrick Rim, Kevin Harris, Braden Copple, Shangchen Han, Xu Xie, Ivan Shugurov, Sizhe An, He Wen, Alex Wong, Tomas Hodan, Kun He
CVPR 2026

Radar-Guided Polynomial Fitting for Metric Depth Estimation
Patrick Rim, Hyoungseob Park, Vadim Ezhov, Jeffrey Moon, Alex Wong
CVPR 2026

Iris: Integrating Language into Diffusion-based Monocular Depth Estimation
Ziyao Zeng, Jingcheng Ni, Daniel Wang, Patrick Rim, Younjoon Chung, Fengyu Yang, Byung-Woo Hong, Alex Wong
CVPR 2026

ODE-GS: Latent ODEs for Dynamic Scene Extrapolation with 3D Gaussian Splatting
Daniel Wang, Patrick Rim, Tian Tian, Alex Wong, Ganesh Sundaramoorthi
ICLR 2026

Unsupervised Depth Completion via Occluded Region Completion as Supervision
Hyoungseob Park, Runjian Chen, Patrick Rim, Dong Lao, Alex Wong
ICLR 2026

ProtoDepth: Unsupervised Continual Depth Completion with Prototypes
Patrick Rim, Hyoungseob Park, S. Gangopadhyay, Ziyao Zeng, Younjoon Chung, Alex Wong
CVPR 2025

ETA: Energy-based Test-time Adaptation for Depth Completion
Younjoon Chung*, Hyoungseob Park*, Patrick Rim*, Xiaoran Zhang, Jihe He, Ziyao Zeng, Safa Cicek, Byung-Woo Hong, James S. Duncan, Alex Wong
ICCV 2025

Extending Foundational Monocular Depth Estimators to Fisheye Cameras with Calibration Tokens
S. Gangopadhyay*, Jung-Hee Kim*, Xien Chen*, Patrick Rim, Hyoungseob Park, Alex Wong
ICCV 2025

SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection
Yichen Xie, Chenfeng Xu, MJ Rakotosaona, Patrick Rim, Federico Tombari, Kurt Keutzer, Masayoshi Tomizuka, Wei Zhan
ICCV 2023

Quadric Representations for LiDAR Odometry, Mapping and Localization
Chao Xia*, Chenfeng Xu*, Patrick Rim, Mingyu Ding, Nanning Zheng, Kurt Keutzer, Masayoshi Tomizuka, Wei Zhan
RA-L 2023

* denotes Equal Contribution