Robotics · Vision · Learning
Ran Cheng
Foundation models for robots·VLA·embodied AI·robot learning
I lead robotics foundation-model research at the intersection of Vision-Language-Action (VLA), embodied AI, and robot learning — and write deep, interactive explainers on how these systems work.
About
At Ant Group, I lead foundational VLA research for robotics across pretraining, post-training, reinforcement learning, and universal reward modeling. Previously, I led an R&D department at Midea and scaled robotics products to over one million production units; before that, at Huawei Noah's Ark Lab, I built scene reconstruction and world-model systems for autonomous driving. I hold an M.Sc. from McGill University's Center for Intelligent Machines (advised by Gregory Dudek and David Meger) and a B.S. from Tongji University.
Writing
all posts →- 2026-06-10 Q-Guided Flow, From the Ground Up: Guiding a Flow Policy with a Value Function at Test Time
- 2026-04-02 Modern Hopfield Networks, Geometrically: From Wide Memory Basins to Attention
- 2026-02-08 Resonant Manifold Network - A Physics-Inspired Approach to Continual Learning
- 2026-02-05 Generative Modeling via Drifting: One-Step Generation Through Training-Time Evolution
- 2026-02-05 Understanding Forward and Reverse KL Divergence
Research
publications →Selected papers in LiDAR perception, semantic scene completion, visual odometry, and RL for autonomous driving — see the publications page.