Physics-based models have been crucial for manipulation, enabling sim-to-real learning, model-predictive control, manipulation planning, and model-based design and verification. However, they typically require extensive manual effort and often fail to capture real-world complexity. Advances in generative modeling—particularly video models—offer a data-driven alternative but struggle with physical plausibility, consistency, and action conditioning. A promising direction is to integrate structured priors with scalable data-driven methods to improve dynamics prediction and generalization across diverse scenarios.
This workshop will explore relevant timely key topics, including state-action representations, supervision sources, generalizable inductive biases, the role of (generative) simulation and video models, and trade-offs in downstream planning, control, policy learning and evaluation.
We will bring together researchers from robotics, computer vision, and cognitive science. The workshop targets audiences in manipulation, world modeling, reinforcement learning, and sim-to-real learning. Posters, panels, and live polls will foster debate and cross-level dialogue, allowing attendees to actively contribute to discussions.
We cordially invite paper submissions relevant to the following (non-exhaustive) topics:
8:00 - 8:15 | Opening Remarks |
8:15 - 8:30 | Poster Teasers |
8:30 - 8:50 | Speaker 1 |
8:50 - 9:10 | Speaker 2 |
9:10 - 9:30 | Speaker 3 |
9:30 - 9:50 | Speaker 4 |
9:55 - 10:35 | Coffee Break & Poster Sessions |
10:35 - 10:55 | Speaker 5 |
10:55 - 11:15 | Speaker 6 |
11:15 - 11:35 | Speaker 7 |
11:35 - 11:45 | Short Break |
11:45 - 12:30 | Panel Discussion & Debate |