Physics-based models have been crucial for manipulation, enabling sim-to-real learning, model-predictive control, manipulation planning, and model-based design and verification. However, they typically require extensive manual effort and often fail to capture real-world complexity. Advances in generative modeling—particularly video models—offer a data-driven alternative but struggle with physical plausibility, consistency, and action conditioning. A promising direction is to integrate structured priors with scalable data-driven methods to improve dynamics prediction and generalization across diverse scenarios.
This workshop will explore relevant timely key topics, including state-action representations, supervision sources, generalizable inductive biases, the role of (generative) simulation and video models, and trade-offs in downstream planning, control, policy learning and evaluation.
We will bring together researchers from robotics, machine learning, and computer vision. The workshop targets audiences in manipulation, world modeling, reinforcement learning, and sim-to-real learning. Posters, panels, and live polls will foster debate and cross-level dialogue, allowing attendees to actively contribute to discussions.
We cordially invite paper submissions relevant to the following (non-exhaustive) topics:
8:00 - 8:05 | Opening Remarks |
8:05 - 8:25 | Paper Oral Presentations |
8:25 - 8:50 | Speaker 1 |
8:50 - 9:15 | Speaker 2 |
9:15 - 9:40 | Speaker 3 |
9:40 - 10:05 | Speaker 4 |
10:05 - 10:30 | Speaker 5 |
10:30 - 11:30 | Coffee Break & Poster Sessions |
11:30 - 11:55 | Speaker 6 |
11:55 - 12:25 | Panel Discussion & Debate |
12:25 - 12:30 | Awards & Closing Remarks |