Explaining the World to Machines That Move

As autonomous driving shifts from modular pipelines to end-to-end learning, a paradox emerges: models that learn more from data need more human guidance, not less. This talk draws on learning theory — the No Free Lunch theorem, the gap between correlation and causation, and the role of inductive bias in sample efficiency — to explain why. We examine what “end-to-end” actually means in practice (gradients flowing through everything, not the absence of structure), why training collapses without human-injected bias, and how the rise of vision-language-action models is transforming annotation from drawing bounding boxes to explaining causality. The better these systems get, the more they need us to explain the world to them.