From (data, architecture, recipe) to (real-robot performance) there is a real map — but no closed-form formula to evaluate it. The analytic shortcut (top) just hits a wall: there's no equation that tells you, in advance, how good a policy will be. The only reliable evaluator (bottom) is to actually train the model and run it on the real robot. That isn't a failure of cleverness; it's an intrinsic property of the map. Admitting it is step one.