Wasserstein distance W₁ = the cheapest way to shovel one pile of sand (μ, blue) into the other (ν, teal). The dual says: it also equals the best score a "gentle ruler" can give — any function g whose slope never exceeds 1 (Lipschitz), scored as Eμ[g]−Eν[g]. Drag the amber ruler up and down (its slope auto-clamps to ±1); the dual value can never beat the true W₁. Hit "snap to best ruler" and it touches it exactly. This is why a Lipschitz constant is a literal exchange rate between a distance in distribution space and a number you can read off.