Skip to content

Transport_intro

Bruno Levy edited this page Jul 26, 2022 · 2 revisions

Introduction

Optimal Transport is a mathematical theory that studies ways of deforming shape A (here an "armadillo") into shape B (here a sphere) while minimizing some definition of "effort".

It has some close connections with the "principle of least action" in physics. This principle says that "electricity takes the shortest path", and also much more general things. It resembles "minimizing the effort" in Optimal Transport.

In some cases, Optimal Transport and the principle of least action are equivalent ! In the beginning of the 2000's, astrophysicists and mathematicians used it to "turn back time in the Universe" (see this article published in Nature).

At that time, their methods worked like that: Suppose you have a map of the Universe, with the positions of all the galaxies. What you want to do is "going back in time", and find the trajectories of all the galaxies, from the origin to now.

It is supposed that at earlier times, the Universe was much densier, hotter and homogeneous. So we are going to suppose (for now) that all galaxies were intially arranged regularly, for instance on a regular grid, like in the image above.

Now the question is knowing which galaxy went where, or which "point on the grid" (blue points) corresponds to which "data point" (red point), how can we do that ?

So we want to know which blue point goes with which red point. What is a physical way of knowing that ? By minimizing a physical quantity, the integrated kinetic energy, also called the action.

And through a nice theorem due to Jean-David Benamoun and Yann Brenier, minimizing the action is equivalent to minimizing the sum of the squared lengths of all black segments.

One possible way of doing that is testing all configurations. There are n! of them, not feasible in practice. Fortunately, there is an algorithm that works in n^2log(n) Bertsekas's auctions. It is what was used in the 2000's.

Then you can reconstruct the following motion for your galaxies. Nice ! But the initial location of the galaxies on a grid is not super realistic... How can we gain more precision ?

We could use 4 points at the initial condition for 1 data point, then we will model more accurately the small patch of matter that collapsed to form each galaxy.

Why stopping there ? Let us use 16 points per data point for more precision !

Even more precise ! But n^2 log(n) starts to weight something (it will be even worse in 3D). But take a close look at it, don't you see a structure ?

  • To have more precision, what if we used an infinite number of points in the initial condition ?
  • But wait a minute, won't it cost an infinite amount of time ?

In fact, the structure that you have seen more and more clearly is a well-defined geometric object, called a Laguerre diagram (white lines). And with GEOGRAM, it is possible to compute this geometric object exactly (see GEOGRAM tutorials here and here). Not only it is more precise, but also it is much much faster !

  • A Laguerre diagram is a bit like a Voronoi diagram
  • OK but what's a Voronoi diagram ?
  • Well, let's say it is what you obtain in a Petri dish when bacterium colonies grow from some locates (and stop growing as soon as they meet a bacterium from another species).

A Voronoi diagram can also be defined as what one sees when looking at a family of cones from below.

You'll see the same diagram if you replace "distance" with "squared distance" in the formula of the cones (because "square" is a monotonous function).

Now let us see what happens when we shift one of the paraboloids along the Z axis...

Now if you look at that from above, and keep playing with the Z offset of the paraboloids, what you see is like a Voronoi diagram that has "tunable" cells (the "tuning buttons" are the Z offsets). It is called a Laguerre diagram.

With these "tuning buttons", you can change the size of the cells. But how can you tune them in such a way that all the cells have prescribed volumes ? The good news is that there is a theorem (Yann Brenier's polar factorization) that tells you that it is possible

... and this theorem can be turned into an efficient algorithm ! When you "translate" the math into physics, then you figure out that the "tuning buttons" associated with each cell correspond to the gravitational potential.

By playing with a similar problem (Lloyd relaxation), we could prove that the objective function to be optimized in order to find the value of the "tuning buttons" is very regular (it has continuous second order derivatives), hence a Newton-like algorithm is possible !

Jun Kitagawa, Quentin Merigot and Boris Thibert found a way of finding the descent parameter, with proven convergence.

Here is a 2D visualization. In 2D, one can visualize also the reconstructed gravitational potential (that corresponds to the Legendre transform of the "tuning buttons" attached to the points). The displacement corresponds to the gradient of this potential.

Next: How to program your own optimal transport

Clone this wiki locally