Really good lectures! And I have a question @57:00. How do we could extract a convergent subsequence \pi _n_k from \pi_n? It seems that you use the compactness of X and Y, and some properties of probability measures. Thanks !
Yes, the space of probability measures (with the topology of weak convergence) on a compact metric space will be compact. You may also want to check out Prokhorov's Theorem.
I think we could simplify a little bit the proof of the existence of minimiser for Kantorovich problem. This is by noting that Kantorovich functional is a convex functional of the joint measure on a compact convex set. And all convex function on a compact convex set is Lipschitz continuous. Then we are done.
Hi Brittany, Thank you for your excellent video lectures. One question: at @19:24, you said " from measure conservation property, we get \int_X {T^2(x) dx} == \int_X { \tilt{T}^2 (x) dx}", I am still not quite clear. Is there any more concrete example?
This comes from the characterisation of a measure-preserving map T as one that satisfies \int_X h(T(x)) f(x) dx = \int_Y h(y) g(y) dy for all continuous h. We can apply this with h(y) = y^2 to get \int_X T(x)^2 f(x) dx = \int_Y y^2 g(y) dy = \int_X \tilde{T}(x)^2 f(x) dx since both T and \tilde{T} are measure preserving.
Hello Brittany, thank you for your great lecture. I know that sequential compactness is equivalent to compactness for metric spaces. Which metric are you using when proving the compactness of the space of joint probability measures?
I didn't actually have a specific metric in mind here. But it's good enough to know that the space I'm working with is metrizable. Here I'm working with the space of probability measures with the topology of weak convergence (that's how I defined convergence when I checked for sequential compactness), and this is indeed metrizable.
Hi, thank you so much for the lectures! I have a question here, why is it true that \pi shows up in a linear way at the Kantorovich Problem? Thank you so much
This formulation requires computing the cost c(x,y) between ALL possible pairs of points x and y. There are no unknowns in this step. Then the unknown pi(x,y) (the transport plan) essentially multiplies this and determines how much weight is assigned to each pair (i.e. how much mass is moved from x to y).
Hello professor, thanks for the great video! I have a question regarding the cyclical monotonicity. Is cyclical monotonicity close under functional composition (suppose all involved measures are absolutely continuous)? This is true in R1 since the composition of two non-decreasing maps is still non-decreasing. I am thinking about whether we have similar properties for cyclical monotonicity in general Rd. Thanks!
Great question, this actually does not work in Rd. For example, let's consider 2 linear maps T(x) = Ax, S(y) = By with A and B both symmetric positive definite. These are cyclically monotone since they can be written as the gradient of a convex function (1/2 * x^TAx and 1/2 * y^TBy respectively). Consider the composition of the maps: R(x) = S(T(x)) = BAx. In general, the matrix product BA need not be symmetric and the map is not the gradient of any function (much less a convex one), thus not cyclically monotone.
@@brittanyhamfeldt Thanks for the clarification, it makes sense! I have a follow-up question. In that case, is the composition of two optimal transport maps still optimal (assume all involved measures are absolutely continuous, supported on compact sets, and use quadratic cost in Monge formulation)? I have this question because of the equivalence between cyclical monotonicity and optimality of transport map, c.f. Brenier-McCann theorem.
Thanks for the course. I don't get why you work with X and Y compact and you define weak convergence with respect to continuous functions. Why don't we immediatly work on full R^n with weak convergence to be defined with respect to bounded continuous functions ? It seems to me that with a truncation argument we are still able to show that the Kantorovich probleme admits a minimum too in this more general setting. Thanks again !
Indeed for the Kantorovich problem you can consider more general X and Y (including R^n). Arguing that \Pi(\mu, u) is compact is slightly more subtle via Prokhorov's Theorem, but certainly doable. Trying to move from here to a transport map (i.e. the Monge problem) does require stronger assumptions on X and Y.
I'm very amazed by these courses thanks a lot professor you did a really great job in these videos Salutations
this is now my favorite area of math
Really good lectures! And I have a question @57:00. How do we could extract a convergent subsequence \pi _n_k from \pi_n? It seems that you use the compactness of X and Y, and some properties of probability measures. Thanks !
Yes, the space of probability measures (with the topology of weak convergence) on a compact metric space will be compact. You may also want to check out Prokhorov's Theorem.
@@brittanyhamfeldt I see. Really thank you!
I think we could simplify a little bit the proof of the existence of minimiser for Kantorovich problem. This is by noting that Kantorovich functional is a convex functional of the joint measure on a compact convex set. And all convex function on a compact convex set is Lipschitz continuous. Then we are done.
Yes, I believe that would work just fine too!
thank you for your sharing what a nice lecture!
Hi Brittany, Thank you for your excellent video lectures. One question: at @19:24, you said " from measure conservation property, we get \int_X {T^2(x) dx} == \int_X { \tilt{T}^2 (x) dx}", I am still not quite clear. Is there any more concrete example?
This comes from the characterisation of a measure-preserving map T as one that satisfies
\int_X h(T(x)) f(x) dx = \int_Y h(y) g(y) dy for all continuous h.
We can apply this with h(y) = y^2 to get
\int_X T(x)^2 f(x) dx = \int_Y y^2 g(y) dy = \int_X \tilde{T}(x)^2 f(x) dx
since both T and \tilde{T} are measure preserving.
@@brittanyhamfeldt I get it. Thanks again.
Thanks for such an amazing course! I have a question. Is cyclically monotonicity a sufficient condition for optimal mapping? If so, why?
Hello Brittany, thank you for your great lecture. I know that sequential compactness is equivalent to compactness for metric spaces. Which metric are you using when proving the compactness of the space of joint probability measures?
I didn't actually have a specific metric in mind here. But it's good enough to know that the space I'm working with is metrizable. Here I'm working with the space of probability measures with the topology of weak convergence (that's how I defined convergence when I checked for sequential compactness), and this is indeed metrizable.
Hi, thank you so much for the lectures! I have a question here, why is it true that \pi shows up in a linear way at the Kantorovich Problem? Thank you so much
This formulation requires computing the cost c(x,y) between ALL possible pairs of points x and y. There are no unknowns in this step. Then the unknown pi(x,y) (the transport plan) essentially multiplies this and determines how much weight is assigned to each pair (i.e. how much mass is moved from x to y).
Hello professor, thanks for the great video! I have a question regarding the cyclical monotonicity. Is cyclical monotonicity close under functional composition (suppose all involved measures are absolutely continuous)? This is true in R1 since the composition of two non-decreasing maps is still non-decreasing. I am thinking about whether we have similar properties for cyclical monotonicity in general Rd. Thanks!
Great question, this actually does not work in Rd.
For example, let's consider 2 linear maps T(x) = Ax, S(y) = By with A and B both symmetric positive definite. These are cyclically monotone since they can be written as the gradient of a convex function (1/2 * x^TAx and 1/2 * y^TBy respectively).
Consider the composition of the maps:
R(x) = S(T(x)) = BAx.
In general, the matrix product BA need not be symmetric and the map is not the gradient of any function (much less a convex one), thus not cyclically monotone.
@@brittanyhamfeldt Thanks for the clarification, it makes sense! I have a follow-up question. In that case, is the composition of two optimal transport maps still optimal (assume all involved measures are absolutely continuous, supported on compact sets, and use quadratic cost in Monge formulation)? I have this question because of the equivalence between cyclical monotonicity and optimality of transport map, c.f. Brenier-McCann theorem.
@@Ethan-lz5rw Indeed, a consequence of this is that the composition of two optimal maps is NOT optimal in general.
@@brittanyhamfeldt Thank you!
Thanks for the course. I don't get why you work with X and Y compact and you define weak convergence with respect to continuous functions. Why don't we immediatly work on full R^n with weak convergence to be defined with respect to bounded continuous functions ? It seems to me that with a truncation argument we are still able to show that the Kantorovich probleme admits a minimum too in this more general setting. Thanks again !
Indeed for the Kantorovich problem you can consider more general X and Y (including R^n). Arguing that \Pi(\mu,
u) is compact is slightly more subtle via Prokhorov's Theorem, but certainly doable. Trying to move from here to a transport map (i.e. the Monge problem) does require stronger assumptions on X and Y.
@@brittanyhamfeldt OK, thanks a lot for your awnser.
Hidden treasure!!