Hyperbolic Information Geometry

Поделиться
HTML-код
  • Опубликовано: 22 окт 2024

Комментарии • 17

  • @carterwoodson8818
    @carterwoodson8818 9 месяцев назад +2

    @0:14 Clayton is a beast, one of the best profs I've had

    • @GabeKhan
      @GabeKhan  9 месяцев назад +1

      If you are interested in his research, there's a really cool video where he discusses random polygons. ruclips.net/video/wcHHRwAfwAo/видео.html&ab_channel=SciState

  • @jakubbartczuk3956
    @jakubbartczuk3956 Год назад +6

    Extremely useful introduction - there is no shortage of written sources but they cant competw with your visualizations.

    • @GabeKhan
      @GabeKhan  Год назад

      Thanks for the kind remark. Glad you enjoyed it!

  • @matthewbroerman1098
    @matthewbroerman1098 11 месяцев назад +2

    very helpful! Can you help me to understand the quantity g? You say it is matrix valued, and I think it has the dimension of the parameters (i.e. 2x2, and the slide says "in terms of (mu, sigma) parameters", but in the quick derivation, it looks to be the known sigma case, and hence the partials are both mu. In particular, where does (dmu^2 + 2dsigma^2) come from?

    • @matthewbroerman1098
      @matthewbroerman1098 11 месяцев назад

      Ah the sigma sigma is at 16:07

    • @GabeKhan
      @GabeKhan  11 месяцев назад

      Thanks! g is a Riemannian metric, which is a generalization of the dot product in Euclidean space. More formally, it is a smoothly varying inner product on the tangent space of a manifold. To calculate it, you specify both the value of mu and sigma, and then figure out what the inner product of two vectors are using that formula. I calculated the (sigma, sigma) component at the end of the video, but to obtain the full metric, you would also need to compute the (mu,sigma) component (which is equal to the (sigma, mu) component by symmetry). However, when you work it out, these components turn out to be zero. Does that answer your questions?

  • @HelloWorlds__JTS
    @HelloWorlds__JTS 8 месяцев назад +1

    Thanks for posting. At 5:45 (and in general for Fisher information), what is required to be a "reasonable" parameterized family? It seems one obvious assumption is that they must be unimodal, because otherwise Fisher information doesn't make sense [to me], and the Fisher metric is no longer PD or PSD.

    • @GabeKhan
      @GabeKhan  8 месяцев назад

      Thanks for the question. All that is needed for the Fisher Information to define a Riemannian metric is for the family to be smooth and for the integrals involved to converge. So you don't need to assume unimodality or anything like that, simply that the family isn't extremely pathological.

    • @HelloWorlds__JTS
      @HelloWorlds__JTS 8 месяцев назад

      @@GabeKhan thanks for the reply; as I read it I realized the underlying reason I had mentioned the "obvious assumption" of uni-modality--and please correct me if I'm wrong:
      It's because I was imagining the [lack of] utility offered by a measure of information when seeking optimal parameter[s] of some distribution ("parameter estimation") if the optimal parameters themselves are needed in order to define the measure! I think that as long as either the distribution is unimodal, or the optimal parameters are already known (or a "provably"-close guess is in hand), then Fisher information is useful. Otherwise it seems useless [in parameter estimation] for multimodal distributions.

  • @markneumann381
    @markneumann381 7 месяцев назад

    Great job. Thank you. Enjoyed this very much.

    • @GabeKhan
      @GabeKhan  6 месяцев назад

      Glad you enjoyed it!

    • @markneumann381
      @markneumann381 6 месяцев назад

      Your very welcome. You do outstanding work.

  • @GowthamaVenkata
    @GowthamaVenkata 4 месяца назад

    Thanks for the video! In the 2nd counter example @3:02 you said Fischer Rao(FR) metric will always induce same geometry for location scale families(LSF), but FR metric is not same for different LSFs right? For example FR metric of gaussian(mu,b^2) is diag(1/b^2,2/b^2) whereas Laplace(mu,b) is diag(1/b^2,1/b^2), isn't it?

    • @GabeKhan
      @GabeKhan  4 месяца назад

      Your expressions are correct, but both of those metrics *are* the same from a geometric perspective. The reason for this is that it is possible to go from one metric to the other by changing the parameters (i.e., rescaling b). In general, we say that two metrics are the same if there is a change of coordinates which transforms one metric to another. The reason for this is that the geometric quantities that we care about (such as geodesics or curvature) are independent of which coordinates we choose. Does that make sense?

    • @GowthamaVenkata
      @GowthamaVenkata 4 месяца назад

      @@GabeKhan I am thinking in terms of distances, let us say we have 3 points in the PARAMETER space (1,2), (2,3) and (3,5) and given the distance between each pair our task is to decide between 2 metrics say Gaussian and Laplace, then only one distribution would correspond to these set of points and given distances right?

    • @GabeKhan
      @GabeKhan  4 месяца назад

      Given pairs of points in a parameter space, to compute the distance between them you should use the Fisher metric for parametrized family of which they are elements. However, this is a separate question from whether it is possible to find a reparametrization of the original statistical model so that the Fisher metrics coincide.