Latent Space Visualisation: PCA, t-SNE, UMAP | Deep Learning Animated

Поделиться
HTML-код
  • Опубликовано: 23 окт 2024

Комментарии • 132

  • @thorvaldspear
    @thorvaldspear 2 месяца назад +64

    Wow that mammoth 2D visualization using UMAP looked like it was opened up and flattened, you could tell it was a living thing of some sort. Incredible!

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +7

      Thank you ! Indeed it looks like a fossil in the ground :)

    • @charlescoult
      @charlescoult 2 месяца назад +3

      Super cool

  • @Ouuiea
    @Ouuiea 2 месяца назад +15

    My only comment to the video is that PCA real advantage is not speed, is interpretability. It's easy to read a principal component in terms of how it correlates with the original variables. Something you cannot do with t-SNE or UMAP. The video is an excellent work!

  • @SudhanvaDixit
    @SudhanvaDixit 8 дней назад

    Prediction - A channel that's going to explode.
    Watched multiple videos. Very crisp clear explanation with good animation. Thank you :)

  • @robotics_hub
    @robotics_hub 2 месяца назад +25

    Amazing visualization for a very difficult topic grasp. Many thanks!

  • @stringtheory5892
    @stringtheory5892 15 дней назад

    Absolutely clear and crisp visualization of PCA!!

  • @alin50248
    @alin50248 7 дней назад

    Excelent and clear animations, graphs and explanation, keep it on!

  • @sharjeel_mazhar
    @sharjeel_mazhar 2 месяца назад +5

    I must say, the way you teach is just brilliant! Those visualizations and all, i mean even a 10yo could understand it if he focuses just a lil bit! Can't wait till you reach the level of teaching us the Transformer models!

  • @Grenoble7
    @Grenoble7 2 месяца назад +7

    thank you for the insight without the fuss. I am a UMAP user and I am glad about your conclusion. Suscribed!

  • @williamz666
    @williamz666 2 месяца назад +4

    Wow, super cool! Love the visualizations! Very informative, much better than the PowerPoint presentations out there lol

  • @ganpangyen4444
    @ganpangyen4444 2 месяца назад

    Incredible visualization and simplification on the topic, especially with UMAP! The superiority of UMAP over t-SNE in terms of the final results and its lower sensitivity to hyperparameters really shows the power of math

  • @janjoecarcillar
    @janjoecarcillar 10 дней назад

    Very clearly discussed. Thanks.

  • @kagan4926
    @kagan4926 2 месяца назад +2

    Awesome video! I was hoping for a bit more of a friendly, intuitive explanation of the equations in t-SNE. Instead of just jumping into the equations, it would be great to get a sense of why they work the way they do and how they fit into the whole picture.

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thanks for the feedback! I would have liked to spend more time on each method, but unfortunately the video got longer than I usually aim for. :/

  • @aadimator
    @aadimator 2 месяца назад +1

    Amazing visualization, pace, and aesthetics. Looking forward to seeing more from you. Best of luck 🤞

  • @laotzunami
    @laotzunami 2 месяца назад +1

    Your videos are such high quality! Thank you so much for putting this effort into them. I do data visualization, and I would love to start including more advanced machine learning models in what I make. I got a lot out of this video. I can't wait to see what is next :)

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thank you for the kind words, the next videos will be about VAEs and their variants !

  • @r3dkoala
    @r3dkoala 2 месяца назад +2

    Killing it man, loving these videos, I'm so glad I found your channel!

  • @andrer.6127
    @andrer.6127 Месяц назад +1

    I forgot about t-SNE. In my own research I have been using UMAP. But, I haven't heard of TriMAP and PaCMAP before. I am going to dive in deeper!

  • @TheHHadouKen
    @TheHHadouKen 2 месяца назад

    Hey I’ve been looking for a visual representation of feature selection en dimensionality reduction and your video is just amazing. The tone and animation make me thing about 3b1b, and that’s a compliment ! 😊

  • @Schnurpselhasen
    @Schnurpselhasen Месяц назад

    very good explanation! Cool 😀 I would love to see a video from you explaining how transformers work

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад

      Thank you ! It's written on a list somewhere but it's not a priority right now :)

  • @jj6741
    @jj6741 2 месяца назад

    Great video! Complex concept in simple language! Do please make more videos about this topic!

  • @jorcyd
    @jorcyd 2 месяца назад

    Whoa ! AI, Data Science and Data Visualization in 3b1b-Manin style. Awesome

  • @sadramedia719
    @sadramedia719 14 дней назад

    amazing tutorial!

  • @ardhidattatreyavarma5337
    @ardhidattatreyavarma5337 2 месяца назад

    This was amazing! Gotta try this on a dataset.

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      @@ardhidattatreyavarma5337 thank you :)

  • @unclecode
    @unclecode 2 месяца назад +1

    I subscribed and will stay tuned for your next video!

  • @jabrikolo
    @jabrikolo 2 месяца назад

    this is really great, thank you very much!

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      @@jabrikolo Thanks for the comment

  • @thmcass8027
    @thmcass8027 2 месяца назад +1

    Big thumbs up for the awesome video! May I know how is the video animated?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Thank you ! 99% of the video is animated using Manim, a python library. Some 3D stuff is animated using Blender.
      I'll publish the code for each video soon.

    • @thmcass8027
      @thmcass8027 2 месяца назад

      @@Deepia-ls2fo That's cool! Your channel deserves millions of subscribers and I'm honoured to be one of the earliest who subscribed!

  • @Number_Cruncher
    @Number_Cruncher 2 месяца назад

    Just stunning!

  • @clickbaitking6770
    @clickbaitking6770 2 месяца назад

    Great video! Hopefully this ultimately leads to a video about the interpretability work by Anthropic using VAEs

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thank you, I've not heard of their work I'll look into it :)

  • @15tatt
    @15tatt 2 месяца назад +1

    this is suuuuuper helpful!!!!! thank you so much for the work!!!

  • @loicmurumba8493
    @loicmurumba8493 10 дней назад

    Great video, very well explained. One question for UMAP; I understand the concept behind the loss function creating a lower level representation with similar distances between points, but how do we represent an autoencoder for MNIST as a vector space? Is there some kind of transformation we can perform on the weights to visualize them as such?

    • @Deepia-ls2fo
      @Deepia-ls2fo  9 дней назад

      @@loicmurumba8493 Thanks, we don't project the network directly, but rather the inner representation it has of the MNIST data. For this, you encode several numbers, which results in vectors of 16 dimensions (for instance, but it could be any dimension). Then you apply UMAP to these vectors.
      We don't really visualize the weights directly.

    • @loicmurumba8493
      @loicmurumba8493 9 дней назад +1

      @@Deepia-ls2fo Really appreciate the quick answer; I had been thinking about the encoder as a neural net classifying the images into their numbers for some reason. In fact, the autoencoder simply compresses the data as accurately as it can into some lower dimensional space (16 dimensions in this example), then we're able to visualize those 16-D representations by running the algorithms described here

  • @rukascool
    @rukascool 29 дней назад

    masterpiece, also beautiful on OLED monitor. Easy sub from me

  • @lazur2006
    @lazur2006 2 месяца назад

    Keep it up! Really appreciate this work

  • @ProgrammingWithJulius
    @ProgrammingWithJulius Месяц назад

    Amazing video. I am using a short clip of it for my next project, hope you don't mind. With credit, of course.

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад +1

      Thanks ! Can you send me an email so that we can discuss this ? I usually don't allow my content to be reused on RUclips

  • @vijayaveluss9098
    @vijayaveluss9098 2 месяца назад

    Thank you so much! Straight to the point❤

  • @lifeinabubble9091
    @lifeinabubble9091 2 месяца назад

    Incredible work, I learned a lot. I will say the pacing was a bit fast at times in my opinion (I had to pause to take notes).

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thank you I'll try to adapt the pace

  • @ArnaldoSANTORO-b6w
    @ArnaldoSANTORO-b6w Месяц назад

    Great explaination and visualization! I'm looking at your code to learn how to perform these visualizations.
    How did you manage to visualize the rotating 3D mammoth in the right at 17:25? I'd like to do it with the t-rex data

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад

      Thank you ! Unfortunately Manim is very bad at handling 3D, so I loaded the data into Blender using a script. If you're struggling you can reach out by email and I'll provide the code (youtube does not like external links in the comments).

    • @arnaldosantoro6812
      @arnaldosantoro6812 Месяц назад

      Thank you for your reply.
      I resorted to the same method instead of wasting time on manim documentation.

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад

      @@arnaldosantoro6812 You're welcome ! Manim is great, just not for 3D :)

  • @sinfinite7516
    @sinfinite7516 2 месяца назад

    WOW AMAZING THANK YOU SO MUCH This helped me a lot 🎉

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Thank you I'm glad it helped :)

  • @MutigerBriefkasten
    @MutigerBriefkasten 2 месяца назад

    Thank you, great video again 🎉 learned many new things 😊

  • @TrusePkay
    @TrusePkay 3 дня назад

    Well done. But you did not mention LDA (linear discriminant analysis)

  • @nikilragav
    @nikilragav 2 месяца назад

    11:59 and elsewhere - when you are running PCA, are the classes not included in your dimensions? Eg. in the 3D spiral with color (actually 4 dimensions), are you only using 3 (x,y,z) for PCA (and SNE)?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thanks for the comment, indeed only the position information is used, not the class !

    • @nikilragav
      @nikilragav 2 месяца назад

      @@Deepia-ls2fo so if you added the class to PCA wouldn't it separate very cleanly?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Yes it would Indeed separate the representations very cleanly.
      We usually don't use them when testing a method because the idea is to see if the method is able to show relationships that we know are true and exist in the high dimensional space.

  • @BharmaArc
    @BharmaArc 2 месяца назад

    loved the video and it came at a great time for me, thank you! one small detail: at 8:12 the distance of the y's in the denominator should also be squared, right?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Hi thanks for the comment, indeed it should be squared that's a mistake. Another detail I did not take time to mention in the video is that there are no sigma paremeter in the low dimensional representation.

  • @Francis-gg4rn
    @Francis-gg4rn Месяц назад

    unrecognized genius

  • @clickbaitking6770
    @clickbaitking6770 2 месяца назад

    Would love a video on mixture of a million experts 🙃

  • @Father_Son.
    @Father_Son. 2 месяца назад

    new sub! great video quality!

  • @yadavadvait
    @yadavadvait 2 месяца назад

    amazing video! please keep it up :)

  • @imadhamaidi
    @imadhamaidi 2 месяца назад

    Hello! Thanks for the superb explanation, but I'm wondering, do the datasets really have to be vector spaces for these methods to work? Wouldn't they work with only a metric space where we only have distance information? Like levenshtein word distances between files?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Hi thanks for the comment, I don't know about this particular distance. Since t-SNE and UMAP both convert vector data into distances, I don't think that you could directly work with the distance information. But what you could do is modify UMAP to include this distance in the first step (the one turning the dataset into a weighted graph). I think t-SNE works only with the standard euclidan distance though. :)

    • @lelandmcinnes9501
      @lelandmcinnes9501 2 месяца назад

      @@Deepia-ls2fo In many UMAP implementations you can use metric="precomputed" and pass a distance matrix rather than input vectors. The Python implementation also supports using a precomputed knn-graph, so you need not compute all-pairs distances, just distances to nearest neighbors. These approaches would allow you to use UMAP with a (precomputed) Levenshtein distance.

  • @mourensun7775
    @mourensun7775 2 месяца назад

    The video is great, thank you. Btw, I have a small question, what is the name of the background music?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Thank you !
      It's a copyright free music I found on Pixabay: Documentary - Coma-Media.

    • @mourensun7775
      @mourensun7775 2 месяца назад

      @@Deepia-ls2fo Many thanks!

  • @Latent_Explorations
    @Latent_Explorations Месяц назад

    I love this ♥

  • @i2c_jason
    @i2c_jason 2 месяца назад

    Is it safe to say that when "choosing your own adventure" in an agentic RAG workflow, if you can visualize the high dimensional latent space, you can make decisions on iterating your workflow that would be more conservative the closer they are clustered together? For example, if I'm adjusting the parameters of a 3D model and pulling from a dataset of known shapes and "widgets", working within the boundaries of a closely clustered embeddings might be safer than pulling a wacky far-off embedding as my decision outcome for my next iteration of whatever I'm doing in the workflow? If I'm understanding this correctly, the visualizations can help guide a generative workflow.

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Hi thanks for the comment, I'm not familiar with all the terms you used. But if I understand correctly yes, checking that the objects you are dealing with are within the known distribution that your model knows how to handle can be done using these visualizations.
      The broad topic would be out of distribution detection.

    • @i2c_jason
      @i2c_jason 2 месяца назад

      @@Deepia-ls2fo I'm still formulating my thoughts on how exactly to implement some of what I'm working on, but the idea is to allow users of my generative workflow to have some intuitive guidance as they progress through iterations. I think exposing the latent space as part of the user experience might not be a bad thing. Rather than it being inside the black box. Also, great videos, new subscriber here!

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Hey just to let you know that I think I saw a thing on linkedin related to what you were talking about. This seems to definitely be a use case.

  • @jakeaustria5445
    @jakeaustria5445 2 месяца назад

    Thank you

  • @superman39756
    @superman39756 2 месяца назад

    Such a soothing AI voice too!

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thanks ! That's actually my voice cloned into elevenlabs and slightly modified :)

  • @vinniepeterss
    @vinniepeterss 2 месяца назад

    great!!

  • @Jinom
    @Jinom 2 месяца назад

    great video

  • @jkzero
    @jkzero 2 месяца назад

    very nice, make sure to submit this to #SoMEpi (deadline is Aug18)

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Hi, thanks for your comment. Could you send me the infos ? I can't seem to find anything online except the tag on youtube :/
      Edit : Nevermind I found it :)

    • @jkzero
      @jkzero 2 месяца назад

      @@Deepia-ls2fo just search for "Summer of Math Exposition," I cannot post the link because RUclips does not let viewers post links

    • @jkzero
      @jkzero Месяц назад

      @@Deepia-ls2fo great to see that your entry won one of the Honorable Mentions, congrats!

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад

      @@jkzero Thanks glad to see you got one too :)

    • @jkzero
      @jkzero Месяц назад

      @@Deepia-ls2fo thanks, I was not expecting this at all. I hope you got exposure to a greater audience and some constructive feedback from reviewers

  • @deror007
    @deror007 2 месяца назад

    In the UMAP case, how does the high-dimensional graph representation have more points than the lower-dimensional graph representation? I am definitely missing something here. 16:10

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Ho no that's my bad, they are supposed to have the same number of points of course !
      These are just illustratives examples though, not results of the actual aglorithm.

    • @deror007
      @deror007 2 месяца назад

      @@Deepia-ls2fo oh okay, thanks!

  • @iamr0b0tx
    @iamr0b0tx 2 месяца назад

    Are there any dimensionality reduction algorithms that work online (i.e. one sample at a time)?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thanks for the comment, I don't know about any dimensionality reduction technique that would work with one sample only except a simple projection. :/
      Edit: I think I misunderstood your question, the only online technique that would come to my mind is deep learning models ? Once trained you can feed them one sample at a time.

    • @iamr0b0tx
      @iamr0b0tx 2 месяца назад +1

      @@Deepia-ls2fo I was wondering about an algorithm that does not wait to get the entire dataset but learns the reduction as it gets the samples one at a time (or a small batch at a time), kinda like online learning. But yeah sounds like some trick with deep learning might be the closest possible solution at this time. Probably some pretrained VAE

  • @BooleanDisorder
    @BooleanDisorder 2 месяца назад

    What's it called when you turn a high dimensional representation into a single point to compare with other similar high dimensional representations?

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Hi, what you are refering too can be described using many words. Maybe "representation learning" is what you are looking for ?
      Edit: maybe "embedding" is the word you're looking for.

    • @BooleanDisorder
      @BooleanDisorder 2 месяца назад

      @@Deepia-ls2fo Yes! Thanks.
      This area of research is SO fascinating. I feel we're are grasping at something bigger.

  • @ravenecho2410
    @ravenecho2410 2 месяца назад

    Bro when did UMap turn into a paper about arxheology and mammoths what in the actual black maths

  • @kellymoses8566
    @kellymoses8566 2 месяца назад +1

    ISOMAP and PaCMAp are tow newer algorithms.

  • @Garfield_Minecraft
    @Garfield_Minecraft 2 месяца назад +1

    those datas at the beginning look like a country map

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thanks for the comment, well it was just MNIST lol

  • @vinniepeterss
    @vinniepeterss 2 месяца назад

    ❤❤

  • @waffles6555
    @waffles6555 2 месяца назад

    Awesome !Q

  • @orderandchaos_at_work
    @orderandchaos_at_work 23 дня назад

    Assumed it was magic

  • @Sleight-l4y
    @Sleight-l4y 2 месяца назад

    I just realized that I am an absolute nerd because I cringe every time you say Gaussian with a hard s

    • @deltamico
      @deltamico 2 месяца назад

      isn't that correct though

    • @Sleight-l4y
      @Sleight-l4y 2 месяца назад

      @@deltamico I have never heard gaussian with a hard s in academia, it’s always with an sh. But “correct” is a soft condition when it comes to pronunciation.

  • @lalamax3d
    @lalamax3d 2 месяца назад

    sir. you are amazing..

  • @paratracker
    @paratracker 2 месяца назад

    It sounds like LEEbler, not LIEbler.

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thanks for your comment, as a French I always assumed he was German, never occured to me he was American !