Latent Space Visualisation: PCA, t-SNE, UMAP | Deep Learning Animated

Поделиться
HTML-код
  • Опубликовано: 24 ноя 2024

Комментарии • 138

  • @thorvaldspear
    @thorvaldspear 3 месяца назад +76

    Wow that mammoth 2D visualization using UMAP looked like it was opened up and flattened, you could tell it was a living thing of some sort. Incredible!

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +8

      Thank you ! Indeed it looks like a fossil in the ground :)

    • @charlescoult
      @charlescoult 3 месяца назад +3

      Super cool

  • @Ouuiea
    @Ouuiea 3 месяца назад +19

    My only comment to the video is that PCA real advantage is not speed, is interpretability. It's easy to read a principal component in terms of how it correlates with the original variables. Something you cannot do with t-SNE or UMAP. The video is an excellent work!

  • @stringtheory5892
    @stringtheory5892 Месяц назад +2

    Absolutely clear and crisp visualization of PCA!!

  • @robotics_hub
    @robotics_hub 3 месяца назад +27

    Amazing visualization for a very difficult topic grasp. Many thanks!

  • @SudhanvaDixit
    @SudhanvaDixit Месяц назад +2

    Prediction - A channel that's going to explode.
    Watched multiple videos. Very crisp clear explanation with good animation. Thank you :)

  • @sharjeel_mazhar
    @sharjeel_mazhar 3 месяца назад +6

    I must say, the way you teach is just brilliant! Those visualizations and all, i mean even a 10yo could understand it if he focuses just a lil bit! Can't wait till you reach the level of teaching us the Transformer models!

  • @Grenoble7
    @Grenoble7 3 месяца назад +7

    thank you for the insight without the fuss. I am a UMAP user and I am glad about your conclusion. Suscribed!

  • @nitroseeks
    @nitroseeks 27 дней назад

    You are amazing. The visualisations in your lectures are top notch

  • @jchealyify
    @jchealyify День назад

    A really solid explanation. Well done! You are a wonderful communicator and your visualizations are top notch.
    I do have one very small suggestion that might help. When sweeping through hyperparameters and showing their effect on the embedding it can be helpful to correct a bit of the stochastic nature of layout. When transitioning between your embeddings in low dimensions it can be helpful to a user for you to run a procrustes algorithm on the two embeddings. This will just flip, rotate and scale the point clouds to be best aligned. It really helps users see consistent patterns as hyperparameters change without altering the embedding in any meaningful ways.
    Keep up the fantastic work. I'll definitely be following your channel.

  • @alin50248
    @alin50248 Месяц назад

    Excelent and clear animations, graphs and explanation, keep it on!

  • @ganpangyen4444
    @ganpangyen4444 3 месяца назад

    Incredible visualization and simplification on the topic, especially with UMAP! The superiority of UMAP over t-SNE in terms of the final results and its lower sensitivity to hyperparameters really shows the power of math

  • @r3dkoala
    @r3dkoala 3 месяца назад +2

    Killing it man, loving these videos, I'm so glad I found your channel!

  • @williamz666
    @williamz666 3 месяца назад +4

    Wow, super cool! Love the visualizations! Very informative, much better than the PowerPoint presentations out there lol

  • @janjoecarcillar
    @janjoecarcillar Месяц назад

    Very clearly discussed. Thanks.

  • @virgenalosveinte5915
    @virgenalosveinte5915 11 дней назад

    Your channel is astounding brobro thank you

  • @laotzunami
    @laotzunami 3 месяца назад +1

    Your videos are such high quality! Thank you so much for putting this effort into them. I do data visualization, and I would love to start including more advanced machine learning models in what I make. I got a lot out of this video. I can't wait to see what is next :)

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thank you for the kind words, the next videos will be about VAEs and their variants !

  • @aadimator
    @aadimator 3 месяца назад +1

    Amazing visualization, pace, and aesthetics. Looking forward to seeing more from you. Best of luck 🤞

  • @andrer.6127
    @andrer.6127 2 месяца назад +1

    I forgot about t-SNE. In my own research I have been using UMAP. But, I haven't heard of TriMAP and PaCMAP before. I am going to dive in deeper!

  • @kagan4926
    @kagan4926 3 месяца назад +2

    Awesome video! I was hoping for a bit more of a friendly, intuitive explanation of the equations in t-SNE. Instead of just jumping into the equations, it would be great to get a sense of why they work the way they do and how they fit into the whole picture.

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thanks for the feedback! I would have liked to spend more time on each method, but unfortunately the video got longer than I usually aim for. :/

  • @TheHHadouKen
    @TheHHadouKen 3 месяца назад

    Hey I’ve been looking for a visual representation of feature selection en dimensionality reduction and your video is just amazing. The tone and animation make me thing about 3b1b, and that’s a compliment ! 😊

  • @AlëMontoya-b2w
    @AlëMontoya-b2w 27 дней назад

    Very nice video , please continue with this wonderfull work ! thanks a lot.

  • @ardhidattatreyavarma5337
    @ardhidattatreyavarma5337 3 месяца назад

    This was amazing! Gotta try this on a dataset.

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      @@ardhidattatreyavarma5337 thank you :)

  • @jj6741
    @jj6741 3 месяца назад

    Great video! Complex concept in simple language! Do please make more videos about this topic!

  • @15tatt
    @15tatt 3 месяца назад +1

    this is suuuuuper helpful!!!!! thank you so much for the work!!!

  • @Number_Cruncher
    @Number_Cruncher 3 месяца назад

    Just stunning!

  • @lazur2006
    @lazur2006 3 месяца назад

    Keep it up! Really appreciate this work

  • @sinfinite7516
    @sinfinite7516 3 месяца назад

    WOW AMAZING THANK YOU SO MUCH This helped me a lot 🎉

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +1

      Thank you I'm glad it helped :)

  • @Tho_Fox
    @Tho_Fox 2 месяца назад

    very good explanation! Cool 😀 I would love to see a video from you explaining how transformers work

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thank you ! It's written on a list somewhere but it's not a priority right now :)

  • @vijayaveluss9098
    @vijayaveluss9098 3 месяца назад

    Thank you so much! Straight to the point❤

  • @sadramedia719
    @sadramedia719 Месяц назад

    amazing tutorial!

  • @jorcyd
    @jorcyd 3 месяца назад

    Whoa ! AI, Data Science and Data Visualization in 3b1b-Manin style. Awesome

  • @MutigerBriefkasten
    @MutigerBriefkasten 3 месяца назад

    Thank you, great video again 🎉 learned many new things 😊

  • @jabrikolo
    @jabrikolo 3 месяца назад

    this is really great, thank you very much!

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      @@jabrikolo Thanks for the comment

  • @clickbaitking6770
    @clickbaitking6770 3 месяца назад

    Great video! Hopefully this ultimately leads to a video about the interpretability work by Anthropic using VAEs

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thank you, I've not heard of their work I'll look into it :)

  • @rukascool
    @rukascool 2 месяца назад

    masterpiece, also beautiful on OLED monitor. Easy sub from me

  • @loicmurumba8493
    @loicmurumba8493 Месяц назад

    Great video, very well explained. One question for UMAP; I understand the concept behind the loss function creating a lower level representation with similar distances between points, but how do we represent an autoencoder for MNIST as a vector space? Is there some kind of transformation we can perform on the weights to visualize them as such?

    • @Deepia-ls2fo
      @Deepia-ls2fo  Месяц назад

      @@loicmurumba8493 Thanks, we don't project the network directly, but rather the inner representation it has of the MNIST data. For this, you encode several numbers, which results in vectors of 16 dimensions (for instance, but it could be any dimension). Then you apply UMAP to these vectors.
      We don't really visualize the weights directly.

    • @loicmurumba8493
      @loicmurumba8493 Месяц назад +1

      @@Deepia-ls2fo Really appreciate the quick answer; I had been thinking about the encoder as a neural net classifying the images into their numbers for some reason. In fact, the autoencoder simply compresses the data as accurately as it can into some lower dimensional space (16 dimensions in this example), then we're able to visualize those 16-D representations by running the algorithms described here

  • @unclecode
    @unclecode 3 месяца назад +1

    I subscribed and will stay tuned for your next video!

  • @lifeinabubble9091
    @lifeinabubble9091 3 месяца назад

    Incredible work, I learned a lot. I will say the pacing was a bit fast at times in my opinion (I had to pause to take notes).

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thank you I'll try to adapt the pace

  • @thmcass8027
    @thmcass8027 3 месяца назад +1

    Big thumbs up for the awesome video! May I know how is the video animated?

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +1

      Thank you ! 99% of the video is animated using Manim, a python library. Some 3D stuff is animated using Blender.
      I'll publish the code for each video soon.

    • @thmcass8027
      @thmcass8027 3 месяца назад

      @@Deepia-ls2fo That's cool! Your channel deserves millions of subscribers and I'm honoured to be one of the earliest who subscribed!

  • @Father_Son.
    @Father_Son. 3 месяца назад

    new sub! great video quality!

  • @ProgrammingWithJulius
    @ProgrammingWithJulius 2 месяца назад

    Amazing video. I am using a short clip of it for my next project, hope you don't mind. With credit, of course.

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад +1

      Thanks ! Can you send me an email so that we can discuss this ? I usually don't allow my content to be reused on RUclips

  • @Francis-gg4rn
    @Francis-gg4rn 2 месяца назад

    unrecognized genius

  • @yadavadvait
    @yadavadvait 3 месяца назад

    amazing video! please keep it up :)

  • @toninikoloski110
    @toninikoloski110 11 дней назад

    Amazing wow!

  • @clickbaitking6770
    @clickbaitking6770 3 месяца назад

    Would love a video on mixture of a million experts 🙃

  • @BharmaArc
    @BharmaArc 3 месяца назад

    loved the video and it came at a great time for me, thank you! one small detail: at 8:12 the distance of the y's in the denominator should also be squared, right?

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Hi thanks for the comment, indeed it should be squared that's a mistake. Another detail I did not take time to mention in the video is that there are no sigma paremeter in the low dimensional representation.

  • @TrusePkay
    @TrusePkay Месяц назад

    Well done. But you did not mention LDA (linear discriminant analysis)

  • @ArnaldoSANTORO-b6w
    @ArnaldoSANTORO-b6w 2 месяца назад

    Great explaination and visualization! I'm looking at your code to learn how to perform these visualizations.
    How did you manage to visualize the rotating 3D mammoth in the right at 17:25? I'd like to do it with the t-rex data

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      Thank you ! Unfortunately Manim is very bad at handling 3D, so I loaded the data into Blender using a script. If you're struggling you can reach out by email and I'll provide the code (youtube does not like external links in the comments).

    • @arnaldosantoro6812
      @arnaldosantoro6812 2 месяца назад

      Thank you for your reply.
      I resorted to the same method instead of wasting time on manim documentation.

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      @@arnaldosantoro6812 You're welcome ! Manim is great, just not for 3D :)

  • @vinniepeterss
    @vinniepeterss 3 месяца назад

    great!!

  • @Latent_Explorations
    @Latent_Explorations 2 месяца назад

    I love this ♥

  • @superman39756
    @superman39756 3 месяца назад

    Such a soothing AI voice too!

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thanks ! That's actually my voice cloned into elevenlabs and slightly modified :)

  • @i2c_jason
    @i2c_jason 3 месяца назад

    Is it safe to say that when "choosing your own adventure" in an agentic RAG workflow, if you can visualize the high dimensional latent space, you can make decisions on iterating your workflow that would be more conservative the closer they are clustered together? For example, if I'm adjusting the parameters of a 3D model and pulling from a dataset of known shapes and "widgets", working within the boundaries of a closely clustered embeddings might be safer than pulling a wacky far-off embedding as my decision outcome for my next iteration of whatever I'm doing in the workflow? If I'm understanding this correctly, the visualizations can help guide a generative workflow.

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +1

      Hi thanks for the comment, I'm not familiar with all the terms you used. But if I understand correctly yes, checking that the objects you are dealing with are within the known distribution that your model knows how to handle can be done using these visualizations.
      The broad topic would be out of distribution detection.

    • @i2c_jason
      @i2c_jason 3 месяца назад

      @@Deepia-ls2fo I'm still formulating my thoughts on how exactly to implement some of what I'm working on, but the idea is to allow users of my generative workflow to have some intuitive guidance as they progress through iterations. I think exposing the latent space as part of the user experience might not be a bad thing. Rather than it being inside the black box. Also, great videos, new subscriber here!

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +1

      Hey just to let you know that I think I saw a thing on linkedin related to what you were talking about. This seems to definitely be a use case.

  • @imadhamaidi
    @imadhamaidi 3 месяца назад

    Hello! Thanks for the superb explanation, but I'm wondering, do the datasets really have to be vector spaces for these methods to work? Wouldn't they work with only a metric space where we only have distance information? Like levenshtein word distances between files?

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +1

      Hi thanks for the comment, I don't know about this particular distance. Since t-SNE and UMAP both convert vector data into distances, I don't think that you could directly work with the distance information. But what you could do is modify UMAP to include this distance in the first step (the one turning the dataset into a weighted graph). I think t-SNE works only with the standard euclidan distance though. :)

    • @lelandmcinnes9501
      @lelandmcinnes9501 3 месяца назад +1

      @@Deepia-ls2fo In many UMAP implementations you can use metric="precomputed" and pass a distance matrix rather than input vectors. The Python implementation also supports using a precomputed knn-graph, so you need not compute all-pairs distances, just distances to nearest neighbors. These approaches would allow you to use UMAP with a (precomputed) Levenshtein distance.

  • @nikilragav
    @nikilragav 3 месяца назад

    11:59 and elsewhere - when you are running PCA, are the classes not included in your dimensions? Eg. in the 3D spiral with color (actually 4 dimensions), are you only using 3 (x,y,z) for PCA (and SNE)?

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thanks for the comment, indeed only the position information is used, not the class !

    • @nikilragav
      @nikilragav 3 месяца назад

      @@Deepia-ls2fo so if you added the class to PCA wouldn't it separate very cleanly?

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +1

      Yes it would Indeed separate the representations very cleanly.
      We usually don't use them when testing a method because the idea is to see if the method is able to show relationships that we know are true and exist in the high dimensional space.

  • @jakeaustria5445
    @jakeaustria5445 3 месяца назад

    Thank you

  • @mourensun7775
    @mourensun7775 3 месяца назад

    The video is great, thank you. Btw, I have a small question, what is the name of the background music?

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +1

      Thank you !
      It's a copyright free music I found on Pixabay: Documentary - Coma-Media.

    • @mourensun7775
      @mourensun7775 3 месяца назад

      @@Deepia-ls2fo Many thanks!

  • @Jinom
    @Jinom 3 месяца назад

    great video

  • @deror007
    @deror007 3 месяца назад

    In the UMAP case, how does the high-dimensional graph representation have more points than the lower-dimensional graph representation? I am definitely missing something here. 16:10

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +1

      Ho no that's my bad, they are supposed to have the same number of points of course !
      These are just illustratives examples though, not results of the actual aglorithm.

    • @deror007
      @deror007 3 месяца назад

      @@Deepia-ls2fo oh okay, thanks!

  • @iamr0b0tx
    @iamr0b0tx 3 месяца назад

    Are there any dimensionality reduction algorithms that work online (i.e. one sample at a time)?

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thanks for the comment, I don't know about any dimensionality reduction technique that would work with one sample only except a simple projection. :/
      Edit: I think I misunderstood your question, the only online technique that would come to my mind is deep learning models ? Once trained you can feed them one sample at a time.

    • @iamr0b0tx
      @iamr0b0tx 3 месяца назад +1

      @@Deepia-ls2fo I was wondering about an algorithm that does not wait to get the entire dataset but learns the reduction as it gets the samples one at a time (or a small batch at a time), kinda like online learning. But yeah sounds like some trick with deep learning might be the closest possible solution at this time. Probably some pretrained VAE

  • @BooleanDisorder
    @BooleanDisorder 3 месяца назад

    What's it called when you turn a high dimensional representation into a single point to compare with other similar high dimensional representations?

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад +2

      Hi, what you are refering too can be described using many words. Maybe "representation learning" is what you are looking for ?
      Edit: maybe "embedding" is the word you're looking for.

    • @BooleanDisorder
      @BooleanDisorder 3 месяца назад

      @@Deepia-ls2fo Yes! Thanks.
      This area of research is SO fascinating. I feel we're are grasping at something bigger.

  • @jkzero
    @jkzero 3 месяца назад

    very nice, make sure to submit this to #SoMEpi (deadline is Aug18)

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Hi, thanks for your comment. Could you send me the infos ? I can't seem to find anything online except the tag on youtube :/
      Edit : Nevermind I found it :)

    • @jkzero
      @jkzero 3 месяца назад

      @@Deepia-ls2fo just search for "Summer of Math Exposition," I cannot post the link because RUclips does not let viewers post links

    • @jkzero
      @jkzero 2 месяца назад

      @@Deepia-ls2fo great to see that your entry won one of the Honorable Mentions, congrats!

    • @Deepia-ls2fo
      @Deepia-ls2fo  2 месяца назад

      @@jkzero Thanks glad to see you got one too :)

    • @jkzero
      @jkzero 2 месяца назад

      @@Deepia-ls2fo thanks, I was not expecting this at all. I hope you got exposure to a greater audience and some constructive feedback from reviewers

  • @ravenecho2410
    @ravenecho2410 3 месяца назад

    Bro when did UMap turn into a paper about arxheology and mammoths what in the actual black maths

  • @waffles6555
    @waffles6555 3 месяца назад

    Awesome !Q

  • @vinniepeterss
    @vinniepeterss 3 месяца назад

    ❤❤

  • @kellymoses8566
    @kellymoses8566 3 месяца назад +1

    ISOMAP and PaCMAp are tow newer algorithms.

  • @Garfield_Minecraft
    @Garfield_Minecraft 3 месяца назад +1

    those datas at the beginning look like a country map

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thanks for the comment, well it was just MNIST lol

  • @orderandchaos_at_work
    @orderandchaos_at_work Месяц назад

    Assumed it was magic

  • @Sleight-l4y
    @Sleight-l4y 3 месяца назад

    I just realized that I am an absolute nerd because I cringe every time you say Gaussian with a hard s

    • @deltamico
      @deltamico 3 месяца назад

      isn't that correct though

    • @Sleight-l4y
      @Sleight-l4y 3 месяца назад +1

      @@deltamico I have never heard gaussian with a hard s in academia, it’s always with an sh. But “correct” is a soft condition when it comes to pronunciation.

  • @lalamax3d
    @lalamax3d 3 месяца назад

    sir. you are amazing..

  • @paratracker
    @paratracker 3 месяца назад

    It sounds like LEEbler, not LIEbler.

    • @Deepia-ls2fo
      @Deepia-ls2fo  3 месяца назад

      Thanks for your comment, as a French I always assumed he was German, never occured to me he was American !