Word Embedding Explained and Visualized - word2vec and wevi

Поделиться
HTML-код
  • Опубликовано: 10 фев 2025
  • This is a talk I gave at Ann Arbor Deep Learning Event (a2-dlearn) hosted by Daniel Pressel et al. I gave an introduction to the working mechanism of the word2vec model, and demonstrated wevi, a visual tool (or more accurately, a toy, for now) I created to support interactive exploration of the training process of word embedding. I am sharing this video because I think this might help people better understand the model and how to use the visual interface.
    The audience is a mixture of academia and industry people interested in the general neural network and deep learning techniques. My talk was the one out of the six talks in total. Thank you, Daniel, for organizing the amazing event! It was truly amazing to learn so much from other researchers in just one single afternoon.
    I apologize for not speaking as clearly as I can. I did not realize I was talking this fast... I had only two hours of sleep in the night before and apparently that created multiple short circuits in the neural networks in my brain... Please turn on the subtitles for best understandability.
    Links:
    slides: bit.ly/wevi-slides
    wevi demo: bit.ly/wevi-online
    wevi git repository: github.com/ron...
    my homepage: bit.ly/xinrong
    a2-dlearn event: midas.umich.edu....
    word2vec website: code.google.co...
    word2vec parameter learning explained: arxiv.org/abs/1...

Комментарии • 102

  • @tingxiwu3749
    @tingxiwu3749 4 года назад +25

    RIP bro. Really sad that we've lost a talent like you. Your paper is now really helping lots of us and thanks for your contribution.

  • @Iniquitous2009
    @Iniquitous2009 6 лет назад +18

    RIP dear stranger, you've made it so much simpler for all of us.

  • @kencheligeer3448
    @kencheligeer3448 5 лет назад +7

    Thanks mate! This is the best explanation for original Word2vec. R.I.P, 一路走好.

  • @melaniebaybay7006
    @melaniebaybay7006 8 лет назад +32

    Amazingly well done! Your paper, this talk, and the wevi tool have made it MUCH easier for me to understand the word2vec model. You definitely succeeded in your goal of explaining this topic better than the original authors. Thank you!

  • @shivendraiitkgp
    @shivendraiitkgp 8 лет назад +6

    Just 3 minutes into the lecture, it has already caught my attention and cleared off my sleepiness. :D

  • @rachel2046
    @rachel2046 2 года назад +4

    Couldn't help imagining how much he would be able to contribute to the world of NLP if he's still alive...

  • @jiehe9673
    @jiehe9673 8 лет назад +1

    I have read your paper and after watching your presentation, I've pretty much understood this model. Thanks!

  • @kavitapandey964
    @kavitapandey964 8 лет назад +3

    Buddy, you are a saviour..this is all I needed to get started for my project! God Bless!

  • @plamenmanov2694
    @plamenmanov2694 7 лет назад +6

    One of the most talented AI presentation I've seen, peaceful flight my friend!

  • @geoffreyanderson4719
    @geoffreyanderson4719 8 лет назад +1

    Thank you for this revealing talk. Good takeaways!

  • @kaiyangzhou3503
    @kaiyangzhou3503 8 лет назад +3

    Fantastic talk! You give me a more clear understanding of word embedding! Awesome!

    • @xinrong5074
      @xinrong5074  8 лет назад +2

      Kevin Zhou Thanks. Glad it helped!

    • @tongwei3527
      @tongwei3527 8 лет назад +1

      Hi, I am a master student at Nanjing U. and I'm interest in word embedding and such NLP technologies. Can I have your wechat or other social media accounts? Looking forward to knowing you. Thanks.

    • @xinrong5074
      @xinrong5074  8 лет назад

      Wei Tong 呃...直接发给我邮件就好啦。ronxin@umich.edu

  • @rck.5726
    @rck.5726 8 лет назад +1

    Superb talk, also read your paper before watching this, thanks for helping people understand this great work.

  • @thetawy14
    @thetawy14 6 лет назад +2

    Oh my god... I came back to this video because of a great explanation...But now after reading comments, I realize that the tragedy already happened the first time when I was watching this :( RIP

  • @里里-x7r
    @里里-x7r 5 лет назад +2

    R.I.P, thank you for your contribution

  • @maoqiutong
    @maoqiutong 7 лет назад

    Many thanks for your great presentation and your perfect website!

  • @inaqingww
    @inaqingww 4 года назад +2

    RIP, Thank you for your contribution

  • @pitbullvicious2505
    @pitbullvicious2505 8 лет назад +1

    Excellent presentation! I had kind of got the basics of w2v and applied them in a couple of problems and noticed how well they work, but never found a paper or presentation that would really explain what w2v does and how, so that I'd understand. This presentation did. Thank you!

  • @BrutalGames2013
    @BrutalGames2013 7 лет назад

    Awesome job! Really straight forward explaination. Thank you very much! :)

  • @JadePX
    @JadePX 8 лет назад

    Most impressive... And excellent presentation.

  • @coffle1
    @coffle1 8 лет назад +1

    Great talk! Also, I appreciate the time taken out to put in subtitles! The volume got pretty low at times, and was glad I could rely on the subtitles.

  • @martinkeller9562
    @martinkeller9562 8 лет назад

    Really well done, such an improvement over the explanation in the original paper!

    • @xinrong5074
      @xinrong5074  8 лет назад

      thanks!

    • @xinrong5074
      @xinrong5074  8 лет назад +2

      i would add that in no way this is a replacement of the explanation of the original paper... the original one(s) was written for researchers in the field - to people who've done neural net, esp neural language modeling for a while, that original paper was a joy to read and offer a lot more insights on the history and competitors of the model

    • @martinkeller9562
      @martinkeller9562 8 лет назад

      True. I'm not saying that it's a bad paper in any way, but I do feel that it could have benefitted from being more explicit or more detailed at some points. In particular, the negative sampling objective function could have been discussed more. Being familiar with neural networks, but not neural language modelling in particular, it took me quite a while to work out what's really happening in word2vec.

    • @xinrong5074
      @xinrong5074  8 лет назад +1

      agreed.

  • @brucel9585
    @brucel9585 3 года назад +3

    I know this channel will not longer update anymore, RIP bro, sadly to know you in this way, to know you better from your contribution from youtube, thanks

  • @jishada.v643
    @jishada.v643 8 лет назад

    Wawoo.. That was a really great talk..

  • @SleeveBlade
    @SleeveBlade 6 лет назад

    Very well done!! Good explanation

  • @chetansaini1985
    @chetansaini1985 6 лет назад

    Very nicely explained.....

  • @hungynu
    @hungynu 6 лет назад

    Thank you so much for clear explanation.

  • @kunliniu5883
    @kunliniu5883 6 лет назад +1

    Awesome talk! Thank you and RIP.

  • @suadrifrasan2726
    @suadrifrasan2726 3 года назад

    Wonderful job @Xin Rong

  • @ibrahimalshubaily9520
    @ibrahimalshubaily9520 5 лет назад

    Outstanding, thank you Xin.

  • @maryam_nn
    @maryam_nn 8 лет назад

    Thank you so much for the video! :)

  • @girishk14
    @girishk14 8 лет назад

    Great video! By far the best explanation of Word Embeddings so far! Xin Rong - do one for GloVe too!

  • @SanghoOh
    @SanghoOh 7 лет назад

    Nice tutorial. Thanks.

  • @dr_flunks
    @dr_flunks 8 лет назад +1

    Hey Xin, I've been studying deep learning for about 6 months. I think your slide's description of backprop is the best i've seen yet. I think you've summarized it as concisely as possible. I think the math finally 'clicked.' Great job. Just for others, i don't believe you called it out in the video but it's the chain rule that allows you to work backwards toward the input layer around 16:07, correct?

    • @joy2000cyber
      @joy2000cyber 3 года назад

      Yes, chain rule is big in ANN

  • @pankaj6663
    @pankaj6663 7 лет назад

    Interesting Talk ,... Thanks :)

  • @RedShipsofSpainAgain
    @RedShipsofSpainAgain 7 лет назад

    Great presentation Xin. Very informative. Next time, I'd suggest ensuring the volume is adequate. I've got my volume turned up to 100% and it's barely audible.

  • @高亚红-n8m
    @高亚红-n8m 8 лет назад +1

    还是用国语交流,更亲切!非常感谢,真的是很好的工作!

  • @yoojongheun9328
    @yoojongheun9328 8 лет назад +4

    Is that a typo at 22:15 (the 2nd chain rule part)? or I am not following the derivation?
    - on the video and the paper dE/dw'_ij = (dE/du_j)(u_j/dw'_ij)
    - shoud it be? dE/dw'_ij = (dE/du_j)(du_j/dw'_ij)

    • @xinrong5074
      @xinrong5074  8 лет назад +2

      +Yoo Jongheun You are absolutely correct. I will correct this in the paper. Thanks.

  • @martin9669
    @martin9669 3 года назад +1

    RIP Xin Rong aka Saginaw John Doe

  • @silentsnooc
    @silentsnooc 8 лет назад

    Thank you for this video and especially for this awesome paper. What I don't fully understand though is why and/of if the input words do have to be one-hot encoded. What if I'd use a different representation. Let's go crazy and say I use a pre-trained word2vec model with an arbitrary embedding size. What if I used these as inputs in order to learn the weights?

  • @henrylee19840301
    @henrylee19840301 5 лет назад +1

    R.I.P. Thank you bro.

  • @cbetlana7733
    @cbetlana7733 8 лет назад

    Awesome talk! I'm just starting to learn about this stuff and was wondering if the talk you refer to (during the "Training a Single Neuron" slide) could be found online somewhere?

  • @anyu8109
    @anyu8109 8 лет назад

    Good questions!!

  • @yingli2681
    @yingli2681 8 лет назад

    Awesome video! Just one question about the PCA graph. Do you look at the variance of the first two PCs explains? My concern is if the first two PCs explains little about the variance, the graph no longer makes much sense right?

    • @xinrong5074
      @xinrong5074  8 лет назад

      I think that is a great point! For inputs like
      a|b,a|c,c|a,c|b,b|a,b|c,d|e,e|d
      the PCA would make little sense.

  • @qijinliu4024
    @qijinliu4024 7 лет назад +4

    黄泉路上,一路走好。RIP.

    • @MrChaos987
      @MrChaos987 6 лет назад

      Qijin Liu 希望他安好吧,不过飞机到底什么原因事故的

  • @georgeigwegbe7258
    @georgeigwegbe7258 6 лет назад

    Thank you.

  • @quynguyen9867
    @quynguyen9867 8 лет назад

    thank u so much!

  • @homerquan
    @homerquan 8 лет назад

    Do your work will be extent to sentences (sen2vec?). e.g., Input a sentence and get its intention?
    I tried to connect you on linkedin. Glad to know more about each other.

  • @zilinlee3742
    @zilinlee3742 6 лет назад +1

    Big thank you & RIP

  • @vespermurtagh6547
    @vespermurtagh6547 7 лет назад +1

    I strongly suggest this brain training game”nonu amazing only” (Google it) for anyone who would like to increase and sharpen their brain. So I have been making use of this game a lot for brain training and it works I`ve been checking more things I remember where I left most of my things.

  • @RajarsheeMitra
    @RajarsheeMitra 8 лет назад

    You said vector of 'drink' will be similar to that of 'milk' after training. That means vectors of context and target will get similar. Then what about the similarity of target words that share similar context ?

    • @anyu8109
      @anyu8109 8 лет назад

      I think they would be similar, since we have similar vectors to predict the targets.

  • @Amapramaadhy
    @Amapramaadhy 7 лет назад +10

    RIP. Gone too soon

    • @abaybektursun
      @abaybektursun 7 лет назад

      Wait, WTF?

    • @geraq0
      @geraq0 7 лет назад

      Dear Lord. I didn't understand what you meant until I googled it. That's terrible.

    • @arsalan2780
      @arsalan2780 7 лет назад

      what really happened

    • @Iniquitous2009
      @Iniquitous2009 6 лет назад

      www.dailymail.co.uk/news/article-4900486/Wife-long-missing-PhD-student-wants-declared-dead.html

    • @josephrussell6786
      @josephrussell6786 6 лет назад

      Sad.

  • @sunnyu3041
    @sunnyu3041 5 дней назад

    RIP, I cannot image to know you by this way

  • @blackblather
    @blackblather 10 месяцев назад

    Thank you and RIP

  • @afrizalfirdaus2285
    @afrizalfirdaus2285 8 лет назад +1

    thank you so much, but i have i question. if i have 10K words, then i must training all the 10K words one-by-one?
    and what is the mean of context? i dont understand what is the context. is that a document or what?

    • @xinrong5074
      @xinrong5074  8 лет назад +1

      Do you mean you have 10K tokens in the corpus? Yes, you will have to train them one-by-one, and maybe multiple iterations for better performance. The context is also a word, or in CBOW a bag of words.

    • @afrizalfirdaus2285
      @afrizalfirdaus2285 8 лет назад

      wow oke i see. actually i want to use this method for bahasa indonesia but there are no published pretrained data in the internet so i must to create it by my self.
      Do i must create the 10K data training like you do in your wevi demo in form "Training data (context|target):" manually one-by-one? is there any method to create the list of data training?
      i've read your paper and there is chapter "multi-word context". can you give me an example what context has multi word? is it like word "clever" and "smart" in one context?

    • @xinrong5074
      @xinrong5074  8 лет назад

      No. You don't have to. My demo is just for illustration purpose. The word2vec package comes with preprocessing functionality to create context|target pairs from a plain text file.
      Multi-word context means considering multiple words in the same sentence as a single context. E.g., using CBOW model, in the sentence "The quick brown fox jumps over the lazy dog." For the word "jumps", assuming window size is 3, then the context is quick+brown+fox+over+the+lazy... i.e., a multi-word context.

    • @afrizalfirdaus2285
      @afrizalfirdaus2285 8 лет назад

      oh oke i understand what is the context
      where can i get the package? hmm but if i want to code the word2vec by my self without the package, how to create the 10K data context|target pair?
      i'm so sorry for asking many question to you :D

    • @xinrong5074
      @xinrong5074  8 лет назад +1

      www.tensorflow.org/versions/r0.10/tutorials/word2vec/index.html and github.com/dav/word2vec

  • @Optimus_Tesla
    @Optimus_Tesla 5 лет назад

    RIP brother

  • @michaelyin8550
    @michaelyin8550 5 лет назад +2

    RIP and big thanks!

  • @shg7709
    @shg7709 3 года назад

    RIP bro

  • @Skandawin78
    @Skandawin78 5 лет назад +1

    Why RIP? What happened to him?

    • @bertmagz8845
      @bertmagz8845 5 лет назад +1

      m.huffingtonpost.ca/2017/03/23/xin-rong-plane-crash_n_15567112.html?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAABEjoMpeN7b9fsotLlieE5ozPCYsNlKJwGUd2_KK8Gw0w9lCE3owMkmmqunR_E-033vq8FbU3CmIaOuDdnzJjaLRV_nktW5ZCyqagEbuefYWPfm2OenSZTYgGi5nPslGolgiy3qHBLdLIi-DT4pecXRKW-S777TsCRb-EEuGjk40

    • @hsun7997
      @hsun7997 4 года назад

      He jumped out of his own plane

    • @Skandawin78
      @Skandawin78 4 года назад

      @@bertmagz8845 hmm the report says how and when he exited is a mystery. Did they ever find his body?

    • @dianezatanna9676
      @dianezatanna9676 3 года назад

      @@Skandawin78 No his body has never been found.

  • @taoxu798
    @taoxu798 5 лет назад +1

    Thanks and RIP.

  • @jihu9522
    @jihu9522 7 лет назад +1

    RIP.

  • @Alex_Eagle1
    @Alex_Eagle1 3 года назад

    R.I.P

  • @bettys7298
    @bettys7298 5 месяцев назад

    RIP!

  • @BrutalGames2013
    @BrutalGames2013 7 лет назад +1

    RIP

  • @amantandon2802
    @amantandon2802 5 лет назад

    Volume too low even my speaker didn't helping me

  • @anyu8109
    @anyu8109 8 лет назад

    haha, you are so funny.

  • @sinnathambymahesan1268
    @sinnathambymahesan1268 5 лет назад

    Very poor sound quality???

  • @刘云辉-p7h
    @刘云辉-p7h 7 лет назад

    牛逼!

  • @WahranRai
    @WahranRai 3 года назад

    Bad quality of video (sound)

  • @jishada.v643
    @jishada.v643 8 лет назад

    Wawoo.. That was a really great talk..

  • @samuel-xr4bi
    @samuel-xr4bi 6 лет назад +1

    RIP