12.1: What is word2vec? - Programming with Text

Поделиться
HTML-код
  • Опубликовано: 30 сен 2024
  • In this new playlist, I explain word embeddings and the machine learning model word2vec with an eye towards creating JavaScript examples with ml5.js.
    🎥 Next Video: • 12.2: Color Vectors - ...
    🎥 Playlist: • Session 12: word2vec -...
    🔗 Understanding Word Vectors by Allison Parrish: gist.github.co...
    🎥 "Experimental Creative Writing with the Vectorized Word" by Allison Parrish: • "Experimental Creative...
    🎥 What is a Vector: • 1.1: Vectors - The Nat...
    🚂 Website: thecodingtrain....
    💖 Patreon: / codingtrain
    🛒 Store: www.designbyhu...
    📚 Books: www.amazon.com...
    🎥 Coding Challenges: • Coding Challenges
    🔗 p5.js: p5js.org
    🔗 Processing: processing.org
    📄 Code of Conduct: github.com/Cod...

Комментарии • 143

  • @bodenseeboys
    @bodenseeboys 4 года назад +124

    'man' + 'caffeine' + 'overdose'' = 'this guy'

    • @pixusru
      @pixusru 2 года назад

      More like 40mg adderal

  • @AmitYadav-ig8yt
    @AmitYadav-ig8yt 4 года назад +2

    This guy is overdramatic. Moving his hands like he is doing YOGA before starting his videos. It irritates me.

  • @HM-qe8vl
    @HM-qe8vl 6 лет назад +131

    The best example i ever heard is :
    "king" - "man" + "woman" = "queen"

    • @SimonK91
      @SimonK91 5 лет назад +18

      Probably also because that's the example Mikolov et al. wrote in their published work.

    • @DmitryRomanov
      @DmitryRomanov 5 лет назад +13

      "Berlin" - "Germany" + "France" = "Paris"

    • @AmitYadav-ig8yt
      @AmitYadav-ig8yt 4 года назад

      @@SimonK91 Yeah, Same Example -

    • @kae4881
      @kae4881 3 года назад +1

      thats actually from grokking deep learning, the best deep learning book out there

  • @hendrik3553
    @hendrik3553 6 лет назад +78

    4:02
    was hoping for "apple + pen"

    • @Danny1986il
      @Danny1986il 6 лет назад

      My thoughts exactly

    • @wasupwithuman
      @wasupwithuman 5 лет назад +6

      why? everyone knows that apple + pen = penpineappleapplepineapplepen

    • @dspower7017
      @dspower7017 4 года назад

      Lol, that tune started playing in my head 🤣🤣

  • @mrdbourke
    @mrdbourke 6 лет назад +28

    One of the best explanations of Word2vec I've ever seen! Can't wait for the rest of the series

  • @marceloboemeke
    @marceloboemeke 4 года назад +11

    Thank you! Man... You're a human! I'm so tired of those mechanical, super logical developers. We see them working, and it's like "Yeah, I'm just a piece of sh*t". It always looks so far away from where we are (speaking by me). You don't pretend to be perfect, you don't even try to hide when you forget something, or get confused. I watch you coding and it's like "Look at this super cool dev, and he is a human like all of us". You gave me the courage and the inspiration to work on my personal projects, the things I dream to accomplish. Man, I can't say "Thanks" enough. Thanks!

  • @deeptigupta518
    @deeptigupta518 4 года назад +12

    You are too cute...I love how you teach...with that smile on your face...feels like learning is fun

  • @942255835
    @942255835 2 года назад +1

    cant focus on the content, too many distracting things in the videos.

  • @96cabero
    @96cabero 3 года назад +1

    This is an explanation of word embeddings, not word2vec

  • @BrianBeanBag
    @BrianBeanBag 2 года назад +1

    the constant laughter after every sentence is lowkey cringe ngl

  • @Heckerschee
    @Heckerschee 3 года назад +1

    Can’t find the playlist. Could someone reply with a link please?

  • @DaveBerendhuysen
    @DaveBerendhuysen 5 лет назад +12

    Man, even just your intro brings a smile to my face. Every. Time.

  • @magelauditore333
    @magelauditore333 4 года назад +1

    You are so great and humble, i can see your tutorial all day but only problem is you use Javascript not python. Why Sir, Why. That the only point i need to skip coding videos

  • @shahrukh1514
    @shahrukh1514 5 лет назад +16

    Very hilarious. Kept me tuned without blinking throughout. Explaining through simple and plain english without complex formulas actually made me understand the concept in a concise manner. Kudos to the tutor.

  • @Geek74
    @Geek74 3 года назад

    Can anybody share with me skip-gram python code for 100 dimension using each line of text.

  • @cimmik
    @cimmik 8 месяцев назад

    Now I wonder if it's possible to vectorize DNA sequences to show how different animals are genetically related to each other.

  • @BeautifulQuotes94
    @BeautifulQuotes94 3 года назад

    Hey mate! I am working on research topic and i have to apply top2vec algorithm on sentiments..but i am stucked....can you help me please

  • @tridunghuynh5573
    @tridunghuynh5573 3 года назад +1

    Thank you but I'm more focusing on the knowledge so I prefer less emotion expression

  • @lancezhang892
    @lancezhang892 9 месяцев назад

    Thanks for your video,Mr. But one question,Are these words coded with tool word2vec?

  • @orbinya2885
    @orbinya2885 6 лет назад +8

    Really excited to watch part 2 :)

  • @abdullahwaris1275
    @abdullahwaris1275 2 года назад

    One problem: How the fuck is mosquito a 1 on a cuteness scale that begins with 0.

  • @penguinmonk7661
    @penguinmonk7661 Год назад +3

    This man singlehandedly carried me through college and he keeps giving stellar explanations, bless

    • @andrew1haddad
      @andrew1haddad Год назад

      what program and what course was this?

  • @paulhetherington3854
    @paulhetherington3854 4 месяца назад

    How to - Activate note pad || photo shop - I did below!

  • @makwelewishbert555
    @makwelewishbert555 2 года назад

    one of those quirky guys...he is good and nice to watch

  • @sakibabrar5321
    @sakibabrar5321 4 года назад +4

    "cat" + "cute" = "kitten"
    XD

  • @deezy437
    @deezy437 8 месяцев назад

    Cheche

  • @hanniffydinn6019
    @hanniffydinn6019 6 лет назад +4

    Could an AI be trained to take things like a reverse dictionary which links meanings to words, and create a vector space of all words and meaning ???

  • @TheOffi
    @TheOffi 6 лет назад +2

    I wonder if thats useful for the Digital Humanities. Looking forward to part 2!

  • @vincepale
    @vincepale Год назад

    My 2yr and 3mo old daughter Just got me to play the coding train song four times in a row.
    That being said, she did look at a book while I watched all of your videos on word2vec

  • @leonard99
    @leonard99 5 лет назад +1

    You can represent each word in a multidimensional space of relatable properties. For example rated for human-interactive/understanding properties such as size, color, softness, usage, etc . If you can 'design' that, the word2vec calculation shall kind-of work in relation to the use of human semantic understanding. Except for the fact that determining the position of each word (each word's vector) is quite a tedious process - when designing (determining each property's value) .. if not impossible due to the fact that each controller (person/human) in doing so, will do this slightly differently.
    Rather, we want this vector space to be constructed, instead of having to design it. But how to design and determine the values for a series of dimensions/properties? If not yet properties and aspects of each semantic word or cluster of words can be determined easily, you can create a space from 'context' of/to a word going by lots of sentences, many corpora. Meaning, you for example relate a word by proximity (context, physical allocation, proximity) to *other words (being close to that word). Thus, you can then represent each word with a vector, by having it relate to relevantly related other words. The property space becomes the set itself. Once you have this - in relation to the used corpora - you can calculate between words using linear algebra. In a way you immediately over-fit the words to the context in which they were used. Better models would include usability properties, non-linearity, and visual and ontological relationships, etc.

  • @DustinGunnells
    @DustinGunnells Год назад

    FACTS! Sometimes 1D is weirder than 2D!

  • @philipp_llj5988
    @philipp_llj5988 6 лет назад +1

    I have a question:
    void setup() {
    size(800, 600);
    background(200);
    strokeWeight(5);
    //malfläche
    fill(255, 255, 255);
    rect(180, 20, 560, 560);
    //Farben
    fill(0, 0, 0);
    rect(20, 20, 90, 90, 255);
    fill(255, 0, 0);
    rect(20, 135, 90, 90, 255);
    fill(0, 255, 0);
    rect(20, 250, 90, 90, 255);
    fill(0, 0, 255);
    rect(20, 365, 90, 90, 255);
    fill(255, 255, 0);
    rect(20, 480, 90, 90, 255);
    }
    void draw() {
    if (mousePressed && mouseX < 740 && mouseX > 180 && mouseY < 580 && mouseY > 20) {
    line(pmouseX, pmouseY, mouseX, mouseY);}
    if (mousePressed && mouseX > 20 && mouseX < 90 && mouseY > 135 && mouseY < 90){
    fill(255,0,0);
    line(pmouseX,pmouseY,mouseX,mouseY);}

    if (mousePressed && mouseX > 20 && mouseX < 90 && mouseY > 20 && mouseY < 90) {
    stroke(0);}
    }
    but if i click in my red circle(rect) then line isnt red. help!

    • @TheCodingTrain
      @TheCodingTrain  6 лет назад

      Would you mind asking at discourse.processing.org/! It's a better platform for Processing and p5.js related code questions. You can share code there easily! Feel free to link from here to your post.

  • @saurjayanbhattacharjee9742
    @saurjayanbhattacharjee9742 3 года назад

    But how are embeddings are actually done?

  • @driyagon
    @driyagon 4 года назад +28

    Too much energy, very less information

  • @11masud
    @11masud 5 лет назад +2

    "The Awesomeness and elegency of explanation of something so complex in funny and easy way" .... I mean I love this!!

  • @trilobyte3851
    @trilobyte3851 5 лет назад +3

    Apple + City = New York

  • @Raghadtaleb
    @Raghadtaleb 2 года назад

    watching this before my 9AM class and can't stop laughing. thanks for the great vid

  • @gnanendraavvaru3906
    @gnanendraavvaru3906 4 года назад +1

    Are you Dr. Emmet Brown from "Back to the future" ?

  • @raizaborreo1071
    @raizaborreo1071 5 лет назад +7

    Full of hand movements and facial expressions, making the discussion lively. Thank you for this video tutorials, can't help but to subscribe!!!

  • @arjungoud3450
    @arjungoud3450 Год назад

    Thanks a lot for vector inclusion 😢

  • @ShawnZ7x
    @ShawnZ7x 6 лет назад +2

    Please use dark theme for RUclips 🙏🙏

  • @deezy437
    @deezy437 8 месяцев назад

    Mucho anooying mate

  • @enermacabechacabuhan7189
    @enermacabechacabuhan7189 5 лет назад +3

    I like how u teach

  • @ccuuttww
    @ccuuttww 4 года назад

    this topic not that simple u should go through some topic in Bayesian statistics before u do vectorization
    and most of case need preprocessing it is a really hard topic in machine learning called NLP

  • @DesuTechHub
    @DesuTechHub 10 месяцев назад

    The presentation mixing white board and monitor with human interaction is very nice.👍

  • @sousacanfly
    @sousacanfly 4 года назад

    Jesus Christ, how can someone not know this VERY BASIC information and still use the models?

  • @geterewlij
    @geterewlij 4 года назад +3

    Next time just jump into the content jesus. RUclips videos r supposed to be different than class

  • @jupiterpie
    @jupiterpie Год назад

    Hey!! Really love this "crazy" style of explanation. Get so engaged in that!! Thxx!!

  • @carlwh123
    @carlwh123 6 лет назад +2

    lol, you have very similar RUclips suggestions to me.

  • @reformed_attempt_1
    @reformed_attempt_1 3 года назад

    like for honesty, on to the notebook

  • @AhmedKhaliet
    @AhmedKhaliet Год назад

    Wow that’s really great I didn’t imagine I would get it as fast as you did thank you really ❤❤

  • @calebparks8318
    @calebparks8318 10 месяцев назад

    So how is it trained?

  • @sureshbairwa1747
    @sureshbairwa1747 4 года назад +1

    video starts at 2:20

  • @AbdalaTaherProgrammer
    @AbdalaTaherProgrammer 2 года назад

    MANY THANKS

  • @TechVizTheDataScienceGuy
    @TechVizTheDataScienceGuy 4 года назад

    Niceee 👏

  • @forestsunrise26
    @forestsunrise26 3 года назад

    oh you are so cute and your explanation is amazing! Thank you very much!

  • @NONAME_G_R_I_D_
    @NONAME_G_R_I_D_ 3 года назад

    I love you and what you are doing!!!!!! Keep on shining please

  • @mishahappy1990
    @mishahappy1990 4 года назад

    You + me. Do the math

  • @WaelAbouEl-Wafa-vv5tf
    @WaelAbouEl-Wafa-vv5tf 4 года назад

    what is the programs you use for this nice vedio i need to know them to made some vedios

  • @RonLWilson
    @RonLWilson 3 года назад

    Great tutorial!
    It would seem that one might have three quite different word models, one being linguistics based e.g.
    ruclips.net/video/OBGA9DZT6Ns/видео.html
    this being more tree based or graph based.
    the other entity based (the things properties and attributes)
    and the other vector based.
    It would seem that one might employ both methods and then use on to aid the other or even help "train" the other.
    For example take an animal for example, say a fox. One could then search for all sentences that have the word fox in it and then see what adjectives go with that word fox, or verbs that tell what that fox is doing say such as ' the quick brown fox jumped over the lazy dog's back". Also one could see how fox is used as a direct object and create a model of what one can do to a fox, such as catch it, chase it, kill it, etc.
    This word model might be viewed as a knowledge graph or the like.
    The entity model could generalize those such as to be brown means to have a color, and the one could link those to classes such as mammals that might have similar properties so as to imply some form of inheritance.
    Thus one could create a word model of fox, and entity model, and then from those a vector model. Then one can use that vector model as an aid to refine the other two models and so on.
    Thus in the word model it would have that links that foxes have been said to be quick can be brown and can jump. The entity model might generalize that by extending the model of what it means to be brown, that is it has the property of color, brown being one of its colors, color, what it means to be quick, that is it has the property of speed or agility, that it has the property of motion such as jumping, etc.
    The maybe from the stats of those two models one could develop algorithms to create vector representation of the entity and word. And this vector space might be what one might call an idea space. in that it is a vector space, which is of course also a metric space and also one can clustering (using an aggregation algorithm) to create a topological space with the clusters being subsets of that topological space and then maybe maps these clusters to the entity space model.
    The idea is to do this automatically where the human then can direct it at a higher level but the computer does all the grunt work of building the three models.

    • @RonLWilson
      @RonLWilson 3 года назад

      And BTW, that last step might employ what one might call a quality graph that maps truth metrics (e.g. size, weight, age, height, etc.) into goodness metrics, tallness, heaviness, agility, etc. Those goodness metrics then might be used to construct the vectors.

  • @thelastone1643
    @thelastone1643 4 года назад

    Thank you very much. Can use word2vec to predict the most frequent 10 words that come before a specific word and the most frequent 10 words that come after that specific word? and how?

  • @anilsharma2774
    @anilsharma2774 3 года назад

    As far as I know, hyper parameters in word2vec can be tuned experimentally. But I want to calculate optimal values of two hyper parameters: context window size and embedding size (vector dimensions) for word2vec skip-gram with negative sampling using Grey wolf optimizer. (I want to use this model to find the top 25 similar words of a token). How to do this? Any idea!!!

  • @johnvaughan6229
    @johnvaughan6229 3 года назад

    you are so funny

  • @MrCk0212
    @MrCk0212 4 года назад

    cat+cute=cat !

  • @chandankumarmishra336
    @chandankumarmishra336 Год назад

    awesome

  • @creativeworks98
    @creativeworks98 5 лет назад

    can i find similarity between two ecommerce product names using this?

  • @leana8959
    @leana8959 2 года назад

    You're so passionate about all of this, I really like your video !

  • @devlogs1785
    @devlogs1785 6 лет назад

    Isn't this just a classification problem where your labels are words in a dictionary?

    • @SimonK91
      @SimonK91 5 лет назад

      Well... sort of.. word2vec is not only trying to classify each word into a long array of numbers, but also give these values some correlation between each other.
      The model can for example do this aritmetic: "king" - "man" + "woman", with the closest answer being "queen".
      It can also do "France" - "Paris" + "Italy" to get the value "Rome".
      The model manage to map million of words into less than a thousand dimensional vector, where previously a "one-hot" vector has been used.

  • @sheheryar89
    @sheheryar89 5 лет назад +3

    stop dancing !

  • @PatPrice123
    @PatPrice123 3 года назад

    Easier to watch at 2x speed ;-)

  • @evailiza218
    @evailiza218 2 года назад

    you're so amazing to watch, so easy to understand , thanks a lot. I wish my children to have a teacher like you !

  • @ekleanthony7997
    @ekleanthony7997 3 года назад

    making learning easy.. Nice presentation.. So entertaining ! Thanks!

  • @eddeveloper2425
    @eddeveloper2425 5 лет назад +1

    you love computers as I do

  • @hanniffydinn6019
    @hanniffydinn6019 6 лет назад

    So basically words and meanings are translated to vector space, with meaning of words linked.
    It seems as meanings are so complex, you need many higher dimensions.

  • @meghnatalari419
    @meghnatalari419 4 года назад

    Can you please do on code2vec

  • @dextr0met0rfan
    @dextr0met0rfan 4 года назад

    second time trying to watch this, cant get past the jokes...

  • @kabonker
    @kabonker 4 года назад

    you are so funny... thanks

  • @JoseCastillo-fl8jn
    @JoseCastillo-fl8jn 6 лет назад

    Sounds good, doesn't for low-resource languages

  • @dirkstark2870
    @dirkstark2870 4 года назад

    Real start is 2:20

  • @ashhasib54
    @ashhasib54 5 лет назад

    Classic enthusiast....Good job bro

  • @ayush612
    @ayush612 5 лет назад +3

    Harry Potter - Hollywood = Daniel Radcliffe

  • @DowzerWTP72
    @DowzerWTP72 6 лет назад

    This is awesome! What a brilliant concept. I clicked assuming it was going to be about converting text into coordinates of the lines making up the characters. So then you could have particles seek those points along the text and form up.
    But this. This is so much cooler than that! Pt 3. Jarvis.

  • @chillz2024
    @chillz2024 4 года назад

    Dude you are perfect!

  • @zaccanter6465
    @zaccanter6465 5 лет назад

    You are crucial.

  • @EngRiadAlmadani
    @EngRiadAlmadani 4 года назад

    Vactorization

  • @jrgantunes
    @jrgantunes 4 года назад

    Thank you :)

  • @kennedyjohnson3868
    @kennedyjohnson3868 4 года назад

    Very impressive tutorial - very lively and engaging. well done!

  • @تعليم-خ7ح
    @تعليم-خ7ح 5 лет назад

    Oh my god hhahha you are so amazing ..keep going

  • @Codeonces
    @Codeonces 6 лет назад

    06:21 graph looks like clustering in data mining !

    • @SimonK91
      @SimonK91 5 лет назад +1

      That's because it is the same. :)

  • @lima_mali
    @lima_mali 4 года назад +1

    Man why do you talk so much ! 🙆 But it's fun watching ya

  • @xmohd2011
    @xmohd2011 6 лет назад

    thank you 😺😺

  • @jezutryingcodes1937
    @jezutryingcodes1937 6 лет назад

    two

  • @aneet84
    @aneet84 2 года назад

    Thanks!

    • @aneet84
      @aneet84 2 года назад

      This is an energetic and focused crash-course on what word2vec does! thank you.

    • @TheCodingTrain
      @TheCodingTrain  2 года назад +1

      Thank you for your generosity! Apologies I never found my way towards completing this short series!

    • @aneet84
      @aneet84 2 года назад

      @@TheCodingTrain you've been far more generous than my little gesture, through sharing your knowledge with us. Thank you.

  • @earomc
    @earomc 6 лет назад

    Now, that’s a cool idea! I like it :D

  • @MrLuke1106
    @MrLuke1106 6 лет назад

    This is brilliant

  • @Xfacta12482
    @Xfacta12482 6 лет назад

    I love all these kind of obscure-ish topics you cover...but have you ever thought about videos just focusing on plain old Node or React?
    You're such an amazing teacher I'd love to see what you'd be able to cook up

  • @Engineer9736
    @Engineer9736 6 лет назад

    I don’t believe this is ever going to work. Words are not made by mathematic means so it’s pointless to try to do mathematic things with them. It may still be funny to mess with though. If you get any results.

    • @SimonK91
      @SimonK91 5 лет назад +1

      This is working, and is the current state-of-the-art (highest performing) within text generation.

    • @tantarudragos
      @tantarudragos Год назад

      this aged pretty badly lol

  • @dnadhruv
    @dnadhruv 6 лет назад

    You're like the Bob Ross for coding.

  • @hazemght4654
    @hazemght4654 6 лет назад +1

    what is the meaning when i comment first ?

    • @madmadz1624
      @madmadz1624 6 лет назад +1

      it's first to watch the video

  • @ailbae1004
    @ailbae1004 6 лет назад +6

    the eart is flat

    • @earomc
      @earomc 6 лет назад +2

      ailba e maybe, but earth is not.

    • @ailbae1004
      @ailbae1004 6 лет назад

      gradle what r u talking about?

    • @Engineer9736
      @Engineer9736 6 лет назад

      ailba e You made a typing error. It’s earth, not eart. And while we’re on it; it’s also flat, not flath.

    • @ailbae1004
      @ailbae1004 6 лет назад

      Richard van Pukkem typing error what do you mean 🤔