Stemming and Lemmatization: NLP Tutorial For Beginners - S1 E10

Поделиться
HTML-код
  • Опубликовано: 25 ноя 2024

Комментарии • 42

  • @codebasics
    @codebasics  2 года назад

    Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

  • @Breaking_Bold
    @Breaking_Bold Год назад +1

    I love the way you explain - other NLP concepts - customizing the pipeline for example !!!

  • @belfloretkoriciza5279
    @belfloretkoriciza5279 2 года назад +1

    you are my teacher and i am proud of you

  • @pphantom5037
    @pphantom5037 3 месяца назад +1

    There is a quiz now!! thank your for your awsome work♥♥♥

  • @ayushgupta80
    @ayushgupta80 8 месяцев назад +2

    Stemming (removing something) vs Lemmatization ( mapped with base word) 4:50
    Note : Spacy don't have support of stemming .
    Code : stemming
    import nltk
    import spacy
    from nltk.stem import PorterStemmer
    stemmer = PorterStemmer()
    words = ["eating","eats","eat","ate","adjustable","rafting","ability","meeting"]
    for word in words:
    print(word,"|",stemmer.stem(word))
    --------------------------------------------------------------------------------
    Code : lemmatization
    nlp = spacy.load("en_core_web_sm")
    doc = nlp("eating eats eat ate adjustable rafting ability meeting better")
    for token in doc:
    print(token,"|",token.lemma_,"|",token.lemma)
    -----------------------------------------------------------------------------------------
    Custom lemmatization
    Code :
    ar = nlp.get_pipe('attribute_ruler')
    ar.add([[{"TEXT":"Bro"}],[{"TEXT":"Brah"}]],{"LEMMA":"Brother"})
    doc =nlp("Bro, you wanna go ? Brah , don't say no ! I am exhausted")
    for token in doc:
    print(token.text,"|",token.lemma_)

  • @Breaking_Bold
    @Breaking_Bold Год назад

    Fantastic ...you make complex NLP topics simple. !!!

  • @amandaahringer7466
    @amandaahringer7466 2 года назад +1

    Very helpful! Looking forward to the rest of the series! Thank you!

  • @aintgonhappen
    @aintgonhappen 2 года назад

    This is some quality content.
    Thank you!

  • @sandeepnaik6437
    @sandeepnaik6437 2 года назад +5

    What is Behavioural data science?

  • @amandaahringer7466
    @amandaahringer7466 2 года назад +1

    8:36 I noticed that the prebuilt language pipelines return an unexpected lemma for "ate". I assumed that lg and trf pipelines would produce ate -> eat while the sm and md pipelines would produce ate -> ate, but that doesn't seem to be the case.
    def eat_lemma(lang_pipeline):
    nlp = spacy.load(lang_pipeline)
    doc = nlp("ate")
    print(lang_pipeline, '|', doc[0].lemma_)
    lp = ["en_core_web_sm", "en_core_web_md", "en_core_web_lg", "en_core_web_trf"]
    for lang_pipeline in lp:
    eat_lemma(lang_pipeline)
    en_core_web_sm | ['eat']
    en_core_web_md | ['ate']
    en_core_web_lg | ['eat']
    en_core_web_trf | ['ate']
    Update: I see that when "ate" is used in the context of a sentence each pipeline produces a lemma of "eat".
    doc = nlp("The person ate an apple.")
    en_core_web_sm | ['the', 'person', 'eat', 'an', 'apple', '.']
    en_core_web_md | ['the', 'person', 'eat', 'an', 'apple', '.']
    en_core_web_lg | ['the', 'person', 'eat', 'an', 'apple', '.']
    en_core_web_trf | ['the', 'person', 'eat', 'an', 'apple', '.']

  • @arnavverma8622
    @arnavverma8622 2 года назад

    Excellent Series👌👌🔥🔥

  • @rajiv7
    @rajiv7 5 месяцев назад

    You are the excellent. Fullstop.

  • @codebasics
    @codebasics  2 года назад

    Do you want to learn technology from me? codebasics.io is my website for video courses. First course going live in the last week of May, 2022

  • @apurav363
    @apurav363 Месяц назад

    Very helpful

  • @Kaafirpeado54-6ayesha
    @Kaafirpeado54-6ayesha Месяц назад

    Thanks a bunch ❤

  • @berkayates6254
    @berkayates6254 9 месяцев назад

    Hey Guys when we used stemming and lemmatizing before training the data we just change the words. After training the model model could generate words that are different from lemmatized words. I mean we teach the model `eat` however the model learn also `ate` how?

  • @aashishmalhotra
    @aashishmalhotra 2 года назад

    If possible try to come with live sessions it would be helpful

  • @MuhammadIBRAHIM-iy3rg
    @MuhammadIBRAHIM-iy3rg 7 месяцев назад

    amazing videos

  • @raphayzia9214
    @raphayzia9214 2 года назад

    Sir it will be very helpful if you make a NLP project like a Chatbot at the end of the series and thanks for making this series

    • @codebasics
      @codebasics  2 года назад +1

      Yes I will be making few projects

  • @jatinnandwani6678
    @jatinnandwani6678 11 месяцев назад

    Thanks so much

  • @JayShah-m1v
    @JayShah-m1v Год назад

    Hey!
    Firstly, this is a very good series. But for the exercise, in the last part using lemmatization, some of my words such as cooking were converted into cook and playing to play while running stayed as it is. Do you know what could be the issue?
    Or do you have any explanation to this?
    Thank you.

    • @agastyabose1645
      @agastyabose1645 9 месяцев назад

      it just might be how that specific model of nlp you used, performs. maybe idk

  • @zaytech528
    @zaytech528 2 года назад

    hello sir, if i want to stem and lemmatize my string at the same time, how'd i do that? as spacy doesn't allow stemming. and nltk doesn't allow lemmatization. pls answer asap

  • @firdospathan3700
    @firdospathan3700 Год назад

    I could not unable to install Ai4bharat package in PC.
    Is there solution. For that error

  • @omarsalam7586
    @omarsalam7586 Год назад

    thank you, sir

  • @muzaffariqbalraja6464
    @muzaffariqbalraja6464 Год назад

    very nice

  • @GAURAVRAUL95
    @GAURAVRAUL95 2 года назад +1

    Which one are you? Marc Spector or Steven Grant??

    • @codebasics
      @codebasics  2 года назад +6

      I am Dhaval, Marc and Steven are my alter egos 😎

  • @muradmammedzade2885
    @muradmammedzade2885 Год назад

    How to write Lemmatizer from scratch?

  • @Telugu-Tech-suport
    @Telugu-Tech-suport 2 года назад

    Sir last 1year EGO my pc hacked .gujd ransomwer please huw to get back my data 🙏 help mee please sum important data is ther

  • @anaschoudhari511
    @anaschoudhari511 2 года назад

    Hi sir a request for you to make some videos on python

    • @codebasics
      @codebasics  2 года назад +1

      I have a python tutorial playlist with more than 40 videos. in youtube search "codebasics python tutorial"

  • @Pride_Of_Ultras
    @Pride_Of_Ultras 2 года назад

    🤩

  • @leoxu1299
    @leoxu1299 2 года назад

    Hey, aren't you the moon knight?

    • @codebasics
      @codebasics  2 года назад +1

      Ha ha you are the third person to say this 🤣😎😎😎

  • @thoughtofme8263
    @thoughtofme8263 2 года назад

    pleeeeeeeeeease try hindi speaking