Google PEGASUS Abstractive Text Summarization (+) HF Transformers python demo

Поделиться
HTML-код
  • Опубликовано: 10 янв 2025

Комментарии • 59

  • @doric1111
    @doric1111 3 года назад +2

    To fix the issue replace the batch variable to :
    batch = tokenizer(src_text, truncation=True, padding='longest', return_tensors='pt').to(torch_device)

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад

      Which issue are you talking about

    • @doric1111
      @doric1111 3 года назад

      @@RitheshSreenivasan github.com/huggingface/transformers/issues/8691

    • @doric1111
      @doric1111 3 года назад

      @@RitheshSreenivasan I sent you also a question on email, I would like you to answer :)

    • @zainnaveed2196
      @zainnaveed2196 2 года назад

      @@doric1111 please can you tell me how you resolve this issue

  • @TechVizTheDataScienceGuy
    @TechVizTheDataScienceGuy 4 года назад +1

    Nice explanation!

  • @aqibfayyaz1619
    @aqibfayyaz1619 3 года назад

    nice explanination

  • @waisyousofi9139
    @waisyousofi9139 2 года назад

    can you implement,
    how to evaluate abstractive summarization.
    thanks

    • @RitheshSreenivasan
      @RitheshSreenivasan  2 года назад +1

      Unless you have some ground truth summary evaluation is difficult. If there is ground truth the you can evaluate using Bert score, rouge metrics

    • @waisyousofi9139
      @waisyousofi9139 2 года назад

      @@RitheshSreenivasan
      Thanks !
      ground truth summary you mean human generated summary which is also called the reference?

    • @RitheshSreenivasan
      @RitheshSreenivasan  2 года назад +1

      Yes

    • @waisyousofi9139
      @waisyousofi9139 2 года назад

      @@RitheshSreenivasan
      Thanks for your time.

  • @UsmanNiazi
    @UsmanNiazi 3 года назад

    Great Content.

  • @zainnaveed2196
    @zainnaveed2196 2 года назад

    please tell me how we increase the length of the summary

  • @nithyakalyani5720
    @nithyakalyani5720 3 года назад

    Do this support for indian languages

  • @mateuszfijak5293
    @mateuszfijak5293 4 года назад

    Great video! Have you considered using GPT-2 or GPT-3 for this task?

    • @RitheshSreenivasan
      @RitheshSreenivasan  4 года назад +1

      I have not as of yet. But would be interested in looking at GPT-2 or GPT-3

    • @waisyousofi9139
      @waisyousofi9139 2 года назад

      hey,
      can you help how to evaluate abstractive summary generated using gpt3.

  • @gagankalra4603
    @gagankalra4603 3 года назад

    Is there a way to increase length of summary? I mean, is there a parameter that can be tuned to increase summary lengths?

  • @computerscienceandengineer2400
    @computerscienceandengineer2400 2 года назад

    Is it possible to apply the model for other Indian languages?

  • @aminjahani5136
    @aminjahani5136 4 года назад +1

    hey , tnx for the video ,
    i have a problem with code , why 'tokenizer' has None value and None Type ? i checked code on github and saw same problem there.

    • @RitheshSreenivasan
      @RitheshSreenivasan  4 года назад

      Could be an issue with how transformers library is installed

    • @gunandm7292
      @gunandm7292 3 года назад

      Just after installing !pip install transformers you need to install !pip install sentencepiece before importing the PegasusTokenizer.........
      i hope this helps :)

    • @gunandm7292
      @gunandm7292 3 года назад +1

      and use this code to tokenize:
      batch = tokenizer(src_text, truncation=True, padding='longest', return_tensors='pt').to(torch_device)

  • @Manideep.
    @Manideep. 3 года назад

    why am i getting error when i import the model
    from transformers import PegasusForConditionalGeneration, PegasusTokenizer
    >>> import torch
    AttributeError: 'Version' object has no attribute 'major' pls help

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад

      Seems to be an issue with PyTorch installation.

    • @Manideep.
      @Manideep. 3 года назад

      Can you share any link to install torch properly

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад

      @@Manideep. You can go to the official page of pytorch . I had followed the instructions from there

  • @ajay0221
    @ajay0221 3 года назад

    How can we extend the same over meetings?

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад

      Can you elaborate your use case?

    • @ajay0221
      @ajay0221 3 года назад

      @@RitheshSreenivasan I was trying to generate minutes of meeting for meetings with multiple participants.

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад

      You are better off with some custom NLP algorithm which can identify participants first followed by detection of important points

  • @zainnaveed2196
    @zainnaveed2196 2 года назад

    i got thi error :'NoneType' object is not callable
    (on this code
    (model_name = 'google/pegasus-large'
    torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'
    in this line (((tokenizer = PegasusTokenizer.from_pretrained (model_name) please help me

  • @smtsdlkr
    @smtsdlkr 3 года назад

    Hi sir can we able to max or min the length of summary. 🤔

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад +1

      Look at hugging face Pegasus documentation. I have not experimented on length of summary

    • @smtsdlkr
      @smtsdlkr 3 года назад

      @@RitheshSreenivasan thanks sir 😊

  • @karimfayed4517
    @karimfayed4517 3 года назад

    How can I fine-tune the PEGASUS large checkpoint?

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад +1

      gist.github.com/jiahao87/50cec29725824da7ff6dd9314b53c4b3

    • @karimfayed4517
      @karimfayed4517 3 года назад

      @@RitheshSreenivasan Thank you for the video and the script it has been a great help. Just one question, if I have my own dateset (which is TFDS) on my local pc all i have to do is replace it with el xsum in the script and adjust batch size and epochs?

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад

      Should work

    • @karimfayed4517
      @karimfayed4517 3 года назад

      @@RitheshSreenivasan with your help and refrence I was able to fine-tune the model, but i'm stuck on implementing it like you have in the video. I don't know how to turn it into a format which I can use to give it an input and have an output.

  • @hieuleinh180
    @hieuleinh180 4 года назад

    do you have source code using hugging face of pegasus to do summarization task ?

    • @RitheshSreenivasan
      @RitheshSreenivasan  4 года назад

      Do you mean to code to fine-tune on your documents. I have not explored it as of yet. The code i have used in this example is available here:github.com/rsreetech/PegasusDemo

  • @rahulnambiar8890
    @rahulnambiar8890 3 года назад

    aren't these results more like extractive ?

    • @RitheshSreenivasan
      @RitheshSreenivasan  3 года назад

      Not exactly as here the text summary is generated . In extractive you just extract relevant sentences as is from the text. In some cases yes the results look very much like extractive and in some cases they look like they have been generated

  • @patagonia4kvideodrone91
    @patagonia4kvideodrone91 7 месяцев назад

    RUclips's automatic translation is very very bad, perhaps your pronunciation too, although I understood more by listening than by reading the ramblings that it generated as a translation. I encourage you to practice your pronunciation, or to translate it with whisper, so that you can see for yourself if it makes sense or not, things like "transformers of the porn border" came out. so imagine.

    • @RitheshSreenivasan
      @RitheshSreenivasan  7 месяцев назад +1

      Can’t help if RUclips’s automatic translation is broken. If you can’t understand my pronunciation don’t watch the videos