Bert Score for Contextual Similarity for RAG Evaluation

Поделиться
HTML-код
  • Опубликовано: 27 окт 2024

Комментарии • 12

  • @manishsharma2211
    @manishsharma2211 10 месяцев назад +1

    Quick Correction : ROUGE is used for summarization and BLEU is used for translation primarily [ time stamp : 4:04 ]

    • @AIAnytime
      @AIAnytime  10 месяцев назад

      My bad in rush. Thanks

  • @SonGoku-pc7jl
    @SonGoku-pc7jl 11 месяцев назад

    thanks for all! :) tomorrow i will watch your others videos of evaluate llm and rag :)

  • @gotitgotya
    @gotitgotya 2 месяца назад

    great work man....thank you so much for uploading such informative videos❤❤

  • @VenkatesanVenkat-fd4hg
    @VenkatesanVenkat-fd4hg 11 месяцев назад

    Thanks for valuable video as usual. Waiting for multimodal and unstructured files/applying RAG

  • @soumilyade1057
    @soumilyade1057 6 месяцев назад

    The library that you have used mentions use of custom models in point 3 of ReadMe. But, there's no parameter by the name "model" or "num_layers". I was wondering if you have figured out what's going on there.

  • @ShreyaSingh-wp9yk
    @ShreyaSingh-wp9yk 11 месяцев назад

    thanks for uploading the video. one quick question how it is different from rogue, bleu and meteor, as they are also recall and precision-based.can we use rogue, meteor and bertscore if I am evaluating chatbot and why. Please excuse if this ques sounds naive, I am very much new to this

  • @ArpitBhavsar-z2c
    @ArpitBhavsar-z2c 11 месяцев назад +1

    Though, in the second example, it gave 85% similarity
    Weird right?

    • @CibeSridharanK
      @CibeSridharanK 5 месяцев назад +2

      exactly thats why his reaction is strange

    • @encianhoratiu5301
      @encianhoratiu5301 11 дней назад

      yeah, not so good to evalute a llm generation... I don't blame him perahps there isn't a better way.

  • @livelaughmotivate94
    @livelaughmotivate94 7 месяцев назад

    For Text Generation RAG, Bert Score won't work?