LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p

Поделиться
HTML-код
  • Опубликовано: 7 сен 2024

Комментарии • 23

  • @datamlistic
    @datamlistic  7 месяцев назад

    Wondering how you can fine-tune LLMs? Take a look here to see how this is done with LoRa, a popular fine-tuning mechanism: ruclips.net/video/CNmsM6JGJz0/видео.html
    VIdeo mistakes:
    - At 2:30 the sum should be for j, not for i. Thanks @mriz for noticing this!
    - The probability distribution after selecting top-3 words at 4:10 is not accurate, and they should be sunny - 0.46, rainy - 0.38, the - 0.15. Thanks @koiRitwikHai for noticing this!

  • @user-jx5or8pk2m
    @user-jx5or8pk2m 3 месяца назад +1

    Thanks! Top p and Top k were easy to understand.

    • @datamlistic
      @datamlistic  3 месяца назад

      You're welcome! I'm glad to hear that those concepts were clear and easy to understand. If you have any more questions or need further clarification on this topic, feel free to ask! :)

  • @stev__8881
    @stev__8881 4 месяца назад

    Great introduction with a clear an simple explanation/ illustration. Thanks!

    • @datamlistic
      @datamlistic  4 месяца назад

      Thanks! Glad you found it helpful! :)

  • @waiitwhaat
    @waiitwhaat 4 месяца назад

    This is a really clear explanation in this concept. Loved it. Thanks!

    • @datamlistic
      @datamlistic  4 месяца назад +1

      Thanks! Happy to hear that you liked the explanation! :)

  • @starsmaker9964
    @starsmaker9964 2 месяца назад

    video helped me a lot! thanks

  • @igordias8728
    @igordias8728 6 месяцев назад +1

    Hello, in TOP-P, witch of the 4 words will be chosen? It's randomly between "sunny", "rainy", "the" and "good"?

    • @datamlistic
      @datamlistic  6 месяцев назад +1

      Yes, it's random according to their distribution.

    • @Annaonawave
      @Annaonawave 5 месяцев назад +1

      @@datamlistic so they are randomly selected, but higher probable values have higher chance of being selected?

    • @datamlistic
      @datamlistic  5 месяцев назад +1

      @@Annaonawave exactly :)

  • @matthakimi3132
    @matthakimi3132 Месяц назад

    Hi there, this was a great introduction. I am working on a recommendation query using Gemini; would you be able to help me fine-tune for the optimal topK and topP? I am looking for an expert in this to be an advisor to my team.

    • @datamlistic
      @datamlistic  Месяц назад

      Unfortunately my time is very tight right now since I am working full time as well, so I can't commit to anything extra. I could however help you with some advice if you can provide more info.

  • @nizhsound
    @nizhsound 7 месяцев назад

    Thank you for the video and explanation between the three types of sampling for LLMs. When sampling between Temperature, Top-K and Top-P, are you using or enabling all three sampling methods at the same time?
    For example if I chose to do Top-K sampling for controlled diversity and reduced nonsense, does that mean that I will choose a low temperature as well?

    • @datamlistic
      @datamlistic  7 месяцев назад

      Glad it was helpful! Yes, you can combine multiple sampling methods at the same time. :)

  • @varadarajraghavendrasrujan3210
    @varadarajraghavendrasrujan3210 3 месяца назад

    Let's say I use top_k=4, does the model sample 1 word out of the 4 most probable words randomly? If not, what happens?

    • @datamlistic
      @datamlistic  3 месяца назад

      That's exactly what happens! The model samples 1 word out of the most probable 4, according to their distribution. (i.e. the higher the probabaility of a word, the more likely it is to sample it).

  • @koiRitwikHai
    @koiRitwikHai 7 месяцев назад

    The probability distribution you get after selecting top-3 words at 4:10 is not accurate. The probabilities, after normalizing the 3-word-window, should be sunny-0.46, rainy-0.38, and the-0.15.

    • @datamlistic
      @datamlistic  7 месяцев назад +1

      Yep, that's correct. Thanks for the feedback! I created/recorder the video over a longer period of time and it seems that I used two version of numbers in doing that (forgot to make any updates). I'm sorry if this has caused any confusion. I will add some corrections about this issue in the description/pinned comment.
      p.s. Maybe it would be a good idea to take the ceil of one of the probabilities you enumerated, so they sum up to 1.

  • @mriz
    @mriz 6 месяцев назад

    2:3
    bro you wrong the sums is not for input i , but for j

    • @datamlistic
      @datamlistic  6 месяцев назад +1

      Yep, that's correct. Thanks for the feedback and sorry if this confused you! I will add a note about this mistake in the pinned comment. :)