Llama 3.2 3b Review Self Hosted Ai Testing on Ollama - Open Source LLM Review

Поделиться
HTML-код
  • Опубликовано: 29 сен 2024

Комментарии • 26

  • @jasonabc
    @jasonabc День назад +2

    Did one test uploaded a pdf and asked it to summarize the document. It spit out jibberish not even the same topic as the paper. So obviously how could i begin to trust anything from this if it fails something very simple.

  • @IlllIlllIlllIlll
    @IlllIlllIlllIlll 3 дня назад +1

    16:00 synthetic as in synthetic data?

  • @michaelwpannekoek
    @michaelwpannekoek 2 дня назад

    What *top program are you running under WSL2?

  • @zurgmuckerberg
    @zurgmuckerberg 23 часа назад

    will you test the 11b one?

  • @shuntera
    @shuntera День назад

    What are you running in the powershell in the top right that shows your GPU status?

    • @DigitalSpaceport
      @DigitalSpaceport  День назад

      Im actually ssh'd into the host computer Guide Here: ruclips.net/video/TmNSDkjDTOs/видео.html and running NVTOP

  • @ugurinanc5177
    @ugurinanc5177 3 дня назад +1

    How can you quantize that model? And also can we fine tuning a model we downloaded from ollama?

    • @DigitalSpaceport
      @DigitalSpaceport  3 дня назад

      This is a good fine tune starting jumpoff point. It answers your questions and in short you can and the -q flag in ollama can create your specific quant desired. The huggingface fine tune adapter framework and safetensor adapter for Llama3.1 is a thing so I would assume it is for 3.2 as well. That's a good starting point for you. github.com/ollama/ollama/blob/main/docs/import.md

  • @bbertram2
    @bbertram2 8 часов назад

    It got it right for me... (or so I thought, lol)
    If A is equal to number 0, what is the number of M, S and z?
    llama3.2:latest
    Based on the standard ordering of the alphabet, where A is indeed equal to 0:
    M = 13
    S = 19
    Z = 26
    Let me know if you have any further questions!

    • @DigitalSpaceport
      @DigitalSpaceport  7 часов назад

      If A=0, then B=1...M=12...S=18...z=25

    • @bbertram2
      @bbertram2 7 часов назад

      @@DigitalSpaceport hahahaha....yeah. Oh well.

    • @DigitalSpaceport
      @DigitalSpaceport  7 часов назад

      I was surprised at a few other models that did get this right and it tracks closely also to counting letters question also I've noticed.

    • @bbertram2
      @bbertram2 7 часов назад +1

      I went back and asked it again, same answer. Tried 3.1, could not get it either. However I gave it one clue A=0 and B=1 and boom it got it. Probably too easy of a clue but I'm surprised it could not answer it. I asked Claude and ChatGPT...they couldn't get it either....very odd. Good question!

    • @DigitalSpaceport
      @DigitalSpaceport  7 часов назад

      Qwen 2.5 here - ruclips.net/video/dOrgIn2ztvY/видео.htmlsi=mb33EAbMjXk55YC3&t=555

  • @DeepThinker193
    @DeepThinker193 3 дня назад +2

    This is the result of AI inbreeding aka training on Synthetic data. I have a prompt that gets counting etc consistently accurate on llama 3.1 8b. However, on the 3.2 models they get things wrong all the time.

    • @DigitalSpaceport
      @DigitalSpaceport  3 дня назад

      Oh that's a great term! Ai inbreeding 😅

    • @mayankmaurya8631
      @mayankmaurya8631 2 дня назад +2

      You sure it wasn't because of 3b vs 8b ?

    • @DeepThinker193
      @DeepThinker193 2 дня назад

      @@mayankmaurya8631 Nope, tried my prompts on the 3.2 11b and 3.2 90b as well. They's just inferior and keep getting things wrong. I get consistently correct responses from llama 3.1 8b using my special prompts.

  • @DavidVincentSSM
    @DavidVincentSSM 3 дня назад

    i agree that the results don't seem to match benchmarks in real world performance.. maybe something everyone else is missing?

    • @DigitalSpaceport
      @DigitalSpaceport  3 дня назад

      @@DavidVincentSSM Im not trying to larp as a pro or anything but I am interested in what makes for a good product. Im thinking less and less benchmarks make for a good product.

  • @mcunumberone613
    @mcunumberone613 3 дня назад +1

    Is there any possibility to earn money with this model?

  • @_s.i.s.u.
    @_s.i.s.u. 3 дня назад

    You're prompting the model wrong. The "strawberry" tests fail due to the _tokenization_ methods of the given model. Prompt it as if you wanted to place the sentence into an array, then ask your third word second letter. It won't fail.

    • @DigitalSpaceport
      @DigitalSpaceport  3 дня назад +1

      @@_s.i.s.u. Other models I've tested, including qwen 2.5, can and do nail that exact question. Copy pasted. If a question has to be asked in a specific way to elicit a correct response, that is a failure.