Gemma-2 (2B) : Google's New SMALL Model is GOOD OR REALLY BAD? (Fully Tested)

Поделиться
HTML-код
  • Опубликовано: 8 сен 2024
  • НаукаНаука

Комментарии • 31

  • @marcograziano7308
    @marcograziano7308 Месяц назад +10

    Thank you for the very informative videos. Big fan.

    • @AICodeKing
      @AICodeKing  Месяц назад

      Thanks a lot for your support. It's amazing to see people like my content.

  • @Historypress-pq4ng
    @Historypress-pq4ng Месяц назад +13

    I WISHED CLAUDE MAKE AN UPGRADE THAT WILL BE WILD

  • @jackflash6377
    @jackflash6377 Месяц назад +3

    Thanks for the tests, very revealing.

  • @8eck
    @8eck Месяц назад +2

    It's just a marketing issue. That score win is considered a success...

  • @jlyunior
    @jlyunior Месяц назад +1

    Thanks for the video ! Totally needed !

  • @blisphul8084
    @blisphul8084 Месяц назад

    For it's size, it's an excellent model. For example, it can do multilingual translations very well, and define each word used. It struggles if your prompt includes complex instructions, so you really need to give it simple, easy to follow prompts. For my use case, all local LLMs have struggled with a certain part of the task (furigana), so the fact that this model can't really do it doesn't put it at a disadvantage compared to something like llama 8b, and I just needed to implement that lookup with a local dictionary instead. This lookup means the transliterations are now more reliable than with GPT-4o, since it isn't relying on the knowledge of an LLM, so my program is better for using the 2b model. And it runs fast and local. This model is weaker and less flexible than 3.5 turbo, but if you can play to its strengths, it can be extremely powerful.

    • @blisphul8084
      @blisphul8084 Месяц назад

      Additionally, a nice prompting trick to speed things up on slow hardware is to have the variable at the very end, because lm studio (and llama.cpp?) cache prompts until the first part that changed, so by having the changed part of the prompt as late as possible, it means that response times are fast, even when running on an 8th Gen laptop i7. Can't wait until we have this model in Bitnet (the 1.58 bit quant without matrix multiplication). It'll probably run at 50+ t/s on CPU based on early testing with a smaller model. Ultra fast broad knowledge intelligence on all platforms is just around the corner.

    • @user-uo2io2mj3v
      @user-uo2io2mj3v 17 дней назад

      这种模型可以用来当我的在线身份

  • @brandon1902
    @brandon1902 Месяц назад

    Qwen2 1.5b is far worse than Gemma2 2b. For example, on my general knowledge test, which includes pop culture questions about movies, music, games, TV shows, sports... it only scored 23.2 vs 43.1 for Gemma2 2b, compared to 74.8 for Mixtral 8x7b. And this pronounced inferiority exists across nearly every test category. For example, Qwen2 1.5b not only can't write prompted poems that rhyme, but they're basically just bad stories with short sentences. And its stories can't even follow a small list of prompt directives and are filled with absurdities and contradictions. In contrast, Gemma2 2b can write poems and stories. They're not great, but vastly better than Qwen2 1.5b's. Even with simple math like 3333+777 Qwen2 1.5b returned 4100 vs 4110.
    I of course agree that it's insane comparing Gemma 2 2b to Mixtral or GPT3.5, both of which have about 100x more unique information and are far better and coding, math, story writing, problem solving, and basically every other task. However, both Phi3 mini and Gemma2 2b are vastly better than Qwen2 1.5b.

  • @paulyflynn
    @paulyflynn Месяц назад +1

    Trash
    Tree is not a number - FAIL
    IMO this misleading marketing is evil.

  • @AnugrahPrahasta
    @AnugrahPrahasta Месяц назад

    I'm affraid google should learn from others. They have the data, but...

  • @MeinDeutschkurs
    @MeinDeutschkurs Месяц назад

    What purpose is the model trained on? Daily conversations?

  • @utvikler-no
    @utvikler-no Месяц назад

    Thanks

  • @jekkleegrace
    @jekkleegrace Месяц назад

    Thanks!

  • @SpikyRoss
    @SpikyRoss Месяц назад +1

    Google is such a liar
    Well even their gemini is not that good tbf

    • @user-uo2io2mj3v
      @user-uo2io2mj3v 17 дней назад

      为什么这么说

    • @SpikyRoss
      @SpikyRoss 17 дней назад

      @@user-uo2io2mj3v Gemini is good for long context but reasoning is subpar

  • @anaskhan-lz2hk
    @anaskhan-lz2hk Месяц назад

    Gemma 2 2b is good with textual ouput like writing story it is bad at reasoning, maths and programming

    • @user-uo2io2mj3v
      @user-uo2io2mj3v 17 дней назад

      这没有关系,有一些AI插件,也蛮好用的

  • @mrouquin
    @mrouquin Месяц назад

    I just don't understand the purpose of this small model. It's not great anywhere. And it's not multi-lang. For example in my tests, french is not supported in the answers.

    • @user-uo2io2mj3v
      @user-uo2io2mj3v 17 дней назад

      与别人聊天,或者加入语音识别生成模型,在游戏里面与那种AI角色聊一些毁灭人类的故事,这样可能会成为下一个新未来的电影脚本

  • @dung5990
    @dung5990 Месяц назад

    can you test the new AI image generator : Flux?

    • @user-uo2io2mj3v
      @user-uo2io2mj3v 17 дней назад

      Flux vary good,这种媲美MJ和SD的模型真的不错,但是SD这次没有朝着图像生成的方向发展了,而是转向纯音乐生成的方向发展了

  • @adriintoborf8116
    @adriintoborf8116 Месяц назад

    Ese tipo de estrategias de Google le quita credibilidad y es vergonzoso.

    • @user-uo2io2mj3v
      @user-uo2io2mj3v 17 дней назад

      谷歌前副总裁才是最牛的那个人

  • @AbuBakr1
    @AbuBakr1 Месяц назад

    A worst model indeed