Qwen SmallThinker (3B) : This NEW Small Reasoning MODEL IS AMAZING! (Opensource & Local)

Поделиться
HTML-код
  • Опубликовано: 6 янв 2025

Комментарии • 45

  • @theaerogr
    @theaerogr День назад +24

    Having locally small models to think for as much as we want, is amazing.

  • @TimeLordRaps
    @TimeLordRaps День назад +12

    Your channel makes me think I dont need the other ones tbh, great quality, quick explanations and up to date information the quickest.

  • @alankerrigan
    @alankerrigan День назад +6

    Thanks for excellent quality videos which are consistently delivering cutting-edge AI coverage

    • @AICodeKing
      @AICodeKing  День назад +2

      Thanks a lot for the support!

  • @OscarTheStrategist
    @OscarTheStrategist День назад +1

    I appreciate you being transparent and also to the point when it comes to the testing. Not wasting anyone’s time, and delivering excellent info.
    Thanks 🎉

  • @erik....
    @erik.... День назад +4

    I'd love to see a model that I can give a task and leave it for a few hours to think about it. It could surf the web to find info and so on, just like a human would.

    • @icakinser
      @icakinser День назад

      Agent zero ai will do that for you lol

  • @Kevencebazile
    @Kevencebazile День назад +2

    Love your content i review so many but I your the best hands down 2minute papers is also a great one but thank you and I wish you have a beautiful day

  • @danield9368
    @danield9368 День назад +3

    Merci c'est génial ce petit modèle tombe à point, il ne faut pas laisser l'avenir de l'ia entre les mains exclusives des gafam sinon nous serons leur esclaves.

  • @BBZ101
    @BBZ101 День назад

    what is system requerments for this to run it

    • @Ancient1341
      @Ancient1341 День назад +1

      3GB of ram if you use 8 bit quantization

    • @BBZ101
      @BBZ101 День назад

      @Ancient1341 I have 8gb vram and i7 and 16gb ram is good enough

    • @Ancient1341
      @Ancient1341 День назад

      @BBZ101 any modern phone is enough, so yeah.

    • @PhuPhillipTrinh
      @PhuPhillipTrinh День назад

      he said m1 8 gig mac will do for this

  • @DouhaveaBugatti
    @DouhaveaBugatti День назад

    Hey king👑! Can you tell me which ai tool are you using to create the super awesome animation intro you use these days

  • @jtjames79
    @jtjames79 День назад +1

    I would like to see the AIs get a chance to correct themselves.
    Can they get it right after being told they got it wrong the first time?
    I don't expect people to get things right the first time every time, it's very important that they can correct their mistakes though.
    Small tangent: I really wish they would separate arithmetic from the rest of math AI. Have a tiny model that can run a calculator, have a big model commit just to algorithms and such. Seems like people AI I is more likely to make a mistake doing the math, than coming up with the correct math to do. /rant

  • @bungrudi
    @bungrudi День назад

    Hi King, what interface are you using? Is that OpenWebui?

  • @theaerogr
    @theaerogr День назад +3

    I really believe if companies trained 3b models at the same compute as the big models ( meaning passing data multiple times ) they would be very competitive to the big ones.
    For example a 3b mixture of experts with 6-8 experts would be great.
    We need Mixture of Experts scaling laws.

    • @Gamatoto2038
      @Gamatoto2038 День назад

      idk bigger the model the more it can learn

    • @CODE7X
      @CODE7X День назад +3

      Small models are good at making them do something particularly very good but not very generalised version of everything in the world

    • @theaerogr
      @theaerogr День назад

      @@CODE7X I don't need an agent to do phd physics or law. I need a fast cheap agent to propose code with minimum cost so it automatically does that 24/7

    • @theaerogr
      @theaerogr День назад

      @@Gamatoto2038 MOE have solved this, as well as scaling laws ( which are for old school llms)

  • @shopon-hossen
    @shopon-hossen День назад

    dose anyone tell me that what he use for Local LLM GUI?.

    • @AICodeKing
      @AICodeKing  День назад

      It's OpenWebUI

    • @shopon-hossen
      @shopon-hossen День назад

      @@AICodeKing thanks bro!. and I really like youre video concept.

  • @NimishChaudhari
    @NimishChaudhari День назад

    Any idea how can we make use of this with deepseek in cline? Idea being the small thinker generates the (pseudo) algorithm and deepseek implements it

    • @abelsun2191
      @abelsun2191 День назад

      aider can do that

    • @spol
      @spol 11 часов назад

      not useful for code

  • @nhtdmr
    @nhtdmr День назад +1

    Please check Hermes Llama 3.2 3B model and compare with Small LMs

  • @AB-cd5gd
    @AB-cd5gd День назад +1

    3b is not enough i'd prefer nestmind thinkflow that has 11b i think that's a great balance also deepthink is not that bad

  • @ultimategolfarchives4746
    @ultimategolfarchives4746 День назад

    Pass of fail dont means much. Analyzing the chain of thought capabilites would be more interesting with these thinking model.

  • @kevinwebber1746
    @kevinwebber1746 День назад

    Winner🎉

  • @fawazyahya2425
    @fawazyahya2425 День назад

    Make a video about how to make application for phone (ios,android) with no code as ai assistant base on my file as knowledge base

  • @algo2trade690
    @algo2trade690 День назад

    can you create a video on how to have a small agentic slm or llm running on 4gb 3050RTX or 40GB RAM with i5 10th gen cpu? I was trying to understand how one can create small agentic code generating model which can have only typescript, javascript, deno js and can work with bolt.diy. all your videos only have testing and info about different models. can we have something which can be practically implemented to learn more and customize? I don't see such videos. Like how to choose or create datasets and train own model and use it locally

    • @ashgtd
      @ashgtd 2 часа назад

      I'm rocking trhe same thing. 3b models are usually pretty great

  • @Hypersniper05
    @Hypersniper05 День назад

    I have constructive feedback for your next video.
    Most model reviewers on here do not
    take into account what the model was designed for and try to benchmark them with general purpose questions. I am assuming most of your audience use these models for other useful things instead of chatting.
    Here is what I would love to see:
    Test the model in a few benchmark categories like:
    -Creative writing
    -Instruction Following
    -Needle in the haystack
    -Function calling
    -Censorship
    -Coding
    -Logic
    -Reasoning
    -Multi-turn chat
    -RAG
    In addition noone talks about the hyper parameters like temperature and top p to name a few. It can effect the output you are looking for. Usually the benchmarks bring the temp down close to 0.
    A chart showing the models performance with 1-8bit quants will be amazing. Most people run a lower quant.
    I know it will require a lot of work but it will really boost the transparency, which right now everyone is trying to hype every model without honest benchmarks etc.

  • @electrophy
    @electrophy День назад

    Thinks

  • @danield9368
    @danield9368 День назад

    Merci !

    • @AICodeKing
      @AICodeKing  День назад +1

      Thanks a lot for the support!

  • @ZidanNextGen
    @ZidanNextGen День назад

    Hi

  • @다루루
    @다루루 День назад

    🤗

  • @ChefBrianCooks
    @ChefBrianCooks 19 часов назад

    these small models are useless and pointless

    • @ashgtd
      @ashgtd 2 часа назад

      100% not true. laptops with 3050 and mobile devices can run these. and they are really smart for thier size