Microsoft's Phi-3 VISION: This NEW Opensource TINY Vision Model beats GPT-4O, Claude-3 & Gemini

Поделиться
HTML-код
  • Опубликовано: 18 ноя 2024
  • НаукаНаука

Комментарии • 19

  • @messagefromgiri
    @messagefromgiri 6 месяцев назад +4

    Thanks!

    • @AICodeKing
      @AICodeKing  6 месяцев назад +1

      Thanks for the support! This makes all my work worth it! Thanks for the appreciation.

    • @NathanChambers
      @NathanChambers 5 месяцев назад +2

      @@AICodeKing How much of a donation needed for the AI voice to be at normal speaking speed and a not so `creepy stalker` sounding? :)

    • @jekkleegrace
      @jekkleegrace 5 месяцев назад

      @@NathanChambers i like the voice, perhaps a bidding war?

  • @richardadonnell
    @richardadonnell 6 месяцев назад +4

    🎯 Key Takeaways for quick navigation:
    00:00 *Introduction and mention of Microsoft's new open-source Vision model, Phi-3 Vision.*
    00:13 *Phi-3 Vision is part of Microsoft's Phi-3 model family, announced at the Microsoft Build 2024 conference.*
    00:38 *Phi-3 Vision is a 4.2 billion parameter multimodal model supporting a 128k context limit for long conversations.*
    02:24 *Trained on 500 billion tokens with 512 Nvidia H100 GPUs, focusing on high-quality reasoning data.*
    02:54 *Outperforms GPT-4 Vision in multiple benchmarks despite its smaller size.*
    03:46 *Beats larger models in the MM Bench and Science QA benchmarks, demonstrating impressive performance.*
    04:55 *Consistently outperforms Claude 3 and Gemini models in various benchmarks.*
    06:37 *Available for use on Hugging Face and Azure AI Studio, with potential for on-device inference.*
    09:29 *Initial testing shows it excels at code generation and data conversion tasks but struggles with more complex reasoning questions.*
    13:44 *Overall, Phi-3 Vision is a promising model for lightweight AI applications, performing well even on mobile devices.*
    Made with HARPA AI

    • @MichaelDomer
      @MichaelDomer 5 месяцев назад

      Learn to communicate, or is AI going to do that for you instead from now on?

  • @moulics
    @moulics 5 месяцев назад

    there cannot be a better summary than this

  • @techsostip
    @techsostip 6 месяцев назад +1

    Here's the rephrased sentence:
    Artificial intelligence is advancing rapidly, and I lack the resources to evaluate every available model (No Time)

  • @buckyzona
    @buckyzona 5 месяцев назад

    Could I run inferences with this model on every image of a 200k frame video, just on my CPU? It’s currently fine for my own OCR model that’s trained on 3000 images. Or do I need a GPU.

    • @AICodeKing
      @AICodeKing  5 месяцев назад +1

      I think you can do that.

  • @prathmeshvhatkar8801
    @prathmeshvhatkar8801 6 месяцев назад +1

    Which app are u using gpt-4o in?

  • @ndidiahiakwo7412
    @ndidiahiakwo7412 5 месяцев назад

    Can one access Phi 3 online without locally running the software in one's computer?

    • @AICodeKing
      @AICodeKing  5 месяцев назад

      Yes. You can access it for free on Azure AI Studio and HuggingFace chat

  • @timothywcrane
    @timothywcrane 5 месяцев назад

    The image description function is too tightly wound ... almost all public domain fine art examples were simply "unable to describe as contains inappropriate material". I couldn't summarize an A&E afternoon special with this! Not trolling. I have yet to try the other phi3 use cases. I have heard that the LLM is god for function handling (not sure of direct calling) with the help of "Guidance" from MS. I am working on media production and these are my two most useful task cases.

  • @NathanChambers
    @NathanChambers 5 месяцев назад +3

    I just did some testing and it seems this really sucks because it always forces any suggestions for software be PAID AND BY MICROSOFT. Go figure... but total BS if you want to ask it actual questions since it will force itself to give you pro-microsoft answers only and block competitors. No matter how many times I said free, it would give suggestions for paid microsoft products:/ Sticking with llama and mistral!

    • @timothywcrane
      @timothywcrane 5 месяцев назад

      I may have to hit the playground and check this out... for defense or exploit... every bug is a feature... this is and always has been the MS philosophy...

  • @Pennytechnews
    @Pennytechnews 6 месяцев назад

    Great info👍🏿

  • @mittanuljak4747
    @mittanuljak4747 5 месяцев назад

    maybe it is good i science because they optimized it for Khan AI? to be the assistants for usa teachers?

  • @christianweyer74
    @christianweyer74 5 месяцев назад

    Nice video, thanks! Do you also provide the code for the demo you showed? @AICodeKing