AI Forum 2023 | The Small Models Revolution

Поделиться
HTML-код
  • Опубликовано: 30 май 2024
  • I will discuss a new method we are pioneering at Microsoft Research to build smaller language models that exhibit many of the properties of the largest language models such as ChatGPT. The focus will be on our latest model, phi-1.5, which is a 1 billion parameters model that can rival competitor with 10 billion or more parameters.
    Learn more about the AI Forum 2023 hosted by Microsoft Research Asia in collaboration with The University of Tokyo: www.microsoft.com/en-us/resea...
  • НаукаНаука

Комментарии • 5

  • @bwhit7919
    @bwhit7919 2 месяца назад +1

    This is brilliant. It’s tough to beat GPT-4. But if you make smaller, specialized models, I think it would be possible to beat GPT-4 on certain benchmarks. That’s what I hope the tech industry starts doing. Especially when 90% of the time I use Chat GPT it’s to write computer code.

  • @khangvutien2538
    @khangvutien2538 4 месяца назад +2

    1. I love the relax but precise style of this presentation
    2. What we are learning here reminds me of my engineer thesis on PCA in 1975: to get significant eigen vectors, it is better to filter the data for meaningful samples 😅 or else there’s plenty of noise that waste computation time.
    3. Question: in view of the lawsuit of the NYT against Microsoft and OpenAI, how can you make sure that the synthetic textbook-quality contents generated automatically by GPT-4 to train Phi-2 doesn’t contain litigious sentences?

  • @que_93
    @que_93 5 месяцев назад +2

    This is brilliant work and gives so much for us to think and work upon. I guess, "size doesn't always matter". Pun intended. And I am glad that you have made phi open-source. Thank you!

  • @ShubhamSinghYoutube
    @ShubhamSinghYoutube 5 месяцев назад +3

    How do you ensure that the textbook quality scoring by GPT4 and GPT3.5 is reliable/ true?

  • @jeetmajumdar7588
    @jeetmajumdar7588 5 месяцев назад +1

    SLMs are good for individual purpose, but why not you building a gpt4 like llm model. Google just launched its gpt4 killer Gemini ai. Hope Microsoft will also come up with multimodal language model.