Merge LLMs using Mergekit: Create your own Medical Mixture of Experts

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024
  • In this tutorial video titled "Merge LLMs using Mergekit: Create your own Medical Mixture of Experts," I'll guide you through the cutting-edge technique of model merging using the mergekit library. This method allows for the combination of multiple Large Language Models (LLMs) into a single, more powerful model without the need for expensive GPU resources.
    First, I'll walk you through installing mergekit and setting up your environment to ensure you're ready to start merging models right away. You'll learn how to use the mergekit-yaml script to define your merge operations through a YAML configuration file.
    We'll delve into practical examples to demonstrate how you can utilize mergekit to assemble a bespoke model. Whether you're aiming for a simple linear merge of multiple models or a more complex recombination of model layers, I'll provide step-by-step instructions to achieve your objectives.
    By the end of this tutorial, you'll be equipped with the knowledge to create your own Medical Mixture of Experts using mergekit. This innovative approach not only broadens the scope of possible applications for LLMs in medical and healthcare domains but also positions you at the forefront of AI model development.
    Remember, merging models is a powerful tool, and with great power comes great responsibility. Always ensure the ethical use of merged models, especially in sensitive fields like healthcare.
    And don't forget to like, comment, and subscribe to stay updated with more tutorials on leveraging AI and machine learning technologies to solve real-world problems.
    Join this channel to get access to perks:
    / @aianytime
    To further support the channel, you can contribute via the following methods:
    Bitcoin Address: 32zhmo5T9jvu8gJDGW3LTuKBM1KPMHoCsW
    UPI: sonu1000raw@ybl
    GitHub Repo: github.com/AIA...
    #llm #ai #generativeai

Комментарии • 27

  • @harshyadav1190
    @harshyadav1190 6 месяцев назад +1

    Every video brings a fresh hairstyle, just like the knowledge you share! 😄 Thanks for keeping us informed

    • @AIAnytime
      @AIAnytime  6 месяцев назад

      Haha... Thank you sir 🙏

  • @voulieav
    @voulieav 6 месяцев назад +1

    wow what a fantastic repository you have there my man, well done and thanks for the sharing and videos. I will build some of these to help learn. thanks

  • @amanjain6687
    @amanjain6687 3 месяца назад

    Nice work.Can you make a video on how multimodal works internally

  • @tintintintin576
    @tintintintin576 6 месяцев назад

    Bro, in your repo , HF credentials were present. I hope you've disabled your old credentials.
    Great work. Very helpful.

  • @mcmarvin7843
    @mcmarvin7843 6 месяцев назад

    glad to see you sharing your knowledge

  • @sushicommander
    @sushicommander 6 месяцев назад +1

    Great video.... although I'm curious, why did you clone the mixtral branch instead of the main branch?

  • @jdoejdoe6161
    @jdoejdoe6161 6 месяцев назад +1

    Great! Please do a video on the collection of custom dataset and fine tuning multi-modal models

    • @AIAnytime
      @AIAnytime  6 месяцев назад

      Thanks for the idea!

  • @tech_ocean777
    @tech_ocean777 6 месяцев назад

    Great content. Thank you so much for the videos

    • @AIAnytime
      @AIAnytime  6 месяцев назад

      Glad you like them!

  • @sandeepsasikumar701
    @sandeepsasikumar701 5 месяцев назад

    Hi,
    If I merge three models of 7b parameters, what will be the parameter size of final merged model?

  • @JokerJarvis-cy2sw
    @JokerJarvis-cy2sw 6 месяцев назад +1

    Please a tutorial on Llava vision model with cv2 to get info about live camera objects and use Llava via replicate API
    Please

  • @Ankur-be7dz
    @Ankur-be7dz 5 месяцев назад

    Can we use mergekit for models which are not in huggingface repo?

  • @xspydazx
    @xspydazx 6 месяцев назад

    the prompts used , could they¥be the same type of prompts that you use for agents ? ie: listing skills and expectations.. ie step by step thinking, ie add multiple prompts ?

  • @jeffg4686
    @jeffg4686 6 месяцев назад

    I was just thinking about this the other day.
    Like, if a bunch of robots are trained to work on cars in different mechanic shops (using human feedback), and then having all of the individual NN merged together from all the robots.

  • @Canna_Science_and_Technology
    @Canna_Science_and_Technology 6 месяцев назад

    The Routing llm handles the mix of llms in mistral x8 7B

  • @flyingsnow1357
    @flyingsnow1357 4 месяца назад

    can i run this script in normal google colab (without pro)?

  • @SAVONASOTTERRANEASEGRETA
    @SAVONASOTTERRANEASEGRETA 5 месяцев назад

    how do you then quantize the file in GUFF?

  • @xspydazx
    @xspydazx 6 месяцев назад

    ok i completed making a model LeroyDyer/Mistral_WhiteHatCoder_Base_Instruct_Moe_3x7b ? i could not get it to load ? In colab after using your full script ? Help if you can with a load model script ?

  • @user-iu4id3eh1x
    @user-iu4id3eh1x 6 месяцев назад

    Thanks for this video.... Can I merge encoder only models?

  • @siliconberry
    @siliconberry 6 месяцев назад

    I like your new hairstyle ! :) :)

    • @AIAnytime
      @AIAnytime  6 месяцев назад

      Haha... I liked it too

  • @flyingsnow1357
    @flyingsnow1357 4 месяца назад

    The merged model is only generating grabage 😁😁