Session 8: Fine-Tuning Embedding Models for RAG Systems

Поделиться
HTML-код
  • Опубликовано: 3 дек 2023
  • What you'll learn this session:
    - How to tune open-source embedding models to align with specialized language, like that used for research
    Speakers:
    Dr. Greg Loughnane, Founder & CEO AI Makerspace.
    / greglough. .
    Chris Alexiuk, CTO AI Makerspace.
    / csalexiuk
    Apply for one of our AI Engineering Courses today!
    www.aimakerspace.io/cohorts
  • НаукаНаука

Комментарии • 6

  • @xspydazx
    @xspydazx 3 месяца назад +1

    Also I have created some (hybrid) models based on some of your videos and new innovations which we not implemented in the original models .... often making hard to load these models agian ... or requiring special code etc to run (ie sending the new model with the llm weights to the transofrmers requester via the config file etc)
    but in this i have also impmented some multimodal models ... by combining the encoder and decoders from the vision encoder/decoder or the speech encoder/decoder .... but again these archare rejected by the classic code:::
    the can be run and recognized still:
    hmm : is this the right direction , with model such as baklava and llava etc : ASR / VIT etc : they can be encorperated intoa single model ? Should they ? or should they be kept as ""EXPERTS" single architectures ?
    the mxture of lora adapters (as experts) after creating actually did not work well at all ! obviously once you get over the hurdle of creating models from scratch .... as you will notice (it does not download a new model it generates one from the givenn config file) you can generate a nn by its config .... so those of whom cannot download due to the size can generate a model of nSize according to your specific hardware ::: as it generates in memory !! (often taking all memeory so you will even need more memeory size to give it the first training before converting the model to fp16 in memory and saving to pretrained ? as without training Essentially the tensors are random (when adding enw components to the model itself i adds the con=mponents with empty tensors?) so they are open for training and the memory is actually empty ready for training a the same new model ?
    hmm... : I would like to be able to : Update the llm , ie by extrracting the documents in a folder , extracting the text and fine tuning it in ?
    ie : i suppose the best way would be to inject it as a text dump ~ HOW?(Please)
    ie take the whole text and tne a single epoch only !:
    As well as saving my chat history as a input/Response dump : single epoch only .
    Question : each time we fine tune ? it takes the last layer and makes a copy then trains the copy and replaces the last layer ? as the model weights are FROZEN? does this mean that they dont get updated ....? if so then the lora is applied to this last layer esentially replacing the layer ?
    If we keep replacing the last layer do we essentially wipe over the previous training ??
    i have seen that you can target Specific layers ? ... How to determine which layers to target? then create the config to match these layers?
    Question : How dowe create a strategy for regular tuning without destroying the last training ? should we be Targetting different layers each fine tuning ?
    Also Why canwe not tune it Live!! ie while we are talking to it ? or discuss with the model and adust the model whilst talking ? is adjusting the weights done by the AUTOGRAD? NN in pytorch with the optimization ? ie adam optimizer ? as with each turn we can produce the loss from the input by supplying the expected outputs to compare with simuarity so if the output is over a specfic threshhold it would finetune acording to the loss (optimize this(once)) ... ie switching between train and evaluation , (freezing a specific percentage of the model )... ? ie essentially woring with a live brain ???
    how can we update the llm with conversation , ??? by giving it the function (function calling) to execute a single training optimization based on user feedback ? ie positive and negative votes... and the current response chain ... ie if the rag was used then the content should be tuned in ??
    SOrry for the long post but it all connects to the same thingy?
    ChatGPT

  • @issambahri7711
    @issambahri7711 2 месяца назад +1

    Nice tutorial! I was able to generate the fine-tuned model. Do you have a guide on how to deploy it on Hugging Face Inference Endpoint?

    • @AI-Makerspace
      @AI-Makerspace  2 месяца назад

      You got it! ruclips.net/user/liveanIBtQNn1G0?si=ODNgv-REJYdj00yF

  • @AI-Makerspace
    @AI-Makerspace  8 месяцев назад +2

    Colab: colab.research.google.com/drive/1TDiWZtb6gsM9wVXCLQrR-7OEPaQ2n-JA?usp=sharing
    Slides: canva.com/design/DAF13u5Vnys/YVTHe4mrt0Gb66glSYyVQg/edit?DAF13u5Vnys&

  • @ScriptureFirst
    @ScriptureFirst 5 месяцев назад +2

    How about a foreign language? 😬 will this work for adding a previously unsupported language?

    • @AI-Makerspace
      @AI-Makerspace  5 месяцев назад +2

      This could help an embedding model better understand a new language - yes.