TRAIN YOUR OWN AI - For Beginners ! Finetune Any LLM for Free (No VRAM) ft. Llama 3.2

Поделиться
HTML-код
  • Опубликовано: 16 ноя 2024

Комментарии • 10

  • @TheZEN2011
    @TheZEN2011 Месяц назад

    The best model training video currently out there!

  • @SteveRogers-q6c
    @SteveRogers-q6c Месяц назад +2

    This is probably the best video for finetuning i have came so far. it is very detailed. but the only thing missing is "custom dataset". can you please make a video on how to make a custom dataset, i mean, the correct format and everything we should be following to make our correct "custom dataset", and also if possible do it for the latest llama 3.2 3b model. please also show us that after making the dataset, where to and how to upload the dataset for the finetuning. please make it very detailed.

  •  Месяц назад

    🎉 thanks!

  • @deathfxu
    @deathfxu Месяц назад +1

    What if we want to add multiple datasets to the training? Do we run that code block multiple times with the different urls put in each time, or will that break it?

    • @xclbrxtra
      @xclbrxtra  Месяц назад

      What I would suggest is run 1 dataset, and once the finetuned model is saved in your collab, use than location in place of the original hugging face model link to train on another dataset. This way you would have both llms with 1. Single dataset 2. Both datatset and can compare if it is being overtrained.

    • @deathfxu
      @deathfxu Месяц назад

      @@xclbrxtra Thanks for the fast response. Alternatively, you can change the code at the end to this to combine any number of datasets:
      from datasets import load_dataset, concatenate_datasets
      dataset1 = load_dataset("gbharti/finance-alpaca", split = "train")
      dataset2 = load_dataset("practical-dreamer/RPGPT_PublicDomain-alpaca", split = "train")
      dataset3 = load_dataset("vicgalle/alpaca-gpt4", split = "train")
      dataset4 = load_dataset("iamtarun/python_code_instructions_18k_alpaca", split = "train")
      dataset = concatenate_datasets([dataset1, dataset2, dataset3, dataset4])
      dataset = dataset.map(formatting_prompts_func, batched = True,)

    • @deathfxu
      @deathfxu Месяц назад +1

      @@xclbrxtra Or just delete my comment with a code workaround... I even made sure the example used non-overlapping datasets to prevent overtraining. But thanks for nothing..........

  • @twofii9.Official
    @twofii9.Official 18 дней назад

    Can I do that on mobile 😢

    • @xclbrxtra
      @xclbrxtra  18 дней назад

      Yes, it uses google Collab so it's online

  • @AdventurousKing
    @AdventurousKing Месяц назад

    I wanted a video like this for a long time 🫡, thanks sir❤