ML Frameworks: Hugging Face Accelerate w/ Sylvain Gugger

Поделиться
HTML-код
  • Опубликовано: 7 янв 2025

Комментарии • 8

  • @marcomagliulo2966
    @marcomagliulo2966 Месяц назад

    Wonderful video IMHO
    I still have to figure out how to use accelerate to run multi-nodes/multi-GPUs trainings and inferences though

  • @raynardzhang4986
    @raynardzhang4986 2 года назад +1

    video is great, the best part is around 8:27 to 22:30

  • @bobsalita3417
    @bobsalita3417 3 года назад +6

    I marked this video to be watched later as I don't have enough time now. The thing is, I rarely have time later as there's an endless stream of worthy ML videos each week. Your target audience are most likely people like me. This is the moment to learn about HF Accelerate but it will pass because of the video's length. One solution is to post links to few minute video explainers. For example, Fireship's 100 second videos are hugely popular.

    • @rahulkadam3305
      @rahulkadam3305 3 года назад +1

      yes, that might be good

    • @wgabrys88
      @wgabrys88 2 месяца назад

      You have time, sit and learn, it's just one hour. Remember Elon's rocket catch? He also makes cars and dances (I've tried the short ones but meh, need explanation ❤)

  • @benak495
    @benak495 Год назад

    Thanks for the great video. Does accelerate works with Windows? I cannot find any information about that and it doesnt work on my windows pc.

  • @brandomiranda6703
    @brandomiranda6703 Год назад

    I think it's important to clarify **explicitly** how the code changes if you use a HF Trainer/SFTTrainer. This is my best guess assuming Trainer is it's own special wraper to train your model:
    ```
    from transformers import GPT2LMHeadModel, GPT2TokenizerFast, TrainingArguments, Trainer
    from accelerate import Accelerator
    from datasets import load_dataset
    # Initialize accelerator
    accelerator = Accelerator()
    # Load a dataset
    dataset = load_dataset('text', data_files={'train': 'train.txt', 'test': 'test.txt'})
    # Tokenization
    tokenizer = GPT2TokenizerFast.from_pretrained('gpt2')
    def tokenize_function(examples):
    # We are doing causal (unidirectional) masking
    return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=512)
    tokenized_datasets = dataset.map(tokenize_function, batched=True)
    # Set the columns to be used in training
    tokenized_datasets.set_format("torch", columns=["input_ids", "attention_mask"])
    # Split the dataset into train and test
    train_dataset = tokenized_datasets["train"]
    test_dataset = tokenized_datasets["test"]
    # Initialize model
    model = GPT2LMHeadModel.from_pretrained("gpt2")
    # Prepare everything with our `accelerator`.
    model, train_dataset, test_dataset = accelerator.prepare(model, train_dataset, test_dataset)
    # Define training arguments
    training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='./logs',
    prediction_loss_only=True, # In language modelling, we only care about the loss
    )
    # Create the trainer
    trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    )
    # Train the model
    trainer.train()
    ```