Accelerate Transformer inference on CPU with Optimum and ONNX

Поделиться
HTML-код
  • Опубликовано: 16 сен 2024

Комментарии • 14

  • @anabildea9274
    @anabildea9274 Год назад +1

    Thank you for sharing! great content!

  • @geekyprogrammer4831
    @geekyprogrammer4831 Год назад

    Thanks a lot for creating this video. I saved a month by watching this video!

  • @youssefbenhachem993
    @youssefbenhachem993 Год назад

    To the point ! great explanation, thanks 😀

  • @Gerald-iz7mv
    @Gerald-iz7mv 4 месяца назад

    How do you export to onnx using cuda? It seems optimum doesnt support it - is there an alternative?

    • @juliensimonfr
      @juliensimonfr  4 месяца назад

      huggingface.co/docs/optimum/onnxruntime/usage_guides/gpu

  • @TheBontenbal
    @TheBontenbal 6 месяцев назад

    I am trying to follow along. Many updates to the code so many errors unfortunately.

    • @juliensimonfr
      @juliensimonfr  6 месяцев назад

      Docs and examples here: huggingface.co/docs/optimum/onnxruntime/overview

  • @ahlamhusni6258
    @ahlamhusni6258 Год назад

    is there any optimization methods applied on word2vec 2.0 model ? and can I apply these methods on the word2vec 2.0

    • @juliensimonfr
      @juliensimonfr  Год назад

      Hi, Word2Vec isn't based on the transformer architecture. You should take a look at Sentence Transformers, they're a good way to get started with Transformer embeddings huggingface.co/blog/getting-started-with-embeddings

    • @ibrahimamin474
      @ibrahimamin474 9 месяцев назад

      @@juliensimonfr I think he meant wav2vec 2.0

  • @Gerald-xg3rq
    @Gerald-xg3rq 4 месяца назад

    what the difference between setfit.exporters.onnx and optimum.onnxruntime (optimizer = ORTModelFromFeatureExtraction.from_pretrained(...) optimizer.optimize()) etc.?