Why FLAX Could Be Your New Favorite Deep Learning Library for NN

Поделиться
HTML-код
  • Опубликовано: 12 сен 2024

Комментарии • 14

  • @erfanzarechavoshi909
    @erfanzarechavoshi909 Год назад +1

    thanks for the content

  • @quasimodo1914
    @quasimodo1914 7 месяцев назад

    I'm using ensembles of simple feedforward networks to approximate posterior distributions of a very noisy dataset. It feels like a knife digging into my leg trying to determine appropriate ensemble sizes with slower libraries, but this seems like savior. Thanks!

  • @99ag
    @99ag Год назад +1

    Hi,
    Can you please make a video on physics informed Neural Networks ?

  • @nunoalexandre6408
    @nunoalexandre6408 Год назад

    Love it!!!!!!!!!!!!!!

  • @hjups
    @hjups Год назад

    What are your thoughts on using FLAX for complicated architectures and dynamic architecture variations? I primarily use PyTorch for these tasks, and it seems like the functional programming aspect of JAX/FLAX may work well for very regular architectures but fail to work for anything too complex. And then what about throwing AMP into the mix for improving training speeds?
    Those are the main reasons I have avoided migrating workflows over to FLAX, though I do find PyTorch to be slower than it should be.
    Also, how would you expect FLAX to compare to something like TensorRT in regards to transformers (FLAX transformer on an NVidia GPU vs TensorRT transformer)?

    • @code4AI
      @code4AI  Год назад +1

      a lot of the new AI models, I have seen in literature (and are from renowned sources which know how to code optimally) use JAX, FLAX. simply because of the costs and time frames associated with it. 2nd: Jax FLAX is optimized for Google's TPU clusters, and the cooperation agreement with NVIDIA will give NVIDIA access to this new tech (IPR), that NVIDIA will use in its latest GPUs.

    • @hjups
      @hjups Год назад +1

      @@code4AI Do you mean that they have the manpower to spend on optimizing JAX/FLAX? Or that they want to cost optimize their training runs?
      I could understand the former, but the latter would be a bigger motivator (from the perspective of a poor researcher in academia with limited funding / compute). All of the recent architectures I have worked with are highly irregular though, utilizing if-else blocks in the torch forward pass.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Год назад +1

    But doesn’t PyTorch now have something equivalent?

    • @code4AI
      @code4AI  Год назад +1

      They try to replicate and learn from JAX, but it is different if you modify your old system or you build it from scratch for performance.

  • @gileneusz
    @gileneusz Год назад

    can you make video about Orca LLM? I'm very confused why it's not open source yet...

    • @code4AI
      @code4AI  Год назад

      already online: ruclips.net/user/shortskrCY9-R_qkA?feature=share

    • @gileneusz
      @gileneusz Год назад

      @@code4AI "this model is so good, so we will never publish it"

    • @code4AI
      @code4AI  Год назад

      It is beneficial to reduce the compute costs of Microsoft, because if ORCA is trained on a subset of GPT-4, then ORCA will be enough (intelligent, compute capacity) for a lot of people, just playing with GPT-4 on their phone for an email. And every million US$ you save (as Microsoft) goes directly to ..... new free services for the community ..... or was it MS corporate profit?

    • @gileneusz
      @gileneusz Год назад

      @@code4AI Orca could be beneficial for all, means for out competitors too, so maybe let's better keep it just for us, said someone in M$. Maybe. 😥