Simple reverse-mode Autodiff in Python

Поделиться
HTML-код
  • Опубликовано: 5 окт 2024

Комментарии • 12

  • @sfdv1147
    @sfdv1147 Год назад +2

    Very clear explanation! Thanks and hope you'll get more views

  • @houkensjtu
    @houkensjtu Год назад +1

    Thank you for the clear explanation! I wonder where does the term "cotangent" come from? A google search shows it comes from differential geometry, do I need to learn differential geometry to understand it ...?

    • @MachineLearningSimulation
      @MachineLearningSimulation  Год назад

      You're welcome 🤗
      Glad you liked it. The term cotangent is borrowed from differential geometry, indeed. If you are using reverse-mode autodiff to compute the derivative of a scalar-valurd loss, you can think of the cotangent associated with a node in the computational graph to be the derivative of the loss wrt that node. More abstractly it is just the auxiliary quantity associated with each node.

  • @zhenlanwang1760
    @zhenlanwang1760 Год назад

    Thanks for the great content as always. One question and one comment. How would you handle it if it is a DAG instead of a chain? Any reference (book/paper) that you can share? I noted that for symbolic differentiation, you pay the price of redundant calculation (quadratic in the length of chains) but with constant memory. On the other hand, the auto-diff caches the intermediate values and has linear calculation but also linear memory.

    • @MachineLearningSimulation
      @MachineLearningSimulation  8 месяцев назад +2

      Hi,
      Thanks for the kind comment :), and big apologies for the delayed reply. I just started working my way through a longer backlog of comments; it's been a bit busy in my private life the past months.
      Regarding your question: For DAGs the approach is to either record a separate Wengert list or overload operations. It's a bit harder to truly find a taxonomy for this because different autodiff engines all have their own style (source transformation at various stages in the compiler/interpreter chain vs. pure operator overloading in the high-level language, restrictions to certain high-level linear algebra operations, etc.). I hope I can finish the video series with some examples of it in the coming months.
      These are some links that directly come to my mind: The "Autodidact" repo is one of the earlier tutorials of simple (NumPy-based) autodiff engines in Python, written by Matthew Johnson (co-author of the famous HIPS autograd package): github.com/mattjj/autodidact . The HIPS autograd authors are also involved in the modern JAX package (that is featured quite often on the channel). There is a similar tutorial called "autodidaX": jax.readthedocs.io/en/latest/autodidax.html
      The microgrid package by Andrey Karpathy is also very insightful: github.com/karpathy/micrograd . It is based on "PyTorch-like" perspective. His video on "Becoming a backdrop Ninja" can also be helpful.
      In the Julia world: you might find the documentation of the "Yota.jl" package helpful: dfdx.github.io/Yota.jl/dev/design/
      Hope that gave some first resources. :)

  • @harikrishnanb7273
    @harikrishnanb7273 Год назад +1

    can you please tell your recommend resources to learn maths? or how did you learned math?

    • @MachineLearningSimulation
      @MachineLearningSimulation  Год назад +1

      Hi,
      it's a great question, but very hard to answer. I can't pinpoint it to this one approach, this one text book etc.
      I have an engineering math background (I studied mechanical engineering for my bachelor degree). Generally speaking, I prefer the approach taken in engineering math classes, being more algorithmically focused than theorem-proof focused. Over the course of my undergrad, I used various RUclips resources (which also motivated me doing this channel). The majority were in German, some English-speaking include the vector calculus videos by Khan academy (which werde done by grant Sanderson) and of course 3b1b.
      For my graduate education, I figured that I really liked reading documentation and seeing API interfaces of various numerical Computer programs. JAX and tensorflow have amazing docs. This is also helpful for PDE simulations. Usually, I guide myself by Google, forum posts and a general sense of curiosity. 😊

    • @harikrishnanb7273
      @harikrishnanb7273 Год назад +1

      @@MachineLearningSimulation thanks for the reply

    • @MachineLearningSimulation
      @MachineLearningSimulation  Год назад +1

      You're welcome 😊
      Good luck with your learning journey

  • @oioisexymlaoy
    @oioisexymlaoy Год назад +1

    Hi, thanks for the video. At 4:20 you say you link to some videos in the top right, but I do not see them.

    • @MachineLearningSimulation
      @MachineLearningSimulation  Год назад

      You're welcome 😊
      Thanks for catching that, I will add the link later today. This should have been linked to this video: ruclips.net/video/Agr-ozXtsOU/видео.html
      More generally, there is also a playlist with a larger collection of rules (also for tensor-level autodiff): ruclips.net/p/PLISXH-iEM4Jn3SEi07q8MJmDD6BaMWlJE
      And I started collecting them in a simple accessible website (let me know if you spot a error there, that's still an early version): fkoehler.site/autodiff-table/