Simple reverse-mode Autodiff in Python

Machine Learning & Simulation

Просмотров 2,5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 5 окт 2024

Комментарии • 12

@sfdv1147 Год назад ⁺²
Very clear explanation! Thanks and hope you'll get more views
@MachineLearningSimulation Год назад ⁺¹
Thanks for the kind feedback 😊
Feel free to share it with your network :)
@houkensjtu Год назад ⁺¹
Thank you for the clear explanation! I wonder where does the term "cotangent" come from? A google search shows it comes from differential geometry, do I need to learn differential geometry to understand it ...?
@MachineLearningSimulation Год назад
You're welcome 🤗
Glad you liked it. The term cotangent is borrowed from differential geometry, indeed. If you are using reverse-mode autodiff to compute the derivative of a scalar-valurd loss, you can think of the cotangent associated with a node in the computational graph to be the derivative of the loss wrt that node. More abstractly it is just the auxiliary quantity associated with each node.
@zhenlanwang1760 Год назад
Thanks for the great content as always. One question and one comment. How would you handle it if it is a DAG instead of a chain? Any reference (book/paper) that you can share? I noted that for symbolic differentiation, you pay the price of redundant calculation (quadratic in the length of chains) but with constant memory. On the other hand, the auto-diff caches the intermediate values and has linear calculation but also linear memory.
@MachineLearningSimulation 8 месяцев назад ⁺²
Hi,
Thanks for the kind comment :), and big apologies for the delayed reply. I just started working my way through a longer backlog of comments; it's been a bit busy in my private life the past months.
Regarding your question: For DAGs the approach is to either record a separate Wengert list or overload operations. It's a bit harder to truly find a taxonomy for this because different autodiff engines all have their own style (source transformation at various stages in the compiler/interpreter chain vs. pure operator overloading in the high-level language, restrictions to certain high-level linear algebra operations, etc.). I hope I can finish the video series with some examples of it in the coming months.
These are some links that directly come to my mind: The "Autodidact" repo is one of the earlier tutorials of simple (NumPy-based) autodiff engines in Python, written by Matthew Johnson (co-author of the famous HIPS autograd package): github.com/mattjj/autodidact . The HIPS autograd authors are also involved in the modern JAX package (that is featured quite often on the channel). There is a similar tutorial called "autodidaX": jax.readthedocs.io/en/latest/autodidax.html
The microgrid package by Andrey Karpathy is also very insightful: github.com/karpathy/micrograd . It is based on "PyTorch-like" perspective. His video on "Becoming a backdrop Ninja" can also be helpful.
In the Julia world: you might find the documentation of the "Yota.jl" package helpful: dfdx.github.io/Yota.jl/dev/design/
Hope that gave some first resources. :)
@harikrishnanb7273 Год назад ⁺¹
can you please tell your recommend resources to learn maths? or how did you learned math?
@MachineLearningSimulation Год назад ⁺¹
Hi,
it's a great question, but very hard to answer. I can't pinpoint it to this one approach, this one text book etc.
I have an engineering math background (I studied mechanical engineering for my bachelor degree). Generally speaking, I prefer the approach taken in engineering math classes, being more algorithmically focused than theorem-proof focused. Over the course of my undergrad, I used various RUclips resources (which also motivated me doing this channel). The majority were in German, some English-speaking include the vector calculus videos by Khan academy (which werde done by grant Sanderson) and of course 3b1b.
For my graduate education, I figured that I really liked reading documentation and seeing API interfaces of various numerical Computer programs. JAX and tensorflow have amazing docs. This is also helpful for PDE simulations. Usually, I guide myself by Google, forum posts and a general sense of curiosity. 😊
@harikrishnanb7273 Год назад ⁺¹
@@MachineLearningSimulation thanks for the reply
@MachineLearningSimulation Год назад ⁺¹
You're welcome 😊
Good luck with your learning journey
@oioisexymlaoy Год назад ⁺¹
Hi, thanks for the video. At 4:20 you say you link to some videos in the top right, but I do not see them.
@MachineLearningSimulation Год назад
You're welcome 😊
Thanks for catching that, I will add the link later today. This should have been linked to this video: ruclips.net/video/Agr-ozXtsOU/видео.html
More generally, there is also a playlist with a larger collection of rules (also for tensor-level autodiff): ruclips.net/p/PLISXH-iEM4Jn3SEi07q8MJmDD6BaMWlJE
And I started collecting them in a simple accessible website (let me know if you spot a error there, that's still an early version): fkoehler.site/autodiff-table/

Следующие

Автовоспроизведение

Reverse Mode Autodiff in Python (general compute graph)