Dear Brunton, your classes are wonderful. Perfect didactic and incredibly easy to convey complex information in a simple and practical way. Congratulations!
Great video! However, I noticed a potential mismatch between the figure and the loss function at 21:53. The figure you show seems to describe "equivariance", where the transformation \( g \) applied to both the input and output is preserved after passing through the neural network. But the loss function on the left (\( ||y - f(x)||_2 + ||y - f(-x)||_2 \)) looks more like it enforces "invariance", as it ensures that the output remains the same regardless of whether \( x \) or \( -x \) is used as input. Also, this is referred to as "equivariant loss" in the chapter, which could be confusing. It would be great if you could clarify this difference between "invariance" and "equivariance" to avoid potential confusion. Thanks again for the insightful content!
I guess the fluid dynamics case actually does have a case where the Lagrangian constraint could result in observable errors between model and reality (where training is done from t0 to t1, and then fit from t1 to tf).
Thank you, Prof. I have seen the videos you cite here and this new video was an effectie review for me to understand them. Your lectures are parsimonious! I promise only people with mechanical engineering background have the ability and tendency to understand and describe complex math parsimoniously ;)
I am interested in the concept of equivariance and invariance related to neural network interpretability. Usually to satisfy the physical constraints given by symmetry we build neural networks that are equivariant, why don't we build neural networks that are invariant instead? In this way, it is not only the output of the network that satisfies the laws of physics, but it is the network itself, with its parameters. Basically instead of choosing via sgd optimisation any parameters in the parameter landscape, can we constrain these parameters in a physically relevant sub manifold? My idea would then be to build neural networks analogous to physical systems, where the parameters of the whole network have an analogue in a physical theory and not just those in the autoencoder bottleneck. An application of these neural networks could be in the field of topological quantum field theory but in general in any lattice gauge theory, where the neural network itself becomes a piece of graphene that spreads the input current over the output boundary state. It can also be a potts model, a spin glass or a penrose spin network, which recognises physics because it is built in analogy with the physical model. Perhaps putting such a strong constraint on the parameter space would be counterproductive, making the neural network lose its ability to generalise. But this is a very interesting topic.
Hi Steve. I was recently diving into ML and AI applications for CFD but I realized its better to focus on robust solvers . I don't know much about this but in other video you show the adaptive wavelet method for CFD and this seem to handle or solve a bunch of CFD problems whereas ML can tackle a specific problem. Am I right ? btw I am a huge fan of your videos . Thanks .
Dear Brunton, your teaching is excellent. My path of learning is book oriented. Could you recommend me a book/paper where I can learn physics informed neural networks so that i can apply this in my nuclear engineering field like reactor design parameters, radiation transport, neutron transport etc.
Another great video. It would be interesting to do a video about Physics Informed ML in the context of Sutton's Bitter Lesson, since this seems to be a case where adding extra knowledge into the architecture/loss of the network actually beats out more generalist approaches. This is probably due to the lack of training data in physics/engineering domains, but maybe building in physics knowledge helps in the large data regime as well.
Are there models of loss functions/ optimization functions that can switch tactics (or swap out functions completely) depending on what stage of training a machine learning model gets is on? I was studying transformers/ self-attention architecture and it made curious if “self attention” could be used specifically on the loss or the optimizing functions to either tighten up focus on a more specific goal, or broaden it depending on where the training is at. Does that make sense? This is the “Multiple loss or optimizing functions that are activated at different times using different triggers” approach. The self-attention method of deciding when to tune or modify (or replace) an optimizing function- I think would be spectacular to demonstrate! It is the key architecture used in OpenAi’s SORA, Chat GPT 4 (and 3 and 2), and many other successful machine learning tools. Also- open question: in what ways can LLM’s or image classifiers / generators be utilized in physics intended machine meaning? Could SORA figure some physics on its own simply via studying video footage as well as the ability to identify, label, and categorize objects in a video ***and *** learn and generalize how these objects change over time? (Like objects falling). I know a lot depends on the training data (watching leaves fall vs watching rocks fall… That’s my big question!!
Dear Brunton, your classes are wonderful. Perfect didactic and incredibly easy to convey complex information in a simple and practical way. Congratulations!
Hi @steve could you please add the papers you mentioned in the description?
Thank you Steve, I was waiting to see that video. I'm excited for the series, great work
same here haha
That's gold, I always can't wait for the next video, wish I could jump forward into future and just watch them all at once!
Great video! However, I noticed a potential mismatch between the figure and the loss function at 21:53. The figure you show seems to describe "equivariance", where the transformation \( g \) applied to both the input and output is preserved after passing through the neural network. But the loss function on the left (\( ||y - f(x)||_2 + ||y - f(-x)||_2 \)) looks more like it enforces "invariance", as it ensures that the output remains the same regardless of whether \( x \) or \( -x \) is used as input. Also, this is referred to as "equivariant loss" in the chapter, which could be confusing.
It would be great if you could clarify this difference between "invariance" and "equivariance" to avoid potential confusion. Thanks again for the insightful content!
I guess the fluid dynamics case actually does have a case where the Lagrangian constraint could result in observable errors between model and reality (where training is done from t0 to t1, and then fit from t1 to tf).
Thank you, Prof. I have seen the videos you cite here and this new video was an effectie review for me to understand them. Your lectures are parsimonious!
I promise only people with mechanical engineering background have the ability and tendency to understand and describe complex math parsimoniously ;)
Thank you for your excellent panning and knowledge base. You are a learned machine!
Loving this series!
I am interested in the concept of equivariance and invariance related to neural network interpretability. Usually to satisfy the physical constraints given by symmetry we build neural networks that are equivariant, why don't we build neural networks that are invariant instead? In this way, it is not only the output of the network that satisfies the laws of physics, but it is the network itself, with its parameters. Basically instead of choosing via sgd optimisation any parameters in the parameter landscape, can we constrain these parameters in a physically relevant sub manifold?
My idea would then be to build neural networks analogous to physical systems, where the parameters of the whole network have an analogue in a physical theory and not just those in the autoencoder bottleneck.
An application of these neural networks could be in the field of topological quantum field theory but in general in any lattice gauge theory, where the neural network itself becomes a piece of graphene that spreads the input current over the output boundary state. It can also be a potts model, a spin glass or a penrose spin network, which recognises physics because it is built in analogy with the physical model.
Perhaps putting such a strong constraint on the parameter space would be counterproductive, making the neural network lose its ability to generalise. But this is a very interesting topic.
Part 3: ruclips.net/video/fiX8c-4K0-Q/видео.htmlsi=KB11RkHSlvQlzxxC
Thank you so much, Steve!!!!
Hi Steve. I was recently diving into ML and AI applications for CFD but I realized its better to focus on robust solvers . I don't know much about this but in other video you show the adaptive wavelet method for CFD and this seem to handle or solve a bunch of CFD problems whereas ML can tackle a specific problem. Am I right ? btw I am a huge fan of your videos . Thanks .
Dear Brunton, your teaching is excellent. My path of learning is book oriented. Could you recommend me a book/paper where I can learn physics informed neural networks so that i can apply this in my nuclear engineering field like reactor design parameters, radiation transport, neutron transport etc.
Another great video.
It would be interesting to do a video about Physics Informed ML in the context of Sutton's Bitter Lesson, since this seems to be a case where adding extra knowledge into the architecture/loss of the network actually beats out more generalist approaches. This is probably due to the lack of training data in physics/engineering domains, but maybe building in physics knowledge helps in the large data regime as well.
Are there models of loss functions/ optimization functions that can switch tactics (or swap out functions completely) depending on what stage of training a machine learning model gets is on?
I was studying transformers/ self-attention architecture and it made curious if “self attention” could be used specifically on the loss or the optimizing functions to either tighten up focus on a more specific goal, or broaden it depending on where the training is at.
Does that make sense?
This is the “Multiple loss or optimizing functions that are activated at different times using different triggers” approach.
The self-attention method of deciding when to tune or modify (or replace) an optimizing function- I think would be spectacular to demonstrate!
It is the key architecture used in OpenAi’s SORA, Chat GPT 4 (and 3 and 2), and many other successful machine learning tools.
Also- open question: in what ways can LLM’s or image classifiers / generators be utilized in physics intended machine meaning?
Could SORA figure some physics on its own simply via studying video footage as well as the ability to identify, label, and categorize objects in a video ***and *** learn and generalize how these objects change over time? (Like objects falling).
I know a lot depends on the training data (watching leaves fall vs watching rocks fall…
That’s my big question!!
Please list this video. I couldn't find this video on your channel directly.
Great video, this is the good stuff! Can't wait for the next one!
super amazing, I'm waiting part 5
I cannot find Parts 2 & 3, either.
I can not find part 3
ruclips.net/video/fiX8c-4K0-Q/видео.htmlsi=M2l0Y5cc77Rki7m8
ruclips.net/video/fiX8c-4K0-Q/видео.htmlsi=1AkKVFx2IF1GQQ_i
God I'm such a dummy
Brilliant
where are the part2 and part3 of the lecture series
part 2 is available now
Sir kindly check your email please
I am following your videos since the last two years