From the book and from here... on the formula around the 5:00 mark. I don't get the argmin over A_j, with your regularizer being lambda * g(A_j). That seems like you are picking an A_j from over all of the layers as a solution. It seems like you need to find a method that determines *all* of the A_j layer weights. I just am not understanding this formulation. I'd get if you'd take the argmin over all of the layer weights A_j (j=1..M) of the nested function, but again, I'm not understanding the argmin over different A_j, with the regularizer term. A bit more explanation would be very very helpful I think.
In section 4, they gave a formula similar to that -- formula 4.3a -- as an "overall mathematical framework when considering regression to nonlinear models"
A wonderful explanation for any beginner! Thank you, Prof. Brunton and Prof. Kutz, for making quality content freely available to all of us.
From the book and from here... on the formula around the 5:00 mark. I don't get the argmin over A_j, with your regularizer being lambda * g(A_j). That seems like you are picking an A_j from over all of the layers as a solution. It seems like you need to find a method that determines *all* of the A_j layer weights. I just am not understanding this formulation. I'd get if you'd take the argmin over all of the layer weights A_j (j=1..M) of the nested function, but again, I'm not understanding the argmin over different A_j, with the regularizer term. A bit more explanation would be very very helpful I think.
In section 4, they gave a formula similar to that -- formula 4.3a -- as an "overall mathematical framework when considering regression to nonlinear models"
Doesn't AI see cat's wiskers?