I regularly watch Siraj’s videos and this is one of the best I’ve seen... got my adrenaline pumping when I saw that list of topics to be covered at 8:30!
You've gotten way better than the last time I checked you out. That was 4 years ago, lol, so I guess thats just normal. But great man! Loved it! Absolutely amazing content.
I'm only half way through the video and I can already tell this is my favorite one of 2019, and possibly my favorite research paper ever! Thanks, Siraj!
This made me fall in love with AI and ML again. Thank you so much. I was going through a slump, but when watching this I couldnt stop smiling throughout the entire video
This looks more and more to me like consciousness is simply a sophisticated set of mathematical operations. This Neural Network architecture is able to optimize its own structure, like how many layers it has, in order to best solve a given problem. The set of equations looks a lot like the same equations used in optimized control theory where an observed state is compared to a desired state to give error state which is then applied by a multiplier and fed back into the system so as to move the system one order of magnitude closer to the desired state.
About a week back, I started working as Teaching Assistant to Under grad Differential Equations course, I wondered when I was reading the text, I had learnt all these theory myself I was in fresh men year but very rarely used these differential equations after the course and I wondered if I can use these in Machine learning (my area of interest). I am really excited after watching your video.
Really interesting research, AI is moving so fast right now. There is so many doors going to be opened. Modelling more complicated functions but still keeping the memory tied in. Amazing stuff, your videos are first class!
I am feeling more happy and proud now for learning Mathematics as my favourite subject. Another interesting reason to explore the AI more and more ..... Thanks, Sirj :)
This could be interesting for me as someone that spent many years during his PhD looking at nonlinear ODEs. Now as a ML guy this would be great to relate back to my original work. There is a caveat that I was not clear on, there is a difference between stability conditions for ODEs which was not clear in the paper how they treat this.
The code which was shown in this video at the end of the video, doesn't show the ODE definition block. I mean, where the ODE was specified, except for the solver. Without defining ODE, how's it possible to solve dx/dt or d2x/dt2?
Siraj... please tell me that you have travelled back in time to help us catchup with the future. I am just flabbergasted by the volume & intensity you handle.! I have no words to comment just a dropped jaw in pure awe!!!😘
Thank you for the video. One thing that I believe it's a kind of frustration it's when you try to solve a differential equation and you don't have any function initial value because actually it results in a serie of functions, not just one. Watching that video I just realized you already have those function initial values: simply they are those data you use to train the network!
Last night when I was going to sleep I had a great idea for a self-evolving non-parametric neural-network. I was wondering for the longest time how I can get the integral of a function of the learning rate with multiple variables. Today I saw this, thank you.
Excellent video. It may be self evident, but It's important to conceptualize these improvements from both a mathematical and programming understanding. You tackled a tough concept beautifully!!! Good job, mate
Please keep posting such videos for new interesting papers. It feels like, something under our noses with math, and we just need to notice it to completely solve AI in an unexpectedly simpler way. Delicious thing to watch. WTG.
Hey mota bhay.....I think in this video you really tried to make things simpler , oh ...yeah . Thanks for considering my suggestion . Keep rocking bro , keep educating the people.
I haven't finished watching yet, but this type of videos is what makes Siraj shine in the world of AI teaching. Latest AI paper explained in a very exciting and motivational way. He is very right when he says that you cannot find this type of lecture anywhere else.
Dear Siraj you promised an explanation of the paper; but what we got is a (very enthusiastic) coverage of material, that frankly if we're watching this video, already knew. I feel the actual nuance and complexity of the paper at the end was rushed and missed. Thanks for the effort but should I be concerned with my ability to understand this, or is it as easy as you say? May be it is, brb ;)
Your vids are always of super high quality, often the topic is completely new to me yet you explain it in simple and easy to understand terms with clear examples. Well done!
I have been exploring differential equations and am so happy I found this video, it puts the calculus in a context that is really interesting and applicable!!
Thank you for the attempt, my suggestion is that you should use the time in the video more efficiently. This is a pretty advanced paper, and noone who doesn't know the basics of neural networks or what a differential is will attempt/succeed to understand it.
Very intersting~ The way to illustrate maths(derivative, integral, partial derivative) is intuitive, I will spend time on Euler Function which I still not very clear. Thank you for uploading such a great introduction which is both profound and intuitive.
Interesting that more and more abstract concepts are added to the deep learning mix. Once found to be a more of a bottom up idea. Besides GANs which I found to be adding higher concepts of the mimax to lower ones as the neural networks, there are also developments in structuring networks from a point of view in abstract algebra, or now by this ODE. It's good to get an overview of the developing flow ....
+Siraj Raval I tried (and failed) to implement ODE nets on a gnn just before the end of the year. It was difficult not only because of the data source structure-ML in graph DBs is still in it's infancy-but also due to the relative dearth of info on this technique. Your explanations were helpful and (maybe even more important) your enthusiasm inspired me to go back and tackle it again; I'd forgotten why ODEnets are so appealing in the first place. Thank you!
Programmer: This function has too many conditionals to write. Can't be done. Data-scientist: Have you tried using Stochastic Gradient Descent to write them? *DNNs are born* Programmer: This function needs too many layers to generate. Can't be done. Data-scientist: Have you tried Stochastic Gradient Descent to pick the right number of layers? *ResNets are born* Programmer: Each feature of this function needs a dynamic, potentially non-integer number of non-linearities added in order to be generated. Can't be done. Data-scientist: Have you tried Differential Calculus to just generate the function? *ODEs are born* Programmer: This function is nowhere-differential. Can't be done. Data-scientist: Uh... *Pulls out box-counting* Programmer: This function can't be described by its fractal dimension. Can't be done. Data-scientist: Oh god... *Pulls out Neural Multifractal Analysis* Programmer: This function can't be described by its singularity spectra. Can't be done. Data-scientist: *Pulls out Neural Multifractal Analysis, but harder this time* Programmer: This function can't be described by its singularity spectra. Can't be done. Data-scientist: [Maximum Call-Stack Exceeded]
wow.....at some parts i wondered whether i accidentally enabled 1.5x mode. Slow down at the essential parts Siraj. Anyways....will try this out right now. I always come to your channel for inspiration and i get energised by the end of your video.
When we're predicting timestep t+h is it that we just forecast this in one step, or do we subdivide the gap (between t and h) into lots of sub-timesteps where the output is evaluated and passed into the algorithm again (almost like autoregression)?
did Siraj post this paper anywhere? I know the original documentation is there for the download but i'm trying to find the one which he is scripting off of.
To whoever pointing out he's speaking and going too fast: this is video is not a course in deep learning, and you shouldn't expect to be able to actively apply notions starting from here; it's a (very good, imho) panoramic view of the subject just to give you a taste. If you're willing to get somewhere, you first need to study: some linear algebra, some probability theory, some multivariable calculus, deep learning dedicated libraries in whatever programming language you wanna use and, last but not least, study from some books about deep learning. I've really appreciated this video, I come from "pure mathematics" (even if I don't really like this term), and I had just an intuitive idea of how deep learning is implemented, but now my understanding is a lot less fuzzy. Thank you very much.
Very interesting! Looking forward to seeing this applied in action with time series data. I'm still don't understand how this design would help irregular time series data prediction.
so this paper essentially makes vertical wormholes for marbles to skip specific air current layers, then digs valleys so the marble has more time to fall into the appropiate grouping.
That was cool. I had never heard of what a resnet was or what an ODEnet was until I watched your video. Great educational value! The ODEnet presentation, however, did not cover the adjoint method sufficiently in order to form some basic understanding of it, unlike the other parts . I'd like to find out more about it.
I have read the paper: arxiv.org/pdf/1806.07366.pdf It seems that in a ResNet, the parameters of the layers are not the same in each layer, while in the ODE fitting problem the parameters are the same. This clearly reduces the degree of freedom in choosing the parameters. ODE parameter fitting is not new, there are even some limited references in the paper. It seems that now one can use standard machine learning libraries, too.
I love this! Thank you!! Why did it take so long to figure this out. None of the concepts presented here are out of reach of a Bachelors grad. Not a mocking question but genuinely curious why something like this in hindsight seems so obvious and yet we continue to spend so much time focused on biomimicry? Regardless really excited! What were the other 3 papers that got first place?
Since the math is being appreciated increasingly throughout your videos, I wanna recommend a beautiful book that describes how math evolved through centuries of understanding mostly inspired from the nature: 'God Created the Integers by Stephen Hawking'. I enjoyed the video btw :)
Hi! I like your channel and find it unique, but there are moments when the explanations turn out a bit dry and abstract (I'm not a programmer, but I understand calculus (generally)). It would be great if you used more examples on numbers not just equations (eg in the part around 24:00). I would suggest techniques used on Mathologer channel - they are master of explaining complex math in an easy way. If you could use some of their approach at least on the most 'dry' parts of your vids, it'd be amazing. Another thing is that you are theoretically trying to target IT people and general public, which I appreciate (as part of the general public ;)), but mixing 2 types of content does not seem to work too well. Please consider splitting such videos into 2 parts, 1st explaining the concepts in general way and 2nd showing the code. Despite the fact that I have coded in the past using simple languages like SQL, visual basic, still the parts with the code are pretty much meaningless for me. Anyway, keep up the great work, thanks for interesting content! :)
There something I don't quite understand. If Neural ODEs have a "continuum" of layers meaning there are no discrete layers, why do you initialize the class ODEBlock as an ODEFunc, which has two layers?
Thanks for the video and short tutorial. This is probably the best channel on A.I. Thumbs up for you Siraj. But one more thing. I wanted to start my journey on A.I(beginning with M.L) tomorrow 15th of January and I intend to put in at least 7 hours a day. I already have a strong foundation in calculus and mathematics but I will definitely do a full revision and learn new concepts. I am also a programmer. I've worked in teams (2 precisely) to build web apps. Also good in C/C++ and Python. How long do you think it's going to take me? Once again thanks a lot for the videos as they keep me on my toes and gives me the thought that there's a lot out there to learn and improve on which is actually fun to me. I have long taught myself the "conservative principle of life-long learning". It's an indispensable gift I gave to myself back in 2012 when I started learning programming - C. Thanks once more.
I highly recommend Deep Learning with Python - Francois Chollet Seriously order it today! This book is very accessible. It strikes a perfect balance between theory and implementation. Deep Learning 101, 201 It will get you current with the mainstream machine learning techniques up through 2017 including GAN's. It is all python based with tensor flow and keras. If you have a nice GPU you will have no trouble running the algos. Then it's on to studying papers and sharpening those math skills so you can help us push the technology further. Calculus, Linear Algebra and Statistics Life long learning is truly a gift! I began learning Python in August of last year so that I could take this journey in machine learning. In October I ordered F. Chollet's book and I finished coding through the last chapter on January 2nd. The advance that Siraj covered in this video is mind blowing! Now its time to sharpen up my math skills and dig into this paper. I'm considering heading down to the local college to get some help... Best Regards, J.
@@einemailadressenbesitzerei8816 I want to learn ML and DL and become good at both. I think with a strong foundation in those two things like GANs, ResNet, NLP and others can easily follow
So to summarize, the ODEblock essentially takes all those (ODEfunc) layers and represents them as one large layer? Also what is the need for the initial resblocks at the start of the model? It's definitely an interesting approach to NN and I'm curious about it's applications in time-series (or anything that has a sequential relationship) data.
Could you post a video on using the adjoint method to solve odes. I would just really appreciate a concise presentation. All of the material I have found on it, is hard to digest.
I remember something about propogation from maths, for non linear systems, interval progation , That might help ai neural networks handle more uncertain things and could advance ai. Personally I still need a maths neurolink
Here is related maths RUclips video ruclips.net/video/hghnNzF4PVM/видео.html , basically it talks about having a simulated reality of some kind to predict many possible outcomes from one position, could help with ai predictions for uncertainty in non linear systems. Interval propagation , not just front and back propogation, I think.
I was also thinking about sub interval propagation like for when you are sort of multi tasking , I think many ai systems could work together, but like humans you need a layer to decide what to focus computation on , I mean do you want to drive safely or play go in a moment, I would prioritize driving safe over playing go, unless I had enough processing power to do both. Interval propagation with simulated reality or game states is the way to go. It could see all the possibilities that humans never thought of. Make a focus layer also for more general ai systems to switch between the specific learned skills like playing chess, or go, or driving, Even if we just started with a simulated reality like grand theft auto and pretended it was the world we could learn techniques to apply to the real world. Start with a smaller number of simulated inputs and outputs and then expand. Like wind speed or ground stability or heat
Very interesting stuff! However, what I don't quite understand is how the ODEs fit in with gradient descent. If the layers of the network can be represented as an ODE at some time t, and algorithms like Euler's method can be used to solve such equations, why is gradient descent necessary? Or if I understood incorrectly and Euler's method is used for computing the gradients rather than the weights, what is the benefit of this compared to using current methods? Does it allow for non-differentiable activation functions?
You shouldn't expect 'any' people to watch or understand this video cuz there will not be a guy who know nothing about AI or deep learning things click on this video and try to find out what is "neural ode". You should just focus on those who knows a bit about this area, which will make this video a lot more better to those who really want to understand neural ode.
correct me if im wrong, but the main part of ode they used was Euler's method to approximate i was wondering if you can use any other tools that was taught for solving odes to neural networks.
1 - Basic neural network theory |
8:30
2 - "Residual" neural network theory
| 12:40
3 - Ordinary Differential Equations (ODEs)
| 17:00
4 - ODE Networks
| 22:20
5 - Euler's Method to Optimize an ODENet
| 27:45
6 - Adjoint Method for ODENet Optimization
| 29:15
7 - ODENet's Applied to time series data
| 30:50
8 - Future Applications of ODENets | 33:41
Thanks broo
Np !
Only PyTorch implementation as of now? rtqichen's torchdiffeq Github.
I have faith in humanity because of people like you 👏🙏
Is there a popular term that RUclips people use for a list of video bookmarks like this one?
The input-times weight-add a bias-activate song is brilliant and should be used in elementary schools
I regularly watch Siraj’s videos and this is one of the best I’ve seen... got my adrenaline pumping when I saw that list of topics to be covered at 8:30!
Siraj dropped the most fire freestyle of 2019 in this video.
@@marketsmoto3180 wait 10 hours for my next video
Siraj Raval I cant eat or sleep until I get these new bars Siraj!
You've gotten way better than the last time I checked you out. That was 4 years ago, lol, so I guess thats just normal. But great man! Loved it! Absolutely amazing content.
I love how you're always excited about what you're talking about. It's infectious.
I'm only half way through the video and I can already tell this is my favorite one of 2019, and possibly my favorite research paper ever! Thanks, Siraj!
This made me fall in love with AI and ML again. Thank you so much. I was going through a slump, but when watching this I couldnt stop smiling throughout the entire video
This looks more and more to me like consciousness is simply a sophisticated set of mathematical operations. This Neural Network architecture is able to optimize its own structure, like how many layers it has, in order to best solve a given problem. The set of equations looks a lot like the same equations used in optimized control theory where an observed state is compared to a desired state to give error state which is then applied by a multiplier and fed back into the system so as to move the system one order of magnitude closer to the desired state.
About a week back, I started working as Teaching Assistant to Under grad Differential Equations course, I wondered when I was reading the text, I had learnt all these theory myself I was in fresh men year but very rarely used these differential equations after the course and I wondered if I can use these in Machine learning (my area of interest). I am really excited after watching your video.
Really interesting research, AI is moving so fast right now. There is so many doors going to be opened. Modelling more complicated functions but still keeping the memory tied in. Amazing stuff, your videos are first class!
I am feeling more happy and proud now for learning Mathematics as my favourite subject.
Another interesting reason to explore the AI more and more .....
Thanks, Sirj :)
I agree, I'm studying maths at university and it is awesome to see differential equations pop up in AI.
Ramesh is that you?
I like that you said "I know that sounds complicated but don't go anywhere."
This could be interesting for me as someone that spent many years during his PhD looking at nonlinear ODEs. Now as a ML guy this would be great to relate back to my original work. There is a caveat that I was not clear on, there is a difference between stability conditions for ODEs which was not clear in the paper how they treat this.
The code which was shown in this video at the end of the video, doesn't show the ODE definition block. I mean, where the ODE was specified, except for the solver. Without defining ODE, how's it possible to solve dx/dt or d2x/dt2?
"ODE block" is not really a block. Shameless plug, here is my explanation of this paper: ruclips.net/video/uPd0B0WhH5w/видео.html
Siraj... please tell me that you have travelled back in time to help us catchup with the future. I am just flabbergasted by the volume & intensity you handle.!
I have no words to comment just a dropped jaw in pure awe!!!😘
Thank you! I watched many videos on ODE with ResNet and yours is the best!!!
Thank you for the video. One thing that I believe it's a kind of frustration it's when you try to solve a differential equation and you don't have any function initial value because actually it results in a serie of functions, not just one. Watching that video I just realized you already have those function initial values: simply they are those data you use to train the network!
Last night when I was going to sleep I had a great idea for a self-evolving non-parametric neural-network. I was wondering for the longest time how I can get the integral of a function of the learning rate with multiple variables. Today I saw this, thank you.
Excellent video. It may be self evident, but It's important to conceptualize these improvements from both a mathematical and programming understanding. You tackled a tough concept beautifully!!! Good job, mate
Please keep posting such videos for new interesting papers. It feels like, something under our noses with math, and we just need to notice it to completely solve AI in an unexpectedly simpler way. Delicious thing to watch. WTG.
Ohh come on!
I needed this for my differential equations proyect last semester:/
such an interesting topic!
Awesome, Thanks Siraj! The physics community is going to love this! Looking forward to you making more videos on this when this research expands!
Even though I'm good at math, I would have never imagined myself using differential equations again after high school... and here I'm
This is awesome, you're killing it mate!
At ~11:00 "That, in essence, is how deep learning research goes. Let's be real, everybody."
You just won LeInternet for today ;-)
Hey mota bhay.....I think in this video you really tried to make things simpler , oh ...yeah . Thanks for considering my suggestion .
Keep rocking bro , keep educating the people.
I like this style of video where you talk freely, just like your livestreams.
Awesome Video, Hopping to cover more about new research papers in that simple way, I really enjoyed even I'm not mathematician.
Hey, siraj! Please make a video on Spiking Neural Networks!
thank you siraj for putting the effort to enclose a much larger, broader audience. Everyone benefits from this.
13:37 when Siraj is about to drop some hardcore ML knowledge
HAHAHAHA, shit is getting serious
Great video Siraj! Thanks and keep up the great work!!
I haven't finished watching yet, but this type of videos is what makes Siraj shine in the world of AI teaching. Latest AI paper explained in a very exciting and motivational way. He is very right when he says that you cannot find this type of lecture anywhere else.
Cant thank you enough! Thank you very much man, your channel is the best!
Your videos are a continues stream of super high quality learnings about new computing mechanisms! Thank you!
Dear Siraj you promised an explanation of the paper; but what we got is a (very enthusiastic) coverage of material, that frankly if we're watching this video, already knew. I feel the actual nuance and complexity of the paper at the end was rushed and missed. Thanks for the effort but should I be concerned with my ability to understand this, or is it as easy as you say? May be it is, brb ;)
You’re right. I could’ve done better. It’s a difficult paper lol
Only channel on RUclips that motivates me to study Maths..
Your vids are always of super high quality, often the topic is completely new to me yet you explain it in simple and easy to understand terms with clear examples. Well done!
30:29 You know shit is about to get serious when Siraj takes on a ninja posture
Thank you Siraj, ive been reading over this paper for the last two weeks seeing how I can use it for my Forex predictions
This is an incredible research paper.
I have been exploring differential equations and am so happy I found this video, it puts the calculus in a context that is really interesting and applicable!!
Thank you for the attempt, my suggestion is that you should use the time in the video more efficiently. This is a pretty advanced paper, and noone who doesn't know the basics of neural networks or what a differential is will attempt/succeed to understand it.
Very intersting~ The way to illustrate maths(derivative, integral, partial derivative) is intuitive, I will spend time on Euler Function which I still not very clear. Thank you for uploading such a great introduction which is both profound and intuitive.
Interesting that more and more abstract concepts are added to the deep learning mix. Once found to be a more of a bottom up idea. Besides GANs which I found to be adding higher concepts of the mimax to lower ones as the neural networks, there are also developments in structuring networks from a point of view in abstract algebra, or now by this ODE. It's good to get an overview of the developing flow ....
+Siraj Raval
I tried (and failed) to implement ODE nets on a gnn just before the end of the year. It was difficult not only because of the data source structure-ML in graph DBs is still in it's infancy-but also due to the relative dearth of info on this technique.
Your explanations were helpful and (maybe even more important) your enthusiasm inspired me to go back and tackle it again; I'd forgotten why ODEnets are so appealing in the first place. Thank you!
Programmer: This function has too many conditionals to write. Can't be done.
Data-scientist: Have you tried using Stochastic Gradient Descent to write them?
*DNNs are born*
Programmer: This function needs too many layers to generate. Can't be done.
Data-scientist: Have you tried Stochastic Gradient Descent to pick the right number of layers?
*ResNets are born*
Programmer: Each feature of this function needs a dynamic, potentially non-integer number of non-linearities added in order to be generated. Can't be done.
Data-scientist: Have you tried Differential Calculus to just generate the function?
*ODEs are born*
Programmer: This function is nowhere-differential. Can't be done.
Data-scientist: Uh... *Pulls out box-counting*
Programmer: This function can't be described by its fractal dimension. Can't be done.
Data-scientist: Oh god... *Pulls out Neural Multifractal Analysis*
Programmer: This function can't be described by its singularity spectra. Can't be done.
Data-scientist: *Pulls out Neural Multifractal Analysis, but harder this time*
Programmer: This function can't be described by its singularity spectra. Can't be done.
Data-scientist: [Maximum Call-Stack Exceeded]
God-tier lulz lad, bravo
I have no fucking clue what this means...
...but it's fucking hilarious and I like it.
one day ill come back to this and understand...
Nice job, bruv. Keep making the diaspora proud!
wow.....at some parts i wondered whether i accidentally enabled 1.5x mode. Slow down at the essential parts Siraj. Anyways....will try this out right now. I always come to your channel for inspiration and i get energised by the end of your video.
I just wanna keep stare at the evolving convolutional layer output with this one. Must be fun! :)
When we're predicting timestep t+h is it that we just forecast this in one step, or do we subdivide the gap (between t and h) into lots of sub-timesteps where the output is evaluated and passed into the algorithm again (almost like autoregression)?
did Siraj post this paper anywhere? I know the original documentation is there for the download but i'm trying to find the one which he is scripting off of.
thank you so much Siraj, I think you just opened my eyes on my next paper title.
To whoever pointing out he's speaking and going too fast: this is video is not a course in deep learning, and you shouldn't expect to be able to actively apply notions starting from here; it's a (very good, imho) panoramic view of the subject just to give you a taste. If you're willing to get somewhere, you first need to study: some linear algebra, some probability theory, some multivariable calculus, deep learning dedicated libraries in whatever programming language you wanna use and, last but not least, study from some books about deep learning.
I've really appreciated this video, I come from "pure mathematics" (even if I don't really like this term), and I had just an intuitive idea of how deep learning is implemented, but now my understanding is a lot less fuzzy. Thank you very much.
fuzzy logic?
Thanks Siraj, you're doing a great job!
Awesome breakdown of very involved topics, Siraj. Keep it up!
Breakthroughs like this are why AGI is closer than we think !
Very interesting! Looking forward to seeing this applied in action with time series data. I'm still don't understand how this design would help irregular time series data prediction.
so this paper essentially makes vertical wormholes for marbles to skip specific air current layers, then digs valleys so the marble has more time to fall into the appropiate grouping.
When.... Cocaine meets anxiety.... .. 😄
you try explaining this shit.
@@CSryoh 1. Download ODE ML library. 2. Use library in code. 3. ??? 4. PROFIT! 🤔
You are such a clever brain. Great work man thanks.
That was cool. I had never heard of what a resnet was or what an ODEnet was until I watched your video. Great educational value! The ODEnet presentation, however, did not cover the adjoint method sufficiently in order to form some basic understanding of it, unlike the other parts . I'd like to find out more about it.
great points
Freaking fucking awesome!! Streched my brain quite a lot😂 Thanks.
I would like to learn more about the code started from 30:50 though.
But I love this video! Thanks for sharing.
This is amazing. You are amazing. Thank you.
Thank you for making these videos!
Awesome siraj. You made my day.
I'm so glad you don't stop rapping from time to time, man
Really glad I studied math and CS in college.
I'm Artificial intelligence enthusiastic, please bring some more videos like this. it'll be helping a lot!
One of your best videos.
I have read the paper: arxiv.org/pdf/1806.07366.pdf It seems that in a ResNet, the parameters of the layers are not the same in each layer, while in the ODE fitting problem the parameters are the same. This clearly reduces the degree of freedom in choosing the parameters. ODE parameter fitting is not new, there are even some limited references in the paper. It seems that now one can use standard machine learning libraries, too.
I am also confused at this, since every layer would have to have the same weight?
the movement of ur hands always inspire me ;p
You're a real triple og for doing this
that was really great, thanks a lot!
Thanks for this good intro into this topic!
I'm excited to see how this will merge with orthogonal polynomials
I love this! Thank you!! Why did it take so long to figure this out. None of the concepts presented here are out of reach of a Bachelors grad. Not a mocking question but genuinely curious why something like this in hindsight seems so obvious and yet we continue to spend so much time focused on biomimicry? Regardless really excited! What were the other 3 papers that got first place?
Since the math is being appreciated increasingly throughout your videos, I wanna recommend a beautiful book that describes how math evolved through centuries of understanding mostly inspired from the nature: 'God Created the Integers by Stephen Hawking'. I enjoyed the video btw :)
Hi! I like your channel and find it unique, but there are moments when the explanations turn out a bit dry and abstract (I'm not a programmer, but I understand calculus (generally)). It would be great if you used more examples on numbers not just equations (eg in the part around 24:00). I would suggest techniques used on Mathologer channel - they are master of explaining complex math in an easy way. If you could use some of their approach at least on the most 'dry' parts of your vids, it'd be amazing. Another thing is that you are theoretically trying to target IT people and general public, which I appreciate (as part of the general public ;)), but mixing 2 types of content does not seem to work too well. Please consider splitting such videos into 2 parts, 1st explaining the concepts in general way and 2nd showing the code. Despite the fact that I have coded in the past using simple languages like SQL, visual basic, still the parts with the code are pretty much meaningless for me. Anyway, keep up the great work, thanks for interesting content! :)
There something I don't quite understand. If Neural ODEs have a "continuum" of layers meaning there are no discrete layers, why do you initialize the class ODEBlock as an ODEFunc, which has two layers?
Hi Siraj, excellent video. Very inspiring. Do you think that this is the next big thing or just another kind of machine learning???
Thanks for the video and short tutorial. This is probably the best channel on A.I. Thumbs up for you Siraj. But one more thing. I wanted to start my journey on A.I(beginning with M.L) tomorrow 15th of January and I intend to put in at least 7 hours a day. I already have a strong foundation in calculus and mathematics but I will definitely do a full revision and learn new concepts. I am also a programmer. I've worked in teams (2 precisely) to build web apps. Also good in C/C++ and Python. How long do you think it's going to take me? Once again thanks a lot for the videos as they keep me on my toes and gives me the thought that there's a lot out there to learn and improve on which is actually fun to me. I have long taught myself the "conservative principle of life-long learning". It's an indispensable gift I gave to myself back in 2012 when I started learning programming - C. Thanks once more.
Check out this
3Blue1Brown - Deep Learning: ruclips.net/p/PLLMP7TazTxHrgVk7w1EKpLBIDoC50QrPS
Brother Skutnu thanks for the tip
I highly recommend Deep Learning with Python - Francois Chollet Seriously order it today! This book is very accessible. It strikes a perfect balance between theory and implementation. Deep Learning 101, 201 It will get you current with the mainstream machine learning techniques up through 2017 including GAN's. It is all python based with tensor flow and keras. If you have a nice GPU you will have no trouble running the algos. Then it's on to studying papers and sharpening those math skills so you can help us push the technology further. Calculus, Linear Algebra and Statistics
Life long learning is truly a gift! I began learning Python in August of last year so that I could take this journey in machine learning. In October I ordered F. Chollet's book and I finished coding through the last chapter on January 2nd. The advance that Siraj covered in this video is mind blowing! Now its time to sharpen up my math skills and dig into this paper. I'm considering heading down to the local college to get some help... Best Regards, J.
You said: "How long do you think it's going to take me?"
What do you exactly mean? What exactly is your aim? What do you want to achieve?
@@einemailadressenbesitzerei8816 I want to learn ML and DL and become good at both. I think with a strong foundation in those two things like GANs, ResNet, NLP and others can easily follow
"infinitesimally small and infinity big" once again Leibniz's monad is still schooling the world.
Thank you for the great effort you put in
Math is awesome i like that bru and my first time ever to hear about reinforcement learning.
So to summarize, the ODEblock essentially takes all those (ODEfunc) layers and represents them as one large layer? Also what is the need for the initial resblocks at the start of the model? It's definitely an interesting approach to NN and I'm curious about it's applications in time-series (or anything that has a sequential relationship) data.
Sir, you are a great teacher, math simplified. 👌👌
Could you post a video on using the adjoint method to solve odes. I would just really appreciate a concise presentation. All of the material I have found on it, is hard to digest.
I remember something about propogation from maths, for non linear systems, interval progation , That might help ai neural networks handle more uncertain things and could advance ai. Personally I still need a maths neurolink
Here is related maths RUclips video ruclips.net/video/hghnNzF4PVM/видео.html , basically it talks about having a simulated reality of some kind to predict many possible outcomes from one position, could help with ai predictions for uncertainty in non linear systems. Interval propagation , not just front and back propogation, I think.
I was also thinking about sub interval propagation like for when you are sort of multi tasking , I think many ai systems could work together, but like humans you need a layer to decide what to focus computation on , I mean do you want to drive safely or play go in a moment, I would prioritize driving safe over playing go, unless I had enough processing power to do both. Interval propagation with simulated reality or game states is the way to go. It could see all the possibilities that humans never thought of. Make a focus layer also for more general ai systems to switch between the specific learned skills like playing chess, or go, or driving, Even if we just started with a simulated reality like grand theft auto and pretended it was the world we could learn techniques to apply to the real world. Start with a smaller number of simulated inputs and outputs and then expand. Like wind speed or ground stability or heat
Cool! Maybe we could predict the earthquake by it.
I was waiting for a video on this
Very interesting stuff! However, what I don't quite understand is how the ODEs fit in with gradient descent. If the layers of the network can be represented as an ODE at some time t, and algorithms like Euler's method can be used to solve such equations, why is gradient descent necessary?
Or if I understood incorrectly and Euler's method is used for computing the gradients rather than the weights, what is the benefit of this compared to using current methods? Does it allow for non-differentiable activation functions?
Siraj bhai, Happy Uttarayan.
Thank you Siraj!! Your videos are awesome
Wow, something more interesting than capsule networks
You shouldn't expect 'any' people to watch or understand this video cuz there will not be a guy who know nothing about AI or deep learning things click on this video and try to find out what is "neural ode". You should just focus on those who knows a bit about this area, which will make this video a lot more better to those who really want to understand neural ode.
correct me if im wrong, but the main part of ode they used was Euler's method to approximate i was wondering if you can use any other tools that was taught for solving odes to neural networks.
Thank you so much for putting this together