Liquid Neural Networks

Поделиться
HTML-код
  • Опубликовано: 26 сен 2024

Комментарии • 186

  • @hyperinfinity
    @hyperinfinity 3 года назад +127

    Most underrated talk. This is an actual game changer for ML.

    • @emmanuelameyaw9735
      @emmanuelameyaw9735 3 года назад +40

      :) most overrated comment on this video.

    • @arturtomasz575
      @arturtomasz575 3 года назад +5

      It mind be! Let's see how it performs in specific tasks against state of the art solutions, not against toy-models of specific architecture, can't wait to try it myself especially vs transformers or residual CNNs :)

    • @guopengli6705
      @guopengli6705 2 года назад +20

      I think that it is way too early to say this. A few mathematicians tried to improve DNNs' interpretability in similar ways. This comment seems perhaps over-optimistic from a viewpoint of theory. We do need testing its performance in more CV tasks.

    • @KirkGravatt
      @KirkGravatt 2 года назад +1

      yeah. this got me to chime back in. holy shit.

    • @moormanjean5636
      @moormanjean5636 Год назад +2

      @@guopengli6705 nah u werent paying attention, this revolutionizes causal learning in my opinion while improving on the state-of the art

  • @isaacgutierrez5283
    @isaacgutierrez5283 2 года назад +92

    I prefer my Neural Networks solid thank you very much

    • @rainbowseal69
      @rainbowseal69 3 месяца назад

      🥹🥹🥹😂😂😂😂😂

    • @anushkathakur5062
      @anushkathakur5062 3 месяца назад

      😂😂😂😂😂

    • @mystifoxtech
      @mystifoxtech 3 месяца назад +4

      If you don't like liquid neural networks then you really won't like gaseous neural networks.

    • @zappy9880
      @zappy9880 2 месяца назад +2

      @@mystifoxtech plasma neural networks gonna when?

  • @adityamwagh
    @adityamwagh Год назад +8

    It just amazes me how the final few layers are so crucial to the objective of the neural network!

  • @FilippoMazza
    @FilippoMazza 2 года назад +41

    Fantastic work. The relative simplicity of the model proves that this methodology is truly a step towards artificial brains. Expressivity, better causality and the many neuron inspired improvements are inspiring.

    • @NickH-o5l
      @NickH-o5l 3 месяца назад

      It would make more sense if it made more sense. I'd love it if this was comprehensible

  • @martinsz441
    @martinsz441 3 года назад +71

    Sounds like an important and necessary evolution of ML. Lets see how much this can be generalized and scaled but sounds fascinating.

    • @David-rb9lh
      @David-rb9lh 3 года назад +5

      I will try to use it.
      I think that a lot of studies and reports will make interesting returns .

    • @maloxi1472
      @maloxi1472 Год назад +2

      "Necessary" for which specific applications ? Surely not "necessary" across the board.
      I'd like to see you elaborate

    • @andrewferguson6901
      @andrewferguson6901 Год назад +4

      ​@@maloxi1472necessary for not spending 50 million dollars for a 2 month training computation?

    • @maloxi1472
      @maloxi1472 Год назад +2

      @@andrewferguson6901 You wrongly assume that the product of that training is necessary to begin with.

  • @marcc16
    @marcc16 Год назад +29

    0:00: 🤖 The talk introduces the concept of liquid neural networks, which aim to bring insights from natural brains back to artificial intelligence.
    - 0:00: The speaker, Daniela Rus, is the director of CSAIL and has a curiosity to understand intelligence.
    - 2:33: The talk aims to build machine learned models that are more compact, sustainable, and explainable than deep neural networks.
    - 3:26: Ramin Hasani, a postdoc in Daniela Rus' group, presents the concept of liquid neural networks and their potential benefits.
    - 5:11: Natural brains interact with their environments to capture causality and go out of distribution, which is an area that can benefit artificial intelligence.
    - 5:34: Natural brains are more robust, flexible, and efficient compared to deep neural networks.
    - 6:03: A demonstration of a typical statistical end-to-end machine learning system is given.
    6:44: 🧠 This research explores the attention and decision-making capabilities of neural networks and compares them to biological systems.
    - 6:44: The CNN learned to attend to the sides of the road when making driving decisions.
    - 7:28: Adding noise to the image affected the reliability of the attention map.
    - 7:59: The researchers propose a framework that combines neuroscience and machine learning to understand and improve neural networks.
    - 8:23: The research explores neural circuits and neural mechanisms to understand the building blocks of intelligence.
    - 9:32: The models developed in the research are more expressive and capable of handling memory compared to deep learning models.
    - 10:09: The systems developed in the research can capture the true causal structure of data and are robust to perturbations.
    11:53: 🧠 The speaker discusses the incorporation of principles from neuroscience into machine learning models, specifically focusing on continuous time neural networks.
    - 11:53: Neural dynamics are described by differential equations and can incorporate complexity, nonlinearity, memory, and sparsity.
    - 14:19: Continuous time neural networks offer advantages such as a larger space of possible functions and the ability to model sequential behavior.
    - 16:00: Numerical ODE solvers can be used to implement continuous time neural networks.
    - 16:36: The choice of ODE solver and loss function can define the complexity and accuracy of the network.
    17:07: ✨ Neural ODEs combine the power of differential equations and neural networks to model biological processes.
    - 17:07: Neural ODEs use differential equations to model the dynamics of a system and neural networks to model the interactions between different components.
    - 17:35: The adjoint method is used to compute the gradients of the loss in respect to the state of the system and the parameters of the system.
    - 18:35: Neural ODEs have high memory complexity but are more accurate than the adjoint method.
    - 19:17: Neural ODEs can be inspired by the dynamics of biological systems, such as the leaky integrator model and conductance-based synapse model.
    - 20:43: Neural ODEs can be reduced to an abstract form with sigmoid activation functions.
    - 21:33: The behavior of the neural ODE depends on the inputs of the system and the coupling between the state and the time constant of the differential equation.
    22:26: ⚙️ Liquid time constant networks (LTCs) are a type of neural network that uses differential equations to control interactions between neurons, resulting in stable behavior and increased expressivity.
    - 22:26: LTCs have the same structure as traditional neural networks but use differential equations to control interactions between neurons.
    - 24:25: LTCs have stable behavior and their time constant can be bounded.
    - 25:26: The synaptic parameters in LTCs determine the impact on neuron activity.
    - 25:50: LTCs are a universal approximator and can approximate any given dynamics.
    - 26:23: Trajectory length measure can be used to measure the expressivity of LTCs.
    - 27:58: LTCs consistently produce longer and more complex trajectories compared to other neural network representations.
    28:46: 📊 The speaker presents an empirical analysis of different types of networks and their trajectory lengths, and evaluates their expressivity and performance in representation learning tasks.
    - 28:46: The trajectory length of LTC networks remains higher regardless of changes in network width or initialization.
    - 29:04: Theoretical evaluation reveals a lower bound for expressivity of these networks based on weighted scale, biases scale, width, depth, and number of discretization steps.
    - 30:38: In representation learning tasks, LTCs outperform other networks, except for tasks with longer term dependencies where LSTMs perform better.
    - 31:13: LTCs show better performance and robustness in real-world examples, such as autonomous driving, with significantly reduced parameters.
    - 33:09: LTC-based networks impose an inductive bias on convolutional networks, allowing them to learn a causal structure and exhibit better attention and robustness to perturbations.
    34:22: ⚙️ Different neural network models have varying abilities to learn representations and perform in a causal manner.
    - 34:22: The CNN consistently focuses on the outside of the road, which is undesirable.
    - 34:31: LSTM provides a good representation but is sensitive to lighting conditions.
    - 34:39: CTRNN or neural ODEs struggle to gain a nice representation in this task.
    - 36:07: Physical models described by ODEs can predict future evolution, account for interventions, and provide insights.
    - 38:36: Dynamic causal models use ODEs to create a graphical model with feedback.
    - 39:55: Liquid neural networks can have a unique solution under certain conditions and can compute coefficients for causal behavior.
    40:18: 🧠 Neural networks with ODE solvers can learn complex causal structures and perform tasks in closed loop environments.
    - 40:18: Dynamic causal models with parameters B and C control collaboration and external inputs in the system.
    - 41:12: Experiments with drone agents showed that the neural networks learned to focus on important targets.
    - 41:58: Attention and causal structure were captured in both single and multi-agent environments.
    - 43:05: The success rate of the networks in closed loop tasks demonstrated their understanding of the causal structure.
    - 43:46: Complexity of the networks is tied to the complexity of the ODE solver, leading to longer training and test times.
    - 44:53: The ODE-based networks may face vanishing gradient problems, which can be mitigated with gating mechanisms.
    45:41: 💡 Model-free inference and liquid networks have the potential to enhance decision-making and intelligence.
    - 45:41: Model-free inference captures temporal aspects of tasks and performs credit assignment better.
    - 45:53: Liquid networks with causal structure enable generative modeling and further inference.
    - 46:32: Compositionality and differentiability make these networks adaptable and interpretable.
    - 46:40: Adding CNN heads or perception modules can handle visual or video data.
    - 48:09: Working with objective functions and physics-informed learning processes can enhance learning.
    - 49:02: Certain structures in liquid networks can improve decision-making for complex tasks.
    Recap by Tammy AI

  • @agritech802
    @agritech802 Год назад +15

    This is truly a game changer in AI, well done folks 👍

    • @ShpanMan
      @ShpanMan 3 месяца назад

      Where was the game changed? I fail to see it.

    • @LukasNitzsche
      @LukasNitzsche 3 месяца назад +1

      @@ShpanMan Yes I'm thinking the same, why haven't LNN's been implement more?

  • @raminkhoshbin9562
    @raminkhoshbin9562 2 года назад +15

    I got so happy finding out the person who wrote this exciting paper, is also a Ramin :)

  • @Dr.Z.Moravcik-inventor-of-AGI
    @Dr.Z.Moravcik-inventor-of-AGI 2 года назад +6

    There are so many smart people on MIT that America must be already a superintelligent nation. Please continue your work and this world will become a wonderful place to live in.

    • @edthoreum7625
      @edthoreum7625 Год назад

      By now the entire human race should be at incredible level of intelligence ,, even traveling out of our solar system with fusion run space shuttles!

    • @AndyBarbosa96
      @AndyBarbosa96 Год назад

      Yeah, America is so "intellligent" flying high on borrowed talent ....

    • @quonxinquonyi8570
      @quonxinquonyi8570 Год назад

      @@AndyBarbosa96this intelligentsia drop to significant level to all the second generation of these first generation geniuses...simple fact...therefore that borrowing approach is single most numero uno policy of American technological might...as Hillary Clinton right lay said some years ago that “ power of America resides outside of America”

  • @scaramir45
    @scaramir45 2 года назад +13

    i hope that one day i'll be able to fully understand what he's talking about... but it sounds amazing and i want to play around with it!

  • @Seekerofknowledges
    @Seekerofknowledges Год назад +3

    I am moved beyond description.
    What an amazing privilege to be alive in this day and age.
    The future will be great for mankind.

  • @lorenzoa.ricciardi4264
    @lorenzoa.ricciardi4264 2 года назад +51

    The "discovery" that fixed time steps for ODE work better in this case is very well known in the optimal control literature (at least by a couple of decades).
    Basically if your ODE solver has adaptive time steps, the exact mathematical operations performed for a given integration time interval dT can vary because a different number of internal steps is performed. This can have really bad consequences on the gradients of the final time states.
    There's plenty of theoretical and practical discussion in Betts' book Practical Methods for Optimal Control, chapter 3.9 Dynamic Systems Differentiation.

    • @abinaslimbu3057
      @abinaslimbu3057 Год назад

      Lord siva
      Gass state State (liquid) Gass light

    • @abinaslimbu3057
      @abinaslimbu3057 Год назад

      Humoid into human

    • @abinaslimbu3057
      @abinaslimbu3057 Год назад

      State Gass powered

    • @DigitalTiger101
      @DigitalTiger101 Год назад +13

      @@abinaslimbu3057 Schizo moment

    • @iamyouu
      @iamyouu Год назад

      @@abinaslimbu3057 why are you doing this? I know no one who's actually hindu would comment such stupid sht.

  • @Ali-wf9ef
    @Ali-wf9ef 2 года назад +2

    The video showed up in my feed randomly and I clicked on it just cause the lecturer was Iranian. But the content was so interesting that I watched it to the end. Sounds like a really evolutionary breakthrough in ML and DL. Specially with the computational power of computing systems growing every day, training/inferencing such complex network models become more possible. Great job

  • @peceed
    @peceed Год назад +1

    Casualty is extremely important in building Bayesian model of the world. It allows to identify correlations between events, that are useful to create a-priori statistics for reasoning, because we avoid double-counting. Single evidence and its logical consequences is not seen as many independent confirmations of hypothesis.

  • @wangnuny93
    @wangnuny93 2 года назад +9

    man i dont work in ml field but sure this is fascinating!!!!

  • @alwadud9243
    @alwadud9243 2 года назад +17

    Thanks Ramin and team. That was the most interesting and well delivered presentation on neural nets that I have ever seen, certainly a lot new to learn in there. Most impressed by the return to learning from nature and the brain and how that significantly augmented 'standard' RNNs etc. Well, there's a new standard now, and it's liquid.

    • @mrf664
      @mrf664 Год назад

      @alwadud9243 can you explain how this works?

    • @mrf664
      @mrf664 Год назад

      I think it is interesting too but I fail to grasp any Intuition.
      The only other way I see is to spend hours with papers and equations, but I cannot afford the time for that at present so I was curious if you were able to glean more insight than me :) 😊 thanks!

  • @KeviPegoraro
    @KeviPegoraro 3 года назад +8

    very good, the idea for that is simple, the problem relay in puting all of it to work together, that is the good stuff

  • @1238a8
    @1238a8 4 месяца назад +1

    That's amazing concept. We should implement it out of spite.
    Too often we feel our brain to be a mush. Ai should suffer that way too.

    • @ShpanMan
      @ShpanMan 3 месяца назад

      Out of "Spike" maybe 😂

    • @tempname8263
      @tempname8263 3 месяца назад

      Your brain is just undertrained bro

  • @ibraheemmoosa
    @ibraheemmoosa 2 года назад +9

    Attention map at 7:00 looks fine to me. If you do not want to wander off out of the road, you should attend to the boundary of the road. And even after you add noise at 7:30, the attention still picks up the boundary which is pretty good.

    • @vegnagunL
      @vegnagunL 2 года назад

      Yes, it still is a consistent pattern for the driving task.

    • @AsifShahriyarSushmit
      @AsifShahriyarSushmit 2 года назад +3

      This sounds kinda like the MesaOptimizatier thing Robert Miles keeps talking about. ruclips.net/video/bJLcIBixGj8/видео.html
      A network can learn the same task in several ways with totally different inner objective which may or may not align with a biological agent doing the same task.

    • @ChrisJohnsonHome
      @ChrisJohnsonHome 2 года назад +1

      Because the LTC Network is uncovering the causal structure, it performs much better in noise (33:26), heavy rain/occlusions (42:54) and crashes less in the simulation.
      Since it pays attention to the causes, I wonder if it's also giving itself more time to steer correctly?

    • @moormanjean5636
      @moormanjean5636 Год назад

      @@ChrisJohnsonHome I would guess that yes, the time constants of the network would learn to modulate in the face of uncertainty

    • @moormanjean5636
      @moormanjean5636 Год назад +2

      Looking at the boundary of the road is not how humans drive. We assume that we know where the nearby boundary is already and so look at the horizon to update our mental maps. It is reasonable that neural networks should look to do the same, and it is evidence of LTC's causal behavior.

  • @samowarow
    @samowarow 2 года назад +47

    Feels like ML folks keep rediscovering things all over

    • @lorenzoa.ricciardi4264
      @lorenzoa.ricciardi4264 2 года назад +11

      Yep. Basically if you have a good theoretical level of optimal control theory you can see that this approach is fusing state observation and control policy. There's literally no mention of it in the whole talk. I'll give the benefit of the doubt for the reason of this, but unfortunately I *very* often see, as you say, that ML people rediscover stuff and rebrand it as a ML invention (like backprop, which is literally just a discretized version of a standard technique in calculus of variations/optimal control).

    • @moormanjean5636
      @moormanjean5636 Год назад +1

      @@lorenzoa.ricciardi4264 this is definitely not just "rediscovering" stuff. I'm not sure how you managed to watch the whole talk yet missed all the parts that this technique has outperformed previous techniques by leaps and bounds. You sound like a salty calculus teacher but I'll give you the benefit of the doubt for the reason of this lol.

    • @ShpanMan
      @ShpanMan 3 месяца назад

      @@moormanjean5636 Please show me the "Liquid" AI model that outperforms any modern LLMs, 2.5 years after this talk.

  • @Alexander_Sannikov
    @Alexander_Sannikov 2 года назад +48

    - let's make another attempt at implementing a biology-inspired neural network
    - proceeds implementing backprop

    • @fernbear3950
      @fernbear3950 2 года назад

      Direct MLE over direct data is still the best (ATM, AFAIK) in class for implicitly performing regression over a distribution density w.r.t. the internal states/activations/features of a network.
      Generally the rule of thumb is to limit "big steps" from the main trunk of development so that impact can be measured, etc. It also helps to vastly (i.e. orders of magnitude) increase the change that something will succeed.
      Otherwise the chances of failure are much higher (and rarely get published, I would suspect from personal experience). I'm sure there is some nice interconnected minimally-required jump of feature subsets from this kind of research to a more Hebbian-kind-of-based approach, but then again there's nothing dictating we do it all at once (which can be exponentially expensive).
      Hopefully brain stuff comes in handy, but ATM the field is going towards massively nearly linear models instead of the opposite, since the prior affords better results (generally) for MLE-over-MCE.

    • @Gunth0r
      @Gunth0r Год назад

      @@fernbear3950 but linear models suck when there's big regime changes in the data

    • @fernbear3950
      @fernbear3950 Год назад

      @@Gunth0r I'm not sure what you mean by 'regime' changes here. I wasn't talking about anything linear at all here. MLE over a linear model would be, uh, interesting to say the least, lol.

  • @0MVR_0
    @0MVR_0 3 месяца назад

    naming this liquid
    analogizes the conventional network as solid or rigid.
    The analogous mechanism is the utility of differential equations that represent multitudinous connections,
    just as water molecules can more freely interact with a greater degree of neighbors.

    • @0MVR_0
      @0MVR_0 3 месяца назад

      actually this is likely a nonsense statement.
      the liquidity is provided through the architecture
      the sensory are convoluted, and the others seem to be recurrent
      excluding the motor

  • @araneascience9607
    @araneascience9607 2 года назад +1

    Great work,i hope that they publish the article of this model soon.

  • @johnniefujita
    @johnniefujita Год назад +4

    does anyone know about any sample code for a model like this?

  • @AA-gl1dr
    @AA-gl1dr 2 года назад +3

    amazing video. thank you for uploading.

  • @AndyBarbosa96
    @AndyBarbosa96 Год назад +2

    What is the difference betweem these LNNs and coupled ODEs? Aren't we conflating these terms. If you drive a car with only 19, then what you have is an asynchronous network of coupled ODEs and not a neural network, the term is misleading.

  • @Eye_of_state
    @Eye_of_state 2 года назад +2

    Must share technology that saves lives.

  • @petevenuti7355
    @petevenuti7355 Год назад

    I've had unarticulated thoughts with resemblance to this concept for many decades, I never learned the math, so never could express my ideas. I still need someone to explain the math at a highschool level!

  • @gameme-yb1jz
    @gameme-yb1jz 8 месяцев назад

    this should be the next deep learning revolution.

  • @andreylebedenko1260
    @andreylebedenko1260 2 года назад +6

    Sounds interesting, but... The fundamental difference between biological and NN processing at the current state is time. While biological systems process input asynchronously, computers try to do the whole path in one tick. I believe, this must be addressed first, leading to a completely new concept of NN, where input neurons will generate a set of signals first (with some variations as per change in the physical source signals), those signals then will be accumulated by the next layer of NN, processed in the same fashion, and passed further. This way signals which will repeat over the multiple sampling ticks of the first layer will be treated with a higher trust (importance) level on the next layer.

    • @terjeoseberg990
      @terjeoseberg990 2 года назад +1

      Our brains learn as we use them. Artificial neural networks are “trained” by using gradient decent to optimize an extremely complex function for a dataset during a training phase, then as it’s used to predict answers it learns nothing.
      We need continuous reinforcement learning.

    • @andreylebedenko1260
      @andreylebedenko1260 2 года назад

      @@terjeoseberg990 What about recurrent neural networks? Besides, the human's brain also first learns how to: see, grab, hold, walk, speak etc -- i.e. builds models -- and then it uses these models, improving them, but never reinventing them.

    • @terjeoseberg990
      @terjeoseberg990 2 года назад

      @@andreylebedenko1260, Recurrent neural networks are also trained using gradient decent. I don’t believe our brains have a gradient descent mechanism. I have mo clue how our brains learn. Gradient descent is pretty simple to understand. What our brains do is a complete mystery.

    • @hi-gf5yl
      @hi-gf5yl 2 года назад

      @@terjeoseberg990 ruclips.net/video/Q18ahll-mRE/видео.html

    • @moormanjean5636
      @moormanjean5636 Год назад

      @@terjeoseberg990 actually look up backpropogation in the brain, it is plausible that we are doing something similar to backprop at the end of the day

  • @rickharold7884
    @rickharold7884 3 года назад +5

    Always interesting. Thx

  • @Tbone913
    @Tbone913 Год назад +1

    But why do the other methods have smaller error bands? There is further improvement that can be done here

  • @manasasb536
    @manasasb536 2 года назад +3

    Can't wait to do a project on LNN and add it to my resume to stand out of the crowd.

  • @maxlee3838
    @maxlee3838 4 месяца назад

    This guy is a genius.

  • @michaelflynn6952
    @michaelflynn6952 2 года назад +12

    Why does no one in this video seem to have any plan for what they want to communicate and in what order? So hard to follow

    • @AM-ng8wc
      @AM-ng8wc 2 года назад +2

      They have engineering syndrome

  • @saeedrehman5085
    @saeedrehman5085 2 года назад +4

    Amazing!!

  • @paulcurry8383
    @paulcurry8383 2 года назад +16

    How does the attention map showing that the LNN was looking at the vanishing point mean it’s forming “better” representations?
    Shouldn’t “better” representations only be understood as having better performance? If it’s more explainable that’s cool but there’s ways to train CNNs that make them more explainable while hurting performance.

    • @jellyboy00
      @jellyboy00 2 года назад +6

      Can't agree more. It would be more persuasive if the Liquid Neural Networks are immune to some problem that previous architecture generally struggles, such as cases about adversarial examples.
      The fact that Liquid Neural Networks can't learn long term dependency, compared with LSTM is sort of disappointing, as LSTM is already underperforming compared with attention only model.
      Not to mention that spike neural network is something that I myself (not an expert though) would say are designed according to biological brain mechanism.

    • @rainmaker5199
      @rainmaker5199 2 года назад +2

      Isn't the point of looking at the attention map to understand how the network is understanding the current issue? When they showed the attention maps for all the models we could see that the LSTM was mostly paying attention to the road like 5-10 feet ahead, making it sensitive to immediate changes in lighting conditions. The LNN was paying attention to the vanishing point to understand the way the road evolves (at least it seemed like that's what they were getting at), and therefore not being sensitive to immediate changes in light level? It doesn't mean its forming 'better' representations, just that being able to distinguish what each representation is using as key information allows us to make more robust models that are less sensitive to common pitfalls one might fall into.

    • @jellyboy00
      @jellyboy00 2 года назад

      @@rainmaker5199 for me that is more like an interpretability issue. And for general auto driving, i think there is no definite answer about where the model should look at, otherwise it becomes a soft handcrafted constraint or curriculum learning. It is still reasonable for the ai to look at the side way as it also tell something about the curvature of the road. And in general auto driving, there might be obstacle or pedestrian poping anywhere, so this claim about attending the vanishing point of the road is better sounds less persuasive. Generally speaking one do not even know what part of the input should be attended in the fitst. place.

    • @rainmaker5199
      @rainmaker5199 2 года назад +1

      @@jellyboy00 I think you misunderstood me, I'm not claiming that the model attending to the vanishing point is a better self driving model for all circumstances, just that it's better at understanding that the road shape can be determined ahead of time rather than in the current step of points. This allows us to have the possibility of distributing responsibility between multiple models focused on more specific tasks. So basically, the fact that its able to tell the road shape earlier and with less continuous information alongside the fact that we know more specifically what task is being accomplished (rather than a mostly black box) is the valuable contribution here.

  •  Год назад +3

    Great work!
    Does this mean all the training done for autonomous driving with the traditional NN goes to the toilet?

  • @MLDawn
    @MLDawn 2 года назад +2

    Could you please share the link to the original paper? Thanks

  • @krishnaaditya2086
    @krishnaaditya2086 3 года назад +3

    Awesome Thanks!

  • @d4rkn3s7
    @d4rkn3s7 2 года назад +17

    Ok, after half of the talk, I stopped and read the entire paper, which kind of left me disappointed. LNNs are promoted as a huge step forward, but where are the numbers to back this up? I couldn't find them in the paper, and I strongly doubt that this is the "game-changer" as some suggest.

    • @JordanMetroidManiac
      @JordanMetroidManiac 2 года назад +3

      Ramin makes a really good point about why LNNs could be better in a lot of situations, though. Time scale is continuous, allowing for the model to approximate any function with significantly fewer parameters. But I can imagine that the implementation might be so ridiculous that it will never replace DNNs.

    • @ChrisJohnsonHome
      @ChrisJohnsonHome 2 года назад +2

      He goes over performance numbers starting at around 29:44

    • @moormanjean5636
      @moormanjean5636 Год назад

      They have a number of advantages, performance and efficiency being two of them. The problem I think a lot of people have is they expect ground-breaking results to be extremely obvious in terms of performance gains, as if the state-of-the-art wasn't extremely proficient to begin with. There are plenty of opportunities to scale the performance of LNNs, but it is their other theoretical properties that are what make them a game changer in my opinion.

    • @moormanjean5636
      @moormanjean5636 Год назад +1

      For example, their stability, their time-continuous nature, their causal nature, these are very important yet subtle properties of effective models. Not to mention you only need a handful to pull off what otherwise would take millions of parameters... how is that not a gamechanger??

    • @Gunth0r
      @Gunth0r Год назад

      I couldn't even find the paper, where is it?

  • @Axl_K
    @Axl_K 3 года назад +6

    Fascinating... loved every minute.

  • @wadahadlan
    @wadahadlan 2 года назад +3

    this was a great talk, this could change everything

  • @KCM25NJL
    @KCM25NJL 3 месяца назад

    Makes me wonder if the research of OpenAI and such will shift towards a multimodal language mode + universal approximators. I can imagine that the world can be differentially modelled from the pretrained weights of the collective knowledge of humanity .......... eventually.

  • @zephyr1181
    @zephyr1181 Год назад +2

    I would need a simpler version of the 22:49 diagram to understand this.
    Ramin says here that standard NN neurons have a recursive connection to themselves. I don't know a ton about ANNs, but I overhear from my coworkers, and I never heard of that recursive connection. Is that for RNNs?
    Is there a "Reaching 99% on MNIST"-simple explanation, or does this liquidity only work on time-series data?

  • @vdwaynev
    @vdwaynev 2 года назад +1

    How do these compare to neural ode?

  • @LarlemMagic
    @LarlemMagic 2 года назад +3

    Get this video to the FSD team.

  • @matthiaswiedemann3819
    @matthiaswiedemann3819 Год назад +1

    To me it seems similar to variational inference ...

  • @phquanta
    @phquanta 2 года назад +9

    I'm curious would't numerical solver to ODE kill all gradients, getting error scaling exponentially as depth grows ?

    • @lorenzoa.ricciardi4264
      @lorenzoa.ricciardi4264 2 года назад +1

      You can differentiate through an ODE solver, either manually or automatically. Even numerical gradients may work if you're careful with your implementation

    • @phquanta
      @phquanta 2 года назад

      @@lorenzoa.ricciardi4264 You mean like and AdaGrad type thing ? Given that gradients can be computed exactly, i.e. solution to ODE exists in closed form - i would assume there would be no such problem. On the other hand, if there is no such thing as closed solution to ODE presented, one probably is limited by depth of neural net, even with approaches like LSTM/GRU, "higher-order" ODE solvers etc.

    • @lorenzoa.ricciardi4264
      @lorenzoa.ricciardi4264 2 года назад

      @@phquanta I'm talking about automatic differentiation, there's several packages to do it in many languages. And you can compute those automatic gradients without knowing the closed solution of the ODE. Of course, if you have the closed form solution you can compute gradients manually, but that's not my point.

    • @phquanta
      @phquanta 2 года назад

      @@lorenzoa.ricciardi4264 All NN have backprop and chain rule that basically unravels all derivatives exactly as nonlinearities are easily differentiable. In liquid NN, along with all other problems(vanishing/exploding gradients) you are adding a source of inherent numerical error on top of existing ones and even Runge-Kutta won't help. What I'm saying, you are limited by the depth of Liquid NN. As a concept, it might be cool, but I would assume it is not easily scalable.

    • @lorenzoa.ricciardi4264
      @lorenzoa.ricciardi4264 2 года назад

      @@phquanta I'm not particularly expert in NN. From what I see in the presentation there's only a few layers in this approach, not dozens. The "depth" mostly seems to come from the continuous process described by the odes of the synapses.
      You shouldn't have particular problems when differentiating through ODEs if you know how to do it properly. One part of the problem may be related to the duration of time you're integrating your ODE for (not mentioned in the talk) and the nonlinearity of the underlying dynamics. When you deal with very nonlinear optimal control problems a naive approach like a single shooting (which is probably related to the simplest kind of backpropagation) you'll end up having horrible sensitivity problems. That's why multiple shooting was invented. Otherwise you can use collocation methods that work even better.
      As for numerical errors: with automatic differentiation your errors are down to machine epsilon by construction. They are as good as analytical ones, but most often they are way faster to execute, and one does not have to do the tedious job of computing them manually. If you combine a multiple shooting approach with automatic differentiation, you don't have numerical error explosion (or better, you can control it very well). That's why we can compute very complicated optimal trajectories for space probes in a really precise way, even though the integration times spans years or even decades and the dynamics is extremely nonlinear.

  • @jiananwang2681
    @jiananwang2681 11 месяцев назад

    Hi, thanks for the great video! Can you share the idea of how to visualize the activated neurons as in 6:06 in this video? It's really cool and I'm curious about it!

  • @amanda.collaud
    @amanda.collaud 2 года назад +7

    What about back propagations? Too bad he didnt finish his train of thought, this is rather an interview than source for knowledge/lesson.

    • @moormanjean5636
      @moormanjean5636 Год назад

      he explained two different ways of calculating gradients for ltc, each with their own pros and cons

  • @aminabbasloo
    @aminabbasloo 2 года назад +3

    I am wondering how it does for RL scenarios!

    • @enricoshippole2409
      @enricoshippole2409 2 года назад

      As am I. I plan on testing out some concepts using their LTC keras package. Will see how it goes

    • @moormanjean5636
      @moormanjean5636 Год назад +1

      I have used it and it works well, just use a slightly larger learning rate than LSTM.

  • @lufiporndre7800
    @lufiporndre7800 7 месяцев назад

    Does anyone have code for an Autonomous car system, I would like to practice it. If anyone knows please share.

  • @shashidharkudari5613
    @shashidharkudari5613 2 года назад +1

    Amazing talk

  • @9assahrasoum3asahboou87
    @9assahrasoum3asahboou87 2 года назад

    fathi fes medos aziza 1 said Thank you so much

  • @ibissensei1856
    @ibissensei1856 3 месяца назад

    As a newbie. It is hard to grasp

  • @dweb
    @dweb 2 года назад +1

    Wow!

  • @ian4692
    @ian4692 2 года назад

    Where to get the slides?

  • @imanshahmari4423
    @imanshahmari4423 2 года назад

    where can i find the paper ?

  • @Tbone913
    @Tbone913 Год назад +1

    Can this be extended to a liquid transformer model?

    • @AndyBarbosa96
      @AndyBarbosa96 Год назад +1

      No, this is not an ANN. This is coupled ODEs for control. The term is misleading.

    • @Tbone913
      @Tbone913 Год назад

      @@AndyBarbosa96 ok thanks

  • @김화겸-y6e
    @김화겸-y6e Год назад +2

    1year later?

  • @imolafodor4667
    @imolafodor4667 Год назад

    is it really reasonable to "just" model a CNN for autonomous driving? it would be better to compare liquid nets with policies trained in an RL system (where at least some underlying goal was followed), no?

  • @danielgordon9444
    @danielgordon9444 2 года назад +4

    ...it runs on water, man.

  • @sitrakaforler8696
    @sitrakaforler8696 2 года назад +2

    Dam...nice.

    • @alwadud9243
      @alwadud9243 2 года назад +1

      Yeah, I loved the part where he said '... and this is nice!'

  • @jos6982
    @jos6982 2 года назад

    good

  • @forheuristiclifeksh7836
    @forheuristiclifeksh7836 2 месяца назад

    7:00

  • @fbomb3930
    @fbomb3930 3 месяца назад

    For all intents and purposes doesn't a liquid network work like a KAN? Change my mind.

  • @VerifyTheTruth
    @VerifyTheTruth 2 года назад +2

    Does The Brain Distribute Calculation Loads To Systemic Subsets, Initiating Feedback Loops From Other Cellular Systems With Different Calculative Specialties And Capacities, Based Upon The Types Of Contextual Information It Recieves From Extraneous Environmental Sources, Which It Then Uses To Construct Or Render The Most Relevant Context To Appropriate Consciousness Access To A Meaningful Response Field Trajectory?

  • @grimsk
    @grimsk 3 месяца назад

    루마니아의 보석

  • @zzmhs4
    @zzmhs4 2 года назад

    I'm not an expert, but this sounds to me like this an implementation of different neurotransmitters, isn't?

    • @moormanjean5636
      @moormanjean5636 Год назад

      I don't see how it would be

    • @zzmhs4
      @zzmhs4 Год назад

      @@moormanjean5636 i've seen the video again to answer you, and still think the same.

    • @moormanjean5636
      @moormanjean5636 Год назад +1

      @@zzmhs4 Let me try to explain my POV. Different neurotransmitters in the brain serve specific and multifaceted roles, some of which are similar but usually not. I think of distinct neurotransmitters as essentially being subcircuits that are coupled together on diverse timescales and in various combinations. Evolution allowed these to emerge naturally, but in my opinion, you would need something like an neuroevolutionary algorithm to actually implement an analogue of neurotransmitters in neural networks. What LTCs propose is something fundamentally different, and I think more to do with a model of the neurons/synapses that is more biologically accurate than an attempted or indirect implementation of different neurotransmitters.

    • @zzmhs4
      @zzmhs4 Год назад

      @@moormanjean5636 Ok, I see, thanks for answer my first question

  • @abinaslimbu3057
    @abinaslimbu3057 Год назад

    John Venn 3 diagram class 10

  • @MaudWinston-t8n
    @MaudWinston-t8n 13 дней назад

    Johnson John Hall Sandra Rodriguez Thomas

  • @kayaba_atributtion2156
    @kayaba_atributtion2156 2 года назад +1

    USA: I WILL TAKE YOUR ENTIRE MODEL

  • @ingenium7135
    @ingenium7135 2 года назад +1

    Soo when AGI ?

    • @shadowkiller0071
      @shadowkiller0071 2 года назад +5

      Gimme 5 minutes.

    • @egor.okhterov
      @egor.okhterov 2 года назад +2

      Why they break the mental barrier of having to stick to backprop and gradient descent...

    • @afbeavers
      @afbeavers Год назад

      @@egor.okhterov Exactly. That would seem to be the roadblock.

  • @ToddFarrell
    @ToddFarrell 2 года назад +4

    To be fair though, he isn't at Stanford, so he hasn't sold out completely yet. Lets give him a chance :)

  • @yes-vy6bn
    @yes-vy6bn 3 года назад +2

    @tesla 👀

  • @jonathanperreault4503
    @jonathanperreault4503 2 года назад

    at the end of the video he says these technologies are open sourced but there are no links in the video descriptions , can we gather the relevant code sources and git hub repos?

    • @Hukkinen
      @Hukkinen 2 года назад +1

      Links are in the slides in the end

  • @sahilpocker
    @sahilpocker 3 года назад

    😮

  • @egor.okhterov
    @egor.okhterov 2 года назад +7

    He failed at 2 things:
    1. He decided to solve differential equations.
    2. He didn’t get rid of back propagation.
    Probably he is required to do some good math in order to publish papers and be paid a salary. As long as we have such an incentive from a scientific community, we would be stuck with suboptimal narrow AI based on statistics and back propagation.

    • @adrianhenle
      @adrianhenle 2 года назад +1

      The alternative being what, exactly?

    • @egor.okhterov
      @egor.okhterov 2 года назад +2

      @@adrianhenle emulation of cortical columns, the way Numenta does it. For example, there is a video "Alternatives to Backpropagation in Neural Networks" if you're interested: ruclips.net/video/oXyQU0aScq0/видео.html

    • @zeb1820
      @zeb1820 2 года назад +4

      The differential equations was an example of how the continuous process of synaptic logic from neuroscience was used to enhance a standard RNN. He showed how he merged the two concepts mathematically to improve the expressivity of model. I believe this was more for our educational benefit than to develop or test what he had already achieved.
      I do get your point about back propagation, but that was not an aim of this exercise. No doubt when that is solved it msy also, at some stage, be useful to merge that with the neuroscience enhanced NN described here.

  • @MS-od7je
    @MS-od7je 2 года назад

    Why is the brain a Mandelbrot set?

  • @stc2828
    @stc2828 2 года назад +1

    Very sad to see AI development fall into another hole. The last 10 years were fun while it lasted. See you guys 30 years later!

  • @vikrantvijit1436
    @vikrantvijit1436 3 года назад +2

    Path breaking new ground forming revolutionary research work that will change the face of futures liberating Force focused On digital humanities SPINNED Technologies INNOVATIONS Spectrums.

    • @pouya685
      @pouya685 3 года назад +3

      My head hurts after reading this sentence

  • @tismanasou
    @tismanasou Год назад +2

    If liquid neural networks were a serious thing, they would have gained a lot more attention in proper ML/AI conferences, not just TEDx and the shit you are presenting here.

    • @DanielSanchez-jl2vf
      @DanielSanchez-jl2vf Год назад +2

      i dont know man, the transformer took 5 years for people to take it seriously; why wouldn't this?

  • @DarkRedman31
    @DarkRedman31 2 года назад

    Not clear at all.

  • @Goldenhordemilo
    @Goldenhordemilo 3 года назад +1

    μ Muon Spec

  • @tonyamyos
    @tonyamyos 2 года назад +3

    Sorry but you make so many assumptions at almost every level. You are biased and your interpretation of the functionality and eventual use of this 'computational' model has nothing to do with how true intelligence arises. Start again. And this time leave your biases where they belong... in your professors heads.

  • @ToddFarrell
    @ToddFarrell 2 года назад +2

    Really it is just an interview because he wants to get a job at Google and make lots of money to serve ads :)

  • @StephenRoseDuo
    @StephenRoseDuo 10 месяцев назад

    Can someone point to a simple LTC network implementation please?

  • @mishmohd
    @mishmohd Год назад

    Any association with Liquid Snake ?

  • @niamcd6604
    @niamcd6604 Год назад +1

    PLEASE.... Do you mind bothering to pronounce other languages correctly!!?
    (And before people jump up .. I speak multiple languages myself).