Видео 41
Просмотров 40 217

Lecture 3: Generalized Weighted Majority -- The Most Versatile Algorithm

26:05

Lecture 2: Prediction with Expert Advice

17:51

Lecture 1: Interactive Online Learning -- One Ring To Rule Them All

21:34

Core Concepts: Interactive No-Regret Learning

27:50

Core Concepts: Linear Quadratic Regulators

24:36

Core Concepts: Imitation Learning

16:00

5 Levels of Robot Learning

This is a sneak preview of the concepts that I am covering in my course on Learning for Robot Decision Making that I am teaching at Cornell, Fall 2022. Learn more at: www.cs.cornell.edu/courses/cs6756/2022fa/
What does learning mean to a robot? This video takes you on a journey through 5 increasingly richer levels of Robot Learning, from the basic simple levels (learning what the human wants you to do) through to the concepts of interactive no-regret learning, the Bellman equation, building a value estimator to a final unified game-theoretic framework (between a robot player and a value player). Much of what we know today in the fields of reinforcement learning, imitation learning, and mod...

Видео

Lecture 3: Generalized Weighted Majority -- The Most Versatile Algorithm

26:05

Lecture 3: Generalized Weighted Majority -- The Most Versatile Algorithm

Просмотров 9072 года назад

In this third lecture, we discuss one of the most powerful algorithms in learning and decision making Generalized Weighted Majority. Given a set of N options, how do you optimally hedge between these options? GWM not only answers this question, but it also provides an algorithmic template that shows up again and again in various fundamental problems in computer science matching learning, optimi...

Lecture 2: Prediction with Expert Advice

17:51

Lecture 2: Prediction with Expert Advice

Просмотров 6072 года назад

In this second lecture, we look at a simple, fundamental setting of interactive learning - prediction with expert advice. You have a set of N experts that make predictions at each round. How do you combine their predictions so you do as well as the best expert? We discuss a general class of algorithms, weighted majority, that plays the majority vote. Either the majority is right, and no mistake...

Lecture 1: Interactive Online Learning -- One Ring To Rule Them All

21:34

Lecture 1: Interactive Online Learning -- One Ring To Rule Them All

Просмотров 4432 года назад

In this series, we will try to understand the fundamental fabric that ties all of robot learning "How can a robot learn from online interactions?" Our quest is to build up a unified mathematical framework that we will wield to conquer recurring problems in reinforcement learning, imitation learning, model predictive control, and planning. Let's begin! For more information about me and my work, ...

Core Concepts: Interactive No-Regret Learning

27:50

Core Concepts: Interactive No-Regret Learning

Просмотров 1,1 тыс.2 года назад

We explore the concept of interactive learning. Robots must interact with the world to gather data on which they learn. The principled way to learn when your data can be changing, possibly adversarially, is by striving to be "no regret", i.e., do as well as the best policy in hindsight. But greedily picking the best policy in hindsight fails, even on the simplest of examples! Join us as we jour...

Core Concepts: Linear Quadratic Regulators

24:36

Core Concepts: Linear Quadratic Regulators

Просмотров 3,5 тыс.3 года назад

We explore the concept of control in robotics, notably Linear Quadratic Regulators (LQR). We see that a powerful way to think about control is as a dynamic optimization, where the goal is to compute a mapping from states to action that minimize a user-specified cost, e.g. land a rocket without exploding. Moreover, you can do this efficiently thanks to Bellman’s insight that the optimal value of...

16:00

Core Concepts: Imitation Learning

Просмотров 2,1 тыс.3 года назад

We explore the concept of imitation learning in robotics. We see that imitation learning is a powerful way to implicitly program robots. Instead of tediously tinkering with rules or tuning reward functions, just demonstrate how you would like the robot to behave. But naively treating imitation learning as mere supervised learning, even on the simplest of examples, leads to very interesting fail...

15:08

Lecture 1: What is Imitation Learning?

Просмотров 6 тыс.3 года назад

In this series, we will journey into the depths of imitation learning. The purpose of our quest is to answer at a deep, mathematical level a single question: "What does it mean to imitate?" The answer, as we shall see, is simple and profound but there are many twists and turns along the way. Let's begin. For more information about me and my work, check out www.sanjibanchoudhury.com/ 1. Swamy et...

Lecture 10: Imitation Learning Finale: The Beginning

19:59

Lecture 10: Imitation Learning Finale: The Beginning

Просмотров 3413 года назад

Over the past 9 lectures, we have journeyed through the depths of imitation learning. We compressed all knowledge down to a single, game-theoretic framework. Armed with just this knowledge, in this series finale, we finally lift off and set our sights on new and distant frontiers. In a sense, we have only just begun. I hope you enjoy this preview of exciting ideas to come. For more information ...

Lecture 9: Imitation Learning -- It's Only A Game!

27:29

Lecture 9: Imitation Learning -- It's Only A Game!

Просмотров 4823 года назад

In this ninth lecture, we finally look at imitation learning in its most fundamental form as a game. This is a game between two players a learner that generates a policy, and an adversary that discriminates between the values of learner and human expert. We'll see how this simple game-theoretic framework unifies all existing imitation learning algorithms, as well as giving us brand new algorith...

Lecture 8: Imitation Learning as Distribution Matching

31:19

Lecture 8: Imitation Learning as Distribution Matching

Просмотров 7253 года назад

In this eighth lecture, we look at imitation learning as simply a distribution matching problem, i.e., generate trajectories that look like that of the expert. At the heart of the problem lies a question: "What does it mean for two distributions to be close, and how can we measure closeness?". We derive an estimator for an entire class of f-divergence and show that it ultimately reduces to solv...

Lecture 7: Imitation Learning Through a Bayesian Lens

24:34

Lecture 7: Imitation Learning Through a Bayesian Lens

Просмотров 5513 года назад

In this seventh lecture, we look at imitation learning in a Bayesian setting where we have a prior over possible cost functions the human may prefer. We show that the problem, fundamentally on of exploration vs exploitation, is intractable and explore a couple remedies. The first is to simplify the problem down to Bayesian Active learning where we show efficient greedy algorithms can be near-op...

Lecture 6: Inverse Reinforcement Learning -- From Maximum Margin to Maximum Entropy

31:29

Lecture 6: Inverse Reinforcement Learning -- From Maximum Margin to Maximum Entropy

Просмотров 2,7 тыс.3 года назад

In this sixth lecture, we look at the problem of recovering the underlying reward or cost function that explains human demonstrations. We show that there are two fundamentally different directions. The first is to view the human as an optimal planner and recover a cost function that they must be optimizing. The second is to view the human as a stochastic process and recover the underlying distr...

Lecture 5: Imitation as a Stairway to Self-Improvement

22:40

Lecture 5: Imitation as a Stairway to Self-Improvement

Просмотров 6823 года назад

In this fifth lecture, we look at the role of values in imitation. Not all imitation errors are equal, some have a greater impact on values than others. Providing imitation learning algorithms with values opens the door to algorithms that can actually outperform the human expert in terms of their own values. We climb the staircase of algorithms that bootstrap imitation learning to ultimately so...

20:59

Lecture 4: Imitation from Interventions

Просмотров 9453 года назад

In this fourth lecture, we look at a natural way by which humans teach and learn interventions. We show that naively imitating interventions can go horribly wrong. Instead, our key insight is that interventions are laden with implicit information about the human's value function. We take a look at how one may recover the value function from both deterministic and probabilistic paradigms. For mo...

Lecture 3: Interaction in Imitation Learning

22:38

Lecture 3: Interaction in Imitation Learning

Просмотров 1,2 тыс.3 года назад

Lecture 3: Interaction in Imitation Learning

Lecture 2: Feedback in Imitation Learning -- The Three Regimes of Covariate Shift

20:54

Lecture 2: Feedback in Imitation Learning -- The Three Regimes of Covariate Shift

Просмотров 2,3 тыс.3 года назад

Lecture 2: Feedback in Imitation Learning The Three Regimes of Covariate Shift

Respecting helicopter performance charts for safe flight

3:45

Respecting helicopter performance charts for safe flight

Просмотров 775 лет назад

Respecting helicopter performance charts for safe flight

1:46

How fast can you fly in a forest?

Просмотров 405 лет назад

How fast can you fly in a forest?

[RSS 2015] Theoretical limits of speed and planning for forest flight

5:01

[RSS 2015] Theoretical limits of speed and planning for forest flight

Просмотров 355 лет назад

[RSS 2015] Theoretical limits of speed and planning for forest flight

[ICRA 2015, AHS 2014] Guaranteed Safe Flight of a Full Scale Helicopter

2:49

[ICRA 2015, AHS 2014] Guaranteed Safe Flight of a Full Scale Helicopter

Просмотров 265 лет назад

[ICRA 2015, AHS 2014] Guaranteed Safe Flight of a Full Scale Helicopter

Guaranteed Safe Flight: Simulation of flying in grand canyon

1:15

Guaranteed Safe Flight: Simulation of flying in grand canyon

Просмотров 155 лет назад

Guaranteed Safe Flight: Simulation of flying in grand canyon

[AHS 2013] Autonomous Emergency Landing of a Helicopter Pitch

1:13

[AHS 2013] Autonomous Emergency Landing of a Helicopter Pitch

Просмотров 485 лет назад

[AHS 2013] Autonomous Emergency Landing of a Helicopter Pitch

[JFR'19] High Performance and Safe Flight of Full-Scale Helicopters from Takeoff to Landing

0:56

[JFR'19] High Performance and Safe Flight of Full-Scale Helicopters from Takeoff to Landing

Просмотров 565 лет назад

[JFR'19] High Performance and Safe Flight of Full-Scale Helicopters from Takeoff to Landing

[ICRA'15] The Dynamics Projection Filter

0:45

[ICRA'15] The Dynamics Projection Filter

Просмотров 225 лет назад

[ICRA'15] The Dynamics Projection Filter

[ICRA'13] RRT*-AR: Sampling-Based Alternate Routes Planning

0:48

[ICRA'13] RRT*-AR: Sampling-Based Alternate Routes Planning

Просмотров 855 лет назад

[ICRA'13] RRT*-AR: Sampling-Based Alternate Routes Planning

1:34

[ICRA'13] SPARTAN: Flying in Robot City

Просмотров 365 лет назад

[ICRA'13] SPARTAN: Flying in Robot City

[ICRA'16] RABIT* : Interleaving local and global search

0:18

[ICRA'16] RABIT* : Interleaving local and global search

Просмотров 645 лет назад

[ICRA'16] RABIT* : Interleaving local and global search

1:51

CSE 490R: Mobile Robots Final Demo

Просмотров 2015 лет назад

CSE 490R: Mobile Robots Final Demo

Learning to Gather Information via Imitation of Clairvoyant Oracles

1:01

Learning to Gather Information via Imitation of Clairvoyant Oracles

Просмотров 145 лет назад

Learning to Gather Information via Imitation of Clairvoyant Oracles

@Ibrahim-rc8sn 23 дня назад
What is this for a shitty video series ? Is this read from a chat gpt script ?
@JamesAllen-c8v 2 месяца назад
Gonzalez Sharon Anderson Elizabeth Hernandez Anna
@MANASSINGHA-i2h 2 месяца назад
sir u teach so great..can you please teach us deep leaning and tranformers multimodal llm too...i really loved your videos
@SakethBajjuri 3 месяца назад
Sir I am a robotics software engineer and want to learn into the field of imitation and reinforcemnent learning for manipulation based robots . do you recommend i should start learning ML first and then learn rl and IL . Can you please suggest me with some resources too if possible
@SakethBajjuri 3 месяца назад
This is so far the best lecture every i heard on imitation learning
@SakethBajjuri 4 месяца назад
Thank you much professor .. its really good start for me to start with . thank you so much for sharng the knowledge
@riyastc4924 4 месяца назад
Great
@bungaIowbill 5 месяцев назад
I really enjoyed these lectures. If you ever came back to it, I'd definitely watch the new content. One minor piece of feedback though: the speed of presentation is not really adapted to how much time is needed to digest the material under consideration. Spending a bit more time on the trickier concepts when they are introduced could be helpful
@germansanchez4160 8 месяцев назад
Can you explain the formulas on min 15:19? (All that page)
@cabbagecat9612 8 месяцев назад
Very, very well explained!
@oshrin 9 месяцев назад
Helps me a lot. Thank you Sanjiban!
@Rashbin_way 9 месяцев назад
this helped me lot in my research. Thank you Sanjiban Ji
@ekanshsharma1507 10 месяцев назад
Thank you so much for this insightful explanation of LQR! :)
@humanintheloop 11 месяцев назад
Great series of lectures and resources. A lot of thanks.
@seanl2061 11 месяцев назад
Hi Sanjiban, thank you greatly for the lecture! I have a question at 15:28. As for the first inequality, as long as all possible policies don't incur the same loss value, the equality wouldn't hold. Correct? Also, in the last inequality terms, isn't that simply showing that for any policy the regret is lower-bounded by 0? How can one conclude that at least one policy must be pretty good as written in the lecture note? Thanks.
@Messiah-000 Год назад
Would using a neural network-based policy to perform dataset replacement rather than aggregation at each batch of training using standard gradient descent still be considered as a no-regret learner?
@sanjibanc Год назад
Great question! So online gradient descent over a convex loss function is no-regret. Neural networks are, unfortunately, not convex so the theory doesn't hold for them. But the theory does hold for kernels (like RKHS) and there is work that shows deep networks are approximately equivalent to kernel machines (such as arxiv.org/pdf/2012.00152.pdf)
@shreelekharevankar600 Год назад
No minions 🤯🤯🤯
@hridaymehta893 Год назад
This is a great resource Sir! Your way of explanation with the animations is exemplary!
@RD-AI-ROBOTICS Год назад
can u make some videos on coding part of Imitation learning as well, cant find anything online!! thanks in advance.
@sanjibanc Год назад
Great suggestion, will definitely try!
@tuongnguyen9391 Год назад
I am sorry but does your series include behavioral cloning somewhere ?
@sanjibanc Год назад
It does! Lecture 2 talks about behavior cloning, where it works and where it fails.
@tuongnguyen9391 Год назад
@@sanjibanc thank you so much professor. I really appreciate your work ! Thank you from Vietnam :D
@sanjibanc Год назад
@@tuongnguyen9391 Thank you! Of course! I'll put out more this semester as I am teaching www.cs.cornell.edu/courses/cs6756/2023fa/
@timandersen8030 Год назад
@@sanjibanc Thank you professor! Looking forward to the lecture videos of this course!
@mberoakoko24 Год назад
Thank you so much sir , We have only learnt eigen value placement in school this is very intuitive to understand .
@mberoakoko24 Год назад
This is extremely informative sir , Thank you
@avanishparmessur5032 Год назад
5:40 where do the upper bound equations come from? all notes and videos I see, people just bring it up like it's something super obvious
@bungaIowbill 5 месяцев назад
You can get a lot of them by using the Taylor expansion and looking at what happens when you discard some of the terms
@johniitjodhpur Год назад
Hope entire lecture series was available😃
@ekpopromise2682 Год назад
Great lecture!
@maxfrankenberg8260 Год назад
amazing video! thanks
@zhidewang5185 Год назад
Nice lecture, Sanjiban! Thanks for presenting complicated ideas in an intuitive form, it's very helpful and inspiring!!
@batsiziki119 2 года назад
Brillant video. Helped me a lot!
@loganl7257 2 года назад
This channel is a hidden gem :)
@sanjibanc 2 года назад
Thanks!
@anirudhkaushik4577 2 года назад
Beautifully explained! Thanks!
@luke9771 2 года назад
Love the floating head thing you've got going on
@quonxinquonyi8570 2 года назад
Such a brilliant lecture
@quonxinquonyi8570 2 года назад
Amazing
@satpalkaur9567 2 года назад
Super
@satpalkaur9567 3 года назад
Well done Sanjiban, congratulations
@akdezfuli 3 года назад
These lectures are great ! Thank you Sanjiban !
@sullyjosefa7249 3 года назад
67vo4 vyn.fyi
@soumyadeep.m 11 лет назад
Decent Improvement Enough? watch?v=iSgR3pSfeTg

Sanjiban Choudhury

Видео

Комментарии