Markov Decision Processes - Computerphile

Computerphile

Просмотров 161 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 24 окт 2022
Deterministic route finding isn't enough for the real world - Nick Hawes of the Oxford Robotics Institute takes us through some problems featuring probabilities.
Nick used an example from Mickael Randour: bit.ly/C_MickaelRandour
This video was previously called "Robot Decision Making"
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Комментарии • 119

@Deathhead68 Год назад ⁺³⁶⁰
This guy was my lecturer about 10 years ago. He was very down to earth and explained the concepts in a really friendly way. Glad to see he's still doing it.
@centcode Год назад ⁺²
We might have crossed paths at uni of bham
@Deathhead68 Год назад
@@centcode was there 2012-2015
@Mounta1ngoat Год назад ⁺⁴
Glad to see Nick here, he definitely provided some of the clearest and most interesting explanations throughout my degree. As well as setting us loose with a lot of Lego robots and watching chaos ensue.
@erazn9077 Год назад
@@Mounta1ngoat lol that sounds great
@symonkanulah3809 Год назад
@@Deathhead68 it sounds great 👍
@CalvinHikes Год назад ⁺¹⁷⁵
This channel makes me appreciate the human brain more. We do all that automatically with barely a moment's thought.
@Ceelvain Год назад ⁺²⁹
It also fail spectacularly from time to time.
For instance the so-called "sunk cost fallacy" might make you stay at the train station for much too long. You've already invested so much time into waiting for the train, you don't want this time to go to waste.
The fallacy is that the time spent waiting is not an investment. It's a pure loss.
@raginald7mars408 Год назад ⁺²
which causes ALL the Problems
we create
and we get ever more creative
@GizmoMaltese Год назад
The key is we don't always make the best choice. For example, if you're choosing a path to work as in this example, you may not make the best choice but it doesn't matter.
@Ceelvain Год назад
@@real_mikkim and with all this computation, it still manages to fall for the most basic fallacies.
I'm very much unimpressed.
@mateuszdziezok8631 Год назад ⁺⁴⁴
OMG as a Robotics student, I'm amazed how well explained that is. Love it
@gasdive Год назад ⁺²⁴
I made these decisions for my real commute. The train was fastest, but occasionally much longer. The car was fast, but the cost of parking equalled 2 hours of work, so was effectively slowest. The latest I could leave and be sure of being on time was walking.
@tlxyxl8524 Год назад ⁺¹⁷
Just took a RL course. Bellman equation and Markovian assumptions are so familiar. Btw, for those who are interested, the algorithm to solve discrete MDP (or model based RL problems in general) are Value Iterations and Policy Iterations, which are all based on Bellman equation.
@SachinVerma-lx5bx Год назад ⁺⁹
Where the formal definitions for concepts like MDP can get overwhelming , it really helps to have these easy to understand explanations
@cerealport2726 Год назад ⁺¹³
I'd like an autonomous taxi system that would decide it's all too hard to take me to the office, and would just take me back home, or, indeed, just refuse to take me to the office.
"Sorry, I"m working from home today because the car refused to drive itself."
@IceMetalPunk Год назад ⁺⁴
"My robot ate my transportation, boss, there was nothing I could do *except* put my comfy PJs back on."
@cerealport2726 Год назад ⁺³
@@IceMetalPunk Sounds legit, take the rest of the week off.
@blacklabelmansociety Год назад ⁺¹⁴
Please, bring more from this guy
@pierreabbat6157 Год назад ⁺¹⁵
There is a 3% chance that, somewhere along the route, there's a half-duplex roadblock because they're fixing the overhead wires or something. There's a 0.1% chance that a power line or tree fell across the road, forcing you to take an extremely long detour, but half of the time this happens, you could get past it on a bike.
@engineeringmadeasy Год назад ⁺⁴
Nice one, I met Professor Nick at Pembroke College Oxford. It was an honour.
@asfandiyar5829 Год назад ⁺¹⁴
I literally had my final year project use a kalman filter to solve this problem. That's awesome!
Edit: spelling
@Maciej-Komosinski Год назад
Kalman
@Ceelvain Год назад ⁺⁶
I rarely put a like on a video, but this one deserves it.
I definitely want to hear more about the algorithms to solve MDP problems.
@yvesamevoin8720 Год назад ⁺³
You can read passion in every word he is pronouncing. Very good explanation.
@Ceelvain Год назад ⁺²
I heared a lot about MDP and policy functions in the context of reinforcement learning. But this is the best explanation I ever heared.
@tobiaswegener1234 Год назад ⁺⁷
This was a fantastic simple explanation, very enlightening.
@phil9447 Год назад ⁺⁶
MDP is the topic of my bachelorthesis and the example really helped understanding everything a lot better and I think I'll be using it throughout the thesis to understand the theory I have to write about. It's a lot easier to understand than some state a,b and c and action 1,2,3 :D
@tristanlouthrobins 3 месяца назад
This is such a fascinating breakdown of Markov decision making. I love the mathematics that underpins Markov, but the creativity and imagination applied to the example and its host of solutions are delicious brain food.
@elwood.downey Год назад
the best explanation of this I've ever heard. many thanks.
@lucrainville4372 Год назад
Fascination look into decision-making.
@BobWaist 10 месяцев назад
great video! Really well explained and interesting
@GBlunted Год назад ⁺²
You shouldn't be afraid to ask the teacher, "Okay, explain that one more time..." So they get a chance at a better, cleaner more polished bits to put in the video.
@TGUGCL Год назад ⁺¹
Very interesting video. What about adding multiple criterias to the model. For instance: time, money in the model about commuting. Is there a software that can help you created and solve these types of Multiple criteria stochastic decision making problems? Something like Enterprise Dynamics, a discrete event simulation software platform
@spyboyb321 Год назад ⁺¹
The timing of this video! I am currently trying to work on a project that uses this in my AI class
@Imevul Год назад
I've unconciously done something similar with my commute to work. I can take the subway or the bus. The subway usually always takes the same amount of time every time, but there's a longer walk and rarely there's signaling issues that may force me to take the bus anyways. During winter, the bus may have problems with getting stuck in the snowy hills, and then I'm forced to take a taxi. The bus also has a connection that I will sometimes barely miss, so I may need to wait either ~1 minute or ~15 minutes for the next one. But one upside is, if the connecting bus takes too long, or never comes, I'm pretty close to work already so I could walk the rest of the way in a pinch.
The biggest problem is, I have no idea how to assign the right probabilities to each of those events. There's just not enough data (that I have access to at least). Usually, I just take the bus to work (less walking, and don't have to deal with signaling issues), and the subway home (to avoid the connecting bus). If nothing goes wrong, they are pretty similar in time.
@Techmagus76 Год назад ⁺³
Once the AI works well enough it puts the bike in the car and if noticed the traffic is high then takes the bike out and travel just the rest by bike.
Next option use the bike to go to the train station and if the train is not coming directly switch to the bike.
@firsttyrell6484 Год назад ⁺⁴
image stabilization would be nice
@Veptis Год назад
So is there a way to compute the solutions? Like I assume some matrices show up. One for probabilities and one for the sum of times. Then you can multiply it and get different time distributions for every strategy?
@WalkerRacing Год назад ⁺⁴
Brady will you please find someone to interview about chess engines/chess programming/neural nets. That would be super interesting
@ideallyyours Год назад ⁺²
This interviewer isn't Brady. Says in the description: "This video was filmed and edited by Sean Riley."
@SystemSh0cker Год назад ⁺¹
Another perfect Video. Thanks for that! But I'm still asking myself... Will this continuous printing ever run out ??? :D
@LukaszWiklendt Год назад ⁺³
16:17 if you're allowed to remember how many cycles you waited for the train, does this mean you lose the Markov property? Or does the Markov property relate to the environment rather than your decision?
@mgostIH Год назад
Looking up on Wikipedia it seems like they define the policy to only take the current state rather than current state + reward.
Granted, you can always augment the state space to include each possible wait for the train at some specific amount of time on the clock and make it markovian, but the example they made does violate the markovian property if the nodes described are the states.
@Leon-pu3vm Год назад
Extremely nice
@vsandu Год назад
Excellent!!! Cheers.
@samt2226 Год назад ⁺²
What sort of paper is being used for the diagrams?
@rd42537 Год назад
That paper takes me back!
@chipsafan1 Год назад
Am I correct to assume that a first-order Markov system is similar to frequentist statistical models as a methodology?
@khaledsrrr 6 месяцев назад
Phenomenal
All the respect
@timng9104 Год назад ⁺¹
wow probabilistic computing is kinda interesting. can u do a video on physical unclonable functions? I need an explainer like this XD
@opusdei1151 Год назад
How does the algorithm work with imperfect information game like poker? Can you apply it to poker?
@alphgeek Год назад
Are the policies analogous to a reward function in a neural network?
@jonr6680 Год назад ⁺³
Fascinating and useful overview.
I've watched a few machine learning lectures, it intrigues me that the logic, theory, mechanics etc is (at this 101 level) identical to decision theory that any human should - could - would use to live their lives efficiently... But never do! Because we were never taught how.
So I bet even the scientists who program their AI for some corporate exploitative system (probably), ironically waste their life taking dumb decisions every day...
And the example given of commuting to work is the classic First World Problem... Like gamblers we all think we know how to game the system, but by playing it we have ALREADY LOST.
Did I just invent computational philosophy??
Per the reboot movie Tron - .
@patrickbateman455 Год назад ⁺⁶
Very nice.
@bigprovola Год назад ⁺⁶
Let's see Paul Allen's Markov chain.
@IanKjos Год назад ⁺¹
There's no point in an edge going home from the railway station because having been at the railway station does not change the stochastic costs of the other options. Once you've decided the rail has the lowest stochastic cost, you're done. Now if we add a concept of traffic changing with time, then we have a higher-order model and the edge becomes pointful again.
@SozImaScrub Год назад ⁺¹
@MarkovBaj any thoughts?
@brettbreet Год назад
What's the watch model he's wearing?
@deep.space.12 Год назад ⁺¹
So... next video gonna be POMDP?
@ohsweetmystery Год назад ⁺¹
The bike can also take longer than 60 minutes. Flat tires, catastrophic mechanical failure, getting hit by another vehicle, etc.
@scottcox503 Год назад
True but it's much more within your control
@avinier325 8 месяцев назад
Can anyone pls tell me where did he get his watch from.
@jasontrunk3353 Год назад
this is great
@DanielkaElliott Год назад
Its like if you are already late just take the bus bit if you have time according to Google maps take the fastest route
Otherwise take the simplest route you have time for. (With least changes and walking)
@chiboubamine5970 4 месяца назад
I have a problem called Facilities Layout Problem which I am trying to solve it using Reinforcement Learning. The initial state is a layout that has a cost and the goal is chnage the facilities layout in order to minmize the cost. My question do this problem should be treated episodically or continously and what to do in the case where there is no absorbing state?? I would be extremely happy if someone could help.
@ChristophTungersleben 3 месяца назад
If episodicaly or continously depends on the beginning state of the 'system' each action is a episode but it is possible to have the optimal by chance. Without break a loop might follow.
@ENI232 5 месяцев назад
More!
@bongsurfer Год назад
Thanks
@danielg9275 Год назад
Coo coo cachoo the probability depends on you!
@marklonergan3898 Год назад ⁺¹
I have to go to the bank and trust me I will be there in about the time of the year is starting to stir fry sauce instead of garlic on the way home now anyway I think I have a few things to do in the morning.
There's predictive text models at work. Start with "I " and keep hammering the predicted word and see what comes out. 😁.
@geniusdavid Год назад
Things to have as a computer scientist, a marker and paper. 😮
@terencewinters2154 5 месяцев назад
Do robots cue up ?
@Bill0102 5 месяцев назад
Remarkable work! This content is fantastic. I found something similar, and it was beyond words. "Game Theory and the Pursuit of Algorithmic Fairness" by Jack Frostwell
@TheThunderSpirit Год назад
im too doing reinforcement learning now
@KibbleWhite Год назад ⁺¹
This is great, except you got the percentages for traffic probability wrong. Light traffic is 10%, medium traffic is 20% and heavy traffic is 70% of the time.
@odiseezall Год назад
This is exactly what AI assistants should allow us to do - apply mathematical analysis to real world problems, in real-time.
@jasonmcfarlane7243 Год назад ⁺²
To all the people in the commentz-- No, he doesn't look 'wierd' or 'wrong', he has a lazy eye or similar condition. These conditions are common and normal. Shame on you.
@RayCase Год назад
2022. Still using tractor feed printer paper as scrap.
@2k10clarky 8 месяцев назад
You might also have a soft deadline for arriving to work so for example as long as your late only 1% of the time
@deanmarktaylor Год назад
I watched the film "The Mist" (2007) last night, it seems like "David" could have used a little "help" with this kind of decision making in the end.
@6DAMMK9 Год назад
“How to guide AI to draw 5 fingers instead of forcing it”
or use chopstick to eat noodles
or bake a cake
@Jkauppa Год назад ⁺¹
make the difference/similarity of strict algorithm and fuzzy probabilistic selection algorithm clear
@Jkauppa Год назад ⁺¹
in the end the bayesian decision is the same as the strict algorithm, but implementation is wildly different and cleanness/interpretation of the algorithm can be clear/fuzzy (same problem, different paths, between step partial results, end result as logged)
@Jkauppa Год назад ⁺¹
fuzzy probabilistic ai vs dijkstra for shortest path
@Jkauppa Год назад
all algorithms give same kinds of answers for same problem but in different logical/math ways
@Jkauppa Год назад
describe dijkstra/A* in infinite memory probabilistic state algorithm
@Jkauppa Год назад
an algorithm might decide on fly while training if it remembers previous states or not
@OwenPrescott Год назад
It really bothers me that he's waving the pen around without the lid on
@Lion_McLionhead Год назад
These shortest path algorithms convinced lions that whoever designs these algorithms is a lot smarter than a lion, spent an entire career designing just 1 algorithm, & it's pointless to try to remember them all.
@Eagle3302PL Год назад
This video presents a problem, names a solution, doesn't present the named solution, then just ends. The whole video can be summed up as "in computer science sequential decisions with probable outcomes are made by using some approach, the approach requires some conditions to be determined for a desired outcome". IT NEVER SHOWS A SOLUTION, IT JUST SAYS THERE IS ONE. WHAT'S THE POINT?
@iwir3d Год назад
Lets go skynet! ..... Lets go skynet! Long live the robot overlords.
@BritishBeachcomber Год назад
*Self driving car.* Bike swerves in front. Action? 1. Brake hard, but can you stop in time?. 2. Swerve left, but what about that little kid? 3. Swerve right and hit incoming traffic, maybe killing many more people?
Humans are very bad when faced with uncertainties like that. Machines would be no better.
@hurktang Год назад
No one understand how trains work in this video. The infographic makes the train jitter on his route and no one ever heard of train schedules.
We should also factor cost. The risk of accident, the health benefit, the capacity to read your email in the train...
@Computerphile Год назад
The graphic illustrates that the route goes via somewhere else... (Unrealistic route for the timings but inspiration taken from my route from Nottingham to Oxford to meet Nick) HTH -Sean
@hurktang Год назад
@@Computerphile Ah sorry ! That make sens. You turned a 150 minutes train ride in a 30 minutes train ride and I found the ride quite bumpy. Thanks for wasting the time to reply to me
@Computerphile Год назад
You're welcome :0)
@mimiphan9582 17 дней назад
question - can you try putting the pen on your nose and then staring at it. I want to test something
@michaelmueller9635 Год назад
My Sunday ...a chameleon is teaching me about robot decisions ...I'm trippin bro xD
@alexandrumacedon291 Год назад
there are no decisions. there are choices. and all are random. if the parameters are obscure. just like us we are biological machines we know rules but we chose as we please.
@gollolocura Год назад
Always take the bike
@ShadowGameAlchemy Год назад ⁺²
I really love all your videos, but I cant stand the sound of marker pen against the paper. That kinda hiss sound irritates to my core. I might be the only one in the world, but my brain is programmed that way. Can you please remove that sound or use a different ball point or other pen ? I have to hold my earphones far when you start writing. Please consider this.
@liftingisfun2350 Год назад
What happened to him
@veeek8 Год назад
So there is a scientific theory behind why i prefer cycling 😂
@buraktekgul2079 Год назад
The paper's voice is so bad .please use white board for next videos
@D1ndo Год назад
I was waiting for 17 minutes for him to actually solve the problem using the algorithm, yet he never got the the point, only babbled about the same thing over and over again. Big dislike.
@johnsenchak1428 Год назад
REPORTED NOT COMPUTER RELATED

Следующие

Автовоспроизведение

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile