Markov Decision Processes - Computerphile

Поделиться
HTML-код
  • Опубликовано: 24 окт 2022
  • Deterministic route finding isn't enough for the real world - Nick Hawes of the Oxford Robotics Institute takes us through some problems featuring probabilities.
    Nick used an example from Mickael Randour: bit.ly/C_MickaelRandour
    This video was previously called "Robot Decision Making"
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Комментарии • 118

  • @Deathhead68
    @Deathhead68 Год назад +345

    This guy was my lecturer about 10 years ago. He was very down to earth and explained the concepts in a really friendly way. Glad to see he's still doing it.

    • @centcode
      @centcode Год назад +2

      We might have crossed paths at uni of bham

    • @Deathhead68
      @Deathhead68 Год назад

      @@centcode was there 2012-2015

    • @Mounta1ngoat
      @Mounta1ngoat Год назад +4

      Glad to see Nick here, he definitely provided some of the clearest and most interesting explanations throughout my degree. As well as setting us loose with a lot of Lego robots and watching chaos ensue.

    • @erazn9077
      @erazn9077 Год назад

      @@Mounta1ngoat lol that sounds great

    • @symonkanulah3809
      @symonkanulah3809 Год назад

      @@Deathhead68 it sounds great 👍

  • @CalvinHikes
    @CalvinHikes Год назад +171

    This channel makes me appreciate the human brain more. We do all that automatically with barely a moment's thought.

    • @Ceelvain
      @Ceelvain Год назад +29

      It also fail spectacularly from time to time.
      For instance the so-called "sunk cost fallacy" might make you stay at the train station for much too long. You've already invested so much time into waiting for the train, you don't want this time to go to waste.
      The fallacy is that the time spent waiting is not an investment. It's a pure loss.

    • @raginald7mars408
      @raginald7mars408 Год назад +2

      which causes ALL the Problems
      we create
      and we get ever more creative

    • @GizmoMaltese
      @GizmoMaltese Год назад

      The key is we don't always make the best choice. For example, if you're choosing a path to work as in this example, you may not make the best choice but it doesn't matter.

    • @Ceelvain
      @Ceelvain Год назад

      @@real_mikkim and with all this computation, it still manages to fall for the most basic fallacies.
      I'm very much unimpressed.

  • @mateuszdziezok8631
    @mateuszdziezok8631 Год назад +44

    OMG as a Robotics student, I'm amazed how well explained that is. Love it

  • @tlxyxl8524
    @tlxyxl8524 Год назад +17

    Just took a RL course. Bellman equation and Markovian assumptions are so familiar. Btw, for those who are interested, the algorithm to solve discrete MDP (or model based RL problems in general) are Value Iterations and Policy Iterations, which are all based on Bellman equation.

  • @gasdive
    @gasdive Год назад +24

    I made these decisions for my real commute. The train was fastest, but occasionally much longer. The car was fast, but the cost of parking equalled 2 hours of work, so was effectively slowest. The latest I could leave and be sure of being on time was walking.

  • @blacklabelmansociety
    @blacklabelmansociety Год назад +14

    Please, bring more from this guy

  • @SachinVerma-lx5bx
    @SachinVerma-lx5bx Год назад +8

    Where the formal definitions for concepts like MDP can get overwhelming , it really helps to have these easy to understand explanations

  • @pierreabbat6157
    @pierreabbat6157 Год назад +15

    There is a 3% chance that, somewhere along the route, there's a half-duplex roadblock because they're fixing the overhead wires or something. There's a 0.1% chance that a power line or tree fell across the road, forcing you to take an extremely long detour, but half of the time this happens, you could get past it on a bike.

  • @engineeringmadeasy
    @engineeringmadeasy Год назад +4

    Nice one, I met Professor Nick at Pembroke College Oxford. It was an honour.

  • @tobiaswegener1234
    @tobiaswegener1234 Год назад +7

    This was a fantastic simple explanation, very enlightening.

  • @cerealport2726
    @cerealport2726 Год назад +12

    I'd like an autonomous taxi system that would decide it's all too hard to take me to the office, and would just take me back home, or, indeed, just refuse to take me to the office.
    "Sorry, I"m working from home today because the car refused to drive itself."

    • @IceMetalPunk
      @IceMetalPunk Год назад +4

      "My robot ate my transportation, boss, there was nothing I could do *except* put my comfy PJs back on."

    • @cerealport2726
      @cerealport2726 Год назад +3

      @@IceMetalPunk Sounds legit, take the rest of the week off.

  • @yvesamevoin8720
    @yvesamevoin8720 Год назад +3

    You can read passion in every word he is pronouncing. Very good explanation.

  • @asfandiyar5829
    @asfandiyar5829 Год назад +14

    I literally had my final year project use a kalman filter to solve this problem. That's awesome!
    Edit: spelling

  • @Ceelvain
    @Ceelvain Год назад +2

    I heared a lot about MDP and policy functions in the context of reinforcement learning. But this is the best explanation I ever heared.

  • @Ceelvain
    @Ceelvain Год назад +6

    I rarely put a like on a video, but this one deserves it.
    I definitely want to hear more about the algorithms to solve MDP problems.

  • @elwood.downey
    @elwood.downey Год назад

    the best explanation of this I've ever heard. many thanks.

  • @phil9447
    @phil9447 Год назад +6

    MDP is the topic of my bachelorthesis and the example really helped understanding everything a lot better and I think I'll be using it throughout the thesis to understand the theory I have to write about. It's a lot easier to understand than some state a,b and c and action 1,2,3 :D

  • @tristanlouthrobins
    @tristanlouthrobins 2 месяца назад

    This is such a fascinating breakdown of Markov decision making. I love the mathematics that underpins Markov, but the creativity and imagination applied to the example and its host of solutions are delicious brain food.

  • @lucrainville4372
    @lucrainville4372 Год назад

    Fascination look into decision-making.

  • @BobWaist
    @BobWaist 10 месяцев назад

    great video! Really well explained and interesting

  • @Imevul
    @Imevul Год назад

    I've unconciously done something similar with my commute to work. I can take the subway or the bus. The subway usually always takes the same amount of time every time, but there's a longer walk and rarely there's signaling issues that may force me to take the bus anyways. During winter, the bus may have problems with getting stuck in the snowy hills, and then I'm forced to take a taxi. The bus also has a connection that I will sometimes barely miss, so I may need to wait either ~1 minute or ~15 minutes for the next one. But one upside is, if the connecting bus takes too long, or never comes, I'm pretty close to work already so I could walk the rest of the way in a pinch.
    The biggest problem is, I have no idea how to assign the right probabilities to each of those events. There's just not enough data (that I have access to at least). Usually, I just take the bus to work (less walking, and don't have to deal with signaling issues), and the subway home (to avoid the connecting bus). If nothing goes wrong, they are pretty similar in time.

  • @spyboyb321
    @spyboyb321 Год назад +1

    The timing of this video! I am currently trying to work on a project that uses this in my AI class

  • @TGUGCL
    @TGUGCL Год назад +1

    Very interesting video. What about adding multiple criterias to the model. For instance: time, money in the model about commuting. Is there a software that can help you created and solve these types of Multiple criteria stochastic decision making problems? Something like Enterprise Dynamics, a discrete event simulation software platform

  • @LukaszWiklendt
    @LukaszWiklendt Год назад +3

    16:17 if you're allowed to remember how many cycles you waited for the train, does this mean you lose the Markov property? Or does the Markov property relate to the environment rather than your decision?

    • @mgostIH
      @mgostIH Год назад

      Looking up on Wikipedia it seems like they define the policy to only take the current state rather than current state + reward.
      Granted, you can always augment the state space to include each possible wait for the train at some specific amount of time on the clock and make it markovian, but the example they made does violate the markovian property if the nodes described are the states.

  • @Techmagus76
    @Techmagus76 Год назад +3

    Once the AI works well enough it puts the bike in the car and if noticed the traffic is high then takes the bike out and travel just the rest by bike.
    Next option use the bike to go to the train station and if the train is not coming directly switch to the bike.

  • @GBlunted
    @GBlunted Год назад +2

    You shouldn't be afraid to ask the teacher, "Okay, explain that one more time..." So they get a chance at a better, cleaner more polished bits to put in the video.

  • @vsandu
    @vsandu Год назад

    Excellent!!! Cheers.

  • @SystemSh0cker
    @SystemSh0cker Год назад +1

    Another perfect Video. Thanks for that! But I'm still asking myself... Will this continuous printing ever run out ??? :D

  • @khaledsrrr
    @khaledsrrr 5 месяцев назад

    Phenomenal
    All the respect

  • @Veptis
    @Veptis Год назад

    So is there a way to compute the solutions? Like I assume some matrices show up. One for probabilities and one for the sum of times. Then you can multiply it and get different time distributions for every strategy?

  • @Leon-pu3vm
    @Leon-pu3vm Год назад

    Extremely nice

  • @firsttyrell6484
    @firsttyrell6484 Год назад +4

    image stabilization would be nice

  • @WalkerRacing
    @WalkerRacing Год назад +4

    Brady will you please find someone to interview about chess engines/chess programming/neural nets. That would be super interesting

    • @ideallyyours
      @ideallyyours Год назад +2

      This interviewer isn't Brady. Says in the description: "This video was filmed and edited by Sean Riley."

  • @samt2226
    @samt2226 Год назад +2

    What sort of paper is being used for the diagrams?

  • @patrickbateman455
    @patrickbateman455 Год назад +6

    Very nice.

    • @bigprovola
      @bigprovola Год назад +6

      Let's see Paul Allen's Markov chain.

  • @rd42537
    @rd42537 Год назад

    That paper takes me back!

  • @chipsafan1
    @chipsafan1 Год назад

    Am I correct to assume that a first-order Markov system is similar to frequentist statistical models as a methodology?

  • @alphgeek
    @alphgeek Год назад

    Are the policies analogous to a reward function in a neural network?

  • @opusdei1151
    @opusdei1151 Год назад

    How does the algorithm work with imperfect information game like poker? Can you apply it to poker?

  • @SozImaScrub
    @SozImaScrub Год назад +1

    @MarkovBaj any thoughts?

  • @jasontrunk3353
    @jasontrunk3353 Год назад

    this is great

  • @timng9104
    @timng9104 Год назад +1

    wow probabilistic computing is kinda interesting. can u do a video on physical unclonable functions? I need an explainer like this XD

  • @avinier325
    @avinier325 7 месяцев назад

    Can anyone pls tell me where did he get his watch from.

  • @brettbreet
    @brettbreet Год назад

    What's the watch model he's wearing?

  • @deep.space.12
    @deep.space.12 Год назад +1

    So... next video gonna be POMDP?

  • @jonr6680
    @jonr6680 Год назад +3

    Fascinating and useful overview.
    I've watched a few machine learning lectures, it intrigues me that the logic, theory, mechanics etc is (at this 101 level) identical to decision theory that any human should - could - would use to live their lives efficiently... But never do! Because we were never taught how.
    So I bet even the scientists who program their AI for some corporate exploitative system (probably), ironically waste their life taking dumb decisions every day...
    And the example given of commuting to work is the classic First World Problem... Like gamblers we all think we know how to game the system, but by playing it we have ALREADY LOST.
    Did I just invent computational philosophy??
    Per the reboot movie Tron - .

  • @ENI232
    @ENI232 4 месяца назад

    More!

  • @IanKjos
    @IanKjos Год назад +1

    There's no point in an edge going home from the railway station because having been at the railway station does not change the stochastic costs of the other options. Once you've decided the rail has the lowest stochastic cost, you're done. Now if we add a concept of traffic changing with time, then we have a higher-order model and the edge becomes pointful again.

  • @bongsurfer
    @bongsurfer Год назад

    Thanks

  • @chiboubamine5970
    @chiboubamine5970 3 месяца назад

    I have a problem called Facilities Layout Problem which I am trying to solve it using Reinforcement Learning. The initial state is a layout that has a cost and the goal is chnage the facilities layout in order to minmize the cost. My question do this problem should be treated episodically or continously and what to do in the case where there is no absorbing state?? I would be extremely happy if someone could help.

    • @ChristophTungersleben
      @ChristophTungersleben 2 месяца назад

      If episodicaly or continously depends on the beginning state of the 'system' each action is a episode but it is possible to have the optimal by chance. Without break a loop might follow.

  • @ohsweetmystery
    @ohsweetmystery Год назад +1

    The bike can also take longer than 60 minutes. Flat tires, catastrophic mechanical failure, getting hit by another vehicle, etc.

    • @scottcox503
      @scottcox503 Год назад

      True but it's much more within your control

  • @DanielkaElliott
    @DanielkaElliott Год назад

    Its like if you are already late just take the bus bit if you have time according to Google maps take the fastest route
    Otherwise take the simplest route you have time for. (With least changes and walking)

  • @danielg9275
    @danielg9275 Год назад

    Coo coo cachoo the probability depends on you!

  • @terencewinters2154
    @terencewinters2154 4 месяца назад

    Do robots cue up ?

  • @geniusdavid
    @geniusdavid Год назад

    Things to have as a computer scientist, a marker and paper. 😮

  • @Bill0102
    @Bill0102 4 месяца назад

    Remarkable work! This content is fantastic. I found something similar, and it was beyond words. "Game Theory and the Pursuit of Algorithmic Fairness" by Jack Frostwell

  • @KibbleWhite
    @KibbleWhite Год назад +1

    This is great, except you got the percentages for traffic probability wrong. Light traffic is 10%, medium traffic is 20% and heavy traffic is 70% of the time.

  • @TheThunderSpirit
    @TheThunderSpirit Год назад

    im too doing reinforcement learning now

  • @2k10clarky
    @2k10clarky 8 месяцев назад

    You might also have a soft deadline for arriving to work so for example as long as your late only 1% of the time

  • @marklonergan3898
    @marklonergan3898 Год назад +1

    I have to go to the bank and trust me I will be there in about the time of the year is starting to stir fry sauce instead of garlic on the way home now anyway I think I have a few things to do in the morning.
    There's predictive text models at work. Start with "I " and keep hammering the predicted word and see what comes out. 😁.

  • @RayCase
    @RayCase Год назад

    2022. Still using tractor feed printer paper as scrap.

  • @odiseezall
    @odiseezall Год назад

    This is exactly what AI assistants should allow us to do - apply mathematical analysis to real world problems, in real-time.

  • @deanmarktaylor
    @deanmarktaylor Год назад

    I watched the film "The Mist" (2007) last night, it seems like "David" could have used a little "help" with this kind of decision making in the end.

  • @Jkauppa
    @Jkauppa Год назад +1

    make the difference/similarity of strict algorithm and fuzzy probabilistic selection algorithm clear

    • @Jkauppa
      @Jkauppa Год назад +1

      in the end the bayesian decision is the same as the strict algorithm, but implementation is wildly different and cleanness/interpretation of the algorithm can be clear/fuzzy (same problem, different paths, between step partial results, end result as logged)

    • @Jkauppa
      @Jkauppa Год назад +1

      fuzzy probabilistic ai vs dijkstra for shortest path

    • @Jkauppa
      @Jkauppa Год назад

      all algorithms give same kinds of answers for same problem but in different logical/math ways

    • @Jkauppa
      @Jkauppa Год назад

      describe dijkstra/A* in infinite memory probabilistic state algorithm

    • @Jkauppa
      @Jkauppa Год назад

      an algorithm might decide on fly while training if it remembers previous states or not

  • @jasonmcfarlane7243
    @jasonmcfarlane7243 Год назад +2

    To all the people in the commentz-- No, he doesn't look 'wierd' or 'wrong', he has a lazy eye or similar condition. These conditions are common and normal. Shame on you.

  • @OwenPrescott
    @OwenPrescott Год назад

    It really bothers me that he's waving the pen around without the lid on

  • @Lion_McLionhead
    @Lion_McLionhead Год назад

    These shortest path algorithms convinced lions that whoever designs these algorithms is a lot smarter than a lion, spent an entire career designing just 1 algorithm, & it's pointless to try to remember them all.

  • @Eagle3302PL
    @Eagle3302PL Год назад

    This video presents a problem, names a solution, doesn't present the named solution, then just ends. The whole video can be summed up as "in computer science sequential decisions with probable outcomes are made by using some approach, the approach requires some conditions to be determined for a desired outcome". IT NEVER SHOWS A SOLUTION, IT JUST SAYS THERE IS ONE. WHAT'S THE POINT?

  • @iwir3d
    @iwir3d Год назад

    Lets go skynet! ..... Lets go skynet! Long live the robot overlords.

  • @6DAMMK9
    @6DAMMK9 Год назад

    “How to guide AI to draw 5 fingers instead of forcing it”
    or use chopstick to eat noodles
    or bake a cake

  • @gollolocura
    @gollolocura Год назад

    Always take the bike

  • @liftingisfun2350
    @liftingisfun2350 Год назад

    What happened to him

  • @BritishBeachcomber
    @BritishBeachcomber Год назад

    *Self driving car.* Bike swerves in front. Action? 1. Brake hard, but can you stop in time?. 2. Swerve left, but what about that little kid? 3. Swerve right and hit incoming traffic, maybe killing many more people?
    Humans are very bad when faced with uncertainties like that. Machines would be no better.

  • @hurktang
    @hurktang Год назад

    No one understand how trains work in this video. The infographic makes the train jitter on his route and no one ever heard of train schedules.
    We should also factor cost. The risk of accident, the health benefit, the capacity to read your email in the train...

    • @Computerphile
      @Computerphile  Год назад

      The graphic illustrates that the route goes via somewhere else... (Unrealistic route for the timings but inspiration taken from my route from Nottingham to Oxford to meet Nick) HTH -Sean

    • @hurktang
      @hurktang Год назад

      ​@@Computerphile ​Ah sorry ! That make sens. You turned a 150 minutes train ride in a 30 minutes train ride and I found the ride quite bumpy. Thanks for wasting the time to reply to me

    • @Computerphile
      @Computerphile  Год назад

      You're welcome :0)

  • @alexandrumacedon291
    @alexandrumacedon291 Год назад

    there are no decisions. there are choices. and all are random. if the parameters are obscure. just like us we are biological machines we know rules but we chose as we please.

  • @ShadowGameAlchemy
    @ShadowGameAlchemy Год назад +2

    I really love all your videos, but I cant stand the sound of marker pen against the paper. That kinda hiss sound irritates to my core. I might be the only one in the world, but my brain is programmed that way. Can you please remove that sound or use a different ball point or other pen ? I have to hold my earphones far when you start writing. Please consider this.

  • @michaelmueller9635
    @michaelmueller9635 Год назад

    My Sunday ...a chameleon is teaching me about robot decisions ...I'm trippin bro xD

  • @veeek8
    @veeek8 Год назад

    So there is a scientific theory behind why i prefer cycling 😂

  • @buraktekgul2079
    @buraktekgul2079 11 месяцев назад

    The paper's voice is so bad .please use white board for next videos

  • @D1ndo
    @D1ndo Год назад

    I was waiting for 17 minutes for him to actually solve the problem using the algorithm, yet he never got the the point, only babbled about the same thing over and over again. Big dislike.

  • @johnsenchak1428
    @johnsenchak1428 Год назад

    REPORTED NOT COMPUTER RELATED