Alborz Geramifard
Alborz Geramifard
  • Видео 2
  • Просмотров 20 761
An Introduction to Markov Decision Processes and Reinforcement Learning
RLPy: rlpy.readthedocs.io/en/latest/
AI Gym: gym.openai.com/
Tutorial Paper: A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning (alborz-geramifard.com/Files/13FTML-RLTutorial.pdf)
Просмотров: 19 009

Видео

MovieBot
Просмотров 1,8 тыс.7 лет назад
* The thinking time on my side and Echo's side were clipped out. The thinking time on Echo's side is peaked around couple of seconds. * MovieBot is available to public as an Alexa skill. Just say "Alexa, enable moviebot" and then "Alexa, open moviebot" * We are working on more features and would love to hear suggestions/comments. * MovieBot was developed using Alexa ASK: developer.amazon.com/al...

Комментарии

  • @ehsanasadi7085
    @ehsanasadi7085 29 дней назад

    amazing

  • @shivangitomar5557
    @shivangitomar5557 Год назад

    The best video on this topic. Thanks a lot!

  • @SteveRobinson-mj8dp
    @SteveRobinson-mj8dp Год назад

    there are obvious cuts between questions and answers 😂😂😂😂 this should be really embarrassing 😂😂😂

  • @ORagnar
    @ORagnar 2 года назад

    15:47 I'm intrigued by the mathematical symbols. I've never seen an upside down "A" before. It's assumed we know these symbols. I've taken calculus, trig, etc, but I don't have familiarity with all of the these symbols.

    • @davidearnest1348
      @davidearnest1348 2 года назад

      The "upside down 'A' " means "for all." It is commonly used in proof-based math courses.

  • @quanghuykhuat3247
    @quanghuykhuat3247 2 года назад

    In the part of value iteration example, I don't understand when he say that transition model is valid for every direction. So total transition probability is 4 but not to be 1. Can someone help me to explain it?

  • @maydin34
    @maydin34 3 года назад

    This is absolutely one of the best presantation I have ever seen about the introductory RL on internet. I don't understand why this is not watched that much! Could you please share a video about the MC methods and Policy Gradients methods (A2C,DDPG etc.) in RL as well? I really like the way you teach! Thanks a lot for this video!

  • @salehabod6740
    @salehabod6740 3 года назад

    Thanks alot Professor Really a great lecture !

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 4 года назад

    the Q&A goes on for far too long. better to leave the such repeated questions till the end.

  • @alirezamogharabi8733
    @alirezamogharabi8733 4 года назад

    آقا دم شما گرم، عالی بود، میشه خواهش کنم اگر ویدیویی بابت MARL داشتید آپلود بفرمایید؟

  • @NuncNuncNuncNunc
    @NuncNuncNuncNunc 5 лет назад

    Interesting response to the ambiguous "rating" question. How would you ask to get the other type of rating - "movie rating", "MPAA rating"?

  • @dijvijaysingh5256
    @dijvijaysingh5256 5 лет назад

    What is the formula for calculating upper bound of MDP?

  • @tistoni09
    @tistoni09 5 лет назад

    nice conversation. Though it would be better if there was no cut before each answer by Alexa

  • @Awsome_watermelon
    @Awsome_watermelon 7 лет назад

    I tried it. This is super cool! Great work, Alborz!

  • @alireza202
    @alireza202 7 лет назад

    Awesome job!