Improving Deep Reinforcement Learning via Quality Diversity, Open-Ended and AI-Generating Algorithms

Поделиться
HTML-код
  • Опубликовано: 7 май 2023
  • Jeff Clune, Associate Professor, Computer Science, University of British Columbia; Canada CIFAR AI Chair and Faculty Member, Vector Institute; Senior Research Advisor, DeepMind
    Abstract: Quality Diversity (QD) algorithms are those that seek to produce a diverse set of high-performing solutions to problems. I will describe them and a number of their positive attributes. I will summarize how they enable robots, after being damaged, to adapt in 1-2 minutes in order to continue performing their mission. I will next describe our QD-based Go-Explore algorithm, which dramatically improves the ability of deep reinforcement learning algorithms to solve previously unsolvable problems wherein reward signals are sparse, meaning that intelligent exploration is required. Go-Explore solved all unsolved Atari games, including Montezuma’s Revenge and Pitfall, considered by many to be a grand challenges of AI research. I will next motivate research into open-ended algorithms, which seek to innovate endlessly, and introduce our POET algorithm, which generates its own training challenges while learning to solve them, automatically creating a curricula for robots to learn an expanding set of diverse skills. Finally, I’ll argue that an alternate paradigm-AI-generating algorithms (AI-GAs)-may be the fastest path to accomplishing our field’s grandest ambition of creating general AI, and describe how QD, Open-Ended, and unsupervised pre-training algorithms (e.g. our recent work on video pre-training/VPT) will likely be essential ingredients of AI-GAs.
    Bio: Jeff Clune is an Associate Professor of computer science at the University of British Columbia and a Canada CIFAR AI Chair at the Vector Institute. Jeff focuses on deep learning, including deep reinforcement learning. Previously he was a research manager at OpenAI, a Senior Research Manager and founding member of Uber AI Labs (formed after Uber acquired a startup he helped lead), the Harris Associate Professor in Computer Science at the University of Wyoming, and a Research Scientist at Cornell University. He received degrees from Michigan State University (PhD, master’s) and the University of Michigan (bachelor’s). More on Jeff’s research can be found at www.JeffClune.com or on Twitter (@jeffclune). Since 2015, he won the Presidential Early Career Award for Scientists and Engineers from the White House, had two papers in Nature and one in PNAS, won an NSF CAREER award, received Outstanding Paper of the Decade and Distinguished Young Investigator awards, and had best paper awards, oral presentations, and invited talks at the top machine learning conferences (NeurIPS, CVPR, ICLR, and ICML). His research is regularly covered in the press, including the New York Times, NPR, NBC, Wired, the BBC, the Economist, Science, Nature, National Geographic, the Atlantic, and the New Scientist.
  • НаукаНаука

Комментарии • 6

  • @user-fc4me3je5k
    @user-fc4me3je5k 7 месяцев назад

    So the answers are right here :)

  • @Leo-we7jo
    @Leo-we7jo Год назад

    Thanks, interesting study

  • @AlgoNudger
    @AlgoNudger Год назад

    Thanks.

  • @yungrecentadvancement
    @yungrecentadvancement Год назад +1

    persistent deep outlier sampling

  • @InquilineKea
    @InquilineKea Год назад +1

    Is anyone else here a maximum-entropy reinforcement learner?

    • @srinjoyroy5488
      @srinjoyroy5488 9 месяцев назад

      @InquilineKea Yes! Do you want to write a paper together?