Memory-Regret Tradeoff for Online Learning

Поделиться
HTML-код
  • Опубликовано: 9 окт 2023
  • Binghui Peng (Columbia University)
    simons.berkeley.edu/talks/bin...
    Sketching and Algorithm Design
    In the experts problem, on each of T days, an agent needs to follow the advice of one of n experts. After each day, the loss associated with each expert’s advice is revealed. A fundamental result in learning theory says that the agent can achieve vanishing regret, i.e. their cumulative loss is within o(T) of the cumulative loss of the best-in-hindsight expert. Can the agent perform well without sufficient space to remember all the experts? We extend a nascent line of research on this question in two directions: (1) We give a new algorithm against the oblivious adversary, improving over the memory- regret tradeoff obtained by [Peng-Zhang’23], and nearly matching the lower bound of [Srinivas-Woodruff-Xu-Zhou'22]. (2) We also consider an adaptive adversary who can observe past experts chosen by the agent. In this setting we give both a new algorithm and a novel lower bound, proving that roughly \sqrt{n} memory is both necessary and sufficient for obtaining o(T) regret. Based on joint work with Aviad Rubinstein

Комментарии • 2

  • @sumitnarsale
    @sumitnarsale 9 месяцев назад

    can someone please explain it to me like I'm a 5 year old

    • @bcj
      @bcj 9 месяцев назад +4

      Ask everyone for a bunch of advice. When you fail/succeed, start trusting the advisor who would have served you best