Tokenformer: The Next Generation of Transformers?

Поделиться
HTML-код
  • Опубликовано: 24 ноя 2024

Комментарии • 6

  • @ericg4379
    @ericg4379 17 дней назад +3

    Thanks for the great video. In the graph at 6:26, for the incrementally scaled networks, is the “training cost” just considering the incremental cost relative to the previous iteration? Or the cumulative training cost inclusive of the compute expended on the preceding increments?

    • @aipapersacademy
      @aipapersacademy  17 дней назад +4

      Thank you for the feedback! The training cost reported for the Tokenformer versions is cumulative, including both the compute spent on the preceding increments and the initial training of the 124M model, while the Transformer cost is reported for each version individually.

  • @KenCheungChannal
    @KenCheungChannal 13 дней назад +3

    Can you also explained what’s the potential down side of this new architecture?

    • @aipapersacademy
      @aipapersacademy  7 дней назад +1

      A potential downside is that, for now, Tokenformer has been tested on a relatively small scale, so its effectiveness for large models is still unproven.

  • @kevon217
    @kevon217 17 дней назад

    Great channel!

  • @jeffg4686
    @jeffg4686 9 дней назад

    A Tokenformer forms tokens