Razvan Pascanu: Improving learning efficiency for deep neural networks (MLSP 2020 keynote)

Поделиться
HTML-код
  • Опубликовано: 8 сен 2024

Комментарии • 2

  • @X_platform
    @X_platform 3 года назад +1

    I could not agree more. The top k gradient approach would greatly reduce the tug of war issue.
    Loving the speaker and the content 😊
    Thank you!

  • @nguyenngocly1484
    @nguyenngocly1484 3 года назад

    You can turn artificial neural networks inside-out by using fixed dot products (weighted sums) and adjustable (parametric) activation functions. The fixed dot products can be computed very quickly using fast transforms like the FFT. Also the number of overall parameters required is vastly reduced. The dot products of the transform act as statistical summary measures. Ensuring good behavour. See Fast Transform (fixed filter bank) neural networks.
    Since dot products are so statistical in nature only weak optimisers are necessary for neural networks. You can use sparse mutations and evolutions. Then the workload can be very easily split between GPUs with little data movement needed during training. See Continuous Gray Code Optimization.