Kolmogorov-Arnold Networks: MLP vs KAN, Math, B-Splines, Universal Approximation Theorem

Поделиться
HTML-код
  • Опубликовано: 12 июл 2024
  • In this video, I will be explaining Kolmogorov-Arnold Networks, a new type of network that was presented in the paper "KAN: Kolmogorov-Arnold Networks" by Liu et al.
    I will start the video by reviewing Multilayer Perceptrons, to show how the typical Linear layer works in a neural network. I will then introduce the concept of data fitting, which is necessary to understand Bézier Curves and then B-Splines.
    Before introducing Kolmogorov-Arnold Networks, I will also explain what is the Universal Approximation Theorem for Neural Networks and its equivalent for Kolmogorov-Arnold Networks called Kolmogorov-Arnold Representation Theorem.
    In the final part of the video, I will explain the structure of this new type of network, by deriving its structure step by step from the formula of the Kolmogorov-Arnold Representation Theorem, while comparing it with Multilayer Perceptrons at the same time.
    We will also explore some properties of this type of network, for example the easy interpretability and the possibility to perform continual learning.
    Paper: arxiv.org/abs/2404.19756
    Slides PDF: github.com/hkproj/kan-notes
    Chapters
    00:00:00 - Introduction
    00:01:10 - Multilayer Perceptron
    00:11:08 - Introduction to data fitting
    00:15:36 - Bézier Curves
    00:28:12 - B-Splines
    00:40:42 - Universal Approximation Theorem
    00:45:10 - Kolmogorov-Arnold Representation Theorem
    00:46:17 - Kolmogorov-Arnold Networks
    00:51:55 - MLP vs KAN
    00:55:20 - Learnable functions
    00:58:06 - Parameters count
    01:00:44 - Grid extension
    01:03:37 - Interpretability
    01:10:42 - Continual learning
  • НаукаНаука

Комментарии • 107