Antonio Sclocchi | Hierarchical data structures through the lenses of diffusion models

Поделиться
HTML-код
  • Опубликовано: 1 окт 2024
  • New Technologies in Mathematics Seminar 10/2/2024
    Speaker: Antonio Sclocchi, EPFL
    Title: Hierarchical data structures through the lenses of diffusion models
    Abstract: The success of deep learning with high-dimensional data relies on the fact that natural data are highly structured. A key aspect of this structure is hierarchical compositionality, yet quantifying it remains a challenge.
    In this talk, we explore how diffusion models can serve as a tool to probe the hierarchical structure of data. We consider a context-free generative model of hierarchical data and show the distinct behaviors of high- and low-level features during a noising-denoising process. Specifically, we find that high-level features undergo a sharp transition in reconstruction probability at a specific noise level, while low-level features recombine into new data from different classes. This behavior of latent features leads to correlated changes in real-space variables, resulting in a diverging correlation length at the transition.
    We validate these predictions in experiments with real data, using state-of-the-art diffusion models for both images and texts. Remarkably, both modalities exhibit a growing correlation length in changing features at the transition of the noising-denoising process.
    Overall, these results highlight the potential of hierarchical models in capturing non-trivial data structures and offer new theoretical insights for understanding generative AI.

Комментарии • 1