controlnet paper explained - Adding Conditional Control to Text-to-Image Diffusion Models

Поделиться
HTML-код
  • Опубликовано: 10 июн 2024
  • ControlNets is the first paper to enable precise spatial control of the generated outputs of image generation models. It won the best prize in the prestigious ICCV 2023 conference.
    This video covers the architecture of ControlNets, the idea of classifier-free guidance, and how it has been modified for resolution reweighting. It also covers the qualitative results and ablation studies.
    ⌚️ ⌚️ ⌚️ TIMESTAMPS ⌚️ ⌚️ ⌚️
    0:00 Introduction to ControlNet
    1:45 Neural Network Blocks
    2:04 ControlNet Architecture
    3:02 ControlNet with Stable Diffusion
    5:05 ControlNet Training
    6:39 Classifier-free Guidance Resolution Weighting
    6:56 Classifier Guidance
    8:58 Classifier-free Guidance
    9:46 Classifier-free Guidance Resolution Weighting
    11:08 Ablation Studies
    🛠 🛠 🛠 MY SOFTWARE TOOLS 🛠 🛠 🛠
    ✍️ Notion - affiliate.notion.so/aibites-yt
    ✍️ Notion AI - affiliate.notion.so/ys9rqzv2vdd8
    📹 OBS Studio for video editing - obsproject.com
    📼 Manim for some animations - www.manim.community
    🎵 My music - www.bensound.com and
    📚 📚 📚 BOOKS I HAVE READ, REFER AND RECOMMEND 📚 📚 📚
    📖 Deep Learning by Ian Goodfellow - amzn.to/3Wnyixv
    📙 Pattern Recognition and Machine Learning by Christopher M. Bishop - amzn.to/3ZVnQQA
    📗 Machine Learning: A Probabilistic Perspective by Kevin Murphy - amzn.to/3kAqThb
    📘 Multiple View Geometry in Computer Vision by R Hartley and A Zisserman - amzn.to/3XKVOWi
    MY KEY LINKS
    RUclips: / @aibites
    Twitter: / ai_bites​
    Patreon: / ai_bites​
    Github: github.com/ai-bites​
    WHO AM I?
    I am a Machine Learning researcher/practitioner who has seen the grind of academia and start-ups equally. I started my career as a software engineer 15 years ago. Because of my love for Mathematics (coupled with a glimmer of luck), I graduated with a Master's in Computer Vision and Robotics in 2016 when the now happening AI revolution just started. Life has changed for the better ever since.
    #machinelearning #deeplearning #aibites

Комментарии • 6

  • @abcd45058
    @abcd45058 3 месяца назад +1

    Great work. Interesting paper read indeed.
    At 7:27 ; Bayes theorem is incorrect. P(X/Y) = P(Y/X).P(X) / P(Y) ; The rest of the math that follows is fine.

    • @AIBites
      @AIBites  2 месяца назад

      well spotted. thank you.
      I think I saw it after the video pub. Left it as YT doesn't allow newer versions of videos. I think I should start writing errata in the comments :)

  • @frazuppi4897
    @frazuppi4897 5 месяцев назад +1

    great video but is not clear how one train it, one needs to have pairs of controlnet input - image output right?

    • @AIBites
      @AIBites  3 месяца назад +1

      yes, we need depth or pose datasets. We already have several datasets in computer vision for depth or pose. The problem is these datasets are tiny compared to the scale at which LLMs or LVMs are trained. So the solution is ControlNet. By ControlNet approach, we simply add a few trainable layers and we are good to go and train with these "small" datasets. As a result, we will be able to control the spatial layout of the generated image during inference.
      Hope that clarifies :)

    • @frazuppi4897
      @frazuppi4897 3 месяца назад

      @@AIBitesyeah but I guess controlenet is around 50M

    • @AIBites
      @AIBites  2 месяца назад

      thats the upper bound I guess. Not sure whats the lower bound to train.