😕LoRA vs Dreambooth vs Textual Inversion vs Hypernetworks

Поделиться
HTML-код
  • Опубликовано: 14 янв 2023
  • There are 5 methods for teaching specific concepts, objects of styles to your Stable Diffusion: Textual Inversion, Dreambooth, Hypernetworks, LoRA and Aesthetic Gradients. The question is: which one should you use?
    In this video we review 3 key research papers, look at the underlying mathematical mechanics behind each method, analyze data from civitai to arrive at an informed and final conclusion.
    Discord: / discord
    Live Stream in 8 hours: • 😕LoRA vs Dreambooth vs...
    ======= Links =======
    Spreadsheet: docs.google.com/spreadsheets/...
    LoRA paper: arxiv.org/abs/2106.09685
    Dreambooth Paper: arxiv.org/abs/2208.12242
    Textual Inversion Paper: arxiv.org/abs/2208.01618
    Dreaming Tulpa: / dreamingtulpa
    Driving a machine insane with Dreambooth: • I drove a Machine Insane
    Good Tutorials:
    Dreambooth tutorial by OlivioSarikas: • DreamBooth for Automat...
    Hypernetworks tutorial by Aitrepreneur: • HYPERNETWORK: Train St...
    Textual Inversion tutorial by Aitrepreneur: • ULTIMATE FREE TEXTUAL ...
    Textual Inversion Paper Walkthough by me: • Textual Inversion with...
    LoRA tutorial by me: • 7GB RAM Dreambooth wit...
    LoRA tutorial by Nerdy Rodent: • LORA for Stable Diffus...
    Aesthetic Embedings tutorial: • How to use Aesthetic G...
    ======= Music =======
    From RUclips Audio Library:
    Escapism Yung Logos
    Music from freetousemusic.com
    ‘Late Morning’ by ‘LuKremBo’: • (no copyright music) c...
    ‘Marshmallow’ by ‘LuKremBo’: • lukrembo - marshmallow...
    ‘Rose’ by ‘LuKremBo’: • lukrembo - rose (royal...
    ‘Snow’ by LuKremBo: • lukrembo - snow (royal...
    ‘Sunset’ by ‘LuKremBo’: • (no copyright music) j...
    ‘Travel’ by ‘LuKremBo’: • lukrembo - travel (roy...
    ‘Branch’ by ‘LuKremBo’: • (no copyright music) c...
    #stablediffusion #aiart #ai #machinelearning #dreambooth #textual-inversion #hypernetworks #lora #aesthetic-gradients #tutorials #resarch #aesthetic-embeddings
  • НаукаНаука

Комментарии • 341

  • @infocyde2024
    @infocyde2024 Год назад +230

    The thing about textual inversions is that they create embeddings that are cross combatable with the base models. A textual inversion trained with SD 1.5 will work with all 1.5 based models, and here is the kicker, you can combine them without having to do any model merging. That is HUGE.

    • @lewingtonn
      @lewingtonn  Год назад +27

      yeah, the flexibility of textual inversion is a big factor, also it's really cool conceptually!!

    • @zyin
      @zyin Год назад +15

      The video really should have mentioned this, it's an incredible advantage for embeddings that was just left out.

    • @neilslater8223
      @neilslater8223 Год назад +24

      Yes, combining two, three or more Dreambooth models is possible, but it takes time and generates yet another 2GB+ model that you need to save somewhere.
      Whilst textual inversions can be used flexibly within the prompts in any combination, including weighting them, using as negative prompts, all on the fly with no extra file management
      However, textual inversion cannot learn to output things that the base model is not able to do at all. So depending on the base model, it may not be possible to train a textual inversion for a specific concept.

    • @infocyde2024
      @infocyde2024 Год назад

      @@expodemita I do not think they are compatible between 1.4/1.5 and 2.0 2.1. 2.0 and 2.1 should be compatible.

    • @alexandrmalafeev7182
      @alexandrmalafeev7182 Год назад +2

      @@infocyde2024 2.0 and 2.1 are for sure

  • @simonbronson
    @simonbronson Год назад +64

    Much appreciated, having someone clever distil all of this dense information down and explain it succinctly and with so much enthusiasm is so refreshing!

  • @KalebWyman
    @KalebWyman Год назад +36

    Thanks for explaining these so well, your visual diagrams are great!

  • @tomm5765
    @tomm5765 Год назад +13

    Thanks for your hard work putting this together, very helpful to evolve my understanding of the different approaches. Much appreciated!

  • @takeuchi5760
    @takeuchi5760 Год назад +8

    Thanks so much for this. Very underrated channel, literally was thinking something like this would be really helpful.

  • @Animes4ever1
    @Animes4ever1 Год назад +2

    Awesome comparison mate, great addition with the statistics, thanks a lot

  • @m3dia_offline
    @m3dia_offline 11 месяцев назад

    I love it, love your promises on what we are going to get from your video at the very starting few seconds of the video itself, keep it going man, love your channel and your energy.

  • @AleOnYouTube
    @AleOnYouTube 11 месяцев назад +2

    you deserve more subscribers, only channel I found that actually delivers what you need to know

  • @anthonyaddo
    @anthonyaddo Год назад +1

    Such an EXCELLENT video. Very very well researched and perfectly presented. Thanks for sharing all your findings and appreciate the time it took.

  • @ParanoidAmerican
    @ParanoidAmerican Год назад

    This video is exactly what I needed, and you went about it in the best way possible. Thanks for this

  • @moneyjuice
    @moneyjuice Год назад +4

    I love your videos, always on point !

  • @TheTruthIsGonnaHurt
    @TheTruthIsGonnaHurt 11 месяцев назад +1

    Liked and Subscribed, Thank you for all the hard work!

  • @NukerOfFace
    @NukerOfFace 11 месяцев назад +2

    Superb video. I don't think I've ever seen a tutorial/explaination for anything that is this good.

  • @fun7704
    @fun7704 Год назад

    This was a very informative video in fact, thank you! And I like your very dramatic delivery of the content! :)

  • @fredingham1855
    @fredingham1855 Год назад

    Outstanding job explaining these concepts! Well done!

  • @metamon2704
    @metamon2704 Год назад +9

    You explained that amazingly, very easy to understand - also things move fast because it seems like LoRA is now the most popular.

  • @Symbosa
    @Symbosa Год назад

    I greatly appreciate this video sir! It is really helpful for me to have context of how things actually work behind the scenes to make mental connections and improve how I interact with the external program.

  • @swannschilling474
    @swannschilling474 Год назад +1

    Thanks for the input, good research!!

  • @GayanZmith-vy1ql
    @GayanZmith-vy1ql Год назад +11

    i'm a total beginner to AI, and i suck at math, but you somehow managed to clear a shit ton of confusion. I was hooked on Dreambooth tutorials and trust me, you don't want that. I literally thought i was not going to be able to get started simply because of the massive resources it required.
    Trust me, you are really good at explaning things :)
    Really appreaciate the help

    • @glasco_
      @glasco_ 11 месяцев назад +1

      I’ve been trying to install dream booth for 3 days now. No success. Ready to walk in front of a bus

  • @AB-wf8ek
    @AB-wf8ek Год назад +2

    Thanks a ton for this breakdown, I've been struggling with this same question for a few weeks now. I had already come to a similar conclusion myself, but this was very validating.
    Dreambooth is preferred, but the models sizes make it so cumbersome and challenging to test different versions. With textual inversion, the file sizes are insignificant, and you can stack them on top of each other, making them very flexible.
    I haven't actually evaluated embeddedings (textual inversion) yet for quality because the animation notebook I use doesn't support them, but the developer just made it compatible, so I'm looking forward to testing it out more.

  • @ytchen6748
    @ytchen6748 Год назад

    What a great video! Thanks for your academic sharing and empirical results❤

  • @kulusic1
    @kulusic1 Год назад +45

    Textual inversion is far better on 2.1 than 1.5, and i think that's why they don't get the same love dreambooth receives. You can also speed up textual inversion training if you spend a few minutes getting the initializing text right so the vectors start in relatively close proximity to their final resting place. The best part imo, is you can combine many embeddings together, something which dreamtbooth doesn't really allow.

    • @leonardom862
      @leonardom862 Год назад +8

      How can you get the initializing text right before the training?

    • @alefratat4018
      @alefratat4018 Год назад +1

      @@leonardom862 By running image to text I suppose ?

    • @nathanbollman
      @nathanbollman 11 месяцев назад

      Ironically I haven't been able to run dreambooth yet,I switched to linux for AI... something broken with PyTorch2.0 and Cuda11.7 only thing affected is dreambooth training. Turn on gradient checkpoint and it cant train, turn it off and I cant make it to the first epoch without running out of 24GB of vram? I hope this gets fixed soon.

    • @sub-jec-tiv
      @sub-jec-tiv 11 месяцев назад +1

      Totally agree. Suuper crucial to be able to call multiple embeddings in a prompt!

  • @jeronimogauna7508
    @jeronimogauna7508 2 месяца назад

    Best video I ever seen.
    Best vibes!
    Thanks so much

  • @Philip8888888
    @Philip8888888 Год назад

    Wow. Thanks for this video, esp. the first part which gave just enough detail to understand the trade-offs and underlying approaches.

  • @dv8silencermobile
    @dv8silencermobile Год назад

    You are really good at explaining this stuff. Thanks!

  • @KlavsKrumins
    @KlavsKrumins 11 месяцев назад

    Thank You a lot. This has been a really good explanation that I felt missing.

  • @toastypanda2963
    @toastypanda2963 10 месяцев назад

    Great explanation! I've learned more about how AI art works from this video alone than all my previous watched videos combined. Everyone tends to say how to configure things without explaining how it works.

  • @lionroot_tv
    @lionroot_tv Год назад

    This is great. Thank you for sharing your knowledge, and about Excalidraw.

  • @mariokotlar303
    @mariokotlar303 Год назад

    Awesome explanation, thank you!

  • @ronenbecker1873
    @ronenbecker1873 6 месяцев назад

    You're an absolute legend. Great video

  • @thedevo01
    @thedevo01 Год назад

    Thank you so much for this video! 🙏

  • @MenoMusing
    @MenoMusing Год назад

    Absolutely incredible video, thank you!

  • @ksottam
    @ksottam Год назад

    Loved this breakdown. You need more followers!

  • @user-be5rk2hy3c
    @user-be5rk2hy3c Год назад

    Incredable explanation! Thanks a lot.

  • @yo252yo
    @yo252yo Год назад

    this is the best video about the topic ive ever seen, thanks so much

  • @suryaprasathramalingam2421
    @suryaprasathramalingam2421 29 дней назад

    thanks for the short explanation. Loved it!

  • @takocain
    @takocain Год назад

    That was an insanely good explanation. Thank you!

  • @CameronRule
    @CameronRule Год назад +16

    One interesting piece of data is Lora has quite a high faves per download rating while only being out for a short period of time

    • @lewingtonn
      @lewingtonn  Год назад +6

      yeah, I saw that too.... good sign!

  • @kyosukefukumoto9382
    @kyosukefukumoto9382 10 месяцев назад

    This video is AMAZING! Thank you SO MUCH.

  • @jichenzhang4385
    @jichenzhang4385 Год назад

    Very nice introduction! Thank you!

  • @nolanzor
    @nolanzor Год назад

    Thank you so much for this video! Amazing work

  • @takif8756
    @takif8756 11 месяцев назад

    Great tutorial mate, thank you!

  • @TheAnna1101
    @TheAnna1101 Год назад

    Thanks for making such great and informative video. Keep up the good work

  • @friendofai
    @friendofai Год назад +1

    Really great video, thanks for sharing all your research!

  • @jondargy
    @jondargy 10 месяцев назад

    Very nice summary- thank you 🙏

  • @rickguzman9463
    @rickguzman9463 11 месяцев назад

    THANK YOU THANK YOU THANK YOU!! Great video. Great insight.

  • @LuisPereira-bn8jq
    @LuisPereira-bn8jq Год назад +3

    That was a really helpful video that definitely saved me a bunch of time trying to understand these differences by myself :P

    • @lewingtonn
      @lewingtonn  Год назад +1

      saving people time makes me super happy, thanks!

  • @martinchen9667
    @martinchen9667 10 месяцев назад

    brilliant video, thank you for all the efforts!

  • @StephaneBusso
    @StephaneBusso Год назад

    thanks for making those complex concepts easy to understand!

  • @Funzelwicht
    @Funzelwicht 7 месяцев назад

    Awesome explanantion for everyone!

  • @huyked
    @huyked Год назад

    Thank you, sir, for this explanation!

  • @TurboSkibidiFun
    @TurboSkibidiFun Год назад

    This is so well taught man thank you so much

  • @danielaston6560
    @danielaston6560 5 месяцев назад

    This video is dope. Super clear and informative. Thank you!!!

  • @kazimozden4010
    @kazimozden4010 11 месяцев назад

    Thank you for an informative and engaging video!

  • @jackzhang891
    @jackzhang891 5 месяцев назад +3

    Hey Koiboi. Great video. When you made this video, as you said yourself, LoRA was still very new and the stats are probably not accurate. Now that a good amount of time has passed, I would love to watch an updated analysis video on the effectiveness of LoRA compared to Dreambooth and Textual Inversion.
    Either way, this is the most informative video I've watched so far comparing these fine-tuning models. Liked and subbed 👍.

  • @jasonhemphill6980
    @jasonhemphill6980 Год назад

    That's so much work! Thank you man

  • @austinliu9218
    @austinliu9218 Год назад

    clearly explained, much appreciated!

  • @darmok072
    @darmok072 Год назад

    thank you for the great explanation!

  • @mlcat
    @mlcat Год назад

    Very clear explanation, thank you!

  • @Apothis1
    @Apothis1 Год назад

    Really appreciate this, so many videos showing how to do this stuff, but not how it works, and specially not how it works dumbed down to a level I can understand. Very cool, thankyou

  • @AC-zv3fx
    @AC-zv3fx Год назад +37

    LORA works only with an extension, and many people don't know how to use it yet, hence lower ratings. Great video btw! Visual comparision would have been great as well! As far as I can remember, there was one in LORA blogpost, showing how textual inversion may be less flexible than dreambooth or lora, and the latter two were showing comparatively similar results.

    • @Avenger222
      @Avenger222 Год назад +5

      Auto added compatibility now! But it was only added recently. (I still use the extension, I find the drop-down much easier to use than how auto implemented it, plus it gives you the ability to tweak the weight of both U-Net and the Text Encoder -- super cool!)

    • @artavenuebln
      @artavenuebln Год назад

      i did everything i should do and i never get lora to run. it was no issue with the textual inversion, tho.

    • @glitter_fart
      @glitter_fart Год назад +1

      controlnet has almost made lora obsolete for anything other than oddities

  • @maggiezhuang3842
    @maggiezhuang3842 5 месяцев назад

    This is awesome! thank you!

  • @BlancheNuit
    @BlancheNuit Год назад

    That is the type of quality content that I'm digging for.
    I want to understand Stable Diffusion and everything related. But my attention span/knowledge about programming is not enough that I can just read papers about it. So I need videos, with visuals, and easy explainations. And your video was Perfect. Liked + Subscribed :)

  • @tenghuili3711
    @tenghuili3711 Год назад

    Very great job! Thank you!🥰

  • @mattecrystal6403
    @mattecrystal6403 Год назад +25

    I've been messing with Loras and they seem to work really well. You can also do a good amount of mix and matching with loras whereas a full model checkpoint only allows you to use that one model at a time. if I had a fruits lora and a vegetables lora, then I could just turn them both on to get fruits and vegies in my random prompt that doesn't ask for fruits or vegies. If I later just want fruit then I could just remove the vegies lora.
    I think loras are going to be big going forward, most people just don't know about them yet.

    • @treyslider6954
      @treyslider6954 Год назад

      I get the feeling that Textual Inversion is the go-to for when you have a new idea you want to teach the model (like a specific character or subject), and Lora is great for when you have a concept you don't want to stop and explain to the model, or may have difficulty doing so. They're very similar things, but not quite the same.
      For example; loras are great for mimicking a specific art style, because instead of having to describe "I want a painted animation style like this specific style, but with eyes drawn just so", you can train a lora and then just say "" at the end of your prompt, and since it isn't actually part of the prompt, this clears up tokens for describing the actual thing you want depicted in that style.

    • @ArbJunkAgeG
      @ArbJunkAgeG Год назад

      This is exactly how i feel about lora. It’s disappointing that people don’t seem to gasp the same values of how beneficial loras can be.

    • @tbuk8350
      @tbuk8350 11 месяцев назад

      @@treyslider6954 And also, as described in the Automatic1111 docs, Textual Inversion can't teach COMPLETELY new concepts.
      The example they gave is that if you trained a model that only knew how to make apples on images of bananas, it wouldn't learn what a banana is, it would just make long yellow apples (in the best-case scenario). Because it's not actually changing model weights, it's better for teaching a style than a new subject, because unless the subject is very similar to something it's seen, it can't learn it.
      LoRAs can teach a model something it's never seen before, because they are directly inserting weights into the model, meaning it's actually modifying the model and not the input going into it.
      Basically, Textual Inversion for simple styles, LoRA for anything complicated.

  • @cinematic_monkey
    @cinematic_monkey 11 месяцев назад

    What I was looking for in that video was the comparison of usability in different scenarios. Which model is good for faces which one for style transfer etc. I'm missing that, other than that quite comprehensive comparison. Good job!

  • @dreamingtulpa
    @dreamingtulpa Год назад

    Why am I only now seeing this? Great video and thanks for the feature ❤

  • @ThePixelkd
    @ThePixelkd Год назад

    Thanks koiboi! I absolutely love these in depth videos.
    Any plans to give ControlNet this sort of treatment?

  • @wecharg
    @wecharg 8 месяцев назад

    Great work!

  • @doingtime20
    @doingtime20 Год назад

    Amazing work thank you. New sub.

  • @ticosanjr
    @ticosanjr 11 месяцев назад

    Great Video! Thank you very much!

  • @SimsiZocker
    @SimsiZocker 11 месяцев назад

    wow that was great, thank you so much!

  • @errrorproduction
    @errrorproduction Год назад

    really great video! finally understand the differences. just the conclusion is already out of date, since we're moving so incredibly fast. lora, is the most popular format on civitai now. understandable, since training is the quickest, even though ti's end-result is much smaller.

  • @thanksfernuthin
    @thanksfernuthin Год назад

    Great info! And coincides with what I learned on Computerphile's channel. Slowly but surely my mind is able to wrap around with what we're dealing with.

  • @sammcj2000
    @sammcj2000 7 месяцев назад

    Brilliant explanation, thank you. By chance are you diagrams available somewhere?

  • @keiralx
    @keiralx 4 месяца назад

    Great video, really helped me understand this

  • @daffertube
    @daffertube 11 месяцев назад

    Great video. Big thanks

  • @somedudeonyoutubefrfr
    @somedudeonyoutubefrfr Год назад +1

    I just started with the whole txt2img models and therefore I really thank you for this video!
    It was great to watch and I got a lot of information. I really appreciate it.
    Also, would you mind sharing the excalidraw board? Either as a exported png or as a file, it would be a great resource for my own documentation, if it's not too much to ask.

  • @paulofalca0
    @paulofalca0 Год назад

    Great video! Thanks!

  • @crustysoda
    @crustysoda Год назад +6

    Thank you for model explanation. Really loved your content so far.
    At the end of civitai comparison, I’m curious if we split data to use cases, object embedding vs style embedding would have different performance/preference.

    • @lewingtonn
      @lewingtonn  Год назад +2

      that's a super hard question to answer :(

  • @jitgo
    @jitgo Год назад +4

    All different now! LoRA is by far the best all round method now and hugely gaining popularity... Great video by the way, excellent explanations!

  • @timalk2097
    @timalk2097 Год назад

    amazing content, insta subbed !

  • @barryjones6479
    @barryjones6479 Год назад +1

    Great video and explanation! I really want TI to be the future but I agree, the quality of dreambooth training is usually better.

  • @Ben_CY123
    @Ben_CY123 Год назад

    Bro, this video really helpful!

  • @wendellkwang3724
    @wendellkwang3724 Год назад

    what a great list of checkpoints you have, a man of culture 🤣

  • @maxschaeffer
    @maxschaeffer Год назад

    thanks a lot for your effort, great job

  • @KnightLenny
    @KnightLenny Год назад

    Amazing educational video!

  • @joaquinramos1181
    @joaquinramos1181 6 месяцев назад

    Muy buen Video! Gracias

  • @tljstewart
    @tljstewart Год назад

    ok you had me @00:27 , would be cool to see a video on civitai

  • @parasite34
    @parasite34 Год назад

    insane work and attention here

  • @RemitheDreamfox
    @RemitheDreamfox Год назад

    You explained this so well. My smooth brain couldn't understand these different methods for the longest time \uwu/

  • @kirollosmalek1365
    @kirollosmalek1365 Год назад

    man you're a hero

  • @Slider93
    @Slider93 Год назад

    Amazing, thank you

  • @badradish2116
    @badradish2116 11 месяцев назад +1

    could you please do a part 2 where you
    - explain aesthetic gradients for educational purposes, and maybe provide data on user feedback like you did at the end for the others.
    - explain lycoris, which from what i understand is lora + 4 random good ideas, but id love to see someone on your level break it down a bit better.
    - give us updated data on the other forms now that more feedback is available (you mentioned not having a big enough sample size to judge the newest tech).
    that would be insanely helpful. thanks!

  • @Xiripyu
    @Xiripyu Год назад

    Thanks you for really nice explonation

  • @metasamsara
    @metasamsara Год назад

    Great breakdown of the details, and pros and cons of each technology! Which model would you use if you plan to train a few characters, then re-use them a lot together to write a visual story frame by frame (as in colored manga)? On a low to mid end laptop GPU...

  • @Copyshinobi
    @Copyshinobi Год назад

    Much appreciated! Having this nodes of wisdom to operate with AI models is a huge contribution to society! Props to you.

  • @bobsmithy3103
    @bobsmithy3103 Год назад

    Amazing explanation

  • @metalpuppy2188
    @metalpuppy2188 Год назад +4

    What an insanely helpful video! I'm still holding out hope the quality of hypernetworks improves (I've had fantastic results with it, but updates often break it and nobody really knows what they're doing so guides are not great)
    It shares some of the same advantages as TI (smaller file size, can be transferred between models easily) and I really hate having giant checkpoints just to add single concepts.
    I was excited to learn about LoRA, but it looks like it can't be used without first adding it to a checkpoint, so its lost some appeal for me. Can you train multiple concepts to a checkpoint with LoRA one at a time and have them all retain coherency?

  • @Qubot
    @Qubot Год назад +1

    Very interesting video, thanks.
    Finetuning is using Dreambooth with multiple text/image embedding right ?