New Easy VAE Workflow (Stable Diffusion)
HTML-код
- Опубликовано: 9 июн 2024
- Using a custom VAE can improve Stable Diffusion images significantly. We walkthrough how to use a custom VAE with the AUTOMATIC1111 webui and also explain what the heck a VAE is and why it helps.
Discord: / discord
0:00 - Intro
1:00 - What is a VAE
8:32 - How to use a VAE
11:17 - Comparison
------- Links -------
Comparison Images: / vae_comparison
AMAZING Video on Variational Autoencoders: • Variational Autoencode...
Good generalist VAE by Stability.AI: huggingface.co/stabilityai/sd...
The waifudiffusion VAE I used: huggingface.co/hakurei/waifu-...
AUTOMATIC1111 Webui: github.com/AUTOMATIC1111/stab...
------- Music -------
Music from freetousemusic.com
‘Branch’ by ‘LuKremBo’: • (no copyright music) c...
‘Butter’ by LuKremBo: • lukrembo - butter (roy...
‘Daily’ by ‘LuKremBo’: • (no copyright music) c...
‘Onion’ by LuKremBo: • (no copyright music) l...
‘Rose’ by ‘LuKremBo’: • lukrembo - rose (royal...
‘Sunset’ by ‘LuKremBo’: • (no copyright music) j...
Many thanks to LuKremBo
#stablediffusion #aiart #xformers #tutorials #techtutorials - Наука
Very Informative! I've been seeing alot about VAEs but have been struggling to understand them. This video helped me out tremendously! Love the content, Keep it up!
Is there anything you CAN'T explain? Amazing mate!!!!!
Yes lol, why aitrepreneur has so many more subs than me :'(
@@lewingtonn mate forget the subs they will come, do what mr beast does and translate in multiple languages!!
@@TheCopernicus1 ............... huh
@@lewingtonn I joined your discord! Also what I meant regarding Mr Beast was the technique he uses for most of his video's is he translates them into multiple spoken languages as there are many ML enthusiasts around the world. He figured he had more non-english speaking friends watching his channel than originally anticipated!
I just really like the way you explain complex stuff. really appreciate it.
hawhahah really? I'll have to visit sometime
@@lewingtonn Sure, That would be fantastic! Let me know when you're thinking of coming 😀
Great video as always Koiboi, always looking forward to what you are creating.
Nice explaination, nice end cut too !
lol, what the heck, sorry!
This is very cool, I love the details you go into.
thx for the expalanation man, very informative and easy to understand
Love your explanation!
Perfect explanation! You got me to read the actual paper and your video helped me get to the Aha moment!
thanks for commenting it out loud dude, literally so good to hear!
Thank you for this video.
Pretty much one of the very best content around. Could you do some intro to upscalers, there is a lot of controversy out there regarding those as well.
Fantastic :) Thank you!
very helpful, thank you
nanomachines?
thanks for the easy answer.
thank you i recently pick up sd and having a problem like washed out color of everything i generate. this actually solve it for me thank you
im convinced thats what midjourney v4 is, just a new vae
[citation needed]
7:11 You mentioned the Encoder converts the latent back to the exact original image; it actually only returns a very close approximation of the original.
yeah, good point, I should have been a bit more clear about that hey. I should have said it TRIES to convert it back or something.
Thank funny hat man. BTW is the VAE technically lossy, so with encoding ---> decoding, when it gets to the end result, is the image a good learning based guess or 1:1 copy of original?
exactly, VAE are very lossy, it's a good learning based guess!
and if I put in the vae, the automatic option? what will he use?
The zigzag just looks like normal raster pixels in a low resolution image. Most raster images have them in higher contrast areas. To make a diagonal you need a series of offset square pixels, after all.
that's exactly what I was trying to point out (I need to work on being clearer): how a diffusion model would have to learn how to offset square pixels to create a diagonal line visual effect, when really it shouldn't have to worry about such details
Do you happen to know the "Quicksettings list"(for those who don't know, this is a thing in the settings that adds stuff at the top of the webUI) value for VAE and clip skip ?
Is it SD_VAE and SD_Clip_Skip ?
CLIP_stop_at_last_layers, sd_vae
It is sd_vae I found it by looking at the web page source and searching vae. For those that are like WTH where is this folder and why don't I have stable diffusion section on your automatic1111 don't forget to do a git pull.
So, how to actually make or extract VAE from the full unpruned model?
Could VAE be a PT format?
One more question: Do you know anything about Stable Warpfusion? Is it another AI or version of SD or it is a model, embedding or promt?
I wonder if the Lorenz is related in any way to the guy who had a fractal model named after them?
Could you consider doing these virtual chalk board thingies on white background? I may not be majority, but my eyes can't take that black background...
that's weird, I can't staaaand white background. It could be a bit more visible though, I'll try thicker lines or something
Now you got a do muscular Kamala, for equity, y'know.
could you do a video on embeddings. I have tested some but it seems they do nothing. Why do we have them?
Why not check Automatic1111's wiki? There's a whole page about textual inversion.
i literally did one!!!
@@lewingtonn great.
Did in it's already out or did in it's coming next?
@@lithium534 it's this one: ruclips.net/video/9zYzuKaYfJw/видео.html&ab_channel=koiboi (aesthetic embeddings = aesthetic gradients), I assume that's what you mean by "embeddings"
@@lewingtonn Thanks.
I was searching embeddings. So this is the other name for it.
Know I know. Thanks again keep the great content coming.
Hey, I wonder why do you use 1.3? Is that a better model in your opinion?better than 1.4 and 1.5?
I used waifu diffusion 1.3, which is the most modern version of waifu diffusion (which is a specially finetuned version of stable diffusion 1.4)
@@lewingtonn ahh I see thanks!
I am a bit confused.
I see no significant differences between your before and after images.
Shouldn't the "waifu diffusion" model be used as your main model in the prompt-to-text page of the GUI?
You can see a more close-up comparison of the images linked in the description, I think the changes were significant in some cases. I did end up using waifu diffusion when I actually generated the images, but you can use any VAE with any diffusion model. Hope that cleared things up a little.
@@lewingtonn Thanks!
Your 100% wrong. Latent Diffusion is magic.
It's a panadora box, it was a gift from aliens.
man, you are hilarious
guys be honest, we all simp 2minutespaper here
especially me :'(
Dear fellow scholars, do you want a 2 minutepapers replacement here?
@@HB-kl5ik YES!
Big government got me 🥵🥵
hold down to your papers and beers.. cheers XD
Am I the only one who calls it Auto Eleven?
only the one eyes pirates that don't see the other two ones...
You are not the only one. 😅
damn, that's way better!
@@lewingtonn "Automatic One-One-One-One" doesn't quite roll off the tongue.
A-Quad-1
Donald Trump would win that fight 😏