Stable Diffusion Deep Dive - CFG - Don't Accidentally Fry Your Images
HTML-код
- Опубликовано: 27 май 2024
- First, this video provides a light technical explanation of what CFG is doing behind the scenes in the Stable Diffusion. Next, I cover all the experiments I ran to characterize CFG's interactions with other parameters, the general findings of these experiments, and then finally charts showing what step and CFG values are likely to work for each sampler.
Introduction - 00:00
Explanation of CFG - 00:33
Methodology - 02:45
General Results - 03:59
DPM Adaptive - 05:57
Charts for Steps and CFG by Sampler - 06:36
Hires Fix and Emphasis - 10:25
Outro - 11:23
Link to Sampler Deep Dive - • Stable Diffusion Deep ...
#stablediffusion #aiart #automatic1111 #cfg
Link to Google spreadsheet with the figures shown in this video as well as some useful prompt constructors. It is read only, so make a copy for yourself if you want to use the constructors.
docs.google.com/spreadsheets/...
I'm loving these deep dives. You explain it in a way someone just starting out could easily understand.
@SiliconThaumaturgy Dude, the meme at 2:04 of 'Katie, teh penguin of doom!!! *holds up spork*' is certainly one of the best references I've ever seen in a youtube video. How many people did you honestly expect to get that lmao. I swear it was put in here for me alone, I'm in tears.
Wow, this must have been a huge amount of work but it shows the relation between CFG Scale and sampling steps beautifully. I'm glad RUclips recommended this, you got a new sub!
This is the kind of study that is extremely helpful. Nicely focused on a single factor in a plethora of factors. Thank you so much.
those thumbnails need work, but im really grateful for your channel/videos! very specific and to the point! thank you for you work friend. keepem coming! 😊
Glad you like them. Upgrading the thumbnails is on my list and I'm getting started today
Great videos. I dig that you keep things in the realm of science and not get any emotion involved as many others seem to do
The best channel on the topic. Only objective tests with x/y test matter, not some rando reddit flame war. Thank you!
Ps you had my sub at "the doggiest dog :)"
Really great content here. I hope you make more - I find the analytical approach fascinating while you weave a little subjectivity into it (which is the only way you can determine whether something is 'good' or 'not good' when creating images since its subjective otuside of defined terms like sharpness, saturation, etc... I would love to see you do a video on different types of subjects (animals, landscapes, mixed subject, logos, marketing related etc) and what samplers, steps and CFG produced what you would consider the most 'reliable' good results. Since it's all subjective, it fits right into your research already but then nails you down, finally, to give a top 3 or top 5 of samplers and some small ranges for the CFG and steps - based on subject. Thanks for all you do my man!
Wow. This video is crazy informative. Subscribed!
Your videos are always so on point and educational. Keep up the good work!
God i love your charts, I haven't seen anyone else make really good visual charts on these sorta things
amazing video ! Those deep dive videos are super interesting and informative
Finally i got a real awnser! Tysm
super informative and helpful. you deserve way more subs
Very useful information. Thank you
Bro I really like your video, so informative and well prepared. Keep it up! ❤
Thank you for the great explanation 👍
This is amazing thank you!
great job,thanks!
thanks for these videos bro, saved me alot of work haha
This is very helpful! Big FANX!
Wow, just invaluable, bro. Thanks.
Thanks! I appreciate the encouragement.
Are you still out there? :) You stopped all videos about 6 months ago...
Very interesting vid ty
Still feels like CFG is a messy subject, I'm mainly don't touch CFG, Sampling method and Steps, because I have plenty of parameters to tweak already, while simultaneously praying for the decent results. Damn, the prompt alone is a major headache, considering that different models give you different outputs, Imagine taking in to the consideration Sampling method, Steps, CFG, and Dynamic Threshold for example, this is crazy
in my experience, if your prompt/Negative prompt is not complicated and less than 100 words then 25 Sample Step + 7 CFG is good.
If your Prompt/Negative Prompt is very complex and contains 300 or more words, then 50 Sample Step + 25 CFG is fine.
thank you, your explanations whit chart s ufff papa muy bueno
Gracias por ver
turn up the volume bro, and do more deep dives please!
I think this was my last video with low volume. More deep dives are definitely in the works
I continually refer back to this video specifically for the charts which begin at 6:36 timestamp. Maybe i should just save them as jpgs to my desktop... lol
sound is too low or is it just me
It's low. I had to put my speak volume to 80 percent to hear clearly, when 50% on other videos is loud enough to get complaints from the neighbors. I suggest Silicon to either normalize the audio or do a test upload until the audio settings are fine tuned. Either way I am grateful for the video.
I'll have to boost audio for future videos. Unfortunately, I can't increase the volume for videos that have already been uploaded.
nice deep dive. wuld love it if u would macke the spreadsheets avalibil. :)
It was a mystery to me why SDXL models are generally so blurry. Not anymore. It seems to me that all SDXL models are very CFG sensitive. Most models that I have tried (about 30), start to look over exposed or start to fry after CFG 2.3 - 2.7 and get very blurry at about CFG 5.0. Sometimes that is even before the scene or the pose in the image get it's approximate final form.
This doesn't seem right to me, though I only use vanilla SDXL. I've only ever used the default CFG of 7.5 and get great results. People would be losing their minds if they had to stay within the range of 2.3 - 5.0. Also, the effects you describe seem to be backwards, as the overexposed look should be on the high CFG end and the blurry, low-contrast look should be on the low end. You might be running into a bug with your software configuration.
wow you could make a website with these direct matrix comparisms that could be super interesting as a cheat sheet. I only found some of those and they typically only deal with one or two samplers, or would always only use "a dog" which is kind of not a very complex cfg
Is there any difference in images generated by the amd gpus and the Navida ones?
Hi - great video, but your audio level is very low!
Yes, this is fixed on my more recent videos, but alas, RUclips doesn't allow me to fix this in existing videos
Thank you for the video but it's very difficult to hear. I had to turn the volume up super high and then a commercial interrupted in the middle of the video and nearly blew my speakers out. It looks like you put a TON of time and effort into this video with all the experimentation with the CFG values and data you present it, so it's really unfortinite that the audio is so low.
Alas, I cannot increase the volume on my old videos. The issue has been fixed on more recent uploads
@@siliconthaumaturgy7593 There's no way to fix it? Bummer. Thanks for the info. :)
I'm here after mixing Loras... oy.
your video is very hard to hear, frustratingly low volume. I am sorry.
Thanks for all of your videos! Can you please make one about prompt editing(word switching) and does it work during hi-res fix fase. Also about BREAK keyword. Thanks in advance.
I have one that includes prompt editing/word switching. It is titled "Stable Diffusion Basics - Prompt Emphasis and Blending Concepts using your prompt." It doesn't include the BREAK keyword unfortunately.