Dude, I usually dont write comments, but this guide is awesome as hell. The explanation with the "a" in the name and the benchmarks PLUS your "good image" disclaimer were extremely well done and professional. (I am currently at minute 3, but had to post a comment first lol) I hope you get more subscribers and viewers. Thank you for your service!
A very useful video. It's rare these days that I see new info in a SD video, but I didn't know half the samplers took twice the time to generate. That could save a huge amount of time using Deforum. Thanks! ❤
Thanks, a very in-depth look at the samplers, thanks for explaining all that, a lot to wrap your head around. 99% of the time I use 30 steps of DPM++ 2M Karras and just occasionally run the final output with a high steps Heun if it is something really special as I prefer the way it looks in all those grids and has a great quality.
Such an awesome video I even wrote your name on a sticky note to add to my collection of bookmarks! Sorry, I just don't like YT deciding what I like for me, but I will speak about it positively when interacting with other humans face to face.
According to Wikipedia, Leonhard Euler and Karl Heun are Germans, in their language EU reads as OY - OYLER and HOYN - it comes from pronunciation of Greek ϒυ - ύψιλον. It would make sense to pronounce the names based on the one principle, and the way their mom and dad called them.
Wow. They waited until right after I made a new video to release new samplers. I saw a quick comparison on Reddit indicating dpmpp_3m_sde_gpu is in the same group as the rest of the SDE samplers. There is a conversion table for A1111 vs Comfy sampler names near the start of the vid Also, seeds for A1111 and Comfy are not the same (GPU vs CPU noise) so you won't get identical results even with the same seed. I think the gpu SDE samplers in comfy might be an exception, but I didn’t verify
Your videos are great and informative, but it would be helpful if you added a check mark or arrow in the places you talk about. Then the video would be easier to watch.
Great video. So much useful info. If you have a spare weekend you could repeat all this with the new LCM lora on 0.5, 0.75 and 1.0... You get great results in 3 steps on DPM SDE. Or maybe wait until the proper LCM scheduler is implemented in auto1111? Even before LCM, many samplers give great results at cfg 1, 1.5 or 2 even with just 5 steps, which I feel is often overlooked.
You could continue this process, producing a graph to illustrate the cfg vs steps trade off... There's a usable range and optimal cfg for each step count, for each sampler. I've not seen it properly graphed in a paper or RUclips video, if you're looking for more inspiration!
Your fantastic video on cfg is here, the legend at 6:50 is great, but perhaps more could be investigated in the bottom left corner below 12 steps :-) There's also the cfg fix in the form of dynamic thresholding... so many variables! ruclips.net/video/kuhO9zAzetk/видео.htmlsi=A61TTCeWHdetui6x
Question: when comparing steps required to get acceptable image between SD1.5 and SDXL, how were the SDXL steps divided between base and refiner samplers?
For this testing, all base steps. Refiner vs base will effect subjective quality and for the best subjective quality, you will want to go beyond the minimum steps anyway. My criteria for a decent image was all artifacts disappearing and the image not looking out of focus or burnt, not best subjective quality.
You forgot about one important thing- decreasing CFG to 3 will let you generate in 6 steps, CFG value and number of steps required are very dependant, steps number must be about twice the CFG value or more, if its less then you get artifacts... which makes this whole test very incomplete. From my testing, DPM++ 2M Karras , SDE and Euler a for 9 images on 3090, are al about - 7.7 sec with 12 steps 512res but i dont like euler a soft texture most of the time unless i need nontextured images, so for 9 images at 6 steps you get about 5 seconds to generate them
That's not true at all. Steps = How many times it refines the image. CFG = How much it tries to force the noise to match the prompt rather than filling it with something else that it sees. Lower CFG = more realism since it uses more of the neural network's own knowledge. Higher CFG = More and more "cooked" image if you go too far. About 6.5 is the highest I ever do, and 3 is the lowest I ever do. They have nothing to do with each other. They are completely independent.
This is the best video I saw about samplers. However, I really wish to have an in-depth comparison of them based on CFG value. I use huge ass prompts that full of every word I can think of that is related to the subject. It is like an essay in the positive and another in the negative, with a punch of negative embeddings. I usually need a very high CFG number (above 20) with hundreds of steps, but I get dark saturated contrasty images. I'm still experimenting with the samplers for generating the images and also for scaling. It is very hard to compare when I have no idea what all of those samplers do, what they can't do, what they can do with extra steps. Any suggestions for an extreme case of very large prompt with very large CFG with open number of steps ?
I'm finding some models don't really take well to novel-length prompts. Usually the creator will state that but sometimes...generate a thousand images and maybe get a hundred that actually apply to my intended parameters.
@ForeverTemplar I used to do that, generate many photos and pick from them, but no matter what number I generate, I always find the images lacking so much from the prompt, so I end up, adding so much words eventually. I made a workflow that inject new words in the prompt in 6 stages, where the old words is carried out in the same promt but the new words are kept in front. The stages start with main stage, then, composition stage, shapping stage, effects stage, detailing stage. The last stage is upscaling with the accumulated prompt. The CFG and steps increase as the stages increase, while the denoise decreases as the stages take place. It is a manual convergent method that gives me control on the outcome in the early stages. I'm still refining it, but I'm getting pretty consistent results at this point.
Big FANX for this Video! I have the strong feeling that with some samplers, long and very detailed prompts require more steps...(?) Has anyone else noticed a connection between prompt complexity, Sampler and number of steps? Happy colored Greetinx
Hello how are you? I really like your videos and your didactics, lately I'm looking for knowledge to create loras, these last 2 days I read and saw several things about the subject, but I'm still lost because in each place I see something teaching or informing something different .. I would really appreciate it if you could make a video about this subject, if possible... I have no doubt that you will explain things clearly... I'm using Kohya and the biggest problem is in the parameters tab, they are many settings and I don't know exactly what the main ones do, I've done dozens of tests, but I still get lost in options like Optimizer, LR Scheduler, and mainly, Network Rank (Dimension) and Network Alpha, the latter I think have a big weight in the result, just like Epochs x Steps... keep up with your beautiful didactic work
This is incredibly good, valuable content, just as always :) Could you maybe do a video deep dive on the refiner? i rarely get better results with it. :( it's even more useless with 3rd party models.
From my testing, DPM++ 2M Karras , SDE and Euler a for 9 images on 3090, are al about - 7.7 sec with 12 steps 512res but i dont like euler a soft texture most of the time unless i need nontextured images, also decreasing CFG to 3 will let you generate in 6 steps, CFG value and number of steps required are very dependant, steps number must be about twice the CFG value or more, if its less then you get artifacts
@@Chris3s I turned off xformers, have only v1 optimisation in bat file enabled, the reason is i get NaN exception with xformers that i cant fix by swapping to another model and back, i get same speds without and with xformers so i dont need them really and yes i tested it to compare if i lose something by disabling them.
Interrasting, I've never truly understood the samplers that well. In regards to dpm being cfg focused might it be possible that it works better compared to other samplers when you would require a high cfg value for whatever reason?
@@siliconthaumaturgy7593 this is true. You tend to manifest changes into automatic1111 with your videos. We'll have to see if we can find ways to use this special power to add handy features to the app. Make a video about mitigating memory leaks so they fix that! :)
somehow my results are unsatisfying with stable diffusion automatic 1111 when trying to generate locally on my pc with my old gpu gtx 1080 ti 11gb, cant even get anything close to just even looking good or anything comparable, if i have to be honest automatic 1111 local generation cant even compare to the simple and basic generation of leonardo ai
Dude, I usually dont write comments, but this guide is awesome as hell. The explanation with the "a" in the name and the benchmarks PLUS your "good image" disclaimer were extremely well done and professional. (I am currently at minute 3, but had to post a comment first lol) I hope you get more subscribers and viewers. Thank you for your service!
Whoa! First explanation of all the sampler that goes in depth... usually it's "just try them all out and see what you like" :)
FINALLY someone explaining all these terms in a way that makes sense.
that was awesomely presented ! thanks for the good work !
All samplers can be used with SDXL in ComfyUI. The DDIM, PLMS, UniPC sampler limitation is an Automatic1111 problem.
Correction, they can now be used in Auto's new commit (release candidate) SDXL update.
A very useful video. It's rare these days that I see new info in a SD video, but I didn't know half the samplers took twice the time to generate. That could save a huge amount of time using Deforum. Thanks! ❤
this info is pure gold
Thank you! Very useful video about samplers!
Thanks, a very in-depth look at the samplers, thanks for explaining all that, a lot to wrap your head around. 99% of the time I use 30 steps of DPM++ 2M Karras and just occasionally run the final output with a high steps Heun if it is something really special as I prefer the way it looks in all those grids and has a great quality.
Thank you VERY much for the time and effort you put into this! Most excellent!
Such an awesome video I even wrote your name on a sticky note to add to my collection of bookmarks!
Sorry, I just don't like YT deciding what I like for me, but I will speak about it positively when interacting with other humans face to face.
Fantastic video!!!
just wow! very in-depth tutorial of samplers!! thx a lot for your effort
17:06 17:06
17:06
THANKS for the video. you did a very good job here!
This is great! Are your infographics available as images somewhere? They'd make a good ongoing references while building familiarity...
Best in depth video.
bro,i love your works,thanks for the video
Thanks! very useful video!
According to Wikipedia, Leonhard Euler and Karl Heun are Germans, in their language EU reads as OY - OYLER and HOYN - it comes from pronunciation of Greek ϒυ - ύψιλον. It would make sense to pronounce the names based on the one principle, and the way their mom and dad called them.
Could you make a similar video about the additional samplers in ComfyUI? Samples like dpmpp_3m_sde_gpu
Wow. They waited until right after I made a new video to release new samplers.
I saw a quick comparison on Reddit indicating dpmpp_3m_sde_gpu is in the same group as the rest of the SDE samplers. There is a conversion table for A1111 vs Comfy sampler names near the start of the vid
Also, seeds for A1111 and Comfy are not the same (GPU vs CPU noise) so you won't get identical results even with the same seed. I think the gpu SDE samplers in comfy might be an exception, but I didn’t verify
Your videos are great and informative, but it would be helpful if you added a check mark or arrow in the places you talk about. Then the video would be easier to watch.
Well, this taught me a bit but I'll probably just stick to DDIM for text to image, then DPM++ 2S Karras for image to image/inpainting.
Hi, Where can I find the document you showed in the video? Thanks a lot!
adding loras and embeddings will continue to change the image significantly
Great video. So much useful info. If you have a spare weekend you could repeat all this with the new LCM lora on 0.5, 0.75 and 1.0... You get great results in 3 steps on DPM SDE. Or maybe wait until the proper LCM scheduler is implemented in auto1111? Even before LCM, many samplers give great results at cfg 1, 1.5 or 2 even with just 5 steps, which I feel is often overlooked.
Even without the LCM lora, try these on 4 steps:
DPM2 cfg 1.5
DPM2 a cfg 1.5
DPM++ 3M SDE cfg 1.5
DPM2 Karras cfg 2
DPM++ 2S a Karras cfg 2
DPM++ SDE Karras cfg 3
Then on 5 steps, more samples become usable at these low cfgs:
DPM++ 3M SDE 1
DPM2 a 1.5
DPM2 2
DPM2 Karras 2
DPM++ 2M Karras 2
Euler a 2
Euler 2
LMS Karras 2
UniPC 2
DPM++ 2S a Karras 2.5
DPM++ SDE Karras 3
DPM++ SDE 3
You could continue this process, producing a graph to illustrate the cfg vs steps trade off... There's a usable range and optimal cfg for each step count, for each sampler. I've not seen it properly graphed in a paper or RUclips video, if you're looking for more inspiration!
And how did you get the wall clock computation time out? Is there a benchmark tool that spits out the data?
Your fantastic video on cfg is here, the legend at 6:50 is great, but perhaps more could be investigated in the bottom left corner below 12 steps :-) There's also the cfg fix in the form of dynamic thresholding... so many variables! ruclips.net/video/kuhO9zAzetk/видео.htmlsi=A61TTCeWHdetui6x
Good guide, too bad this channel stopped with new content 10 months ago, so much more now with SD 2, 2.1, SDXL Lightning, Stable Cascade and SD3.
Question: when comparing steps required to get acceptable image between SD1.5 and SDXL, how were the SDXL steps divided between base and refiner samplers?
For this testing, all base steps. Refiner vs base will effect subjective quality and for the best subjective quality, you will want to go beyond the minimum steps anyway.
My criteria for a decent image was all artifacts disappearing and the image not looking out of focus or burnt, not best subjective quality.
hi, where do we put the samplers files in folder ?
You forgot about one important thing- decreasing CFG to 3 will let you generate in 6 steps, CFG value and number of steps required are very dependant, steps number must be about twice the CFG value or more, if its less then you get artifacts... which makes this whole test very incomplete. From my testing, DPM++ 2M Karras , SDE and Euler a for 9 images on 3090, are al about - 7.7 sec with 12 steps 512res but i dont like euler a soft texture most of the time unless i need nontextured images, so for 9 images at 6 steps you get about 5 seconds to generate them
That's not true at all.
Steps = How many times it refines the image.
CFG = How much it tries to force the noise to match the prompt rather than filling it with something else that it sees. Lower CFG = more realism since it uses more of the neural network's own knowledge. Higher CFG = More and more "cooked" image if you go too far. About 6.5 is the highest I ever do, and 3 is the lowest I ever do.
They have nothing to do with each other. They are completely independent.
This is the best video I saw about samplers.
However, I really wish to have an in-depth comparison of them based on CFG value. I use huge ass prompts that full of every word I can think of that is related to the subject. It is like an essay in the positive and another in the negative, with a punch of negative embeddings. I usually need a very high CFG number (above 20) with hundreds of steps, but I get dark saturated contrasty images.
I'm still experimenting with the samplers for generating the images and also for scaling. It is very hard to compare when I have no idea what all of those samplers do, what they can't do, what they can do with extra steps.
Any suggestions for an extreme case of very large prompt with very large CFG with open number of steps ?
I'm finding some models don't really take well to novel-length prompts. Usually the creator will state that but sometimes...generate a thousand images and maybe get a hundred that actually apply to my intended parameters.
@ForeverTemplar I used to do that, generate many photos and pick from them, but no matter what number I generate, I always find the images lacking so much from the prompt, so I end up, adding so much words eventually.
I made a workflow that inject new words in the prompt in 6 stages, where the old words is carried out in the same promt but the new words are kept in front. The stages start with main stage, then, composition stage, shapping stage, effects stage, detailing stage. The last stage is upscaling with the accumulated prompt.
The CFG and steps increase as the stages increase, while the denoise decreases as the stages take place. It is a manual convergent method that gives me control on the outcome in the early stages. I'm still refining it, but I'm getting pretty consistent results at this point.
@@jasemali1987This is interesting. How do you do this?
Big FANX for this Video! I have the strong feeling that with some samplers, long and very detailed prompts require more steps...(?)
Has anyone else noticed a connection between prompt complexity, Sampler and number of steps?
Happy colored Greetinx
new samplers now in the SD WebUI update
What about DPM++ 3M versions? Are they faster or better??
Hello how are you? I really like your videos and your didactics, lately I'm looking for knowledge to create loras, these last 2 days I read and saw several things about the subject, but I'm still lost because in each place I see something teaching or informing something different .. I would really appreciate it if you could make a video about this subject, if possible... I have no doubt that you will explain things clearly... I'm using Kohya and the biggest problem is in the parameters tab, they are many settings and I don't know exactly what the main ones do, I've done dozens of tests, but I still get lost in options like Optimizer, LR Scheduler, and mainly, Network Rank (Dimension) and Network Alpha, the latter I think have a big weight in the result, just like Epochs x Steps... keep up with your beautiful didactic work
what you use most of the time kindly share .
This is incredibly good, valuable content, just as always :)
Could you maybe do a video deep dive on the refiner? i rarely get better results with it. :( it's even more useless with 3rd party models.
Was DPM++ 2M Karras also tested above 20 steps?
From my testing, DPM++ 2M Karras , SDE and Euler a for 9 images on 3090, are al about - 7.7 sec with 12 steps 512res but i dont like euler a soft texture most of the time unless i need nontextured images, also decreasing CFG to 3 will let you generate in 6 steps, CFG value and number of steps required are very dependant, steps number must be about twice the CFG value or more, if its less then you get artifacts
@@krystiankrysti1396 have you tested with xformers and the slider option in settings? (forgot the name)
@@Chris3s I turned off xformers, have only v1 optimisation in bat file enabled, the reason is i get NaN exception with xformers that i cant fix by swapping to another model and back, i get same speds without and with xformers so i dont need them really and yes i tested it to compare if i lose something by disabling them.
Interrasting, I've never truly understood the samplers that well. In regards to dpm being cfg focused might it be possible that it works better compared to other samplers when you would require a high cfg value for whatever reason?
Apparently the latest automatic1111 added a bunch of new samplers. Are you howling in anguish? :P
Definitely poor timing on my part haha.
Though I'm use to videos going out of date within weeks at this point. Just how it is.
@@siliconthaumaturgy7593 this is true. You tend to manifest changes into automatic1111 with your videos.
We'll have to see if we can find ways to use this special power to add handy features to the app. Make a video about mitigating memory leaks so they fix that! :)
The gpu and non-gpu versions produce different results
somehow my results are unsatisfying with stable diffusion automatic 1111 when trying to generate locally on my pc with my old gpu gtx 1080 ti 11gb, cant even get anything close to just even looking good or anything comparable, if i have to be honest automatic 1111 local generation cant even compare to the simple and basic generation of leonardo ai
hahahah not obsessed with that "chinless dude" hahaha
interesting