You are right, it really is hard to juggle all of these - I think the best model depends on your use case. If you want quality images with minimal effort, Midjourney and Dalle-3 are terrific. However, for more customization, commercial use, and no cost, SD 1.5 and XL are extremely well supported in the open source community and can be run locally. Stable Cascade which was covered in this video is on par with XL but is non-commercial and requires more VRAM to make it work on local machines. There are of course other considerations like cost, censorship, and APIs as well.
Another excellent, short, and to-the-point video-thank you. Do you have any idea why the generation process is extremely slow? It takes about 10 minutes with my Nvidia 2070 with 8GB VRAM
From my understanding is that this cascading (Würstchen) architecture requires that each of the models (Stage C, B, and A) to be loaded into memory - of which requires a larger amount of VRAM to do this. That being said, Stage B and C are modular. Once we see proper support in apps like ComfyUI/Auto1111/Forge/SD.Next,etc., I imagine that less VRAM would be necessary to get these up and running when using the smaller models. Additionally, Stability noted in their release the following: "Thanks to Stable Cascade’s modular approach, the expected VRAM requirements for inference can be kept to approximately 20gb but can be further lowered by using the smaller variants (as mentioned before, this may also decrease the final output quality)."
Haha - touchè! If memory serves, about 30 seconds or so. Curious to see how it performs once full support in ComfyUI and with the smaller Stage C/B models.
Yeah, I a healthy amount of VRAM is required to get things rolling and the models loaded. Once running then should see good results. Unsure if officially stated as away from computer at the moment - but believe generally you want at least 8GB of VRAM.
@@PromptingPixels Yes, I've been running fooocus with a 16gb amd 6900xt, all be it a bit slow it puts out some amazing renders. But I'll defiantly be upgrading to nvidia with the 5xxx series next year, that is if AMD doesn't improve upon there ai gen.
Yeah this is quite a temporary solution to make it run in ComfyUI. I have been checking the ComfyUI repo regularly waiting for the Stable Cascade support - will probably put together a video once that happens.
this is getting confusing, what's the best between stable diffusion 1.5, stable diffusion XL, stable cascade, midjourney, and dalle 3?
You are right, it really is hard to juggle all of these - I think the best model depends on your use case. If you want quality images with minimal effort, Midjourney and Dalle-3 are terrific.
However, for more customization, commercial use, and no cost, SD 1.5 and XL are extremely well supported in the open source community and can be run locally. Stable Cascade which was covered in this video is on par with XL but is non-commercial and requires more VRAM to make it work on local machines.
There are of course other considerations like cost, censorship, and APIs as well.
thanks for the info, seems like it depends on what you want to use them for indeed@@PromptingPixels
Just gave it a try and yes, the repo is deprecated but it has instructions to the official support :)
Thank you for the heads up! Just uploaded a new workflow with the native support.
Thanks as always good sir
You bet!
Another excellent, short, and to-the-point video-thank you. Do you have any idea why the generation process is extremely slow? It takes about 10 minutes with my Nvidia 2070 with 8GB VRAM
From my understanding is that this cascading (Würstchen) architecture requires that each of the models (Stage C, B, and A) to be loaded into memory - of which requires a larger amount of VRAM to do this.
That being said, Stage B and C are modular. Once we see proper support in apps like ComfyUI/Auto1111/Forge/SD.Next,etc., I imagine that less VRAM would be necessary to get these up and running when using the smaller models.
Additionally, Stability noted in their release the following:
"Thanks to Stable Cascade’s modular approach, the expected VRAM requirements for inference can be kept to approximately 20gb but can be further lowered by using the smaller variants (as mentioned before, this may also decrease the final output quality)."
press the button, i want to see it render :D
Haha - touchè! If memory serves, about 30 seconds or so. Curious to see how it performs once full support in ComfyUI and with the smaller Stage C/B models.
Thanks for the info. I'm guessing this will only work on nvidia gpu's?
Yeah, I a healthy amount of VRAM is required to get things rolling and the models loaded. Once running then should see good results. Unsure if officially stated as away from computer at the moment - but believe generally you want at least 8GB of VRAM.
@@PromptingPixels Yes, I've been running fooocus with a 16gb amd 6900xt, all be it a bit slow it puts out some amazing renders. But I'll defiantly be upgrading to nvidia with the 5xxx series next year, that is if AMD doesn't improve upon there ai gen.
Sora says hi
Wasn't that announcement crazy!?
@@PromptingPixels waiting for you to drop a video about it haha
Just shot it! Now on to editing - look for it in about an hour or two!
Too bad the model first goes to C: drive cache, if your C: is full, you can't install it
Yeah this is quite a temporary solution to make it run in ComfyUI.
I have been checking the ComfyUI repo regularly waiting for the Stable Cascade support - will probably put together a video once that happens.