I listed all diffusion model, vae link in this blog post, so how YT don't like HF links in description. thefuturethinker.org/nvidia-sana-in-comfyui-setup-tutorial-guide/
Please provide us working vae link. All people having problem with grey,black,blurry colorful images. Someone told that oldest version working, give it a chance and inform us.
@@KaganParlatan RUclips deletes comments to the hf links Efficient-Large-Model/Sana_1600M_1024px_diffusers/tree/38ebe9b227c30cf6b35f2b7871375e9a28c0ccce/vae add huggingface dot co in front
Don’t even bother watching videos from a month ago 😂, honestly every time I open my RUclips a new major model is out including 2x video models. New apps, new quants, new great workflows, new control nets 😂 I can’t download fast enough
If you get black you can get the oldest vae from the commit history on hugging face. There is only one vae file in that commit and it works at the moment.
@@Pernicuz I downloaded what I thought would have been the oldest version, but without success, still a black image. could you guys please leave a direct link here ? thanks a lot
Thanks for the pointer it worked, this should be the top comment. Download the earliest vae model and it works, looks like they broke something in the meantime.
i don't think that this is the VAE, the Ksampler shows no preview at all and all samples i have seen have ksampler preview. The extra models are simply broken.
I did everything as in the video. I downloaded all the databases. It is generated without errors , but the end result is a black square (( ComfyUI has been updated. Does not help. ( config win 10/10400/64 gb/ 3080 10gb (latest drivers)
The error "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)" appears. How can I fix this? For your information, I am using an RTX 3080.
I am only getting black or grey images. This error keeps appearing in console: Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in .
I have this error: GemmaLoader The checkpoint you are trying to load has model type `gemma2` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
Nerdy Rodent was required to remove his Hunyuan video since he is in the UK and there are EU and UK license issues with that model. This is video too ?
Hi, Great video! I'm getting the following error: "Tokenizer class GemmaTokenizer does not exist or is not currently imported." Any ideas on a solve would be greatly appreciated.
I wonder how inpainting and attention to details works with Sana. As far as I’ve seen in this video, Sana mostly fails to correctly represent objects and ignores generation of small details, like skin texture (woman’s example). But if more high quality FLUX-like results will be introduced, it’ll be amazing to have such a fast model to play with
hi, i don't know what hapens, always displays error "No package metadata was found for bitsandbytes" , i installed bitsandbytes and reinstalled torch and cuda and the error stays there
Hello, Im new to all this, and Im getting a bit of an error after following your steps, this is the error and I have no idea how to fix it, Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same, might you have an idea how to fix this and help a newbie out. any help is greatly appreciated... so it seems its telling me my model is running on GPU and data is on CPU, not sure how to make the adjustment to have both the same, any ideas, or is there a GPU work flow for this
Make sure got the right VAE for the model. I can't post the HF links on the description, so how YT don't like it. And I list it all in my blog post. thefuturethinker.org/nvidia-sana-in-comfyui-setup-tutorial-guide/
@@TheFutureThinker This is more exciting than a new car, bro. Cause it's freedom to create. That would include a car design with no limits. I want to make smoke paint. It's possible with AI. Thanks for the vids.
I got error GemmaLoader Can't use dtype 'torch.float16' with CPU! Set dtype to 'default'. I have set it to default, but i got error "The checkpoint you are trying to load has model type `gemma2` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date." Can you help me?
Great video! thank you. I'm getting blank images generated even though everything is set up properly. Any ideas why? Edit: Ijust noticed everyone else has already reported it.. oh! and Nvidia's license is extremely non user-friendly...shame
The model released doesn't output 4K images, it's a 1024 pixel model meaning 1million pixels(1024x1024) when you put in 4.00 its an aspect ratio of 1 to 4 not a 4K resolution.
@@TheFutureThinker Yes they do mention they've tested to 4K but in an unreleased version of the model. I've read comments on other channels as well and some users are annoyed after downloading and setting it up thinking they're getting 4K images out of the box and that's not the case.
@@glenyoung1809 this is the most common mistake of people nowadays , see something, zombie brain mode on, then rush to download. Haven't go through the steps or detail. Thats why many said error or something not work, etc. at the final only some are able to use AI and make it work. And in the previous video , where I only focus on this AI model research paper and the MIT demo page testing. I did mention the 4K res, and its roadmap. Also this is a base model , a lots of people have forgot about this point.
@@TheFutureThinker Most people see 4K in the title and that's it, the majority could care less about research papers, only that they can get their hands on the model and start pumping out images. The problem is the majority of users forget this is the bleeding edge, all of this is experimental and that comes with being a beta tester(which most users don't realize they are). There have been reported issues with blank images and then they conclude this model is crap when in fact this is all experimental and changing rapidly. To be fair to users there was that mess with Stable Diffusion 3.5 medium, which was crap and too hastily released and it didn't help that Black Forest released Flux 1.0 which only made SD3.5Med look even worse.
i am getting two errors back to back "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm" , "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm
Watched the whole video called "nVidia SANA in ComfyUI - is it worth it?" And at the end they said NOT READY FOR PRIMETIME. So NO you wont be able to get it to work yet as of 1/1/25
Nice video, but please, when you do this kind of videos, explain how you do things. I´m searching for a way to search inside ComfyUI and I don´t find anyway to do it, I´m stuck almost at the beginning. Edit: Ok, I found a way and now I don´t get how to do the rest because you don´t explain how to do it, so it´s not a comprehensive tutorial as you call it, it´s a tutorial for people that knows what to do.
The downside with Comfyui is not primarily the cluttered UI, but the difficulty in gauging the settings for an optimal output. Trial and error takes precious time, and there are no standardized and proved settings for the best results. Opinions are like rear ends.
As its about 4 times smaller database than flux it pretty much cant be better and as variable. What comes to VRAM they should publish specific checkpoints. Like people, cats, dogs, buildings, landscapes etc.. 4 sure its faster. So is SDXL. Better in quality? Nope. More variance? Nope. I hate the hype going around everything released just to get clicks.
Followed exactly but get "`.to` is not supported for `4-bit` or `8-bit` bitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct `dtype`."
But I will let SANA sit for a month while you guys work out the kinks. Too much too fast. BTW 16GB cards will be the minimum for 2025. Buy a new GPU before those orange man tariffs kick in.
I listed all diffusion model, vae link in this blog post, so how YT don't like HF links in description.
thefuturethinker.org/nvidia-sana-in-comfyui-setup-tutorial-guide/
Please provide us working vae link. All people having problem with grey,black,blurry colorful images. Someone told that oldest version working, give it a chance and inform us.
@@KaganParlatan RUclips deletes comments to the hf links Efficient-Large-Model/Sana_1600M_1024px_diffusers/tree/38ebe9b227c30cf6b35f2b7871375e9a28c0ccce/vae add huggingface dot co in front
The fastest at making a blank image! RIP ExtraVAELoader.
Damn, the pace of AI development right now is just ridiculous. Something I learned 3-6 months ago is already outdated.
@@dhanangright
@dhanang sometimes wake up another, a new thing release
3-6 Months?? I'd say 3-6 Weeks max 😅
@@maknien3-6 days😂
Don’t even bother watching videos from a month ago 😂, honestly every time I open my RUclips a new major model is out including 2x video models. New apps, new quants, new great workflows, new control nets 😂 I can’t download fast enough
If you get black you can get the oldest vae from the commit history on hugging face. There is only one vae file in that commit and it works at the moment.
yup that worked for me too, thanks!
@@Pernicuz I downloaded what I thought would have been the oldest version, but without success, still a black image. could you guys please leave a direct link here ? thanks a lot
Thanks for the pointer it worked, this should be the top comment.
Download the earliest vae model and it works, looks like they broke something in the meantime.
@@kaiserscharrman Tried leaving a link but YT immediately deletes the comment.
The commit number is 38ebe9b227c30cf6b35f2b7871375e9a28c0ccce
i don't think that this is the VAE, the Ksampler shows no preview at all and all samples i have seen have ksampler preview. The extra models are simply broken.
Censored and sanitized for my protection?
Is there any news on SANA Lora training?
I'm just getting a pixelated mess. 😥
I did everything as in the video. I downloaded all the databases. It is generated without errors , but the end result is a black square ((
ComfyUI has been updated. Does not help. (
config win 10/10400/64 gb/ 3080 10gb (latest drivers)
I'm also only getting black images
Same. I just get grey square.
I got a black gray image too... not sure what I missed... did exactly as the video
same here
At least I don't feel like the only one hahaha
The error "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)" appears. How can I fix this?
For your information, I am using an RTX 3080.
@@moviecartoonworld4459 I get the same error. Couldn't find a solution.
I was exposed to the same message with the general video card in my device RTX3050 4gb
How did you solve the problem, please?
Looking forward for this solution, facing the same problem too
Competitive Aquarium Design?!!!😱 I didn't know that was a thing!!!.... Oh... Sana is pretty cool too 😁👍
@UnclePapi_2024 search it, IAPLC 😉 you might be addicted to this hobby if you like nature, water, animals, and plants.
I am only getting black or grey images.
This error keeps appearing in console:
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in .
Great video! Thanks! 🙏
Than you for taking the time to explain it !
Are generated images allowed for commercial use? From what I read in the license files, they are not permitted.
NO, the license clearly says "non commercial use only". so I don't even bother with the black rectangle it generates due to an VAE issue any longer...
How is it with generating text in images ?
I have this error:
GemmaLoader
The checkpoint you are trying to load has model type `gemma2` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
Is there a way to have an output of various cfg levels? I'd think it makes sense to have an array of the different variants.
Nerdy Rodent was required to remove his Hunyuan video since he is in the UK and there are EU and UK license issues with that model.
This is video too ?
No, image
@@javi22022 ?
How much VRAM & CPU RAM please?
like others same issue on windows 11: GemmaLoader, No package metadata was found for bitsandbytes
what about sana vs flux dev? can sana win? any do tests alredy?
@@robertaopd2182 flux for quality, sana for speed
Hi, Great video! I'm getting the following error: "Tokenizer class GemmaTokenizer does not exist or is not currently imported." Any ideas on a solve would be greatly appreciated.
followed the video but generating only colors, either grey, black or yellow
i LOVE that your hobby came through in this video!!!!
I am not just tech and AI 😄😅
I wonder how inpainting and attention to details works with Sana. As far as I’ve seen in this video, Sana mostly fails to correctly represent objects and ignores generation of small details, like skin texture (woman’s example). But if more high quality FLUX-like results will be introduced, it’ll be amazing to have such a fast model to play with
Well, its a base model and very small size model. So it happened, some type of image not showing detail.
@ hopefully the infrastructure to fine-tune Sana and use it with other generation pipelines components will be developed soon:))
Getting a Grey output image
hi, i don't know what hapens, always displays error "No package metadata was found for bitsandbytes" , i installed bitsandbytes and reinstalled torch and cuda and the error stays there
Are you using Windows?
I have the same error
with windows 11 4060 16gb ti
Thank You for this video! :)
Hope you like it 😄
Hello, Im new to all this, and Im getting a bit of an error after following your steps, this is the error and I have no idea how to fix it, Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same, might you have an idea how to fix this and help a newbie out. any help is greatly appreciated... so it seems its telling me my model is running on GPU and data is on CPU, not sure how to make the adjustment to have both the same, any ideas, or is there a GPU work flow for this
Does this work with Mac considering it’s Nvidea?
i got only a solid grey colored picture ...
Same here. Everything is updated, but no image.
Make sure got the right VAE for the model. I can't post the HF links on the description, so how YT don't like it.
And I list it all in my blog post. thefuturethinker.org/nvidia-sana-in-comfyui-setup-tutorial-guide/
getting same result did anyone fix it yet ? btw i had downloaded the same vae from the link same as the model
@@TheFutureThinker seems several are having the same issue. Still happening after verifying VAE
I dont know if it is related but google/gemma-2-2b-it does not have access to install.
wont download fetch files at the beginning.. fetching 9 file stops at 0%
The next step is when some company launch a common AI model. For example ... use resource from sd 1.5 in Pony, SDXL, or other model.
Thanks, great ideo. shame these models probably don't have controlnets yet?
I was literally looking at NVDA for a long lol. This is awesome.
Haha 😂😂
But this model still in early stage. Wait for the fine tune.
@@TheFutureThinker This is more exciting than a new car, bro. Cause it's freedom to create. That would include a car design with no limits. I want to make smoke paint. It's possible with AI. Thanks for the vids.
Thank you. Hopefuly, we will be moving from flux to other models.
I got error GemmaLoader Can't use dtype 'torch.float16' with CPU! Set dtype to 'default'. I have set it to default, but i got error "The checkpoint you are trying to load has model type `gemma2` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date." Can you help me?
Maybe a TensorRT version next ?
output were grey photos even followed your instructions in install on Window
Nsfw? I might try this for up-res of flux fills
yeah, thanks for the updates!!
Try this out😉
Do LORAs work with this?
I keep getting this at the ksampler: KSampler 'int' object is not subscriptable
I got the same error when I used the whole number 1 .... 1.1 works, so it must be some scripting error
Great video! thank you. I'm getting blank images generated even though everything is set up properly. Any ideas why? Edit: Ijust noticed everyone else has already reported it.. oh! and Nvidia's license is extremely non user-friendly...shame
nice tutorial. but it only generates a solid gray image.
Nvidia Sana in spanish means: Healthy Envy
Excellent explained like everytime! ❤
Thanks
Without finishing the video, how are the hands and realism in general?
Can it do hands well? Also I guess loras will come to civitai soon enough.
yes, im currently developing loras for human realism, the flux process seems hundreds of times longer than this new one from NVIDIA, im happy.
@@masterwillian7785nice 👍 please keep us update on your Lora for Sana.
@@masterwillian7785 That sounds great. Will watch your video on the lora when you release it.
The model released doesn't output 4K images, it's a 1024 pixel model meaning 1million pixels(1024x1024) when you put in 4.00 its an aspect ratio of 1 to 4 not a 4K resolution.
Check the research paper. And yes, 4.0 that is ratio, but check the research paper where is talk have mentioned 4K res.
@@TheFutureThinker Yes they do mention they've tested to 4K but in an unreleased version of the model.
I've read comments on other channels as well and some users are annoyed after downloading and setting it up thinking they're getting 4K images out of the box and that's not the case.
@@glenyoung1809 this is the most common mistake of people nowadays , see something, zombie brain mode on, then rush to download. Haven't go through the steps or detail. Thats why many said error or something not work, etc. at the final only some are able to use AI and make it work.
And in the previous video , where I only focus on this AI model research paper and the MIT demo page testing. I did mention the 4K res, and its roadmap.
Also this is a base model , a lots of people have forgot about this point.
@@TheFutureThinker Most people see 4K in the title and that's it, the majority could care less about research papers, only that they can get their hands on the model and start pumping out images.
The problem is the majority of users forget this is the bleeding edge, all of this is experimental and that comes with being a beta tester(which most users don't realize they are).
There have been reported issues with blank images and then they conclude this model is crap when in fact this is all experimental and changing rapidly.
To be fair to users there was that mess with Stable Diffusion 3.5 medium, which was crap and too hastily released and it didn't help that Black Forest released Flux 1.0 which only made SD3.5Med look even worse.
Thanks it works
Did this work for anyone?
i am getting two errors back to back "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm" , "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm
After 3 hours of googling and re-installing - yes. But... This model feels like proof of concept, not real thing.
Yea so you lost me at search box when you moved the mouse around.
Watched the whole video called "nVidia SANA in ComfyUI - is it worth it?" And at the end they said NOT READY FOR PRIMETIME. So NO you wont be able to get it to work yet as of 1/1/25
brooo waiting on aqua scaping channel now :D
I know in your place that have beautiful wood 😁 and good water condition for fish
what's left is to make them read our thoughts xD Typing cost to much time haha
Nice video, but please, when you do this kind of videos, explain how you do things. I´m searching for a way to search inside ComfyUI and I don´t find anyway to do it, I´m stuck almost at the beginning. Edit: Ok, I found a way and now I don´t get how to do the rest because you don´t explain how to do it, so it´s not a comprehensive tutorial as you call it, it´s a tutorial for people that knows what to do.
Now just need a controlnet for it
Hope it supports lora, controlnet and ipadapter soon
Haha, i remember our office with a 180cm planted tank 😂 when will you setup a new layout again?
Maybe coming 2025 I join again the IAPLC 😁🤫. And I need to buy some new stone. Too bad AI cannot generate that. LOL
iit works but not sure it bether than sd15 or sdxl :(
I have try Sana and i prefer by far flux, something is just not right in the lighting and details.
@@petertremblay3725 same with redux and pulid flux still rules
The question every user wants to ask, and no developer wants to answer:
"Does it do hands?"
GemmaLoader prbleme
All I get is a blnk picture on every run.
Ai Never sleeps, dam!
😂😂😂yup
Mindblowing o_O
CPU wahh🤩
@@golddiggerprankz yup yup the text encoder do the same as T5 but very lightweight.
not for generating the image, only for the prompt.
The downside with Comfyui is not primarily the cluttered UI, but the difficulty in gauging the settings for an optimal output. Trial and error takes precious time, and there are no standardized and proved settings for the best results. Opinions are like rear ends.
6:55
It's fast but the images look like from 2023
On lun 9
As its about 4 times smaller database than flux it pretty much cant be better and as variable. What comes to VRAM they should publish specific checkpoints. Like people, cats, dogs, buildings, landscapes etc.. 4 sure its faster. So is SDXL.
Better in quality? Nope. More variance? Nope.
I hate the hype going around everything released just to get clicks.
How about people?
fastest black image generator. Somehow the model don't do anything.
It is wonderful for creating landscapes but for creating humans it is very bad, the worst.
Followed exactly but get "`.to` is not supported for `4-bit` or `8-bit` bitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct `dtype`."
updated comfyui today and it's fixed but now I get the black screen. Will try to find that old vae.
But I will let SANA sit for a month while you guys work out the kinks. Too much too fast. BTW 16GB cards will be the minimum for 2025. Buy a new GPU before those orange man tariffs kick in.
This Sana draws people just terribly :(( No miracle happened, I’m staying with Flux Schnel
Well, what did you expect? It's just a small model of 0.6B parameters, while FLUX is 6B - 11B parameters. Of course the results will be terrible. =)
Censored.
I didn't see anything surprising, it's just a small model, so it requires less resources and produces terrible results.
1000 times slower and resalt black frame
i need a better computer sadly :/
I