Not sure what's going on, but I'm following the instructions completely and I'm generating completely random images with the occasional beige blank image. Any ideas as to why that may be happening?
I really like your style of explanation. At first I was intimidated by ComfyUi but you've done such a good job explaining that I actually can handle this beast of an UI now. So thanks! :)
Just brilliant! Yesterday I spend about two hours trying to resolve links between modules in other "tutorial" on youtube in which author shows nothing, but placed modules that nothing was visible, without success. Thank you very much for your work in this and other tutorial Scott!
Duuuude! You're throwing out SDXL + Comfy heavyhitters like there's no tomorrow. Very inspirational, more please! Tomorrow I'm biting the bullet and getting it all installed thanks to you.
Hello Scott, Super awesome video and simple cool explanations making this workflow delicious to the eye and viewer! Thanks a lot for exploring this in a more educative way so one can keep thinking and improvise the workflow for individual needs ! REVITOLOGY explained cool !
This is GENIUS! I have been lerping latents to try to accomplish this sort of thing, but this works so much better! Every one of your videos proves to be super valuable, you are an asset to the community! By the way, your little sidetrack about CompyUi being intended as a backend, would have been a great time to shout out the great work that you co-worker "McMonkey" is doing with Stable Swarm!
This is absolutely genius, I was able to combine this workflow with controlnet, prompting and the some finetuning. And I'm finally able to create the image in the pose I want, with the style I want! Thank you so much
For people who want a nicer UI for ComfyUI, comfybox is pretty nice. I've pretty much built the basic features of A1111 in it, and then some; I have a HR fix that uses the Tile controlnet instead, and the workflows use AITemplate wherever feasible for a frankly ridiculous speedup in gen time. The graph is a mess though. But it works.
Okay, I checked it out totally expecting to roll my eyes, but WOW dude! You have something pretty fantastic here! If you can add in the images as thumbnails for inputs where an imageload is involved, you have an amazing thing here! I left a comment as a FR on your git. Nice work, and I will do a video on this soon!
Thanks so much for your tutorials! I'm stuck at that epidsode while generating so many beuatiful and funny pictures :D i just got 2 thumbs up - im sure i can generate a picture with more than 10 thumb ups you earn :D
Scott, is there a way to save custom nodes? We change the typical 512x512 parameters to 1024x1024 quite often... I'm sure there are even better examples, but it would be pretty sweet if you could save that as a custom node with 1024 as the default parameters so you don't have to repeat that step every time you work with SDXL.
ok built this 2 x and not working....not sure if its the clip vision model or the checkpoint model messing it up. have not changed any settings. using but maybe sampler is the wrench causing probs.?! can anyone post the right ones?...basically trying every combination now = (
This is really fantastic, thank you for this workflow. And in the end, I miss the integration of Control Net, I tried, in different ways, but ended with a lot of errors... Would you have some hints to add control net in the process, thanks again !
Just found your channel and the ComfyUI playlist, working my way through them now. How can I get a text version out of CLIP Vision to see what it gets out of the images? Many thanks.
Great video Scott! Thanks! In case anyone is listening though: when I set things up like in this video, everything works at first, but then, at some point all my images are just B&W. Any idea as to why this is happening?
thank you so much! Everything went really smooth until I deleted the "Conditioning zero out" and type prompts in Cliptextencoder, since then the images were broken pixels. Can anyone kindly help me out? Did you guys encounter this problem too?
Great, thanks for this. Does this only work for SDXL models though? When trying 1.5 models with this setup I had no luck so far to get this to work properly.
It is part of youtube membership, I don't use Patreon. I might switch back, as this is a complete pain, but it is one site which is nice. ruclips.net/channel/UC9kC4zCxE-i-g4GnB3KhWpAjoin
Awesome video. The file list was pytorch models not clip_vision and I put them in ComfyUI\ComfyUI_windows_portable\ComfyUI\models\clip_vision. I'm still new so, just learning rn.
I know nothing about this stuff but im trying to piece it all together with videos like these. I'm doing everything like you but I'm getting an error "Error occurred when executing KSampler: The size of tensor a (768) must match the size of tensor b (1280) at non-singleton dimension 1" can anybody help me with that?
I tried it with the same exact workflow but it is showing a very different image. I don't know how to fix this. This is a good content with superb explanation technique. Thanks.
Very well explained, thank you. By the way, you didn't use the special SDXLClipTextEncode, so it's not mandatory if you don't want to use the refiner ?
Man I wish I could get this to work like yours. Mine just produces garbage. I gave it an angle and a stained glass window and it produces a very pretty sunset. Ordinarly I get really good images, but reading another image seems to be too much for it.
Hey Scott, thanks for this. I'm having an error : Error occurred when executing KSampler: The size of tensor a (1024) must match the size of tensor b (1280) at non-singleton dimension 1 I did use an Image resize node after my two Load Image nodes to make sure they are both a 1024x1024, but no dice. What do you think that could be?
Hello, let me ask you a question. The prompt doesn't seem to have any effect on this revision operation, is there any way the prompt can be involved in the image combination?
would you know where can I download the safetensors version of CLIP-ViT-bigG-14-laion2B-39B-b160k model like the one you have ? Also, is there a node that can output the actual caption generated by CLIP ?
Is it normal to get the error "RuntimeError: The size of tensor a (1024) must match the size of tensor b (1280) at non-singleton dimension 1" when using another clip vision model?
Weird, started from scratch twice now, triple checked everything & yet I'm not getting the same sort of results. My images are generating, but they don't seem to be pulling from or blending the source images, just making something completely new. I'll keep tinkering...
It's been a minute since i looked at that work flow, but I know I figured it out eventually @@falk.e :D I'll have a look shortly and let you know if I can remember just what it was I did to solve it. I've learned a lot fiddling with it over the last week or so :)
How do you use the Clip model? I downloaded it and it's just a bunch of data. Can't figure out what I'm supposed to do with the files. (Not really a programmer and new to this, btw)
@@sedetweiler Right, but I don't have a single file in the download. It's just a folder with tons of files labeled "data". The other ones (the sdxl files) are just a single file with a .safetensors extension edit: I figured out what the issue was. I extracted the files from the .bin file instead of just leaving it alone... If I just paste the .bin file directly into the proper folder, it works fine. Hopefully this might be helpful for someone who makes the same mistake in the future!
Just getting some Random junk like most people say in the comment. Also the clip file in the video is diff form the one that is in the link. May be that is the problem, or i am doping something wrong, no idea. Anyway its a great video. Thanks for sharing
amazing and simple, thank you very much, please make a tutorial on how I can make the output and put it in a sequence of images so I can animate them. 🥇🥇
Really great stuff Scott. A couple of questions for you. Is there a university level course or program (with exams etc) that allows some sort of certification for this stuff? I really want to switch to this full time but I think some formal training would be useful. And I guess that leads to a second question; where do you look for jobs in this field?
I get errors when using "clip_vision_g.safetensors" and "open_clip_pytorch_model.bin" generates mostly random junk, only using similar colors from source images. Not sure what I'm doing wrong, same results when using revision-image_mixing_example.json from HuggingFace.
@@sedetweiler Yes, I've seen the link... but all files? I usually just download checkpoints, there I always just download a single file. So all bin files in this case?
@@gordonbrinkmann I'm wondering the same thing... UPDATE... just put the whole bin file in the models/clip_vision folder, that worked here. When launching ComfyUI after that it loaded a lot of stuff. After re-creating the workflow Scott first shows us, the Queue Prompt button did not work. I saved the workflow and closed out of the browser and the CMD window and came back in and it works fab!
@@sedetweiler In the video you used the model "clip_vision_g.safetensors", though the description link points to "open_clip_pytorch_model.bin". I found the first one myself, and both seem to work. Is there a reason for the change?
this is random af for me and has NOTHING to do with the input images so it feels really lame to use. Setup is like yours. It feels like XL is JUST for making some generic AI art that you see everywhere now *oh but thank you for the tutorial and awesome video + pacing! subbed ** even the included workflows dont work at all. I start with like a picture of a human and end up with a duck in a pond that has 0 connection to the initial pic. *** its the Fooocus Ksampler that was breaking everything - a custom node that replaces certain functions even if it is not used..wow
I'm also getting extremely random images from this. exactly the same setup, completely updated on all fronts... just random stuff like a dog or cars when putting in portraits of people, and nothing close to the style of the images. Did you figure this out, or did you give up?
@@sedetweiler it had nothing to do with that. for some reason the Fooocus custom node doesn't mix with this, even if you're not using the node in the workflow. It has to be completely uninstalled for this to work.
Sadly I got an error message instead 😂 The error message said something about size mismatch for vision_model.embeddings.position_embedding.weight , and it is very long too.
@@sedetweiler I don’t have one, moreover, there is such an Image Resize node at a certain point in time I also had it, but then it disappeared somewhere and I can’t imagine how to return it .. Can you make a lesson on how to return standard nodes?
You're an amazing teacher!! You speak really well. Thank you!!
Not sure what's going on, but I'm following the instructions completely and I'm generating completely random images with the occasional beige blank image. Any ideas as to why that may be happening?
I really like your style of explanation.
At first I was intimidated by ComfyUi but you've done such a good job explaining that I actually can handle this beast of an UI now. So thanks! :)
Great to hear! Thank you so much!
Awesome tutorial! I like that you keep it simple and strait to the point, while teaching advanced techniques. Will be sure to check your next vidéos.
Thank you!
Just brilliant! Yesterday I spend about two hours trying to resolve links between modules in other "tutorial" on youtube in which author shows nothing, but placed modules that nothing was visible, without success. Thank you very much for your work in this and other tutorial Scott!
Glad it helped!
Duuuude! You're throwing out SDXL + Comfy heavyhitters like there's no tomorrow.
Very inspirational, more please!
Tomorrow I'm biting the bullet and getting it all installed thanks to you.
Welcome to the dark side!
Hello Scott, Super awesome video and simple cool explanations making this workflow delicious to the eye and viewer! Thanks a lot for exploring this in a more educative way so one can keep thinking and improvise the workflow for individual needs ! REVITOLOGY explained cool !
This is GENIUS! I have been lerping latents to try to accomplish this sort of thing, but this works so much better! Every one of your videos proves to be super valuable, you are an asset to the community! By the way, your little sidetrack about CompyUi being intended as a backend, would have been a great time to shout out the great work that you co-worker "McMonkey" is doing with Stable Swarm!
So true! I just messaged him today with some questions I had. I will do a video on that for sure!
This is absolutely genius, I was able to combine this workflow with controlnet, prompting and the some finetuning. And I'm finally able to create the image in the pose I want, with the style I want! Thank you so much
Hello, would you share your workflow with CONTROL NET nodes added, I had a bit of rough time trying to add them.
@@melondezign I've replied your mail. Greets
Yes, thank you @@marjolein_pas !
Hi would appreciate much, if you can share this workflow with ControlNet. Thank you
How can I reach you?
@@SamBeera
what exactly do you download to get clip vision to work? what file am i looking for to install? thanks
For people who want a nicer UI for ComfyUI, comfybox is pretty nice.
I've pretty much built the basic features of A1111 in it, and then some; I have a HR fix that uses the Tile controlnet instead, and the workflows use AITemplate wherever feasible for a frankly ridiculous speedup in gen time.
The graph is a mess though. But it works.
Okay, I checked it out totally expecting to roll my eyes, but WOW dude! You have something pretty fantastic here! If you can add in the images as thumbnails for inputs where an imageload is involved, you have an amazing thing here! I left a comment as a FR on your git. Nice work, and I will do a video on this soon!
Is there a Google Collab folder for comfybox?
you are amazing learning so much from you
I love things that challenge the creative mind!
Indeed!
Great video. Thank you!
Glad you liked it!
Worthy to try. Thanks 😊
another great video and easy to follow, nice work!
Glad you liked it!
Thanks so much for your tutorials! I'm stuck at that epidsode while generating so many beuatiful and funny pictures :D
i just got 2 thumbs up - im sure i can generate a picture with more than 10 thumb ups you earn :D
Scott, is there a way to save custom nodes? We change the typical 512x512 parameters to 1024x1024 quite often... I'm sure there are even better examples, but it would be pretty sweet if you could save that as a custom node with 1024 as the default parameters so you don't have to repeat that step every time you work with SDXL.
Yes, there are methods to set defaults. That and some other node housecleaning tips are coming soon!
You can use Nested Node Builder to save them as a new node and then load them when you want, also unnest them if needed. Very convenient.
ok built this 2 x and not working....not sure if its the clip vision model or the checkpoint model messing it up. have not changed any settings. using but maybe sampler is the wrench causing probs.?! can anyone post the right ones?...basically trying every combination now = (
This gets the kernels within the images, it's a kind of essence compression
Yes, that is a nice way to say it.
I have an important question. Why does an empty positive prompt does not the same as a conditionigzeroout ???
This is really fantastic, thank you for this workflow. And in the end, I miss the integration of Control Net, I tried, in different ways, but ended with a lot of errors... Would you have some hints to add control net in the process, thanks again !
Great video now I need one to create depth maps so can generate stl bas-relief 3d print files. Gonna subscribe just in case you decide to go there.
Just found your channel and the ComfyUI playlist, working my way through them now. How can I get a text version out of CLIP Vision to see what it gets out of the images? Many thanks.
Great video Scott! Thanks! In case anyone is listening though: when I set things up like in this video, everything works at first, but then, at some point all my images are just B&W. Any idea as to why this is happening?
thank you so much! Everything went really smooth until I deleted the "Conditioning zero out" and type prompts in Cliptextencoder, since then the images were broken pixels. Can anyone kindly help me out? Did you guys encounter this problem too?
Great, thanks for this. Does this only work for SDXL models though? When trying 1.5 models with this setup I had no luck so far to get this to work properly.
How would your node tree look for doing Image sequences/batch images in Comfy UI?
Great content !!!!!!!!!!!
Thank you for the video. Where is your patrons page? It's not in the description.
It is part of youtube membership, I don't use Patreon. I might switch back, as this is a complete pain, but it is one site which is nice. ruclips.net/channel/UC9kC4zCxE-i-g4GnB3KhWpAjoin
Awesome video. The file list was pytorch models not clip_vision and I put them in ComfyUI\ComfyUI_windows_portable\ComfyUI\models\clip_vision. I'm still new so, just learning rn.
I know nothing about this stuff but im trying to piece it all together with videos like these. I'm doing everything like you but I'm getting an error "Error occurred when executing KSampler:
The size of tensor a (768) must match the size of tensor b (1280) at non-singleton dimension 1" can anybody help me with that?
yeah, i got same error
@@blacktilebluewall My solution was to install clip_Vision_g.safetensors. from Manager, and choosing in it Load Clip vision.
I tried it with the same exact workflow but it is showing a very different image. I don't know how to fix this.
This is a good content with superb explanation technique. Thanks.
Could you use XY grids with those 2 parameters? that would be cool
Hmm, perhaps! I will have to play with it.
Im not sure why but for some reason when I do this it doesnt make any change to my image
I am getting no result, a blank solid color. What must be the reason? Attaching Screenshot below, please help.
Very well explained, thank you. By the way, you didn't use the special SDXLClipTextEncode, so it's not mandatory if you don't want to use the refiner ?
You don't get the benefit of the full clip conditioning or the 2 clip models, but it's fine for demonstration.
You don't get the benefit of the full clip conditioning or the 2 clip models, but it's fine for demonstration.
You won't get the benefit of the full clip conditioning or the 2 clip models, but it's fine for demonstration.
Man I wish I could get this to work like yours. Mine just produces garbage. I gave it an angle and a stained glass window and it produces a very pretty sunset. Ordinarly I get really good images, but reading another image seems to be too much for it.
Are you using an XL checkpoint? You might get garbage if you're trying this with 1.5
Hey Scott, thanks for this. I'm having an error : Error occurred when executing KSampler:
The size of tensor a (1024) must match the size of tensor b (1280) at non-singleton dimension 1
I did use an Image resize node after my two Load Image nodes to make sure they are both a 1024x1024, but no dice. What do you think that could be?
could you resolve this?
Hi, thanks for awesome content. What is save(API Format). How does it works. May i have a link? 🙂
that is coming soon!
Hello, let me ask you a question.
The prompt doesn't seem to have any effect on this revision operation, is there any way the prompt can be involved in the image combination?
Not using this method. However, the IPadapter might be the tool you want. I did a video on that recently as well.
@@sedetweiler thank you so much. I love your voice.
Aww, thank you!
Is it possible to turn the conditioning back into a prompt?
Interesting question! I will have to see what is involved.
Is there a compelling reason to grab openclip and the clipvision-g model?
Yeah, we will use them anytime we want to "query" an image we are uploading into the workflow and using it as a prompt.
@@sedetweiler sorry I meant is there a reason to have both rather than just one?
would you know where can I download the safetensors version of CLIP-ViT-bigG-14-laion2B-39B-b160k model like the one you have ?
Also, is there a node that can output the actual caption generated by CLIP ?
I don't believe there is a safe tensors version.
@@sedetweiler really? it says clip_vision_g.safetensors in your node
Is it normal to get the error "RuntimeError: The size of tensor a (1024) must match the size of tensor b (1280) at non-singleton dimension 1" when using another clip vision model?
You need to resize the source clip
@@urekmazino2086what is that?
Does anyone know a solution similar to this for SD1.5? Thanks. *cries in NVidia 1050*
Weird, started from scratch twice now, triple checked everything & yet I'm not getting the same sort of results. My images are generating, but they don't seem to be pulling from or blending the source images, just making something completely new. I'll keep tinkering...
I have the same issue. it doesn`t use the source images, just the model and prompts
It's been a minute since i looked at that work flow, but I know I figured it out eventually @@falk.e :D I'll have a look shortly and let you know if I can remember just what it was I did to solve it. I've learned a lot fiddling with it over the last week or so :)
Thanks for the lesson... Why not just have saved and shared it as a workflow
I did, but that is in the member posts for sponsors. Gotta feed those that help feed the family! ;-)
I have it set up exactly and I get garbage. I don't know what I'm doing wrong.
Where can i download the photos so i can load the graphs in comfy
They are posts in RUclips for channel members at the Sponsor level or higher.
big like ' thanks 🙂
Thanks for the visit
How do you use the Clip model? I downloaded it and it's just a bunch of data. Can't figure out what I'm supposed to do with the files. (Not really a programmer and new to this, btw)
CLIP was trained to recognize the relationship between photos and words (on a very basic level). You will never look at these files directly.
@@sedetweiler Right, but I don't have a single file in the download. It's just a folder with tons of files labeled "data". The other ones (the sdxl files) are just a single file with a .safetensors extension
edit: I figured out what the issue was. I extracted the files from the .bin file instead of just leaving it alone... If I just paste the .bin file directly into the proper folder, it works fine. Hopefully this might be helpful for someone who makes the same mistake in the future!
where is your clip vision model located? I'm in models/clip_vision and having no luck
got it working, needs to be in the extension models/clip_vision, not the main webui models folder if anyone else has the same problem.
Just getting some Random junk like most people say in the comment. Also the clip file in the video is diff form the one that is in the link. May be that is the problem, or i am doping something wrong, no idea. Anyway its a great video. Thanks for sharing
So good ❤
Thanks!
hi could you please add the nodes map for your videos 🥺
They are added for the Sponsors of the channel in the youtube posts.
amazing and simple, thank you very much, please make a tutorial on how I can make the output and put it in a sequence of images so I can animate them. 🥇🥇
Ah like photoshop ai. hope my pc can handel it atleast with sd1.5. Amazing tut like always.
Thank ya!
Unfortunately I'm getting out of memory error because of that clip vision file :(
You can download a smaller version of it, but I am not sure it is official.
you could probably use the negative prompt to get the opposite of an image :D
Always worth a try!
Really great stuff Scott. A couple of questions for you. Is there a university level course or program (with exams etc) that allows some sort of certification for this stuff? I really want to switch to this full time but I think some formal training would be useful. And I guess that leads to a second question; where do you look for jobs in this field?
Naa, this is all new to the world! It's a bit bleeding edge for them.
@@sedetweiler lol, ok, thanks
You mentioned you use Revision, but nowhere in the video do I see revision.
I get errors when using "clip_vision_g.safetensors" and "open_clip_pytorch_model.bin" generates mostly random junk, only using similar colors from source images.
Not sure what I'm doing wrong, same results when using revision-image_mixing_example.json from HuggingFace.
same here, only the manual prompt works. also wondering what the difference between those two files is considering the huge size difference.
Use the link to get the correct clip from 🤗. You probably have the wrong clip model.
I had the same issue with random junk. It's not the clip vision model, you need to update your version of ComfyUI
yep! thanks@@JLocust
@@JLocust I had the same issue; clip_vision_g loading errors or just outputting random nonsense images. Updating ComfyUI fixed it, thanks.
I'm sorry I'm completely lost... which of those files under the CLIP link do I need to download and where do I put them?
It's the link in the description for the clip model. Put it in the clip_vision model folder.
@@sedetweiler Yes, I've seen the link... but all files? I usually just download checkpoints, there I always just download a single file. So all bin files in this case?
@@gordonbrinkmann I'm wondering the same thing...
UPDATE... just put the whole bin file in the models/clip_vision folder, that worked here. When launching ComfyUI after that it loaded a lot of stuff. After re-creating the workflow Scott first shows us, the Queue Prompt button did not work. I saved the workflow and closed out of the browser and the CMD window and came back in and it works fab!
@@sedetweiler In the video you used the model "clip_vision_g.safetensors", though the description link points to "open_clip_pytorch_model.bin". I found the first one myself, and both seem to work. Is there a reason for the change?
👋
:-)
this is random af for me and has NOTHING to do with the input images so it feels really lame to use. Setup is like yours. It feels like XL is JUST for making some generic AI art that you see everywhere now
*oh but thank you for the tutorial and awesome video + pacing! subbed
** even the included workflows dont work at all. I start with like a picture of a human and end up with a duck in a pond that has 0 connection to the initial pic.
*** its the Fooocus Ksampler that was breaking everything - a custom node that replaces certain functions even if it is not used..wow
I'm also getting extremely random images from this. exactly the same setup, completely updated on all fronts... just random stuff like a dog or cars when putting in portraits of people, and nothing close to the style of the images. Did you figure this out, or did you give up?
I would focus on the strength. Working closer to .6 and .7 seems to work better than aggressive values in there.
@@sedetweiler it had nothing to do with that. for some reason the Fooocus custom node doesn't mix with this, even if you're not using the node in the workflow. It has to be completely uninstalled for this to work.
مشتبا قطش کو
you talk too fast and click too fast , bad tutorial
😴
Didn't like it?
Doesn't work at all.
Sadly I got an error message instead 😂 The error message said something about size mismatch for vision_model.embeddings.position_embedding.weight , and it is very long too.
where you get "ConditionZeroOut?"
It should be in the standard nodes. I search for it
@@sedetweiler I don’t have one, moreover, there is such an Image Resize node at a certain point in time I also had it, but then it disappeared somewhere and I can’t imagine how to return it .. Can you make a lesson on how to return standard nodes?
I would just do a reinstall. It's probably good if things are wacky.