I kind of got stuck here... I managed to install the ip adapter models, but it wasn't very straightforward (and I may have overwritten the 1.5 image encoder folder with the XL image encoder one).. but then I got completely lost trying to get the CLIP vision stuff. I know it can be tedious for the more advanced users, but I wish you had briefly touched on installing these.
Yup he totally breezed over that! I'm stuck on the install myself now. Last time I tried installing things and didn't really understand what I was doing, I borked ComfyUI totally and had to rebuild from scratch.
Hi I'd love to get the IPA apply module but it doesnt show up in the search. Has it been depracated. and if so what can I use instead please. Thankyou...great video
So I used the IPAdapter which you have to use the IPAdater Unified Loader, no clipvision required in your workflow(I belive the unified loader takes care of this). The other option is LoadImage->IPAdapterEncoder->IPAadapter Embeds. IPAdapter Unified Loader->IPAdapter Encoder and IPAdapter Embeds. The bonus to this work flow is if you want to add another image you just use another IPAdapter Encoder and combine them with an IPAdapter Combiner the bonus to using this second work flow is you can choose how much each image is used. One caviout is that i have found that the IPAdapter does not work with all models, some of the models will give an error at the Sampler, "Expected query, key, and value to have the same dtype, but got query.dtype: struct c10::Half key.dtype: float and value.dtype: float instead", I am sure that there is a work around for this I have not researched it yet as I am just starting to play around with Comfy UI .
According to the troubleshooting guide on the IPAdapter git hub page "Can't find the IPAdapterApply node anymore The IPAdapter Apply node is now replaced by IPAdapter Advanced. It's a drop in replacement, remove the old one and reconnect the pipelines to the new one."
ccording to the troubleshooting guide on the IPAdapter git hub page "▶ Dtype mismatch If you get errors like: Expected query, key, and value to have the same dtype, but got query.dtype: struct c10::Half key.dtype: float and value.dtype: float instead. Run ComfyUI with --force-fp16"
So glad you put this workflow together, I was trying for something similar myself but this is more efficient. Getting great results and the experimentation has just begun. One small thing for newbs the prerequisits are daunting I think, simple things you take for granted catch us out. Like saying you need a Clip model and expecting us to know where to find one and then where to put it. ☺ A link and a sentence like find it here and place it there would polish the information. Cheers
I have IPADAPTER and PLUS installed, yet I don't have those file options you have. And I don't have any CLIP files. Not sure where I missed getting those. Is there a link somewhere? A manager search I'm supposed to do, or what did I miss?
@@DerekShenk yes I arrived at the same (or very similar problem). I was searching for that "laion/CLIP-ViT-H-14-laion2B-s32B-b79K" ... in the section of this file at Hugging Face, I found a "open_clip_pytorch_model.bin" I wonder if this is the same?
@@DerekShenk we have the link to huggingface page for CLIP-ViT-H-14-laion2B-s32B-b79K, but I can't figure out where to download the .bin / or how to combine all the files into a bin. What do you think?
oh dar. I am getting an error about size when I run this. : Error occurred when executing IPAdapterApply: Error(s) in loading state_dict for Resampler: size mismatch for proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1664]). Any idea what I have done wrong? thanks
yes@@MrPer4illo , it's got to do with the matching of the load ipadpator and the load clip vision, it takes some experimenting to get them to match. I find the easiest is 1.5 matches not sdxl. I can get sdxl to work for still images but not for moving ones.
You really nail it when it comes to explaining stuff - way better than others out there! Plus, your workflow node order makes it super easy to follow along and get a clear picture of how things go down step by step. Keep up the awesome work!....... and "Yes" I have the same OCD :)))))
Dude, thank you so much! I often wonder if I am going to deep on these, but then I do see others who are surfing a graph and you can tell they don't really get it. But, there are others that just go WAY deeper than I do, so trying to find balance. Cheers my OCD friend! :-)
@@sedetweiler there are graph surfers without OCD, that makes me cry, they just put all nodes next to each other with no visibility of connections and logic, I am having hard time following them even though I have a good knowledge. And there are very few who builds workflows like a Dostoyevsky novel :)
I think most of those other people just take the example graph and do an overview, which I also don't find satisfying. I use comfy because I want to know how things work, not because it is a sexy interface. :-)
One Cool Way to use it is to control your LoRa Results Better, in essence you help your LoRa adapt to exactly what you really want to see instead of rolling the dice, especially considering you can control it with both controlNET AND prompting and even control how much it will consider either etc etc.
I wonder if there's potential to use this as part of a loop and feed back a generated image with the qualities of a desired result. It gets the context of the lora and some of the prompting you'd use.
Great video. Unfortunately you lost me from 15:00 on, because after the cut suddenly the noise image was there, but the other image was not. That was a bit confusing and I still don't understand why you don't need the other image for the workflow. But maybe I have something essential not yet understood 🙂
Fascinating and inspiring! Thank you! In your videos on upscaling, you didn't get into the tiling parameters. I'd love to see a follow up video that covers them. Thanks again!
Perfect video. I didn't understand where to get the clip vision model. Hugging face shows me open_clip_pytourch_model.bin, and that's not the VIT model you have in the video. Any ideas?
Albeit when I put that open_clip_pytorch_model.bin in the clip vision folder in Comfy, I don't get CLIP-ViT-H-14-laion2B-s32B-b79K rather the Pytorch one, and indeed the flow doesn't work as intended @@sedetweiler
thanks for the video scott. If you could provide an update on how to get both the models for the "Load IP Adapter Model" node and the "Load CLIP Vision" node that would perfect 🙂
Those should be linked on the official page. I prefer not to give a direct like outside of the one in the video description in case they change it and mess things up. All of the models are available from links on that page as that is where I collected them. github.com/cubiq/ComfyUI_IPAdapter_plus and github.com/tencent-ailab/IP-Adapter/
@@sedetweiler It would be helpful for others if you gave the current model name that you renamed and a warning that it will change soon. Many dont know what it is exactly as there are several files listed
I love this episode. I have already started using these image combining techniques learned here for a number of different purpose already. It is very powerful because you can use real or generated people with real or generated backgrounds. Pretty much allows you to place anyone anywhere.
Thanks for great videos, really like your bottom up approach for explaining things and keeping it simple. I have a question on this techique: Can the following be done with this approach (just want to know it's possible before I dive into it): Generate an image of a livingroom with a sofa -> masking the sofa -> Applying a control net to the sofa (to get it's shape) -> Take an external image of another sofa (perhaps from another angle) -> Using IP-adapter together with the ControlNet to inpaint the new sofa in another angle in the livingroom Or have I misunderstood anything?
Hi Scott. Fantastic videos. Shot in the dark here. I'm trying to combine a video with an image. So the video (my daughter looking around) is in 1950's halifax high street (background). Do you happen to have any videos on that? Thanks so much. Dave
@@sedetweiler HI scott. so, with your help I was going great guns mixing a video with still images. And then the IPadapter got updated and now my old nodes don't work. I don't suppose you're planning an IPadapter update? all the ones on youtube are image to image :-( Cheers Scott! Dave
@@sedetweiler looks like it no longer does install with the manager. Also, the version of the package your description links to has no models in it either.
Actually, looks like I accidentally mixed up IPAdapter models and ControlNet ones. The former is the one that has no models at this point; the ControlNet one seems fine, sorry for the confusion. (for the IPAdapter ones, I did find a bunch on huggingface under h94/IP-Adapter, some of which seem to at least be named the same as the ones you had in your dropdown in the video)
Thanks for the tutorial, this looks so cool and is what got me to try AI art! However I'd like to ask if it's possible to use an image with a/multiple LoRA (for example, a style LoRA) to generate a similar image in ComfyUi? I've googled but not many relevant results came up.
This was a great video and Intro to Ip adapters, thanks so much! I do have one question after experimenting with it. In your set up with 3 images plus positive prompt+ Controlnet Canny, what is the relation between the 3 images in how they will affect the final result. is there a way to weight each image to set a level of strength per image? and as you have it does one image have more strength than the other?
I had the same thought before seeing this comment and tried various combinations to see if it would affect the result, and much like Scott replied I found very little change.
Would combining images in this way change the weight of images based on how they're grouped? e.g. image A & B are combined, then that combined node is combined with image C. Would Image A + B = 50% and C= 50%?
If This isn’t the right place to ask this, Please advise. I’ve loved emulating/copying the workflows, but have been stuck for a few days trying to any sort of faceswap procedure. Cant get Reactor node to import successfully, always giving me errors no matter which windows machine at my home I try. Installed VisualStudio tools, uninstalled and reinstalled python, tried putting things on PATH. Still dead in the water. Anyone know where I could get help to solve this?
I had to install the SDXL Empty Latent image as a custom node (from Shingo), is that the right way to do it? It seemed like you had it as built into ComfyUI somehow?
Can you get consistent enough characters/images to use for animation? For example, use the same inputs, but just change the controlnet for each frame of an animation.
It would be interesting if IP Adapter could be tuned in a way to totally ignore subject matter while maintaining the style of a picture. And vice versa. As it is, the only way to really get the style to come through is to set it to a high weight, but this usually ends up bringing the subject matter with it too.
My adaptive appty ip is different, it has a mask input, I have lowered the models of the IPA and the vision & clip but they always give me error, even supply with the inspiration and also gives error. Apparently the error is the model connection, because if I leave that direct you create image normally, Is there any way you share in a ZIP, the Comfyui folder already with the included files? (At least the minimums you use in the example?) And do you upload it somewhere like Mediafire or 1fher, I was trying with several videos, and in none it works, it is desceptional, but I always see the changed names of the original, so it is very difficult Finding that fails, even now perform a clean installation of Comfyui but there is no case. There are so many versions of all the nodes, that not even going down the things of the same links is the same, I would appreciate very much both me and others, if there is a chance to share the comfyui, and it is not for vague, I have spent 2 days lowering the models, The gits, etc., on top of my internet that is a bit slow, worse, but one thing is to lower the pack knowing that it is and the other is to go down all day and that you do nothing, I know that even copying the Custom_nodes and Clipvision folder, they are a few gigabytes, but at least the models you are using, perhaps they are different? The same would appreciate an image with the workflow or the JSON, I don't know if it happens to all but is disappointing. Greetings and thanks for your time, I will keep trying where the error is.
Nice I tried it out but I keep getting error: module 'torchvision.transforms' has no attribute 'ElasticTransform' when reaching the Encode IPadapter Image(and I know nothing about torch...) :( any idea how tp solve it?
Error(s) in loading state_dict for Resampler: size mismatch for proj_in.weight: copying a param with shape torch.Size([768, 1280]) from checkpoint, the shape in current model is torch.Size([768, 1664])
thanks! I though it would be about making a 1 image lora of a person so that the face is consistent.. I really want to convert my photos into a painting and still be able to recognize the person in the portrait. I've tried canny and that helps, but is there a way to use one 1024 image for the person and another 1024 cropped in close up for the face, so it looks like the person? or something like that.. I've been trying all sorts of things but it never looks like them.
I take it you know how to do this with Auto1111, but I'll remind it anyway : there is an extension called Asymmetric tiling, and a LoRA for 360 images. Use both at the same time. In ComfyUI, it's the same : download the custom node called Asymmetric tiling, load it at the same time as you load the 360 LoRA. I haven't tried it yet in ComfyUI... but there's no reason for it not to work ^^.
Does your input image have to match the resolution chosen in SDXL Empty Latent Image? I keep getting: Error occurred when executing IPAdapterApply: Error(s) in loading state_dict for ImageProjModel: size mismatch for proj.weight: copying a param with shape torch.Size([8192, 1024]) from checkpoint, the shape in current model is torch.Size([8192, 1280]).
not sure where my other comment went so hopefully i'm not posting this twice! i'm unable to find the clipvision model CLIP-ViT-H-14-laion2B-s32B-b79K. I found a huggingface page for it but theres no .bin listed in the files. I'm open to being stupid and completely missing where it's located but if someone could point me in the right direction i'd appreciate it! Awesome guide btw I enjoy your videos a lot!
huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K/tree/main and just rename it so you remember it. There are a lot of different naming conventions, but I am lazy and just use the name on the repo. You will know if it is the wrong one, as it will error right away. Cheers!
I'm not sure what the correct file is either but I grabbed the "open_clip_pytorch_model.bin" from that hugging face page and renamed it and it seems to work fine.
Hi Scott, I have another question I need to bother you with regarding the IPAdapter and ControlNets, hope it's ok :) I have an issue where I have an image of a white sofa from the front, I then want to render this sofa in the same angle as an image of a black sofa from the side. I apply lineart on the black sofa and use IPAdapater, and have tried different weights and also using depthmap instead of lineart, but the issue is that I get to much of the characteristics of the black soffa applied on the white sofa (but the angle is correct so that is good). Do you have any advice on how I can render the white sofa in the same as the black sofa, but still have the characteristics of the white sofa preserved?
Is that _SDXL Empty Latent Image_ node part of some custom pack? EDIT: wow, my attention span sucks I guess :D you mentioned it right after mentioning the main custom nodes pack :D (so, it's shingo1228 version)
thanks for explaining what things are doing. most 'tutorials' seem to only read out the name of the node input without explaining what its doing. - gets frustrating
Glad you enjoyed it! I wanted to try and make it approachable, as you are totally right in that it is easy to look like an expert in this field, as long as someone doesn't have to explain things.
5:50 wrong. VitH is the small SD1.5 vision model. VitG is the large SDXL model. But that isn't even the biggest mistake. The biggest mistake is that the ipadapter PLUS are the large "lots of tokens" models. The non plus versions are the small "few tokens" models. Anyway, please prepare a workflow before starting recording. Seeing 5 minutes of fiddling and moving nodes around was painful. You also use the wrong way of including many images in IP-Adapter. You are supposed to its IP-Adapter encoder, not batching. With the correct encoder, you can set the weights for every individual image. So 50% space station, 100% victorian, for example. You can however batch many images into the encoder slots if you want to utilize the 4 different weight slots for many different images. All that being said, thanks for making videos anyway, I will keep watching now and see if I learn something new. 😁
I believe it is in the Impact Pack. However, there is an update for this as of a day or two ago, and now there is a new node to do some of this. I will get that new graph out for you folks today.
There are so many different nodes that you use that are not native. It is hard to follow your guides not knowing where they come from. Installing some of the packages makes ComfyUI not working and requires reinstall, so it is very confusing and time consuming to follow. Could you please make a definitive guide on what to install and what is safe (in terms of not making comfy not working)?
I think I showed at the start the 2 you needed. I also showed the ones I have loaded. Comfy isn't meant to be one gigantic application, it is small pieces and you load what you need so I tried to clearly show what was needed.
You using the manager? Be sure all of those are updated, as that manager depends on a database file and if you have an old one, than new tech will not be present.
Hey sir, I was curious about whether I should join colab pro or do it on my laptop. Laptop Specs: RTX 3070ti 8GB VRAM RAM: 16GB, 4800 MHZ CPU: i9-1200H 14 cores, 2.4 Ghz to 5Ghz SSD: 1TB Nvme Hynix Really Appreciate the effort!
This was a good video, however, you didn't teach anything. Everything was "shown" not "taught". So that made it less helpful. Most videos are like this, do this, then this, then this, and nobody explains why!
I don't think you understand the full breadth and width of what's possible. And neither does anyone, after just a year of having seen AI imaging. We're at the first babysteps of a new field, not the completely evolved bit.
I mean they all look really similar when you look closely , it’s like a perfect ”Botoxed plastic woman" Your eyes gets oversaturated and ill after watching on it to long It’s like your favorite food you ate to much and want to puke
@sedetweiler Scott at 10.14 mins I see your Ksampler shows a preview of the generation. I cant't seem to find that option. Could u pls tell me how to get that? Thank youuu
It is in the manager. Change the preview to "slow TAESD" and it will show up. I show how to do this in the video I posted yesterday just as a quick reminder.
I kind of got stuck here... I managed to install the ip adapter models, but it wasn't very straightforward (and I may have overwritten the 1.5 image encoder folder with the XL image encoder one).. but then I got completely lost trying to get the CLIP vision stuff. I know it can be tedious for the more advanced users, but I wish you had briefly touched on installing these.
Yup he totally breezed over that! I'm stuck on the install myself now. Last time I tried installing things and didn't really understand what I was doing, I borked ComfyUI totally and had to rebuild from scratch.
Yeah it fucked me and the git link arent user friendly. He shows us how to do the workflow but not to install the base models.
Thanks for showing the actual process of setting all of this up, rather than just popping in the workflow! 😊
Glad it was helpful! I don't enjoy videos where they just graph surf, as you know they are just guessing. Cheers!
@@sedetweiler Weeellll looking twice at the grap sometimes helps.. now i understood.. stupid me... 😄
Wow, explained everything in easy to follow way. Thankyou for sharing.
Glad you enjoyed it and thank you for the support!
Hi I'd love to get the IPA apply module but it doesnt show up in the search. Has it been depracated. and if so what can I use instead please. Thankyou...great video
So I used the IPAdapter which you have to use the IPAdater Unified Loader, no clipvision required in your workflow(I belive the unified loader takes care of this). The other option is LoadImage->IPAdapterEncoder->IPAadapter Embeds. IPAdapter Unified Loader->IPAdapter Encoder and IPAdapter Embeds. The bonus to this work flow is if you want to add another image you just use another IPAdapter Encoder and combine them with an IPAdapter Combiner the bonus to using this second work flow is you can choose how much each image is used. One caviout is that i have found that the IPAdapter does not work with all models, some of the models will give an error at the Sampler, "Expected query, key, and value to have the same dtype, but got query.dtype: struct c10::Half key.dtype: float and value.dtype: float instead", I am sure that there is a work around for this I have not researched it yet as I am just starting to play around with Comfy UI .
According to the troubleshooting guide on the IPAdapter git hub page "Can't find the IPAdapterApply node anymore
The IPAdapter Apply node is now replaced by IPAdapter Advanced. It's a drop in replacement, remove the old one and reconnect the pipelines to the new one."
ccording to the troubleshooting guide on the IPAdapter git hub page
"▶ Dtype mismatch
If you get errors like:
Expected query, key, and value to have the same dtype, but got query.dtype: struct c10::Half key.dtype: float and value.dtype: float instead.
Run ComfyUI with --force-fp16"
So glad you put this workflow together, I was trying for something similar myself but this is more efficient. Getting great results and the experimentation has just begun.
One small thing for newbs the prerequisits are daunting I think, simple things you take for granted catch us out. Like saying you need a Clip model and expecting us to know where to find one and then where to put it. ☺ A link and a sentence like find it here and place it there would polish the information. Cheers
There should be a linky for the CLIP model in the description. Hmm, did I miss it?
I have IPADAPTER and PLUS installed, yet I don't have those file options you have. And I don't have any CLIP files. Not sure where I missed getting those. Is there a link somewhere? A manager search I'm supposed to do, or what did I miss?
@@DerekShenk yes I arrived at the same (or very similar problem). I was searching for that "laion/CLIP-ViT-H-14-laion2B-s32B-b79K" ... in the section of this file at Hugging Face, I found a "open_clip_pytorch_model.bin" I wonder if this is the same?
@@DerekShenk we have the link to huggingface page for CLIP-ViT-H-14-laion2B-s32B-b79K, but I can't figure out where to download the .bin / or how to combine all the files into a bin. What do you think?
wow really you explain just like pros, well you are one pro.
oh dar. I am getting an error about size when I run this. : Error occurred when executing IPAdapterApply:
Error(s) in loading state_dict for Resampler:
size mismatch for proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1664]).
Any idea what I have done wrong? thanks
well, i think i broke the whole thing and am starting again.
Have you solved this? I get the same error
yes@@MrPer4illo , it's got to do with the matching of the load ipadpator and the load clip vision, it takes some experimenting to get them to match. I find the easiest is 1.5 matches not sdxl. I can get sdxl to work for still images but not for moving ones.
You really nail it when it comes to explaining stuff - way better than others out there! Plus, your workflow node order makes it super easy to follow along and get a clear picture of how things go down step by step. Keep up the awesome work!....... and "Yes" I have the same OCD :)))))
Dude, thank you so much! I often wonder if I am going to deep on these, but then I do see others who are surfing a graph and you can tell they don't really get it. But, there are others that just go WAY deeper than I do, so trying to find balance. Cheers my OCD friend! :-)
@@sedetweiler there are graph surfers without OCD, that makes me cry, they just put all nodes next to each other with no visibility of connections and logic, I am having hard time following them even though I have a good knowledge. And there are very few who builds workflows like a Dostoyevsky novel :)
I think most of those other people just take the example graph and do an overview, which I also don't find satisfying. I use comfy because I want to know how things work, not because it is a sexy interface. :-)
I so agree with this! huge thank you!
Thank you for awesome video!
You are quite welcome!
I’m currently making the transition from Invokeai because I wanted more control and these videos have been very informative! Ty!
Great to hear!
One Cool Way to use it is to control your LoRa Results Better, in essence you help your LoRa adapt to exactly what you really want to see instead of rolling the dice, especially considering you can control it with both controlNET AND prompting and even control how much it will consider either etc etc.
Indeed! So many amazing things coming up, and all of these great nodes really help us expand our capabilities!
I wonder if there's potential to use this as part of a loop and feed back a generated image with the qualities of a desired result. It gets the context of the lora and some of the prompting you'd use.
Great tutorial choom. Helped me a lot.
Glad it helped!
This series has been great. Finally all caught up! Can't wait for more. Thanks!
Glad you like them!
Great video. Unfortunately you lost me from 15:00 on, because after the cut suddenly the noise image was there, but the other image was not. That was a bit confusing and I still don't understand why you don't need the other image for the workflow. But maybe I have something essential not yet understood 🙂
Oh, that was an intentional cut! I didn't notice until you pointed it out. All I did was load in a texture I had and used ImageBatch to combine them.
Fascinating and inspiring! Thank you! In your videos on upscaling, you didn't get into the tiling parameters. I'd love to see a follow up video that covers them. Thanks again!
Great suggestion!
Thanks as always your content is always 10/10
I appreciate that!
Perfect video. I didn't understand where to get the clip vision model. Hugging face shows me open_clip_pytourch_model.bin, and that's not the VIT model you have in the video. Any ideas?
That's the one.
Albeit when I put that open_clip_pytorch_model.bin in the clip vision folder in Comfy, I don't get
CLIP-ViT-H-14-laion2B-s32B-b79K rather the Pytorch one, and indeed the flow doesn't work as intended @@sedetweiler
Also, where did you get the CLIP model? Sorry, just starting :)
Hi, thanks for the video! I'm not sure where to grab the clip vision model "Clip-vit-h-14-laion2b-s32b-b79k". How can I install it?
You explain super clear thank you
thanks for the video scott. If you could provide an update on how to get both the models for the "Load IP Adapter Model" node and the "Load CLIP Vision" node that would perfect 🙂
Those should be linked on the official page. I prefer not to give a direct like outside of the one in the video description in case they change it and mess things up. All of the models are available from links on that page as that is where I collected them. github.com/cubiq/ComfyUI_IPAdapter_plus and github.com/tencent-ailab/IP-Adapter/
@@sedetweiler It would be helpful for others if you gave the current model name that you renamed and a warning that it will change soon. Many dont know what it is exactly as there are several files listed
I love this episode. I have already started using these image combining techniques learned here for a number of different purpose already. It is very powerful because you can use real or generated people with real or generated backgrounds. Pretty much allows you to place anyone anywhere.
Glad it was helpful!
Many thanks for your tutorial. May I ask why use Canny instead of openpose? what's the difference here? @@sedetweiler
would be kind if you can link the best way to upscale after using a LoRa .....
I have 2 different upscale videos, so you have about 890 different options depending on what you want. There is really no limit.
helpfully thorough as usual. many thanks
My pleasure!
That is a fantastic introduction to IP Adapter! Thank you for the ideas!
My pleasure!
Thanks for great videos, really like your bottom up approach for explaining things and keeping it simple. I have a question on this techique:
Can the following be done with this approach (just want to know it's possible before I dive into it):
Generate an image of a livingroom with a sofa -> masking the sofa -> Applying a control net to the sofa (to get it's shape) -> Take an external image of another sofa (perhaps from another angle) -> Using IP-adapter together with the ControlNet to inpaint the new sofa in another angle in the livingroom
Or have I misunderstood anything?
You would get something like the new sofa, but it might not be exactly the same thing. Give it a whirl! Thanks for the kind words as well! 🥂
I love this workflow! :)
I do as well!
You shared a really good tip in this. Thank you! Can’t wait to try it.
That's what's so great about it. There were a bunch in here, depending on where you're at right now learning the stuff.
You are so welcome!
I will keep them coming!
Hi Scott. Fantastic videos. Shot in the dark here. I'm trying to combine a video with an image. So the video (my daughter looking around) is in 1950's halifax high street (background). Do you happen to have any videos on that? Thanks so much. Dave
You could try using an IP adapter with a video workflow and that should help.
Thanks scott. There's a gazzilion settings. let me check that one!
@@sedetweiler
and do i plug that into the Controlnet encoder? it's the only one i can see that says 'image'.
@@sedetweiler HI scott. so, with your help I was going great guns mixing a video with still images. And then the IPadapter got updated and now my old nodes don't work. I don't suppose you're planning an IPadapter update? all the ones on youtube are image to image :-( Cheers Scott! Dave
How do I get ControlNets ? My "LoadControlNetModule" shows nothing...
Be sure you load in the pre-processors. It should all install when you use the manager.
@@sedetweiler looks like it no longer does install with the manager. Also, the version of the package your description links to has no models in it either.
Actually, looks like I accidentally mixed up IPAdapter models and ControlNet ones. The former is the one that has no models at this point; the ControlNet one seems fine, sorry for the confusion.
(for the IPAdapter ones, I did find a bunch on huggingface under h94/IP-Adapter, some of which seem to at least be named the same as the ones you had in your dropdown in the video)
Thanks for the tutorial, this looks so cool and is what got me to try AI art! However I'd like to ask if it's possible to use an image with a/multiple LoRA (for example, a style LoRA) to generate a similar image in ComfyUi? I've googled but not many relevant results came up.
Yup! Just string them together
This was a great video and Intro to Ip adapters, thanks so much! I do have one question after experimenting with it. In your set up with 3 images plus positive prompt+ Controlnet Canny, what is the relation between the 3 images in how they will affect the final result. is there a way to weight each image to set a level of strength per image? and as you have it does one image have more strength than the other?
The new version of the node allows weight adjustment, but in this example they are equal.
@@sedetweiler which node? The image combiner?
I believe the batch. I will have to check as I had heard about it but have yet to see it myself.
Where can i find the workflow that you made in this video, thank you.
Nice. I wonder if the order in wich you batch source images has an impact? ((A+B) + C) VS (A + (B + C)) for example.
Not that I have noticed.
I had the same thought before seeing this comment and tried various combinations to see if it would affect the result, and much like Scott replied I found very little change.
Thank's for all, just have to say i can't load "ip-adapter-etc..." no list on my "Load IPAdapter Model, can you help me please ? Thank's for reading.
Would combining images in this way change the weight of images based on how they're grouped? e.g. image A & B are combined, then that combined node is combined with image C. Would Image A + B = 50% and C= 50%?
can we use open pose with this workflow ?
If This isn’t the right place to ask this, Please advise. I’ve loved emulating/copying the workflows, but have been stuck for a few days trying to any sort of faceswap procedure. Cant get Reactor node to import successfully, always giving me errors no matter which windows machine at my home I try. Installed VisualStudio tools, uninstalled and reinstalled python, tried putting things on PATH. Still dead in the water. Anyone know where I could get help to solve this?
I would ask the developer for help via github.
Wondering how to install cudnn in comfyui. It speed up generation in automatic, but dont know how to do it in comfy
This looks great! Goodbye Automatic111
I still use AUTO1111 from time to time, but it is getting to be longer and longer between visits.
I’m just starting your comfyUI series. Can you tell me if masks like in controlnet’s inpaint mask feature works in ComfyUI?@@sedetweiler
I had to install the SDXL Empty Latent image as a custom node (from Shingo), is that the right way to do it? It seemed like you had it as built into ComfyUI somehow?
Hey Scott this is amazing once I got it all working. Where can I send you some examples?
how to get the batch images node?
Can you get consistent enough characters/images to use for animation? For example, use the same inputs, but just change the controlnet for each frame of an animation.
I don't see why not! That face model would probably work, and animation is just single images, 24 per second on good days. :-)
It would be interesting if IP Adapter could be tuned in a way to totally ignore subject matter while maintaining the style of a picture. And vice versa.
As it is, the only way to really get the style to come through is to set it to a high weight, but this usually ends up bringing the subject matter with it too.
I would probably just use negative prompts to mitigate whatever you needed to deflect.
does adding more images to Image Prompt Adapter = more VRAM use or more RAM use?
My adaptive appty ip is different, it has a mask input, I have lowered the models of the IPA and the vision & clip but they always give me error, even supply with the inspiration and also gives error. Apparently the error is the model connection, because if I leave that direct you create image normally,
Is there any way you share in a ZIP, the Comfyui folder already with the included files? (At least the minimums you use in the example?) And do you upload it somewhere like Mediafire or 1fher, I was trying with several videos, and in none it works, it is desceptional, but I always see the changed names of the original, so it is very difficult Finding that fails, even now perform a clean installation of Comfyui but there is no case.
There are so many versions of all the nodes, that not even going down the things of the same links is the same, I would appreciate very much both me and others, if there is a chance to share the comfyui, and it is not for vague, I have spent 2 days lowering the models, The gits, etc., on top of my internet that is a bit slow, worse, but one thing is to lower the pack knowing that it is and the other is to go down all day and that you do nothing,
I know that even copying the Custom_nodes and Clipvision folder, they are a few gigabytes, but at least the models you are using, perhaps they are different? The same would appreciate an image with the workflow or the JSON, I don't know if it happens to all but is disappointing. Greetings and thanks for your time, I will keep trying where the error is.
Nice I tried it out but I keep getting error: module 'torchvision.transforms' has no attribute 'ElasticTransform' when reaching the Encode IPadapter Image(and I know nothing about torch...) :( any idea how tp solve it?
Hi, in automatic .bin was renamed to .pth, to be recognized by web ui. Is it not necessary in ComfyUI?
I am not sure. I have not run into that issue yet.
Huge thank you!!!
I missed it right the end... where you add the noise, the sc-ifi image is not on the left as input so how did it affect that?
Error(s) in loading state_dict for Resampler:
size mismatch for proj_in.weight: copying a param with shape torch.Size([768, 1280]) from checkpoint, the shape in current model is torch.Size([768, 1664])
My "Load IPAdapter Model" list is empty.... Any idea why?
You will need to download them from the link on the git page for the node suite.
have you figured it out? having the same issue here.... installed everything from the git but still empty
Does anyone have advice on why the ioadapter returns, failed install? Thanks
you should post this on the git for the node.
@@sedetweiler Thanks Scott, will do
Are there nodes for dropping a scanned image from a camera or can i only use .png created from a previouse sdxl render?
You can use a png or jpg from any source with the load image node.
thanks! I though it would be about making a 1 image lora of a person so that the face is consistent..
I really want to convert my photos into a painting and still be able to recognize the person in the portrait. I've tried canny and that helps, but is there a way to use one 1024 image for the person and another 1024 cropped in close up for the face, so it looks like the person? or something like that.. I've been trying all sorts of things but it never looks like them.
There are a few ways to do that. The easiest is roop, then LoRA, and then this method.
How to install Load IPadapter model, I'm new to this and cant seem to know how to install it and also the load clip =(
Mine freezes at the sampler node every time I try to run it. I tried 1.5 version as well and same thing
nice guide!
Thank you!
Great Tut - thanks!
Does anybody know how to generate 360, panoramic images in ComfyUI?
Not that I know of
I take it you know how to do this with Auto1111, but I'll remind it anyway : there is an extension called Asymmetric tiling, and a LoRA for 360 images. Use both at the same time.
In ComfyUI, it's the same : download the custom node called Asymmetric tiling, load it at the same time as you load the 360 LoRA.
I haven't tried it yet in ComfyUI... but there's no reason for it not to work ^^.
can anyone tell me where the clip vision modesl are like downloaded from
Does your input image have to match the resolution chosen in SDXL Empty Latent Image? I keep getting:
Error occurred when executing IPAdapterApply:
Error(s) in loading state_dict for ImageProjModel:
size mismatch for proj.weight: copying a param with shape torch.Size([8192, 1024]) from checkpoint, the shape in current model is torch.Size([8192, 1280]).
Usually you want the IPAdapter to be square, but that is ideal, and should not be throwing an error. I wonder if there is something else going on.
not sure where my other comment went so hopefully i'm not posting this twice! i'm unable to find the clipvision model CLIP-ViT-H-14-laion2B-s32B-b79K. I found a huggingface page for it but theres no .bin listed in the files. I'm open to being stupid and completely missing where it's located but if someone could point me in the right direction i'd appreciate it! Awesome guide btw I enjoy your videos a lot!
huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K/tree/main and just rename it so you remember it. There are a lot of different naming conventions, but I am lazy and just use the name on the repo. You will know if it is the wrong one, as it will error right away. Cheers!
I'm not sure what the correct file is either but I grabbed the "open_clip_pytorch_model.bin" from that hugging face page and renamed it and it seems to work fine.
awesome video
Thank you!
the SDXL empty latent note doesn't come up on the list
That is a custom node in the ComfyMath node suite. You can tell the node expansions because of the little badge over the node.
Hi Scott, I have another question I need to bother you with regarding the IPAdapter and ControlNets, hope it's ok :)
I have an issue where I have an image of a white sofa from the front, I then want to render this sofa in the same angle as an image of a black sofa from the side. I apply lineart on the black sofa and use IPAdapater, and have tried different weights and also using depthmap instead of lineart, but the issue is that I get to much of the characteristics of the black soffa applied on the white sofa (but the angle is correct so that is good). Do you have any advice on how I can render the white sofa in the same as the black sofa, but still have the characteristics of the white sofa preserved?
Oh, that's nice. I don't have a Manager button on my version... and I thought I had the latest version, too. 🙄
That is a customization. civitai.com/models/71980/comfyui-manager
Is that _SDXL Empty Latent Image_ node part of some custom pack?
EDIT: wow, my attention span sucks I guess :D you mentioned it right after mentioning the main custom nodes pack :D (so, it's shingo1228 version)
Yup, but I have used others in the past as well. I think this one is nice because it has the drop down. Cheers!
thanks for explaining what things are doing.
most 'tutorials' seem to only read out the name of the node input without explaining what its doing. - gets frustrating
Glad you enjoyed it! I wanted to try and make it approachable, as you are totally right in that it is easy to look like an expert in this field, as long as someone doesn't have to explain things.
IPA like the beer!!! 😁
🍺
interesting. It would be great to have some kind of batch image folder that comfyui would look in. instead of batching manualy each image loaded.
You can! The great thing about nodes is you can use them in creative ways. The primitive node for example can select random files.
5:50 wrong. VitH is the small SD1.5 vision model. VitG is the large SDXL model. But that isn't even the biggest mistake. The biggest mistake is that the ipadapter PLUS are the large "lots of tokens" models. The non plus versions are the small "few tokens" models.
Anyway, please prepare a workflow before starting recording. Seeing 5 minutes of fiddling and moving nodes around was painful.
You also use the wrong way of including many images in IP-Adapter. You are supposed to its IP-Adapter encoder, not batching. With the correct encoder, you can set the weights for every individual image. So 50% space station, 100% victorian, for example. You can however batch many images into the encoder slots if you want to utilize the 4 different weight slots for many different images.
All that being said, thanks for making videos anyway, I will keep watching now and see if I learn something new. 😁
I can't see the manager in my comfyUI
You need to install it. It is a custom node. I have a video on that as well.
... looking for Image Batch Node ... ?
I believe it is in the Impact Pack. However, there is an update for this as of a day or two ago, and now there is a new node to do some of this. I will get that new graph out for you folks today.
There are so many different nodes that you use that are not native. It is hard to follow your guides not knowing where they come from. Installing some of the packages makes ComfyUI not working and requires reinstall, so it is very confusing and time consuming to follow. Could you please make a definitive guide on what to install and what is safe (in terms of not making comfy not working)?
I think I showed at the start the 2 you needed. I also showed the ones I have loaded. Comfy isn't meant to be one gigantic application, it is small pieces and you load what you need so I tried to clearly show what was needed.
@@sedetweilermy bad. I forgot to restart comfy, that's why i was missing nodes. Sorry and thank you!
You should always have a backup copy of your comfyui
@@Taim_allah how?
Dude, I can't find the nodes you're using. Apply IP adapter is nowhere to be found.
You using the manager? Be sure all of those are updated, as that manager depends on a database file and if you have an old one, than new tech will not be present.
If updating all in the manager updates the manager, then everything is updated. Still can't find your handy nodes.@@sedetweiler
Hey sir, I was curious about whether I should join colab pro or do it on my laptop.
Laptop Specs:
RTX 3070ti 8GB VRAM
RAM: 16GB, 4800 MHZ
CPU: i9-1200H 14 cores, 2.4 Ghz to 5Ghz
SSD: 1TB Nvme Hynix
Really Appreciate the effort!
You have a 3070, this is fine.
0:53
Nice! Deep end inclusion is always good news to my ears
I think this has a lot more capability the more I mess with it. Collecting some great ideas here.
fkin awesome man. definitely will try after i finish downloading gigs from huggingface at .00002 kbps
Yeah, it is a bit of a slog, but totally worth it!
Damn... alright alright i'll learn comfy ui clearly it's literally a million times more customizable
ty vm
I go to use this and the whole thing got revamped 2 days ago 😂. No longer need specific Clip visions...
👋
That was fast! :-)
why not just use img2img? break this down! Think step by step! 😀
This was a good video, however, you didn't teach anything. Everything was "shown" not "taught". So that made it less helpful. Most videos are like this, do this, then this, then this, and nobody explains why!
This is like episode 17. Best of you start earlier, as repeating the basics every video would mean they will get exponentially longer every time.
This video should be 3 minutes long lol, but it's 16
Want me to talk faster?
@@sedetweiler Maybe talk less and try to keep it more to the pertinent stuff.
Maybe it comes down to preference, but personally, I like all the hints and tips in this series.
AI pictures are so boring
I don't think you understand the full breadth and width of what's possible. And neither does anyone, after just a year of having seen AI imaging. We're at the first babysteps of a new field, not the completely evolved bit.
I don't think you are going to enjoy this channel much. Just a guess. 😂
I mean they all look really similar when you look closely , it’s like a perfect ”Botoxed plastic woman"
Your eyes gets oversaturated and ill after watching on it to long
It’s like your favorite food you ate to much and want to puke
Another awesome video! Also, do you have a discord?
Yes I have! It should be in the description, but perhaps I removed it. I will get that back on there. Cheers!
@sedetweiler Scott at 10.14 mins I see your Ksampler shows a preview of the generation. I cant't seem to find that option. Could u pls tell me how to get that?
Thank youuu
It is in the manager. Change the preview to "slow TAESD" and it will show up. I show how to do this in the video I posted yesterday just as a quick reminder.
@@sedetweiler found it! Thaaanx
how can I add weight to each image before patching them?