Don't forget, in *general* in stable diffusion, if you ask for A, you get A, if you ask for B, you get B, but if you ask for both A and B you get a little less of both, as if the very act of asking for more things dilutes the pool of "thinginess" it can apply. Also, don't just use brackets, use brackets w. numbers i.e. (helictopter pad:1.2) will give helicopter pad a weight of 1.2 times everything else. I find when I try to combine scans of myself with other things, to get myself to still show up, I have to upp it to 1.4-1.6 in weight.
A step-by-step experience will probably save a lot of people's time so they don't make the same mistakes. Thank you for putting so much effort into creating this video.
@@Aitrepreneur By the way, if you would like to redo your training with the arcane dataset, I'd happy to share it. Might be useful for the viewers as well, when they want to try the initial concept.
Literally was just making a note to look into using checkpoint mergers tonight. Your timing is perfect with your content, as always. Thanks for all you do for the SD community!!!!
Also another observation! One can literally merge two models one at 0.2 strength and one by .8 Create the init image with 0.2 and then img2img using .8 This will result in maximum coherence and style.
Yesterday I read on reddit that one way to add weights in your prompts is by selecting the phrase you want to emphazises and put CTRL + Up Arrow. Old version: "a (fantasy) world" New version: "a (fantasy:1.3) world" I used on AUTOMATIC1111 version and worked fine
Dreambooth: 1. I would recommend to use a dreambooth variant with prior preservation, i got much better results. Joe Penna's version doesn't use prior preservation. 2. Instead of imgur, just drag and drop your training (and class) images from your computer into the notebook folder. Much faster. 3. If you have a 3080 or better, with at least 10GB, I recommend running dreambooth locally with WSL2 ubuntu on your windows machine. Nerdy Rodent has a great tutorial on this, and you save the hassle of using runpod.
I never tried the variant one but it's on my list yes. I use imgur links because I deleted the images from my computer but kept the links in a notepad txt document so I just copy and paste them when I need it. I have a 1080 so I can't use derambooth locally unfortunately
Honestly the things shown at the beginning of the vid were above my head but towards end it was totally interesting 😎. Awesome experiment and thank you so much for sharing ❤👏🏼👍🏻
Sigmoid and Inverse sigmoid were basically a recalculation of the value used in weighted sum. It just applies a curve to the value passed to the weighted sum, you can replicate the curve. Longstory short: 0.2 sigmoid = 0.104 weighted sum; 0.2 inverse = 0.287 weighted sum, 0.8 sigmoid = 0.896 and 0.8 inverse = 0.71256 Your hardwork was not lost, you just needed to set the right values for the examples :)
I had the same idea but instead of training as 'person' train as 'style'. I never tried but seeing your results I'm really excited to see what will happen. Also the Google Drive thing works for me but it's a bit tricky, sometimes I just change the directory where I am and then go back Dreambooth folder and it works.
Thanks for these. In case no one has said it, the uploading directly to Gdrive DOES work, but you have to keep hacking at it sometimes, constantly reloading the link they provide until it will finally authorize you.
awesome results, bro! I'm still just playing with it, looking for a way to integrate it into my production workflow. stable diffusion and whisper are massive game changers, they also made me fully shift my attention to AI, thank god that i already do code for a living 😅 cheers and please keep posting :D
You can also try to do this 1. generate a whole bunch of images from the style model and then follow the training 2 things together But, I think rather than doing this if you follow the shivam dream-booth implementation --- he has given the option to add multiple subjects on the same training -- I was able to do 4 characters quite easily on that... that json file option is available on the UI as well.
Merging is not training, it will always get something between the two models, like it you are morphing one image to another. You could try to generate some images of the arcane style and use it to train the style + face, it may have a good result even if it isn't the original images used to create the style.
@@Aitrepreneur I know, that is what I said, that is why the dreambooth method have a better result. You won't get a better result than dreambooth using the merge tool.
@@jnjairo unless you’re merging two similar models and you’re selecting the best “offspring” based on those merges. So if AI merged two dreambooth midjourney rhaeneras and kept going a few generations down… he’d “breed” a better combination eventually:D
Interesting thanks for doing this. I've been thinking about combining a training of my g/f with a waifu model, so I'm going to give this a shot and hope there is some coherence.
Great as always 👍 Not sure why you still stick to the rather complicated Imgur way when you can just drag and drop your pics to the training-images folder and continue.
Amazing results!! One alternative approach that I've been thinking to experiment with (but I'm not sure is even possible) is to use prompt alternating, but alternating models. Similar to the [a|b] syntax of the Automatic1111 gui, but switching models instead of prompts for each step.
Sigmoid stuff was removed because in the end it was exactly the same as weighted sum. I don't recall the exact formula off the top of my head, but slider at 0.3 with inverse sigmoid is exactly the same as setting it to 0.36 (or something like that) with weighted normals. I would just change the increments to 1 instead of 5 so you have more control around merging.
I give up as I just got bitten by Dreambooth because the inference I picked was already known so I never get any of my work. I keep picking weird names for the inference YET they are already known so I never see my training. Sucks that there is no tool/program to change the inference name post training as each time for me is 1h.
So your saying I could get some issues with multiple styles and characters on one model? I hope this is not really an issue because I'm planning to have multiple characters on one model for use by anybody... Is this some issue that might happen or not? I don't want people disappointed by one character being lower quality as the many come pouring in and being supported...
The best experience will be training a style then using it as a base to train your checkpoint, that way you can use the ckpt to generate multiple characters with that style, and use the character without the style on the same ckpt
You are now my higher power. With your videos I have been able to advance my use of stable diffusion beyond what I thought was possible. Thank you so much. I do have a question. When using checkpoints I have downloaded, can I rename them before putting them into the models folder? Some of the names are pretty cryptic and it becomes difficult to keep track of what's what. Thanks again.
My method, train subjects and style at the same time, first, put characters in. Train till they begin to look familiar about 30 it per image, then add your style in and down your learning rate to at least 4e-7 and make sure your style is doable your TOTAL Character count. This is easiest with last Ben's fast dreambooth. But can be done by chosing your finished ckpt from shivers method and then training that .ckpt.
I'm at into 9:45 in this video and I think the added difference will really shine if you substract the base SD 1.4 from the arcance model (only the style details will be passed on to your primary model)
@@Aitrepreneur No I haven't tried it but based on my understanding of training 15+ models so far I think this is the way it will actually work. I don't use automatic1111 as my local PC doesn't support fast rendering. I'll love for you to test this out. P.s. If this does work, do make a video and maybe give me a shoutout :p Cheers!
If I have 2 models trained for stable diffusion, one with my face trained, and one with an art style, isn't there a way to merge the two models at the prompt???
me 2 was just checking the merg option when you uploaded this video. i tried a other option what gives good results 2. i used the add difference mode 0.5 the a for person b for style and c original model. you can try it out. thanks for making videos like this
There is a better way to train concepts together, I've trained multiple subjects and multiple styles at the same time. Use kanewallmann's fork of Dreambooth. There you don't need to specify a class, or a token, it derives those from the folder structure and filenames. So in your regularization folder you would have multiple folders for regularization images. "person" folder with person images, perhaps "painting" for just random painting images, or "container" for the old container example as an.. example. Then in your training folder the structure could be like this "training/example/person" and you could have multiple different people there, you just need to use descriptive filenames like you can with textual inversion, so you could train with filenames such as "Young Rhaenyra_1" and "Old_Rhaenyra_5" etc. to train multiple subjects at the same time. But in the same training session you can also have a folder "training/example/painting" with your midjourney style images there. His fork then matches the regularization images based on which folders they are in, and it also uses the filenames instead of static tokens, so the end result will be a lot more editable and flexible. People kinda forgot about that fork but it's hands down the most powerful way to use Dreambooth. Keep in mind you need at least 100 steps per training image, so if you're training multiple subjects and styles then you will need to extend your training time accordingly.
@@Aitrepreneur awesome. In my experience it works incredibly well. But still not as well as training subjects on their own for example. But perhaps it's just my luck.
It would be interesting to use the prompt-to-prompt with cross-attention control with this methodology. Instead of using parenthesis, pipes, brackets, etc to try to adjust control in sort of clunky way, PTPCAC allows you to have very fine-grained control over every word and weight them very carefully against other words and dial them up and down in a simple way.
@@Aitrepreneur Totally understand, and it make sense for this video. Was just wondering what the result would be if you took some of the same ideas and got deeper into the fine grained abilities of cross-attention control. I have to wonder if the Google Colab's that have have built a module for PTPCAC are just injecting all of that into a prompt that has a similar formatting. You've shown how (((artstation))) works to bring a higher level of attention to that part of the prompt, but is PTPCAC just injecting information into the prompt to do something like "(fantasy, 8) painting of a (woman, 5) sitting on a (barstool,-6) in a train station, trending on (artstation,-10)", or something? I am just really curious about the syntax to create a prompt that has much finer control than less or more parentheses.
Very good tutorial. We would also like to know how to train more than one character in one session? Or train multiple things at the same time. Like a person and a dog.
Well this was complicated. Why did your dreambooth lose its arcane style? I trained a face with the midjourney style and it worked perfectly. The face is pretty spot on. Don't know why you had to do all of that. I wonder why your arcane style disappeared. One interesting work around is you could have taken all your training images and put them through ArcaneGAN and then added them back as the training images.
I didnt get the add difference thanks a lot! So is it better with add difference and using the same model for 3 than it is weighted sum? I haven't seen the whole test but it seems like maybe that's the same thing in the end?
I was looking for a way to do that with Shiv's Colab but unfortunately it is much harder, however with thelastben it is more straightforward but his colab is currently more volatile.
can you do a video on using a img2video script in the downloaded version of stable diffusion? I love being able to run it offline, but it seems the script is broken for some of the addons. Any help? Great work btw. I'm going to start chucking people your way when they ask me about using AI. Your videos are perfect.
There is any way i run the new Dreambooth with Prior Preservation? The Joepenna Dreambooth does not have support to it. The new version get better results but i need a tuto how to install on my Pod.
Hi, So I am doing this same training on Runpod, it's still running right now, 50% done so far but its been aroung 50 mins already since it started training. I am using the same numbers of instance images and regularization images as you are. Even the GPU is the same. It's saying ETA is around 30 mins more. Any idea why this is happening? The difference is really big from yours (less than 15 mins). The only difference is I am running 3030 training steps and yours was 2020 steps. Is this the reason? But, logically the difference is too high I think. Any suggestions would be great.
This is excellent, I am experimenting wit dreambooth. Do you think is there a way to train multiple instances on one model without interfering with each instance? Let's say training for two characters and one art style?
@@Aitrepreneur yeah, I checked about 12 seconds later I wrote this. Thank you for covering theses latest upgrades in the scene. As far as I understand that new colab is meant to train multiple faces or 'person' instances, do you think it would work with other kind of instances like places or art styles as well?
Hey u, how exactly do you train these models? Don’t you huge amount of computational power, data and overall just money to train a stable diffusion net?
Awesome work, thanks for sharing. I wonder if anyone has tried this with models that are overtrained?.. could this be a perfect use for those? Double the training, then cut it in half by merging..? 🤔
Thx for the great video. I'll try that. Btw. why are you always downloading the model on runpod, there ist already a stable diffusion version installed where you can just move the existing model.ckpt over to the Dreambooth folder.
Question: Will the final method work with the dreamlabs google colab? Since the runpod option isn't free, I'd have to run it on dreamlab colab then download and run a conversion script.
In the video I think you mentioned training on a trained ckpt? I've read that you need to convert the ckpt to its original diffsuer via a python script which I don't know how to run. Can you maybe do a tutorial on this?
What if you used dreambooth to train a person, and then hypernetworks to train different styles? or the opposite? I think applying a hypernetwork will "dillute" the trained model on a lesser scale than retraining it, no?
You can also train a style into a hypernetwork and apply it to any of your weighted models you have created. You can also use dreambooth on the stylized model with your training data, this will result in the best results.
Hey Stable Diffusion Master. Can you make a video about updating stable diffusion? I have an older version and if it is any way i wold like to be on the last version. Thank you for your videos and good luck
Just click on the url folder where your sd is installed, type cmd press enter, and in the command prompt window type git pull (you need to have git installed on your PC) I show this in pretty much every video
Could you make a video where you use your Rahenyra model for the person and for your midjourney style you choose the file you get when trained with Textual Inversion method? For Person the .ckpt file in models folder and for style the .pt file you put in the embeddins folder.
I dont see any point in model merging, as you can use lora to do it on demand on any model and adding the amount of weight you want in one second. Unless there is some drawback to training lora I would always chose that option,
People are recommending training using a ckpt file that was already trained with your style of choice, but tbh I don't think running that locally is possible as it's not actually loading a ckpt file but some kind of weird etherial cache or something from CompVis on Hugging face.. super annoying.
how do i download the midjourney images you used? or should I create my own midjourney images on SD using magic prompts and then using it? or should i use real midjourney photographs i made?
I keep getting "RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 19.71 GiB total capacity; 18.04 GiB already allocated; 38.19 MiB free; 18.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"
You should try turning rhaenyra into the style using img2img and weighted prompt editing, cross attention control and such where you can actually control it and cherry pick the ones that retain the essential features, then feed that into the training dataset with less of the style images, and maybe leave out style images with faces
i have a question. i have a gt 1060 6gb, and when i run automatic1111 webui loacally, i follow all steps and everything works but, when i run a text2img or img2img or any other atually, it takes around 3secs/it that also while performing 20steps of sampling. am i missing some gpu activation steps or is it because my gpu is very old. please i need some advice on this, i have tried tried everything i could and having no background on computer science doesn't help my situation at all.Also, I am running batch count and batch size both at 1
I have 1660ti and I get about 1-6 seconds/iteration. Make sure you close as many background programs as you can and disable hardware acceleration in your web browser (Google it, a bit different than each browser) If you are using preview images during rendering, you might want to turn those off, or way down to only show one every 10 iterations or so. I found showing each step can more than double the time it takes...
@@ArielTavori I have a laptop with a 1660ti and I get around 1.6it/s (625ms/it) on Euler a at 512x512- xformers + automatic1111 I can also run it at 1024x1024 with the --medvram argument, but it takes around 6-8s/it if I do
When downloading the model after the training is done i only get a download rate of like 200 kb/s which means i have to rent the GPU for like 3 additonal hours. Does someone know if there is maybe an alternative for downloading the file?
Some service, like runpod, allow you to filter out the ones with slow internet.... sadly I was pooped just like you. My guess is that some of the times this is actually intentional for them to earn more money...
I want to train Dreambooth with two dogs. Right now, what's the best looking and fastest (less training time) way to do it? Should I train it with images of both at the same time or one first and then the other one?
The best looking one without fail is probably to train 2 different models, one with dog A and the other with dog B and but them together with inpainting
Don't forget, in *general* in stable diffusion, if you ask for A, you get A, if you ask for B, you get B, but if you ask for both A and B you get a little less of both, as if the very act of asking for more things dilutes the pool of "thinginess" it can apply. Also, don't just use brackets, use brackets w. numbers i.e. (helictopter pad:1.2) will give helicopter pad a weight of 1.2 times everything else. I find when I try to combine scans of myself with other things, to get myself to still show up, I have to upp it to 1.4-1.6 in weight.
Thanks zap
A step-by-step experience will probably save a lot of people's time so they don't make the same mistakes.
Thank you for putting so much effort into creating this video.
Awesome video and thank you for featuring my model! I was actually looking for a good way to merge models and this explains it perfectly!
Thanks for the model, really well done ;)
@@Aitrepreneur By the way, if you would like to redo your training with the arcane dataset, I'd happy to share it. Might be useful for the viewers as well, when they want to try the initial concept.
That would be cool yes, could you send them to me on discord? I will put them on the board
everything I just throw at your model looks amazing ☺️
Really glad you enjoy it so much!
Thank you for your comment :)
Literally was just making a note to look into using checkpoint mergers tonight. Your timing is perfect with your content, as always. Thanks for all you do for the SD community!!!!
Glad I could help!
Also another observation!
One can literally merge two models one at 0.2 strength and one by .8
Create the init image with 0.2 and then img2img using .8
This will result in maximum coherence and style.
Yesterday I read on reddit that one way to add weights in your prompts is by selecting the phrase you want to emphazises and put CTRL + Up Arrow.
Old version: "a (fantasy) world"
New version: "a (fantasy:1.3) world"
I used on AUTOMATIC1111 version and worked fine
Yes it works pretty well
I pressed the arrow key accidentally last day and found this exact feature you mentioned here:-) So cool that Automatic1111 keeps adding new things.
Thanks for all your efforts, you make great content and you enable people to become actively involved with minimal technical knowledge. Cheers
I appreciate that!
Dreambooth:
1. I would recommend to use a dreambooth variant with prior preservation, i got much better results. Joe Penna's version doesn't use prior preservation.
2. Instead of imgur, just drag and drop your training (and class) images from your computer into the notebook folder. Much faster.
3. If you have a 3080 or better, with at least 10GB, I recommend running dreambooth locally with WSL2 ubuntu on your windows machine. Nerdy Rodent has a great tutorial on this, and you save the hassle of using runpod.
I never tried the variant one but it's on my list yes. I use imgur links because I deleted the images from my computer but kept the links in a notepad txt document so I just copy and paste them when I need it. I have a 1080 so I can't use derambooth locally unfortunately
Honestly the things shown at the beginning of the vid were above my head but towards end it was totally interesting 😎. Awesome experiment and thank you so much for sharing ❤👏🏼👍🏻
Incredible. Thank you for sharing your results!
My pleasure!
Sigmoid and Inverse sigmoid were basically a recalculation of the value used in weighted sum.
It just applies a curve to the value passed to the weighted sum, you can replicate the curve.
Longstory short: 0.2 sigmoid = 0.104 weighted sum; 0.2 inverse = 0.287 weighted sum, 0.8 sigmoid = 0.896 and 0.8 inverse = 0.71256
Your hardwork was not lost, you just needed to set the right values for the examples :)
I had the same idea but instead of training as 'person' train as 'style'. I never tried but seeing your results I'm really excited to see what will happen.
Also the Google Drive thing works for me but it's a bit tricky, sometimes I just change the directory where I am and then go back Dreambooth folder and it works.
Let me know if that works better for you, really curious to know
A very articulate + concise video. Thank you so much. This is gonna be great!
Glad it was helpful!
waiting for this exact video, i really appreciate your efforts!
Big up for this video. So comprehensive and useful !! Many thanks
Thanks for these. In case no one has said it, the uploading directly to Gdrive DOES work, but you have to keep hacking at it sometimes, constantly reloading the link they provide until it will finally authorize you.
Sometimes it does, sometimes it doesn't, can't really explain why that is...
Very nice!! My rigs have been down for days so thank you for the work.
awesome results, bro! I'm still just playing with it, looking for a way to integrate it into my production workflow. stable diffusion and whisper are massive game changers, they also made me fully shift my attention to AI, thank god that i already do code for a living 😅
cheers and please keep posting :D
Good luck!
wow you did so much work on this, amazing stuff!
Thank you! Cheers!
You can also try to do this 1. generate a whole bunch of images from the style model and then follow the training 2 things together
But, I think rather than doing this if you follow the shivam dream-booth implementation --- he has given the option to add multiple subjects on the same training -- I was able to do 4 characters quite easily on that... that json file option is available on the UI as well.
Amazing results! And incredibly instructive video! Thanks again!
Glad you enjoyed it!
Great work
Merging is not training, it will always get something between the two models, like it you are morphing one image to another.
You could try to generate some images of the arcane style and use it to train the style + face, it may have a good result even if it isn't the original images used to create the style.
Yes merging is not training, but the dreambooth part is
@@Aitrepreneur I know, that is what I said, that is why the dreambooth method have a better result. You won't get a better result than dreambooth using the merge tool.
@@jnjairo unless you’re merging two similar models and you’re selecting the best “offspring” based on those merges.
So if AI merged two dreambooth midjourney rhaeneras and kept going a few generations down… he’d “breed” a better combination eventually:D
Interesting thanks for doing this. I've been thinking about combining a training of my g/f with a waifu model, so I'm going to give this a shot and hope there is some coherence.
Great as always 👍
Not sure why you still stick to the rather complicated Imgur way when you can just drag and drop your pics to the training-images folder and continue.
Because I deleted the images from my computer and kept the imgur links in a notepad text document that I just copy and paste each time
Thank you very much this is really interesting and useful.
Thank you. Your effort is appreciated.
I appreciate that!
Amazing results!!
One alternative approach that I've been thinking to experiment with (but I'm not sure is even possible) is to use prompt alternating, but alternating models. Similar to the [a|b] syntax of the Automatic1111 gui, but switching models instead of prompts for each step.
Sigmoid stuff was removed because in the end it was exactly the same as weighted sum. I don't recall the exact formula off the top of my head, but slider at 0.3 with inverse sigmoid is exactly the same as setting it to 0.36 (or something like that) with weighted normals. I would just change the increments to 1 instead of 5 so you have more control around merging.
Where did you read that? Because the results I got were vastly different with Sigmoid and weighted sum
HELLO HUMANS! Thank you for watching & do NOT forget to LIKE and SUBSCRIBE For More Ai Updates. Thx
I give up as I just got bitten by Dreambooth because the inference I picked was already known so I never get any of my work. I keep picking weird names for the inference YET they are already known so I never see my training. Sucks that there is no tool/program to change the inference name post training as each time for me is 1h.
This is amazing! With how quickly everything changes, is there another way to merge a character (or two perhaps?) and a style nowadays?
Excellent video, awesome!
Clever!! Good job! Merge it!
well done sir.id try both methods just for the sake of it.
So your saying I could get some issues with multiple styles and characters on one model? I hope this is not really an issue because I'm planning to have multiple characters on one model for use by anybody... Is this some issue that might happen or not?
I don't want people disappointed by one character being lower quality as the many come pouring in and being supported...
The best experience will be training a style then using it as a base to train your checkpoint, that way you can use the ckpt to generate multiple characters with that style, and use the character without the style on the same ckpt
Yes but you lose the style as you train a new character over it
@@Aitrepreneur not much at all. Merging checkpoints loses style more than retraining
Wow. Thank you
You are now my higher power. With your videos I have been able to advance my use of stable diffusion beyond what I thought was possible. Thank you so much. I do have a question. When using checkpoints I have downloaded, can I rename them before putting them into the models folder? Some of the names are pretty cryptic and it becomes difficult to keep track of what's what. Thanks again.
Yes of course, it will not change the information inside
@@Aitrepreneur thank you.
My method, train subjects and style at the same time, first, put characters in. Train till they begin to look familiar about 30 it per image, then add your style in and down your learning rate to at least 4e-7 and make sure your style is doable your TOTAL Character count.
This is easiest with last Ben's fast dreambooth. But can be done by chosing your finished ckpt from shivers method and then training that .ckpt.
I'm at into 9:45 in this video and I think the added difference will really shine if you substract the base SD 1.4 from the arcance model (only the style details will be passed on to your primary model)
Is that something that you've tried? Could be fantastic if that worked
@@Aitrepreneur No I haven't tried it but based on my understanding of training 15+ models so far I think this is the way it will actually work. I don't use automatic1111 as my local PC doesn't support fast rendering.
I'll love for you to test this out.
P.s. If this does work, do make a video and maybe give me a shoutout :p
Cheers!
pro tip: if you setup runpod with stable diffusion you can copy the model into dreambooth right away from the stable-diff folder
If I have 2 models trained for stable diffusion, one with my face trained, and one with an art style, isn't there a way to merge the two models at the prompt???
You can use checkpoint merger as I showed in the video
@@Aitrepreneur Thank you so much! :D
great tutorial man thank you!!
Glad it helped!
me 2 was just checking the merg option when you uploaded this video. i tried a other option what gives good results 2. i used the add difference mode 0.5 the a for person b for style and c original model. you can try it out. thanks for making videos like this
Thanks for sharing
@@Aitrepreneur it works very good on the waifu and novelai style
There is a better way to train concepts together, I've trained multiple subjects and multiple styles at the same time. Use kanewallmann's fork of Dreambooth. There you don't need to specify a class, or a token, it derives those from the folder structure and filenames. So in your regularization folder you would have multiple folders for regularization images. "person" folder with person images, perhaps "painting" for just random painting images, or "container" for the old container example as an.. example. Then in your training folder the structure could be like this "training/example/person" and you could have multiple different people there, you just need to use descriptive filenames like you can with textual inversion, so you could train with filenames such as "Young Rhaenyra_1" and "Old_Rhaenyra_5" etc. to train multiple subjects at the same time. But in the same training session you can also have a folder "training/example/painting" with your midjourney style images there. His fork then matches the regularization images based on which folders they are in, and it also uses the filenames instead of static tokens, so the end result will be a lot more editable and flexible. People kinda forgot about that fork but it's hands down the most powerful way to use Dreambooth. Keep in mind you need at least 100 steps per training image, so if you're training multiple subjects and styles then you will need to extend your training time accordingly.
I haven't used it yet, just discovered it recently but I will definitely do more testing with it!
@@Aitrepreneur awesome. In my experience it works incredibly well. But still not as well as training subjects on their own for example. But perhaps it's just my luck.
It would be interesting to use the prompt-to-prompt with cross-attention control with this methodology. Instead of using parenthesis, pipes, brackets, etc to try to adjust control in sort of clunky way, PTPCAC allows you to have very fine-grained control over every word and weight them very carefully against other words and dial them up and down in a simple way.
I simply used the normal parenthesis here since it was easier to write and test
@@Aitrepreneur Totally understand, and it make sense for this video. Was just wondering what the result would be if you took some of the same ideas and got deeper into the fine grained abilities of cross-attention control. I have to wonder if the Google Colab's that have have built a module for PTPCAC are just injecting all of that into a prompt that has a similar formatting. You've shown how (((artstation))) works to bring a higher level of attention to that part of the prompt, but is PTPCAC just injecting information into the prompt to do something like "(fantasy, 8) painting of a (woman, 5) sitting on a (barstool,-6) in a train station, trending on (artstation,-10)", or something? I am just really curious about the syntax to create a prompt that has much finer control than less or more parentheses.
Thanks you very much !
Very good tutorial. We would also like to know how to train more than one character in one session? Or train multiple things at the same time. Like a person and a dog.
There is apparently a Dreambooth variant that can do that but I haven't tried it yet
Great tuto as usaually ;). Could you make video with same technic but using web-ui SD 2.1 locally because SD2 that will be very cool
Great video
Well this was complicated. Why did your dreambooth lose its arcane style? I trained a face with the midjourney style and it worked perfectly. The face is pretty spot on. Don't know why you had to do all of that. I wonder why your arcane style disappeared. One interesting work around is you could have taken all your training images and put them through ArcaneGAN and then added them back as the training images.
Good find!😄
Thanks!
hi i really enjoy all your tutorial videos, i just wonder is the voice in the video ai generated? thank you
Well I am an AI overlord after all so...
I didnt get the add difference thanks a lot! So is it better with add difference and using the same model for 3 than it is weighted sum? I haven't seen the whole test but it seems like maybe that's the same thing in the end?
This is incredible! Thanks ! Question, if using runpod service, I can re create what you did on my MacBook Pro, right ?
Yes since it's online, but you can't really run automatic1111 sd on mac as far as I know
@@Aitrepreneur I see, and the one you’re running is automatic1111, right ?
yes
Hi and thanks, did you try to subtract the style model from original SD instead of subtracting it from the person model in your merge?
I was looking for a way to do that with Shiv's Colab but unfortunately it is much harder, however with thelastben it is more straightforward but his colab is currently more volatile.
can you do a video on using a img2video script in the downloaded version of stable diffusion? I love being able to run it offline, but it seems the script is broken for some of the addons. Any help?
Great work btw. I'm going to start chucking people your way when they ask me about using AI. Your videos are perfect.
There is any way i run the new Dreambooth with Prior Preservation? The Joepenna Dreambooth does not have support to it. The new version get better results but i need a tuto how to install on my Pod.
Hi, So I am doing this same training on Runpod, it's still running right now, 50% done so far but its been aroung 50 mins already since it started training. I am using the same numbers of instance images and regularization images as you are. Even the GPU is the same. It's saying ETA is around 30 mins more. Any idea why this is happening? The difference is really big from yours (less than 15 mins). The only difference is I am running 3030 training steps and yours was 2020 steps. Is this the reason? But, logically the difference is too high I think. Any suggestions would be great.
This is excellent, I am experimenting wit dreambooth. Do you think is there a way to train multiple instances on one model without interfering with each instance? Let's say training for two characters and one art style?
Check out my latest dreambooth video
@@Aitrepreneur yeah, I checked about 12 seconds later I wrote this. Thank you for covering theses latest upgrades in the scene. As far as I understand that new colab is meant to train multiple faces or 'person' instances, do you think it would work with other kind of instances like places or art styles as well?
Hey u, how exactly do you train these models? Don’t you huge amount of computational power, data and overall just money to train a stable diffusion net?
Awesome work, thanks for sharing. I wonder if anyone has tried this with models that are overtrained?.. could this be a perfect use for those? Double the training, then cut it in half by merging..? 🤔
No idea but not I'm sure that's how it works, to me an overtrained model is basically corrupted, not matter what
may i also ask what software do u use to draw the tree graph in the beginning of the video?
miro.com
Thx for the great video. I'll try that. Btw. why are you always downloading the model on runpod, there ist already a stable diffusion version installed where you can just move the existing model.ckpt over to the Dreambooth folder.
Because you are not using stable diffusion but a Dreambooth notebook
btw. can we actually train with 30 images ? i thought 20 were max, and then you get VRAM out of memory
Not really, you can even use way more
Question: Will the final method work with the dreamlabs google colab? Since the runpod option isn't free, I'd have to run it on dreamlab colab then download and run a conversion script.
Maybe you're gonna have different results with the Colab one, check out my comparaison video between the Colab one and the Runpod one
the midjourney picrures you uploaded might have embeded tags
Is it possible to use custom models in the colab version?
In the video I think you mentioned training on a trained ckpt? I've read that you need to convert the ckpt to its original diffsuer via a python script which I don't know how to run. Can you maybe do a tutorial on this?
I already did every tutorial on these ;)
You don't need to convert antyhing: Its a model. You can just use it as any other. Worked for me.
Thank you, but what about the Style-fields under the Generate-button?!? 😏
These are just presets that you can save and reuse
Like ArcaneGan is there a way to keep it consistent and only change the style?
What if you used dreambooth to train a person, and then hypernetworks to train different styles? or the opposite? I think applying a hypernetwork will "dillute" the trained model on a lesser scale than retraining it, no?
There is definitely room for new experiments but I have no idea if that will work well or not
You can also train a style into a hypernetwork and apply it to any of your weighted models you have created.
You can also use dreambooth on the stylized model with your training data, this will result in the best results.
Yes there are so many options it's almost impossible to test them all now
Hey Stable Diffusion Master. Can you make a video about updating stable diffusion? I have an older version and if it is any way i wold like to be on the last version. Thank you for your videos and good luck
Just click on the url folder where your sd is installed, type cmd press enter, and in the command prompt window type git pull (you need to have git installed on your PC) I show this in pretty much every video
thank you🥰
Could you make a video where you use your Rahenyra model for the person and for your midjourney style you choose the file you get when trained with Textual Inversion method? For Person the .ckpt file in models folder and for style the .pt file you put in the embeddins folder.
If you removed the portrait images from your MJ set and leave only the non-portrait you will get even better result.
is it possible that SD already understand a midjourney style even before u trained?
No
Nice
I dont see any point in model merging, as you can use lora to do it on demand on any model and adding the amount of weight you want in one second. Unless there is some drawback to training lora I would always chose that option,
People are recommending training using a ckpt file that was already trained with your style of choice, but tbh I don't think running that locally is possible as it's not actually loading a ckpt file but some kind of weird etherial cache or something from CompVis on Hugging face.. super annoying.
how do i download the midjourney images you used? or should I create my own midjourney images on SD using magic prompts and then using it? or should i use real midjourney photographs i made?
You can download them from the miro board
how do we do all of this locally, on a 3090? i spent way too much for the gear during the gpu shortage - i'd like to use it.
Wish I could show you but I don't have a powerful GPU to do that myself
Im getting CUDA out of memory erros even though i'm running a 24GB GPU. Any idea why?
Hi, where do we get arcane and midjourney models (if possible)?
Thx
Arcane is in the description and midjourney is a style I created in my how to train a style video that you will find in the description of that video
@@Aitrepreneur Thx for the reply, I see the miro board, but not the arcane link, did I miss it? Thx!
It's in the miro board at the top
@@Aitrepreneur Thank you for this one!
Is it possible to run dreambooth locally with a 3080?
Is arcane style pruned when running through DB?
Great video and comparison
and amazing results, will definetly try this out
Glad you liked it
AI: "dude, I'm smarter than you, just give me all the data at once and I'll do all the work."
I keep getting "RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 19.71 GiB total capacity; 18.04 GiB already allocated; 38.19 MiB free; 18.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"
You should try turning rhaenyra into the style using img2img and weighted prompt editing, cross attention control and such where you can actually control it and cherry pick the ones that retain the essential features, then feed that into the training dataset with less of the style images, and maybe leave out style images with faces
hey whats the best method to train your own face?
Dreambooth imo
Training start at 18:18
Hello. Is it possible to do training on a 3070 8 GB video card?
unless you have 25gb of ram but I wouldn't recommend it, so i guess the answer is no
I know of Elizabeth Olsen but who is Elisabeth Olsen as used in this video?
Just me not being able to spell her name correclty
@@Aitrepreneur Oh OK. Wasn't actually sure if it was to give some kind of alternative result.
i have a question. i have a gt 1060 6gb, and when i run automatic1111 webui loacally, i follow all steps and everything works but, when i run a text2img or img2img or any other atually, it takes around 3secs/it that also while performing 20steps of sampling. am i missing some gpu activation steps or is it because my gpu is very old. please i need some advice on this, i have tried tried everything i could and having no background on computer science doesn't help my situation at all.Also, I am running batch count and batch size both at 1
It can be because your GPU is old, my GPU is worse than yours and I get 10 secs/it.
I have 1660ti and I get about 1-6 seconds/iteration. Make sure you close as many background programs as you can and disable hardware acceleration in your web browser (Google it, a bit different than each browser)
If you are using preview images during rendering, you might want to turn those off, or way down to only show one every 10 iterations or so. I found showing each step can more than double the time it takes...
i use my cpu for stable diffusion and i have 20sec/iteration :D buy a new gpu if you want very fast generation :D
@@ArielTavori I have a laptop with a 1660ti and I get around 1.6it/s (625ms/it) on Euler a at 512x512- xformers + automatic1111
I can also run it at 1024x1024 with the --medvram argument, but it takes around 6-8s/it if I do
When downloading the model after the training is done i only get a download rate of like 200 kb/s which means i have to rent the GPU for like 3 additonal hours. Does someone know if there is maybe an alternative for downloading the file?
Some service, like runpod, allow you to filter out the ones with slow internet.... sadly I was pooped just like you. My guess is that some of the times this is actually intentional for them to earn more money...
I want to train Dreambooth with two dogs.
Right now, what's the best looking and fastest (less training time) way to do it?
Should I train it with images of both at the same time or one first and then the other one?
The best looking one without fail is probably to train 2 different models, one with dog A and the other with dog B and but them together with inpainting
@@Aitrepreneur Thank you!
What's cheaper? Runpod or Colab Pro?
If I wanted to train lots of people or animals
Is there a way to run dreambooth locally on windows? :’(
Yes if your GPU is powerful enough
@@Aitrepreneur siiick
Is there a way to run Dreambooth on an M1 MacBook Pro?
Not that I know of