LORA Training - for HYPER Realistic Results

Olivio Sarikas

Просмотров 107 тыс.

2 300

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 10 фев 2025

Комментарии • 144

@numbnut098 Год назад ⁺⁷
I think this is the best, most detailed tutorial on the subject of training a character lora that I have seen. The information you have given has changed my lora's from treaining nightmare juice, to training actual people. Thank you so much for this.
@OlivioSarikas Год назад
Thank you very much :)
@pixeladdikt Год назад ⁺²²
Thank you Olivio! I've had to train new faces and since Dreambooth isn't the "go-to" these days I've been looking for a new LORA tutorial. Those last 2 mins where you explain putting the LORA into the aDetailer really hit home - such an amazing workflow 👊💯
@OlivioSarikas Год назад
Thank you, yes, that helped me a lot too :)
@BunnySwallows-qx7ek Год назад ⁺¹
do you know what the "go-to" Method is these days? I am having trouble running dreambooth on colab, so i'm looking for a new method to make a LoRa or model.
@chrisrosch4731 Год назад
wouldnt you say dreambooth is still king?
@JavierPortillo1 Год назад ⁺¹²
Realistic vision just updated a couple days ago and it looks fantastic!
@OlivioSarikas Год назад ⁺⁴
Cool, i will look into that and maybe make a video about it
@yudhamarhatn3006 Год назад ⁺¹
Agreed, already training lora with RV4.0 and it gimme goosebump
@ArnoSelhorst Год назад ⁺⁶
Olivio, absolute professional advice here that is really appreciated. I follow you for quite some time now and I have to say it really shows the earnesty in which you follow your passion and teach it to others. Bravo! Keep it up!
@WifeWantsAWizard 9 месяцев назад ⁺⁶
(16:40) First, he meant "ALT+F4". Second, you can "ALT+TAB" to swap to the alert popup. Also, BooruDatasetTagManager has hotkeys you can set yourself (under "settings" => "hotkeys"). The default hotkey for hiding the preview window is "CTRL+P".
(25:30) He forgot to mention that you can't set "max resolution" to 768x768 if your input images are less than that--say 512x512. A lot of times we'll create LoRAs specifically for use in image-to-image. That means we want those LoRAs to output at a low resolution so that it is quick and then you can "upscale" in img2img using the low-res as a base. You can also use 128x128 for pixel art.
@germanfa1770 8 месяцев назад ⁺²
Could you please explain one thing to me? I'm not sure if I understand this correctly. Can I use a resolution of 2040x2040 pixels or 1600x1600 pixels to train Loras SD 1.5, and set 768x768 pixels in the settings of Kohya without the "Enable Buckets" option turned off, since all the photos in the dataset have the same resolution of 2040x2040? If so, is it advisable to do this? After all, the SD 1.5 model only understands 512x512 pixels. Will understand the model my dataset of 2040px? But if I prepare my dataset as 512x512 or 767x768, the quality of the original photos will be noticeably reduced. Thank you.
@WifeWantsAWizard 8 месяцев назад
@@germanfa1770 So, two things. To your question, "buckets" is for sorting input images into different groups. So the system presorts, let's say, all the 1080x480s into one group and does them together then presorts all your 768x512 into another group and so on. Checking the "buckets" toggle means you are telling the machine, "look through this pile and sort it before you get started". If you only have one size, when it "buckets" everything it will only find the one size and group them all together. Technically that's a waste of 20 seconds, so hence you can turn it off if you know everything you're ever putting in will be 1:1 or whatever.
Second, stable diffusion "remembers" powers of two. So, you said, "understands 512x512". True, but use the word "remembers". SD and SDXL also remember 1024x1024, 2048x2048 (if you want your graphics card to catch fire), and even 128x128 (for pixel art). It stores the training data at a 1:1 aspect that is a power of 2 but can "see" (train from) any resolution/aspect ratio. It may seem counter-intuitive that a 1:3 aspect ratio can somehow result in 1:1 training data, but that's how the math works.
@lennylein Год назад ⁺⁶
I can not stress enough how important the quality of the source images is for training Loras. This is one of the few tutorials which actually give useful advice how to create and prepare a high quality training data set.
Thank you for this outstanding video ❤
@OlivioSarikas Год назад ⁺⁴
#### Links from the Video ####
Install Kohya ss Guide: ruclips.net/video/9MT1n97ITaE/видео.html
Photon Model Video: ruclips.net/video/0tDFCZr5cA8/видео.html
Photon Download: civitai.com/models/84728/photon
v1-5-pruned.safetensors huggingface.co/runwayml/stable-diffusion-v1-5/tree/main
github.com/starik222/BooruDatasetTagManager
@op12studio Год назад
you can just press enter to get passed the save popup if you accidentally forgot to move the preview image. Great video btw
@OlivioSarikas Год назад
@@op12studio oh, cool, thank you!
@gohan2091 Год назад
Typo in your description
it's Photon not photo :D
@OlivioSarikas Год назад ⁺¹
@@gohan2091 thank you
@JohnSmith-vk4vq Год назад
Wow thank you for explaining the right way to set up samples… you are correct 👍 Sir!
@maxfahl Год назад
BIG thank you! This was exactly the video I was missing in my LORA expeditions.
@ozerune Год назад ⁺³²
I made a Lora of my dead grandma last night to create images of her for my mom, and she was very happy with it, but it was so blurry and unfortunately it isn't really possible to give it a better dataset anymore
@testales Год назад ⁺¹²
You could try upscaling your training dataset images with GigaPixel AI beforehand. This will also fix quite a bit of noises. There are also AI sharpers which can yield impressive results at times, so it depends how much time you want to invest in your training set.
@OlivioSarikas Год назад ⁺¹²
Try to upscale and sharpen the images, then try to use the very best images from your AI creation also in the training and see if that improves anything. Also try to render the images with differnt models or render it first with the model that works the best for you and then use img2img with a different model that creates more realistic results
@spoonikle Год назад ⁺⁴
I am with the comments, lots of AI’s specialize in upscaling old photos. If your using a model only trained on modern digital data then you wont get much help, but if you start with high res scans of photographs in a model train on upscaling photo scans you will get amazing results.
Maybe we can fill a database to contribute to a lora by taking the same photos with different cameras and scans of film.
@Chilangosta Год назад
Agree with the other comments - I'd just echo the advice to not give up if you really want this! The first few times are almost never representative of the results you can get! Keep tweaking and trying new things, and research it more, and you'll probably end up with a much better result than you thought possible!
@Lexie-bq1kk 5 месяцев назад
People are recommending you upscale and sharpen, however there is a lot of work you can do before getting to that point. I would encourage you to take the photos you have, go through and remove any unwanted objects, and to simplify the background as much as possible. I use Topaz Photo AI and Photoshop to remove objects and create new backgrounds,' The images might be low quality but if you can simplify the image into a very simple background and just the subject, it will help. Also, you can take your training photos into a CLIP interrogator to see how the AI will recognize certain things about the image, ultimately you may be able to use that information for captioning or for future use in negative prompts.
Also, I would see if you can accomplish the "sharpening" with actual very low scale denoise, like 0.1 in Topaz, since what you are hoping to achieve is more clarify, denoise might be a better alternative to sharpen. I find that as long the image isn't overly noisy, sharpness doesn't matter as much.
@jrfoto981 Год назад
Thank you Olivio, this is a good process for getting a desired result. I used similar process of image preparation for making custom embeddings.
@LeonvanBokhorst Год назад ⁺³
Thanks again. Very helpful, like always 🙏🚀
@0AThijs Год назад
Thank you for this very informational guide, Definitly -one if not- the best out there : )
@axelesch9271 Год назад ⁺³
The new Kohya SS master release is using different tabs of your video : Deambooth, Loora, Text Inversion ... nothing Like Dreambooth TI, Dreambooth Lora, How to figure what tabs do what, since the dreambooth tab doesnt include anything like network rank ... Also the Lora tabs include nothing related to Dreambooth/Lora technique.
Nobody is talking about it but the dev of the GUI just changed the whole UI whithout providing any documentation on how to interpret all the changes to the GUI he has made.
@AeroviewMenorca Год назад ⁺¹
As always, excellent work Olivio. I've been following you from Spain. My English is a bit limited, so I use an AI to translate your voice into Spanish based on the subtitles. It might be funny, but it's a lifesaver for me. You provide very detailed explanations in your videos. Greetings and thank you very much! 👏👏
@TheKillingPerfection Год назад
how do you do that?
@tomschuelke7955 Год назад ⁺²
Many thanks for this. Two questions. No 3.
What is about those extra images.. Some other youtubers suggest for the class of object.. Dont remember the name.. Calibration images?
When to make a lora and when to use dreambooth?
Third..
When i want to train for example the typical style an architectural company has for.. Lets say.. Office fassades seen from the street.. That for sure difffer often. But to still finde the essence of the style.. Lora or dreambooth. How many images. How to capption?
@Inugamiz Год назад ⁺⁶
Olivio, mind doing an update tutorial on making those dancing AI videos? I been trying but either the face is messed up or just stop doing the poses.
@lostinoc3528 3 месяца назад ⁺¹
been searching for weeks and still can't find, what are the exact image resolutions which should be used in training a lora, for flux? or SDXL. i know 1024x1024 is the ai's resolution, but what size should the images be for training the lora? should i crop them all to 1024x1024? what about full body photos? what size should they be? should i stick to just 1 or 2 sizes? or are several ok? i will mostly be using images from social media, so their resolutions are 1440x1440 and 1440x1800, mostly, are these okay resolutions? or should i enlarge them? reduce them? crop them? any advise will be super appreciated!
@HanSolocambo 10 месяцев назад
21:03 "That number defines the steps or repetition [...]"
This number represents repetitions only (repeats).
steps is something else = nb.images x repeats.
21:09 "I mostly use 10 for my LoRA but others use 5 [...]"
Nothing's random in training a Lora ;) Number of repeats should more be about "how many images do I have for that specific LoRA", rather than about "how many epochs am I going to need now" or "I am used to that number".
Images found (let's say 100) x repeats (8) = 800 steps.
steps x gradient accumulate steps x epochs x regularization factor (if one uses properly made reg. images + reg captions for each trained image) = Max Train Steps.
800 x 1 x 2 x 2 = 3200 steps (which is often enough).
This being said I'm still confused about why or how one should balance repeats and/or epochs to reach the sweet spot between about 3K~4K Max Train Steps. Especially since we can save checkpoints and samples every N samples, run more epochs or resume from a precedent trained weight.
@germanfa1770 8 месяцев назад ⁺¹
Could you please explain one thing to me? I'm not sure if I understand this correctly. Can I use a resolution of 2040x2040 pixels or 1600x1600 pixels to train Loras SD 1.5, and set 768x768 pixels in the settings of Kohya without the "Enable Buckets" option turned off, since all the photos in the dataset have the same resolution of 2040x2040? If so, is it advisable to do this? After all, the SD 1.5 model only understands 512x512 pixels. Will understand the model my dataset of 2040px? But if I prepare my dataset as 512x512 or 767x768, the quality of the original photos will be noticeably reduced. Thank you.
@TheTornado73 Год назад ⁺¹
Hello! there is also a psychology factor, women do not like too detailed photos -)
that is, it is not necessary to see all the wrinkles, acne, pigmentation, etc.
so detailing is important for large details of the shape of the eyes, eyebrows, eyelashes of the lips
and if you focus on the super detailing of all wrinkles
- they will tell you - it doesn’t look like it!
no wonder the beauty industry works -)))
you can reduce the number of epochs by increasing the dataset, i.e. on a dataset of 50-60 photos .80-90 steps per photo
and one epoch give quite normal results, with the weight of lore in the prompt 0.7-0.8,
+ a variety of clothes, a variety of backgrounds
if the set is on the same background - this background will pop up in the most unexpected places if it is only in a white T-shirt
- this t-shirt will be everywhere,
it's better to cut out the background altogether,
I am from a set of 10 photos with the same type of background - I cut out the background for 8, well, a variety of clothes will not let sd get hung up on a certain color, style
@TaylorWay-e8z Год назад
great video! thank you!
@Aviator-ce1hl Год назад
Instead to create manually the folder for the training you can do it automatically using the Tools tab in the Kohya Dreambooth LoRA.
About the using of "Restore Faces" if I well remember in one of your video you suggested to don't use if you are using a LoRA model because it may modify the actual face. Actually I found that it may be true. When you use the Tools in Dreambooth there you set the key word for the LoRA and you also give a category to the model which is I believe important for the training.
@mr_pip_ Год назад
Wow .. really very well explained, thank you!
@haydenmartin5866 Год назад ⁺²
Hey man we need to see an SDXL Lora tut 🙏
@camar078 Год назад ⁺³
Alles schön und gut, aber die wirklich relevanten Stellen und Problematiken die einem beim Training begeegnen hast du nicht besprochen. Punkte die tatsächlich informativ gewesen wäre, sind: Was hast du geändert, nachdem ein LoRa die Kopfform oder Haare nicht richtig wiedergegeben hat? Welchen Unterschied macht die Auflösung und das Seitenverhältnis der Quellbilder sowohl in Trainingszeit, so wie auch in den Ergebnissen? Wie stellt man die Buckets richtig ein und wie ist deren Beziehung zur Trainingsauflösung und den Quellbildern? Wie trainiert man sowohl Portrait- wie auch Full-Body-Shots? Wie viele der jeweilig eingestellten Perspektiven haben bei dir funktioniert? 1 Teil Full-body, 3 Teile Close-up? Etwas anderes? Wie kann ich die Konsistenz der Ergebnisse in verschieden eingestellten Seitenverhältnissen verbessern/muss ich etwas bei den Quellbildern beachten damit die LoRas hier gut funktionieren? "Erklärungen" zu Mixed precision, Network rank dims (Dateigröße, Konsistenz der Ergebnisse), LoRa Auflösung sind bestenfalls als gefährliches Halbwissen zu bezeichnen und sollten auch dringend als solches markiert werden. Aussagen wie "Ich habe viele gute Ergebnisse bei mit fp16 trainierten LoRas gesehen, aber auch welche mit bf16" helfen keinem weiter und haben keine Aussagekraft für irgendwen, wenn die zugrundeliegenden Eigenschaften nicht zumindest kurz angeschnitten werden. Mein Vorschlag ist daher, entweder diese Punkte kennzeichnen als "persönlichen Eindruck" oder gleich sagen, dass man hier nichts objektiv nachweisbares weiß und nicht weiter recherchiert hat. Das Netz ist mittlerweile voll von "Tutorials" die zu 90% inhaltlich alle identisch sind und nur zu bestenfalls halbgaren Ergebnissen führen. Mehr Ehrlichkeit und oder echte Recherche wäre erfrischend hilfreich.
@YouTubeInspiration-l9i 2 месяца назад
finally, someone who is actually building LoRAs adressing actual issues! Thank you for the comment. Watching Olivio's tutorial gives me impression that he never actually did any LoRA at all and just makes a video to make money on youtube.
@annansm4293 2 месяца назад
If my dataset contains images of different kind of makeups, should i caption those too? Like natural makeup or full makeup
@ArjenJongeling Год назад
15:26 I can't figure out what the name of the tool is you mention. Is there a link to it, or how do you spell it?
@fatenathan 11 месяцев назад
Thanks Olivio! Can u tell me one thing. i want to train a LORA for Stable 1.5 but also XL. When i make few pictures how big should be the canvas size? Is it okay to make very high quality like 2k resolution and the lora make the details out of it? or is this more bad to get higher resolution like 2140x1600 px. ?
@Dalroc Год назад ⁺¹
Just hit Enter if your pop up in the Booru tagger is blocked by the preview.
You should've shown the full process of tagging one or two images. Like what tags you removed and what tags you added.
@gohan2091 Год назад ⁺²
I'm currently using low resolution, low quality images from Facebook of work colleagues (with their permission of course) and making a LORA although the results are pretty poor but I'm using Roop with the 4x ultra upscaler and getting ok results but nothing amazing like these with high quality and high resolution photos. A tip.. You can see the meta data of a Lora inside A1111 which tells you the settings used during training.
@OlivioSarikas Год назад
Really? How do you see the meta data of the lora in A1111?
@gohan2091 Год назад
@@OlivioSarikas I'm using Vlads A1111. When you view your list of LoRAs thumbnails (where you click them to insert into the prompt box) there are various buttons at the bottom of each thumbnail such as adding Lora to your favourites. One of the buttons is to read the meta data. This shows you the training data settings you used. Was looking at this last night. May just be exclusive to Vlads. I'm not home at the moment to give you exact instructions to locate but it's definitely possible.
@Dalroc Год назад
@@OlivioSarikas Just click the ( i ) in the top right corner of the LORA i nthe A1111 GUI.
@Strangepaper Год назад
Use the roop-ed photos as additions to the training data!
@ProjectOfTheWeek Год назад
Great video! We need a tutorial to train a brand logo. and then be able to create illustrations with the.logo, thanks!
@4dee103 Год назад
Thanks for a great video, new to SD but wanted to give this a go...can i please ask what sizes are all your photos for training? You're square photos are 768x768? What about the full body shots?
@simonevalle8369 Год назад ⁺¹
My kohya gui is totally different and I can't understand what I have to do for train my lora......evry time I tryu to train it I ahve only errors on the cmd page
@CrystalBreakfast Год назад ⁺³
With the tagging, people say to only tag the things that aren't intrinsic to the subject. So a lot of those (rather judgy) anime tags aren't appropriate because things like "thin lips" are intrinsic to what a "Betka" is. So by tagging "thin lips" what you're saying to the AI is it's a picture of "a Betka with thin lips," implying that a "Betka" normally doesn't have those lips. So you mainly tag things that aren't a permanent part of what makes a "Betka" a "Betka," then by process of elimination it learns what is a part of Betka and what isn't. Or at least, that's what I've come to understand from the advice I've read.
@OlivioSarikas Год назад ⁺³
I that was the case then it would not render anything that was tagged afterwards in the image. but that is not how it works. When you tag "short hair" the character does have short hair in the images, unless you write a different hair style in the prompt. but if you don't write short hair, it will always have short hair. So a keyword or tag does not exclude things, it makes the a variable as i said in the video. tag things you want to be able to change
@krystiankrysti1396 Год назад ⁺²
I would advise against using DSLR or big sensor camera, it will introduce bokeh and you want whole head in focus (shallow dof ruined my trainin cause ears and hair were out of focus and it learned it), phone photos are better , especially if you have multiple lenses not just one wide angle, try batch 4,epoch 4, repeats 27, images 35, network at 200,200 inference in webui with 0.8 and inpaint the face at 1.0 or 0.9 to bring the likeness even more, with lora its hard to get likeness from one inference ,you have to inpaint just the face on full power but without distortion from overtraining, its better if 1.0 is overtrained and starts to distort
@mattmarket5642 Год назад ⁺²
Very bad advice. Larger sensor means more detail and higher quality. You’re right that you want the whole face in focus. Just don’t shoot the photos with too low of an f-stop. Smartphone cameras often distort faces.
@krystiankrysti1396 Год назад
@@Teo-iq4gi its for kohya ss webui, 27 is number of repeats, when you train a face, you should see overtraining at 1.0 , at 0.6 there should be great stylisaiton but weak likeness, so you generate at 0.6 and inpaint face at 0.9 or 0.8 if you can, this way you get best results, get adetailer extension to fix inpaint face automatically
@PhilipRikoZen Год назад
@@krystiankrysti1396 sorry to ask you again, where is the value "repeats" in kohya ss webui? can't find it. thank you
@krystiankrysti1396 Год назад
@@PhilipRikoZen dood are you that lazy so many days passed and still cant find it ? Its in tools panel, maybe spend like 10 minutes going over all panels in webui and read what it says , you want to learn it or be lead by a hand
@PhilipRikoZen Год назад
@@krystiankrysti1396 so much anger, 2 days ago was the first time ever i read your comment in my life, no idea what "so many days" you talking about, anyway thank you, was under "deprecated" which my brain was ignoring cause deprecated
@CrynogarTM 7 дней назад
ESC or ALT + Y will do the trick on the save cancle rqeust
@RolandT Год назад
Und wieder vielen lieben Dank für Deine super Anleitung und Tipps mit Kohya und anderen Models auf denen man trainieren könnte! Ich hatte damit Anfangs Schwierigkeiten mit der Installation (falsche Python-Version und manueller Installation von accelator) aber nach ein paar pip Installs hat es nun endlich geklappt. Ich habe mich gleich an eine Fotoserie mit über 400 Fotos gewagt (mit 5 Epochs a 20) und bin vom Ergebnis überwältigt! Auch bin ich vorher nie auf die Idee gekommen, es nicht mit SD 1.5 zu trainieren. Hatte bisher auch nur Hypernetworks mit eigener Formel trainiert, was relativ aufwändig war. Die Ergebnisse waren mal so mal so. Mit Photon kommen die Fotos nun von meinem Fotografen gleich viel realistischer. Auch der Tipp mit dem ADetailer ist Klasse! Wieder mal so viel dazulernen können! Danke dafür! Jetzt fehlt nur noch ein Tut wie man LORA-Models für SDXL erstellt (hatte ich schon mal probiert aber bekomme noch überall Fehler). 🙂
@randymonteith1660 Год назад ⁺¹
Is the " Booru Dataset Manager " a standalone program or an extension for A1111 or Kohya?
@markreiser4080 Год назад ⁺¹
What about regularization images? Some say they are important, some even don't mention them?
@leolaxes Год назад
How do I select from 4 epochs that look technically the same? They are so similar that to distinguish the difference is difficult. Adetailer essentials makes them all perfect
@JieTie Год назад ⁺¹
Maybe a vid for training LORA for XL?
@nayandhabarde Год назад ⁺²
@olivioSarikas can you please create one detailed style training guide with lora and dreambooth? which one is more suitable?
@OlivioSarikas Год назад ⁺²
I would maybe do that as a online course :)
@ChrlzMaraz Год назад
Something that is never talked about regarding quality images is focal length. Vary your focal length! some wide angle close ups, standard, and some telephoto portraits. In addition, vary your f-stop. Wide angle image's usually wont have bokeh, most telephoto photos will.
@Divamage Год назад ⁺¹
my kohya ss has missing module library issue
@sigma_osprey Год назад ⁺⁶
Hi, Olivio. I'm new to this AI image generation thing. I want to learn how to do it. Do you have a structured tutorial or a set of videos that teaches how to go about this? Like from installing stable diffusion (not on a pc) to model installion and best image prompts to produce realistic looking human images. I hope I made sense there. Thanks.
@lordkhunlord9210 Год назад ⁺²
Unless im missing something , how are we supposed to use booru dataset ? Even the github page doesn't say how we're supposed to install it
@NowOrNeverAI Год назад ⁺¹
same problem
@temporaldeicide9558 Год назад
What about Regularisation folder? I hear that sometimes that helps a lot. But I don't actually know what it is or what it does.
@PatchCornAdams723 Год назад
All I want is smutty images of Diana Burnwood from the 'Hitman' series. I have a high end PC and I am computer savvy, I just trip up on the github stuff.
@Foloex Год назад ⁺¹
I wish there was a tutorial out there to train other things than people for example: different sports (martial arts, ping pong, danse move, gymnastics, pole vaulting...), hugging, massage, meditation pose, playing cards, ... I tried to train a simple concept like holding "boy dressed as a magician, holding a rabbit", "woman holding a baby", "girl holding a cat"... So far I can't get consistent result and the relation " " doesn't seem to be understood. If anyone could give me pointers on that matter, I would appreciate.
@hcfgaming401 Год назад
Leaving a comment on the off chance this gets a reply eventually.
@alectriciti Год назад
It would be neat if you did a video on the Dreambooth extension for A1111. Personally, I've had several issues getting Kuyah working even after different types of installs, it ended up just being a big headache, where as the Dreambooth Extension just works. Though they share similar concepts, I think it would still be helpful for a lot of people. Anyway, thanks for the video m8!
@greatjensen Год назад ⁺¹
Agree with this. I cant get Kuyah to work either.
@fernando749845 Год назад ⁺¹
Is your audio out of sync or is it my system?
@pb3d Год назад ⁺¹
16:49 if you're stuck with this, just hit enter, that will close the pop up
@OlivioSarikas Год назад
thank you!
@kayinsho 11 месяцев назад
Can you use this on a Mac M1?
@ZombieAben Год назад
Training will take multiple days to do on my Nvidia 1660 if done with the proposed repetitions and Epochs. Is there anything i can do or do i need a better graphic card like a 2080 ti or above? Maybe my source images have a to high resolution and i need to either crop, resize or do both. Will that speed up the process?
@HollySwanson Год назад
I have a 4060ti 16GB and it takes 30-60 minutes per model if you want it hyper realistic. I think there are some deals on for Christmas on amazon for 3060s and 4060tis
@sikliztailbunch Год назад
Sadly, every tutorial is either about training on a specific person or a general art style. I want to train on scorpions and they turn out bad.
I also have trouble with Kohya. It won´t train. It gives me an error in the console. I haven´t found any fix for that. So I train with Dreambooth. But my loras don´t even chage the image . They do literally nothing. I watched countless tutorials, read through all docs. But I think I am doing it wrong anyways.
I found a small tool called NMKD, which let´s me train on SD 2.0. It works, kinda. But it only gets me dead, misfigured scorpions. Also I want scorpions in SDXL. SD 2.0 is too weak overall.
@cleverestx Год назад
Can someone help? I'm getting this when I click to train after following this video: "No data found. Please verify arguments (train_data_dir must be the parent of folders with images)"
@Dmitrii-q6p Год назад
is it better to remove background, not?
3d paint can do it successfully in 1 click.
@Lenpreston2 Год назад
Fascinating video
@TheGarugc Год назад
Don't you use regularization images for training?
@walidflux Год назад
in Kohya ss there is an option to convert model to lora can you dive into that please
@TanvirsTechTalk Год назад
what is your discord channel?
@Sbill92085 Год назад
Does this process work the same with SDXL?
@pb3d Год назад ⁺¹
what are your thoughts on regularisation images?
@OlivioSarikas Год назад
That can certainly help, but i haven't experimented with that too much. It's good for putting keywords on things that don't work so you can then put these keywords into your negative prompt
@matthewmounsey-wood5299 Год назад
🙌❤
@testales Год назад
You put so much effort into details but in the end you are training with 512px or 768x, so that means these images will be downscaled accordingly right before the actual processing occurs. So it doesn't matter if you provide super nice 4k images or just 768px right from start. In fact it might be better to downscale the images the exact actual training resolution yourself, that way you can at least choose the downscaling method and see what the LORA will actually see and learn. Btw. for the 1.5 models 1024px is no problem if your VRAM can handle it, though it may not work with just a few images.
@OlivioSarikas Год назад ⁺¹
I would have to try that, but you will still get a better 768 image from a very sharp image than from a image that lacks detail. Also you will see in upscaling that loras and checkpoints trained on larger images give a much better, more high detail result
@testales Год назад
@@Teo-iq4gi Recent implementations use buckets, where a bucket is container for images of a specific size. So in the first step your data set will analyzed and a number of buckets will be created that suits your dataset with the biggest possible bucket being the size x size, hence 1024 x 1024 if you set this as your training size. If you don't have square images, a smaller bucket will be used such as 1024x768. All your images will be put in the bucket they fit in best. How the bucket sizes are calculated and therefore how many buckets there will be, depends on the algorithm. But either way, every image that exceeds the training resultion will be scaled down to fit in one of the biggest buckets. So to my understandin,g there is no adavantage in providing images above training resolution.
@camilovallejo5024 Год назад
Somehow I don't get the white button you used to select the model... Like really? 😑
@Jaysunn Год назад
Which point in the video are you referring to
@seifergunblade9857 Год назад
can laptop rtx 3060 6gb vram use for training lora?
@nolimit7582 Год назад ⁺¹
Why LORA? What about LyCORIS?
@gohan2091 Год назад
The image folder is called "bf16, network 256 alpha 128" but in your training settings you are using fp16, network at 8 and alpha at 1 so I am very confused why you named your folder like that
@PhilipRikoZen Год назад ⁺¹
He choosed the name of the folder by manually creating the folder in windows and typing the text you see, in the tutorial he doesn't actually create that folder nor it comes from Kohya, during the examples he makes at the end generating pictures he's actually using the Lora he created before the video which was trained with those values, bf16 and higher network and alpha. Long story short, if you computer can, try and generate with much higher network and alpha then the default 8 and 1.
@gohan2091 Год назад ⁺¹
@@PhilipRikoZen I have a 4090 and used alpha and network at like 64/64 and 128/64 with koyha and when it generates I get only black images even when j lower the weight but at 1 and 8 it's fine. Any idea?
@thebrokenglasskids5196 Год назад ⁺¹
@@gohan2091 Using settings that high for Network Alpha will often trigger the Nan error while training your Lora. I would suggest lowering it to something like 16 and test. If your Lora works in SD then increase it in Kohya and keep doing so until you get the Nan error. You'll know if the error is happening by watching the "loss=x.x" value during your training. If at any point it changes to "loss=nan" then your setting is too high and you've errored your Lora into one that will only render nan in SD(that's why you get nothing but a black image).
The Network Rank is fine to keep at 128. In fact you should, as lowering that lowers the file size and the quality of the resulting Lora will diminish. The Network Alpha is what triggers the Nan error, so that's the one to lower to fix it.
For reference, I have an RTX 3060 12gb and usually train with Network Rank 128 and Network Alpha 16 for character Loras using a dataset of 50 images @ 24 steps per image and 3 Epochs.
Hope that helps.
@gohan2091 Год назад ⁺¹
@@thebrokenglasskids5196 thanks. I think I used values 1 and 8 in my last Lora. Results aren't great but it kind of works. This seems like a guessing game. Picking numbers at random and see what works without any reel understanding on what's going on lol
@RenoRivsan Год назад
This guys is a pro,. BUT he makes everything so complicated! dang!!
@RenoRivsan Год назад
meaning do not get to the point and keeps adding more topics*
@weatoris Год назад
Is hyper realistic more realistic than realistic? 🤔
@Aks15314 Год назад
Any one can help can i train lora on colab stable diffusion ?
@bilybob-c4p Год назад
Oh, I thought the opposite; I thought you wanted to get every angle, every lighting condition etc...
@sin2pie432 11 месяцев назад
Why are you using LoRA if you don't need an 8-bit transformation? What pipeline? LoRA is for training scenarios that are too resource intensive for your training environment. LoRA will not improve results in any scenario. Rather the opposite, in many use cases it is not lossless. I spend too much time writing code and forget people are actually using this in the real world.
@360travels9 Год назад
is there something better than Roop for faceswap with higher resolution?
@ramn_ Год назад
you need to pay
@dannous Год назад
basically you need to use the same pictures you use for your passport or green card lottery :D
@metamon2704 Год назад
Unfortunately there is a new version of the ui already and several things have changed - like the ability to click the icon to select another model is not there.
@gulfblue Год назад
How do you find stable diffusion's trained faces based on your original input photos? (28:41 where are you getting this comparison?)
@sinayagubi8805 Год назад
If this wasn't the right tutorial for you, watch my tutorial how to make a lora with just one image from your back with your camera off 👍
@sizlax 3 месяца назад
If the model is bad for producing images of the character looking to the side, or up/down, shouldn't you be encouraging people to train their loras with more images of the person looking in those directions? I mean, unless people Only want to produce boring images of the character standing there and looking directly at the camera. But that kinda negates the point of image generation.
@OlivioSarikas 3 месяца назад
try it. but this is 1.5 and the lora got confused by the face detailes if the face was visible from all angles. the face would then look a lot less like the person. especially in the small details
@mohegyux4072 11 месяцев назад
23:30 i think you're wrong here
@nwkproductions01 Год назад ⁺¹
Vllt. bin ich das erste mal in meinem Leben erster?🥹
@OlivioSarikas Год назад
cool 😍
@willpulier Год назад ⁺¹
How about SDXL?
@OlivioSarikas Год назад ⁺¹
There isn't even a good A1111 implementation for it yet to render image ;)
@adelalfusail1821 Год назад ⁺¹
man, I got disappointed hearing you saying" Experiment With That" throughout the video. since you already did , why didn't you share with us
@SaxophoneChihuahua Год назад
lycoris is better than lora
@pastuh Год назад ⁺²
Currently, the trend is to create Lora "Sliders."
It would improve your visitor numbers if you could create a tutorial on how to make them.

Следующие

Автовоспроизведение

LORA + Checkpoint Model Training GUIDE - Get the BEST RESULTS super easy