I am using FP16 since it just works like the more resent SDXL and Pony models. I have tried FP8 models but quality and consistency is a bit off. I may be doing something wrong. Even with Forge this is a lot more complex than running SDXL models and I am leaning towards using Flux models that have VAE/Clip baked in to avoid error massages.
honestly I tried all the flux models on comfyui and I did not notice any significant differences in terms of speed in image generation, I have a gpu rtx 4070 ti 12 GB vram and 32 gb of ram but on comfyui with any flux model it goes slow.
I have a different graphics card than most. It is an Nvidia RTX Quatro 4000 8g, and 32g system ram. I have very similar results as in the video in terms of speed. Nf4 is the fastest, but I have been using q4 with built in hyper lora for 8 steps. Leads right in to your next video. models that have that hyper lora cooked in just seem to work better in forge than running the lora separately.....at least for me..
@@jones77wrx If you can run FP16 definitely the better choice. Personally I like using the FP8 but using the Q8 GGUF more lately since it's closer to FP16.
@@MonzonMedia I just did the stopwatch test for you: 1:29 minutes (1024x1024, 20 Steps, Euler/Simple). That's OK for me, I'm a very patient person. The main thing for me is that it runs very stably. 🙂
@@armauploads1034 That's not bad at all and the good thing with Flux is that it's more coherent for most things so if you have a good prompt you like, it doesn't take too many images to get it right.
You're welcome! I wish there would just be one central place for everything but then again, the open source community is big so....it's a blessing and a curse. 😊
You're welcome! I just posted the vid earlier today! ruclips.net/video/L0pRdKQSNcM/видео.htmlsi=mggZOx6NSleCiMDM Also if you check the description there is a link to an 8 step NF4 Hyper Checkpoint. I just saw it after making the video on the loras....figures! hahaha!
with FORGE UI i can run DEV fp8 with only 6GB VRAM...... it takes 1:50 minutes for 20 steps on euler simple. and i can do as high as 1440x1080 and it will take 3:30 minutes...... WAY FASTER than comfy!!! GO FORGE!!!! and in my experience, Q8 is NOT better than FP8 because its slower and SAME quality, so......
Great video, but how did you get those speeds on an 8GB card? I have a laptop 3070 with 8GB VRAM and 32GB RAM, and I generated one image with FP8 in over 4 minutes. I used the settings you showed in the video.
Yeah that's way too long for your card. The only thing I can think of is that I'm using SSD drives to store my models and also to run Forge. Also make sure you are not using CPU for the swap location.
@@MonzonMedia All of my models are in the same drive with Forge. I am using shared in the swap location. What should I use in the swap method? I have tried Flux NF4, NF4 v2, fp8 and even the 3 and 4 bit gguf models. The fastest generation I had was 4 minutes and 27 seconds with the NF4 v2 model. Anything else I could try?
I've recently come back to Forge, Chose Dev-NF4 and it's okay. Just keep the CFG scale to 1 and Sampling steps to 20 and will make a coherent image. Downside being the images always seem to have large pores on skin or grain to them even when you use an Upscaler. I found that you can use the NF4 in combo with PNY models and it really speeds up my 1024x1024 generations. So I'm really withholding on Flux until the Grainy, Large pore images issues are ironed out.
I've notice that as well with some images, even with FP8. You could use another model like SDXL when upscaling but set the denoise really low so it doesn't change the details too much.
@@MonzonMedia True and it only really works well on Euler as of right now, I tried it on DPM2, DPM++ variants and it creates incomprehensible noise even with recommended settings. Plus FLUX doesn't allow for negative prompting, What's up with that. I'll just wait till a Pony-ish highly refined model comes. Because I'll say this about FLUX, It's good, accurate even but a bit rudimentary IMO.
Nf4 is fine quality wise. You might want to try Loras for more photo realistic images. With SD so and prior it also took a while for realistic skin textures..
Hi! I stil use Fooocus and learnt it based on your previous films way some time ago before you moved more to Suno and Udio. Is it possible to use Flux in Fooocus (not Ruined Fooocus) on 1080GTX 8GB?
Unfortunately not and most likely it won't happen. Fooocus is coded for SDXL, it would take a lot of work to add Flux as the architecture is different. But who knows? Perhaps the developer (who is the same one for Forge) might consider making a version just for Flux.
I see that WebUI is quite user friendly, also prepared like Fooocus by lllyasviel. I will give it a go! Would you consider making a film about setting up Flux in WebUI Forge?
The GGUF files can go into your checkpoints folder, just like any other model however you have to use the T5 text encoder, Clip_l and VAE. The first generation will take some time to load but the next one will be faster. Hope that helps.
I just installed Pinokio and flux-webui for the first time. I was hoping I could download and run Juggernaut XL model on 16GB or RAM and 6GB of VRAM. Is it possible? I can't find requirements for these models anywhere.
@@MonzonMedia would you mind sharing which tasks those could be? Isn't this just an image generation? Also, if a model is under 7GB in size, what does that mean exactly? That GPU can run it? I'm new to all this so that's why I ask, thanks
hi can we manipulate gpu on flux, im actually using old mining gpu on a1111 and it can running, but when i try use forge it detect my p106 as is, so didnt work and use cpu instead, any solution other than buy new one
Original unet huggingface.co/black-forest-labs/FLUX.1-dev/tree/main, GGUF version huggingface.co/city96/FLUX.1-dev-gguf/tree/main. Both need to be used with text encoders and VAE.
All the files are listed in the google doc in the description. For your card you can try the lower Q GGUF models. Also download the T5 text encode that is GGUF file as well.
@@MonzonMedia Subbing based on this reply. I've been doing this since the first models were released and it's still confusing AF. Then if you ask the interweb why you get a CLIP error, I think the reply is in actual computer code LOL. Using Pinokio helps a lot but its simplicity actually introduces other difficulties (i.e. the CLIP error.) It would be cool if your tut included file folder management because it seems like every time a new model comes out, it sets up a new folder structure so who knows how many 10-25GB unused models we have sitting around on our hard drives. Maybe none? I don't know.
Let me know which Flux Dev model you are using the most and what GPU you have.
I am using FP16 since it just works like the more resent SDXL and Pony models.
I have tried FP8 models but quality and consistency is a bit off. I may be doing something wrong.
Even with Forge this is a lot more complex than running SDXL models and I am leaning towards using Flux models that have VAE/Clip baked in to avoid error massages.
honestly I tried all the flux models on comfyui and I did not notice any significant differences in terms of speed in image generation, I have a gpu rtx 4070 ti 12 GB vram and 32 gb of ram but on comfyui with any flux model it goes slow.
I have a different graphics card than most. It is an Nvidia RTX Quatro 4000 8g, and 32g system ram. I have very similar results as in the video in terms of speed. Nf4 is the fastest, but I have been using q4 with built in hyper lora for 8 steps. Leads right in to your next video. models that have that hyper lora cooked in just seem to work better in forge than running the lora separately.....at least for me..
@@jones77wrx If you can run FP16 definitely the better choice. Personally I like using the FP8 but using the Q8 GGUF more lately since it's closer to FP16.
For your gpu you may as well run the fp8 or Q8 gguf. I think the nf4 model is better for people with 12gb cards or less.
I´m using Flux Dev FP8 with Nvidia RTX 2070 8GB VRAM and Forge - and it works great.
Awesome! Curious what speeds you are getting for 1024x1024 at 20 steps?
@@MonzonMedia I just did the stopwatch test for you: 1:29 minutes (1024x1024, 20 Steps, Euler/Simple).
That's OK for me, I'm a very patient person. The main thing for me is that it runs very stably. 🙂
@@armauploads1034 That's not bad at all and the good thing with Flux is that it's more coherent for most things so if you have a good prompt you like, it doesn't take too many images to get it right.
Thanks for making everything so clear. I was VERY confused about those formats. You did a great job! 👍
Late reply but you're welcome and thank you!
Thank you for helping me find the GGuf text encoders!!
You're welcome! I wish there would just be one central place for everything but then again, the open source community is big so....it's a blessing and a curse. 😊
This is the comparison we needed. Thank you so much.🤩
You’re welcome 😊 Glad it was helpful!
Man, image examples on your comparison are so good!😍
Hey bro! Thanks man...more to come!
Bro damn that’s so helpful I didn’t know about the loras!! Thanks man
You're welcome! I just posted the vid earlier today! ruclips.net/video/L0pRdKQSNcM/видео.htmlsi=mggZOx6NSleCiMDM Also if you check the description there is a link to an 8 step NF4 Hyper Checkpoint. I just saw it after making the video on the loras....figures! hahaha!
Thank you, good and useful comparison
Glad it was helpful! And you're so welcome!
best explanation. thanks
You are welcome!
Thank you, I really appreciate your work, I’m not a big fan of comfyui
You're welcome! I'd always choose Forge over Comfyui but we're still waiting for controlnet support. Hopefully it comes soon.
I dont have the vae/text encoder bar at the top. How do i add that on my interface?
Do you have the latest Forge installed? It should be on by default.
@@MonzonMedia I forgot to run the update file first lol
@@HDLEGSHOW so you’re good to go? There should be another update coming soon to use ControlNet for flux. Hopefully this coming week or next week.
@@MonzonMedia my only issue now is speed. I unfortunately have a 4gb vram so I think any settings I use will be very slow for flux
with FORGE UI i can run DEV fp8 with only 6GB VRAM...... it takes 1:50 minutes for 20 steps on euler simple. and i can do as high as 1440x1080 and it will take 3:30 minutes...... WAY FASTER than comfy!!! GO FORGE!!!! and in my experience, Q8 is NOT better than FP8 because its slower and SAME quality, so......
Great video, but how did you get those speeds on an 8GB card? I have a laptop 3070 with 8GB VRAM and 32GB RAM, and I generated one image with FP8 in over 4 minutes. I used the settings you showed in the video.
Yeah that's way too long for your card. The only thing I can think of is that I'm using SSD drives to store my models and also to run Forge. Also make sure you are not using CPU for the swap location.
@@MonzonMedia All of my models are in the same drive with Forge. I am using shared in the swap location. What should I use in the swap method? I have tried Flux NF4, NF4 v2, fp8 and even the 3 and 4 bit gguf models. The fastest generation I had was 4 minutes and 27 seconds with the NF4 v2 model. Anything else I could try?
I've recently come back to Forge, Chose Dev-NF4 and it's okay. Just keep the CFG scale to 1 and Sampling steps to 20 and will make a coherent image. Downside being the images always seem to have large pores on skin or grain to them even when you use an Upscaler. I found that you can use the NF4 in combo with PNY models and it really speeds up my 1024x1024 generations. So I'm really withholding on Flux until the Grainy, Large pore images issues are ironed out.
I've notice that as well with some images, even with FP8. You could use another model like SDXL when upscaling but set the denoise really low so it doesn't change the details too much.
@@MonzonMedia True and it only really works well on Euler as of right now, I tried it on DPM2, DPM++ variants and it creates incomprehensible noise even with recommended settings. Plus FLUX doesn't allow for negative prompting, What's up with that. I'll just wait till a Pony-ish highly refined model comes. Because I'll say this about FLUX, It's good, accurate even but a bit rudimentary IMO.
@@4thObserver pony have announced their next model will be based on aura flow. Don't know if forge already supports that,but others do.
Nf4 is fine quality wise. You might want to try Loras for more photo realistic images. With SD so and prior it also took a while for realistic skin textures..
@@4thObserverYou can get negative prompts enabled for FLUX in Forge by setting CFG to 1.1
Just getting into flux. I have flux schnell FP8 1024x1024 4 step running in 2.58 seconds on my 4090.
Nice! If you're running an 4090 you should be using the Full Dev model which is the FP16, or at the very least FP8.
Hi! I stil use Fooocus and learnt it based on your previous films way some time ago before you moved more to Suno and Udio. Is it possible to use Flux in Fooocus (not Ruined Fooocus) on 1080GTX 8GB?
Unfortunately not and most likely it won't happen. Fooocus is coded for SDXL, it would take a lot of work to add Flux as the architecture is different. But who knows? Perhaps the developer (who is the same one for Forge) might consider making a version just for Flux.
@@MonzonMedia Thank you. Is WebUI the easiest interface to run it?
I see that WebUI is quite user friendly, also prepared like Fooocus by lllyasviel. I will give it a go! Would you consider making a film about setting up Flux in WebUI Forge?
Hi there. Just a quick question. Do i need to download encoder & safetensor file to run GGUF or safetensor can just be use standalone.
The GGUF files can go into your checkpoints folder, just like any other model however you have to use the T5 text encoder, Clip_l and VAE. The first generation will take some time to load but the next one will be faster. Hope that helps.
I just installed Pinokio and flux-webui for the first time. I was hoping I could download and run Juggernaut XL model on 16GB or RAM and 6GB of VRAM. Is it possible? I can't find requirements for these models anywhere.
You should be fine although 16GB of system ram could be a bottle neck for some tasks. Juggernaut is an SDXL model which is under 7GB.
@@MonzonMedia would you mind sharing which tasks those could be? Isn't this just an image generation? Also, if a model is under 7GB in size, what does that mean exactly? That GPU can run it? I'm new to all this so that's why I ask, thanks
Flux and ponyxl restart my pc, I am using rtx 4060 8gb vram and 16 ram, using forge. Any idea whats wrong?
hi can we manipulate gpu on flux, im actually using old mining gpu on a1111 and it can running,
but when i try use forge it detect my p106 as is, so didnt work and use cpu instead, any solution other than buy new one
Best to ask in the discussion page on GitHub. Not sure to be honest. github.com/lllyasviel/stable-diffusion-webui-forge/discussions
i have rtx 4060ti 16gb. can i do comfyui live portrait with this gpu?
im still hoping to used flux but untill now i got error. i have m1 chip. i used forge
i cant seem to find fp16 anywhere
Original unet huggingface.co/black-forest-labs/FLUX.1-dev/tree/main, GGUF version huggingface.co/city96/FLUX.1-dev-gguf/tree/main. Both need to be used with text encoders and VAE.
All my renders are coming out blurry, no matter which checkpoint I use
@@b.radical can’t help you man if I have no details. Settings? Size? Context?
@@MonzonMedia Rebooting fixed it, I guess I had gunk in my VRAM
Can I run this on my android phone
I don't think so.
Why not? I actually run it on my pager.
Please share link for vram 4GB
All the files are listed in the google doc in the description. For your card you can try the lower Q GGUF models. Also download the T5 text encode that is GGUF file as well.
@@MonzonMedia for T5 is it stored in which folder?
@@fullmatchfullhighlight see 6:05 👍
I need a PC! lolll
Yeah that would help! hahaha!
dev, schnell, nf4, fp8, gguf, vae, clip... this simpleton is confused
It's a lot to take in if you are new to all this. I'm preparing a beginners video very soon.
@@MonzonMedia Subbing based on this reply. I've been doing this since the first models were released and it's still confusing AF. Then if you ask the interweb why you get a CLIP error, I think the reply is in actual computer code LOL. Using Pinokio helps a lot but its simplicity actually introduces other difficulties (i.e. the CLIP error.) It would be cool if your tut included file folder management because it seems like every time a new model comes out, it sets up a new folder structure so who knows how many 10-25GB unused models we have sitting around on our hard drives. Maybe none? I don't know.
Gtx1060 mod q4.gguf 720*720 13 min(((