2X SPEED BOOST for SDUI | TensorRT/Stable Diffusion Full Guide | AUTOMATIC1111

Поделиться
HTML-код
  • Опубликовано: 19 июн 2024
  • Nvidia's TensorRT is a brand new extension for Stable Diffusion that boosts the performance of RTX Graphics Cards in Automatic1111's Stable Diffusion WebUI by 100%+. You can optimize models and Loras to get TONS more performance out of the same hardware. From tens of seconds to just a few! This is insanely powerful and definately something that's worth the effort - compared to what it was before.
    TensorRT Article: nvidia.custhelp.com/app/answe...
    Nvidia Driver: www.nvidia.com/download/index...
    TensorRT Install Path: github.com/NVIDIA/Stable-Diff...
    Fix No Entry error: github.com/NVIDIA/Stable-Diff...
    Install + Complete Crash Course fo Stable Diffusion: • COMPLETE CRASH COURSE ...
    Timestamps:
    0:00 - Intro/Explanation
    0:33 - Requirements
    2:30 - Installing TensoRT for Automatic1111 SDUI
    4:30 - Errors? It should be okay
    5:50 - Preparing Default TensorRT Engine
    7:00 - Different sizes & settings for TensorRT Engine
    8:10 - Cusomizing TensorRT Engine models
    9:10 - Activate TensorRT unet
    10:15 - TensorRT Benchmark in SDUI (20 it/s)
    11:13 - TensorRT vs normal model (11 it/s)
    11:48 - Lora with TensorRT (HUGE improvement!)
    13:00 - Optimizing Lora with TensorRT
    15:15 - Is TensorRT worth it? YES!
    15:30 - File size & Drive space
    16:02 - Final notes
    #StableDiffusion #AI #TensorRT
    -----------------------------
    💸 Found this useful? Help me make more! Support me by becoming a member: / @troublechute
    -----------------------------
    💸 Support me on Patreon: / troublechute
    💸 Direct donations via Ko-Fi: ko-fi.com/TCNOco
    💬 Discuss the video & Suggest (Discord): s.tcno.co/Discord
    👉 Game guides & Simple tips: / troublechutebasics
    🌐 Website: tcno.co
    📧 Need voiceovers done? Business query? Contact my business email: TroubleChute (at) tcno.co
    -----------------------------
    🎨 My Themes & Windows Skins: hub.tcno.co/faq/my-windows/
    👨‍💻 Software I use: hub.tcno.co/faq/my-software/
    ➡️ My Setup: hub.tcno.co/faq/my-hardware/
    🖥️ My Current Hardware:
    Intel i9-13900k - amzn.to/42xQuI1
    GIGABYTE Z790 AORUS Master - amzn.to/3nHuBHx
    G.Skill RipJaws 2x(2x32G) [128GB] - amzn.to/42cilxN
    Corsair H150i 360mm AIO - amzn.to/42cznvP
    MSI 3080Ti Gaming X Trio - amzn.to/3pdnLdb
    Corsair 1000W RM1000i - amzn.to/42gOTGY
    Corsair MP600 PRO XT 2TB - amzn.to/3NSvwzx
    🎙️ My Current Mic/Recording Gear:
    Shure SM7B - amzn.to/3nDGYo1
    Audient iD14 - amzn.to/3pgf2XK
    dbx 286s - amzn.to/3VNaq7O
    Triton Audio FetHead - amzn.to/3pdjIgZ
    Everything in this video is my personal opinion and experience and should not be considered professional advice. Always do your own research and ensure what you're doing is safe.

Комментарии • 69

  • @vos72
    @vos72 4 месяца назад +3

    I just wanted to say that I got TensorRT installed in stablediffusion, and WOW WOW WOW what a difference it makes. Your instructions were crystal clear and I noticed a -significant- increase in it/s. I'm getting above 30it/s now on my 3090 Ti (w/ 24G RAM). Glad I can now better use that beast under the hood. WOW. Thanks!

  • @higon99
    @higon99 5 месяцев назад

    Thank you for a clear instruction. At the current state, I just had to 'pip install polygraphy importlib_metadata' before installing the extention to a1111 dev branch.
    It's working for me with the caveat that it doesn't load any lora from the lycoris folder at all.

  • @DeViciousOfficial
    @DeViciousOfficial 7 месяцев назад +11

    I don't want to be that guy but I am going to be that guy.... this works, your video is fantastic and you are doing a great job. However.. the TensorRT comes completely without security guard rails for your card, it just keeps maxing out the card uncontrollably and causes it to overheat. People with RTX 2xxx won't run into issues but if you have a 3090 or 4090 and have run into black screens / Max Fan Speed before, you will run into this issue almost certainly. Reproduced it on 3 rigs with 3090 and 4090 which have all 3 masterful cooling systems. Maxing out these cards is no joke, this can cause serious damage. I'd sit out round one till this is fixed if you run a XX90 Card, image Generation isn't slow for you anyways, upscaling is.

    • @3d_visuals__motion
      @3d_visuals__motion 7 месяцев назад +2

      Yes its does initially to my 3090 now i have just dropped the GPU power to 70% and its now working without any serious over heating issues i have tried it constantly to more than tone hour of image rerols and my GPU was not never crossed 65 degrees. Let me know if this will help.

    • @DeViciousOfficial
      @DeViciousOfficial 7 месяцев назад

      @@3d_visuals__motion Oh yeah sure I know how to prevent it, thanx. I actually went back up to maximum power and started cooling my case with a fan which is the cheapest and most efficient colling system I have ever had 😀

    • @valter987
      @valter987 7 месяцев назад

      Should i be worried about my 3060?

    • @DeViciousOfficial
      @DeViciousOfficial 7 месяцев назад

      @@valter987 no need to worry if you never ran into overheating issues before, when the PC was still running fans go 100% but the screen goes black. Have an eye on the temperature. U should be fine, thats mostly a 3090 problem

    • @petec737
      @petec737 7 месяцев назад

      Imagine thinking your card breaks just because you see that usage jump to 100% lol..

  • @orianonicolau6253
    @orianonicolau6253 7 месяцев назад +3

    Thank you for the tutorial! May be you can help me, Im getting this message when gerating and render times getting terrible slow "CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization." How do I active that Cuda lazy loading? Thanks!

  • @ferluisch
    @ferluisch 6 месяцев назад +1

    I got about x3.5 speed up with my 2080, from 2.1it/s to 7.5it/s. Such a huge boost!

  • @YakaBita
    @YakaBita 8 месяцев назад +6

    i wish we had upscaler presets for 2x, 4x with similar tensorRT speed boost

  • @christianblinde
    @christianblinde 8 месяцев назад +1

    Very Nice, Thank you. Would be great if there will be something similar for ComfyUI

    • @Rimbo28
      @Rimbo28 6 месяцев назад

      Hey men... do you have any multicontrolnet workflow that works ?

  • @imresomodi4961
    @imresomodi4961 8 месяцев назад

    You used a sdxl LoRa for sd 1.5. ;) Good Video, thx

  • @ICE0124
    @ICE0124 5 месяцев назад +4

    If anyone else still gets errors after reinstall run these commands
    To run them go into the auto1111 stable diffusion root folder and then in the path bar type "cmd" with no quotation marks.
    Or go and copy the path and then open command prompt and then type "cd PathHere"
    then run these commands:
    venv\Scripts\activate
    python -m pip uninstall -y nvidia-cudnn-cu11
    then open the web ui again and hope that fixed it

    • @ThePolyakovv
      @ThePolyakovv 5 месяцев назад

      When i'm trying to create Default model - "Failed to parse ONNX model." Error on "Clean SD Automatic" What it should be? According to this guide, everything was fine before.
      UPD: remove -- medvram or --lowram Args, it works!

    • @12uniflew
      @12uniflew 2 месяца назад +1

      God Bless you kind sir/ma'am!

  • @waltervolbers3443
    @waltervolbers3443 7 месяцев назад

    great, thanks for explaining,
    is now faster

  • @wingofwinter888
    @wingofwinter888 7 месяцев назад +3

    sadly it doesnt work with control net in my PC also give me error with reactor. its a huge boost in speed, im praying NVIDIA keep ironing out the errors and make it more compatible with other modules.
    im ok with converting the checkpoint, its not taking too long.
    2gb is less then 4K movies, so i wouldnt call it as negative because the speed boost is really huge

  • @puzzles626
    @puzzles626 8 месяцев назад

    Ey its Dimitri from csgo surf! Keep up the good work my dude

    • @TroubleChute
      @TroubleChute  8 месяцев назад +1

      Wasn't expecting you here

  • @BrunoMartinho
    @BrunoMartinho 7 месяцев назад

    Is it possible to train tensorRT in high resolutions? I get a error, I was going for 870x1305

  • @Painjusu
    @Painjusu 8 месяцев назад +1

    Can't wait for my 4090 next month, god.

  • @substandard649
    @substandard649 8 месяцев назад +2

    Thanks for the tutorial, does this work with hires fix? What about controlnet?

    • @Painjusu
      @Painjusu 8 месяцев назад +1

      This is for overall generation lol.

  • @ThatGuyNamedBender
    @ThatGuyNamedBender 3 месяца назад

    I built the default engine but when I render at anything other than 512 or if I go 512 then hires fix to a slightly higher res the rendering fails. With the highres fix it does the standard steps but fails when doing the highres fix steps. Any ideas?

  • @KratomSyndicate
    @KratomSyndicate 8 месяцев назад

    Do you have to be on the dev branch in 1111 for this to work? Just getting cpu and cuda:0 errors.

    • @TroubleChute
      @TroubleChute  8 месяцев назад

      No. You can use the normal release. Just make sure it's up to date. Some have reported better compatability with dev

  • @ksk5058
    @ksk5058 2 месяца назад

    whats this green extension in your prompt??

  • @yaruuvva
    @yaruuvva 6 месяцев назад +1

    Man, your Lora does not work in tensorRt, why you don't see it?

  • @west1778
    @west1778 8 месяцев назад +6

    Does this work with SDXL models as well?

    • @daemoniax3788
      @daemoniax3788 3 месяца назад

      not from 2-4weeks, before yes, now no, only if u have a really strong gpu with a lot of vram like 24gbvram, because with the new update, the model is now trying to force more ran, if it has not, it show "onix parse error"

  • @dhonta40david3
    @dhonta40david3 8 месяцев назад +5

    Huge boost but it doesn't wok with controlnet unfortunately

  • @danielhejira899
    @danielhejira899 7 месяцев назад +6

    when i try to export default engine it says No ONNX file found. Exporting ONNX... Please check the progress in the terminal. anyone know ?

    • @Heldn100
      @Heldn100 3 месяца назад

      same, did you find any fix?

    • @nandoPluister
      @nandoPluister 2 месяца назад +1

      I deleted the venv folder, restarted my PC and opened SD and it worked@@Heldn100

  • @Duckers_McQuack
    @Duckers_McQuack 8 месяцев назад +1

    With just 512x512 20 steps, i went from 7.16 iterations to 20, so 3x speed there with 3090 :D
    Downside is that you need a TRT model per resolution sadly.

    • @PhilippSeven
      @PhilippSeven 7 месяцев назад +4

      But the 3090 should give about 17 it/s without this extension. 7 it/s is the 3060.

  • @___x__x_r___xa__x_____f______
    @___x__x_r___xa__x_____f______ 7 месяцев назад

    Would have been perfect if you had converted sdxl. I was not to install for sdxl unfortunately

  • @pastuh
    @pastuh 8 месяцев назад +1

    I hope that Apple will enter the gaming or AI industry..
    Just imagine a generation inside the headset, like an artist with a paintbrush :)

  • @user-zw5dw5kl5g
    @user-zw5dw5kl5g 3 месяца назад

    dose it support controlnet?

  • @Heldn100
    @Heldn100 3 месяца назад +1

    i have this problem
    No ONNX file found

  • @ratside9485
    @ratside9485 8 месяцев назад +1

    Thanks for the info. But this still looks pretty buggy. I'll wait a few more days until I test it.

  • @DinoFancellu
    @DinoFancellu 7 месяцев назад +2

    Doesn't work for me, did all the steps then got
    "Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)"
    No problems at all without tensorrt (RTX 4090), using juggernautXL_version6Rundiffusion

    • @darkjanissary5718
      @darkjanissary5718 7 месяцев назад +1

      I have the same error. It is so buggy, completely unusable atm.

  • @DSLDARTH
    @DSLDARTH 7 месяцев назад

    I still get an error but can still launch automatic1111 but when I got to TensorRT and click export in the exporter it says. "No ONNX file found. Exporting ONNX... Please check the progress in the terminal." it runs its script but at the end nothing happens and when clicking export again it tries to pull Onyx again but can't.

    • @Gwenyria
      @Gwenyria 6 месяцев назад +1

      I had the same issue but it was fixed for me when i deleted the --medvram commandline argument. Maybe you should try starting a1111 without them and see if it works. Also i selected an automatic vae, created 1 standard image (maybe to satisfy something i dont understand) and afterwards i started the tensorRT with a model i liked and it worked (you have to wait a while until it starts after clicking export engine)

    • @DSLDARTH
      @DSLDARTH 6 месяцев назад

      @Gwenyria unfortunately it doesn't work at all for me, downloaded and installed all the dependencies but always fails when trying to load tensorrt. This is on a 3090.

  • @leandrozanardo1046
    @leandrozanardo1046 6 месяцев назад

    It is really fast, but the results have nothing to do with the original model used. Sometimes can be nice, but in general if you are using loras it loses a lot of details...

  • @LFXMusicNoCopyright
    @LFXMusicNoCopyright 7 месяцев назад

    How do you update the venv folder?! very critical thank you

    • @tsmakrakis32
      @tsmakrakis32 7 месяцев назад

      I think you just delete the folder (or rename it) and run stable diffusion again (the .bat file). It will create a new venv folder and re-download whatever is needed.

  • @scyence
    @scyence 7 месяцев назад

    When installing it, I get the error "ModuleNotFoundError: No module named 'importlib_metadata'"

    • @scyence
      @scyence 7 месяцев назад

      Also, deleting the venv folder broke a1111 for me. Just ended up reinstalling.

  • @dannywoods3928
    @dannywoods3928 4 месяца назад

    Shout out to all the SA youtubers!

  • @weirdscix
    @weirdscix 8 месяцев назад

    I installed this but it was a pain to get working as the a1111 extension installer is bugged, so I had to do it manually.

    • @Jet_Set_Go
      @Jet_Set_Go 8 месяцев назад

      2 or 3 days and it will for sure be fixed or in this case, even improved

    • @TroubleChute
      @TroubleChute  8 месяцев назад +2

      And the errors and and.
      Followed a issue on Nvidia's GitHub to fix the errors, but it would work after that. Seems to work find turning a blind eye so hey. I'll take improvements where I can get em

  • @andrejlopuchov7972
    @andrejlopuchov7972 6 месяцев назад

    I wish this would work with animatediff

  • @liquidmind
    @liquidmind 7 месяцев назад

    Any luck anyone with RTX of 6 GB VRAM?

  • @jamesclow108
    @jamesclow108 7 месяцев назад +2

    Not sure I went wrong, but after creating an optimized model, then creating an image I get RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!

    • @Rambo....
      @Rambo.... 7 месяцев назад

      It's a very new extension, it still has a lot of bugs, I get this error when using controlnet, currently it doesn't support controlnet.
      😥

    • @pascaltatipata
      @pascaltatipata 7 месяцев назад

      Same here but only on XL models.

  • @procrastonationforever5521
    @procrastonationforever5521 6 месяцев назад

    Yeah, yeah... But what about hires-fix? Upscaling? Compatibility? No? Oh boy...

  • @Knox420
    @Knox420 2 месяца назад

    amd users be like

  • @crazysteve8088
    @crazysteve8088 8 месяцев назад

    you dont need to restart after deleting venv. this is a virtual environment.

    • @TroubleChute
      @TroubleChute  8 месяцев назад

      Restarted after installing gpu drivers :)