Omost = Almost AI Image Generation from lllyasviel

Поделиться
HTML-код
  • Опубликовано: 28 сен 2024

Комментарии • 86

  • @havemoney
    @havemoney 3 месяца назад +60

    lllyasviel this guy came to us from the future. If only he would add Omost and IC Light to Fooocus

    • @marhensa
      @marhensa 3 месяца назад +7

      I also remember this guy also developed the very ControlNet we know and love, that makes Stable Diffusion much more advanced than MidJourney last years.

    • @АлександрБычков-к4н
      @АлександрБычков-к4н 3 месяца назад +8

      This man is more effective than billion worth corporations

    • @NotThatOlivia
      @NotThatOlivia 3 месяца назад

      @@АлександрБычков-к4н like Stability AI ??? ;)

  • @kernsanders3973
    @kernsanders3973 3 месяца назад +4

    AI agents, ToonCrafter AI, now an LLM auto generating SD complex prompts from simple prompts... its too much... only have so much time.. I love it

  • @zzzzzzz8473
    @zzzzzzz8473 3 месяца назад +1

    so cool , think it is easier to understand this seeing the debug conditional rendering images of the colored regions handling depth and overlapping correctly . such a cool concept and truly an evolution of controlnet from the same creator .

  • @swannschilling474
    @swannschilling474 3 месяца назад +1

    Looks pretty awesome! Thanks for sharing Nerdy!! 😊

  • @jtjames79
    @jtjames79 3 месяца назад +10

    Apparently they used an agentic knowledge graphing llm, to make a robot dog, walk on a ball.
    You can put an agent at every step in a workflow, even use agents to make workflows.
    The trick is to use retrieval augmented generation to create a knowledge graph. For some reason this makes AI work like magic.

    • @MagusArtStudios
      @MagusArtStudios 3 месяца назад

      This is true! I use retrieval augmented generation to dynamically change the system message and for the tools.

  • @dishcleaner2
    @dishcleaner2 3 месяца назад +1

    I made a very basic version of this using the chatgpt api last year. This is way more impressive.

  • @jamesjonnes
    @jamesjonnes 3 месяца назад +2

    How come one can't edit the code? I want to edit the code.

  • @Pauluz_The_Web_Gnome
    @Pauluz_The_Web_Gnome 3 месяца назад +1

    I like it and all, but the slowness of generating the prompts omg!

  • @jurandfantom
    @jurandfantom 3 месяца назад +1

    Love that udio outro :D

  • @fontenbleau
    @fontenbleau 3 месяца назад

    this more usable to something in architecture than art, i like very precise & transparent description. Would be great if can be installed by Pinokio launcher.

  • @drdca8263
    @drdca8263 3 месяца назад

    It seems like when changing the rodent into a kitten, it also changed the details of the house behind it, a bit more than I expected?
    I think if one wants to really have the “now change this aspect of it” dialogue thing work best, it would probably be best if the other things don’t change much? Idk.
    I mean, I imagine you could do something with masking?
    Future work I suppose
    5:39 : WOAH! Much more control than I anticipated.

  • @john_blues
    @john_blues 3 месяца назад +2

    It's now on Huggingface as well.

  • @mr.entezaee
    @mr.entezaee 3 месяца назад

    can you guide me An error:
    __init__.py", line 239, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
    AssertionError: Torch not compiled with CUDA enabled

  • @coloryvr
    @coloryvr 3 месяца назад +3

    Wow! Amazing! ...and a super video! Big FANX!

  • @GU-jt5fe
    @GU-jt5fe 3 месяца назад

    What file manager are you using? It looks like you're using Windows, but something other than the default File Explorer. What is that? I've tried many third party file managers and they all leave a lot to be desired.

  • @terbospeed
    @terbospeed 3 месяца назад

    This looks really promising but I only got so-so results and the loading/unloading of models was abnormally slow.

  • @fixelheimer3726
    @fixelheimer3726 3 месяца назад

    I wonder if it has more than only prompting included,like regional prompting etc looks like it

  • @TusharHossain
    @TusharHossain 3 месяца назад +1

    great tools. thanks

  • @duytdl
    @duytdl 3 месяца назад

    I tried these examples with ChatGPT 4o, and they were pretty much the same as this. Not sure why SDXL is disappointing, maybe 4o has surpassed it?

  • @Shingo_AI_Art
    @Shingo_AI_Art 3 месяца назад

    It's a pain in the butt to try to change the model right now, it's not using the .safetensors version of the models but all the folders containing tokenizers and shit 😵‍💫

  • @ItsOk-mq9ex
    @ItsOk-mq9ex 3 месяца назад +3

    18g .... yikes. *Watches from the window like a peasant* Maybe someday these could work in dual card nvid/amd setups then i'd have enough vram.

    • @weirdscix
      @weirdscix 3 месяца назад +2

      Mine never went above 8

    • @thenetgamer2
      @thenetgamer2 3 месяца назад +1

      Just get more Dram, works but is of course slower.

  • @MarcSpctr
    @MarcSpctr 3 месяца назад +29

    Saw it on Reddit yesterday.
    Just waiting for ComfyUI implementation.
    So we can do even more like controlnet, ipadapters, LoRAs, etc

  • @southcoastinventors6583
    @southcoastinventors6583 3 месяца назад +3

    From Dall-E to Most-Me very impressive for something this new. Can't wait till you can easily swap out models and optimization. A step closer to a locally run multimodal model. Thanks for the video

  • @blackvx
    @blackvx 3 месяца назад +4

    Thank you, Mr. Rodent

  • @Slav4o911
    @Slav4o911 3 месяца назад

    They should make it easier to swap the models. There are better LLMs and better SDXL models...

  • @Xodroc
    @Xodroc 3 месяца назад +1

    Something like this that could understand what custom ComfyUI nodes are used for would be quite interesting.

  • @sephia4583
    @sephia4583 3 месяца назад +1

    Would be nice if it could analyze/reference or inpaint specified area (by mask or prompt) for automating detailed edits

  • @hamidmohamadzade1920
    @hamidmohamadzade1920 3 месяца назад +2

    anazing

  • @Beauty.and.FashionPhotographer
    @Beauty.and.FashionPhotographer 3 месяца назад +1

    LLMs that are best in fixing Phyton code errors etc etc...are there any?

  • @shallowandpedantic2320
    @shallowandpedantic2320 3 месяца назад +2

    what an outro

  • @mik3lang3lo
    @mik3lang3lo 3 месяца назад

    Can I try this with Pinokio?

  • @Cingku
    @Cingku 3 месяца назад

    Please make it into comfyui so that I can use it with my current workflow and image style...its prompt adherence is so good....I mean probably 70% of SD3 but it's good enough and better than anything else open source right now..

  • @timeTegus
    @timeTegus 3 месяца назад +1

    First

  • @SandyGoneByeBye
    @SandyGoneByeBye 3 месяца назад +1

    New songs... hmmm.

  • @workingclassreptiles
    @workingclassreptiles 3 месяца назад

    Hey Nerdy Rodent, the stable audio open source model just dropped. You should check it out and tell us how to run it!

    • @NerdyRodent
      @NerdyRodent  3 месяца назад

      Would love to, but you know… research only 🫤

  • @bradballew3037
    @bradballew3037 3 месяца назад +1

    How do we use another SDXL model though? I don't see a models folder, do you just change the name and it downloads it automatically? Do we drag it over from our ComfyUI models folder and place it somewhere?

    • @AltoidDealer
      @AltoidDealer 3 месяца назад

      Same boat… I tried adding models where the RealVis model can be found, editing the py file… seems like the model data has to be formatted in a very specific way that I can’t grasp and is not documented. Same for the LLM models, the repo has instructions to download other models but no instructions how to actually use them

    • @taucalm
      @taucalm 3 месяца назад

      I would suggest you to try change the sdxl name to something different and change that sdxl model name you want to use to that former name. Should work and is easier than changing the code.

    • @stefanc205
      @stefanc205 3 месяца назад +2

      i think we need to find a model on hugging face that has fp16 model then replace the repo name?

  • @Blackkspot
    @Blackkspot 3 месяца назад

    Ok thank you, nice. But how does it actually work under the hood? What is that code? Doesnt look like python etc. Is it like Anynode?

  • @DreamingConcepts
    @DreamingConcepts 3 месяца назад

    rip prompt engineers

  • @synthoelectro
    @synthoelectro 3 месяца назад +1

    Or if you have 18 GB of Virtual memory. Amazing how swap memory can help even with a 4GB VRAM card.

    • @DreamingConcepts
      @DreamingConcepts 3 месяца назад

      what do you mean by "swap memory"?

    • @Ginto_O
      @Ginto_O 3 месяца назад +1

      @@DreamingConcepts i'm not sure how he could make it more clear. perhaps you should google what swap memory is

    • @fontenbleau
      @fontenbleau 3 месяца назад +1

      it will wear out your ssd fast, i put 128Gb RAM in all 4 slots, that's a minimum if you want to run any 70 billions Llama on 14 cores CPU (it reserves 90 gigs) in gguf max quality 8 bit. New merged models going above 100 billions and my 128 is only managable if shared with GPU, but better to have 256.

    • @DreamingConcepts
      @DreamingConcepts 3 месяца назад

      @@Ginto_O wait, you mean you can use a LLM with your SSD? isn't that extremely slow?

  • @luislozano2896
    @luislozano2896 3 месяца назад

    Be warned, It does not auto save your images to a folder ! So right click the image on your browser !!!

  • @ronbere
    @ronbere 3 месяца назад

    I've installed it locally... But what is the practical use of such a program?

  • @stinthad
    @stinthad 3 месяца назад

    I'd like to know if it does well when you just throw in a bunch of booru tags

  • @Mopantsu
    @Mopantsu 3 месяца назад

    Tried it. Still can't connect an umbrella handle lol.

  • @TheColonelJJ
    @TheColonelJJ 3 месяца назад

    First! Sorry, couldn't help it.

  • @utsavchakraborty24
    @utsavchakraborty24 3 месяца назад +2

    HOLY MOTHER OF GOD!!!!!!!!!!!!!!!!!!!!!!!.......m completely overwhelmed.....

  • @LouisGedo
    @LouisGedo 3 месяца назад

    👋

  • @AnotherComment-rl6fv
    @AnotherComment-rl6fv 3 месяца назад +5

    Forget real art people are even lazier to write prompts now.🤣🤣

    • @drdca8263
      @drdca8263 3 месяца назад +1

      This seems like it grants a much greater degree of control than just writing a prompt does? Which, if it does, seems like it could make a larger “portion” of the generated image be a result of human choices?
      Especially if one manually edits the json describing the image

  • @royjones5790
    @royjones5790 3 месяца назад +1

    ❓❓ CHANGING LLMs??? -- Trying to change LLMs between the 3 they have but I just don't know which file(s) to download from the HF repository. When I go to the folder they describe I see 5-6 different safetensor files labeled model0002-etc... of different GB size but IDK if I'm supposed to choose one of those & rename, choose them all, choose a set, or what.❓❓

  • @EmmaFitzgerald-dp4re
    @EmmaFitzgerald-dp4re 3 месяца назад

    I don't get it lol... what's the big deal? And I mean that with tons of respect as well, what am I supposed to do with this?

    • @NerdyRodent
      @NerdyRodent  3 месяца назад +1

      There are quite a few things people use images for such as: t-shirts, mugs, games, greetings cards, to hang on the wall, etc - it’s all down to your imagination!

  • @nutzernutzeramstart2723
    @nutzernutzeramstart2723 3 месяца назад +2

    Does anyone have a source where i can learn how to install this locally or can give me instructions on what to do, because I seem to be incapable of understanding how to use github. Is there anything I need to install before? would appreciate any form of help, thanks :)

    • @fontenbleau
      @fontenbleau 3 месяца назад

      it's hard for novices, usually people write there some instruction on project page, but it's mostly place to just drop code and experiment. The only launcher i know which automatically installing what people dropping there is Pinokio, yes user interface little scuffed and some offered apps require your troubleshooting, but it works. With several attempts i managed to run Stable cascade art generator, also there's a bug in Pinokio-leftover cache from deleted apps need to be cleaned manually, it can take gigabytes.

  • @theteknologist9574
    @theteknologist9574 3 месяца назад

    Videos like this should come with a disclaimer:
    ***RTX3090 or 4090 only!!! HIGH VRAM!!!***

    • @raininheart9967
      @raininheart9967 3 месяца назад

      no, i have a 4060ti 16gb, works well

  • @Doctor_Random
    @Doctor_Random 3 месяца назад

    I'm getting (RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16') everytime I try :( :(

    • @Doctor_Random
      @Doctor_Random 3 месяца назад

      @@sazarod I seem to have mostly fixed it by reinstalling anaconda and then following the GitHub instructions again.

  • @eod9910
    @eod9910 3 месяца назад +3

    it's so censored that it's useless

    • @DreamingConcepts
      @DreamingConcepts 3 месяца назад

      what exactly did it consor?

    • @eod9910
      @eod9910 3 месяца назад

      @@DreamingConcepts try typing in car crash bloody car crash it won't let you you can't put in woman in bikini so no no nudity no violence no blood

    • @royjones5790
      @royjones5790 3 месяца назад +4

      They have 3 different LLMs, the Dolphin 2.9 is uncensored. They have the link to their download there. I am having problems figuring out which of the files in the Hugginface is the right (singular) file to download & rename. Then, in the above vid, at 9:14 is shown where to update the LLM

    • @eod9910
      @eod9910 3 месяца назад

      @@royjones5790 This the file that you want. lllyasviel/omost-dolphin-2.9-llama3-8b-4bits. I cloned it into Omost\hf_download\hub then replaced the files in this folder models--lllyasviel--omost-llama-3-8b-4bits with the dolphin uncensored model. Don't rename the original fold just replace the files then run it. It loads the uncensored dolphin model

    • @fontenbleau
      @fontenbleau 3 месяца назад

      ​@@eod9910thanks Eric Hartford genius for dolphin 👏

  • @weirdscix
    @weirdscix 3 месяца назад

    I never noticed any memory issues, it levelled out around 30GB RAM usage.

    • @royjones5790
      @royjones5790 3 месяца назад

      Mine is crawling. I gave it an initial image, then modified it once, and a 2nd time, & generation has become a 10+ minute process now. 16gb vram + 16ram

    • @weirdscix
      @weirdscix 3 месяца назад

      @@royjones5790 it is very RAM heavy, that 30GB was just for Omost, then there was another 16Gb used for the system.

    • @royjones5790
      @royjones5790 3 месяца назад +1

      @@weirdscix I actually used this as an excuse to go out & up my RAM to 64 from my 16 & you're right, it's moving so much smoother, consistently

  • @brootalbap
    @brootalbap 3 месяца назад

    i don't understand. So this works with LLMs, not with Stable diffusion models? no chance to insert specialised SD models? the image quality looks like a base SD model. The idea for localizing prompts is fantastic, but without a powerful model to create high quality images, the output won't be good.

    • @fontenbleau
      @fontenbleau 3 месяца назад

      as i see by the code, it's just prompt generator but very precise and detailed