I also remember this guy also developed the very ControlNet we know and love, that makes Stable Diffusion much more advanced than MidJourney last years.
so cool , think it is easier to understand this seeing the debug conditional rendering images of the colored regions handling depth and overlapping correctly . such a cool concept and truly an evolution of controlnet from the same creator .
Apparently they used an agentic knowledge graphing llm, to make a robot dog, walk on a ball. You can put an agent at every step in a workflow, even use agents to make workflows. The trick is to use retrieval augmented generation to create a knowledge graph. For some reason this makes AI work like magic.
this more usable to something in architecture than art, i like very precise & transparent description. Would be great if can be installed by Pinokio launcher.
It seems like when changing the rodent into a kitten, it also changed the details of the house behind it, a bit more than I expected? I think if one wants to really have the “now change this aspect of it” dialogue thing work best, it would probably be best if the other things don’t change much? Idk. I mean, I imagine you could do something with masking? Future work I suppose 5:39 : WOAH! Much more control than I anticipated.
can you guide me An error: __init__.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
What file manager are you using? It looks like you're using Windows, but something other than the default File Explorer. What is that? I've tried many third party file managers and they all leave a lot to be desired.
It's a pain in the butt to try to change the model right now, it's not using the .safetensors version of the models but all the folders containing tokenizers and shit 😵💫
From Dall-E to Most-Me very impressive for something this new. Can't wait till you can easily swap out models and optimization. A step closer to a locally run multimodal model. Thanks for the video
Please make it into comfyui so that I can use it with my current workflow and image style...its prompt adherence is so good....I mean probably 70% of SD3 but it's good enough and better than anything else open source right now..
How do we use another SDXL model though? I don't see a models folder, do you just change the name and it downloads it automatically? Do we drag it over from our ComfyUI models folder and place it somewhere?
Same boat… I tried adding models where the RealVis model can be found, editing the py file… seems like the model data has to be formatted in a very specific way that I can’t grasp and is not documented. Same for the LLM models, the repo has instructions to download other models but no instructions how to actually use them
I would suggest you to try change the sdxl name to something different and change that sdxl model name you want to use to that former name. Should work and is easier than changing the code.
it will wear out your ssd fast, i put 128Gb RAM in all 4 slots, that's a minimum if you want to run any 70 billions Llama on 14 cores CPU (it reserves 90 gigs) in gguf max quality 8 bit. New merged models going above 100 billions and my 128 is only managable if shared with GPU, but better to have 256.
This seems like it grants a much greater degree of control than just writing a prompt does? Which, if it does, seems like it could make a larger “portion” of the generated image be a result of human choices? Especially if one manually edits the json describing the image
❓❓ CHANGING LLMs??? -- Trying to change LLMs between the 3 they have but I just don't know which file(s) to download from the HF repository. When I go to the folder they describe I see 5-6 different safetensor files labeled model0002-etc... of different GB size but IDK if I'm supposed to choose one of those & rename, choose them all, choose a set, or what.❓❓
There are quite a few things people use images for such as: t-shirts, mugs, games, greetings cards, to hang on the wall, etc - it’s all down to your imagination!
Does anyone have a source where i can learn how to install this locally or can give me instructions on what to do, because I seem to be incapable of understanding how to use github. Is there anything I need to install before? would appreciate any form of help, thanks :)
it's hard for novices, usually people write there some instruction on project page, but it's mostly place to just drop code and experiment. The only launcher i know which automatically installing what people dropping there is Pinokio, yes user interface little scuffed and some offered apps require your troubleshooting, but it works. With several attempts i managed to run Stable cascade art generator, also there's a bug in Pinokio-leftover cache from deleted apps need to be cleaned manually, it can take gigabytes.
They have 3 different LLMs, the Dolphin 2.9 is uncensored. They have the link to their download there. I am having problems figuring out which of the files in the Hugginface is the right (singular) file to download & rename. Then, in the above vid, at 9:14 is shown where to update the LLM
@@royjones5790 This the file that you want. lllyasviel/omost-dolphin-2.9-llama3-8b-4bits. I cloned it into Omost\hf_download\hub then replaced the files in this folder models--lllyasviel--omost-llama-3-8b-4bits with the dolphin uncensored model. Don't rename the original fold just replace the files then run it. It loads the uncensored dolphin model
Mine is crawling. I gave it an initial image, then modified it once, and a 2nd time, & generation has become a 10+ minute process now. 16gb vram + 16ram
i don't understand. So this works with LLMs, not with Stable diffusion models? no chance to insert specialised SD models? the image quality looks like a base SD model. The idea for localizing prompts is fantastic, but without a powerful model to create high quality images, the output won't be good.
lllyasviel this guy came to us from the future. If only he would add Omost and IC Light to Fooocus
I also remember this guy also developed the very ControlNet we know and love, that makes Stable Diffusion much more advanced than MidJourney last years.
This man is more effective than billion worth corporations
@@АлександрБычков-к4н like Stability AI ??? ;)
AI agents, ToonCrafter AI, now an LLM auto generating SD complex prompts from simple prompts... its too much... only have so much time.. I love it
so cool , think it is easier to understand this seeing the debug conditional rendering images of the colored regions handling depth and overlapping correctly . such a cool concept and truly an evolution of controlnet from the same creator .
Looks pretty awesome! Thanks for sharing Nerdy!! 😊
My pleasure 😊
Apparently they used an agentic knowledge graphing llm, to make a robot dog, walk on a ball.
You can put an agent at every step in a workflow, even use agents to make workflows.
The trick is to use retrieval augmented generation to create a knowledge graph. For some reason this makes AI work like magic.
This is true! I use retrieval augmented generation to dynamically change the system message and for the tools.
I made a very basic version of this using the chatgpt api last year. This is way more impressive.
How come one can't edit the code? I want to edit the code.
I like it and all, but the slowness of generating the prompts omg!
Love that udio outro :D
this more usable to something in architecture than art, i like very precise & transparent description. Would be great if can be installed by Pinokio launcher.
It seems like when changing the rodent into a kitten, it also changed the details of the house behind it, a bit more than I expected?
I think if one wants to really have the “now change this aspect of it” dialogue thing work best, it would probably be best if the other things don’t change much? Idk.
I mean, I imagine you could do something with masking?
Future work I suppose
5:39 : WOAH! Much more control than I anticipated.
It's now on Huggingface as well.
can you guide me An error:
__init__.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
Wow! Amazing! ...and a super video! Big FANX!
What file manager are you using? It looks like you're using Windows, but something other than the default File Explorer. What is that? I've tried many third party file managers and they all leave a lot to be desired.
This looks really promising but I only got so-so results and the loading/unloading of models was abnormally slow.
I wonder if it has more than only prompting included,like regional prompting etc looks like it
great tools. thanks
I tried these examples with ChatGPT 4o, and they were pretty much the same as this. Not sure why SDXL is disappointing, maybe 4o has surpassed it?
It's a pain in the butt to try to change the model right now, it's not using the .safetensors version of the models but all the folders containing tokenizers and shit 😵💫
18g .... yikes. *Watches from the window like a peasant* Maybe someday these could work in dual card nvid/amd setups then i'd have enough vram.
Mine never went above 8
Just get more Dram, works but is of course slower.
Saw it on Reddit yesterday.
Just waiting for ComfyUI implementation.
So we can do even more like controlnet, ipadapters, LoRAs, etc
From Dall-E to Most-Me very impressive for something this new. Can't wait till you can easily swap out models and optimization. A step closer to a locally run multimodal model. Thanks for the video
Thank you, Mr. Rodent
They should make it easier to swap the models. There are better LLMs and better SDXL models...
Something like this that could understand what custom ComfyUI nodes are used for would be quite interesting.
Would be nice if it could analyze/reference or inpaint specified area (by mask or prompt) for automating detailed edits
anazing
LLMs that are best in fixing Phyton code errors etc etc...are there any?
what an outro
Can I try this with Pinokio?
Please make it into comfyui so that I can use it with my current workflow and image style...its prompt adherence is so good....I mean probably 70% of SD3 but it's good enough and better than anything else open source right now..
First
New songs... hmmm.
Hey Nerdy Rodent, the stable audio open source model just dropped. You should check it out and tell us how to run it!
Would love to, but you know… research only 🫤
How do we use another SDXL model though? I don't see a models folder, do you just change the name and it downloads it automatically? Do we drag it over from our ComfyUI models folder and place it somewhere?
Same boat… I tried adding models where the RealVis model can be found, editing the py file… seems like the model data has to be formatted in a very specific way that I can’t grasp and is not documented. Same for the LLM models, the repo has instructions to download other models but no instructions how to actually use them
I would suggest you to try change the sdxl name to something different and change that sdxl model name you want to use to that former name. Should work and is easier than changing the code.
i think we need to find a model on hugging face that has fp16 model then replace the repo name?
Ok thank you, nice. But how does it actually work under the hood? What is that code? Doesnt look like python etc. Is it like Anynode?
rip prompt engineers
Or if you have 18 GB of Virtual memory. Amazing how swap memory can help even with a 4GB VRAM card.
what do you mean by "swap memory"?
@@DreamingConcepts i'm not sure how he could make it more clear. perhaps you should google what swap memory is
it will wear out your ssd fast, i put 128Gb RAM in all 4 slots, that's a minimum if you want to run any 70 billions Llama on 14 cores CPU (it reserves 90 gigs) in gguf max quality 8 bit. New merged models going above 100 billions and my 128 is only managable if shared with GPU, but better to have 256.
@@Ginto_O wait, you mean you can use a LLM with your SSD? isn't that extremely slow?
Be warned, It does not auto save your images to a folder ! So right click the image on your browser !!!
I've installed it locally... But what is the practical use of such a program?
I'd like to know if it does well when you just throw in a bunch of booru tags
Tried it. Still can't connect an umbrella handle lol.
First! Sorry, couldn't help it.
HOLY MOTHER OF GOD!!!!!!!!!!!!!!!!!!!!!!!.......m completely overwhelmed.....
👋
Forget real art people are even lazier to write prompts now.🤣🤣
This seems like it grants a much greater degree of control than just writing a prompt does? Which, if it does, seems like it could make a larger “portion” of the generated image be a result of human choices?
Especially if one manually edits the json describing the image
❓❓ CHANGING LLMs??? -- Trying to change LLMs between the 3 they have but I just don't know which file(s) to download from the HF repository. When I go to the folder they describe I see 5-6 different safetensor files labeled model0002-etc... of different GB size but IDK if I'm supposed to choose one of those & rename, choose them all, choose a set, or what.❓❓
I don't get it lol... what's the big deal? And I mean that with tons of respect as well, what am I supposed to do with this?
There are quite a few things people use images for such as: t-shirts, mugs, games, greetings cards, to hang on the wall, etc - it’s all down to your imagination!
Does anyone have a source where i can learn how to install this locally or can give me instructions on what to do, because I seem to be incapable of understanding how to use github. Is there anything I need to install before? would appreciate any form of help, thanks :)
it's hard for novices, usually people write there some instruction on project page, but it's mostly place to just drop code and experiment. The only launcher i know which automatically installing what people dropping there is Pinokio, yes user interface little scuffed and some offered apps require your troubleshooting, but it works. With several attempts i managed to run Stable cascade art generator, also there's a bug in Pinokio-leftover cache from deleted apps need to be cleaned manually, it can take gigabytes.
Videos like this should come with a disclaimer:
***RTX3090 or 4090 only!!! HIGH VRAM!!!***
no, i have a 4060ti 16gb, works well
I'm getting (RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16') everytime I try :( :(
@@sazarod I seem to have mostly fixed it by reinstalling anaconda and then following the GitHub instructions again.
it's so censored that it's useless
what exactly did it consor?
@@DreamingConcepts try typing in car crash bloody car crash it won't let you you can't put in woman in bikini so no no nudity no violence no blood
They have 3 different LLMs, the Dolphin 2.9 is uncensored. They have the link to their download there. I am having problems figuring out which of the files in the Hugginface is the right (singular) file to download & rename. Then, in the above vid, at 9:14 is shown where to update the LLM
@@royjones5790 This the file that you want. lllyasviel/omost-dolphin-2.9-llama3-8b-4bits. I cloned it into Omost\hf_download\hub then replaced the files in this folder models--lllyasviel--omost-llama-3-8b-4bits with the dolphin uncensored model. Don't rename the original fold just replace the files then run it. It loads the uncensored dolphin model
@@eod9910thanks Eric Hartford genius for dolphin 👏
I never noticed any memory issues, it levelled out around 30GB RAM usage.
Mine is crawling. I gave it an initial image, then modified it once, and a 2nd time, & generation has become a 10+ minute process now. 16gb vram + 16ram
@@royjones5790 it is very RAM heavy, that 30GB was just for Omost, then there was another 16Gb used for the system.
@@weirdscix I actually used this as an excuse to go out & up my RAM to 64 from my 16 & you're right, it's moving so much smoother, consistently
i don't understand. So this works with LLMs, not with Stable diffusion models? no chance to insert specialised SD models? the image quality looks like a base SD model. The idea for localizing prompts is fantastic, but without a powerful model to create high quality images, the output won't be good.
as i see by the code, it's just prompt generator but very precise and detailed