Playground AI Tried to Break the Internet. NEW Stable Diffusion Base Model!
HTML-код
- Опубликовано: 4 дек 2023
- UPDATE: You can now download the model for all interfaces here: huggingface.co/playgroundai/p...
Playground claims their new model outperforms SDXL by 250%. Let's have a look.
blog.playgroundai.com/playgro...
/ 1732102168029905309
huggingface.co/playgroundai/p...
Get early access to videos and help me, support me on Patreon / sebastiankamph
Prompt styles for Stable diffusion a1111 & Vlad/SD.Next: / sebs-hilis-79649068
ComfyUI workflow for 1.5 models: / comfyui-1-5-86145057
ComfyUI Workflow for SDXL: / comfyui-workflow-86104919
Chat with me in our community discord: / discord
My Weekly AI Art Challenges • Let's AI Paint - Weekl...
My Stable diffusion workflow to Perfect Images • Revealing my Workflow ...
ControlNet tutorial and install guide • NEW ControlNet for Sta...
Famous Scenes Remade by ControlNet AI • Famous Scenes Remade b... - Хобби
UPDATE: You can now download the model for all interfaces here: huggingface.co/playgroundai/playground-v2-1024px-aesthetic/tree/main
That's strange... i've downloaded this two days ago: 13 gb
In your vid here it's only at 5 / 10 gb - at this link it's the same 13 gb file i already have 🤔
link doesnt work for me
Link doesnt work. Entry not found
Thank you, fixed@@creka2897
That model is in diffusers format on HF, so InvokeAI supports it natively. Just put "playgroundai/playground-v2-1024px-aesthetic" for the Model Location field of the Import Models area, and it will download all submodels automatically. To get the Unet, CLIPs, and VAE out separately in the node interface, you can use the normal SDXL Main Model node. Their VAE is broken for fp16, so make sure you run decodes at fp32 instead or use the fixed sdxl-vae to do it.
Edit: Just tested, and I get the exact same results using SDXL 1.0 CLIP vs the CLIP models that come from Playground. Seems that they reused the original CLIP and VAE from SDXL base, and only trained the Unet.
So can they really say it's trained from scratch?
@@dkemil yes. Worth noting that StabilityAI did not invent or train CLIP either when they released SD. The Unet is the big important part, and as far as we can tell from weight analysis they did indeed train it from scratch.
So, use a base SDXL for primary worflow and just add Playground as unet is the way to use it ?
PSA: your huge negative prompts are ruining everything. Many models don't need a negative prompt at all, and most of them will punish you for overusing them without thought. "Extra limbs" is not a thing that is labeled in datasets guys, use common sense. Actual experts (not me) have restated these points repeatedly.
What about (((((extra limbs)))) ? 😂
At this point we don't even need to do feature requests -- just sit back and wait and someone will develop a new feature or better model tomorrow.
If you look back at the original cat image (4:59) you forgot to take the negative prompt. it starts with "Tail, ugly, tail" etc
You kept the previous negative prompt. Sorry! :) Great channel...
Yeah I noticed that too
Yes, I noticed that too.
There's a 13.9 GB safetensors file added to huggingface 6 hours ago that is working like a normal checkpoint
Considering low default CFG I'd assume baked LCM, probably mixed with so-called SDXL-turbo. Especially if you see those boring plastic skin in generated images.
I tried with fp16 version (don't have the memory for the full model) : images are overburned. I try with differents VAE (the playground16fp version too) and SDXL models for clip, same issue.
🤔 I can't shake the thought that the negative prompts might effect Playground's generation differently than SDXL's. Sometimes negatives can really mess with the end result.
What do you guys think?
negatives are definitely very underrated as image shaper. I used a prompt to make a ghostly face surreal girl and nothing happened, it gave a regular girl. then i added the negatives that filtered out the realism and then it worked perfectly. so negatives are a BIG thing if you know what you want. otherwise, you will be happy at any pretty picture AI churns out.
Yup, different models react differently to the "seed/hash" of the negative you entered. Your negatives are essentially pointing to different pictures in each model, and similar pictures in similar models. And negatives can make a huge difference as they essentially filter your generation possibilities.
SD Next let's you load diffusers models from hugging face. Then auto detects the diffusers pipeline and loads it. However loading the unet like that in comfy was really neat and I can't wait to do some combos with that.
more one amazing tutorial bro, thanks !!
I feel comparing the base models is the proper thing to actually do.
Then when comparing derivatives of each, you compare ones made with the same goal or even with the same params.
I'm a lot more interested in accuracy, as someone else said. As well as performance. It still takes pretty cutting edge hardware in order to have a fast workflow. I have a 6gb gpu in a 10yr old pc. It's enough to participate and explore but slow enough to be an issue xD
IA was my excuse to buy a 4090. It wasn't justifiable any other way.
@@jonathaningram8157 Maybe can be my excuse to get a better job 😂
Still better than my Intel IGPU
Any chance of using this in A1111 yet?
great info!
Just a note, with the picture of the realistic woman, you were comparing it to the Juggernaut XL (it was set as a filter)
How is it that the SDXL VAE, which is tailored to work with SDXL's own latent space representation, is compatible with the latent image output from the Playground model your using? Isn't it an independently trained model? From my understanding, each model would have a different understanding of latent space and thus would each need their own VAE Decoder. Is it possible that these two independent models came up with basically the same latent space representation? Or is it more likely that this Playground model is actually based on SDXL?
as suggested by others, seems like a LCM merged SDXL turbo cocktail.
100% based on sdxl
this is the slowest model i have seen by a big margin. i am using a1111 on my mac mini m1 16gb and i am trying playgroundv2(13.9gb) version. i didn't get any result because of memory problems. i think it is not suitable for me. am i wrong?
OK so here's something interesting: I'm using the playground VAE from this release in automatic 1111 with regular SD 1.5 512x512 models along with 'default negative', and it's actually enabling me to inference at 1024x1024 with zero doubling artifacts. Reliably. Pretty cool!
By “inference” do you mean hiresfix? Or starting resolution?
I can't get the workflow anywhere. could you or you simply used a previous one ?
starting resolution before high res fix@@middleman-theory
@xilix details and process please.
Did you try this with the original 1.5 model too? Because many finetuned versions of that are trained on higher resolution images too so they are able to create 1024x photos out of the box. I'd be really surprised if you can generate 1024px art images using the the base 1.5 model just by using another VAE because getting 1024px art is especially difficult even for most fine tuned 1.5 models.
How can I use it with automatic1111? When I try to load model I see error message "Connection errored out."
See updated model file in description
@@sebastiankamph Thanks. There are a lot of files. What shold I choose?
What UI are you using? That doesn't look like auto1111
That's Comfy UI
I really just want more control over what I create. I don’t care if AI produces beautiful pictures if I don’t have much control over the content that is being produced. The randomness of AI doesn’t help much. This is what I’m hoping gets improved.
Have you tried ControlNet?
Currently you need to have a prompt of what you want to see, AND the seed that produces what you need. This is why im not going to XL models, just generate ALOT of images and work on your seed. Mind you a 600x1000 image will generate different than an image of 600x800. Edit: some typos
i think you're looking for comfyui
Just want to mention that you should look at the prompts, this model is not SD or SDXL so it doesn't interpret the prompts exactly the same way so trying to compare via prompt-prompt (from a website copy/paste) is not really accurate. You should be doing your own prompts and comparing. (my two cents). Thanks though, will test myself.
can it do hands?
where did you get 250%? I don't see any difference between the standard sdxl model
It is just a subjective opinion obtained from surveys done by the creators of the model.
A lot of images created by users in Playground AI are edited to include their signatures. As a result, many times the prompts do not correspond accurately with the generated images, which is something I dislike.
Ironically, a lot of people just straight up post real bespoke art on their Playground profile, let the program change a handful of pixels, and try to pass it off as AI art.
For this you need to look into the history of playground and how it works in general. Most of those images you see are not composed in one single prompt but in a img2img process "Create Variations" where the prompt is changed and evolved between variations. That's why you can recreate them withouth knowing the exact workflow. The signing in the end is just the tip on the iceberg. The only solution to this would be to record the prompt history between all thos img2img processes. And yet you wouldn't be able to recreate the final image again due to the nature of img2img
I noticed a lot of the other images were using a filter (like Juggernaut), maybe that's why the images on their site weren't turning out right.
interesting, thought it should be using SDXL workflow,
honestly the only immediately meaningful benchmark for new models/checkpoints are fingers/toes. Good SD1.5 models + extensions handle everything else more than adequately.
what's internet breaking about this?
The clickbait. Pretty sure this whole model is a scam and this YT channel has no idea what it's talking about.
lol for functional application i feel like SD1.5 out performs SDXL models guess overall depends what your using them for but SD1.5 is in my opinion so well rounded its hard to move away from
half done vid click baited achievement yours lol
What is the VAE doing exactly?
"Latent Space" is where SD does its work. SD works by taking an image, either one you pick, or a random page of static. It is then passed through a VAE (variational autoencoder ) to convert it into the latent space so that it can perform operations on it. When the generative process is complete, it then decodes it with the VAE to render it into the final image. Latent space is a condensed, information dense 'space' and the operations there are faster and more memory efficent. The VAE also upscales the image.
I'm no expert, but that's my understanding at least. I may be wrong, but I've always thought of it as the thing that translates the image from human viewable to computer usable, and back again. There is a more detailed description by Louis Bouchard on his site that goes into more detail. Not sure if links are allowed here, so I'll reply in another message with that.
The model works in fooocus just like any other model
I tried Stable Diffusion and sometimes it makes - Well most of the time this happens: the prompt is 2 men... and then it will be more men. And the face is almost the same. So it looks like they have been cloned. Then there is a lots of arms... and sometimes there is images with alot of people spread all over the image and they also might grow from one of the body
I think that may be due to the empty latent image size used. Was using SD1.5 checkpoint in ComfyUI but with empty latent image size of 1024x1024. It should be 512x512 for SD1.5, so everything was doubling up e.g two people merged into each other when I only asked for one person!
Playground is a great interface, but could go next level if you could import other models. Playground v2 model still falls short on photo realistic people. Need to compare say Fooocus with an image then use Playground and the difference is obvious. Don't get me wrong, I love Playground on everything else.
Thanks for looking into this so I don't have too!
I got you!
but can it do hands
Agenda noted.
I get why you used their "model" but they also supplied a VAE and scheduler... Worth trying?
how do you install the scheduler?
@@BluezJustice I dont know. (but I did try to figure it out) VAE didnt work for me but std sdxl one worked good.
Looks like a decent model. The clip shouldn't matter unless its trained on a specific setting. Then you could use the CLIP Set Last Layer to adjust to whatever is recommended.
This dad-joke comes with a bottle of pepperspray to spice it up if needed.
Considering how many seeds there are, any "like for like" comparison is meaningless.
I will be impressed when you don't have to have to perform all sorts of acrobatics to get more than one figure in a picture doing each it's own thing.
Not really convinced its 250% better after playing with it, i still think some other models are also pretty good compared to it, even juggernaut works fine with low cfg and few steps.
I agree. I don't think it's better than any custom sdxl model tbh :( I would've liked to see user tests between custom sdxl and not base model like they did.
@@sebastiankamph in my test, landscape looks better in playground than juggernaut at 1024x1024, more photorealistic with 10 steps and cfg set to 3, juggernaut results are clumsy but playground is lost as soon as you use controlnet on top of it. It looks like it doesn't even apply controlnet. That's weird.
Better than SDXL? Please....it's better than 'old' playground, but that's about it.
All these different models are really taking up space, I have about 100 GB of them now in Comfy.
Only? 😂
rookie numbers, i got 460 GB of models, and i just cleaned out all my trash merges xD
@@z1mt0n1x2 xD 1TO38Mb but i really need to dump some old stuff collecting dust since 1.4
I have four 1TB USB SSDs and a 1TB M2 SSD and I'm still getting low on space. I don't have that much in models like you do yet though, just other random stuff, like partition images and videos. Too bad there's only 1 M.2 socket in my mobo, they're very small. Wasn't easy to figure out how to plug it in there though, you have to push it in kind of sideways and then down. No "that's what she said" jokes please.@@z1mt0n1x2
I have 1TB of them... and I wonder which ones to delete.... 100TB is too little for AI models. It seems you don't do merges at all if you use so little space....
👋
This or Juggernaut XL ? :)
SebastiansMerge.safetensors
Not another LCM model..
You forgot to change the negative prompts? Long neck, no bad hands, arms, fingers, face.. when generating a cat image? That's going to mess up the result big time
He literally said the cfg scale and steps "don't make much of a difference" lol. These guys are cringe, only care about puting as many videos out but have no clue what they are doing.
This ComfyUI looks a lot more clean than the boxy crap you had before.
At this point any new standard models are pointless now Dall e 3 has started with its GPT integration. Accuracy , interpretation capability and editing capabilities should be looked at more than just trying to pump out models that aren't really doing much more than we've already seen
Imo Realistic Vision is still much better. This model does not look like anything special to me.
Vae shouldn’t be shared like that
New models are like new fitness RUclips channels. Same old song just repackaged.
Well, new base models are at least only released like once or twice a year :)
@@sebastiankamph word, but if you hadn’t said that this was a new base model I would have just thought it was a variation of the ones we’re used to. maybe I’m missing something but it doesn’t seem any different.
Ran a few tests on comfy with unet and my usual benchmark prompts, no matter the settings, is this model any better compared to any custom SDXL I have ? Not even close...
Even my Turbo+LCM models do better, 10x faster :(
But I still liked the video for the swiftness of a peregrine falcon ! lol
GG Sebastian !
I broke some RUclips world records getting it out so quick!
Doesn't seem better
right, the compared imagines were completely different takes on the promt, how do I compare which is better from that
The fact that people preferred 2.5x vs SDXL can also mean that it is 2.5 times worse if you realize the horrible aesthetic taste that average people have. 😂😂😂😂😂
Yeah, I know, it's so strange too. You'd think people would have a more streamlined taste, but it's like completely the opposite.
Doesn't look very good, thanks for looking at it though
more art, less realism
Using A1111, I did a comparison with this model and several other xl models, using identical settings. This one gave the worst results. Not impressed.
Man, I've noticed that I watch your videos at X2 speed and you still speak slow haha
No A1111, no care
I just came here second time to vomit on your click-baits. But most of the time your video are great except the ones where you are promoting paid stuff like the Lemonade Seeds bullshit.
Think I’ll stick w MidJourney still
Great video though
Why didn’t you mention that you were using comfy UI so I could have avoided this video altogether?
It's not available in a1111 yet.
No one really cares about this aside from playground ai
_★_ I believe we are meant to be like Jesus in our hearts and not in our flesh. But be careful of AI, for it is just our flesh and that is it. It knows only things of the flesh (our fleshly desires) and cannot comprehend things of the spirit such as peace of heart (which comes from obeying God's Word). Whereas we are a spirit and we have a soul but live in the body (in the flesh). When you go to bed it is your flesh that sleeps but your spirit never sleeps (otherwise you have died physically) that is why you have dreams. More so, true love that endures and last is a thing of the heart (when I say 'heart', I mean 'spirit'). But fake love, pretentious love, love with expectations, love for classic reasons, love for material reasons and love for selfish reasons that is a thing of our flesh. In the beginning God said let us make man in our own image, according to our likeness. Take note, God is Spirit and God is Love. As Love He is the source of it. We also know that God is Omnipotent, for He creates out of nothing and He has no beginning and has no end. That means, our love is but a shadow of God's Love. True love looks around to see who is in need of your help, your smile, your possessions, your money, your strength, your quality time. Love forgives and forgets. Love wants for others what it wants for itself. Take note, true love works in conjunction with other spiritual forces such as patience and faith (in the finished work of our Lord and Savior, Jesus Christ, rather than in what man has done such as science, technology and organizations which won't last forever). To avoid sin and error which leads to the death of our body and also our spirit in hell fire, we should let the Word of God be the standard of our lives not AI. If not, God will let us face AI on our own and it will cast the truth down to the ground, it will be the cause of so much destruction like never seen before, it will deceive many and take many captive in order to enslave them into worshipping it and abiding in lawlessness. We can only destroy ourselves but with God all things are possible. God knows us better because He is our Creater and He knows our beginning and our end. Our prove text is taken from the book of John 5:31-44, 2 Thessalonians 2:1-12, Daniel 7-9, Revelation 13-15, Matthew 24-25 and Luke 21. Let us watch and pray... God bless you as you share this message to others.