The thing I'm waiting for is character reference type stuff that consistently keeps every aspect of the character, not just the character. For example, keeping the outfit exactly the same across different prompts.
How close can we get Midjourney to do that? I know there’s certain character weight things and some other stuff… I think it gets pretty close, correct? I haven’t really pushed it that far…
Thanks for showing my clip brotha! I'm cooking some stuff up, I already dropped that little teaser for the music video and I just got some more clips that look fantastic. I too did get some funky stuff, but tbh I probably got like 8 out of ten to gen properly and without need to reroll. It's not to say that the it's perfect, but I've had a ton of luck with the consistent character. No more generating images to then animate them, i just add the char img and I can just generate scenes from that. Anyhow, thanks again bud and have a chill evenin' man.
I find the best option for consistent characters is Midjourney with character reference then image to video. This gives you the option to reroll your images till you get a consistency you're happy with. Of course having character reference in the video generator is great if you're doing text to video, but otherwise if you're already generating images, Midjourney is the way to go IMHO😀
AH! Vidu. Okay I've already been using this...or tried to. Like it got the face reference perfect on mine but kept changing the clothes. Wasn't super useful BUT it's SUCH a good start! Like...I'm glad it EXISTS, now they can improve it and others can learn from it. ...yeah I've been working on an AI Short Film and I'm using Runway Gen-3 Alpha/Gen-3 Alpha Turbo, Kling, Luma, and Vidu all together. It's kind of a hoot seeing how differently each one handles different things (with Luma giving me the most trouble on character consistency, though it has improved recently). Vidu is pretty cool. I like that I can pay half-price for a preview video and then pay the rest to clean it up. Wish I could top up my generations though or I would have used it more.
The only trick I’ve found with somewhat consistent clothing is just to prompt for really bland things. Like, AI-me worked because the reference was a black tee shirt, so super simple. But even if I were to prompt for me in a tuxedo, I’d still get the black tee shirt. I suppose you could get insane and take pictures of an actor in different clothes to use as reference? Haha. So stupid it might work!
Just tried Vidu and its awesome! Much better than Runway or anything else i have tried. A simple prompt only, to get a female agent flying over a ruined city, looking down at her at a 45 degree angle, Vidu did it right and beautiful first time. Runway gen3 has NEVER been able to do it, even after using all my credits (I have a subscription) and about 40 or 50 tries, Runway did not have a clue.
Thank you!! For sure too-- I need to play around with the idea of doing half image to video and half Text (character ref) to video. I think you can get something super dynamic out of that!
Right?! How much better a movie would that have been over that last Pirates movie? Which, I actually called Dead Man’s Chest. I was wrong about that- that was the second one, which is actually great. I meant Dead Men Tell No Tales. Which was, oooof, Not as bad as the 4th one though! (To note: awhile back I marathoned all the movies, so they’ve been stuck in my head)
Another piece of the puzzle. I hope other platforms will also add the character reference feature and do it better. The filmmaking pieces are all coming together, but it feels a bit like slow motion right now.
It does. A lot of kitbashing to get something cooking! I should probably do a round up at some point and see how many different places it takes to pull off one scene-- from concept to animation, to lip sync, etc-- I'd be curious to see how spread out we are right now, and what we're still lacking.
@@TheoreticallyMedia That's a great idea. Maybe wait 1 or 2 months with that, I suspect new A.I. things and upgrades will be released after the summer season.
Watching AI make these leaps week to week on your channel it occurred to me that given the pace of innovation it might be interesting for you to do a compilation video few months. Compilations of niches ie "Text to video" or "Image to video" would give a longer view of the progress in particular areas. This would be of particular interest for investors, and also historians who need to see what were the pivotal events in the AI revolution.
Oh, that is a cool idea! Like a "clip show" in a kind of way? It's funny, a lot of us were talking about doing a "year in review" video at the end of 2023-- but also, I think we all got slammed by the holidays. Plus, we all forgot what happened in January! ha This might be a good solution!
I just finally got to see the AI you HAHAHA so awesome. I had to run out for a few so I paused and came back to you backin' yo thang up at the club which may or may not influenced a 50 cent track
wow, that really sold me on Krea!!! What's the best image-to-video model that retains text? I have an image of a customer holding my supplement product I'd like to turn into a video but most models are warping the label on my bottle.
Krea’s upscaler is amazing, that said: experiment with those sliders! It’s super touchy! I think most of them are retaining text in image to video these days. Gen-3 for sure, and I’ve had some good luck with Luma and Kling on that front as well
Did you already test FotoR? They recently added video generation as well, but they don't give you many free credits... I think they give you 2 credits per day for logging in (not automatically), but a video creation needs at least 4 credits, or something like that.
Utilizing every Ai tool at your disposal and editor software for post editing. Can you try to make a realistic cinematic short feature film? Telling a story through people, places and things?🙏🏼🥺
Consistency has been tough with AI but a lot of AI generators have some form of face consistency then there is face swap. Also in a series of generating you get mostly similar faces rhat face swap can fix Nice to see it coming from video generator's. Now if they did environmental consistency like LTX that would be awesome. What I would really like to see is outfit consistency. For my character I use for my movies would realy help. Right now getting a consistent outfit is nearly impossible. As far as length goes 4 to 10 seconds works fine for me. If I want to lengthen I usually change the speed in davinci.
Outfit consistency is a tough one. You can even see that w/ the Not David Beckham shots. The shots of AI Me worked, because I was ref'ing a guy in a black tee-shirt--which is bland enough to always work. Also-- Detectives wearing a white shirt and a black tie. It's a never miss. But for the most part: You can't be too fashionable as an AI Character!
"Vidu shouldn't be sued as your primary generator" I agree strongly. also it has very low fidility on resolution. for likeness its best to sue it for closeups only. it screws all fine details unless visible at very low res." it honestly also asks for a bit of cash as someone who earns a none $ currency.
I'll say, at least currently as well, even after an HD upscale on Vidu, you still pretty much have to run it through Topaz (more $) to get it up to spec. Still, willing to give them some growing room, esp if they get that creative upscaler really kicking!
Do you know about comfy ui? I think you can do local video generation and image generaiton with it if you own a high end consumer graphics card like the rtx 4090. You can then run flux dev for free and stable video 2 on it. If planning to do an entire movie this can be worth it over paying for lots of services.
Oh for sure! I do like Comfy, but I tend to stay away from it on the channel only because: A) It's fairly complex and requires a good amount of knowledge to get started, and B) it can be very temperamental based on your machine specs. Anytime I do a video or a tutorial on running locally, the comments section just turns into tech support. haha, I don't have time for that! For sure agree on the cost of local vs SaS-- I think, at least? One video I really want to do at some point is a cost comparison of buying a capable machine and running locally vs. say Midjourney/Runway etc. The thing is, most would say investing in hardware is the way to go-- but I want to go deep on it and factor in things like Learning Comfy (how much do you cost per hour?), building and maintaining the machine and upgrading. I think the cost of running locally will still come out on top-- but, I also think the margin might be closer than we think.
@@TheoreticallyMedia great thoughts! I thought you would know about this. And yeah it would prop lead to a lot of support since nodes easilly get missing and you have to know your way around asking ai tools for help solving some issues if you are not working as a programmer. Been using comfy ui a lot, let me share my experiment, you might find it interesting. I just finished this complete story generation pipeline which I wrote in python with the help of sonnet (it wrote the code for me). The pipeline selects a random place on earth and a random time in history and sends a message to claude sonnet 3.5 to reflect on interesting occurances or points of interest for this place. I then ask it to reflect on a real life biography of a person that would live there (where they grew up, where they went to school etc, their relations and some conflicts or dilemmas in their life). It then creates a story pitch based on the real grounding provided in initial step and is tasked to do do a pitch for something we never heard before. Then in the next step I give to it 16 principles for good story telling and tasks it to generate each continuation of the story one part at a time. When its finished it reviews the story and replace parts based on a not-reveal stuff to early approach amd guidelines where we make sure the story works good as a whole. I then do some final post processing of the text to make the story better. When the script is ready, in the next step the code splits the text up in two sentences each and sends each part as a requests to elevn labs for generating speech audio. Then I ask it to master it using audio compression from a library of background audio I generated using SUNO (I generated about 120 currated background music and ambient loops) which I task sonnet 3.5 to place along the story as a json response with audio files and timgings. This as well as 50 sound effets I generated using elven labs which it puts into the timeline aligned with the text to speech. In the next step it sends the final audio and a json describing all the points in time to the comfy ui using api to generate images to it. Then it does post processing audio mixing to make sure the speech audio never drowns with the sound effect, ambient sounds and sfx. The audio output works really well. Then I plan to make it upload the videos to youtube as well. This is an experiment I will gladly share the new channel with you when finished. All automated. Not sure it will be good but the early results seem promissing.
Ahhh, sorry-- Yeah, that compression is a killer. X is worse at least. Basically, Topaz will usually take your output and kind of smooth everything out, give a little more detail, but it's really more about giving everything a final polish. Krea is a bit of a creative upscaler. Think Magnific, but video. It can really work some wonders, BUT-- it can also go pretty nuts, so you've got to be careful with those sliders. But man, if you can find that sweet spot? It's a stunner!
I don't know if you mentioned it already, but what is Vidu a killer of? Thanks in advance! P.S. Really impressed of that Krea upscale of the girl pirate.
Haha, I think I mentioned it at the end of the video-- but, In my opinion, Vidu kills nothing, but does have a killer feature! I do think that now that they've nailed (or proven) character referencing can work, we'll see it iterated and implemented into the other video generators soon enough!
@@TheoreticallyMedia haha yeah I was only continuing the joke! It's quite exciting to see these tools develop. For video generation (for artistic purposes) I've only been using Kaiber so far. I do like that the output is more artistic / abstract but eventually I would want to also use something that gives me more control in order to tell stories in my music videos. Character reference is a must for that. I checked LTX studio after your short today and it seems really promising towards that direction but the pricing is basically a hard stopper for me, since in order for me to be able to post my creations as music videos it requires the "Business Plan" which is around 175 per month. So here's to hoping we will soon have affordable and reliable tools to tell stories with.
Generative AI video is the externalization of the human visual imagination to machines. Something we couldn't do until just recently. So many of the technologies that future generations will take for granted were impossibilities just a generation ago.
this feature should have both start frame and reference at the same time. That way you can start a shot from behind where the face is not visible and then continue to the same face .... I hate it when my character turns around and then comes out with a new face.
It’s still not to my stand but I would say if you are going to use these videos.. Put it in professional video editing software like Davinci Resolve because the color balance is waaaaY off!! You can also do thing like make the fire look like real fire with a bit of masking and replacement.
Oh, agreed. I don’t have time to do those things for these video, but I did a lot of that kind of stuff on my previous AI films. Those are those little details that really kick up AI videos to the next level.
VIDU's Free plan is useless. "Task Limit Exceeded, the server load has reached its limit". That's with 64 credits on my account. I haven't be able to generate anything for weeks, no matter what time a day. They might as well remove the free option.
To be honest, I think the hug of death might have happened. I generated a lot over the weekend and it was super smooth, but just as I was finishing this video up, I noticed a lot of sluggishness and errors. I’d give it a few days and see if it stabilizes.
Ok, who else thinks that "pirate" scene was trained on "pirate" copies of PotC and associated titles? Seems to me there isn't the reference material in the public domain- esp. as those sorts of pirates didn't actually exist.
Shhhhh, that's how I get all my business. "Come to my office in Crime Alley...oh, now you need to hire me because of the crime that happened? How unfortunate!"
Wait--did I get that wrong? I meant the 5th one...the one with Javier Bardem. Shoot- I meant "Dead Men Tell No Tales" I actually like the Pirates trilogy!
I think you might be commenting on the wrong video, but haha- for sure. The Doom game was running on a TPU, so significantly more horsepower than…well, to be honest, I think a Roku can probably run the original Doom!
Sorry for a 3rd message, but WOW again for Vidu. Just tried image to video with prompt, female with jetbackpack flying between buildings with a gun, (same stuff I just tried on runway Gen3 = got what I describe as 'beautifuly rendered rubbish', she flew backwards and had no legs), KLING I am still waiting, Haiper just shredded her (what it mostly does to everything), BUT Vidu, flying forward between building as in the image and prompts, BUT, Vidu added dodging and weaving as in a firefight, AND, her firing her weapon with gun flashes and even gun recoil! No other platform does proper gun flashes. No other platform can do such dynamic and realistic maneuvering. Nothing comes close to Vidu for doing unusual and action videos. Nothing!
That’s awesome to hear! I gotta jump back on Viidu soon! Have you played with Minimax yet? It’s pretty impressive (and FREE!) although, only Text to Video, currently!
@@TheoreticallyMedia I upload the last video.its all text to video.. now working to make it like for a short movie..so just prompting and it's keep character details
Wicked. I have a Luma subscription and it's more expensive than netflix. I honestly cant give a straight answer as to why, too. But I have a bad feeling I'm going to subscribe to this too! Ready to go open source, though. Figure it's time to cancel the subs and go 4090
I must have missed the announcement on this feature with the onslaught of Kling/Gen3 news. And to be fair, I don’t t think I’ve seen any coverage on Vidu- so, on the plus side, at least this video is getting the word out.
@@TheoreticallyMedia Luma is garbage now. More than 24 hours for a generation and I can't use my free generations cause I'm on a waitlist! As long as it's a hit and miss (mostly miss) I won't pay for it!
4:45 It is painful how Tim uses art terms. "The color saturation looks completely off." Ugh, no, the exposure is off. Or you could say the contrast is too high.
Eh, there are about 150 video here, spanning about 4 different mediums. Meaning there are around 149 videos with at least one terminology gaff. Don’t get hung up on semantics.
I honestly think Vidus results are horrible. Runway right now is still the best image to video and text to video on the market. You can't beat gen 3 turbo.
Vidu has some work to do for sure. When it hits, it hits, but often you’ll end up with glitches and wonky hands. I think they’re about an update or two away from something pretty stellar though. That said: Gen3A’s speed?! That’s insane.
oh for sure-- You can probably re-enforce it back on the Krea side-- but to be honest, I'm a bigger fan of keeping famous faces out of AI generations. That said, plenty of folks out there doing deepfakes of actors and actresses.
I'm not impressed yet. Character references in video still have a lot of room to improve. I want multiple consistent characters as well, and it's a struggle 😢
For the price point, Vidu is cheaper than any of its video competitors, such as Kling, Vidu, and Luma. The only things missing in Vidu are the ability to input a start frame as well as an end frame in one generation
I think Kling is having a STUPID cheap sale right now-- so, if you catch that, it's a price that can't be beat. But, temporary. That said: Yeah, there's a real price war going on right now, and that is awesome for us! Vidu is still a little hit or miss for me with the morphing, but I'm totally willing to give them time. I think it'll grow to be a powerhouse!
I could see that-- which is weird, because I'd never think of me looking like him. For sure, there is some additional "pasting" onto various actors though-- hence Pirate Depp
Character reference in text to video. You could do it obviously in Image to Video, but this is the first I’ve seen that will take a character and put them into text to video.
Might have given it the hug of death. It was working pretty well over the weekend, but I have noticed some more slowdowns and errors. We might have pushed too much traffic to them. I’d say give it a day or two and let them stabilize out a bit.
That’s why I’d prefer a low res (less credits) preview before committing. Ideally that low res version would be free, but I understand it still costs something to generate. I hope Vidu (and the others) implement this idea. Kaiber did something similar way back, where you could see the first frame before you generated. Just something to let us know if the model is headed in the right direction.
I know it's customary to use the trick of telling people "That's coming up " or " We will discuss that in a little bit." This is all in the hope that people will watch the entire video because that benefits you more. I get it, but you do it so much that it becomes very annoying, so much so that I stopped watching less than halfway through.
Good note! I'll try to keep an eye on that. I don't really script anymore, I'm just looking at outlines, so a lot of times I'm saying that out of habit-- and also to remind myself to talk about it later. But, yeah-- if you guys are noticing, then I'll work on it!
Eh, it’s my thing. If I stare directly into the camera I feel like a serial killer. Haha. I also think, since I externally process a lot (thinking out loud), the visual stimuli of looking around feeds the thought process. What can I say, it’s who I am.
You can check out this Video to Video workflow (mentioned in this video) here: ruclips.net/video/hjk4a9gxILc/видео.html
Estar in all of my content when I want to and I get consistent shots 90 to 100% of the time
m.ruclips.net/video/vkGWftmOQFM/видео.html
Runway, Luma & Vidu are the ponds I'm fishing in for now... excellent tools. Great show, thanks brother!
Thanks so much! Those are good waters! Lots of bites!!
Runway and Vidu for me, at this time. I LUV 'em! Cheers
The thing I'm waiting for is character reference type stuff that consistently keeps every aspect of the character, not just the character. For example, keeping the outfit exactly the same across different prompts.
How close can we get Midjourney to do that? I know there’s certain character weight things and some other stuff… I think it gets pretty close, correct? I haven’t really pushed it that far…
Thanks for showing my clip brotha! I'm cooking some stuff up, I already dropped that little teaser for the music video and I just got some more clips that look fantastic. I too did get some funky stuff, but tbh I probably got like 8 out of ten to gen properly and without need to reroll. It's not to say that the it's perfect, but I've had a ton of luck with the consistent character. No more generating images to then animate them, i just add the char img and I can just generate scenes from that. Anyhow, thanks again bud and have a chill evenin' man.
Hey man! Sorry I didn't get to work the fake merch in! haha-- I saw that too late unfortunately! Great work though!!
@@TheoreticallyMedia no worries, the fake merch one was made with Runway anyhow. That was grok and g3at
Always a pleasure to see what are you trining and showing us !
ai-videoupscale AI fixes this. FREE AI Video Does Something New!
I find the best option for consistent characters is Midjourney with character reference then image to video. This gives you the option to reroll your images till you get a consistency you're happy with. Of course having character reference in the video generator is great if you're doing text to video, but otherwise if you're already generating images, Midjourney is the way to go IMHO😀
Very true, thanks! Great advice. I see some amazing AI artists doing this for a consistent style, but not yet for a consistent specific character(s).
AH! Vidu. Okay I've already been using this...or tried to. Like it got the face reference perfect on mine but kept changing the clothes. Wasn't super useful BUT it's SUCH a good start! Like...I'm glad it EXISTS, now they can improve it and others can learn from it.
...yeah I've been working on an AI Short Film and I'm using Runway Gen-3 Alpha/Gen-3 Alpha Turbo, Kling, Luma, and Vidu all together. It's kind of a hoot seeing how differently each one handles different things (with Luma giving me the most trouble on character consistency, though it has improved recently). Vidu is pretty cool. I like that I can pay half-price for a preview video and then pay the rest to clean it up. Wish I could top up my generations though or I would have used it more.
The only trick I’ve found with somewhat consistent clothing is just to prompt for really bland things. Like, AI-me worked because the reference was a black tee shirt, so super simple.
But even if I were to prompt for me in a tuxedo, I’d still get the black tee shirt.
I suppose you could get insane and take pictures of an actor in different clothes to use as reference? Haha. So stupid it might work!
Just tried Vidu and its awesome! Much better than Runway or anything else i have tried. A simple prompt only, to get a female agent flying over a ruined city, looking down at her at a 45 degree angle, Vidu did it right and beautiful first time. Runway gen3 has NEVER been able to do it, even after using all my credits (I have a subscription) and about 40 or 50 tries, Runway did not have a clue.
I like Krea's photo creative upscaler. Video upscaler is OK sometimes.
It can be hit or miss. You've really got to play with those sliders. I've found that less is more with it, to be sure!
Thank you for the tips. Your videos are always very dynamic and entertaining. I love your sense of humor.
Looks promising. Smart to add something like character reference to start with, for sure.
Great vid as always, cheers!
Thank you!! For sure too-- I need to play around with the idea of doing half image to video and half Text (character ref) to video. I think you can get something super dynamic out of that!
The very near future looks like your own home made movie projects that could match all but the biggest blockbuster films. That's pretty 😎 to me.
Wish I could look as suave as you in those amazing generations! Great review. Thanks for sharing.
I wish I could look at suave as AI Me!! Haha- Did you see how good he was on the dance floor? I'd be falling all over the place!
oh thank you man. Theoretically Media, We love your channel, you always give us new and valuable information, we respect you
You always have the best info. Thank you!
You just created the best female character for Pirates frenchise and Hollywood should pay you for this! 😊
Right?! How much better a movie would that have been over that last Pirates movie? Which, I actually called Dead Man’s Chest. I was wrong about that- that was the second one, which is actually great. I meant Dead Men Tell No Tales. Which was, oooof,
Not as bad as the 4th one though!
(To note: awhile back I marathoned all the movies, so they’ve been stuck in my head)
Another piece of the puzzle. I hope other platforms will also add the character reference feature and do it better. The filmmaking pieces are all coming together, but it feels a bit like slow motion right now.
It does. A lot of kitbashing to get something cooking! I should probably do a round up at some point and see how many different places it takes to pull off one scene-- from concept to animation, to lip sync, etc-- I'd be curious to see how spread out we are right now, and what we're still lacking.
@@TheoreticallyMedia That's a great idea. Maybe wait 1 or 2 months with that, I suspect new A.I. things and upgrades will be released after the summer season.
Watching AI make these leaps week to week on your channel it occurred to me that given the pace of innovation it might be interesting for you to do a compilation video few months. Compilations of niches ie "Text to video" or "Image to video" would give a longer view of the progress in particular areas. This would be of particular interest for investors, and also historians who need to see what were the pivotal events in the AI revolution.
Oh, that is a cool idea! Like a "clip show" in a kind of way? It's funny, a lot of us were talking about doing a "year in review" video at the end of 2023-- but also, I think we all got slammed by the holidays. Plus, we all forgot what happened in January! ha
This might be a good solution!
I just finally got to see the AI you HAHAHA so awesome. I had to run out for a few so I paused and came back to you backin' yo thang up at the club which may or may not influenced a 50 cent track
wow, that really sold me on Krea!!! What's the best image-to-video model that retains text? I have an image of a customer holding my supplement product I'd like to turn into a video but most models are warping the label on my bottle.
Krea’s upscaler is amazing, that said: experiment with those sliders! It’s super touchy!
I think most of them are retaining text in image to video these days. Gen-3 for sure, and I’ve had some good luck with Luma and Kling on that front as well
Immediately goes over to vidu
I now have soooo many apps - its like the video streaming issue .. so many places to subscribe!!!
Did you already test FotoR? They recently added video generation as well, but they don't give you many free credits... I think they give you 2 credits per day for logging in (not automatically), but a video creation needs at least 4 credits, or something like that.
Utilizing every Ai tool at your disposal and editor software for post editing. Can you try to make a realistic cinematic short feature film? Telling a story through people, places and things?🙏🏼🥺
7:40 Topaz … thanks so much for
1000%! It’s great!
Consistency has been tough with AI but a lot of AI generators have some form of face consistency then there is face swap. Also in a series of generating you get mostly similar faces rhat face swap can fix Nice to see it coming from video generator's. Now if they did environmental consistency like LTX that would be awesome.
What I would really like to see is outfit consistency. For my character I use for my movies would realy help. Right now getting a consistent outfit is nearly impossible.
As far as length goes 4 to 10 seconds works fine for me. If I want to lengthen I usually change the speed in davinci.
Outfit consistency is a tough one. You can even see that w/ the Not David Beckham shots. The shots of AI Me worked, because I was ref'ing a guy in a black tee-shirt--which is bland enough to always work.
Also-- Detectives wearing a white shirt and a black tie. It's a never miss.
But for the most part: You can't be too fashionable as an AI Character!
This ft was there for a few weeks but thabks for sharing. Helps improve future models by increased usage
Great strides are being made! The future is bright for AI video!!!
"Vidu shouldn't be sued as your primary generator" I agree strongly. also it has very low fidility on resolution. for likeness its best to sue it for closeups only. it screws all fine details unless visible at very low res."
it honestly also asks for a bit of cash as someone who earns a none $ currency.
I'll say, at least currently as well, even after an HD upscale on Vidu, you still pretty much have to run it through Topaz (more $) to get it up to spec.
Still, willing to give them some growing room, esp if they get that creative upscaler really kicking!
Do you know about comfy ui? I think you can do local video generation and image generaiton with it if you own a high end consumer graphics card like the rtx 4090. You can then run flux dev for free and stable video 2 on it. If planning to do an entire movie this can be worth it over paying for lots of services.
Oh for sure! I do like Comfy, but I tend to stay away from it on the channel only because: A) It's fairly complex and requires a good amount of knowledge to get started, and B) it can be very temperamental based on your machine specs.
Anytime I do a video or a tutorial on running locally, the comments section just turns into tech support. haha, I don't have time for that!
For sure agree on the cost of local vs SaS-- I think, at least? One video I really want to do at some point is a cost comparison of buying a capable machine and running locally vs. say Midjourney/Runway etc.
The thing is, most would say investing in hardware is the way to go-- but I want to go deep on it and factor in things like Learning Comfy (how much do you cost per hour?), building and maintaining the machine and upgrading.
I think the cost of running locally will still come out on top-- but, I also think the margin might be closer than we think.
@@TheoreticallyMedia great thoughts!
I thought you would know about this. And yeah it would prop lead to a lot of support since nodes easilly get missing and you have to know your way around asking ai tools for help solving some issues if you are not working as a programmer.
Been using comfy ui a lot, let me share my experiment, you might find it interesting.
I just finished this complete story generation pipeline which I wrote in python with the help of sonnet (it wrote the code for me). The pipeline selects a random place on earth and a random time in history and sends a message to claude sonnet 3.5 to reflect on interesting occurances or points of interest for this place. I then ask it to reflect on a real life biography of a person that would live there (where they grew up, where they went to school etc, their relations and some conflicts or dilemmas in their life). It then creates a story pitch based on the real grounding provided in initial step and is tasked to do do a pitch for something we never heard before. Then in the next step I give to it 16 principles for good story telling and tasks it to generate each continuation of the story one part at a time. When its finished it reviews the story and replace parts based on a not-reveal stuff to early approach amd guidelines where we make sure the story works good as a whole. I then do some final post processing of the text to make the story better. When the script is ready, in the next step the code splits the text up in two sentences each and sends each part as a requests to elevn labs for generating speech audio. Then I ask it to master it using audio compression from a library of background audio I generated using SUNO (I generated about 120 currated background music and ambient loops) which I task sonnet 3.5 to place along the story as a json response with audio files and timgings. This as well as 50 sound effets I generated using elven labs which it puts into the timeline aligned with the text to speech. In the next step it sends the final audio and a json describing all the points in time to the comfy ui using api to generate images to it. Then it does post processing audio mixing to make sure the speech audio never drowns with the sound effect, ambient sounds and sfx. The audio output works really well. Then I plan to make it upload the videos to youtube as well. This is an experiment I will gladly share the new channel with you when finished. All automated. Not sure it will be good but the early results seem promissing.
Thanks for all these updates!
1000%! Thank you for watching!
@@TheoreticallyMedia Learning so much from your good work!
Can you explain the difference between the topaz upscale and krea because with youtube compression, I really can't tell the difference
Ahhh, sorry-- Yeah, that compression is a killer. X is worse at least. Basically, Topaz will usually take your output and kind of smooth everything out, give a little more detail, but it's really more about giving everything a final polish.
Krea is a bit of a creative upscaler. Think Magnific, but video. It can really work some wonders, BUT-- it can also go pretty nuts, so you've got to be careful with those sliders. But man, if you can find that sweet spot? It's a stunner!
Consistent character refs will really be a "Game" Changer ... More info coming soon. ; )
Oooooohhhhh...I like where this is going!
I don't know if you mentioned it already, but what is Vidu a killer of?
Thanks in advance!
P.S. Really impressed of that Krea upscale of the girl pirate.
Haha, I think I mentioned it at the end of the video-- but, In my opinion, Vidu kills nothing, but does have a killer feature!
I do think that now that they've nailed (or proven) character referencing can work, we'll see it iterated and implemented into the other video generators soon enough!
@@TheoreticallyMedia haha yeah I was only continuing the joke! It's quite exciting to see these tools develop. For video generation (for artistic purposes) I've only been using Kaiber so far. I do like that the output is more artistic / abstract but eventually I would want to also use something that gives me more control in order to tell stories in my music videos. Character reference is a must for that. I checked LTX studio after your short today and it seems really promising towards that direction but the pricing is basically a hard stopper for me, since in order for me to be able to post my creations as music videos it requires the "Business Plan" which is around 175 per month. So here's to hoping we will soon have affordable and reliable tools to tell stories with.
Big fan of your channel. Thank you
Generative AI video is the externalization of the human visual imagination to machines. Something we couldn't do until just recently. So many of the technologies that future generations will take for granted were impossibilities just a generation ago.
Good stuff Tim!
We might be able to prompt an entire movie sooner than we think. Holy cow.
Getting there!!
Dude, your videos rock! :)
No, YOU ROCK!! haha, thank you!!
@@TheoreticallyMedia - I also salsa - Here's a AI music vid I made. Thanx for all the tips and tricks - ruclips.net/video/Cq-WisY8jwI/видео.html
Do you reckon Krea has the best video to video creative upscaler?
I Love your videos. Can you please make a video on Odyssey AI which is still under development and promises full elemental control over videos
this feature should have both start frame and reference at the same time. That way you can start a shot from behind where the face is not visible and then continue to the same face .... I hate it when my character turns around and then comes out with a new face.
It’s still not to my stand but I would say if you are going to use these videos.. Put it in professional video editing software like Davinci Resolve because the color balance is waaaaY off!! You can also do thing like make the fire look like real fire with a bit of masking and replacement.
Oh, agreed. I don’t have time to do those things for these video, but I did a lot of that kind of stuff on my previous AI films.
Those are those little details that really kick up AI videos to the next level.
@@TheoreticallyMedia I just meant in general, wasn’t directed at you
@@ATLJB86 oh, 100% Got that!
Two very large differences with that pirate :D
👋 Looking forward to this 😊
It’s a pretty good one! AI me is living the life!! Haha
I want api for minimax and character consistency. also, complete movie machine that also adds cloned voiced from script. one prompt to complete movie.
VIDU's Free plan is useless. "Task Limit Exceeded, the server load has reached its limit". That's with 64 credits on my account. I haven't be able to generate anything for weeks, no matter what time a day. They might as well remove the free option.
To be honest, I think the hug of death might have happened. I generated a lot over the weekend and it was super smooth, but just as I was finishing this video up, I noticed a lot of sluggishness and errors. I’d give it a few days and see if it stabilizes.
Ok, who else thinks that "pirate" scene was trained on "pirate" copies of PotC and associated titles?
Seems to me there isn't the reference material in the public domain- esp. as those sorts of pirates didn't actually exist.
Isn't your office in Crime Alley?
Shhhhh, that's how I get all my business. "Come to my office in Crime Alley...oh, now you need to hire me because of the crime that happened? How unfortunate!"
...did you actually just crap on Pirates of the Caribbean 2? aint no way
Wait--did I get that wrong? I meant the 5th one...the one with Javier Bardem. Shoot- I meant "Dead Men Tell No Tales"
I actually like the Pirates trilogy!
I’m guessing this needs millions of times more processing power than the original Doom.
I think you might be commenting on the wrong video, but haha- for sure. The Doom game was running on a TPU, so significantly more horsepower than…well, to be honest, I think a Roku can probably run the original Doom!
Sorry for a 3rd message, but WOW again for Vidu. Just tried image to video with prompt, female with jetbackpack flying between buildings with a gun, (same stuff I just tried on runway Gen3 = got what I describe as 'beautifuly rendered rubbish', she flew backwards and had no legs), KLING I am still waiting, Haiper just shredded her (what it mostly does to everything), BUT Vidu, flying forward between building as in the image and prompts, BUT, Vidu added dodging and weaving as in a firefight, AND, her firing her weapon with gun flashes and even gun recoil! No other platform does proper gun flashes. No other platform can do such dynamic and realistic maneuvering. Nothing comes close to Vidu for doing unusual and action videos. Nothing!
That’s awesome to hear! I gotta jump back on Viidu soon! Have you played with Minimax yet? It’s pretty impressive (and FREE!) although, only Text to Video, currently!
@@TheoreticallyMedia I will try it now. Thanks! And thanks for your work.
5:09 Joannie Depp? 😉
There it is!! Hay-oooooh!
I upload text to video with Kling, its pure prompting with my GPTS
Same! I built one awhile back. I should probably update it soon.
@@TheoreticallyMedia I upload the last video.its all text to video.. now working to make it like for a short movie..so just prompting and it's keep character details
Brilliant!
Nice one , I like it…. Cheers
Wicked. I have a Luma subscription and it's more expensive than netflix. I honestly cant give a straight answer as to why, too. But I have a bad feeling I'm going to subscribe to this too! Ready to go open source, though. Figure it's time to cancel the subs and go 4090
Thank you
Yeah this one been out a couple weeks. I made a music video with me in a mansion smoking a cigar and driving a ferrari
I must have missed the announcement on this feature with the onslaught of Kling/Gen3 news. And to be fair, I don’t t think I’ve seen any coverage on Vidu- so, on the plus side, at least this video is getting the word out.
Kling and Gen 3 are still the best.
I like all of them (Luma as well)-- but yeah, those are usually my first go-to's as well. I've been circling back more and more to Luma though!
@@TheoreticallyMedia Luma is garbage now. More than 24 hours for a generation and I can't use my free generations cause I'm on a waitlist! As long as it's a hit and miss (mostly miss) I won't pay for it!
Unfortunately vidu needs a que up system
Yeah, we need V2 of Vidu. But I'm sure they're working on it.
That thumbnail looks like you... 50 years ago.
Oh, Vidu is very kind. I mention that. It also makes me into a marathon runner. Vidu knows how to butter you up.
4:45 It is painful how Tim uses art terms. "The color saturation looks completely off." Ugh, no, the exposure is off. Or you could say the contrast is too high.
Eh, there are about 150 video here, spanning about 4 different mediums. Meaning there are around 149 videos with at least one terminology gaff.
Don’t get hung up on semantics.
I honestly think Vidus results are horrible. Runway right now is still the best image to video and text to video on the market. You can't beat gen 3 turbo.
Vidu has some work to do for sure. When it hits, it hits, but often you’ll end up with glitches and wonky hands. I think they’re about an update or two away from something pretty stellar though.
That said: Gen3A’s speed?! That’s insane.
Tx again u da best 😜
Regarding Krea and Beckham, prompt his name and it will help……
oh for sure-- You can probably re-enforce it back on the Krea side-- but to be honest, I'm a bigger fan of keeping famous faces out of AI generations.
That said, plenty of folks out there doing deepfakes of actors and actresses.
Thank you. You're funny 😋
Some good things with Vidu but the quality looks terrible, even the enhanced version.
It's still too soon and a waste of money for any of these apps.
6:31 😂😂😂
I'm not impressed yet. Character references in video still have a lot of room to improve. I want multiple consistent characters as well, and it's a struggle 😢
Totally. That and consistent wardrobe. Slow and steady, though. We’ll be there soon enough!
Then higher an actor and wardrobe coordinator.
For the price point, Vidu is cheaper than any of its video competitors, such as Kling, Vidu, and Luma. The only things missing in Vidu are the ability to input a start frame as well as an end frame in one generation
I think Kling is having a STUPID cheap sale right now-- so, if you catch that, it's a price that can't be beat. But, temporary. That said: Yeah, there's a real price war going on right now, and that is awesome for us!
Vidu is still a little hit or miss for me with the morphing, but I'm totally willing to give them time. I think it'll grow to be a powerhouse!
What free ai video maker does 10 seconds?!!😮
👍👍👍👍👍👍👍
6:03 Mark Wahlberg
I could see that-- which is weird, because I'd never think of me looking like him. For sure, there is some additional "pasting" onto various actors though-- hence Pirate Depp
❤
First! And I'm not a bot! XD (sorry I wanted to try this at least once here)
You win the chicken dinner!!! It's in the mail! I wouldn't recommend eating it. haha
Tim Johnson
Where is the, ”something new"?
Character reference in text to video. You could do it obviously in Image to Video, but this is the first I’ve seen that will take a character and put them into text to video.
Jeez what a tool to fit people up for things they didnt do..
Hi, Vidu? This doen'st work
Might have given it the hug of death. It was working pretty well over the weekend, but I have noticed some more slowdowns and errors. We might have pushed too much traffic to them.
I’d say give it a day or two and let them stabilize out a bit.
It does not make sense nor it is perfectly legal to make you pay for each generation as you will have to cherry-pick among many results.
That’s why I’d prefer a low res (less credits) preview before committing. Ideally that low res version would be free, but I understand it still costs something to generate.
I hope Vidu (and the others) implement this idea.
Kaiber did something similar way back, where you could see the first frame before you generated. Just something to let us know if the model is headed in the right direction.
Until they bring it to perfection, I won't throw my money in the garbage.😒
netter than the $100 /month
I know it's customary to use the trick of telling people "That's coming up " or " We will discuss that in a little bit." This is all in the hope that people will watch the entire video because that benefits you more. I get it, but you do it so much that it becomes very annoying, so much so that I stopped watching less than halfway through.
I like this guy, but I have to agree with you.
Yeah, it's a little annoying.
Good note! I'll try to keep an eye on that. I don't really script anymore, I'm just looking at outlines, so a lot of times I'm saying that out of habit-- and also to remind myself to talk about it later.
But, yeah-- if you guys are noticing, then I'll work on it!
@@TheoreticallyMediayou make excellent content btw 🙏🏽💎
His content is good tho
The way you keep constantly rolling up your eyes in a diagonal way is so distracting for me 😢
Eh, it’s my thing. If I stare directly into the camera I feel like a serial killer. Haha. I also think, since I externally process a lot (thinking out loud), the visual stimuli of looking around feeds the thought process.
What can I say, it’s who I am.
It’s classic eye movement when accessing memory. NLP neural linguistic programming
bland archetype, such as Andrew Tate lol
Very old news….aren’t you supposed to be up to date with AI news??? This is at least 2 months old news……
Not the character reference in text to video. That’s brand new.