CogVideoX 5B AI Video Model Updated With Img2Vid In ComfyUI

Поделиться
HTML-код
  • Опубликовано: 2 дек 2024

Комментарии • 55

  • @kalakala4803
    @kalakala4803 2 месяца назад +6

    Yes , why there's no way to motion control like ControlNet in AD...? All companies spend so much in the new AI video model training and development, but no add on features...

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +4

      Well, another conspiracy is that, they want you to spend more on credits to keep generate 😂

  • @hugoalvarez923
    @hugoalvarez923 2 месяца назад +4

    Maybe we can combine it sometimes with toonCrafter, livePortrait or other options to get more control

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +3

      Yes, i don't like that it have no control. Again, it feels like gambling in txt2vid and img2vid. Wait for the next seed number and see what do we get.

  • @SteveWarner
    @SteveWarner 2 месяца назад +1

    I'd be curious to see how CogVideoX 5B performs when you feed it a very simple 3D blocking animation as a base. The Blender Palladium tool seems to be built for this, and the samples they're showing with CogVideoX seem solid and coherent.

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +1

      the CogVideoX 5B txt2vid was performing well, and some generated results are coherent for objects. But that was text , so we can't input any 3D obj or image.
      For this one, img2vid, it seems not getting a high percentage of great image to video result.
      I tested, and from my result, 3/10 img2vid can be a pass, which other result morphing , deform...

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +1

      And if 3D animation, between 2 blender or AI video models , I am going to choose Blender for animation at this moment.
      Maybe a breakthrough later in Ai video model.

    • @SteveWarner
      @SteveWarner 2 месяца назад

      @@TheFutureThinker I'm more interested in vid2vid with the source vid being a rough animation from blender. Using that with control net and animdiff should work. Just curious if this model would produce better results. Will give it a try once I get v2v with animdiff working.

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад

      ​@@SteveWarner yes, v2v AD is just work. With ControlNet , it can be controlled, and it's also very flexible upon the user's creativity.

  • @stopgirlidk6611
    @stopgirlidk6611 Месяц назад

    hey, need some help.
    I've been getting this error:
    'CogVideoXTransformer3DModel' object has no attribute 'single_transformer_blocks'
    any help?

  • @madrooky1398
    @madrooky1398 2 месяца назад +2

    Hilarious how your voice becomes so sad when talking about high memory consumption... 😂
    I feel ya bro.

  • @arimiarts9233
    @arimiarts9233 2 месяца назад

    It's all working fine but when I set the frames number above 49 I get some sort of mosaic tile output. Is there any way to make this work. I really need this to run longer.

  • @Ken_news5220
    @Ken_news5220 2 месяца назад

    so far the motion director is not compatible with this model right ? so you can't have better control through pre-trained lora. 😟😟😟

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад

      I assume you are talking about motion director for Comfyui. And that one is built for AnimateDiff model. For DiT AI video model, it does not work. So if we have to do a motion lora for CogvideoX, it will have to base on this model to train it, supposedly.
      Again , all ai video model are all playing gambling, you click few time wait for generate and hope theres one what you are looking for.
      For Runway v2v or AnimateDiff V2V , yes we can apply it for ai video, but its not directly 1 flow to generate a result like what we used to do with ControlNet.

  • @Geffers58
    @Geffers58 Месяц назад

    Although this is working fine on my RTX 3090 at home on Windows, it's giving a weird error on RunPod's linux servers, though I've done other stuff which has been basically fine. It marks the models clip file as corrupted, even though the md5 code is right.

  • @honyeechua9670
    @honyeechua9670 2 месяца назад +1

    i just wonder about the reason of bad result. because from another video there show hints from github project: the model is trained with long prompts, so short prompts will generate bad result.

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +1

      i think so. but img2vid model shouldn't need a long prompt, no other AI video models that on the market need it, as it's based on the vision of the image for reference. If a model need a long prompt even it's doing a img2vid, that mean is not good enough. Why required users to downgrade or put more work in order to make an AI work like other AI right? :)

    • @honyeechua9670
      @honyeechua9670 2 месяца назад

      @@TheFutureThinker appreciate for explain this!

  • @crazyleafdesignweb
    @crazyleafdesignweb 2 месяца назад

    But can we use like Blender or some 3D editor , export the 3D motion run it in AI Video?

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад

      Then that will back to the older techniques like some 3D arty RUclips talk about, Blender create 3D then bring it to AnimateDiff v2v, and they think its awesome.
      For business, AI new models should have a breakthrough.

  • @LEONARDO-z-b2o
    @LEONARDO-z-b2o 2 месяца назад

    Error occurred when executing CogVideoSampler:
    'list' object has no attribute 'shape'
    Any suggestions?!

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +1

      This happen to me before, I added a image resize node after load image. The comfy can run the process. Thats how I solve it in my system.

  • @cgraider
    @cgraider 2 месяца назад

    quick question, is it possible to train flux dev model (Checkpoint) on GTX 4080, how ? what repo ?

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +1

      um.. not sure if Flux have checkpoint model trainer yet. But it has the training script written in Python.

    • @cgraider
      @cgraider 2 месяца назад

      @@TheFutureThinker so those one in civit are using the main open source code ?

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +1

      @@cgraider I guess so, you can ask the checkpoint creator. But in Flux Hugging Face , they have the training code for it, and basically it can run on the cloud or local.
      Of course, when training CP, it's better to run on the cloud.

    • @cgraider
      @cgraider 2 месяца назад

      ​@@TheFutureThinker thanks.

  • @vidovitt
    @vidovitt 2 месяца назад

    What does this error mean? How do I fix it? Given groups=1, weight of size [3072, 16, 2, 2], expected input[26, 32, 60, 90] to have 16 channels, but got 32 channels instead

  • @MilesBellas
    @MilesBellas 2 месяца назад

    Sequential image upscale ?
    Ideally .exr.
    Motion blur synthesis next ?

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +2

      Yes, i want to try this as well. Great one

  • @Caged_Monuments-x6p
    @Caged_Monuments-x6p 2 месяца назад

    Is this limited to 720x480 ratio/resolution?

  • @suprichiegt1439
    @suprichiegt1439 2 месяца назад

    this takes forever on my 4080 super and i9 14900k
    am i doing something wrong ??

    • @Radarhacke
      @Radarhacke 2 месяца назад

      No, this Model is literaly bullshit. If, for once, the process doesn't break down, you'll get a terribly bad video clip. This model is a complete waste of electricity.

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +2

      No you are not, you are doing normal.
      This model do max out all Vram you have.

    • @Caged_Monuments-x6p
      @Caged_Monuments-x6p 2 месяца назад +1

      4090 3950 is churning 1 cycle in 5 min

    • @lioncrud9096
      @lioncrud9096 2 месяца назад +2

      no that sounds about right. it takes 10 minutes to generate a video with my rtx 3090 😑. i heard there is a "fun" 2b version that takes about 2 minutes per generation.

  • @DeathMasterofhell15
    @DeathMasterofhell15 2 месяца назад

    I like to listen your voice , its calming , make more videos and I try my workflow for text to image to video to video

  • @developmentmotion4765
    @developmentmotion4765 2 месяца назад

    why my I2v model not appear, only 2b and 5b

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад

      Update the last version, the drop down menu will appear this.

  • @luisellagirasole7909
    @luisellagirasole7909 2 месяца назад

    Hi, as usualy Cog Video doesn't like me :)
    That's the error I got
    CogVideoXPatchEmbed( (proj): Conv2d(32, 3072, kernel_size=(2, 2), stride=(2, 2)) (text_proj): Linear(in_features=4096, out_features=3072, bias=True) ) does not have a parameter or a buffer named pos_embedding.

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +1

      Looks like its killing your pc 😅

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +1

      Btw, diffuser update? From requirements.txt

    • @luisellagirasole7909
      @luisellagirasole7909 2 месяца назад

      @@TheFutureThinker Ok, the error is gone, thank you, but as bedore it stucked at Sampler. No more CogVideo for me :)

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад

      @@luisellagirasole7909 you don't need Cogvid for tiktok LOL.

  • @MilesBellas
    @MilesBellas 2 месяца назад

    oh yes.....
    Benji the Leader

  • @felipealmeida5880
    @felipealmeida5880 2 месяца назад +4

    Cool to be local, have privacy, and be able to generate whatever you want, but what’s the use of a model that generates low-quality video, with distortions, and only for 6 seconds?

    • @TheFutureThinker
      @TheFutureThinker  2 месяца назад +9

      The problem is, when there are large parameters size model, and give you a high quality run on local, for consumer pc, you can't even run it. Maybe you can do some research before comment, and as I mentioned about Transformer architecture.

    • @wereldeconomie1233
      @wereldeconomie1233 2 месяца назад +2

      ​​@@TheFutureThinker well, i think not all artists or people came from multi media background do understand how AI run. So they comment this.