Deepseek's cooked a Multimodal AI great!!! 💥 Janus 1.3B 💥

Поделиться
HTML-код
  • Опубликовано: 27 окт 2024
  • Janus is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus make it a strong candidate for next-generation unified multimodal models.
    Janus: Decoupling Visual Encoding for Unified
    Multimodal Understanding and Generation
    arxiv.org/pdf/...
    Janus 1.3B demo - huggingface.co...
    ❤️ If you want to support the channel ❤️
    Support here:
    Patreon - / 1littlecoder
    Ko-Fi - ko-fi.com/1lit...
    🧭 Follow me on 🧭
    Twitter - / 1littlecoder

Комментарии • 9

  • @mshonle
    @mshonle 2 дня назад +2

    Janus is an appropriate name for something that has two faces or looks in two different ways, like a multimodal model. (The month of January is named after the Roman god of beginnings and endings, duality.)

  • @msokokokokokok
    @msokokokokokok День назад +1

    Task specific tokeniser is a great idea . This is even true in language models . Imagine a task to count how many r’s are in strawberry ? Why should we tokenise it as straw and berry ? Would it not be better to tokenise it as s,t,r,a,w,b,e,r,r,y ?

    • @xlretard
      @xlretard День назад

      this guy gets it

  • @brto
    @brto 2 дня назад +1

    Which is the top model in this category from your experience?

  • @paulyflynn
    @paulyflynn 2 дня назад

    Thanks! If I have time, I’ll try it on an iPhone 16 Pro

  • @remsee1608
    @remsee1608 2 дня назад

    So will it be better if it is scaled?

  • @praveengowd
    @praveengowd 2 дня назад

    Hi, now a days latest important updates are not coming from you like, IBM Granite Models, LightRAG, OpenwebUI ??

    • @1littlecoder
      @1littlecoder  2 дня назад +3

      @@praveengowd open webui is not new, I have a video old one when it was called ollama web ui. They rebranded I guess. IBM granite I thought of making but the model was quite medicore nothing special so didn't find value. LightRAG is a genuine miss, thanks for reminding.

    • @praveengowd
      @praveengowd 2 дня назад

      @@1littlecoder now a days I'm trying Openwebui, it seems it is ok for general use like me. Can you explore latest options "pipelines" & "functions".
      And
      Is there any way to combine "lightRAG" with "openwebui" pipelines? Can you check that.
      If time permits for you, pl try.