Deepseek's cooked a Multimodal AI great!!! 💥 Janus 1.3B 💥
HTML-код
- Опубликовано: 27 окт 2024
- Janus is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus make it a strong candidate for next-generation unified multimodal models.
Janus: Decoupling Visual Encoding for Unified
Multimodal Understanding and Generation
arxiv.org/pdf/...
Janus 1.3B demo - huggingface.co...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - ko-fi.com/1lit...
🧭 Follow me on 🧭
Twitter - / 1littlecoder
Janus is an appropriate name for something that has two faces or looks in two different ways, like a multimodal model. (The month of January is named after the Roman god of beginnings and endings, duality.)
Task specific tokeniser is a great idea . This is even true in language models . Imagine a task to count how many r’s are in strawberry ? Why should we tokenise it as straw and berry ? Would it not be better to tokenise it as s,t,r,a,w,b,e,r,r,y ?
this guy gets it
Which is the top model in this category from your experience?
Thanks! If I have time, I’ll try it on an iPhone 16 Pro
So will it be better if it is scaled?
Hi, now a days latest important updates are not coming from you like, IBM Granite Models, LightRAG, OpenwebUI ??
@@praveengowd open webui is not new, I have a video old one when it was called ollama web ui. They rebranded I guess. IBM granite I thought of making but the model was quite medicore nothing special so didn't find value. LightRAG is a genuine miss, thanks for reminding.
@@1littlecoder now a days I'm trying Openwebui, it seems it is ok for general use like me. Can you explore latest options "pipelines" & "functions".
And
Is there any way to combine "lightRAG" with "openwebui" pipelines? Can you check that.
If time permits for you, pl try.