MiniCPM-V 2.6 Training Guide
HTML-код
- Опубликовано: 9 фев 2025
- MiniCPM-V is a series of end-side multimodal LLMs (MLLMs) designed for vision-language understanding. The models take image, video and text as inputs and provide high-quality text outputs.
MiniCPM-V 2.6: 🔥🔥🔥 The latest and most capable model in the MiniCPM-V series. With a total of 8B parameters, the model surpasses GPT-4V in single image, multi-image and video understanding. It outperforms GPT-4o mini, Gemini 1.5 Pro and Claude 3.5 Sonnet in single image understanding, and advances MiniCPM-Llama3-V 2.5's features such as strong OCR capability, trustworthy behavior, multilingual support, and end-side deployment. Due to its superior token density, MiniCPM-V 2.6 can for the first time support real-time video understanding on end-side devices such as iPad.
📚:modelbest.feis...
Fantastic!
Could ths model be trained to analyse sports techniques? This is a game changer!!!
Watching it 13th time .. Love you Guys Love the Girl which explains to me even 13th time lol. will be here for 130000th time too..
Thanks for the explanations.
Your model really performs very well. Kudos to all the hard work done.
Guys please hire me XD.
Good work