But apart from just testing when I used it for my workflow in coding, maths, and other stuff which do through deepseek and Claude, I don't think this minimax is so intelligent and powerful, but the idea here is mainly focusing on longer context, it's the first time we got this type of open source model with this tweaked architecture having 4 million context, so I can say its really really nice, cuz I'm bet sure after this context increasing the company would be working on its intelligence and accuracy which it lacks, still very very nice model for what point it's trying focus. Waiting for fine-tuning methods from developers with increased intelligence of the model.
The crazy thing fahd is that in a very short time. This model will get way smaller. Maybe 120b or even 70b. But also when those Nvidia desktop mini-super computers come out we will be able to run them locally. I can't wait till you do a vid and you say, from my house on my Nvidia computron, or whatever their going to call it.
wow 4 million context window, how many pdf can it crunch in one go, i wonder will be able to do multimodal understanding too? what about needle in haystack... the first gemini generation also have huge context window but failed to understand/remember the context given at the middle....
I was hoping the larger context window, and the token slider, would mean it would output much longer form writing. 1 million tokens is approximately 750,000 words. Now I'm not expecting that, but my experiments have been similar to yours in that it is very difficult to get a longer output.
But apart from just testing when I used it for my workflow in coding, maths, and other stuff which do through deepseek and Claude, I don't think this minimax is so intelligent and powerful, but the idea here is mainly focusing on longer context, it's the first time we got this type of open source model with this tweaked architecture having 4 million context, so I can say its really really nice, cuz I'm bet sure after this context increasing the company would be working on its intelligence and accuracy which it lacks, still very very nice model for what point it's trying focus. Waiting for fine-tuning methods from developers with increased intelligence of the model.
The crazy thing fahd is that in a very short time. This model will get way smaller. Maybe 120b or even 70b. But also when those Nvidia desktop mini-super computers come out we will be able to run them locally. I can't wait till you do a vid and you say, from my house on my Nvidia computron, or whatever their going to call it.
wow 4 million context window, how many pdf can it crunch in one go, i wonder will be able to do multimodal understanding too? what about needle in haystack... the first gemini generation also have huge context window but failed to understand/remember the context given at the middle....
I was hoping the larger context window, and the token slider, would mean it would output much longer form writing. 1 million tokens is approximately 750,000 words. Now I'm not expecting that, but my experiments have been similar to yours in that it is very difficult to get a longer output.
Yes, I am also hoping for longer output