Thank you so much to everyone who watches to the end and likes my videos. It really helps RUclips push them out to more people so I can grow my channel. 👌I'm pretty shy in real life and making these videos helps me come out of my shell and be the person I want to be, so thanks again and hope to see you next time!
There is a new ChatGPT GPT-4o it's faster than ChatGPT-4. Also Groq doesn't have any AI models of their own. They just provide inference hardware access.
Well I like Groq. Its amazing fast. So amazing we all should run it. While I am working with their playground daily their API limit is hindering its popularity a lot. And there are no paid plans available yet. Also I havnt seen a demo of a diffusion model yet running on groq, do you know any?
@@yottylaboratories I tried the task with the 500 word story with Llama3 on a H100 yesterday. Havnt had a stop watch but was damn slow compared to Groq.
Please provide more information about your software and if you have been able to buy Groq cards or servers in Europe, I would love an alternative to buying H100 from Nvidia and Groq seems to be the best alternative if available.
Hi. My software is a plugin called Insights used inside the ConnectWise Manage business software (darklabs.ai if you're interested). At this point I'm just using APIs and not running my own hardware so I have no information about buying Groq chips unfortunately. Thanks for asking!
OK, so hands up I had to Google that but thanks for teaching me a new word 👍 So I guess something like Spirited Away is a true Isekai? Looking at the definition, I guess yes, Stargate does indeed go to other worlds and has that sense of exploration, but on the other hand, it seems to imply you'd get there magically, whereas Stargate uses technology. Also, Stargate people can come home whenever they want, I'm not sure if being stuck somewhere is part of Isekai? Thanks again for the new word 👊
@@yottylaboratories magic and technology are the same thing, one is just unexplainable forces that affect the world through invisible forces, changing elements at nanoscopic scales, using quantom probability to telaport electrons through solid walls in a controllable way in order to hold charges in a voltage values for data stored on scales of millions to billions of times a second just to take a photo of your mac and cheese🤣 and yet people think throwing fireballs from your hand is magic. Magic is just technology we haven't made yet or alternative physics, I kinda like the idea of just calling "magic" "alternative physics"
I wouldn't worry too much about getting a few things wrong, the average comments on these technical analyses have far more errors than you may make in oversight. Just make a new video clarifying and extending your prior analysis, and correcting any errors, no need to take a video down for having erred in precision, while still being right in generalities. Fortunately custom AI hardware is so many orders of magnitude superior to adaptations of hardware not expressly designed for the application, that you can be wrong in degree while still being correct in your conclusions. That said, LLM's are not "trained on parameters," they are trained on data, individual instances of which may be referred to as "samples," "data points," or "examples." Parameters are metrics of the numbers of nodes, connections between nodes and between layers, modules, and input or output registers, and the numbers of variables represented in weights and biases. So more generally, weights, biases, and connections. Also, all neural-network AI modules are not LLM's, some process data other than language (although a case could be made that all data is expressed in some form of language). Most multi-modal systems are modular, but GPT-4o is an example of a natively multi-modal model, which processes various modes in the same network. Not all of it's intended native capabilities are publicly available yet, however, so it can temporarily still be classified as modular in its public release. I have yet to view a video on AI that has no misstatements. It's a very complex science that few non-specialists have a full comprehension of. You are doing a good job of demonstrating the relative performance of hardware, and you will learn by OJT, which is often the best way to learn. Let your curiosity guide your research, admit your errors, and move on. The internet has a very low bar for civility in comments sections, so you can't get too emotionally involved.
I am again confused. Why are you comparing that groq thing to the resources meta provides to a user? If you are talking about nvidia, wouldn't it be logical to compare lama on nvidia gpu vs groq?
Doesn’t it matter that the Meta servers are simultaneously running inference for perhaps millions (at least 100K for sure) of simultaneous users vs some drastically smaller number for Groq? (It may not, it’s an honest question 😁)
You are exactly right, it will have an effect for sure, but I couldn't find usage stats for either unfortunately. Groq themselves are still confident their chips will outperform anything else out there ATM, but we can't know for sure! Thanks for commenting.
but groq run 70B model meanwhile chat gpt 4? i bet it more than 70B so until we can see how chat gpt 4 or 4o or anything runs with groq it is impossible to compare.
ok see now the comparisons are more accurate and easier to understand, and your video is now at 99% grade. it's not 100% percent because of that Stargate nonsense. (BTW, it would be awesome if someone could somehow get chatgpt running on the Groq chips. Imagine the already-speedy new GPT4o with it's responsive speech now on Groq too!)
Thank you, high praise indeed! I wondered about GPT on Groq in the original video, and I decided a lot of the work into GPT was to optimise it for its current hardware, so switching to Groq would probably not have the immediate effect of speeding it up, if it ran at all. Also, I don't think there are anywhere near enough Groq chips in the world to support a fraction of GPTs requirements ATM. Saying that, 4o just ran the same car story prompt in 15 seconds, and I have no idea of the size of that model! Perhaps they don't need Groq anyway? Sorry for the essay 😂
@@yottylaboratories good points that i didn't consider. the issue of having to re-optimize on groq and the fact of needing to produce as much groq chips to equal the army of gpus gpt4 currently requires just to function. at least groq will help prop up the the opensource sector. more power to everyone!
Thank you so much to everyone who watches to the end and likes my videos. It really helps RUclips push them out to more people so I can grow my channel. 👌I'm pretty shy in real life and making these videos helps me come out of my shell and be the person I want to be, so thanks again and hope to see you next time!
Can't they see socialism yet, or do they just not want to announce it yet?
Excellent, and no nans were harmed in the making of this video
Cheers!
Nan status: DELIGHTED
😂 balance has hopefully been restored
Great video...and I remember the day you went to collect the Stargate guy!
Thanks! You were his chauffeur! I was a bodyguard to Brian Blessed too that weekend. Good times!!
Great video! I'd never heard of Groq with a Q! (Sounds like a band name)
Thank you! 🎤🎤ARE YOU GUYS READY TO GROQ?!! 🎤🎤 Damn, maybe that should be the title 😂
@@yottylaboratories 😆😆😆
I really like your content. Looking forward to more 👌
Thank you Papa Meow Meow 😊 Thanks for watching.
My tiny mind is blown! Awesome stuff.
There is a new ChatGPT GPT-4o it's faster than ChatGPT-4. Also Groq doesn't have any AI models of their own. They just provide inference hardware access.
Cheers, I'll be doing a video on GPT-4o shortly. Is there anything it's done that's really impressed you?
1:50 "get your priorities straight" from some guy that drinks a pint at 9am?
😂 airports are like international waters!
Well I like Groq. Its amazing fast. So amazing we all should run it. While I am working with their playground daily their API limit is hindering its popularity a lot. And there are no paid plans available yet. Also I havnt seen a demo of a diffusion model yet running on groq, do you know any?
I haven't seen any, no, but from what I can make out, they should be pretty hot at running diffusion models too. Perhaps they'll add them eventually?
@@yottylaboratories I tried the task with the 500 word story with Llama3 on a H100 yesterday. Havnt had a stop watch but was damn slow compared to Groq.
@@dyter07 That's really interesting to know, thanks for sharing 👍
Please provide more information about your software and if you have been able to buy Groq cards or servers in Europe, I would love an alternative to buying H100 from Nvidia and Groq seems to be the best alternative if available.
Hi. My software is a plugin called Insights used inside the ConnectWise Manage business software (darklabs.ai if you're interested). At this point I'm just using APIs and not running my own hardware so I have no information about buying Groq chips unfortunately. Thanks for asking!
Impressed but not sure why since most of it went over my head
Thanks! I will try to better explain things in future videos. In hindsight I don't think it needed to be that technical 🤔
Yes for Stargate!
Stargate is a iseki, change my mind 😎
OK, so hands up I had to Google that but thanks for teaching me a new word 👍 So I guess something like Spirited Away is a true Isekai? Looking at the definition, I guess yes, Stargate does indeed go to other worlds and has that sense of exploration, but on the other hand, it seems to imply you'd get there magically, whereas Stargate uses technology. Also, Stargate people can come home whenever they want, I'm not sure if being stuck somewhere is part of Isekai? Thanks again for the new word 👊
@@yottylaboratories magic and technology are the same thing, one is just unexplainable forces that affect the world through invisible forces, changing elements at nanoscopic scales, using quantom probability to telaport electrons through solid walls in a controllable way in order to hold charges in a voltage values for data stored on scales of millions to billions of times a second just to take a photo of your mac and cheese🤣 and yet people think throwing fireballs from your hand is magic. Magic is just technology we haven't made yet or alternative physics, I kinda like the idea of just calling "magic" "alternative physics"
@@IM2awsme love it!
I wouldn't worry too much about getting a few things wrong, the average comments on these technical analyses have far more errors than you may make in oversight. Just make a new video clarifying and extending your prior analysis, and correcting any errors, no need to take a video down for having erred in precision, while still being right in generalities. Fortunately custom AI hardware is so many orders of magnitude superior to adaptations of hardware not expressly designed for the application, that you can be wrong in degree while still being correct in your conclusions.
That said, LLM's are not "trained on parameters," they are trained on data, individual instances of which may be referred to as "samples," "data points," or "examples." Parameters are metrics of the numbers of nodes, connections between nodes and between layers, modules, and input or output registers, and the numbers of variables represented in weights and biases. So more generally, weights, biases, and connections.
Also, all neural-network AI modules are not LLM's, some process data other than language (although a case could be made that all data is expressed in some form of language). Most multi-modal systems are modular, but GPT-4o is an example of a natively multi-modal model, which processes various modes in the same network. Not all of it's intended native capabilities are publicly available yet, however, so it can temporarily still be classified as modular in its public release.
I have yet to view a video on AI that has no misstatements. It's a very complex science that few non-specialists have a full comprehension of. You are doing a good job of demonstrating the relative performance of hardware, and you will learn by OJT, which is often the best way to learn. Let your curiosity guide your research, admit your errors, and move on. The internet has a very low bar for civility in comments sections, so you can't get too emotionally involved.
Thank you for the kind, encouraging words and explanation 👍 you have educated me! I appreciate it.
I am again confused. Why are you comparing that groq thing to the resources meta provides to a user? If you are talking about nvidia, wouldn't it be logical to compare lama on nvidia gpu vs groq?
Doesn’t it matter that the Meta servers are simultaneously running inference for perhaps millions (at least 100K for sure) of simultaneous users vs some drastically smaller number for Groq? (It may not, it’s an honest question 😁)
You are exactly right, it will have an effect for sure, but I couldn't find usage stats for either unfortunately. Groq themselves are still confident their chips will outperform anything else out there ATM, but we can't know for sure! Thanks for commenting.
Jaffa kree!
but groq run 70B model meanwhile chat gpt 4? i bet it more than 70B so until we can see how chat gpt 4 or 4o or anything runs with groq it is impossible to compare.
Tell me you didn't watch the video without telling me you didn't watch the video 😅
GPT 4 is estimated to be about 1.8 trillion. No idea on 4o ATM. 👍
@@yottylaboratories yes i did watch it but the evidence is when something compared with something equally, unless that we just say it guessing
@@yottylaboratories yeah really cool... 4o really insane it is multimodal builtin,
ok see now the comparisons are more accurate and easier to understand, and your video is now at 99% grade. it's not 100% percent because of that Stargate nonsense.
(BTW, it would be awesome if someone could somehow get chatgpt running on the Groq chips. Imagine the already-speedy new GPT4o with it's responsive speech now on Groq too!)
Thank you, high praise indeed! I wondered about GPT on Groq in the original video, and I decided a lot of the work into GPT was to optimise it for its current hardware, so switching to Groq would probably not have the immediate effect of speeding it up, if it ran at all. Also, I don't think there are anywhere near enough Groq chips in the world to support a fraction of GPTs requirements ATM. Saying that, 4o just ran the same car story prompt in 15 seconds, and I have no idea of the size of that model! Perhaps they don't need Groq anyway? Sorry for the essay 😂
@@yottylaboratories Stargate gets my vote, because I'm a fan of isekis🫡
@@yottylaboratories good points that i didn't consider. the issue of having to re-optimize on groq and the fact of needing to produce as much groq chips to equal the army of gpus gpt4 currently requires just to function. at least groq will help prop up the the opensource sector. more power to everyone!