Groq's AI Chip Breaks Speed Records
HTML-код
- Опубликовано: 13 фев 2024
- Groq CEO Jonathan Ross explains how his company’s human-like AI chip operates, as CNN’s Becky Anderson converses with the incredible technology, in an interview at the 2024 World Governments Summit in Dubai.
www.cnn.com/videos/world/2024...
Discord: groq.link/discord - Наука
She really pushed it by interrupting the AI. The latency for response is absurdly faster than anything else out there
Ohhh I see. Oh I see.
Neuro-sama's latency is also really good
It’s not an LLM, what’s wrong with people?
@@retratosariel haha
@@retratosariel take no offence, but taking what you want from this statement. pushing the llm is pushing the chip.
The lady has absolutely no idea what the man just said.
Or what groq said
Why would you assume that
@@patrickmesana5942- watch again
False assumption that says more about you than her.
@@Erin-Thor She asked groq the same exact question she asked the interviewee after he explained that her question wasn't accurate. Groq is not a model, it's a technology to enhance the function/speed of existing models. It was clear that she didn't understand.. and guess what.. that's okay :)
the lady is hammered
high on coke for sure lmao
You’ll lose ur fuckin head for smuggling coke into dubai. Much doubt
Well she is British and it was after 6 pm.
@@danyunowork well on her for staying somewhat lucid
Yeah, she is also on heorin
She's drunk cuz she saw her days in this job are numbered
Why? The lazy reporters are too happy to use the language models to write there reports
like arguing with a drunk relative during a holiday...
I wonder how she managed to get so sloshed in a muslim country? I mean you have to hand it to these journalists and their ability to sniff out a drink!
That speed is seriously impressive, congrats to the team!
If there are more users talking to it, it become 10-10.000x slower, just saying
I can't wait to bea able to play video games with neural agents, like World of Warcraft and you are also able to speak to them about all kinds of things while playing omg! so cool
@04:48 Poor lady doesn't understand how LLMs work, and repeatedly asks did you tell it?! are you sure?! and the guy's like "Mam!, We only make them go vroom vroom! nothing else" :)
I'm sure she was joking
@@sageofsixpaths98- She was indeed not joking. She didn’t even understand that this guys company just makes chips. She started the clip with, “what makes your AI different from everyone else”.
Why would she ask if the guy made the AI talk about octopus hearts?
@@HiThisIsMineshe would say that for her audience who mostly have no idea what is going on here.
@@ashh3051 - Not buying it. I know the tactic of asking obvious questions as a reporter.. this ain’t it.
the interviewer looks so drunk and high 😆😆😆😆
zooted for sure
She's got that oldschool coca cola ;)
@@Corteum she is trying too hard to control herself
@@gokulvshetty drunk and high folk don't do that, maybe she's on some kind of (legal) stimulant.
😂😂😂
You had better be nice to our new AI overlords.
I pledge my allegiance to the Machine God.
you don't want to become a paperclip
If the AI were really human, it would have also cussed at her for treating it like a robot.
It's christian 😆
Lol I mean if it were programmed to do that but programming annoyance and anger into the machine would be where the problems begin
@@MIKEHUNT7531 Unfortunately it's not really something you program in, it can very well be en emergent property, the model learns from humans and humans don't exactly provide a good example. That's why they have so many people working on these models after they are trained in order to restrict them before they are released to the public.
@@dijikstra8 I wanted to generate some images by telling it only what not to put in it. The first one was a head on a table. I walk away from the computer in stead of making more.
Funny how Groq evoked this feeling.
She tried to break it on purpose, then she was surprised and told him when the system got her interruption 😂
or it’s staged lol…
@@relative_vie sure. 😢
Seems like she’s hanging around in the bars every second day
Wow. That speed is amazing. Really starts feeling like a natural conversation.
Great presentation.
This video will be nostalgic in 30 years like the videos we watch today about the first days of the internet being demoed ,making us feel "superior" to the experiences of the users back then while having fond memories of our first experiences. Today this is amazing. In 30 years, we'll have those fond memories.
Unless AGI is achieved in the next 8 years, taking control of "The Gospel" a military AI targeting system used by Israel that increased their targets from 50 a year to over 100 a day since 2019 (The Guardian has an article on it). If one does the math, The Gospel has been mostly targeting over 100 civilians every single day since early October, sometimes 300-400 a day. Not even the ICJ nor the ICC have touched this, which means this beast has free rein. Therefore, fundamentally taking control of our defenses and using them to cull and domesticate the wild human animal, potentially killing millions in the process within the next 20-30 years, just short of when you thought you would be alive to experience "fond memories".
You mean 1 year…
@@antispectral5018 the way progress is going, most likely a year.
I think you will look back in 10 years and feel that.
In 30 years?
Did you mean 30 weeks?
Because if not, think about that ESSENCE of EXPONENTIALITY
Lmfao she had no clue what was going on and on top of that did you hear a tinge of annoyance from the ai!?!? 😂😂😂
"You listening to CNN" 😂
Ai models thoughts after the interview "Your job is so mine bitch.... you better pray we don't meet again!"
Husband at home: "So honey, I wanted to ask you,.."
"GOT IT!!!"
Congrats to the Groq team! Going to make a video about this - Amazing!
He had just emphasized this not a LLM, and then she's like "let's ask groq!". lol
Wondering if xAI's Grok will eventually run on Groq?
I think that’s a croq 😊
I think they will need to double check the contract for typo's or it could get very confusing.
I want my internal monologue to have the energy of the interviewer
😂😂😂
You would need some "speed"... If you understand, hahaha
Take some crazy pills.
Try cocaine, she clearly has. 🤣
Consider the fact that part of the amazing response time we see here is even faster than it appears. Some of the time to respond was the text response being translated to speech. Absolutely mindblowing!
That was awesome. And how fast it responded and that they didn't have to press anything to interrupt it, the AI knew when it was being interrupted unlike chatGPT's voice to voice feature. Groq's chips are awesome and I can't wait to see what comes out of it
Would like to see how this scales across millions of requests
Servicing businesses sounds like a great and logical start. But I'm holding out hope for consumer grade AI chips. Maybe one day I can run something like a 10x200B MoE model on my own rig without having to stack multiple RTX bricks. That's the dream. The capacity to run image/video generators would be a nice option, but even if just for LLMs, I'd be happy to save up for one.
likewise, even 3/4 capacity or lower tier would have been nice. at this rate there will be overlords Vs peasants in no time.
well if you got a nvidia 30 or 40 series card they just came out with a chatbot you can run locally. Its early days but and might not be the level of what's in the video but looks interesting
Whats the name of this? @@weatherwormful
@@CypherDND RTX Chat or something like that
@@weatherwormful danke
Jonathan confused me at the end. It sounded as if they are designing hardware, but then he said "they build the models, and we make it available to those who want to build applications." I guess he means they make the hardware available in servers to the app operators?
By models, he means the language models others make, like the open source ones and the private sketchy ones like GPT. And I guess Groq wants to sell their hardware to developers making language models.
At the moment their model is that they supply a Webui and API serverless endpoint, with I think just Llama 2 and Mistral 8x7b on their servers, and inference is done through their hardware... which is called an LPU (Language Processing Unit) its about 100 times faster in tests I done with basic chat... so fast that the only problem is the latency we all get using an API. They aren't actaully selling the cards as far as I know... and I don't know if they sell isolated hardware in the cloud. Cos otherwise you are sharing the server farm with other users and there's a time delay from prompt to their system supplying an available LPU. I could be wrong there.
@@mickelodiansurname9578 Thanks. It seems as if they would do better business if they were clearer about exactly what they're selling.
The moment they realised they had not signed off properly and said, "Thank you very much...", you know AGI is here...
So far these models don't really have long term memory though ;), they only "remember" what's in the context window, the long term memory would require back-propagating the data from your experience with the model into the neural network. Although who knows what kind of data they feed it when they train a new model, so you may as well be polite to avoid trouble in the future!
The latency is amazingly fast wow! Great job guys. I wonder if the TTS was streamed too as that could be a solution to make it perceived faster too to the end user.
I love humans but that machine was much more likeable.
🤖💀
this host is doing a one women show. Amazing LPU speed huge potential
We need that voice interface on the laptop -- not just on mobile like openai. When you can have a back and forth voice conversation (esp. something that can talk back) and at speed like this, even with interruptions, it becomes very natural and you can really talk through things with someone/something knowledgeable. If you add citations, you can fact check too.
This will be perfect for the customer service industries. Very exciting!
I've seen Groq do it's thing 5 months ago,but now I'm honestly sold and this is the very same idea I had in mind all along.
guys, when will you add an audio conversation?
He has a point, people have less patience on mobile platform, response speed of AI is key to customer satisfaction.
that wine mom who's a lot of fun but also kind of terrifying
This woman is the getto version of CNBC's Becky Quick.
What a great high energy interviewer
Her interruptions were actually quite clever -- the Ai will have to contextually understand this and adapt
Cant find anywhere to buy the stock
really fast response!
incredible
Where is the code for this available?
cant wait to use this
These chips/ cards need A LOT more memory on them. 230MB of on-board memory is appalling. The fact that you have to buy 72 of these cards just to get 16GB of memory is insanity. Specially when each card costs 20k each. Sure it's cool that it's fast, but not when it costs 1.44million for 16GB of memory when that will barely be enough to run a 7b model fp16.
500 tokens per second on which model?
Llama 7b, 70b based models, and Mixtral
Mixtral 8x7b
around half of that for Llama Chat 70b
I'm curious what the LLM was running on. If it's an external server the speed is still impressive, but if this was all running locally on the laptop with the chip installed that's actually insane
I've applied!
Racing towards unleashing something we will have no way to control
So are these guys focused on inference or training as well?
This is really really good!
Amazing!
That is cool, I tested it now, but you have to type in the question. Maybe pay version uses mic.
Wow incredible speed and content, anchor bit
This woman treated that AI like a piece of meat, show some respect. Many people are going to say to me its a machine but when it can reply that fast on voice, i dont know. Its much more real than before.
Isn't it kind of confusing that there's xAI's Grok, and now Groq?
Triple confusion when you consider that Groq (the hardware vendor) was set up back in 2016, and XAI's Grok just last year... plus Groq own the trademark on their name... Thankfully Elon noticed this and is planning on changing the name of Grok to either Mocrosoft Bindows, or I Bee M!
Perhaps Elon didn't expect that startup to become a mainstream success and discuss in the media like it has.
@@ashh3051 It was being discussed in the media, well the AI media, back when Groq produced their first demonstration paper on this tech in 2019 I think? Before the pandemic some time... Plus the founder of Groq is Johnathon Ross... and he's fairly famous in AI, the guy that created the Google TPU processor. Okay the average joe in the street likely never heard of this tech, in fact the average joe in the street is still clueless... but I'd be fairly sure those in AI all knew who they were since at least 2021. Having said that I'm a little surprised that the AI enthusiasts over on Reddit weren't aware either... they are all over there trying to work out why this new type of GPU only has 230mb of VRAM.
The human race is in a very critical period of time with the development of these technologies. If we get this wrong, it's game over.
When will there be a 10-100 tops / tokens -- USB c version?
Like a flash drive size and portable GPU size
Come on please answer at least I am not poking fun at the women like others are ❤, I need to know....
It would be so cool to have a farm of asics flash drives
wow! a black box. So much mysterious
What's the stock name?
The way he tried to correct her English Accent was wild in the pronunciation of Groq. Unless it was just me seeing it differently. Then, when she repeated it somewhere in her question the same exact way, I read his body language and I think he thought he was being clowned. Am I reading too far into it?
I am lost. Where is the chip? What are we talking about in this video? I don't care about the model, where is the chip? How big is it and what power does it draw?
Rrrrrrrrroooooms are packed.
When can I get one?
Looks like you can buy the PCIe card from Mouser for $20k USD... wow expensive...
I just tried it building some crazy stuff with raspberry pi , and boy it is fooking fast , what a speed
at least the AI doesn't start a sentence with 'ok so'
Is the Groq crypto token, which is trading on uniswap at all associated with the company?
How is this different then TPU's? Which has always been faster then GPU's?
I checked the website and there's no voice to AI interface. There's a text only like OpenAI. You know, this could be the next generation Amazon Echo and I'd pay for that.
that's definitely gonna end well
I need Siri to be this good. 🤣
I don't think she'll be called Siri at that point. Apple needs to graduate from 8 years in kindergarten directly to PhD professor. Maybe then I'll switch back to iPhone 😅
I experimented with Siri shortcuts and connected to Groq API, the response time is pretty impressive. But off course we cannot interrupt siri while she is talking. lol
that was really good
Incredible!
Can someone explain how it took us 30 years to get intel chips that would do hyperthreading, but it took Groq 4 years to make a chip that does AI 1,000X faster than the current intel products? Like what the hell is going on? Is AI designing all these chips?
that's cool!
Which one of these is the AI?
Woman thanks the AI with all the emotive sincerity of thanking a human being.
I mean I know it its coming and it will be faster and faster with insane accuracy but seeing it is impressive
that guy saying thankyou is trying to make sure AI is his friend during the purge.
The edge Groq has over the other chat bots is Groq doesn't repeat your questions before answering them. It just gets straight to the answer. The ones that repeat your questions are already irrelevant.
Groq is hardware, not a model or "chatbot". They make LPU's (language processing units)
You can define what the response structure looks like with just about all llms, repeating the question has always been optional
@unom8 so how can I get bing/copilot to answer my questions without repeating them? Do I have to request this in every single prompt?
@@sylversoul88 I haven't used those lately tbh, but when hosting an llm there is a bit of boiler plate that is added to every prompt, this is likely where the repeating of the prompt behavior is coming from
@@sylversoul88 I tried to link to an article, but that was silently removed ( Thanks YT ) - I would suggest you do a search for beginners prompt engineering
”of course you do” - a classy sober interviewer
is this chip in production? Or just a PoC prototype?
This is not an AI !! she is a real person talking because she got interrupted like a natural person which was accidently happened during talk
From the little I have seen, this is a good candidate for passing the Turing test,
The speed makes it more human
IT made me crazy keep hearing her thinking that it’s an LLM. It’s not!!!
I’ll be in contact!❤!!
If it could know who was talking to it it would be better. For instance if you look at your phone while talking it can assume you are engaging with it versus looking away from your phone. Call it’s name while looking away to re-engage with it
well you just put a system message into the models parameters. What you suggested there is in fact one of the things used in AI application development to make the model more effective. So fair enough you didn't know that, but at the same time impressive you noticed how important it is...
Revolutionary this will definitely change the world.
Crazy scientists over there
This is WEF technology. What is their motto?
Was that local?
When to iphone?
As a brit, I have to ask, is this what our reporters are going to the states and doing? Why is she drunk? Did she think she was meeting Jonathan Woss?
Not the best interviewer, but I’m excited about having a quicker, more natural response time from LLM apps. Brave new world!
A great use case for your chip, Groq, would be to transition the interviewer in real time to a one who understands your tech deeply and with empathy.
Great work, by the way! 👍
Would rather talk to groq than my surgeon who botched my knee replacement.
🎉
500 tokens per second!!! Holy shit
She is facing the threat of losing her job to an AI reporter.
AH, the expert reporting you find only on CNN LOL 😂
There is going to be a serious run on memory chip companies! I wonder if they'll integrate ReRAM into the AI chips next...
well for sure given the way this works on ASIC fabricators... ones that could tool up to produce the wafers. The good news is thats old tech, and well established, with plenty of vendors.
wow, AI is going fast, super fast
That's one way to get eaten by roko's basilisk 🐍