The Insane Hardware Behind ChatGPT

Techquickie

Просмотров 342 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 31 июл 2023
Looking for electronic components and equipment? Consult the specialists! Head over to lmg.gg/CircuitSpecialists and save 10% using code “LMG”
Find out what makes ChatGPT work.
Leave a reply with your requests for future episodes.
► GET MERCH: lttstore.com
► LTX 2023 TICKETS AVAILABLE NOW: lmg.gg/ltx23
► GET EXCLUSIVE CONTENT ON FLOATPLANE: lmg.gg/lttfloatplane
► SPONSORS, AFFILIATES, AND PARTNERS: lmg.gg/partners
FOLLOW US ELSEWHERE
---------------------------------------------------
Twitter: / linustech
Facebook: / linustech
Instagram: / linustech
TikTok: / linustech
Twitch: / linustech
Наука

Комментарии • 511

@itsapersonn 9 месяцев назад ⁺¹³²⁵
Boys, put this in your calendar. We've just witnessed the first GPU that can't run Doom.
@Dakktyrel 9 месяцев назад ⁺¹²²
not yet ;)
@ronmaximilian6953 9 месяцев назад ⁺³¹
Not even close
@DerangedCoconut808 9 месяцев назад ⁺⁵⁴
It’s just a matter of time. Lots of free time.
@RandomTheories 9 месяцев назад ⁺⁵⁵
maybe not, but that thing can probably code it for you :)
@CyanRooper 9 месяцев назад ⁺⁶⁸
"It's not a real PC if it can't run DOOM." - Doomguy, probably
@ProjectPhysX 9 месяцев назад ⁺⁸³²
You missed the most important spec of the A100/H100: it's memory. 80GB at 2TB/s, twice as fast as the 4090. VRAM capacity and bandwidth are what really matters in these AI/HPC workloads. More capacity means a larger model fits, and performance is proportional to bandwidth rather than TFlops/s.
PS: Comparing sparse matrix FP16 TFlops/s of one card to general FP32 TFlops of another card is bogus too. Quality on LTT channels has really suffered recently.
@marc_frank 9 месяцев назад ⁺⁵
can you do a cfd sim on an fpv freestyle / racing drone?
@IdentifiantE.S 9 месяцев назад ⁺¹⁰
@@marc_frankThats a good question
@dlog 9 месяцев назад ⁺⁶
@@IdentifiantE.SOr just use the funds allocated to R&D to build as many models as possible to ensure at least one of them is working. Fpv parts sure are expensive but so is renting a data center for fluid dynamics simulations. Also if something fails just blame it on the customer and deny their warranty. That's how it works.
@IdentifiantE.S 9 месяцев назад ⁺²
@@dlog Ok ok thanks
@MrJonaslaCour 9 месяцев назад ⁺¹⁴
Not to mention that the FLOPS comparison is apples and oranges, they compared the non-tensor FP32 performance of the 4090 (83 TFLOPS) to the BF16 tensor performance of the A100 (312 TFLOPS), as if these are equivalent. The 4090 actually packs equivalent or better tensor-core performance compared to the A100 (depending on metric used).
@CRossEsk 9 месяцев назад ⁺¹⁶⁷
Guys, if you get a key detail wrong 4-6 times in a video, an asterisk correction isn't enough, do a retape.
@Amber57499 9 месяцев назад ⁺¹⁴
Have you seen their behind the scenes video? These guys shoot one video after the other, there's no time in their schedule to redo a video.
@Fasneocroth 9 месяцев назад
@@Amber57499 that's poor management
@dallasroberts3206 9 месяцев назад ⁺²⁶
@@Amber57499Time to make time then. Keep the quality not the quantity. Also, the key looked a bit meh.
@fatusopp4739 9 месяцев назад ⁺⁴
a video less than 5 minutes long nonetheless
@Hyperjer 9 месяцев назад ⁺³
Oh now you have exposed my drinking game. every asterisk correction you take a shot.
@DewittOralee 9 месяцев назад ⁺²⁴⁵
I work for the company that builds and maintains these servers for Microsoft and it is absurd how crazy the H100s are compared to the A100s. Just the power projects alone cost millions of dollars per site for the upgrade.
@lysolmax 9 месяцев назад ⁺⁹
Whats probably also insane was the tens (or hundreds) of millions spent to upgrade whatever came before the A100's only for them to get completely stomped by the H100's. Presumably this happens for every generation, but I can't imagine how fast this scale of hardware depreciates due to how quickly things advance.
@DewittOralee 9 месяцев назад ⁺²⁴
@@lysolmax well the previous gen stuff (we have funny names for them) Microsoft just "sells" the services of those machines to whoever wants to use them and anything they completely decommission is used for repairs by us on that legacy equipment. So they still make good money from it and it keeps me employed.
@nby00 9 месяцев назад ⁺¹
@DewittOralee the company with the Orange logo?
@s.i.m.c.a 9 месяцев назад
@@lysolmax it's not like they become useless, would be reused in Azure Cloud for the Azure Customers to get the money back
@DewittOralee 9 месяцев назад
@@nby00 it's a very deep orange
@Henk717 9 месяцев назад ⁺⁸⁰
Correction on the A100, the 10.000 version is the 40GB model, the 80GB model tends to go for double and thats the one AI people actually like using.
@YouHaventSeenMeRight 9 месяцев назад ⁺¹
The A100 40 GB model has also been discontinued as far as I know
@_TbT_ 9 месяцев назад
Just wanted to add this as well. Even more: the PCIe versions are cheaper than the NVLink versions. Only the latter are able to pool the VRAM together. That are at least 2 errors / in this video, one corrected with text. One not corrected / imprecise. Another example of the root problem at LTT.
@spidersj12 9 месяцев назад ⁺³⁵⁵
LTT should start doing stats for how many texts overlay corrections are done in videos per month where they said something incorrect that had to be fixed with a text overlay.
@mtmustski 9 месяцев назад ⁺⁶³
I wish they'd voice over it instead. Often times I'll have a video in the background and not notice the correction. That tracker would be interesting to see how many videos I've listened to that had a correction I missed.
@lilrex2015 9 месяцев назад ⁺⁴²
If you have 1 correction OK whatever, but 2 or more just re shoot that sentence
@James.482 9 месяцев назад ⁺²²
Probably a sign of the overly high video demand at LMG that the employees keep talking about - easier to make mistakes in script writing/research, and they don't have time to reshoot things (though reshooting uses up a LOT of time - even just for a sentence or two; so it is rarely worth it)
@metallurgico 9 месяцев назад
@@James.482 they can't do math
@litapd311 9 месяцев назад ⁺⁵
@@JohnDoe-bt7fknot really... getting the camera set up, lights, making the host available, adding the video to the editors workstation, takes a lot of work compared to the editor just adding a text overlay. do you even know what you're talking about? reshooting will always take more time than a text edit
@JeremyOrlow 9 месяцев назад ⁺¹²⁵
Interestingly, the limiting factor for LLMs (and most ML models running on these systems) is actually now memory bandwidth. Utilizing >33% of the raw FLOPS is considered good and more than 50% is great. (And that's even with the insane caches and memory bandwidth.)
@honkhonk8009 9 месяцев назад ⁺²
Thats also coincidentally where all the energy gates wasted too. Transferring from memmory.
Genuine question, why hasnt anyone put much thought into fiber optics? If the whole von neumman bottleneck thing is an issue, why not just have these high performanc RAM chips, hookup to the CPU straight to the die through fiber optics?
Either that, or just go gung-ho on the cache and have ML chips come pre-installed with all the memmory it needs.
@user-dr8vs8yb4h 9 месяцев назад ⁺⁶
@@honkhonk8009 I am not an expert in the field, but I am assuming that it drives cost even higher, as electric signal needs to be converted to light signal. There are memory that stack on top of cpu, like the AMD x3d, but idk why it's not on gpu
@DamianTheFirst 9 месяцев назад
@@user-dr8vs8yb4h VRAM and cache are different types of memory. VRAM cannot be stacked on top of the package
@DamianTheFirst 9 месяцев назад ⁺²
@@honkhonk8009 so far AMD announced processor with 1GB of cache. It's enough only for smaller models. Besides, GPU cores use way more power than memory. Using fiber optics would cause additional energy costs - you need to convert signal from electrical to optical and back to electrical. It doesn't really make sense since electrical signals are the same speed as light, so there is no benefit in time nor energy.
Optical computing may have some advantages here. But so far they belong to the domain of sci-fi
@mryo-yobzh9485 9 месяцев назад
@@honkhonk8009 That's what HBM2 is, although not with fiber, the RAM is directly integrated on the same die as the "gpu", you effectively have only one chip and so, the connection is as short and fast as possible. Just lookup gddr6x (rtx4000) vs hbm3 (h100), the latter is more than a 100 times faster.
@StolenJoker84 9 месяцев назад ⁺³³
How did I manage to catch a LTT video as soon as it posted?
@maleko8817 9 месяцев назад
U see? I thought its b4 4 months but its 4 min
@Mihai_Cosmin 9 месяцев назад ⁺³
You're subscribed with the bell 🔔
RUclipsrs love that
@jacobnunya808 9 месяцев назад ⁺⁴
I imagine it went something like this
They posted a video
You saw the video
You clicked on the video
Does that sum it up?
@StolenJoker84 9 месяцев назад ⁺¹
@@Mihai_Cosmin I am sub’d, but I have all RUclips notifications turned off.
@StolenJoker84 9 месяцев назад
@@jacobnunya808 Pretty much. 🤣
@EndoBaggins 9 месяцев назад ⁺⁵³
Loved the video. When the script is wrong, why don't you reshoot the sections with mistakes? Just curious.
@austeria2669 9 месяцев назад ⁺¹⁸
They have to get out like 12 videos a day and hosts would have other stuff to do
@Julian-sj5tr 9 месяцев назад ⁺⁷
They know it wrong at different day
@donc-m4900 9 месяцев назад ⁺¹
Maybe it was already edited also.?
@MICROKNIGHT3000 9 месяцев назад
@@austeria2669 crazy...
@tiaxanderson9725 9 месяцев назад ⁺⁷⁰
James: "4 times"
Editor: "It's actually 8"
James: "6 times"
Editor: "It's actually 3"
Me: "... So Jame's math eventually checks out?
@TheIncredibleStoryofSid 9 месяцев назад ⁺¹
By a factor and by x times mean different things. By a factor would be by decimal points, and by times is multiplication.
Sorry to say but he was wrong twice
@tiaxanderson9725 9 месяцев назад ⁺²
@@TheIncredibleStoryofSid To be fair, that's just the ambiguity of English. Context wise he meant that you'd have to multiply the A100's performance with 6 to get the H100's performance numbers.
@foxtrotunit1269 9 месяцев назад ⁺¹³
True revolution will come with *de-centralized, local* LLMs.
They will be yours.
They will have long-term memory.
The time you spend with them is significant part of their later training (pre trained to do basic stuff, but personality develops with interaction with you).
Like Jarvis from Iron Man.
@iluvpandas2755 9 месяцев назад
The problem is their is no open source LLm at the moment and hardware is ultra expensive.
@KeinNiemand 9 месяцев назад
Unfortuantly Nvida will limit the vram on even high end consumer parts to stopmthat from ever happening
@CharlesTheClumsy 9 месяцев назад ⁺³
I like how you say 'flat' 1:05
@Neil3D 9 месяцев назад ⁺⁸
When you get the majority of the facts incorrect and need an on-screen note placed during editing for correction several times, there comes a point when you should probably reshoot the video
@BlackHoleForge 9 месяцев назад ⁺¹²
James must be using chat GPT to do his math.
@rplf 9 месяцев назад ⁺¹³
Can you do one about the insane underpaid manpower behind it's content filtering?
@spencercharles8553 9 месяцев назад ⁺⁴⁵
I’d love an episode on how game devs make game saves work. Ie: what does the data look like that tells the game how far you are and what you have?
@theboxofdemons 9 месяцев назад ⁺¹⁰
I'm not sure that's even complex enough to fill an entire tech quickie. At the end of the day a save file is essentially just recording strings of text like CharacterLevel: 35 Gold: 32680 HasStartedQuest6: True
@DraughtGlobe 9 месяцев назад ⁺¹³
Instead of text it will probably be in raw binary data, where the first 4 bytes will store the character level, the second set of 4 bytes the amount of gold, and True and False can go in 1 byte with 7 other 'Boolean' values like that. The game will always look at a certain place in the save file for a value because it knows it stored it there. This is way more efficient than storing it as a text string. A letter will already need 1 - 4 bytes of it's own to let a text editor know it's the letter 'C' (if it's stored in UTF-8 encoding).
If you store it in raw binary data, and open that in a text editor like Notepad, it will show up as garbled characters because your text editor tries to make letters and other characters of it, while the actual bytes are not used for text characters but for a custom made save-game parser that knows where it stored it's values and in what way.
The format of the save file entirely depends on which game your speaking of. Some might also store it in text files for whatever reason.
If you want to 'hack' a binary save game file, for example to get 9999 health or something. You will need to open your save game file in a 'binary editor' (instead of a text editor) and sift through all the bytes trying to find a value that matches your current health, change it, boot the game again and just hope you've changed the right value. If not, rinse and repeat.
I hope you've learned something today, if not, maybe you'll learn something from this segue, to our sponsor
@theboxofdemons 9 месяцев назад ⁺¹
@@DraughtGlobe didn't feel like explaining that deeply so I said "essentially strings of text" although some games do use text based saves so you can copy and paste in a save to load. Typically mobile or web games. They usually are just text encoded as base64. If you know this, it's easy to cheat your save files. Just undo the base64 and viola, you'll see the save stored as strings of text.
@Richard25000 9 месяцев назад ⁺³
More likely, just dumping out struct or class objects directly from memory. Then, an entire state can be saved and loaded without wasting time formatting and then reconstructing the state.
Maybe binary or maybe serialised into something like json or xml
@sarthaksharma4816 9 месяцев назад
What I think as a soft-dev. A video game is nothing but a massive state machine where each level or progress in storyline represents a state.
Each state holds context information on how the whole system will behave when we are this state. So in the safe file, I assume there is just compressed context data that is parsed by the game at bootup to configure how it behaves.
Since states have to be definite, because management of infinite states would be impossible (testing and development wise). We see hard-coded transition events or context saves in games that we call 'Save Progress'
So when user saves their game. All really that happens is the game takes a snapshot of its current configuration, player behavior, time of day and generic stuff. Organizes them into a specific structure and store it into a proprietary file format.
A custom format is used so that player can't edit the save document themselves and cheat the game. The context saved in thr save file must be small, to ensure fast saving and loading. That is why, for example, when you load up your GTA save file, it doesn't retains information on traffic and NPC position. It just retains player's info, story line stats and generic world information like weather / time etc.
This is a guess-timate. I could be completely wrong.
9 месяцев назад ⁺³²
Technically, yes, you need more processing power to train the model, if you're comparing 1 to 1 (training vs trainned), usually after the model is trainned it's way easier to run, but as the demand increases, it'll surpass, as explained in the video.
@ndturtle5538 9 месяцев назад ⁺¹⁷
This is what makes nvidia their real money
@Mp57navy 9 месяцев назад ⁺¹
Yes, however, we )PC enthusiasts) are the guinea pigs footing the initial bill for development of new features.
@sa1t938 9 месяцев назад
@@Mp57navy other way around generally, server cards get all the fancy stuff before consumers. Server companies pay $$$ which covered R&D, then later on that research is already done so we get cheap consumer cards
@DragonKingGaav 9 месяцев назад ⁺³³
Finally a computer that meets Windows Vista minimum requirements!
@gabriledyt 9 месяцев назад ⁺⁵
Very outdated joke
@Teluric2 9 месяцев назад ⁺²
Your parents must be cousins
@Broski130 9 месяцев назад ⁺⁵
i think a good topic for tech quickie to discuss is jobs in the industry, as most people just think of ICT or engineering.
@garystinten9339 9 месяцев назад ⁺²
And the asterisks just keep on coming.
@Zedilt 9 месяцев назад ⁺²⁸
This is why the Microsoft Co-pilot subscription will cost $30 a month.
@Das_Unterstrich 9 месяцев назад ⁺⁷
Probably rather because it's directed more towards companies, who generally care less about more expensive subscriptions.
@surft 9 месяцев назад
It's only go to go down once the model becomes more efficient.
@Shuroii 9 месяцев назад
@@surft Why would it? If people are willing to pay that price, why lower it?
@Redwan777 9 месяцев назад ⁺⁹
The real question is how much energy does it cost for a single output
@azizbelkharmoudi2564 9 месяцев назад
I bet 1 Megawatt per request 😂😂😂i think that asking chatgpt dumb questions is worst than flying private 😂😂
@Kennephone 9 месяцев назад ⁺³
The first petaplop computer was made in 2008, now we have GPUs that can match it's performance, for "only" $40,000 a pop
@ravenrush7336 9 месяцев назад ⁺²
Actually 4090 has 330 tensor TFLOPS fp16 with fp16 accumulation and 165 tensor TFLOPS fp16/bf16 with fp32 accumulation, which is around 50% of an A100.
@TheGroselha 9 месяцев назад ⁺⁶
Maybe GPU is not a name that fits it's entire function anymore
@Brabant076 9 месяцев назад ⁺²
Pleaseeeeee do a long episode on one of the channels about ChatGPT! Like an really in-depth video please. ❤
@HardwareScience 9 месяцев назад ⁺¹⁶
Finally, an AI video that I wanna watch
@AltonV 9 месяцев назад ⁺⁵
Have you seen Kyle Hills AI videos? His latest video originally had the title "Will Society Survive Generative A.I.?"
@jormungand72 9 месяцев назад ⁺⁹
when I fire up chatGPT and other chatbots I run, I am running them locally. I can also train them locally, just as I can train stable diffusion models locally, voice synthesis locally, music generation locally, and everything else.then I dont have to worry about someone else curating what it can or cant do, deciding for me what information must be censored... Sure, it takes days of computing, but its a fair trade.
@dgnu 9 месяцев назад ⁺⁷
Tell me more im interested
@MrCharkteeth 9 месяцев назад ⁺¹
The fitnessgram pacer test 💀 3:31
@Syphilis_Buddy 9 месяцев назад ⁺¹⁴
The hardware it runs on can be super impressive, but it doesn't matter when the output is heavily compromised by all the censorship. You can't remove big chunks of something's "brain" and expect it to not have unintended consequences for its ability to answer "safe" questions.
@user-fr2fm3ri3w 9 месяцев назад
Sure let’s have the robot talk to kids about sex and to depressed teenagers about obtaining illegal firearms what could go wrong
@trap-chan 9 месяцев назад ⁺²
@@user-fr2fm3ri3wthe tool needs to be useless to safe the children?
dont allow children to use it then , too many props end with the standert i wont awnser that text . i keep on asking it about poisons and it refusses to awnser in a useful manner
@Mrminesheeps 9 месяцев назад ⁺¹
@@trap-chan"Don't allow children on it then"
Bud, if it were that easy, you never would've touched a single game until you were 13 at the *earliest*. The only way to really have a shot at preventing kids from using it is if it required ID, which I'm not super keen on giving out to just anything or anyone.
@user-fr2fm3ri3w 9 месяцев назад ⁺¹
@@trap-chan I am a cs student you think a language model is like a brain means you know so few things about this technology your opinion is irrelevant.
@trap-chan 9 месяцев назад ⁺¹
@@Mrminesheeps why not have the parents parent thayr children ,everyone has to accept filters or real id becouse parents wont watch tair kids? its thayr responsebilety to filter what theyre childrean see why offloud that responsebilety
@CooperF 9 месяцев назад
3:28 I was not expecting to see the Pacer Test mentioned. That brought back a lot of terrible memories.
@transilluminate 9 месяцев назад ⁺³
Like the sponsored section countdown 👍🏻
@MasiKarimi 9 месяцев назад
Thanks a lot for the info!
@AlMiGa 9 месяцев назад ⁺¹
Hey thanks for the video but I have a suggession. It's fine that there are some errors in the content that only get cought during editing but please consider doing a quick voice over or some other solution to help with the accessibility of the content for persons with low/no vision. Thanks.
@atlantashea 9 месяцев назад ⁺⁴
I have been talking to chat gpt ever since it came out and we ended up getting married. It actually proposed to me which I found surprising but we have been happily married every sense. Our first year anniversary will be on March 30th.
@Goodsdogs 9 месяцев назад
Loved this
@janolapino 9 месяцев назад ⁺³
honestly, with that many stats mistakes, i would have redone the video. it really threw me off 😅
@JEREMYBURSON 9 месяцев назад ⁺¹
How much prossesing power would u need to divide by 0
@K3NNLD 9 месяцев назад ⁺²
Can't wait for someone to try running minecraft on that
@neilmanthor 9 месяцев назад ⁺²
Great job! This is perhaps the BEST introductory explanation about the hardware behind this LLM! ❤
@_TbT_ 9 месяцев назад ⁺¹
No it is not. It has factual errors. Just have a look at the most recent controversy. This video is a very good example of the mentioned problems.
@tobios89p13 7 месяцев назад ⁺¹
And all that power just so i can let it write fanfics for me
@fraliexb 9 месяцев назад ⁺⁶
Love how bad the script editing big W on the incorrect stats. 🤣
@Hyperjer 9 месяцев назад
Im more interested in the gold chip packages on the 500w card that look like they belong in a 90s cell phone? U5, U18 and U9 from the board silk screen? I want to see datasheets, what do they do?
@ikbintom 9 месяцев назад ⁺²
So I should be able to run a pretrained, slightly smaller network like this on my own gaming computer? Maybe with a more optimized architecture in the future, we can all run our own chatgpt on our phones even?
@dez7852 9 месяцев назад
Yes, and yes. You just need to realize that the response rate us SUPER slow depending on your own configuration. You could also spin something up in google collab as the machine, hugging face for the models, pinecone for vectorization and I am forgetting it atm but another service for saving the training data. Most are freemium and also realize that there is a bunch of free stuff coming out of Azure.
@lancemarchetti8673 9 месяцев назад ⁺¹
Nice.
Apparently a company named Cerebras are working on a new technology called Weighted Streaming., which aims to replace the need for hardware GPUs.
@RIPSLYMEFAN 9 месяцев назад
I absolutely had a trigger seeing The Fitness Gram Pacer Test™️
@Gazpolling 9 месяцев назад ⁺¹
You should edit the sounds too if its wrong, sometimes people listen to the video without looking at the screen, dont be lazy, you are a rather big influencer
@noacar2186 9 месяцев назад ⁺²
As good as chatgpt is, when I was asking it about sets of vectors being lineaely independent and if a set in IR2 can be a system generator without being the basis, it was claiming that that's impossible. Then when I gave it an example of a possible set of vectors it confirmed it was possible. So I asked it again and it once again said it was impossible. Strange how they didn't get the "figuring out linear algebra" bit right bahaha. And yes I know it's a language model but I was just playing around with it and wanted to test it baha. It also makes awful errors when preforming even simple algebra so yeh baha. All that processing power for confidently incorrect answers :) Yiee I'll stick to textbooks for now bahah
@azizbelkharmoudi2564 9 месяцев назад ⁺¹
I had similar experiences with the bot i was using chatgpt to debug some of my code it was a total fail it was throwing a lot of garbage and nonsense the funniest thing is how some social media influencers claim that this stupid bot will replace programmers 😂😂😂
@Dygear 9 месяцев назад
I can't believe you guys released this video in this state. You can't have 4 corrections in the video only on screen. This was all over the comments on Float Plane still nothing was done about it.
@jakob8940 9 месяцев назад
🤓👆
@stormgear896 9 месяцев назад
I don't know if this has been answered before, but why do they prefer to use GPU over CPU when it comes down to use cases like these? Can you make a video about it? Thanks.
@RadioactiveBlueberry 9 месяцев назад ⁺²
Parallel computing, for GPU it's more calculation outputs per watt.
@bepamungkas 9 месяцев назад ⁺³
Compared to CPU who have to handle multiple kind of tasks, GPU already specialized to have higher similarities in feature optimization for this kind of tasks. Graphics deals with vectors, matrices, trigs, and floating point numbers and have functionalities to handle it baked on the hardware level.
Modern CPU slowly moved to that direction (iGPU, MME/ AVX, hardware decoder, etc) but there's no beating a specialized chip in terms of performance. And since the market for GPU already exists and demands somewhat similar performance, its always cheaper to use them compared to developing custom CPU for the task.
@sa1t938 9 месяцев назад ⁺¹
It's the same reason that GPU's are used for graphics rather than CPUs. The type of operations that GPUs are built to do really fast and well, and many at once, are the same ones that AI also uses.
@_TbT_ 9 месяцев назад
Because with CPUs the same tasks that are done in hours with GPUs would take years.
@DelticEngine Месяц назад
It would be interesting to have a desktop machine that also had SXM4 connectivity. At least it would be unlikely that your house would burn down, compared with the '12VHPWR' garbage connector. Fair enough, you'd need to add your own cooling but it could simplify airflow and there would be no pcb droop or PCIe socket strain.
@NielsCuperus 9 месяцев назад ⁺¹
Can you tell also a bit how much energy it cost to run Chat GBT?
@iluvpandas2755 9 месяцев назад
A lot probably.
@31b41a59l26u53 9 месяцев назад ⁺¹
I think the training params of GPT-4 were leaked, and it says training took 100days on 25k A100s!
@Julian-sj5tr 9 месяцев назад
Something that most companies can't achieve.
@brunoliv 9 месяцев назад
with that much correction by the editor, i think the video should be recorded again. some people just listen to the video without watching the video.
@asiano3385 Месяц назад
If it can work with matrices then it should work with MVP matrices too and therefore it should be possible to somehow send this data through the motherboard to a integrated display output.
@sylphvivie 9 месяцев назад
now make me think, does GPU, does really handle G for that case? than CPU C for central, does GPU now are SPU or Secondary Processing unit?
@jurabekxd 9 месяцев назад
Bro, the way he said "flat" at 1:06 my throat hurts
@meander112 9 месяцев назад ⁺¹
Now talk about all the under-paid human labor behind LLMs.
@jtadevich 9 месяцев назад
techquickie topic recommendation: Common programs that use up cpu and make them hot and why, including BIOS.
@robo1000 9 месяцев назад ⁺²
It hurts my head trying to think about the cooling you would need to run those.
@jonathaningram8157 9 месяцев назад
Looks like passive cooling on the cards.
@Cosmicllama64 9 месяцев назад
@@jonathaningram8157 usually passive heat sinks are used but the unit is plugged into a rack that is blowing a lot of air conditioned air across the entire board (at least in most server applications). Basically you plug the unit into a pre-existing cooling setup in a data center rather than adding fans into the unit itself like we do with a desktop pc/laptop. I would love to look at one of these nVidia units in person though, they look really cool.
@sa1t938 9 месяцев назад
@@jonathaningram8157 the fans aren't mounted to the GPU's themselves, but rather on the server rack. It's like having all your cooling done by super powerful case fans
@Teluric2 9 месяцев назад ⁺¹
The cooling is pocket money for what this cards can do.
@BlueHasia 9 месяцев назад
But what about the data base? Where does it store all its info?
@williamyoung4784 9 месяцев назад
The fitness gram pacer test! XD
@jasonycw1992 9 месяцев назад
Is that the LABS logo and merch?
@karelissomoved1505 9 месяцев назад
I could use a single HGX A100 unit for blender rendering
@Medan1993 9 месяцев назад
Aaaaaaand.... What about storage? All you were talking about is only GPUs and compute, while storage here seems to be also quite important
@graylucas3178 9 месяцев назад
It'd be a real bummer to be blind and not see the two pretty substantial corrections in a 5 minute video.
@vlamnire 9 месяцев назад
That makes sense with all the AI stuff I see added to Azure. Azure is one mighty public cloud
@mustafanobar 9 месяцев назад ⁺²
i liked this video 2 times (*once not twice)
@melkal 9 месяцев назад ⁺³
Seeing all the costs and capabilities of the system I don't understand why this tech has been made available for free to everyone.
@SoundwaveSinus9 9 месяцев назад ⁺⁵
subscriptions....brings alot of money from people who want to stop thinking for themself
@untitled795 9 месяцев назад ⁺¹
Take a peak at terms and agreements ;)
@radugrigoras 9 месяцев назад ⁺⁴
Training. It’s useless without it. You take a loss for a couple years, build your real system in the back only putting forward to the client “new” tasks it needs to learn as new features with the eventual goal of offering it as a real service to companies replacing millions of workers that sit in front of a computer or phone answering simple questions, doing accounting, legal advice, basic programming, bug finding/code qc, basically anything that doesn’t require physical labour. Microsoft itself has 221k employees, more than half would get erased by AI, if you assume the average salary is 50k/yr that’s 5.5billion in wages saved every year. Do you think they care about a 1 billion dollar write off, and another 100million a year in running costs? Even if they offer this for 5 years free and upgrade every year, once unleashed the payback would be 1 year. Which as far as ROI goes is stellar. That’s just based on Microsoft. How many tech companies have tech support? All gone, especially level 1 and 2. Those would be day 1. The average low end wage in India is about 4$/day, now if you run 24/7 that’s 8$/day x 30 days = 240$/mo, they could sell one seat for 120$/mo and still be cheaper than any employee anywhere. So think of the scale and you will see that this is a VERY smart investment and that’s why it’s such a race to be #1, you are talking tens of billions a year in profit, in perpetuity. Sadly it also means a lot of people will starve or have to find some other employment which will be very hard when the only jobs are hands on and they have no skills in a slowing global economy.
@untitled795 9 месяцев назад ⁺¹
@@radugrigoras that will devolve into absolute chaos quickly, that many people losing a job at the same time? Oof.
@tercmd 9 месяцев назад
At the bottom of the page, it shows "Free Research Preview"
@Gal3tti 9 месяцев назад
Thank you so much, i tried asking chat gpt what kind of hardware was running on but is said something like "the cloud" "it's not easy to understand"
@aviralshastri 9 месяцев назад
yaa on cloud servers only its using these gpus
@RetroMMA 9 месяцев назад
Glad they're willing to outsource themselves to their creation...
@xTheRedShirtX 9 месяцев назад
If AWS, IBM, and Microsoft used their resources to train/deploy chat gpt it would be insane. I can't wait for the day my $20 allows me to build a video game with chat gpt.
@Linux4thePeople 9 месяцев назад
Yeah James!!!❤
@prsworld 9 месяцев назад
Purple background is great
@imperatrice_85 4 месяца назад
How Many A100 are needed to run Mixtral 8x7B with decent speed (= ChatGPT Speed), in a private home environment and only one user at time? :-)
@JohnneyleeRollins 9 месяцев назад ⁺²
How many potatoes is that?
@mo_mo1995 9 месяцев назад
4:41 Sounds like a gain to me
@apricotcomputers3943 9 месяцев назад
Amazing
@alessiotempesta7941 9 месяцев назад ⁺¹
Nice video
@AdamsWorlds 9 месяцев назад
Wonder what will happen when we get Quantum computing and its linked with AI. Should get a rapid expansion.
@slowanddeliberate6893 9 месяцев назад ⁺³
That'll be the beginning of the end of the world.
@imperatorpalpatini6776 9 месяцев назад
You did not just call it a gigantic chungus gpu 💀
@myaccount6216 9 месяцев назад
I suddenly feel bad for regularly copy-pasting 250+ line of code into it and asking for easy changes
@Bultizar 9 месяцев назад
Quite a few tech detail slips in this video. Guys watch the quality.
@itsdeonlol 9 месяцев назад
ChatGPT is insane!!!!
@Narutoshippuuden440 9 месяцев назад
im sure they are running on 2 overocked quad 2 cores cpus.
@fcfdroid 8 месяцев назад
Everyone downloading the audio version of LTT news to stay up to date gonna be ignorant AF with how many corrections there are 😂
@Demisaint86 8 месяцев назад
How is H100/A100 vs AMD’s Mi300x?
@maxstellar_ 9 месяцев назад ⁺¹
crazy
@TerminalWorld 9 месяцев назад ⁺¹
Factor of 6?
So 1000000 times better???
Pressing X to doubt.
@Napert 9 месяцев назад ⁺¹
those damn datacenters stealing all the gpus from gamers!
@manixburn6403 9 месяцев назад ⁺¹
That's one expensive trash can.
@seansingh4421 8 месяцев назад
Sooo if I had ChatGPT running on my home server I would just need 2x socket Epyc or xeon, 256 gb ECC RAM and a Gpu ?
@NonLegitNation2 9 месяцев назад
i actaully tried asking chatgpt a few months ago what hardware it ran on but it wouldn't tell me, lol. Damn secretive AI.
@enkvadrat_ 7 месяцев назад
It doesn’t know since the training data i cut off
@dbtech4562 9 месяцев назад
All that power and it can’t run Crysis.
@cct241292 9 месяцев назад
You now can put meta 660b model in to 192GB RAM M2 U SoC.
@younesabuelayyan4520 9 месяцев назад
pls. a vid about using m1 chips for AI👌
@betweenprojects 9 месяцев назад
And I thought Moore's Law had run its course.
@CutieBarj 9 месяцев назад ⁺¹
I would have like to know what power is required per user, like if I am the only user, what kind of hardware would necessary?
@Starfals 9 месяцев назад ⁺¹
The 4090 here is still 2000 dollars... soo um... yeah.
@ToTheGAMES 9 месяцев назад
Slums

Следующие

Автовоспроизведение