What is Hyperthreading?
HTML-код
- Опубликовано: 8 янв 2025
- Because of Hyperthreading (or should it be Hyper-threading), many modern CPUs have "threads" as well "cores". Normally there are double the number of threads than cores. But what does it all mean? Does it improve performance? And is it better to have more cores than threads?
Like my t-shirt? Buy one here: teespring.com/...
Multitasking vs Multithreading vs Multiprocessing: • Multitasking vs Multit...
Introduction to Android app development. Everything you need to know to get started: www.dgitacadem...
Twitter: / garyexplains
Instagram: / garyexplains
#garyexplains
The airport part was the best hyperthreading explanation I've ever seen. Good job
Philip Rydén I agree. Very nice analogy
Yea it was. Helped me really understand better how threads work.
I agree
Completely agree.
I guess you can also compare CPU operation to cooking in a kitchen. Because there you have operations that can be done out of order and in parallel (such as prepping vegetables) and those which can not.
That example was really made it easier to understand. Thumbs up from my side :)
This channel is arguably one of the best when it comes to explaining computer concepts. Gary you the best.
This was a really great explanation. Love the fact that you took the airport example and in the end compared the 6C/12T vs 8C/8T speed differences.
There has been a lot of talk around the next gen Intel CPU with "only" 8C with no HT vs the current 6C/12T CPU. For some it does not make sense, while for me it makes perfect sense. :-)
Keep them explanations coming!
Awesome explanation as always. This channel is so underrated. Hope it gets larger than other so called "tech" channels that talk about things like how animation in one phone is 10ms faster than it is in another phone. One suggestion: please don't use handwriting animation (with fake hand) - it is little distracting. Thanks!
Again .. what a great demo Gary .. really good job
Gary, as always, great explanation and information.
Truth!
Great video, like the drawings, amazing as always - thank you Gary 😇
Helpful as always, thank you Gary Sensei.
This channel is underrated. Subscribed.
How have I not heard of this channel. Actual useful, correct info on technology he deserves over a million subs.
The best hyperthreading explanation I've heard so far, thank you! Thumbs up!
Omg, airport security checks example…. How brilliant! You are gifted at this, Sir! No one explained this better than you. I am preparing my CompTia A+, I got to subscribe your channel
Glad you enjoyed it
I just love your explanation 😊 Sir Gary
Wow, the best way to explain these CPUs ever. Thanks again Gary for the great work.
Very nicely explain and the examples are awesome.
Please explain us the concepts of SOCKETS.
And how cores are divided among SOCKETS and how it works. Please give us some ideas on it. Thanks a lot !!!
This channel should get a million sub ASAP. so through so good video. good job man. :)
I agree!!! 😁
Shahed zaman Nice name. 😏
Agreed, he is the best explainer of how things actually work in the popular tech scene
Great video Gary. 🔥
Loved the Airport analogy - really brilliant observation and clean way to explain hyperthreading contention.
the example with the airport security check is just simply awesome. thank your very much :)
Thank you again Professor! Wonderful video... lesson learned!😎
As always, awesome explanation Gary. I would say you wore born to teach tech. :)
You're helping lots of people. Thank you.
Thanks... superb explanation
Thanks again for unpacking this
Appreciate your work man ❤️..
My pleasure.
Hello Gary,
I like your analogy with the airport queue, it is a very good illustration of one of the uses of hyper threading. However, you may have forgotten (or maybe is it too technical to be explained in the same video?) the case of super scalar architecture where you have multiple ALUs in each core, which allow two threads to actually execute at the same time on the same core. It may be a subject for an other video?
Keep on with your videos, I love them!
I have covered different aspects of super-scaler architectures in various videos mostly notably this one: ruclips.net/video/gLsdS0zQ82c/видео.html However things are a little more complicated than you portray. Having instruction level parallelism (with multiple ALUs, load/store units etc) does help but remember that each thread has its own context (like registers etc) which means that the "backend" parts aren't like two threads again, it isn't 2-1-2, but rather a shared resource. In that sense the performance gains of using ILP doesn't increase from 1 thread to 2.
In short : *4C/4T* is better than *2C/4T*
4C/8T is better than 4C/4T
What a great analogy on the airport queue!
This is amazing thank you sir.
I am going to take a refresher course to remember what I have learned in the past. Now I remember again what multithreading and hyoerthreading is all about. Old age can make one forgetful from time to time. Especially at age 93!
Very good explained
The best instructional video i've seen in a while! lots of love for this channel. God bless you more Gary ! :)
best explanation so far...thank you!
Glad it was helpful!
Wow, this is the best hyper threading explaining video I have seen! I have some IT background but not Computer Science. Can you tell in a PC, what will a "queue" to have problem that affects it to fetch instructions into a CPU? Apart, will there is cases that hyper-threading will cause decrease in performance?
this is one of the best video for me on RUclips and the way you have explained it made me very happy :). Can you please also helps us explain how "Intel Virtualization Technology" works. We are great full to see all your videos.
Great video, thank you
@6:01 That is the advantage of a Xeon E3 1231 V3 over an I7 4470, hyperthreading, in case you use it for gaming and it is cheaper if you buy it used. ;)
@6:48 Ah, so my example is a good because the architecture is basically the same and the clockspeed is the same.
*GARY!!!*
*Good Afternoon Professor!!!*
MARK!!!
Mark!
*ZAMAN!*
Hello mark !!! Very good morning class mate!!!!!
Zaman Siddiqui hiii hello ZAMAN very good morning class mate!!!
When I see Gary Explains notification I know that is quality content awaits.
Hyperthreading helps mitigate data dependencies within a thread. To illustrate.
Assume you have a superscalar processor that can execute up to 4 instructions per clock cycle. Unfortunately, most code can't take advantage of that due to data dependencies. An example would be this expression:
A = B + C + D + E
There's 3 additions there, so in theory, only one clock cycle is required since the processor can perform 4 instructions per clock. But that isn't so. The above could be compiled into something like this:
1. A = B + C
2. A = A + D
3. A = A + E
And if you look at the above 3 instructions, you'll see that #2 can't execute until the result of #1 is known, and #3 can't execute until the result of #2 is known. So the above code sequence would take 3 cycles with an average of only 1 instruction per clock. Now a clever compiler can mitigate that somewhat. For instance, look at the following code sequence:
1. T1 = B + C
2. T2 = D + E
3. A = T1 + T2
For the above, #1 and #2 can both execute at the same time since they don't depend upon each other. But #3 has to wait for the results of #1 and #2 to be completed. So the above sequence takes only 2 cycles for an average of 1.5 instructions per clock.
Now, let's make the following assumptions.
1. We have a super scalar processor capable of executing up to 4 instructions per clock.
2. Our code on average can only execute 1 instruction 50% of the time, 2 instructions 25% of the time, 3 instructions 12.5% of the time, 4 instructions 6.25% of the time, 5 instructions 3.125% .....
With the above assumptions, the processor will on average execute 1.875 instructions per clock for a total utilization of 47%. Now the goal is to make it faster. We could add extra logic and make is capable of executing up to 5 instructions per clock, which would raise the average instructions per clock to 1.9375 which is an improvement over the 1.875, but it's a fairly small improvement. But if we add an entirely separate set of registers and have it execute an independent thread which has no data dependencies with the first thread, we get an average of 3.25 instructions per clock for a 81% utilization. Unfortunately, each thread drops its performance to 1.625 instructions per clock, but the overall system performance does increase significantly. The reason each individual thread slows down is because they're no longer capable of having 4 instructions execute because the other thread is executing 1 instruction. Also some of the time when a thread would be capable of executing 3 instructions is lost because the other thread is executing 2 instructions. But the loss of going from 1.875 to 1.625 is fairly modest while the increase of system performance of going from 1.875 to 3.25 is quite significant.
Very clear explainations! The airport example was the best♡💯
Very nice explanation
Gary, How does the CPU keep track of which instruction belong to which thread? Are the instructions tagged with a thread number in some manner? Do they use some kind of shadow registers? If say a load instruction of thread 1 puts a value in register A, but the thread 2 actually needed that register for the next operation it is pushing through the core, how do we deal with that? Does it have some sort of shadow registers? Or can the thing picking instructions from the pipeline somehow sort or stall instructions based on whether they are touching registers already occupied?
Thank you so much
Excellent, as always! 👌
Well explained, without going into too much detail 👍
P.S. Hope you don't mind my asking - do you prefer Mr Sims, or Gary? :)
Cheers!
Gary is fine... my Dad is called Mr Sims!!! 😀
"Professor" is best!
Is it possible for you to make series a on :
1. Evolution of computer it's hardware from difference engine to 4th gen computer.
2. Computer software - first operating system - current operating system.
3. Evolution of networking to Internet.
I tried to find the information but it's unstructured and I can't connect the pieces. But I am too interested to know about it.
So can you please explain.
Great explanation
this is one of the best tech channels...gj
I was wondering what the hell is hyperthreading when almost every tech youtuber was like intel i7 9700k is dropping hyperthreading. Now I have a better comprehension. thanks professor gary.
*RAHUL!*
Mark Keller YES!
Afternoon classmate!
Ummmmm. thanks i guess!
🤔😅
???
I think the process descriped in the video is more comparable to temporal multithreading than simultaneous multithreading (or hyperthreading)
No.
could you please explain CPU over commit with Hyper threading as well ?
I'll remember the Rule of thumb.
Thanks Gary.
Well done, I think even normal people will understand this 👍🏻
ok, ok, I get most of this, but I am confused about multi core CPU's, it seems like they would bottleneck as well at the "scanner" somewhere like the threads to, right? If you have 4 cores, aren't the results or outputs being sent out to the same "highway" or bus as the other cores? I am guessing memory clock speeds need to be faster than the CPU to keep things moving?
Outside of queuing, one of the things that hyperthreading does is it saves an expensive context switch. Unloading the registers and loading from RAM is expensive.With hyperthreading you have more threads and reduces the need for a context switch. One day CPUs will probably have the ability to hold all OS threads within its registers, it will probably coincide with ARMs rise.
Sir please make a video on how to enable hyperthreading
Sun's UltraSPARC architecture took this to extremes, with up to 8 threads per core and making cache misses almost a feature. Early versions even shared a single FPU among several cores. Needless to say, everything was sacrificed to maximise aggregate throughput with low power consumption and single thread performance was dire. I had lots of "interesting" meetings with senior management who had been over-sold UltraSPARC and demanded to know why we weren't using more of these apparently cheap machines for a lot of their projects, and there were some disastrous examples where the wrong choice was made.
This was nothing like the same issue with Intel hyper-threading as Intel did not sacrifice things like cache size, out of order execution and so on, but the non-linear throughput with (apparent) CPU utilisation could cause some confusion in capacity planning. If half the threads were idle, it did not mean there was twice the potential throughput when on a hyper-threaded machine.
*MultiThreading v.s HyperThreading:*
Multithreading refers to the general task of running more than one thread of execution within an operating system. Hyperthreading, on the other hand, refers to a very specific hardware technology created by Intel, which allows a single processor core to interleave multiple threads of execution more efficiently.
OOh, I just harped on some terminology on another vid of yours and I hesitate to say it and sound like a broken record but I really do need to jump on you for the title here in honor of my former professor who actually came up with the idea herself and who the conflation of this term rankled so much she actually flinched in class over it. What you are describing in such general terms is "Simultaneous Multi-Threading" whereas "Hyperthreading" is Intel's implementation of it. Yes, i'm proud to say that Susan Eggars was my professor and she was phenomenal.
Yes I know I am describing SMT, but most people haven't heard of SMT but they may have heard of Hyperthreading. It is the old photocopy/Xerox thing.
When you Turn of HT in i5 11400f, benchmark in CPU-Z shows more points in single thread . Some games behaving the same - strong single tread is better for them
thank you your example helped me finally get the idea of it lol
Hay sir... what the different beyween 32bit processor and 64bit processor.. and why there is 32bit operating system and 64bit operating system... i hope you make a vide o of it.. thanks
A 32-bit processor handles data internally in chunks of 32-bit, while a 64-bit processors handles its data in chunks of 64-bits. Also, 32-bit processors are generally (but not always) limited to addressed a maximum of 4GB of physical memory or 4GB of virtual memory. 64-bit processors don't have that limitation. You might find this video of mine useful: ruclips.net/video/04YvqoQMs3k/видео.html
well i cant find hyper threading technology in bios, what should i do to turn it on??
Thank you, sir.
Good.What is quantum computer?
Thanks! Can you explain more about the instruction pipeline?
Did you see my "instructions per cycle" video over on the Android Authority channel?
Thanks a lot! very helpful.
Nice one
Hiii !!!! Hello !!!! Very good evening professor!!!!!!!! Very good video !!!!!! Thanks professor!!!!
Robby!!! Glad you liked it!!!
*ROBBY!*
Are there single-core CPUs with hyperthreading?
And are there hyperthreading CPUs with more than two threads per core?
Thank you.
I think the original Pentium 4 HT processors were 1 core, 2 threads.
Hey Gary, Can you make video on Difference between SELinux and Linux ?
Does a program need to be a multitasking/multithreading support program?
Yes. Only multithreaded programs will get a performance boost when running on a hyper threaded CPU.
Why ARM cpus doesn’t have hyperthreding?
Because while hyper-threading can offer a performance boost, it is very power hungry and the amount of power it takes vs how much performance you gain means it is not useful on mobile and other power sensitive applications. A much better solution is HMP, which is what we find in most high end smartphones, i.e. 4 big cores (like Cortex-A75) and 4 little cores (like Cortex-A55).
Gary Explains so 4 strong and 4 week cores are the mobile equivalent to hyperthreding?
No it isn't about equivalency, but most desktop processors are 2/4 or 4/8. Only recently have large configuration become more mainstream. However mobile has had octa core processor for years, but they are 4 & 4 as I described. IMHO, HMP is superior to hyper-threading by a long way because those extra four cores are complete cores, not just "threads".
Gary Explains so if there are more real cores it’s mean that it’s better than hyperthreading ?
Yes in the sense that rather rather than adding a thread you add an entire core. However those cores don't have the same performance as the bigger cores. So it is always a trade-off. The point about HMP is that for mobile it is excellent as the power efficient cores can get used for all the low priority tasks. Even watching RUclips uses the smaller cores leaving the big cores shutdown and so it saves battery. But remember we are comparing two very different systems, desktops with main power and big cools fans, compared to mobile with a battery and no active cooling. The Asus Chromebook that I recently reviewed as HMP, it might be interesting for me to test the performance using my program from the Multitasking & Multithreading video to see how it uses HMP.
this is awesome
I thought you would also mention vCPUs in all this and how it relates to hyperthreading.
What are vCPUs?
vCPUs is a term created by cloud providers (probably AWS). And on AWS, when you request an EC2 instance with say, 2 cores (aka vCPUs), you don't know whether those 2 cores you got are hyperthreaded cores or actual, physical cores. Heck, your said EC2 instance can have 1 hyperthreaded core and 1 physical core, but Amazon will still charge you for 2 cores as if they were 2 physical cores.@@GaryExplains
What are we gonna do if we cant use hyper threading anymore intels cpu flaw says disable hyperthreding
True, but if we are thinking of the same vulnerability, it doesn't apply to AMD!!!
@@GaryExplains intel I'd still better, the overclocking is way better with intel , you will never reach the high overclocks with amd
Thanks
wait is multicore cpu machine then the same as a multi cpu machine in that case?
Yes, from a software point of view they are almost the same. There is some stuff about wanting to keep processes on the same CPU (on a multi-cpu machine) so that the L2 and L3 caches are populated, but in general they basically treated the same. Obviously, from a hardware point of view there are big differences.
@@GaryExplains that's interesting, are advantages and disadvantages otherwise?
great video!
נהדר! (Great in Hebrew)
You learned me a lot with your videos and info. Thank you man so much ❤
All good and clear but, once the entire cpu is at max load hyperthreading because useless right?
That is the wrong question, the right question is how to get the CPU at max load with multiples and multiple threads!
for me useful note is rule of thumb
Thanks.
You're welcome
I don't understand what exactly can cause a not-hyper-threading core to be idle when there is still processing to be done.
Let's say I want to perform a billion calculations,
Why would I care if those calculations are execuded in one thread (without HT) or in two threads at half the speed each (with HT)????
A program that just has a billion calculations to run is rare, most programs include lots of IO, from memory, from disk, from the network. These slower things mean the CPU has to wait, reducing that wait is essentials for high performance. That is why we have cache memory, branch prediction and hyperthreading, etc.
Actually when the LGA1155 CPUs came out computer Taiwanese computer enthusiasts did a test in Taiwan running the i5 2500 against the i7 2600 and discovered that the i5 2500 could almost run head to head with the i7 2600 Intel CPU despite the i7 2600 having 4 extra threads and 2 MB more cache! In other words hyper-threading was practically useless b/c threads are threads and have no ability to process information like real CPU cores!
Excelent.
Airport nailed it.
Why don't you use the actual name, SMT, as opposed to the brand name from Intel?
Because no one has ever heard of SMT, but they have heard of hyperthreading.
@@GaryExplains You could have said the actual name in the video. Intel didn't invent it nor are they the only CPU company that use it (Zen uses it, and they advertise it as "SMT").
Feels like a bit of an oversight.
@@nextlifeonearth I am appealing to the masses. So sue me. 😜
@@GaryExplains I get that more people will find the video with that name, but I think that would be valuable information to consumers among others, if you say they're the same thing. They look at two CPUs, one has this Hyperthreading thingy they saw a video of and the other this SMT thingy they never heard of. If they knew it was basically the same thing, that would prepare them better.
Anyways now that I have your attention: video idea: a video about OOO execution. It's very relevant with the M1 SOC and I'd like to hear your take on it.
6 core 6 threads and 4 core 8 threads almost give same level of performance if rest of the things are same.
Rumoured 4 threads per core.
A) A slight gain in performance sometimes
B) A marketing gimmick
Did you see the performance numbers I quote in the ThunderX3 video for SMT4?
@@GaryExplains I have now, thanks for explaining Gary!
This make sense with the like of i7 8700k vs i7 9700k
RIP Hyperthreading
One implication of the performance benefits of SMT only grows as the years go by and the memory wall increasingly shows its ugly face. And that is - what happens on a cache miss? Well, one hardware thread is blocked for an eternity while it waits on slow memory to eventually come back with some data. And as it waits, it uses NO INTERNAL RESOURCES. Well, happy day to the other core which'll have absolutely NO CONTENTION.
I don't want to crash your party here, but he did not explain how Hyperthreading actually works under the hood.
No party crashing here. The title of the video is "what is hyperthreading" not "how does hyperthreading work under the hood."
That explanation was amazing. Now you need to explain how to get teenagers to keep their room clean. 👍
First...thank god
lol
yo.. second
lol
Intel fanboys in 3..2..1..
It's when you're Hyper about stitching threads.
😒 hyperthreading is fake cores, depending on your applications its sometimes useless and money wasted, check and compare benchmarks on different cpus 💻💻🤓
Somehow this is too close to Techquickie
eh?
Gary Explains Pretty sure you're familiar with that channel, just pointing out you're content is getting similar to it.
Gary Explains Hahaha true but Linus doesn't go as deep as Gary does because his videos are shorter
So if a guy on RUclips has a video about a topic, then no one else is allowed to make videos on that topic? I don't understand?
Gary Explains You can literally do whatever the F you desire to, it's your channel. Just my input about you putting out already available content, for me your instructional videos and guides made the channel original and worthwhile to hang around with.