Really interesting tech demo/early version of this feature. Curious to see if it goes anywhere further. In the meantime, we'll be busy on some CPU benchmarks for a while! Guesses as to why? Check out our Steam Deck tear-down here! Really fun one - we learned a lot from Valve's engineers: ruclips.net/video/9jhRh11bTRA/видео.html
@GamerNexus , What cpu should i buy the 7800x3d , 13900k , 14900k ? I want to stream, gaming, and also have multiple tasks of google open ,spotify , and also maybe edit some videos . So what should i do
@@AlexManMichael Get the 7800X3D, uses a little more than 1/3 the power and has better gaming performance and you'll actually be able to upgrade using the same motherboard.
7800x3d would be best. It's slightly slower due to clock when rendering etc, but absolutely owns in gaming. Plus, it sips power (insanely efficient). Unless you're video editing like Steve, Linus, jayztwocents etc, 8 cores and 16 threads are more than enough. Plus, as mentioned, AM5 platform means future proofing for later CPU generations. I have 7800x3d and I'm so glad I chose AMD this time round.
I was relieved to see you guys reran Rainbow Six at low settings so we wouldn't have to go hunting for another couple percent, hahaha! Nice to share the workload on stuff like this.
I suspect they're even limiting to the 14700k and 147900k specifically to upsell from the "1x600k is enough for gaming" message reviewers have been repeating for several generations now.
Naa... They probably dumped this out because they know the "14th gen" CPUs are a joke, and this provides some more segmentation from the 13th gen CPUs. It'll get dropped with an excuse that they've worked with M$ to incorporate the key features into the windows scheduler.
@@meeponinthbit3466 most likely. reminds me of the 40 series and dlss 3.5. Why can't it run on the 30 series? Prob cuz they ain't selling the 40s as well
Yeah I had no idea this was a thing on the latest intel CPU gen (have a 8700K). Upgrading now. Ordered the DDR5 RAM, new CPU cooler, but haven't locked my self in yet (haven't bought the mobo or CPU). This is making me consider going with AMD. I was already having a very hard time deciding *sigh*
The power consumption reminds me a REALLY old test with those Atom processors. The Atom processor was SO SLOW that ended up consuming more electricity than the faster, modern processor despite have lower power load from the wall all because it took more time to complete the task.
Two things: 1. I'm missing an "e core disabled" benchmark, in addition to apo on/off. 2. CPU's are starting to become like the latest car engines: extremely complicated just to lower CPU consumption a little. And that's not something good. Now we need to add some kind of utility to re-schedule tasks.
Is not complicated, the i9s just are fundamentally flawed, 33% made of P cores and 66% e-cores kills the entire point of Big-Little. Apple doesn't do this for the M-series of chips.
The fact that this is just scheduling means that without "APO", the processors are performing incorrectly...making it additionally insulting that they won't enable these application-specific fixes for the processors from which these 14th generation chips were photocopied
Oddly enough you don't have these massive scheduling issues on Linux. Like the poster above me said: Microsoft doesn't care. They've not cared since windows 10 when they fired their QA team
@@Winnetou17 Now you know how AMD felt the whole time, especially during the FX era. Microsoft always geared Windows to work best with Intel. So it's a nice, but tiny change, for once.
In addition to Windows' infamously iffy thread scheduler, it also shows the inadequacy of Intel's Thread Director driver. You know, the thing that was supposed to help tell Windows what the "needs to be fast" threads are and which can be given slower billing. What APO finally delivered, but through this awful per-application hack?
This is why my Intel rig was built with an i5 12400. I never had much trust in the big/little thing. Same thing with my new AMD build - I went with a single CCD 7800X3D. I can't believe 7900X3D/7950X3D owners need to use the Xbox Game Bar to make sure Windows is specifically prioritizing the 3D V-Cache cores... Sometimes cheaper / simpler is better. Throwing money away doesn't always get you the best experience with all of these gimmicks in modern CPUs.
I got the same i5 CPU earlier this year cause I too wasn't sold on their big.little concept. It seems to be the future since it's been a thing forever on ARM chips and AMD has said they'll try their hand at it soon so it's bound to get better eventually but it's not quite there yet.
I agree, im pretty sure your desktop experience is far better with a 12400, my 12600k has one or two e cores pegged out doing mundane things on desktop causing hitching and lag, all while p cores are sitting there doing nothing.
Yep, I pretty much refuse to buy anything with mixed cores. I'll probably have no choice at some point, but I am sticking with an older CPU for now and basically hoping this idea goes away, haha. I hate it. Mainly because I know there's no way to make a magic scheduler that can somehow know which core to put which threads on; something APO seems to prove when they have to do this amount of work for each game, on each CPU SKU, for it to work properly. It's just ridiculous, and not something I want to buy into until there's no other choice.
@@LoudlevinChange your power plan to High Performance. This should allow tasks to be promoted to the P-cores more aggressively. There were also some BIOS patches during 13th gen that quietly addressed some weirdness of low E-core counts for chips like the 13400. Going up to the most recent might iron some of that out.
@@theviniso I am very sure AMD will face the same problem with Intel as well with their own big.little cores. AMD has only claimed feature parity with both Zen 4 and Zen 4c. However, I'm certain that the IPC of both types of cores are different. It is up to Windows to schedule threads for the correct type of cores.
My worry would be with such targeted optimisation you’re likely to see varying results with different hardware, and find significant changes or actual problems when games get updates and patches.
@@alexnulty8902 have they really rested affinity and not just disabling cores because disabling cores would remove the e-cores from taking over load from the rest of the system leaving more for the game?
You might get similar results if you carefully chose and set the affinity per _thread_ . Not just for the whole process. Of course, that requires pretty deep knowledge into how that game's engine works. Knowledge that would be extremely difficult to figure out from only empirical observation.
curious to know if simply forcing all processes except the foreground game to the E cores & put the foreground game only on P cores using process lasso would achieve similar results to this on 12th/13th gen or not
If you ever bring metro exodus to the test suite consider using the starting area from the sam's story DLC. Right before the first combat sequence there's a broken road overlooking a bridge that absolutely murders cpus.
I recently played that game at 4K maxed out. I recall my i7 9700k and RTX 3080 running that scene pretty well. Where my PC totally got wrecked in Sam's Story was the boss fight with the bat towards the end of the game. My framerate was dropping in the low 10s.
@@Mcnooblet Actually I had DLSS Quality on with full ray tracing. The entire game would've been unplayable from start to finish at native 4K with full ray tracing.
I've said this at HUB's video, as well: In the end, Intel basically admitted that Thread Director still sucks, at least for gaming. This is really hand-tuned scheduling, because they can't just disable TD altogether and fall back on Windows' standard scheduling for games, either. I wouldn't even call this a "tech demo", simply because it's not tech. Call it "craftsmanship" if you must - because that's more in line with people at Intel painstakingly pre-determining scheduling for a specific application. What a horribly unsustainable way of going about problems just because marketing told you so. Really reminds me of that oberclocked Xeon at Computex.
> fall back on Windows' standard scheduling for games I am very sure there are no fallback for scheduling. Because from the start, scheduling is OS job and not CPU job. This APO is basically because Microsoft not doing their job for YEARS to handle better scheduling, and then Intel taking matter into its own hand.
@@randomsomeguy156 LOL, please don't comment if you don't know what you are talking about. OS scheduler is a PROGRAM/SOFTWARE that an operating system uses to place running process in the right thread/workload. OS scheduler is not HARDWARE. It's not a transistor or logic gate that you put in the CPU. What we need, is for Microsoft and Intel to WORK TOGETHER to 'fix' the Windows scheduler. As the Windows scheduler has been proven again, again, and again, to not work properly with Intel P and E core models. It's basically the fault of both companies. Why don't they just work together to address this issue. Blaming just Intel for this is stupid and basically don't know anything, like you.
@@LaCroix05 Actually, Windows does the scheduling - the "OS' job" as you put it. However, we're way past the times when this could be done silicon agnostic. Take an i5-12400 or an R7-5800X for example: Even on a fairly uncomplicated CPU (when it comes to scheduling, that is) with only equal cores, on one CCD in AMD's case, there's still "better" and "worse" cores that clock higher and lower, respectively. Windows can and does take this into account, deciding based on fMax. (Simplified explanation for a very complicated problem!) Now, if you go AMD's way and add c-cores, which have the exact same capabilities as their regular brethren, just with lower clockspeeds, that works out fine. Intel's e-cores however are fundamentally different from p-cores, down to the very microarchitecture and instruction set. They also don't have HT. That would wreak havoc with the whole scheduling and thus Thread Director was created. Intel worked a lot with Microsoft and TD is implemented on the very silicon to "guide" the OS' scheduler. In the end, it's still the OS that decides where to assign things, but with help. AMD ran into a similar, albeit easier to work around problem with the R9-7950X3D. The OS would always assign threads to the higher clocking non-X3D chiplet first - which is fine, in general, but not for games and other very cache-dependant workloads. Hence the XBox Game Bar integration that tells the OS what's a game and when to prefer the low frequency/high cache CCD. However, the problem that p-cores and e-cores are two very different architectures remains. Think of it as basically having two completely different CPUs in one system. Intel will always run into the same problems and will always have to find better ways to manage thread assignments by the OS' scheduler. And no- that's neither Microsoft's nor any other OS provider's job. That responsibility lands squarely on Intel's shoulders. They decided to make heterogenous CPUs with fundamentally different cores, they have to figure out how scheduling for these can work. It's no wonder that they didn't implement the same principle for Xeon Scalable. No one would have put a CPU into their datacenter if they had to rework their whole scheduler just for that. The reliability issues would've been a nightmare at no tangible benefit whatsoever. Consumer PCs are different in that regard. Intel were faced with the problem that their core architecture was so overextended that it just didn't scale any further. Comet Lake was already pushing the limits and Rocket Lake even had to scale back to work with what 14nm could do. 10nm (Intel 7 these days) didn't give them the headroom to scale up, either, and AMD was literally clobbering them with high core count CPUs. Disaggregation was still far out on the horizon, so they did what they could. 8 enormous p-cores was the limit they could reasonably fit into a client CPU's die and then they filled up the rest with space-efficient e-cores - scheduling issues be damned. Now they are set firmly on that road and have to see it through. RLR is basically a bust, Arrow Lake still looks hopeful, but anything after that... i don't know. We'll see. Having manually optimized scheduling like they're doing now with APO still looks like a marketing stunt to me that's unsustainable long term. Maybe AI can fix that for them, wh knows.
@@theviniso Without looking into it, my instinct is that it's a RISC/CISC issue; with x86/x64 CISC nature making its threads far more complicated to efficiently schedule, especially with hybrid core architectures. It's a guess though; I could be entirely incorrect. Either way I too would be interested to know the answer!
Using big and little cores on x86 high performance desktop CPUs is not a good move. I just hope AMD sees this and doesn't follow the same idea. With laptops, perhaps the trade-off is acceptable
This APO thing could've been and still could be massive if it worked on all 12th, 13th gen and "14th gen" on all games. This +30% could've put intel back in the ring. Let's just hope they'll figure it out and either drop a software update so scheduling works as it should or make it work well with 15th gen.
This, to me, smells of a feature that the marketing side forced the dev side to release before it was ready for public consumption, as well as artificially limit to 14th gen. Not a good move, Intel.
@@clitmint Yeah, idk about that. If in a few months they've added support and provide significant benefit for a bunch of popular games, then maybe. If it's still like it is now, then everyone will forget about it. As Steve said, rn it's a tech demo and nothing more.
@@griffin1366 they kinda were, 250w down to 220w is still a far cry compared to 100w-180w on AMD especially when you realize relative performance is same for both sides and here intel uses shitload more cores + software optimizations to catch up to AMD while AMD just throws L3 cache onto CPU's because software optimization is way more financially and time demanding efficiency cores are nice but they require all parties to work on optimization for each application existing on the market while extra L3$ is redundant in design and fool proof in regards that it basically improves majority of workloads by existing while we still cheer for E cores to be a thing we can't ignore that E cores a software engineering nightmare and that it is time for microsoft to fix their god awful task scheduling
This is the reason I went with the 7800X3d . Big little requires to much gymnastics to get good performance out of it. This feels like intel desperately trying to squeeze more performance out of the chips. If we had a standard syscall that allowed programs to ask for a big or little core per thread maybe it could work, but that also requires the game developers to care.
To be fair, X3d hybrid CPUs have just as many problems with scheduling as E-Cores do. AMD's own answer was "if program is on this list, park non x3d cores. If on other list park performance cores". The scheduler isn't looking at something like cache misses and making a heuristic determination. Scheduling is hard.
@@arthurmoore9488 yea the issue is just hybrid cpus it seems, for example the 7800x3d works perfectly fine bc it ONLY has 3D vcache cores compared to the 7950x3d which has 8 3D V-cache cores and 8 standard cores which causes all these issues with scheduling
@@arthurmoore9488 Yup, 7900x3d and 7950x3d are actually much worse in that regard than any Intel cpu with e cores. 7800x3d is the only one that works...if it doesn't explode first that is, hehehe.
It exists ! It's called Process Lasso and complex cpu users are happy of the results. And the best is that you have complete control over what your cpu does. Counter strike benefits from using faster cores? Force the game to run only on those cores with a couple of clicks. This is a tool that should have been made either by Intel or Amd, it would have been much easier from the beginning
Oh but they did figure it out, and in their figuring, they figured the fanbois would yet again "upgrade" to the new thing, as they've always done... but when it didn't happen how they thought it would (because people are finally getting sick of Intel's bullshit) ... they roll this out to entice you all to "upgrade". Intel knows how to get the money out of your pockets and into theirs.
In fairness, deciding which tasks have a lot of work to do ASAP, and/or which threads will bottleneck responsiveness, is *HARD* , inherently so. In fact, it is probably impossible to solve exactly in the general case (no method that can work for all cases).* That said, good heuristics on "impossible to solve exactly" problems can still give a good approximation for most real world cases. And developers already have moderate ability to communicate intentions and needs of a task to the OS, that Thread Director seems to be underutilize. * Probably impossible because it is an "adjacent" problem to the halting problem (will a program finish or loop/freeze forever on some input?). The halting problem has been known to be, in the general case, impossible to answer exactly for decades.
@@TechSY730 a popular scheme is to run timers and kick processes out periodically. Done or not there's other things waiting. I'm running 230 threads on a 4 core CPU. Limited resources need to be shared. But yeah nothing is going to be ideal. The OS just has to do the best it can. If I want things done faster then I need to get a faster PC. Which I have been thinking about doing. I'm just not quite there yet.
Stuff like this is exactly why I went with AMD. The hybrid architecture feels like slapping a bandaid on a gunshot wound and for some reason the bullet keeps heating up and pulls 300W.
Nah, BIG.little would be absolutely better way of doing things IF the thread scheduling really worked. Hopefully it does in the future (and AMD is going to the same direction anyway).
This feels like a prime opportunity for some AI learning to kick in and make this slow, laborious optimization into something the machine does for you.
I think it's a good idea, but it needs a few month more development, and Intel has to add support for at least 13th gen. At the moment they lose their own customers because people are mad they bought a pretty expensive CPU this year and already out of the game. It's a really bad move and I hope there will be more pressure on them to add the feature to all 1700 cpus.
Bad news, Hardware Umboxed channel covered this including the fact that Intel has stated that they will not support older CPU's regarding APO, only 14th gen and onwards :(
Of course. That is Intel's m.o. The 7800x3d and 7950x3d are readily available and beats Intel's new 14th gen at virtually everything. Zen 5, roughly 6-months away, will add to the lead while reducing the price of the current Zen 4 champs. Intel is behind AMD in cpu, much like AMD is behind Nvidia in graphics.@@DrKriegsgrave
I mean Intel has been doing the same since the core lineup was created and they keep selling well, why would they give it to 13th gen when they are exactly the same as 14th gen? They need to sell those CPUs somehow, otherwise 13th gen users would have no reason to upgrade.
@@andersonfrancotabares3614 When exactly did Intel exclude prior CPUs from software solutions? They won't sell nothing when more and more Intel users switch to AMD because of the bad sevice. Intel doesn't win customers with this, they lose them.
I've had to tell people to use process lasso to get games off their e-cores when their performance is odd for a while and it's sort of nice to see even Intel struggling to manage scheduling for them. e-cores have their good, and turning them off isn't great when there are multiple ways to get the wrong programs from sitting on your e-cores.
I think it's more on Microsoft to do the work. Windows clearly recognizes which CPU is fitted to the motherboard, so it should have code in Windows to adjust the environment to work at it's best/most efficient per-vendor. However, Windows has always been geared to work better with Intel. For obvious reasons of course, like Intel just threw a ton of money around, and was the leading CPU for such a long time anyway. It was always AMD that got the bad end of the deal from Microsoft. Now AMD is in their consoles, they kinda have to play a little bit fairer toward AMD.
it's called Linux ... where you can bind specific applications to specific CPUs already via different means. This is basically Intel doing MS work. However to their credit. MS need to deal with a lot more CPUs so they cannot go all out on optimizations ...
@@saricubra2867 It is actually already doable, though not easily set. The easiest way is to use a program called *_Process Lasso_* ; it will detect heterogenous cores, and can automatically set "CPU Affinity" to a setting you determine to be the best.
These sorts of problems (that the APO tool is meant to address) is what I immediately assumed was going to happen when the P/E core thing was first unveiled. I'm still not convinced that's a better design than uniform cores. We saw similar problems with the above 7800 X3D chips from AMD, where the CCD with the 3d cache became the "P" cores.
Not to mention that Intel Thread Director was supposed to aid this kind of assignment among E/P cores from the start. So now it sort of feels like Intel is making developers do manually what Thread Director's job was supposed to be all along. Remember, APO is something that has to be tuned manually for each individual program; it is not automatic.
The split-cores on Intel's 12th-14 Gen CPUs is one thing, and largely are Intel's own fault for doing it that way.... but the problems you mentioned with AMD, aren't actually problems with AMD at all. The problems lay exclusively with Microsoft and even developers.
The hybrid core designs offer massive improvements for multi threaded workloads, at least ones tested here, but for these gaming workloads it is still causing issues. However, you can always either 1. Pin misbehaving games to P-Cores, or 2. Disable the E:-Cores if you're just gaming. Although then, the 7800X3D is the best option anyway. It's a shame that mixed core designs from Intel still have problems at the software level, because the processors otherwise are pretty impressive hardware. Maybe with this hardware scheduler people talk about for Meteor Lake and onwards they can overcome Microsoft's laziness in updating Windows core scheduling.
@@jamesbuckwas6575You're probably right. But my issue is that most people that buy it aren't going to know what a P or E core is, or what a bios setting is. They're just going to see a 20 core processor and think it's going to perform heaps better at gaming than an 8 core AMD processor. Something AMD was guilty of in the past with their "compute cores" marketing and the whole bulldozer core count disaster. There's no way that intel isn't using this as an excuse to inflate their core counts for marketing to the uninformed. I think it's perfect for the mobile/laptop scene but doesn't belong in their high end desktop cpus.
@@slimjim2321 I agree, the marketing around 14 core i5's should be improved. But that is a separate issue from the actual hardware, especially for people like us who are better informed about these products. Perhaps 6+8 cores could be better, or treat the E-cores as "half cores", or something better.
@@GamersNexus It would be good if they could. Presumably they'll want to automate it somehow, and at this stage they are not making lavish claims about how many games it will apply to and it seems that this is Intel driven NOT Developer driven, at least at the present. It's a sort of "We've got to get our house in order but it will be beneficial going forward".
Sounds like a scheduler bug fix / driver update that they're choosing to deny releasing on "older" (but identical) hardware, and instead package as a "new" feature. 👎
@@GamersNexus Its' also proof that Intel still thinks very little of their fanbase. Not allowing this to work for 12th and 13th Gen parts, is a spit in the face. I stopped buying Intel at the 10900K, so even though it doesn't personally affect me, it still annoys me. I'm just sick of them at this point.
@@clitmint Think you're missing the point there. They DON'T want it to be used on a vast number of platforms as it's in public development by them, It's sounds like it's more Alpha than even Beta. Once the bugs are ironed out then I can see it definitely filtering down to 12th & 13th Gen. They are restricting the "test" pool on purpose at this stage.
Great video, its interesting to talk about the potential benefits of better scheduling, both for Intel E-cores and the AMD X3D series with increased cache on 1 CCD. I saw some similar efforts from another small youtuber @Savitarax - who really put some effort into optimizing windows scheduling to run background processes on his 7950x3d on the high clockspeed / lower cache cores. I'm glad you guys are giving this area more exposure, as obviously if we get the right people aware of the problem it can be taken more seriously. It makes sense that this would be a bigger effort from companies like Microsoft, to work to automatically schedule the less important tasks on the 'background' cores. I'd love to see a follow-up video where you manually tuned a windows installation to push background tasks to the E-cores / AMD non X3D cores respectively, and see if you could pick up a noticeable framerate (or especially 1% lows) gain across multiple games - I know that would be a lot of work, so it probably won't happen, but it would be great to see. Either way, thanks for this look at APO.
@@saricubra2867 yeah, therefore games are singlecore only so you have cores left for a parked internet browser^^ Also why not use the extra performence when you don't multitask heavily (like most of the time)? We aren't gaming and rendering at the same time
@@emmata98 Ask that question to Hogwarts Legacy devs. Hogsmead can stutter even on the best Ryzen 9 on the planet. APO is the most disturbing thing ever because now game devs don't care about optimization anymore and you have to brute force it through AI driver stuff...
_This_ is the kinda GN content I live for - who else is gonna figure all this out and explain it to us? ...cuz I don't test sh*t, cuz _I ain't got time!_ For what it's worth, the e-cores in my 12700K have been good for me, but not in games. They're mainly good for like, non-real-time rendering nonsense.
This is really a good example as to why hybrid architectures like this are hard to do, and likely why AMD went the "just make the existing cores smaller" route
If this turns out like Intel’s route with performance vs cores where they had suddenly years of design to catch up to vs AMD it might solidify AMD’s position a bit more.
Not sure if using Process Lasso provide any benefits for CPUs with E-cores but it helped me a lot with game stability back when my system was CPU bound.
I love when companies make something exclusive to the windows store. and also when they launch a feature 2 gens late just to only support a couple games
Hopefully we got some kind of hardware scheduler on Arrow Lake and Zen 5. The ecores and extra ccd causing performance problems as well as having to micromanage programs is not fun at all. 7800X3D still god tier gaming.
Hopefully we'll see 10 or 12 cores on 1 CCD in the near future. A 10-12 core single CCD X3D CPU would be killer for gaming while also offering more cores. Some games are already optimized for 16 threads (just like the PS5) so even 2 extra cores would be nice to have on one CCD. Intel is extremely subpar for gaming. Not a good look when your CPU uses literally double the power for on average the same or less gaming performance, while costing more. The 5800X3D was a great success and the 7800X3D perfected it, consuming even less power than a 5800X3D while performing much better. Zen 5 X3D will destroy Intel if they don't come up with a solution fast. I'm curious what their next gen CPUs with L4 cache will be like, and if they can be air cooled lol. Let's be honest Intel is extremely lucky they have OEMs and brand recognition on their side during their own "Bulldozer era".
efficiency of cpu in gaming [Frames/Joule]= 13:53 [FPS] / 12:03 [W] , that is more correct way of talking about efficiency power is only related to the amount of work being done per unit of time, but we need to know if we are wasting energy or not by doing the work as efficeint as possible... +24% FPS needing -17% power means: +70% efficiency in F/J or -41% energy spent in doing the same work (less electric bill and less heat generated but also less energy spent in cooling if needed, last one not taken into acount as is an external energy use at room level).
An alternative might be "process lasso" a little bit technical tool to set application affinity to specific cores. Can be done with Windows task manager but process lasso remembers the settings after a reboot when you configure it that way.
They are playing the NVIDIA card, they release a new product that's effectively the same as the previous one but throw in software exclusivity to pretend it's an upgrade.
@@whiteren8749 No. Power budget. They're already in a less than ideal spot. Just giving people 24 P cores but limiting PL1 and PL2 to, say, 400W will give an average clock speed that is lower than what 8 P + 16 E cores offers because the voltage/frequency response is so different between the two. The E cores are mobil architecture derived. That 24 P cores would also blow out the die area to absurdity, while giving lower perf/watt and lower total perf is precisely the reason they had to start mucking about with a heterogeneous architecture in the first place.
I use to mine Monero, a good miner needed to optimize which cores could run. Often times we would gain over 50 percent extra performance for free. This, with my testing, I noticed disabled most threads, and allowed most cores to run uninterrupted. Btw, it worked on AMD and on Intel.
You mine monero on a CPU? Is it worth it at all? I remember mining it on a 6700XT and I got basically nothing, certainly not enough to pay the power bill. CPUs are much slower, no?
They're only detrimental under certain gaming scenarios, similar to how hyperthreading can sometimes hurt or help gaming performance. It's variable. The P/E core setup is actually pretty great otherwise. Most games function similarly whether E cores are enabled or disabled. And for multithreaded performance, it's insanely helpful because they've allowed 4 tiny cores in the space of what would've been maybe 1 extra big core. They shouldn't be mocked for trying to innovate in the desktop PC space. They should be mocked for not allowing this feature on 13th gen CPUs though.
That "example only" screenshot is interesting...Total Annihilation is an RTS from 1997 (and it's an incredible game with an extensive modding community active at TAUniverse)
If the secret sauce behind APO is really just a person or a team of people manually optimizing how the threads are scheduled for a specific application, that is not going to be sustainable as a product feature; unless you are talking about the kind of efforts nVidia made on releasing a new driver every time a new big-budget game came out (optimization being particularly important for DX 11 games).
One thing that does seem remarkable for the APO is that the 1%Lows get huge improvements as well. This is also true for Rainbow 6 Siege. The low fps occurency is much lower.
Seems similar to what can be done with process lasso. A lot of people have been getting good results moving everything to E cores on Intel and second CCD on AMD. And that works with anything. I wonder how the power levels compare.
I don't use process lasso. The problem with e-cores is that they have pretty bad singlethread perfomance, specially worse IPC than a 10 year old Haswell laptop. I let Thread Director do it's job and it works for me (i have the i7-12700K which is 8+4).
@@saricubra2867 I think you misunderstood my post. Or I did a poor job of explaining myself. I did not compare AMD to Intel. Just that people are doing things similar to what APO seems to be doing using Process Lasso. As far as making sure things run on specific cores. But, AMD CCD's are not always equal. In 5xxx chips, for example, the second CCD is usually weaker and clocks lower so it is a good idea to move everything to that one that is not mission critical that can not use more cores than one CCD has. Cross talk between CCD's is pretty slow relative to being on the same CCD. And with the new 7xxx 3D chips, you only have the 3D cache on one CCD. So, again, moving the OS and as much other stuff as possible to the second CCD is great for any games or other programs that can take advantage of the extra 3D cache on the first CCD. So, even on AMD it is advantageous in some situations to control what cores/CCD you are running various things on. You wont usually see big gains in average FPS, but 1% lows and stutters are dealt with very well. Similar to making sure your games are not using E cores on Intel. The gains just will be bigger on the Intel side since a P core is so much more powerful than the E cores.
I would like to see performance comparisons between APO and per-process E-core disablement via tools like Project Lasso. Since APO is making better use of the scheduler, it should outperform E-core disablement. I'd also like to hear more about what entities are responsible for optimizing scheduling. I was under the impression Microsoft was responsible for it in their OS. And why is this enhancement delivered via the Windows Store and not Windows Update? Is Intel circumventing a Microsoft process? Is Intel taking this upon themselves due to dissatisfaction with Microsoft's scheduling efforts?
@@OrjonZ define "best"? "best" can be fastest, but "best" can also mean less "total kWh per task". GamersNexus added this metric to their CPU tests a while ago and it is the metric that gets improved by efficiency cores, which is why the naming is proper. But that metric is unimportant for gaming.
They're optimized in many ways only privy to computer engineers. For example, context switching. A larger p core burns more power saving processor state and flushing the pipeline than an e core would. The e cores are better at certain tasks, such as frequently switching background threads in and out. Additionally, fewer execution units means less power is used per clock cycle when a background thread doesnt saturate the reservation stations.
@@KerbalLauncher Is the efficient context switching the result having fewer specialized registers, or is there a more interesting optimization involved?
@@solemnmagus Sort-of; the smaller size of the e-core means less state to save/restore on a context switch. I'm not sure I'd call this being more efficient though - it's just that there is literally less work to do.
@12:49: No, Steve! You can have lower priority tasks run on E-cores, there are many different tasks a game runs on the CPU and you can perfectly fine run sound or even AI subsystems on E-cores. Game developers generally have no clue about thread scheduling!
The OS is supposed to handle scheduling. A CPU is just hardware. If you want to see what a CPU does with no software running on it turn your PC off. It just kinda sits there like a lump, don't it? CPUs execute instructions. That's ALL they do! You need to feed it instructions for it to do anything at all. Well, that's not entirely true today but for the sake of discussion let's keep it simple. Your CPU actually runs Minix while it is powered up. If you have an Intel CPU, that is. But that just does internal housekeeping. It isn't externally exposed. I'm not sure if that executes on one of your counted cores or if there's extra logic? Kind of like a secret computer inside the chip.
@@RyanFennecthis seems to keep background stuff on the E cores to increase performance, turning them off is a hack at best Not saying that APO isn't a hack, but is a notch above crippling the CPU
future support will make or break this 'feature', the only silver lining i can think of is that the graphic driver team surely has tons of data on how individual games run because they've been hard at work polishing those arc drivers so if the cpu-team can make use of that to sort of streamline the process then it might be feasible in the long term
Would be interesting to see if restricting the affinity of tasks in Windows / Linux appropriately would also give similar results (i.e. restrict all OS / background tasks to E-cores, put game on P-cores).
@@jasonmitchell3307 I use W10Privacy with a preconfigured ini file combined with Shutup10 ( tools with many options) to remove/disable a lot of W10/11 bloat like unecessary background processes and preinstalled cr@p.
@@jasonmitchell3307 Fun fact: My PSU reports its power usage. I can quantify Windows background tasks as "10-15W of power at idle when compared to Linux".
One additional test that I think would be interesting would be running the games with APO disabled while setting the CPU affinity on the game process to only allow running on P-cores, either through Process Lasso or just manually changing via Task Manager. In theory that approach could provide a working optimization for 12th and 13th gen CPU owners who Intel is leaving out to dry with APO. I suspect it may still not be quite as good, as CPU Affiniity only affects the running game process but the GPU driver code may still run on E-cores, but even still may be an improvement over nothing at all.
My impression from Intels e-cores, is that they're primarily about efficiency in die size, not about the energy efficiency. Yes, they may also be more 'energy efficient' for some tasks where you'd either have to power a full core or just these, but we've already seen in the past that E-Cores are less efficient than P-Cores for the high performance workloads.
If you downclock a normal core, you get massive power efficiency improvements, due to how voltage and power scaling works. This is why laptops are still reasonably comparable to desktops. E-Cores aren't actually any more efficient than P-Cores, and I suspect they are a bit worse in many cases, due to Intel's horrible implementation. They work well for phones though.
Intel E-cores were about scraping narrow wins in benchmarks that a 10 P-core system could not do, as such instead of running at a power efficient frequency they've had to be pushed up the power wall.
One thing that HUB found that wasn't mentioned here is that the APO changes actually had a larger impact than just turning off e-cores. Looking at their e-core load distribution, it looks like 1 e-core per cluster was utilized. I wonder if they wanted to keep the extra 4MB cache per cluster without using too many e-cores. Given the hybrid scheduling oddities encountered with CS2, that would probably be the biggest title they could add to APO. Given that a solid chunk of the player base plays on 4:3 stretch with minimum settings, the CPU is probably the bottleneck for many people.
Problem with the HUB test AFAIK is that they disabled the e-cores instead of just setting the thread affinity of the game to the p-cores, by disabling the e-cores the rest of the system cannot use the e-cores either so it will cannibalize on the p-cores from the game.
@@Henrik_Holst I tried that. CS2 microstuttering stopped but also Battlefield 2042 lost 12% performance. So it depends on the game. One thing i do need to try is core parking though, I have C-States etc. disabled so the eCores were at 4Ghz
@@Henrik_HolstMoving non-game processes off the P cores is the obvious solution that should've been there on 12th gen, this is why I avoid Intel and dual-CCD CPU's, nobody can get basic shit right
@@griffin1366 interesting that BF2042 went worse by exluding the e-cores, I guess that game launches lots of threads then and thus benefits more from having access to lots and lots of cores.
@@Henrik_Holst Battlefield games have always been that way. They will take whatever you throw at them. You can still see gains in Battlefield 4 with new hardware, upwards of 600 FPS at low settings. Crazy how smooth that game feels to play.
Wait, wasn't this one of the things Windows 11 was supposed to be doing already? That was a big thing with Windows and Intel 12th gen and up; Windows 11 process scheduler was supposed to be able to delegate low priority and background processes to the E-cores and the CPU heavy processes to the P-cores. It was a big thing at the time of Windows 11 to get people with the latest Intel CPU to move away from Windows 10. Am I wrong, or is this even deeper than that? Is it delegating a specific operation within a process to the appropriate E/P core, instead of the whole process?
I believe this reallocates affinity for the in-game threads, like put renderer threads to p-cores and some misc threads (like ai scripts) to e-cores, something that windows scheduler have no way of knowing of how to do (all of them are game.exe threads) and that is why APO have to have custom profiles for each game.
@@marshallb5210 if I had to guess I'd say it was a research project that came up in discussion at a meeting and someone had the bright idea of releasing it. I can certainly see why Intel would. It does cast their latest hardware in a better light. Even if the scope is extremely limited. It also opens up a dialog Intel is keen on promoting too. Very few here seem to have gotten the point though. But they may come around eventually.
@@AlexeyDyachenko now it is beginning to sound like an API failing. Because with no way of labeling the threads, like you say, there's no way for the OS to know what to do with anything.
That's really interesting. I wonder if this is a Windows-only issue or is the scheduler on Linux also not great. Maybe that's why the application is Windows only?
100% this. Locking games to P cores only gives really good improvements in avg fps and less stutter/spikes on my 12700H laptop in some games. Like Factorio, where you'd expect running 100K logistic bots would be where the extra threads would help
This is exactly what I was thinking. Because they’re branded efficiency cores they’re not clocked as well so when they have a couple of 2.2ghz loads on a few of them apo brings up the wattage, since the cores aren’t chillin around 1ghz they’re heating up to act as if 2-3 e-cores are 1 full p-core
The opposite is happening, though, at least in Hardware Unboxed's coverage. There the total CPU power consumption goes down by about 10W while frame rates go up by up to 20%. E cores at 2.2GHz still consume a lot less power than P cores at 5.7Ghz. Exactly why is hard to say. It could be because of improved cache usage, but it could also be that the P cores run cooler and have more headroom to pull stored power (from capacitors for instance) when some processing step requires extra attention. If they can scale this out to most modern gaming titles from now on, they may have an answer to AMD's X3D tech.
More "efficient" cores reducing P core throughput and increasing power use sounds very much to me like it's causing cache thrashing Unfortunately this kind of stalling is invisible to consumer users, just shows up as CPU busy / 100% utilisation If the rumours of 8+32 Arrow Lake are true this problem may become worse without even more of these application specific bandaids. 🤮
@@greebj Hardware Unboxed's hypothesis was that using E cores for low priority/background tasks would free up level 3 cache for the P cores, which would then speed up high priority tasks. Hopefully, Intel will be able to turn APO into something that would either work relatively generically, or at least be built into compilers and/or frameworks like Unreal Engine. I say "hopefully", because Intel has become the underdog lately, and we as consumers are best off with competition.
What they should so to make it really good PR is to make it possible for the public to create their own custom profiles for game/CPU combinations. They'd effectively do a Bethesda and crowd source improvements for their tech.
From my testing it made zero difference anyway. GN came to the same conclusion. Windows 11 exists because Microsoft wants to make Windows about the Desktop PC again. They no longer have to rely on Windows to make profits and have since stopped chasing trends.
Yeah. Windows 11 uses some magical heuristics that are frequently wrong, while Windows 10 simply looks at thread priority, putting any subnormal thread on E-cores. Honestly, in my experience, Windows 10 gets it right more often than Windows 11. Good luck getting Windows 11 to put any service process on the P-cores (this causes a lot more problems than some may realize).
This sort of thing is why I'm not too comfortable with e cores yet, at least on X86. When all you have are P cores you don't have to worry about stuff like this, because all cores are equal except _maybe_ the boost clocks slightly.
So basically what I'm getting from this is that the only way that 14th gen could actually have a generational improvement is by fixing a "bug" for the new chips while leaving that issue intact on the older generations? Talk about slimy.
Nice vid! When is your Alienware Aurora R16 review going to be released? That is, if you are reviewing it. I just binged your other Alienware Aurora reviews. They are very entertaining to watch. I hope Alienware has made more drastic improvements to the new design. I haven't looked at any reviews for it yet, so I have no idea if they did.
I wonder if this will ever be supported on Linux, or even in Windows in a more general way. To me it sounds like a hack to fix scheduling issues by hand which should be dealt with on the OS-level.
I've added automatic thread affinity to Feral's Gamemode back in May (they have not merged it yet), question is though if APO does more magic than just thread affinity.
@@Henrik_Holst Intel posted some Thread Director patches for Linux some time ago which weren't merged yet. These work with a class and score system for tasks. My uneducated guess would be that APO is some kind of extension to that work on a per-application basis. But that seems very hackish and unmaintainable in the long run, I hope this work will be eventually morphed into something more generic and automatic, either on the OS-level or in hardware with a hardware-based scheduler.
@@seylaw AFAIK those patches are P-State patches, AFAIK ThreadDirector is something else that only works (or rather doesn't work) on Windows where the CPU does scheduling decisions something the P-State driver doesn't do on Linux since it's not really needed to how the Linux scheduler works. edit: my bad, apparently the real name for ThreadDirector is "Intel Hardware Feedback Interface" and is apparently only a performance metric for each core. Looks like it was mostly marketing when they talked about how it would handle the scheduling.
I’d forgotten just how far hardware has come since the release of R6 Siege. I remember playing this game on a 1050 and managing 1080p60 at mid-to-high settings and now we’re able to see over 600fps with it lmao
If they decide not to bring APO to all of 13th gen, I wish they would at least allow APO with the 13900KS since its basically the same as the 14900K. Also, would something like "Processor Lasso" be able to be used to optimize the thread scheduling on 13th gen the same way APO optimizes the 14900K? or is that not possible?
This is basically what Process Lasso does, theoretically. It would be great if someone with a 14700/14900 could test the difference between using APO vs just Process Lasso. I've been using it with my 12700k for a long time now, but it is really just to keep the e-cores free for background things like my browser while using the P-cores for heavier stuff.
Game with Big cores, let the Little corses take card of the background tasks - oh wait they don't actually do this for G12 or G13 and now only for G14 when supported. So why was these approach a good thing again?
At what point are we (Or Intel) willing to admit that instead of this 8 P cores and 16 E cores (14900k) malarkey, the better CPU would have 12-16 performance oriented cores??
Only for gaming. And desktop gaming is the smallest of all Intel's segments. Desktop is only 14% of the total x86 market (and AMD has taken a hefty chunk out of that) and gaming focused desktops are only a fraction of those 14%. So it makes no sense for Intel, from an RnD cost perspective, to launch a dedicated gaming SKU like the 7800X3D. AMD needed the 3D V-Cache technology for their contacts on scientific compute super computers and it just happened to also give them a gaming edge. That the 5800X3D beats the 5950X, and the 7800X3D beats the 7950X in gaming tells you all you need to know about the number of cores are needed for fast gaming. I have a 7950X3D myself, and use launch scripts to lock them to the best suited CCD, and only about 5-7% of my 250-ish games prefer the clock-speed one over the V-Cache one... and none of them improves by having all 16 cores available. That is not true in general, though, as city planners and games like that will chew any number of cores (and actually utilize intel's retard/efficiency cores perfectly well). But yes, if Intel wanted the gaming crown back they'd have to make an 8 P core, 0 E core die with, say, 72MB L3 cache and clock the absolute snot out of it. But the ROI in that would probably be in the negative.
Of course the real question is - once they've got enough games onstream with APO, do GN and everyone else use it for game benchmarks? It also seems to show that all the hype around Windows 11 thread scheduling and Intel Thread Director was just that - hype. Although I suspected something like APO would eventually come along when Intel first announced p-core/e-core architecture. But at the time I assumed it would be down to individual users to play around with the settings for each program. Rather than this top-down approach. Or the games would set APO themselves. Which may still be a long-term goal.
Guess I'm glad I have a 12400, no e cores lol. I care about efficiency a lot. I'm not a fan of how PC gaming usually ignores it. Or has misconceptions about how it works.
@@otto5423 Not to shit on you, just in response to your point. That's a lot of wattage, but outputs a really nice image. For me efficiency also invoves my standards. 200w GPU for my 1440p monitor feels like a bit much. Although maybe worth it to hit high framerates in certain games. Unfortunately I didn't pay much attention to efficiency when I built my new PC in January. The 12400 is plenty efficient, but the 6700XT is not from what I gather lol. Maybe if Intel puts out an efficient mid-range GPU I'll side-grade to that next year. For now though Intel's is the least efficient GPUs
This exact thing is why I went for R9 7950X. Both Intel and X3D have had "special" core scheduling issues, so I'll just sit over here like Thanos with my perfectly balanced dies
If Intel doesn't roll this back to 12/13th gen it might be the last time I buy Intel for a long time. Been using them since 1st gen i7 and before but I don't stand for shady tactics!
You don't stand for shady tactics? You're running Windows ain't you? You do realize the OS is supposed to handle all scheduling, don't you? So why are you blaming Intel? You think Intel writes Windows? Intel makes chips. What people do with those chips is their business. Shady that. What Intel is really doing here is showing you how bad your OS sucks. I bet Intel is actually pretty pissed that Microsoft aren't doing their job. It's making Intel look bad.
@@1pcfred Well Intel decided to go with a new hybrid architecture. I don't see how you can blame MS if all the cores are being utilized and scheduled based on performance and efficiency cores. All I'm saying is if Intel can squeeze out free performance from their CPU's why not do the same for others of the same design? To me that's shady considering it could be the only reason to consider 14th gen over 13th and they're gonna need some help selling them so why not make it exclusive? Sounds shady to me.
@@toddsimone7182 yes Intel makes hardware. It is up to programmers to utilize that hardware. That's the way it's always worked. Although if you're into vertical integration perhaps Apple is your thing? They make the hardware and the software. So one stop shopping. As far as why not Intel doesn't make software beyond tech demos. Intel makes chips. That's their thing. They invented chips. Well, Bobbie did before he founded Intel.
Ok, I have to ask the question, and I REALLY hope that someone with GN sees this and looks into it, but has anyone done any really in-depth testing using Process Lasso to better control the E-Cores? I personally have been using PL to tie most of my background tasks to the E-Cores only, leaving the P-Cores free to be used by my games as the games see fit. I have also used it to tie a game or 2 specifically to the P-Cores and make it so that those games aren't even aware that the E-Cores exist. I would love it if GN would do some digging to see what kind of effect (if any) they find using PL on it's own vs something like APO. Like I said, I have done some limited amount of testing on my own with decent results, but I don't have the free time or (more importantly) the know how to really come to any definitive conclusions on this.
Honestly I have a 12600k and this CPU had so many issues I disabled E cores all together. I still have some minor issues but I will be going back to AMD next generation.
Ryzen 7000 non 3D Vcache CPUs are underrated. Keep in mind, Threadripper 7000 doesn't have 3D Vcache and it certainly doesn't need it to dominate in both performance and power efficiency. Now if AMD releases a Zen 5 8995wx3D or a 9995wx3D with all core 3Dvcache, they will have less then nonexistent competition in the CPU market unless Intel 3D and Battlemage are groundbreaking!
Unless games are using all P-cores @ 100%, it would be interesting to see the benchmarks (99th frame time, etc) with E-cores completely disabled. (APO feels kinda like a band-aid to the fact that Intel thought it would be easier to schedule software between two different core types). There was an older Hardware Unboxed video that should that at least Rainbow 6 Siege performed better with them off. Edit: Nevermind, Hardwareunboxed did those tests. Very interesting. 🙂
APO is showing to be even better than no E-cores in R6, since there is some value in freeing up time on the P-cores if you can optimize routines for non-latency sensitive E-core scheduling. still seems like a bandaid for big/little challenges.
I think I'll heavily lean towards symmetric CPUs for peace of mind that software is not running in a weirdly gimped way. Unfortunately, it seems like Ryzen 7950X3D and 7900X3D do a similar thing to assign the 3D vcache CCD to games.
Great coverage GN. This is why Intel needs to put it's Ai RND energy hard into Ai acceleration of thread director. It's a win win for them, and as such would drive cpu sales and Ai sales as a showcase. Efficiency I think will be the first real boon of Ai, just imagine if intel had the first Ai DPU to feed all of a computers data draws, bottlenecks could be really known on a new level and responded to. Thread Directors real promise v.s. AMD's own big little apporach only shines when threads can be cascaded to by thread size, ability to be cache hit by L1 - L3 and timing latencies.
To me it's astonishing that little to no people, especially looking at the hardware reviewers here, have considered that the whole hybrid p-core + e-core architecture isn't going to cause problems in games. When i bought my 12900k new, right at launch, that was the first thing i was messing around with with this cpu : disabling e-cores. Boy oh boy do you leave some serious performance on the table in a lot of games, that aren't even that old, by running Intel 12/13/14th gen with their stock config (that being e-cores enabled).
12th Gen does have a general e-core consideration. 13th and 14th does not apply here. The issue is that the ring bus clock rate lowers when e-cores are active. A slower ring bus affects L3 cache performance and thread/thread communication. Chips and Cheese has an old article on this (alder lake ring bus) and they ran a couple of benchmarks and got 3-6% difference. Of course they special case rigged a worst case test to demo and test this 12th gen implementation. Fully loaded all p-cores and put a dummy load on an e-core to get the ring bus to declock. They benched apps where they had control over the thread count created for app work (aka to get a fully loaded p-core suite, and force affinity to said cores). Thread/thread communication is a reason for the history with scheduling and AMD CCXs. Mostly the older 4-core CCX designs.
@@AdrianOkay Yet it has been rarely covered, i mean properly covered. To me it seems like a lot of people still aren't aware of how much performance you leave on the table by just having the 12/13/14th gen hybrid CPUs in their stock config. That being when it comes to games at least. That being also not when it comes to all games. Some games do just fine with the stock config but a lot of them don't.
That would be the same as turning off e cores. You want control over individual game threads, putting the big ones on p cores while leaving small ones on e cores. I don't think you have that kind of control with process lasso.
I really like the idea behind this. But, with it being so specific, I'm wondering if Intel could develop this into a dev kit that software vendors can then use to make their own profiles. 'Cause a lot of people try to use Intel consumer desktop systems for music work, and multimedia in general; and the e-cores have been a problem, but on an app-by-app basis. Not only would a kit allow vendors to solve their own efficiency problems on their own, it would help secure adoption of this idea. Furthermore, I can only imagine how handy this could be for virtualization apps, too.
Can you imagine what this could do in a game studio's hands? Imagine a game detects a CPU SKU they were able to optimize for, like a 14700K or 14900K, and this little thing just goes "hey windows, I'll take it from here." Something similar could possibly be done for hybrid X3D CPUs or will need to be done for AMD's Zen5C hybrids, so maybe we end up with games or engines that have this slowly growing list of "optimized" CPUs.
@@DigitalJedi Well, the way it works now, an out-and-out profile would have to be built per application. And this is so early in development that Intel's crew are doing it by hand. However, I expect that if anyone would/could make this into a kit, it'd be them. And considering how far behind they are against AMD these days, they need any edge they can get. And it definitely would help with adoption, since there are still a lot of cases where the e-core design doesn't play correctly with deeply established apps (I'm not just talking games, obviously). That would be the ideal solution, though we obviously don't know how they're going to handle it, let alone what practical, working options they even have. Now, it's a cinch Intel isn't going to help AMD on this. So, never get your hopes up about X3D getting love if AMD isn't providing it themselves. But, you are definitely on the right track that AMD should sort out some kind of solution along similar lines as Intel.
Trusting software vendors doesn't work - we've seen this on mobile platforms, where when given the opportunity *every* developer labelled their application as needing high performance, because it made the application run better. And of course, if everyone says "give me high performance", this becomes a one step forward, two steps back situation.
@@artisan002 Eh? I didn't mention video games. I just stated what was learned from mobile devices. Android used to allow developers to specify the performance level of their application, and almost everyone said "max performance" - even things like not taking or email. That resulted in battery life issues, followed by the removal of that option. I'd think we'd probably see the same thing if PC developers were allowed to request P or E core - which would mean that all background tasks (e.g. Discord, Steam, Windows cruft etc) would request to run on P-cores, sending us back to square one.
This is a fantastic demonstration of the importance of scheduling and how much of a magic sauce it is But also it's clearly so much work there is absolutely 0% chance this gets maintained... It's completely a tech demo. It's incredibly ironic that this tech demo seems to show bigger scaling than the generational improvements.
I don't think e cores are meant to save power. Certain work requires certain energy - basic physics. All you could save is some (rather small) overhead. Also e cores are running lower frequencies. But then you could run performance cores at lower boost too. So this one doesn't really change anything. E cores are meant to fit a lot of cores in small space for faster completion of repetitive, data heavy tasks like (de)compression, OCR, encoding and alike.
I think if enough people care about this utility, then yes -- but they have to see that people care and Intel might have to expand beyond two applications to get that response.
@@GamersNexus Exactly this, so thank you for covering already. Can you or HUB maybe make a comprehensible tweet about it for the masses to share? I think the subject is a little too complex for some people to start complaining on their own. As a 13900KS owner I‘d very much appreciate that!
What i would love here is a comparison between windows and linux. New kernels seem to have a good scheduling approach there. Lets get people over to linux with games... worked for windows, right?
Why is there so much tuning involved? Can't they just have the programm check "is it that exe. that is running" and then force that to only use P Cores? And then give us a field where we can add our own exe's that will only run on P-Cores, why does it need so much work?
Wasn't thread director (and the "enhancements" in the win11 version of it) already supposed to address this stuff? And then they decide to release another feature for it. Needless segmentation on top of needless segmentation.
@@xomm Exactly my point. This is fix that was supposed to be sort of from beginning. If in 3 months it isn't available on many more CPU's then I'm going to be mad. For now I can give it a pass in the same way I can give AFMF a pass - it is a "tech demo". At least realistically. IDK about "officially", but that's how it feels.
Really interesting tech demo/early version of this feature. Curious to see if it goes anywhere further. In the meantime, we'll be busy on some CPU benchmarks for a while! Guesses as to why?
Check out our Steam Deck tear-down here! Really fun one - we learned a lot from Valve's engineers: ruclips.net/video/9jhRh11bTRA/видео.html
@GamerNexus , What cpu should i buy the 7800x3d , 13900k , 14900k ? I want to stream, gaming, and also have multiple tasks of google open ,spotify , and also maybe edit some videos . So what should i do
@@AlexManMichael Get the 7800X3D, uses a little more than 1/3 the power and has better gaming performance and you'll actually be able to upgrade using the same motherboard.
But for the tasks i want to do its more than enough ?
@@mugabugaYT
All of these are overkill if you consider 'spotify', 'having chrome tabs open' and 'maybe edit videos' as some of your tasks.
7800x3d would be best. It's slightly slower due to clock when rendering etc, but absolutely owns in gaming. Plus, it sips power (insanely efficient). Unless you're video editing like Steve, Linus, jayztwocents etc, 8 cores and 16 threads are more than enough. Plus, as mentioned, AM5 platform means future proofing for later CPU generations. I have 7800x3d and I'm so glad I chose AMD this time round.
I'm keen to check out your testing Steve and excited about more charts! :)
First comment to another youtuber’s comment
I was relieved to see you guys reran Rainbow Six at low settings so we wouldn't have to go hunting for another couple percent, hahaha! Nice to share the workload on stuff like this.
Thanks Steves!
How does one test a "Steve"?
@@thecivilizedgamer2533 The rare double thankssteves. You don't want to know what summons with a triple.
Artificially locking features to sell underwhelming current gen products. Sounds like they're learning from Jensen.
sounds like they're learning from apple
more like early tensor cores that didn't get frame gen@unholydonuts
I suspect they're even limiting to the 14700k and 147900k specifically to upsell from the "1x600k is enough for gaming" message reviewers have been repeating for several generations now.
An i3 is "enough" these days the 12100f is goated
an 3770k is enough@@BonusCrook
With this kind of hyper specific optimization, they will slowly dig a hole which they need to get out off very fast, before the hole gets too deep.
Naa... They probably dumped this out because they know the "14th gen" CPUs are a joke, and this provides some more segmentation from the 13th gen CPUs. It'll get dropped with an excuse that they've worked with M$ to incorporate the key features into the windows scheduler.
@@meeponinthbit3466 most likely. reminds me of the 40 series and dlss 3.5. Why can't it run on the 30 series? Prob cuz they ain't selling the 40s as well
@@meeponinthbit3466agreed - it feels very reminiscent of: "oh yeah... DLSS3 framegen tooootally only works on the 40-series..."
Yeah I had no idea this was a thing on the latest intel CPU gen (have a 8700K). Upgrading now. Ordered the DDR5 RAM, new CPU cooler, but haven't locked my self in yet (haven't bought the mobo or CPU). This is making me consider going with AMD. I was already having a very hard time deciding *sigh*
@@DaysieMandlss 3.5 does work on 30 series cards. I have Ray reconstruction running on cyberpunk on my 3090.
The power consumption reminds me a REALLY old test with those Atom processors. The Atom processor was SO SLOW that ended up consuming more electricity than the faster, modern processor despite have lower power load from the wall all because it took more time to complete the task.
So it's just basically moev the problem from one place to an other ?
Two things: 1. I'm missing an "e core disabled" benchmark, in addition to apo on/off.
2. CPU's are starting to become like the latest car engines: extremely complicated just to lower CPU consumption a little. And that's not something good. Now we need to add some kind of utility to re-schedule tasks.
Is not complicated, the i9s just are fundamentally flawed, 33% made of P cores and 66% e-cores kills the entire point of Big-Little.
Apple doesn't do this for the M-series of chips.
The fact that this is just scheduling means that without "APO", the processors are performing incorrectly...making it additionally insulting that they won't enable these application-specific fixes for the processors from which these 14th generation chips were photocopied
Traditionally scheduling is done by the OS, not by the CPU itself. This shows more how incompetent and uncaring Microsoft is, IMO.
Oddly enough you don't have these massive scheduling issues on Linux. Like the poster above me said: Microsoft doesn't care. They've not cared since windows 10 when they fired their QA team
@@Winnetou17
Now you know how AMD felt the whole time, especially during the FX era.
Microsoft always geared Windows to work best with Intel. So it's a nice, but tiny change, for once.
In addition to Windows' infamously iffy thread scheduler, it also shows the inadequacy of Intel's Thread Director driver. You know, the thing that was supposed to help tell Windows what the "needs to be fast" threads are and which can be given slower billing. What APO finally delivered, but through this awful per-application hack?
eCores disabled doesn't give massive gains like this.
It's a Windows problem that Intel is try to fix on its own.
This is why my Intel rig was built with an i5 12400. I never had much trust in the big/little thing. Same thing with my new AMD build - I went with a single CCD 7800X3D. I can't believe 7900X3D/7950X3D owners need to use the Xbox Game Bar to make sure Windows is specifically prioritizing the 3D V-Cache cores...
Sometimes cheaper / simpler is better. Throwing money away doesn't always get you the best experience with all of these gimmicks in modern CPUs.
I got the same i5 CPU earlier this year cause I too wasn't sold on their big.little concept. It seems to be the future since it's been a thing forever on ARM chips and AMD has said they'll try their hand at it soon so it's bound to get better eventually but it's not quite there yet.
I agree, im pretty sure your desktop experience is far better with a 12400, my 12600k has one or two e cores pegged out doing mundane things on desktop causing hitching and lag, all while p cores are sitting there doing nothing.
Yep, I pretty much refuse to buy anything with mixed cores. I'll probably have no choice at some point, but I am sticking with an older CPU for now and basically hoping this idea goes away, haha. I hate it. Mainly because I know there's no way to make a magic scheduler that can somehow know which core to put which threads on; something APO seems to prove when they have to do this amount of work for each game, on each CPU SKU, for it to work properly. It's just ridiculous, and not something I want to buy into until there's no other choice.
@@LoudlevinChange your power plan to High Performance. This should allow tasks to be promoted to the P-cores more aggressively. There were also some BIOS patches during 13th gen that quietly addressed some weirdness of low E-core counts for chips like the 13400. Going up to the most recent might iron some of that out.
@@theviniso I am very sure AMD will face the same problem with Intel as well with their own big.little cores. AMD has only claimed feature parity with both Zen 4 and Zen 4c. However, I'm certain that the IPC of both types of cores are different. It is up to Windows to schedule threads for the correct type of cores.
My worry would be with such targeted optimisation you’re likely to see varying results with different hardware, and find significant changes or actual problems when games get updates and patches.
I would just set processor affinity in task manager, and likely achieve the same result.
@@xerideapeople have tested this it doesn’t do the same thing, APO still performs better than setting affinity or disabling e cores
@@alexnulty8902 have they really rested affinity and not just disabling cores because disabling cores would remove the e-cores from taking over load from the rest of the system leaving more for the game?
@@Henrik_Holst Hardware Unboxed did.
You might get similar results if you carefully chose and set the affinity per _thread_ . Not just for the whole process.
Of course, that requires pretty deep knowledge into how that game's engine works. Knowledge that would be extremely difficult to figure out from only empirical observation.
curious to know if simply forcing all processes except the foreground game to the E cores & put the foreground game only on P cores using process lasso would achieve similar results to this on 12th/13th gen or not
Sounds like something Windows should have been doing since 12th gen
If you ever bring metro exodus to the test suite consider using the starting area from the sam's story DLC. Right before the first combat sequence there's a broken road overlooking a bridge that absolutely murders cpus.
I recently played that game at 4K maxed out. I recall my i7 9700k and RTX 3080 running that scene pretty well. Where my PC totally got wrecked in Sam's Story was the boss fight with the bat towards the end of the game. My framerate was dropping in the low 10s.
@@dwu9369 Does "4k maxed out" mean no DLSS upscaling and full ray tracing?
@@Mcnooblet Actually I had DLSS Quality on with full ray tracing. The entire game would've been unplayable from start to finish at native 4K with full ray tracing.
I've said this at HUB's video, as well:
In the end, Intel basically admitted that Thread Director still sucks, at least for gaming. This is really hand-tuned scheduling, because they can't just disable TD altogether and fall back on Windows' standard scheduling for games, either.
I wouldn't even call this a "tech demo", simply because it's not tech. Call it "craftsmanship" if you must - because that's more in line with people at Intel painstakingly pre-determining scheduling for a specific application. What a horribly unsustainable way of going about problems just because marketing told you so. Really reminds me of that oberclocked Xeon at Computex.
> fall back on Windows' standard scheduling for games
I am very sure there are no fallback for scheduling. Because from the start, scheduling is OS job and not CPU job.
This APO is basically because Microsoft not doing their job for YEARS to handle better scheduling, and then Intel taking matter into its own hand.
Also intel made this e core shit and wast ressources that can be help full in somting matter for marketing
@@LaCroix05 why not just make a product that isn't shit and comes out with a scheduler out of box
@@randomsomeguy156 LOL, please don't comment if you don't know what you are talking about.
OS scheduler is a PROGRAM/SOFTWARE that an operating system uses to place running process in the right thread/workload. OS scheduler is not HARDWARE. It's not a transistor or logic gate that you put in the CPU.
What we need, is for Microsoft and Intel to WORK TOGETHER to 'fix' the Windows scheduler. As the Windows scheduler has been proven again, again, and again, to not work properly with Intel P and E core models.
It's basically the fault of both companies. Why don't they just work together to address this issue.
Blaming just Intel for this is stupid and basically don't know anything, like you.
@@LaCroix05 Actually, Windows does the scheduling - the "OS' job" as you put it. However, we're way past the times when this could be done silicon agnostic.
Take an i5-12400 or an R7-5800X for example: Even on a fairly uncomplicated CPU (when it comes to scheduling, that is) with only equal cores, on one CCD in AMD's case, there's still "better" and "worse" cores that clock higher and lower, respectively. Windows can and does take this into account, deciding based on fMax. (Simplified explanation for a very complicated problem!)
Now, if you go AMD's way and add c-cores, which have the exact same capabilities as their regular brethren, just with lower clockspeeds, that works out fine. Intel's e-cores however are fundamentally different from p-cores, down to the very microarchitecture and instruction set. They also don't have HT. That would wreak havoc with the whole scheduling and thus Thread Director was created. Intel worked a lot with Microsoft and TD is implemented on the very silicon to "guide" the OS' scheduler. In the end, it's still the OS that decides where to assign things, but with help.
AMD ran into a similar, albeit easier to work around problem with the R9-7950X3D. The OS would always assign threads to the higher clocking non-X3D chiplet first - which is fine, in general, but not for games and other very cache-dependant workloads. Hence the XBox Game Bar integration that tells the OS what's a game and when to prefer the low frequency/high cache CCD.
However, the problem that p-cores and e-cores are two very different architectures remains. Think of it as basically having two completely different CPUs in one system. Intel will always run into the same problems and will always have to find better ways to manage thread assignments by the OS' scheduler.
And no- that's neither Microsoft's nor any other OS provider's job. That responsibility lands squarely on Intel's shoulders. They decided to make heterogenous CPUs with fundamentally different cores, they have to figure out how scheduling for these can work. It's no wonder that they didn't implement the same principle for Xeon Scalable. No one would have put a CPU into their datacenter if they had to rework their whole scheduler just for that. The reliability issues would've been a nightmare at no tangible benefit whatsoever.
Consumer PCs are different in that regard. Intel were faced with the problem that their core architecture was so overextended that it just didn't scale any further. Comet Lake was already pushing the limits and Rocket Lake even had to scale back to work with what 14nm could do. 10nm (Intel 7 these days) didn't give them the headroom to scale up, either, and AMD was literally clobbering them with high core count CPUs. Disaggregation was still far out on the horizon, so they did what they could. 8 enormous p-cores was the limit they could reasonably fit into a client CPU's die and then they filled up the rest with space-efficient e-cores - scheduling issues be damned.
Now they are set firmly on that road and have to see it through. RLR is basically a bust, Arrow Lake still looks hopeful, but anything after that... i don't know. We'll see. Having manually optimized scheduling like they're doing now with APO still looks like a marketing stunt to me that's unsustainable long term. Maybe AI can fix that for them, wh knows.
I've never liked the idea of e-cores because of these types of problems.
Don't they work just fine on ARM chips?
@@theviniso Without looking into it, my instinct is that it's a RISC/CISC issue; with x86/x64 CISC nature making its threads far more complicated to efficiently schedule, especially with hybrid core architectures.
It's a guess though; I could be entirely incorrect.
Either way I too would be interested to know the answer!
Using big and little cores on x86 high performance desktop CPUs is not a good move. I just hope AMD sees this and doesn't follow the same idea. With laptops, perhaps the trade-off is acceptable
@@theviniso
Does Windows run on ARM chips?
I've never bothered to look at ARM, so i really have no idea about it.
@@clitmint Apparently it does but not very well. I've never used it though.
This APO thing could've been and still could be massive if it worked on all 12th, 13th gen and "14th gen" on all games.
This +30% could've put intel back in the ring.
Let's just hope they'll figure it out and either drop a software update so scheduling works as it should or make it work well with 15th gen.
This, to me, smells of a feature that the marketing side forced the dev side to release before it was ready for public consumption, as well as artificially limit to 14th gen. Not a good move, Intel.
@@greggmacdonald9644
It's not a good move like you say, but then again, Intel knows their fanbois will still lap it up, as they always have done.
"Could've put them back in the ring"
They were never out though..?
@@clitmint Yeah, idk about that. If in a few months they've added support and provide significant benefit for a bunch of popular games, then maybe. If it's still like it is now, then everyone will forget about it. As Steve said, rn it's a tech demo and nothing more.
@@griffin1366 they kinda were, 250w down to 220w is still a far cry compared to 100w-180w on AMD especially when you realize relative performance is same for both sides
and here intel uses shitload more cores + software optimizations to catch up to AMD while AMD just throws L3 cache onto CPU's because software optimization is way more financially and time demanding
efficiency cores are nice but they require all parties to work on optimization for each application existing on the market while extra L3$ is redundant in design and fool proof in regards that it basically improves majority of workloads by existing
while we still cheer for E cores to be a thing we can't ignore that E cores a software engineering nightmare and that it is time for microsoft to fix their god awful task scheduling
damn the video production value of these newer gamers nexus videos is on point! great lighting and depth of field on the shots of Steve
Thank you! I'll pass that along to Vitalii on the video team! I love when the video team's camera work gets noticed like this, and I know they do too.
Came for the tech, stayed for the eye candy.
This is the reason I went with the 7800X3d . Big little requires to much gymnastics to get good performance out of it. This feels like intel desperately trying to squeeze more performance out of the chips. If we had a standard syscall that allowed programs to ask for a big or little core per thread maybe it could work, but that also requires the game developers to care.
To be fair, X3d hybrid CPUs have just as many problems with scheduling as E-Cores do. AMD's own answer was "if program is on this list, park non x3d cores. If on other list park performance cores". The scheduler isn't looking at something like cache misses and making a heuristic determination.
Scheduling is hard.
@@arthurmoore9488 yea the issue is just hybrid cpus it seems, for example the 7800x3d works perfectly fine bc it ONLY has 3D vcache cores compared to the 7950x3d which has 8 3D V-cache cores and 8 standard cores which causes all these issues with scheduling
@@arthurmoore9488 Yup, 7900x3d and 7950x3d are actually much worse in that regard than any Intel cpu with e cores. 7800x3d is the only one that works...if it doesn't explode first that is, hehehe.
It exists ! It's called Process Lasso and complex cpu users are happy of the results. And the best is that you have complete control over what your cpu does. Counter strike benefits from using faster cores? Force the game to run only on those cores with a couple of clicks.
This is a tool that should have been made either by Intel or Amd, it would have been much easier from the beginning
Zen 4C is a better idea.
Thanks Steve and team for this great explanation and video. Very interesting.
Seems like this is something intel should have figured out by now.
Oh but they did figure it out, and in their figuring, they figured the fanbois would yet again "upgrade" to the new thing, as they've always done... but when it didn't happen how they thought it would (because people are finally getting sick of Intel's bullshit) ... they roll this out to entice you all to "upgrade".
Intel knows how to get the money out of your pockets and into theirs.
They figured out that Microsoft sucks a long time ago now.
In fairness, deciding which tasks have a lot of work to do ASAP, and/or which threads will bottleneck responsiveness, is *HARD* , inherently so.
In fact, it is probably impossible to solve exactly in the general case (no method that can work for all cases).*
That said, good heuristics on "impossible to solve exactly" problems can still give a good approximation for most real world cases. And developers already have moderate ability to communicate intentions and needs of a task to the OS, that Thread Director seems to be underutilize.
* Probably impossible because it is an "adjacent" problem to the halting problem (will a program finish or loop/freeze forever on some input?). The halting problem has been known to be, in the general case, impossible to answer exactly for decades.
@@TechSY730 a popular scheme is to run timers and kick processes out periodically. Done or not there's other things waiting. I'm running 230 threads on a 4 core CPU. Limited resources need to be shared. But yeah nothing is going to be ideal. The OS just has to do the best it can. If I want things done faster then I need to get a faster PC. Which I have been thinking about doing. I'm just not quite there yet.
They did, when you have a chip my like i7-12700K. No one expected 16 e-cores and 8 p-cores
Nice to see Total Annihilation in the example list from Intel. A single core game released in 1997 will surely benefit from the tool.
Stuff like this is exactly why I went with AMD. The hybrid architecture feels like slapping a bandaid on a gunshot wound and for some reason the bullet keeps heating up and pulls 300W.
Nah, BIG.little would be absolutely better way of doing things IF the thread scheduling really worked. Hopefully it does in the future (and AMD is going to the same direction anyway).
Hybrid architecture works on a chip like the i7-12700K. That i9 with PBO is still worse perfomance per watt than my i7.
@@CuthaluThe i9s never were power efficient.
@roenie Big-Little is here to stay, the 7900X is not a monolithic chip and that by design hurts it's IPC and memory latency.
This feels like a prime opportunity for some AI learning to kick in and make this slow, laborious optimization into something the machine does for you.
BUT THEN THEY'D BE TAKING JOBS AWAY FROM PEOPLE WHO... don't actually want to be stuck doing this kind of grunt work. This is an excellent use of AI.
intel and innovation? good one
I mean it is about time we put ai to an actual use instead of using it for whatever some super rich people envision.
Or just program a classic algorythm
I think it's a good idea, but it needs a few month more development, and Intel has to add support for at least 13th gen. At the moment they lose their own customers because people are mad they bought a pretty expensive CPU this year and already out of the game. It's a really bad move and I hope there will be more pressure on them to add the feature to all 1700 cpus.
12gen too needs support... is a recent cpu.
Bad news, Hardware Umboxed channel covered this including the fact that Intel has stated that they will not support older CPU's regarding APO, only 14th gen and onwards :(
Of course. That is Intel's m.o. The 7800x3d and 7950x3d are readily available and beats Intel's new 14th gen at virtually everything. Zen 5, roughly 6-months away, will add to the lead while reducing the price of the current Zen 4 champs. Intel is behind AMD in cpu, much like AMD is behind Nvidia in graphics.@@DrKriegsgrave
I mean Intel has been doing the same since the core lineup was created and they keep selling well, why would they give it to 13th gen when they are exactly the same as 14th gen? They need to sell those CPUs somehow, otherwise 13th gen users would have no reason to upgrade.
@@andersonfrancotabares3614 When exactly did Intel exclude prior CPUs from software solutions?
They won't sell nothing when more and more Intel users switch to AMD because of the bad sevice. Intel doesn't win customers with this, they lose them.
I've had to tell people to use process lasso to get games off their e-cores when their performance is odd for a while and it's sort of nice to see even Intel struggling to manage scheduling for them. e-cores have their good, and turning them off isn't great when there are multiple ways to get the wrong programs from sitting on your e-cores.
I think both intel and AMD need a user accessible way to prioritize which cores are used on their split core designs on a per program basis.
I think it's more on Microsoft to do the work. Windows clearly recognizes which CPU is fitted to the motherboard, so it should have code in Windows to adjust the environment to work at it's best/most efficient per-vendor.
However, Windows has always been geared to work better with Intel. For obvious reasons of course, like Intel just threw a ton of money around, and was the leading CPU for such a long time anyway.
It was always AMD that got the bad end of the deal from Microsoft. Now AMD is in their consoles, they kinda have to play a little bit fairer toward AMD.
it's called Linux ... where you can bind specific applications to specific CPUs already via different means. This is basically Intel doing MS work. However to their credit. MS need to deal with a lot more CPUs so they cannot go all out on optimizations ...
I think Microsoft should add an option that says CPU preference, then you choose if you want multitasking or multithreading
@@saricubra2867 It is actually already doable, though not easily set. The easiest way is to use a program called *_Process Lasso_* ; it will detect heterogenous cores, and can automatically set "CPU Affinity" to a setting you determine to be the best.
These sorts of problems (that the APO tool is meant to address) is what I immediately assumed was going to happen when the P/E core thing was first unveiled. I'm still not convinced that's a better design than uniform cores. We saw similar problems with the above 7800 X3D chips from AMD, where the CCD with the 3d cache became the "P" cores.
Not to mention that Intel Thread Director was supposed to aid this kind of assignment among E/P cores from the start.
So now it sort of feels like Intel is making developers do manually what Thread Director's job was supposed to be all along. Remember, APO is something that has to be tuned manually for each individual program; it is not automatic.
The split-cores on Intel's 12th-14 Gen CPUs is one thing, and largely are Intel's own fault for doing it that way.... but the problems you mentioned with AMD, aren't actually problems with AMD at all. The problems lay exclusively with Microsoft and even developers.
The hybrid core designs offer massive improvements for multi threaded workloads, at least ones tested here, but for these gaming workloads it is still causing issues. However, you can always either 1. Pin misbehaving games to P-Cores, or 2. Disable the E:-Cores if you're just gaming. Although then, the 7800X3D is the best option anyway.
It's a shame that mixed core designs from Intel still have problems at the software level, because the processors otherwise are pretty impressive hardware. Maybe with this hardware scheduler people talk about for Meteor Lake and onwards they can overcome Microsoft's laziness in updating Windows core scheduling.
@@jamesbuckwas6575You're probably right. But my issue is that most people that buy it aren't going to know what a P or E core is, or what a bios setting is. They're just going to see a 20 core processor and think it's going to perform heaps better at gaming than an 8 core AMD processor. Something AMD was guilty of in the past with their "compute cores" marketing and the whole bulldozer core count disaster.
There's no way that intel isn't using this as an excuse to inflate their core counts for marketing to the uninformed.
I think it's perfect for the mobile/laptop scene but doesn't belong in their high end desktop cpus.
@@slimjim2321 I agree, the marketing around 14 core i5's should be improved. But that is a separate issue from the actual hardware, especially for people like us who are better informed about these products. Perhaps 6+8 cores could be better, or treat the E-cores as "half cores", or something better.
Thanks to the “Steve’s” for covering this new tech in great depth
Sounds more like a Proof of Concept scenario which can be rolled out bit by bit.
Agreed, it seems like a proof of concept. Hopefully they do more if they're going to stick with it.
@@GamersNexus It would be good if they could. Presumably they'll want to automate it somehow, and at this stage they are not making lavish claims about how many games it will apply to and it seems that this is Intel driven NOT Developer driven, at least at the present. It's a sort of "We've got to get our house in order but it will be beneficial going forward".
Sounds like a scheduler bug fix / driver update that they're choosing to deny releasing on "older" (but identical) hardware, and instead package as a "new" feature.
👎
@@GamersNexus
Its' also proof that Intel still thinks very little of their fanbase. Not allowing this to work for 12th and 13th Gen parts, is a spit in the face.
I stopped buying Intel at the 10900K, so even though it doesn't personally affect me, it still annoys me. I'm just sick of them at this point.
@@clitmint Think you're missing the point there. They DON'T want it to be used on a vast number of platforms as it's in public development by them, It's sounds like it's more Alpha than even Beta. Once the bugs are ironed out then I can see it definitely filtering down to 12th & 13th Gen. They are restricting the "test" pool on purpose at this stage.
Great video, its interesting to talk about the potential benefits of better scheduling, both for Intel E-cores and the AMD X3D series with increased cache on 1 CCD. I saw some similar efforts from another small youtuber @Savitarax - who really put some effort into optimizing windows scheduling to run background processes on his 7950x3d on the high clockspeed / lower cache cores.
I'm glad you guys are giving this area more exposure, as obviously if we get the right people aware of the problem it can be taken more seriously. It makes sense that this would be a bigger effort from companies like Microsoft, to work to automatically schedule the less important tasks on the 'background' cores.
I'd love to see a follow-up video where you manually tuned a windows installation to push background tasks to the E-cores / AMD non X3D cores respectively, and see if you could pick up a noticeable framerate (or especially 1% lows) gain across multiple games - I know that would be a lot of work, so it probably won't happen, but it would be great to see.
Either way, thanks for this look at APO.
Or the age old thing with using more cores in general
@@emmata98That would hurt multitasking.
@@saricubra2867 yeah, therefore games are singlecore only so you have cores left for a parked internet browser^^
Also why not use the extra performence when you don't multitask heavily (like most of the time)?
We aren't gaming and rendering at the same time
@@emmata98 Not every game out there scales with core count
@@emmata98 Ask that question to Hogwarts Legacy devs. Hogsmead can stutter even on the best Ryzen 9 on the planet.
APO is the most disturbing thing ever because now game devs don't care about optimization anymore and you have to brute force it through AI driver stuff...
_This_ is the kinda GN content I live for - who else is gonna figure all this out and explain it to us?
...cuz I don't test sh*t, cuz _I ain't got time!_
For what it's worth, the e-cores in my 12700K have been good for me, but not in games. They're mainly good for like, non-real-time rendering nonsense.
Man your camera quality is on point now!!
This is really a good example as to why hybrid architectures like this are hard to do, and likely why AMD went the "just make the existing cores smaller" route
If this turns out like Intel’s route with performance vs cores where they had suddenly years of design to catch up to vs AMD it might solidify AMD’s position a bit more.
5:30 in - excellent use of (sic) and very welcome. Thank you.
Not sure if using Process Lasso provide any benefits for CPUs with E-cores but it helped me a lot with game stability back when my system was CPU bound.
I live for videos like this, Thanks Steve!
I love when companies make something exclusive to the windows store. and also when they launch a feature 2 gens late just to only support a couple games
Good job pointing out a new marketing feature "inefficiency cores", thanks steve
Hopefully we got some kind of hardware scheduler on Arrow Lake and Zen 5. The ecores and extra ccd causing performance problems as well as having to micromanage programs is not fun at all. 7800X3D still god tier gaming.
Hopefully we'll see 10 or 12 cores on 1 CCD in the near future. A 10-12 core single CCD X3D CPU would be killer for gaming while also offering more cores. Some games are already optimized for 16 threads (just like the PS5) so even 2 extra cores would be nice to have on one CCD.
Intel is extremely subpar for gaming. Not a good look when your CPU uses literally double the power for on average the same or less gaming performance, while costing more.
The 5800X3D was a great success and the 7800X3D perfected it, consuming even less power than a 5800X3D while performing much better. Zen 5 X3D will destroy Intel if they don't come up with a solution fast. I'm curious what their next gen CPUs with L4 cache will be like, and if they can be air cooled lol.
Let's be honest Intel is extremely lucky they have OEMs and brand recognition on their side during their own "Bulldozer era".
efficiency of cpu in gaming [Frames/Joule]= 13:53 [FPS] / 12:03 [W] , that is more correct way of talking about efficiency power is only related to the amount of work being done per unit of time, but we need to know if we are wasting energy or not by doing the work as efficeint as possible... +24% FPS needing -17% power means: +70% efficiency in F/J or -41% energy spent in doing the same work (less electric bill and less heat generated but also less energy spent in cooling if needed, last one not taken into acount as is an external energy use at room level).
That 1.5 star rating lol
hahaha, noticed that also
An alternative might be "process lasso" a little bit technical tool to set application affinity to specific cores. Can be done with Windows task manager but process lasso remembers the settings after a reboot when you configure it that way.
Could we ask for one extra test and results for disabled E cores and enforcing Core Affinity to P cores please?
They are playing the NVIDIA card, they release a new product that's effectively the same as the previous one but throw in software exclusivity to pretend it's an upgrade.
E-cores are mainly die space efficient not power. Only their limited clock speed prevents them from wasting as much power as the P-cores
The only reason Intel uses "E-Cores" is because they can't have a CPU with all regular cores due to power draw.
literally not true. They can put 4 ecores in the place of 1 pcore that will scale better in any multicore app
Yes, it's to pack as much multithreaded performance in as they can without completely blowing out the thermal/power budget.
shhhh don't tell the truth
@@GamersNexus Die area budget, not power or thermals.
@@whiteren8749 No. Power budget. They're already in a less than ideal spot. Just giving people 24 P cores but limiting PL1 and PL2 to, say, 400W will give an average clock speed that is lower than what 8 P + 16 E cores offers because the voltage/frequency response is so different between the two. The E cores are mobil architecture derived. That 24 P cores would also blow out the die area to absurdity, while giving lower perf/watt and lower total perf is precisely the reason they had to start mucking about with a heterogeneous architecture in the first place.
I use to mine Monero, a good miner needed to optimize which cores could run. Often times we would gain over 50 percent extra performance for free. This, with my testing, I noticed disabled most threads, and allowed most cores to run uninterrupted. Btw, it worked on AMD and on Intel.
You mine monero on a CPU? Is it worth it at all? I remember mining it on a 6700XT and I got basically nothing, certainly not enough to pay the power bill. CPUs are much slower, no?
They should be called inefficient cores instead and performance cores instead
economy cores
walmart cores
They’re space efficient, that’s the point of them as far as I’m aware
@@Rogerkonijntje Temu Cores
They're only detrimental under certain gaming scenarios, similar to how hyperthreading can sometimes hurt or help gaming performance. It's variable. The P/E core setup is actually pretty great otherwise. Most games function similarly whether E cores are enabled or disabled. And for multithreaded performance, it's insanely helpful because they've allowed 4 tiny cores in the space of what would've been maybe 1 extra big core.
They shouldn't be mocked for trying to innovate in the desktop PC space. They should be mocked for not allowing this feature on 13th gen CPUs though.
That "example only" screenshot is interesting...Total Annihilation is an RTS from 1997 (and it's an incredible game with an extensive modding community active at TAUniverse)
Pretty sure any E-core should handle that game fine but who knows? lol. I love that game, btw.
If the secret sauce behind APO is really just a person or a team of people manually optimizing how the threads are scheduled for a specific application, that is not going to be sustainable as a product feature; unless you are talking about the kind of efforts nVidia made on releasing a new driver every time a new big-budget game came out (optimization being particularly important for DX 11 games).
One thing that does seem remarkable for the APO is that the 1%Lows get huge improvements as well. This is also true for Rainbow 6 Siege. The low fps occurency is much lower.
Seems similar to what can be done with process lasso. A lot of people have been getting good results moving everything to E cores on Intel and second CCD on AMD. And that works with anything. I wonder how the power levels compare.
I don't use process lasso. The problem with e-cores is that they have pretty bad singlethread perfomance, specially worse IPC than a 10 year old Haswell laptop. I let Thread Director do it's job and it works for me (i have the i7-12700K which is 8+4).
The benefit of CCDs on AMD is that they are equal, perfomance is very consistent, but they would use far more power for light workloads.
@@saricubra2867 I think you misunderstood my post. Or I did a poor job of explaining myself. I did not compare AMD to Intel. Just that people are doing things similar to what APO seems to be doing using Process Lasso. As far as making sure things run on specific cores.
But, AMD CCD's are not always equal. In 5xxx chips, for example, the second CCD is usually weaker and clocks lower so it is a good idea to move everything to that one that is not mission critical that can not use more cores than one CCD has. Cross talk between CCD's is pretty slow relative to being on the same CCD. And with the new 7xxx 3D chips, you only have the 3D cache on one CCD. So, again, moving the OS and as much other stuff as possible to the second CCD is great for any games or other programs that can take advantage of the extra 3D cache on the first CCD.
So, even on AMD it is advantageous in some situations to control what cores/CCD you are running various things on. You wont usually see big gains in average FPS, but 1% lows and stutters are dealt with very well. Similar to making sure your games are not using E cores on Intel. The gains just will be bigger on the Intel side since a P core is so much more powerful than the E cores.
@@cbremer83 "5xxx chips"
Zen 3 CCDs are equal, you mean 79xx X3D.
@@cbremer83 Fun fact, Cyberpunk 2077 has good scaling across core counts
I would like to see performance comparisons between APO and per-process E-core disablement via tools like Project Lasso. Since APO is making better use of the scheduler, it should outperform E-core disablement. I'd also like to hear more about what entities are responsible for optimizing scheduling. I was under the impression Microsoft was responsible for it in their OS.
And why is this enhancement delivered via the Windows Store and not Windows Update? Is Intel circumventing a Microsoft process? Is Intel taking this upon themselves due to dissatisfaction with Microsoft's scheduling efforts?
I always thought efficiency cores were space-efficient, not necessarily power-efficient.
economy cores not efficiency cores. efficiency comes when the best core does the task they are best at.
@@OrjonZ define "best"? "best" can be fastest, but "best" can also mean less "total kWh per task". GamersNexus added this metric to their CPU tests a while ago and it is the metric that gets improved by efficiency cores, which is why the naming is proper. But that metric is unimportant for gaming.
They're optimized in many ways only privy to computer engineers. For example, context switching. A larger p core burns more power saving processor state and flushing the pipeline than an e core would. The e cores are better at certain tasks, such as frequently switching background threads in and out. Additionally, fewer execution units means less power is used per clock cycle when a background thread doesnt saturate the reservation stations.
@@KerbalLauncher Is the efficient context switching the result having fewer specialized registers, or is there a more interesting optimization involved?
@@solemnmagus Sort-of; the smaller size of the e-core means less state to save/restore on a context switch. I'm not sure I'd call this being more efficient though - it's just that there is literally less work to do.
@12:49: No, Steve!
You can have lower priority tasks run on E-cores, there are many different tasks a game runs on the CPU and you can perfectly fine run sound or even AI subsystems on E-cores.
Game developers generally have no clue about thread scheduling!
Even preloader threads can run perfectly fine on E-cores!
This has kept me away from Intel CPUs despite the value, I don't trust the scheduling
The OS is supposed to handle scheduling. A CPU is just hardware. If you want to see what a CPU does with no software running on it turn your PC off. It just kinda sits there like a lump, don't it? CPUs execute instructions. That's ALL they do! You need to feed it instructions for it to do anything at all. Well, that's not entirely true today but for the sake of discussion let's keep it simple. Your CPU actually runs Minix while it is powered up. If you have an Intel CPU, that is. But that just does internal housekeeping. It isn't externally exposed. I'm not sure if that executes on one of your counted cores or if there's extra logic? Kind of like a secret computer inside the chip.
Turn off the e cores in the bios. Sorta a simple fix xd
@@RyanFennecobviously doesnt fix anything
@@RyanFennecthis seems to keep background stuff on the E cores to increase performance, turning them off is a hack at best
Not saying that APO isn't a hack, but is a notch above crippling the CPU
@@awemowe2830several of my friends have shown me uplifts in their preferred PvP games so it's not a doesn't do anything is it
future support will make or break this 'feature', the only silver lining i can think of is that the graphic driver team surely has tons of data on how individual games run because they've been hard at work polishing those arc drivers so if the cpu-team can make use of that to sort of streamline the process then it might be feasible in the long term
Would be interesting to see if restricting the affinity of tasks in Windows / Linux appropriately would also give similar results (i.e. restrict all OS / background tasks to E-cores, put game on P-cores).
Yeah exactly what I was thinking.
Would be nice to just cull the background tasks altogether. Windows bloat/spyware is out of control
@@jasonmitchell3307 I use W10Privacy with a preconfigured ini file combined with Shutup10 ( tools with many options) to remove/disable a lot of W10/11 bloat like unecessary background processes and preinstalled cr@p.
@@jasonmitchell3307It's gonna make them lose to Linux in performance if they keep piling it on
@@jasonmitchell3307 Fun fact: My PSU reports its power usage. I can quantify Windows background tasks as "10-15W of power at idle when compared to Linux".
One additional test that I think would be interesting would be running the games with APO disabled while setting the CPU affinity on the game process to only allow running on P-cores, either through Process Lasso or just manually changing via Task Manager. In theory that approach could provide a working optimization for 12th and 13th gen CPU owners who Intel is leaving out to dry with APO.
I suspect it may still not be quite as good, as CPU Affiniity only affects the running game process but the GPU driver code may still run on E-cores, but even still may be an improvement over nothing at all.
There's too many cores on an i9. If you have a simple chip like a i7-12700K, you don't need Process Lasso, it just works.
@@saricubra2867 You mean actually just works, or "just works" in jensen huang style? LOL
@@idontneedthis66 It's not Jensen Huang style lmao.
My impression from Intels e-cores, is that they're primarily about efficiency in die size, not about the energy efficiency.
Yes, they may also be more 'energy efficient' for some tasks where you'd either have to power a full core or just these, but we've already seen in the past that E-Cores are less efficient than P-Cores for the high performance workloads.
If you downclock a normal core, you get massive power efficiency improvements, due to how voltage and power scaling works. This is why laptops are still reasonably comparable to desktops. E-Cores aren't actually any more efficient than P-Cores, and I suspect they are a bit worse in many cases, due to Intel's horrible implementation. They work well for phones though.
Intel E-cores were about scraping narrow wins in benchmarks that a 10 P-core system could not do, as such instead of running at a power efficient frequency they've had to be pushed up the power wall.
One thing that HUB found that wasn't mentioned here is that the APO changes actually had a larger impact than just turning off e-cores. Looking at their e-core load distribution, it looks like 1 e-core per cluster was utilized. I wonder if they wanted to keep the extra 4MB cache per cluster without using too many e-cores.
Given the hybrid scheduling oddities encountered with CS2, that would probably be the biggest title they could add to APO. Given that a solid chunk of the player base plays on 4:3 stretch with minimum settings, the CPU is probably the bottleneck for many people.
Problem with the HUB test AFAIK is that they disabled the e-cores instead of just setting the thread affinity of the game to the p-cores, by disabling the e-cores the rest of the system cannot use the e-cores either so it will cannibalize on the p-cores from the game.
@@Henrik_Holst I tried that. CS2 microstuttering stopped but also Battlefield 2042 lost 12% performance. So it depends on the game. One thing i do need to try is core parking though, I have C-States etc. disabled so the eCores were at 4Ghz
@@Henrik_HolstMoving non-game processes off the P cores is the obvious solution that should've been there on 12th gen, this is why I avoid Intel and dual-CCD CPU's, nobody can get basic shit right
@@griffin1366 interesting that BF2042 went worse by exluding the e-cores, I guess that game launches lots of threads then and thus benefits more from having access to lots and lots of cores.
@@Henrik_Holst Battlefield games have always been that way. They will take whatever you throw at them. You can still see gains in Battlefield 4 with new hardware, upwards of 600 FPS at low settings. Crazy how smooth that game feels to play.
Wait, wasn't this one of the things Windows 11 was supposed to be doing already? That was a big thing with Windows and Intel 12th gen and up; Windows 11 process scheduler was supposed to be able to delegate low priority and background processes to the E-cores and the CPU heavy processes to the P-cores. It was a big thing at the time of Windows 11 to get people with the latest Intel CPU to move away from Windows 10. Am I wrong, or is this even deeper than that? Is it delegating a specific operation within a process to the appropriate E/P core, instead of the whole process?
Remember plug and pray? Microsoft has a habit of over promising and under delivering.
I suspect it was nothing more than a marketing bullet point completely divorced from the actual developer roadmap
I believe this reallocates affinity for the in-game threads, like put renderer threads to p-cores and some misc threads (like ai scripts) to e-cores, something that windows scheduler have no way of knowing of how to do (all of them are game.exe threads) and that is why APO have to have custom profiles for each game.
@@marshallb5210 if I had to guess I'd say it was a research project that came up in discussion at a meeting and someone had the bright idea of releasing it. I can certainly see why Intel would. It does cast their latest hardware in a better light. Even if the scope is extremely limited. It also opens up a dialog Intel is keen on promoting too. Very few here seem to have gotten the point though. But they may come around eventually.
@@AlexeyDyachenko now it is beginning to sound like an API failing. Because with no way of labeling the threads, like you say, there's no way for the OS to know what to do with anything.
Great video as always!
That's really interesting. I wonder if this is a Windows-only issue or is the scheduler on Linux also not great. Maybe that's why the application is Windows only?
Intel don't care about desktop Linux, only server Linux.
You gotta admit that the .1% low increasing by more than 50% at 1080p on the rainbox six siege is impressive when enabling APO.
Would be really interested in how this compares to manually setting CPU affinity to P cores only for games - have you tried that?
100% this. Locking games to P cores only gives really good improvements in avg fps and less stutter/spikes on my 12700H laptop in some games. Like Factorio, where you'd expect running 100K logistic bots would be where the extra threads would help
Depends on the game.
CS2 has microstutters unless I set an affinity to just the Pcores while Battlefield 2042 has about 12% more FPS.
How do you know which threads are the efficiency ones? I5 13400f owner here.
I agee.. Can you achive the same results by using say Procces Lasso ?
@@greebj You are power limited on your i7-12700H with clockspeed being the main bottleneck.
Seeing TA in the list of supported games warms my heart
This is exactly what I was thinking. Because they’re branded efficiency cores they’re not clocked as well so when they have a couple of 2.2ghz loads on a few of them apo brings up the wattage, since the cores aren’t chillin around 1ghz they’re heating up to act as if 2-3 e-cores are 1 full p-core
The opposite is happening, though, at least in Hardware Unboxed's coverage. There the total CPU power consumption goes down by about 10W while frame rates go up by up to 20%. E cores at 2.2GHz still consume a lot less power than P cores at 5.7Ghz. Exactly why is hard to say. It could be because of improved cache usage, but it could also be that the P cores run cooler and have more headroom to pull stored power (from capacitors for instance) when some processing step requires extra attention.
If they can scale this out to most modern gaming titles from now on, they may have an answer to AMD's X3D tech.
More "efficient" cores reducing P core throughput and increasing power use sounds very much to me like it's causing cache thrashing
Unfortunately this kind of stalling is invisible to consumer users, just shows up as CPU busy / 100% utilisation
If the rumours of 8+32 Arrow Lake are true this problem may become worse without even more of these application specific bandaids. 🤮
@@greebj Hardware Unboxed's hypothesis was that using E cores for low priority/background tasks would free up level 3 cache for the P cores, which would then speed up high priority tasks.
Hopefully, Intel will be able to turn APO into something that would either work relatively generically, or at least be built into compilers and/or frameworks like Unreal Engine.
I say "hopefully", because Intel has become the underdog lately, and we as consumers are best off with competition.
What they should so to make it really good PR is to make it possible for the public to create their own custom profiles for game/CPU combinations. They'd effectively do a Bethesda and crowd source improvements for their tech.
I thought one of the main reasons Windows 11 even exists was to properly assign threads to P and E cores
Nah, just give Intel more money and hope they release your game's performance from the shackles of this crappy architecture/software implementation.
From my testing it made zero difference anyway. GN came to the same conclusion.
Windows 11 exists because Microsoft wants to make Windows about the Desktop PC again. They no longer have to rely on Windows to make profits and have since stopped chasing trends.
and nowdays windows 10 has the same task scheduler as windows 11 lol. no need for windows 11
don't worry, windows 12 will fix /s
Yeah. Windows 11 uses some magical heuristics that are frequently wrong, while Windows 10 simply looks at thread priority, putting any subnormal thread on E-cores. Honestly, in my experience, Windows 10 gets it right more often than Windows 11. Good luck getting Windows 11 to put any service process on the P-cores (this causes a lot more problems than some may realize).
This sort of thing is why I'm not too comfortable with e cores yet, at least on X86. When all you have are P cores you don't have to worry about stuff like this, because all cores are equal except _maybe_ the boost clocks slightly.
All cores are equal, unless you overclock them near their limit, where even smallest and insignificant defects make it impossible to reach.
So basically what I'm getting from this is that the only way that 14th gen could actually have a generational improvement is by fixing a "bug" for the new chips while leaving that issue intact on the older generations? Talk about slimy.
I have a feeling if Intel had headphone jacks in their CPUs, they would remove them and charge for dongles.
Nice vid! When is your Alienware Aurora R16 review going to be released? That is, if you are reviewing it. I just binged your other Alienware Aurora reviews. They are very entertaining to watch. I hope Alienware has made more drastic improvements to the new design. I haven't looked at any reviews for it yet, so I have no idea if they did.
I wonder if this will ever be supported on Linux, or even in Windows in a more general way. To me it sounds like a hack to fix scheduling issues by hand which should be dealt with on the OS-level.
I've added automatic thread affinity to Feral's Gamemode back in May (they have not merged it yet), question is though if APO does more magic than just thread affinity.
@@Henrik_Holst Intel posted some Thread Director patches for Linux some time ago which weren't merged yet. These work with a class and score system for tasks. My uneducated guess would be that APO is some kind of extension to that work on a per-application basis. But that seems very hackish and unmaintainable in the long run, I hope this work will be eventually morphed into something more generic and automatic, either on the OS-level or in hardware with a hardware-based scheduler.
@@seylaw AFAIK those patches are P-State patches, AFAIK ThreadDirector is something else that only works (or rather doesn't work) on Windows where the CPU does scheduling decisions something the P-State driver doesn't do on Linux since it's not really needed to how the Linux scheduler works.
edit: my bad, apparently the real name for ThreadDirector is "Intel Hardware Feedback Interface" and is apparently only a performance metric for each core. Looks like it was mostly marketing when they talked about how it would handle the scheduling.
I’d forgotten just how far hardware has come since the release of R6 Siege. I remember playing this game on a 1050 and managing 1080p60 at mid-to-high settings and now we’re able to see over 600fps with it lmao
If they decide not to bring APO to all of 13th gen, I wish they would at least allow APO with the 13900KS since its basically the same as the 14900K. Also, would something like "Processor Lasso" be able to be used to optimize the thread scheduling on 13th gen the same way APO optimizes the 14900K? or is that not possible?
Disable all your E-cores? Or use Lasso?
This is basically what Process Lasso does, theoretically. It would be great if someone with a 14700/14900 could test the difference between using APO vs just Process Lasso.
I've been using it with my 12700k for a long time now, but it is really just to keep the e-cores free for background things like my browser while using the P-cores for heavier stuff.
Game with Big cores, let the Little corses take card of the background tasks - oh wait they don't actually do this for G12 or G13 and now only for G14 when supported. So why was these approach a good thing again?
At what point are we (Or Intel) willing to admit that instead of this 8 P cores and 16 E cores (14900k) malarkey, the better CPU would have 12-16 performance oriented cores??
Only for gaming. And desktop gaming is the smallest of all Intel's segments. Desktop is only 14% of the total x86 market (and AMD has taken a hefty chunk out of that) and gaming focused desktops are only a fraction of those 14%. So it makes no sense for Intel, from an RnD cost perspective, to launch a dedicated gaming SKU like the 7800X3D.
AMD needed the 3D V-Cache technology for their contacts on scientific compute super computers and it just happened to also give them a gaming edge. That the 5800X3D beats the 5950X, and the 7800X3D beats the 7950X in gaming tells you all you need to know about the number of cores are needed for fast gaming.
I have a 7950X3D myself, and use launch scripts to lock them to the best suited CCD, and only about 5-7% of my 250-ish games prefer the clock-speed one over the V-Cache one... and none of them improves by having all 16 cores available. That is not true in general, though, as city planners and games like that will chew any number of cores (and actually utilize intel's retard/efficiency cores perfectly well).
But yes, if Intel wanted the gaming crown back they'd have to make an 8 P core, 0 E core die with, say, 72MB L3 cache and clock the absolute snot out of it. But the ROI in that would probably be in the negative.
The cpus should do this by default. Ridiculous that you need a piece of software to gain more FPS on your CPU.
Intel throwing everything they have to try to sell more 14th gen
Of course the real question is - once they've got enough games onstream with APO, do GN and everyone else use it for game benchmarks?
It also seems to show that all the hype around Windows 11 thread scheduling and Intel Thread Director was just that - hype.
Although I suspected something like APO would eventually come along when Intel first announced p-core/e-core architecture. But at the time I assumed it would be down to individual users to play around with the settings for each program. Rather than this top-down approach.
Or the games would set APO themselves. Which may still be a long-term goal.
Guess I'm glad I have a 12400, no e cores lol.
I care about efficiency a lot. I'm not a fan of how PC gaming usually ignores it. Or has misconceptions about how it works.
Nooooo! You need a 450w CPU!!!
I have a 13700k and it pulls 50-90W depending on the game.
RTX 2080 that pulls 100-200W depending on the game.
My 4080 pulls like 170-230 watts depending on the game it is undervolted but not too much. Best card for efficiency!
@@otto5423 Yeah 4080 is such a nice sweet spot and undervolts well.
@@otto5423 Not to shit on you, just in response to your point. That's a lot of wattage, but outputs a really nice image. For me efficiency also invoves my standards. 200w GPU for my 1440p monitor feels like a bit much. Although maybe worth it to hit high framerates in certain games.
Unfortunately I didn't pay much attention to efficiency when I built my new PC in January. The 12400 is plenty efficient, but the 6700XT is not from what I gather lol.
Maybe if Intel puts out an efficient mid-range GPU I'll side-grade to that next year. For now though Intel's is the least efficient GPUs
This exact thing is why I went for R9 7950X. Both Intel and X3D have had "special" core scheduling issues, so I'll just sit over here like Thanos with my perfectly balanced dies
If Intel doesn't roll this back to 12/13th gen it might be the last time I buy Intel for a long time. Been using them since 1st gen i7 and before but I don't stand for shady tactics!
You don't stand for shady tactics? You're running Windows ain't you? You do realize the OS is supposed to handle all scheduling, don't you? So why are you blaming Intel? You think Intel writes Windows? Intel makes chips. What people do with those chips is their business. Shady that. What Intel is really doing here is showing you how bad your OS sucks. I bet Intel is actually pretty pissed that Microsoft aren't doing their job. It's making Intel look bad.
@@1pcfred Well Intel decided to go with a new hybrid architecture. I don't see how you can blame MS if all the cores are being utilized and scheduled based on performance and efficiency cores. All I'm saying is if Intel can squeeze out free performance from their CPU's why not do the same for others of the same design?
To me that's shady considering it could be the only reason to consider 14th gen over 13th and they're gonna need some help selling them so why not make it exclusive? Sounds shady to me.
@@toddsimone7182 yes Intel makes hardware. It is up to programmers to utilize that hardware. That's the way it's always worked. Although if you're into vertical integration perhaps Apple is your thing? They make the hardware and the software. So one stop shopping. As far as why not Intel doesn't make software beyond tech demos. Intel makes chips. That's their thing. They invented chips. Well, Bobbie did before he founded Intel.
Ok, I have to ask the question, and I REALLY hope that someone with GN sees this and looks into it, but has anyone done any really in-depth testing using Process Lasso to better control the E-Cores? I personally have been using PL to tie most of my background tasks to the E-Cores only, leaving the P-Cores free to be used by my games as the games see fit. I have also used it to tie a game or 2 specifically to the P-Cores and make it so that those games aren't even aware that the E-Cores exist.
I would love it if GN would do some digging to see what kind of effect (if any) they find using PL on it's own vs something like APO. Like I said, I have done some limited amount of testing on my own with decent results, but I don't have the free time or (more importantly) the know how to really come to any definitive conclusions on this.
Id love to see a video about this subject too.
Honestly I have a 12600k and this CPU had so many issues I disabled E cores all together. I still have some minor issues but I will be going back to AMD next generation.
Interesting, what kind of issues?
what issues? had 0 on my 12900k and 13900k with and without them on.
Intel saw hybrid architectures for mobile, and thought, that's a good idea. Unfortunately they completely bombed the implementation.
Ryzen 7000 non 3D Vcache CPUs are underrated. Keep in mind, Threadripper 7000 doesn't have 3D Vcache and it certainly doesn't need it to dominate in both performance and power efficiency.
Now if AMD releases a Zen 5 8995wx3D or a 9995wx3D with all core 3Dvcache, they will have less then nonexistent competition in the CPU market unless Intel 3D and Battlemage are groundbreaking!
Unless games are using all P-cores @ 100%, it would be interesting to see the benchmarks (99th frame time, etc) with E-cores completely disabled. (APO feels kinda like a band-aid to the fact that Intel thought it would be easier to schedule software between two different core types).
There was an older Hardware Unboxed video that should that at least Rainbow 6 Siege performed better with them off.
Edit: Nevermind, Hardwareunboxed did those tests. Very interesting. 🙂
APO is showing to be even better than no E-cores in R6, since there is some value in freeing up time on the P-cores if you can optimize routines for non-latency sensitive E-core scheduling. still seems like a bandaid for big/little challenges.
I think I'll heavily lean towards symmetric CPUs for peace of mind that software is not running in a weirdly gimped way. Unfortunately, it seems like Ryzen 7950X3D and 7900X3D do a similar thing to assign the 3D vcache CCD to games.
Great coverage GN. This is why Intel needs to put it's Ai RND energy hard into Ai acceleration of thread director. It's a win win for them, and as such would drive cpu sales and Ai sales as a showcase. Efficiency I think will be the first real boon of Ai, just imagine if intel had the first Ai DPU to feed all of a computers data draws, bottlenecks could be really known on a new level and responded to. Thread Directors real promise v.s. AMD's own big little apporach only shines when threads can be cascaded to by thread size, ability to be cache hit by L1 - L3 and timing latencies.
To me it's astonishing that little to no people, especially looking at the hardware reviewers here, have considered that the whole hybrid p-core + e-core architecture isn't going to cause problems in games. When i bought my 12900k new, right at launch, that was the first thing i was messing around with with this cpu : disabling e-cores. Boy oh boy do you leave some serious performance on the table in a lot of games, that aren't even that old, by running Intel 12/13/14th gen with their stock config (that being e-cores enabled).
That has been a main discussion point since they came out. What do you mean no one considers it?
@@GamersNexus Did you guys covered that topic with an dedicated video?
@@NoName-st6zc since the dual ccd conflicting with the scheduler people have known this kind of stuff was going to happen
12th Gen does have a general e-core consideration. 13th and 14th does not apply here. The issue is that the ring bus clock rate lowers when e-cores are active. A slower ring bus affects L3 cache performance and thread/thread communication. Chips and Cheese has an old article on this (alder lake ring bus) and they ran a couple of benchmarks and got 3-6% difference. Of course they special case rigged a worst case test to demo and test this 12th gen implementation. Fully loaded all p-cores and put a dummy load on an e-core to get the ring bus to declock. They benched apps where they had control over the thread count created for app work (aka to get a fully loaded p-core suite, and force affinity to said cores).
Thread/thread communication is a reason for the history with scheduling and AMD CCXs. Mostly the older 4-core CCX designs.
@@AdrianOkay Yet it has been rarely covered, i mean properly covered. To me it seems like a lot of people still aren't aware of how much performance you leave on the table by just having the 12/13/14th gen hybrid CPUs in their stock config. That being when it comes to games at least. That being also not when it comes to all games. Some games do just fine with the stock config but a lot of them don't.
So would you get a similar effect from using process lasso and setting the thread affinity for your games to the P cores?
That would be the same as turning off e cores.
You want control over individual game threads, putting the big ones on p cores while leaving small ones on e cores. I don't think you have that kind of control with process lasso.
Shoutout to the four people benefitting from this.
Claim your “here within an hour” ticket right here🏆
mine
no
how about a minute?
claimed
Welp
I really like the idea behind this. But, with it being so specific, I'm wondering if Intel could develop this into a dev kit that software vendors can then use to make their own profiles. 'Cause a lot of people try to use Intel consumer desktop systems for music work, and multimedia in general; and the e-cores have been a problem, but on an app-by-app basis. Not only would a kit allow vendors to solve their own efficiency problems on their own, it would help secure adoption of this idea. Furthermore, I can only imagine how handy this could be for virtualization apps, too.
Can you imagine what this could do in a game studio's hands? Imagine a game detects a CPU SKU they were able to optimize for, like a 14700K or 14900K, and this little thing just goes "hey windows, I'll take it from here." Something similar could possibly be done for hybrid X3D CPUs or will need to be done for AMD's Zen5C hybrids, so maybe we end up with games or engines that have this slowly growing list of "optimized" CPUs.
@@DigitalJedi Well, the way it works now, an out-and-out profile would have to be built per application. And this is so early in development that Intel's crew are doing it by hand. However, I expect that if anyone would/could make this into a kit, it'd be them. And considering how far behind they are against AMD these days, they need any edge they can get. And it definitely would help with adoption, since there are still a lot of cases where the e-core design doesn't play correctly with deeply established apps (I'm not just talking games, obviously). That would be the ideal solution, though we obviously don't know how they're going to handle it, let alone what practical, working options they even have.
Now, it's a cinch Intel isn't going to help AMD on this. So, never get your hopes up about X3D getting love if AMD isn't providing it themselves. But, you are definitely on the right track that AMD should sort out some kind of solution along similar lines as Intel.
Trusting software vendors doesn't work - we've seen this on mobile platforms, where when given the opportunity *every* developer labelled their application as needing high performance, because it made the application run better. And of course, if everyone says "give me high performance", this becomes a one step forward, two steps back situation.
@@habilain Man... You must _hate_ the evolution of video games, then.
@@artisan002 Eh? I didn't mention video games. I just stated what was learned from mobile devices. Android used to allow developers to specify the performance level of their application, and almost everyone said "max performance" - even things like not taking or email. That resulted in battery life issues, followed by the removal of that option.
I'd think we'd probably see the same thing if PC developers were allowed to request P or E core - which would mean that all background tasks (e.g. Discord, Steam, Windows cruft etc) would request to run on P-cores, sending us back to square one.
This is a fantastic demonstration of the importance of scheduling and how much of a magic sauce it is
But also it's clearly so much work there is absolutely 0% chance this gets maintained... It's completely a tech demo.
It's incredibly ironic that this tech demo seems to show bigger scaling than the generational improvements.
This looks really promising if they can extend it to more games and processors. 30% more FPS for free is crazy.
It's not in all uses though, as Steve from Hardware Unboxed showed, some results didn't do anything at all at higher resolutions LOL
I don't think e cores are meant to save power. Certain work requires certain energy - basic physics. All you could save is some (rather small) overhead.
Also e cores are running lower frequencies. But then you could run performance cores at lower boost too. So this one doesn't really change anything.
E cores are meant to fit a lot of cores in small space for faster completion of repetitive, data heavy tasks like (de)compression, OCR, encoding and alike.
Reckon we can get Intel to change its mind about being "14th" Gen only?
I think if enough people care about this utility, then yes -- but they have to see that people care and Intel might have to expand beyond two applications to get that response.
@@GamersNexus
Exactly this, so thank you for covering already. Can you or HUB maybe make a comprehensible tweet about it for the masses to share? I think the subject is a little too complex for some people to start complaining on their own. As a 13900KS owner I‘d very much appreciate that!
What i would love here is a comparison between windows and linux. New kernels seem to have a good scheduling approach there. Lets get people over to linux with games... worked for windows, right?
Intel taking a play out of Nvidia's playbook. "We could support the precious gwn but we need you to buy our new stuff.
Why is there so much tuning involved? Can't they just have the programm check "is it that exe. that is running" and then force that to only use P Cores?
And then give us a field where we can add our own exe's that will only run on P-Cores, why does it need so much work?
So fixing the issue, but only on 14th Gen?
Only on SOME 14th gen! Not even all of them!
Wasn't thread director (and the "enhancements" in the win11 version of it) already supposed to address this stuff? And then they decide to release another feature for it. Needless segmentation on top of needless segmentation.
@@GamersNexus Gotta love some special treatment :')
@@GamersNexus Sadly yes. I only noticed it after I posted. And should have added "14th Gen" with quotes ;)
@@xomm Exactly my point. This is fix that was supposed to be sort of from beginning. If in 3 months it isn't available on many more CPU's then I'm going to be mad. For now I can give it a pass in the same way I can give AFMF a pass - it is a "tech demo". At least realistically. IDK about "officially", but that's how it feels.