AMD K6-III+: Additional L2 Cache Performance and Cache Levels

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024
  • After modifying an AMD K6-2+ by applying a mod found on Twitter, I wanted to check if the additional 128KB of L2 cache improves the CPU's performance. Let's have a look at some benchmark results and learn more about the importance of cache levels and their impact on the overall CPU performance. Enjoy!
    You can support me on Patreon:
    / bitsundbolts

Комментарии • 74

  • @drCox12
    @drCox12 Год назад +12

    For cache testing: One benefit of the K6-III with integrierted L2 cache was the relatively large cacheable area (4GB). Of course this would only have an effect for configurations with very large memory capacities or if the motherboard restricts the cacheable area.
    On your P/I-P55T2P4 I can see that the motherboard's L2 (L3) cache configuration is 512 KB (JP5 hard wired to pins 2+3). I can also see that the cacheable area of the L2 (L3) cache is set to 64 MB (JP4 is set to pins 1+2). It could also be more than 64 MB (up to 512 MB - JP4 must then the be set to pins 2+3) but requires Pipeline Burst SRAM (PBSRAM) and a TAG SRAM upgrade (the small socket next to the CPU socket that is empty on your motherboard)
    As you have a memory capacity of "only" 32 MB this doesn't make a difference at all. The motherboard's L2 cache can perfectly mange your 32 MB of SRAM (up to 64 MB) and could manage up to 512 MB of PBSRAM if you upgrade it with a TAG SRAM chip.
    In a nutshell: If you use more then 64 MB of memory, your motherboard isn't able to cache the whole memory capacity without a TAG SRAM upgrade and using PBSRAM.
    Thus it would be interesting to see if it makes a difference if you use more then 64 MB of memory with K6-2 vs. K6-III.

    • @bitsundbolts
      @bitsundbolts  Год назад +3

      I have a video online where I am upgrading the TAG ram and use a COASt module. The images I use in those videos are older images I have taken from two revisions (2.3 and 3.1). It is funny that you say I should test the board with non-cacheable memory areas. I am planning a video where I'll do exactly that, plus using an older BIOS Version that doesn't know the plus versions of the K6 - I think I've read that you'll also get lower performance of around 10% in that case.

    • @drCox12
      @drCox12 Год назад +1

      @@bitsundbolts Cool! Looking forward to that! :)

  • @AlexBoneff
    @AlexBoneff 2 года назад +4

    I was probably in need of this video 20 years ago :)

    • @bitsundbolts
      @bitsundbolts  2 года назад +2

      A secret that was kept for many many years! Interesting that nobody leaked it or it was found earlier! One other of those hacks were for the Radeon 9500 Pro to 9700 Pro mod. I attempted the soft mod back then, but was unsuccessful. I got artifacts on screen after unlocking the other 4 render pipelines or whatever it was called

  • @biffzinker7994
    @biffzinker7994 2 года назад +9

    You're also observing no change due to the narrow on-die 64-bit bus between the L2 Cache that goes to L1 Cache/front-end decoders. The low Associativity of the L2 Cache hurts performance as well.

  • @Chriva
    @Chriva 2 года назад +12

    There is a huge benefit with more cache but only if your workload needs it. It's kinda like ram. You'll only notice performance differences when you have too little but basically nothing happens once you're above that limit :)

    • @bitsundbolts
      @bitsundbolts  2 года назад +6

      I need to find a scenario that clearly shows a benefit between the two CPUs. Hopefully I will find one!

    • @JL-yi1fx
      @JL-yi1fx 2 года назад +2

      On my Asus mb I noticed that having the level 2 , 128k on die cache , improved the video playback of mpg video files - not at full screen . ( Running Video CD mpg files ) . It can almost play a DVD , but not quite with out stuttering . The cpu load went from 98% down to 47% usage in win 98 when using the on die 128k L2 cache / no mb cache . AMD K6-2+E 570 clocked at 500 mhz , 32mb Matrox G450 pci video card . 256mb / 60 ns system RAM .

  • @MrKillswitch88
    @MrKillswitch88 2 года назад +4

    Generally with anything after say the Pentium M era that when it comes to additional cache there is usually a bump in additional bandwidth hence the bump in performance even though the cache is still being used the additional capacity might not provide that much of an improvement depending on the application but it is always nice to have more regardless. Procs in general are sensitive to both bandwidth and throughput as high latency robs a lot of potential performance especially on modern consoles as just an example using high latency memory.

  • @Neksus-M06
    @Neksus-M06 2 года назад +1

    You are going to make several videos on the subject, future is bright for us viewers :)
    Thanks for now.

    • @bitsundbolts
      @bitsundbolts  2 года назад +2

      That's the plan! Thank you for being here!

  • @scottyhehehe5367
    @scottyhehehe5367 4 месяца назад

    Nice testing. I heard long ago that the extra 128kb L2 on K6-III+ did not really help at all. Your graphs just proved that.

  • @HabiburRahman-dh7oy
    @HabiburRahman-dh7oy 2 года назад

    The introduction was very helpful, thanks!

  • @angieandretti
    @angieandretti 2 года назад +5

    Can you test actual game benchmarks with your K6-3+ with and without the motherboard cache enabled, to see whether there's actually a performance regression with that L3 cache enabled? I remember this claim being made but I think I saw a video, PhilsComputerLab IIRC, that found it wasn't the case.
    Update: Just did a quick test myself w/ K6-2+ 618MHz (5.5x112.5) and I do see a massive 27% increase in memory write speed per a synthetic benchmark when disabling the 512KB motherboard cache, but I also tested a game (Unreal) and see nearly 3% reduction in FPS.

    • @bitsundbolts
      @bitsundbolts  2 года назад +3

      Yes, I will do this in the future. Any particular game(s) you have in mind?

    • @SkalabalaK6
      @SkalabalaK6 2 года назад +1

      You need to set your cache and mem timings to fastest to benefit from it. You need to use WPCREDIT for this. I have beaten socket 7 to death with a stick, level 3 cache helps ends of story.

  • @mattnordsell9760
    @mattnordsell9760 2 года назад +4

    Seeing these benchmarks, it seems like the L2 has yet to ever be a huge deal as even processors today have pretty much the same L2 cache levels per core that they had back then.

    • @bitsundbolts
      @bitsundbolts  2 года назад +6

      True, L1 and L2 Cache have not changed significantly in size. They got a lot faster, but as I get more into understanding how cache works, it gets more confusing. 25 years ago, you cached 64 or 128 MB or memory, now systems have 1000x more memory! It's still a topic I have a lot of work and research planned

  • @Romerco77
    @Romerco77 Год назад +2

    That L2 cache increase should translate into a noticeable benefit in games, if not something fishy is happening. Try Quake in DOS.

  • @FutureChaosTV
    @FutureChaosTV Год назад

    Insane:
    RAM size AND speed have improved a *thousand* fold since those days...

  • @yosemite-e2v
    @yosemite-e2v 2 года назад +3

    With my very similar Asus TXP4 I have a modified K6-2+ which I'm running at 6x83 with 256 megs of SDRAM. When I use Sisoft Sandra to measure the memory bandwidth it's always lower when the 512k motherboard cache is enabled.

    • @bitsundbolts
      @bitsundbolts  2 года назад +1

      That is similar to my findings then. But that was the only benchmark that suggested a decrease in performance. Will try to find more info on this topic.

    • @yosemite-e2v
      @yosemite-e2v 2 года назад +1

      @@bitsundbolts I have an FX 5500 PCI card in the system using the Nvidia 45.23 driver and both 3D 99 and 3D 2000 have slightly lower scores with the motherboard's cache enabled. I've done the tests many times; the gap is small but consistent. I thought it might be because the TX chipset can only cache 64 megabytes of memory and I have 256 megs installed, but I also ran the tests with only 64 megabytes and the results were the same.

    • @bitsundbolts
      @bitsundbolts  2 года назад +1

      I have a TAG module installed (15ns) and all memory is cacheable. I still got results in memory bandwidth that is lower when the motherboard cache is enabled. Maybe I will revisit this topic in the future

  • @chrisducati26
    @chrisducati26 2 года назад +7

    Next time try with a super socket 7 motherboard,you will be amazed by the results

    • @bitsundbolts
      @bitsundbolts  2 года назад +4

      I am in the process to get a Super Socket 7 Board. There will be a follow-up for sure!

    • @ninedogs2418
      @ninedogs2418 2 года назад +1

      @@bitsundbolts try 83mhz fsb on current mb. 83*6=500

    • @bitsundbolts
      @bitsundbolts  2 года назад +2

      Tested this today. Will make a follow-up soon, but don't want to spoil anything ;)

  • @coryflammer1269
    @coryflammer1269 Год назад +1

    I would assume, you wouldn't see a difference on an older socket 7 board. Need an SS7 ALI or MVP3 chipset to really see the difference. PhilsComputerLab did a video about it on a later SS7 board

  • @mochtard7709
    @mochtard7709 2 года назад

    coz you did great work on the video, thank u for your help o7

  • @cal2127
    @cal2127 Месяц назад

    honestly the coolest thing about socket 7 boards is you can get a full 1 mb of l3 cache in 99

  • @BlueSkyYGO
    @BlueSkyYGO Год назад +1

    Imagine this same benchs but with 100MHz FSB instead of 66...
    Lv2 and/or external cache should have higher impact on performance

    • @bitsundbolts
      @bitsundbolts  Год назад +3

      Yes, I'll probably do something similar with the super socket 7 board.

  • @rebeccaschade3987
    @rebeccaschade3987 Год назад

    Testing with Quake 3, Unreal Tournament etc, and you should see much more of a difference, even more so once you move to a motherboard with 100MHz FSB and higher clock frequencies.

  • @alongwithguide735
    @alongwithguide735 2 года назад +1

    Hello from Kazakhstan) If it really improves performance, then I think I'll try this method on my AMD K6 II+ 550mhz, Overclocking to 600mhz and 256kb cache could make the system on socket 7 the most powerful. Can you show an example of this system on gothic or morrowind games?

    • @bitsundbolts
      @bitsundbolts  2 года назад +2

      Hello! I am not sure yet if this mod improves performance by enough considering the risk you take during the delid! Also, there is a chance that the cache is defective and the mod doesn't work. I have a couple of K6-2+ 570 waiting. Next month I will do a follow-up to see if I can find a unit with defective cache.
      Regarding Gothic and Morrowind - based on their system requirements they are too demanding for this system (it is not a Super Socket 7 board). If I get a Super Socket 7 board with AGP support, 100MHz FSB, SDRAM, and better GPUs, I may have a look.

  • @bestshortsinsportshd
    @bestshortsinsportshd 2 года назад

    He had when he "pitched down the Nice tutorialgh hats at the end of the phrase. "

  • @Elkarlo77
    @Elkarlo77 Год назад

    One Question: Did you test the tool by Andreas Stiller from C't setk6v3 ? It is still around. Some Motherboards have a curious Bug which would explain your results. I used the tool on my K6-3+ 500. Some Motherboards don't recognize the L2 Cache, then u use only the L1 and L3 Cache. the tool by Stiller is a driver which corrects this issue. Some tools may use the L2 cache but it is not commonly used, without the driver. My K6-3+ got a solid boost thanks to the tool. 256kb with 500mhz is way faster as the 256kb 100mhz the motherboard offered. Your results looks like you are not running the L2 Cache.

    • @bitsundbolts
      @bitsundbolts  Год назад

      I have not used this tool, but maybe it is worth trying. The BIOS I am using is a modded BIOS that specifically states to have fixed the proper initialization of K6-III CPUs - therefore, I didn't consider using any third party tool. The speedsys benchmark shows however a clear drop of performance at each cache level - so, I think the CPU and all cache levels work properly. Maybe it would be a good test against a board that knowingly has this bug. I have to check what Socket 7 boards I have for that...

    • @Elkarlo77
      @Elkarlo77 Год назад

      @@bitsundbolts The tool lets you enable and disable the Cache with a command line. When your Bios is fixed then i may be wrong. It is from a CPU Guru from a German PC Magazin and here widely used. In Germany the first "Aldi" PC (Grocery Chain) had an AMD K6-2, the bios was not fixed and needed the Tool when you upgraded to an K6-3.

  • @raven4k998
    @raven4k998 Год назад

    now can he add another 256 k l2 cache to the cpu making it a k64+ cpu?

  • @andreewert6576
    @andreewert6576 Год назад +1

    I have to add that at least on my PCchips M577, leaving the external cache enabled with a K6-3 or K6-2+ leads to memory corruption. Memtest fails. I pulled my hair out until I found it doesn’t happen with K6 or K6-2.

    • @bitsundbolts
      @bitsundbolts  Год назад +1

      That is interesting. I don't understand how the external cache would work with K6 and K6-2 though and not with CPUs that have internal L2 Cache.

    • @andreewert6576
      @andreewert6576 Год назад +1

      @@bitsundbolts it might be a bug where the cache is not demoted to L3 as it should but stays as a second L2 cache, giving conflicting outcomes in specific scenarios. I could boot windows but it crashed at random. Always failed memtest test 5 at 0 Bytes afair.

    • @andreewert6576
      @andreewert6576 Год назад +1

      @@bitsundbolts one more bit: happens on two separate boards with different revisions. Jep, it drove me mad so I bought another $100 M577 board to troubleshoot the issue.

    • @bitsundbolts
      @bitsundbolts  Год назад +1

      Yeah, those situations can drive you crazy! This demotion to Level 3 seems to be flaky. I have noticed that some tools do not properly identify the different caches (but that might be due to the way the tools try to identify caches - I assume some use access time).
      But you could be right that something may have caused issues with correct allocation on those motherboards

  • @jozewsqwe435
    @jozewsqwe435 2 года назад

    Good video

  • @adg1355
    @adg1355 Год назад

    And how about a Super7 system with 1M L3 cache?

    • @bitsundbolts
      @bitsundbolts  Год назад

      I think you won't see much better performance on the Super7 platform and it may completely depend on the workload if the level 3 is properly utilized.

  • @MohammedIrfan-eq2ix
    @MohammedIrfan-eq2ix 2 года назад

    But that comnt is 4 months old

  • @kokodin5895
    @kokodin5895 2 года назад

    is it possible you hit a wall of storage media read speed instead of cpu limitations, at least in boot speed benchmark
    how about torturing it with xp?
    how much of the memory was chahable in those tests,
    would tag chip change anything?

    • @bitsundbolts
      @bitsundbolts  2 года назад +1

      I have a tag chip installed - 100% of memory is cacheable. During this test, I had 32 MB memory installed. But I am also using SD-Card to IDE devices.
      Maybe later with Super Socket 7 I can revisit this - maybe even try a small SSD in the future.

    • @kokodin5895
      @kokodin5895 2 года назад

      @@bitsundbolts that kind of interesting actually . since you already have tag chip i would try tests that can handle more ram and stack the board to the fullest with the biggest fastest sims it can handle (or dimms i don't really know the platform)
      because you are hitting some kind of performance wall and unless you can say for 100% that's cpu you will see no difference whatever it is storage, ram or cache because ram buffer is just too small to work eficiently with bigger cache buffer
      you can also remove the tag chip and check if that decrease your resouts in significant way, and at what point of memory it start to play a role
      for storage i would probably just stack some pci sata controller card and small sata ssd. it could be funny to see how storage is almost as fast as ram in that kind of a system and pci buss being a bottleneck

    • @bitsundbolts
      @bitsundbolts  2 года назад +2

      You are probably right - I should have used at least 64 MB of RAM - this is the minimum the ASUS P55T2P4 can handle without Tag. But I had the Tag installed anyway - and once I tested the board with 256 MB (4x64MB + Tag) memory. There was no non-cacheable memory area.
      Cache is anyway an interesting topic. I am planning to do more in that area - but it requires a lot of research. That will be a future project!

  • @fernandolopes1114
    @fernandolopes1114 2 года назад

    missing my ti line sir.

  • @janwiederlechner9901
    @janwiederlechner9901 2 года назад +5

    Your conclusion is wrong because your system is unable to fully utilize anything faster than (very slow) K6-2 without L2 cache on die. To measure it correctly you need three things. First get rid of any EDO RAM and use SDRAM only, even 66MHz 15 ns version is more than 3x faster than fastest 50 ns EDO. EDO's performance is really horribly bad for such CPUs and it is clear bottleneck for your system. It is good for Pentium 120 MHz but too slow for anything ahead. Then you need FSB 100MHz+, ideally 112 MHz or more. And third - you need SDRAM as fast as possible, ideally with CL2 :-). In such case you will see healthy +20% performance boost with 256 kB L2 cache over 128 kB and data throughput >200MB/s. Said short - without good Super Socket 7 motherboard there is no use for such fast CPUs and waste of time to make such tests with wrong results.

    • @bitsundbolts
      @bitsundbolts  2 года назад +1

      Right now, I don't have a super socket 7 board. Once I get one, I can compare and see what difference it makes using all those things you mentioned. This is not the end of it, I am determined to find a good argument for the the larger cache.

    • @janwiederlechner9901
      @janwiederlechner9901 2 года назад +2

      @@bitsundbolts Don't get me wrong, you made nice video but with too weak mobo + RAM. Now it's like testing server CPUs with 1 thread application. In the end you will see all such CPUs are the same no matter what they can do :-).
      I have fast super socket 7 board with CL2 RAM and the difference between K6-2+ and III+ on the same clock is really +20% in games, even more in 3D tests (like 3D Mark 2000 I got 2775 vs 3560 points with 128/256kB L2 cache on die). Ordinary K6-2 without cache is substantially slower than both K6-2/III+ naturally.
      For speed difference between EDO and SDRAM - fast EDO can provide you around 55 MB/s, slow SDRAM ~100 MB/s. PC100 CL3 around 160 MB/s, PC112 CL3 can give you 180 MB/s and CL2 approx. 210 MB/s.

    • @bitsundbolts
      @bitsundbolts  2 года назад +2

      I am fine with what you are saying - and I agree. I am in the process of getting a proper motherboard to do such comparison in the future. Comments like yours will also help me to get ideas what to do in the future! So, thank you for the info. At the end I just want to create something that is enjoyable and informative. That does not imply that I will always do the best job and there may be a time when I will be completely wrong. I hope we are all here to learn ;)

    • @AlexBoneff
      @AlexBoneff 2 года назад

      I have not heard some of these terms since decades ago :D

  • @paveljelinek772
    @paveljelinek772 2 года назад +2

    I used to own k6-3+ 1.6V😭😭😭 damn i sold it in 2014 for just 3500CZK to a collector.. today it would be probably worth more than triple
    edit: bro, you were most probably very limited by that fukin slow EDO ram.. btw this is the first case i see a Ssocket7 with edo slots.. SDR would be way faster man

    • @bitsundbolts
      @bitsundbolts  2 года назад +1

      Yes, I know that this is not a good combination. I am waiting to get my hands on a Super Socket 7 Board. But then I want to know how much faster SD-RAM really is - more content for the future!
      Edit: Sorry for you selling that nice K6-3+! Would love to have one of those!

    • @paveljelinek772
      @paveljelinek772 2 года назад

      @@bitsundbolts yea they oc like crazy, i swear i managed to get him 2.25V 700Mhz rock stable and it would probably hit 725-750Mhz

    • @bitsundbolts
      @bitsundbolts  2 года назад

      Once I get my hands on one of the better SS7 boards, I will try my luck! But I think it was very difficult to go past 600/625 MHz.

    • @paveljelinek772
      @paveljelinek772 2 года назад

      @@bitsundbolts maybe a silicon lottery? I did not crash test its stability though..

    • @Romerco77
      @Romerco77 Год назад

      @@bitsundbolts indeed it is difficult, right now i have a K6 III+ 400 (1.6V) Running at 600@2.0v. Beyond that things get very unstable unless you crank up the voltage and the practical performance increase in not worth it.

  • @cosmefulanito5933
    @cosmefulanito5933 Год назад

    The problem is that you are using a 66Mhz bus. So you don't see any differences. The motherboard is artificially limiting the processor.
    Please, stop using that motherboard and use a consistent one.

    • @bitsundbolts
      @bitsundbolts  Год назад +1

      I will retest this soon using a super socket 7 which I didn't have at the time of recording this video.
      But I don't think performance will increase by much using a super socket 7 board and the switch between 128 and 256 kb of Level 2 cache. I'm guessing around 5%.

  • @ricardobortolin6332
    @ricardobortolin6332 2 года назад

    sa bruh

  • @theALFEST
    @theALFEST Год назад

    quake is much better benchmark

  • @another3997
    @another3997 Год назад

    The results are definitely interesting, if a little unexpected. But, please don't put that awful jazz "music" in the background... my ears will never forgive you for the pain inflicted! 😉

    • @bitsundbolts
      @bitsundbolts  Год назад

      I hope you notice an improvement in my more recent videos 😉