Yes, I have the Intel Rapid Cad as well. There will be a follow up video with benchmarks where I will compare the 386+387, Intel Rapid Cad and the 486 @ 33 MHz. 😀
I've use a 386sx and a 486sx, both at 33Mhz, the 486 was as twice as fast in regular use . You can find on Internet the difference of cpu cycles needed for floating point operation on 386, 486, sx/dx. FP sofware emulation is slow on a 386, merly 50 cpu cycles and only 16 on 486.
@@pascalmathieu9332 if i understand correctly the RapidCad was actually a 486dx wired up to the socket layout for a 386, more or less, with the FPU inside the die and the second chip as merely a dummy for compatibility
The second chip is some logic chip which is redirecting the math co pro signals back to the cpu. this weekend in my next video I will review the RapidCAD
Oh the nostalgia. I did my PhD designing maths co-processor hardware. My rose-tinted glasses filter out the hideousness of having to design these things and now I can just appreciate the beauty in them.
I love computer architecture. How many cycles did your designs take for a multiply? Was it pipelined? What was the radix of the adders used for multiply? Does it handle denormals and the different IEEE rounding modes or nobody cared? I got the impression those are the bane for many designers. Multiprecision multipliers are becoming popular these days for use on CNNs. I don't know if that made sense back then.
At some point I was involved in CPU designs, and one of them brought me to some FPU work. Shifters, shifters, shifters and CORDIC! It's hypnotic to me see the final result, how zillions of Verilog lines end up in those marvellous geometries on the die.
Finally this channel gets the traction that it deserves. All this content is incredibly interesting, and I’m not at all into collecting stuff. It’s just a superb way of explaining the fundamentals of micro architecture. (edit: typo) (edit 2: and what in the world is that EDM-meets-yodeling song? Googling Austria hits, yodel techno, etc. is going nowhere ruclips.net/video/qaGQxZEYby0/видео.html )
Because they were basically using the same FPU 2 generations later..... Well, not quite. It was enhenced but on on their initial dx2 and dx4 cpu's the FPU was forced to run at bus clock. And on the 5x86 of socket 3 it ran at half cpu clock. It was still really strong for its actual clock speed but being disadvantaged by its low clocks.
Intel basically just got the FPU running according to spec whereas Cyrix hired mathematicians to optimize their FPU. Later on the FPU usage in FPS games was wildly underestimated bei Cyrix as well as AMD. AMD was able to rectify this with the Athlon which had a really strong FPU. Unfortunately, later they were betting on the APU and GPGPU for FP math rather than the integrated FPU in their bulldozer design. We know how that turned out ...
@@stefanmisch5272 Bulldozer was counting on games going wide threaded about 5 years before it started to really happen. It really had nothing to do with the GPGPU heterogeneity. It also suffered from chronic lateness to market, and a gross miss calculation about just how strong sandy bridge actually was. Bulldozer was intended to compete with Nahalam and not Sandy Bridge / Ivy bridge. In that world it would have been more than competitive against Gulftown and Clarkdale. AMD saw the writing on the wall pretty much the day the 2600K was launched and decided to throw everything, including the kitchen sink at a clean room proper strait up high performance core in zen. and only dedicated just enough resources to it to keep them from sinking, The interesting thing was all 5 (or 6?) iterations of APU's did see IPC increases each generation and refinements.. I would imagine had AMD not thrown the towl in, made use of more agressive process nodes and such, the performance class chips would have had the 10% IPC gains per year that AMD originally promised with it.. As Excavator saw... Keeping in mind that it was really only the second iteration, and the APU's went several after that... But in the end they really did make the right move. Lisa Su seems to know what she is doing. I am no bulldozer apologist, It was a day late and a dollar short, And as a result it really did suck if am honest. but I am fascinated with the vision they were going for. Bulldozer is actually still a decent gamer if you have an 8thread 8350 with some clock applied, it had as long of a useful life to those who bought it as the 2600K did. "It was ahead of its time".. lol
These chips were the Radeon Instinct MI 100 and Nvidia Tesla A100 from the late 80s / early 90s. Amazing how much this tech has progressed in 3 decades.
As a child I came across different systems which had those empty sockets for Math CoProcessors and I always wondered what magic could be in there. Now I know for certain! Thanks so much, I love this channel!
In early Pentium days, games started to require "486 or better", but they usually only needed a math coprocessor. By that time, 387s were very inexpensive and, most of the time, they were enough to allow 386s to run "486 or better" games.
if i remember correct, the Cyrix also has 4x4 Matrix functions build in, i don't know if any of the other brands has this feature, i think there were some libraries to bind while compiling.
A great topic for those who don't know how we used to suffer back in the day! I appreciate this video and thank you so much for your hard work! Your brother, W.Bushnaq from Kingdom of Saudi Arabia
3 года назад+2
Love the video. I would also love to see these benchmarks run at the other CPU-s. If you can control for run-to-run variances, the fractal test could show even small differences, or maybe pinpoint some specific architectural traits.
You have an impressive collection of hardware there! Respect. Edit , in 90-91' I was still running an Atari 1040 STe and an Amiga 500. The hardware you are showing was pretty high end/niche and indeed quite expensive at that time
I had to get the 387 for AutoCAD 1.0 . The difference really was impressive for my old i386. Drawing in 3D was so very slow without it....painfully so.
I remember MicroProse Grand Prix (GP1) speeding up when I installed a 387 on my Cyrix 386SX-16 computer back in the day. Might be an option for your 387 testing.
I'd really like to see a comprehensive list of pre-Pentium era games that used FPU co-processors. And also including games on the Amiga/Apple Mac/Atari ST that actually used the Motorola 68881 & 68882 FPUs...
As an 386-enthousiast I can only like this kind of stuff. Very nice! Interesting and surprising results. I also have a selection of 387-FPU's in my collection; but I never bothered to actually test them like this. Just as you, I don't like empty sockets, so just installed one of them... I now learned that I did choose one of the slow one's ;-) (the ULSI MathCo).
Integration is what makes technology cheap; what happened to FPUs happend to almost everything, steadily. It's been happening steadily from the very beginning of integrated circuits. Before there were CPUs built from a single chip, you could build a CPU from many chips where each chip had some small number of functions; e.g. some registers, or an adder, or logic operations, or adressed into a core memory plane. L1 cache used to be on the motherboard, then it was put into the CPU and L2 cache was put on the motherboard. FPUs used to be done in software, then there were coprocessors, then the coprocessor was put into the CPU. Then L2 cache was put into the CPU. Multiple CPUs used to live in multiple sockets on the same motherboard and then it was put into the CPU. RAM used to come as tiny DIP chips, dozens and dozens of them, then they were put into SIMMs and DIMMs and stuck on the motherboard. The memory controllers used to live in the north bridge on the motherboard, and then they were put inside the CPU. An early 8088 or 286 motherboard could do almost nothing and had an overwhelming number of chips on the motherboard. If you want a floppy drive; there's an ISA card for that. If you want a harddrive; there's an ISA card for that. If you want a scanner, there's an ISA card for that with a parallel port on it. If you want a mouse, there's an ISA card with serial port on it. 386DX-40 is when PC really started to become affordable. There were Maybe half a dozen chips plus some cache DIP modules; very highly integrated. If you wanted CDROM, harddrive, mouse and printer there was a single ISA card for that, that was fairly cheap. That's what made technology cheap; what made technology fast was Dennard scaling, not Moore's.
I remember contemplating getting a Fasmath back in the early 90s when they started to get really cheap. I saw benchmarks of one being ran at 16mhz on a 33mhz intel 386 DX and seeing it almost keep up with the intel 387 at 33mhz and was very very impressed. What I didn't understand at the time was the first few Cyrix FPUs were limited to lower clocks, but when ran half bus clock they were able to avoid async latency penalties and their IPC was increased a measurable amount relative to clock (vs say running them at 20 or 25mhz on a 33mhz bus clock). So it made them look really really good. Of course running them at bus clock will be much faster but the ability to show it off at 16mhz was a great selling gimmick. Of course the Intel FPU would increase IPC in the same situation, but the salesman wouldn't ever bother divulging that. IIRC halving the clock of the FPU on a 33mhz bus would only slow it to about 60 or 65% and not to %50 like one would expect. Perhaps due to a relatively higher amount of memory bandwidth. This is why there was usually a minimal difference when I ran mine at 20mhz over 16 mhz in my computer at the time. Faster clocks were always better, but some dividers had less penalties than others. I saw no reason to run mine at 16 other than to impress people, but being a little bit OCD it just seemed more rounded. I am unable to recall however if the slower FPU would end up bottle necking the 386 at all due to it potentially tying up the system bus a little longer? I didn't think it an issue at the time. Some integer ops ran slightly slower after putting in the FPU (like maybe 3-5%) but from what I understand that was normal no matter what speed the FPU was. For what I was doing, it was a welcome boost and well worth every penny.
@@CPUGalaxy lol good luck. I just got a very obscure joba few days back to extract some data from some really old QiC carts, and I happened to remember I had a special accelerator card for those floppy port based tape drives. I was up through the night trying to get it to work only to realize the drive I was provided was partly dead. Such is us retro enthusiasts. The spare they had worked perfectly though, and I was able to turn a 4 hour per tape extraction into about 30 minutes with the card I had. First time for me to ever use that card that I picked up on a whim cause I thought it would come in handy one day... That was 26 years ago. haha Thankyou for all the time you put in. It brings back memories and keeps my enthusiasm for playin around with my collections.
There was also in the days a very unusual brand, Weitek, their coprocessors were the fastest, but they didn't work as a x87 replacement, needing a special socket. In fact they were available for other CPU families as well. Also kudos to all those manufacturers that took the effort to implement IEEE 754 in hardware, it's quite a complex standard for 1985's technology. (EDIT: Oops, I have realized someone there is at least another comment on the matter)
I seem to recall WinChip and some other manufacturers that Best Buy and CompUSA/ComputerCity used to stock for older CPUs late in the day in 1999, in addition to Cyrix and other companies. FPUs probably could be used to improve decoding of music files. I recall on the Atari Falcon030 there was one MOD - or was it MP3? - player that could use the Motorola 68882 FPU to improve performance years after that computer was discontinued. I'm sure there's players on the Amiga that did the same...
You really have to add the Weitek co-processors. They needed a different socket, but they were fast! I had done a similar (but more comprehensive) test about 18 years ago, comparing everything @ 20MHz, with what 387s I had available, but also with various 386 models. Apart from one graph (on some CPU forum which I cannot remember), I never published the results.
Very interesting video. I used to own the IIT DLC3 387 math chip in my old 66Mhz IBM Blue Lightning computer which I purchased back in late 1993.. First time I've seen that chip in years... quite nostalgic.
Fascinating comparison, I remember there was a DOS utility that would emulate a coprocessor for CPUs that didn't have it, I wish I could remember the name of it as it actually gave a slight boost to performance. I recall wanting to buy a 387 back when I had a 486SLC and then being flabbergasted at the price - far outside of my range back then as a student!
I messed around with POVRAY back in the DOS 5 days on a superb 486SX25. Renders would take hours to days to finish. When I upgraded to a DX2-50 it would take minutes to hours to render. I never did own a socketed co-pro.
I'm glad I found this channel. Love em processors. Also, I have a question: When this person was installing the co-processor, how did he manage to place it in the correct way? I didn't see any marks or guides.
Thank you very much. Regarding your question I can recommend watching my video where I did a more detailed review on that board. And you are right, the right placing of the chip is an issue. here the link. ruclips.net/video/qaGQxZEYby0/видео.html
@@Dj-Mccullough Not quite. Their 386 CPU's were on par with intel's IPC and were clocked higher as well. At the time intel, despite being the designer, basically had the lesser chip compared to almost all competitors. It was their later socket 7 intergrated FPU units that were feeble compared to AMD, intel and IDT. Their integer units were actually quite a bit faster clock for clock as intel's at that time (hence the 200+ rating at 150MHz), but sadly for them Quake came along, which used the FPU.
Was wondering if it makes sense to add a copro to my 486DLC40 CPU, as it is annoying to see an empty socket as you say... At the end, it's only an esthetics matter, without any purpose. Ahhh, the best part of the best DOS demo, Crystal Dreams II! You gave good tastes 😉 Schöne Grüße aus der Steiermark!
Nice video as always :) I wanted to see this comparison. I was collecting 387s recently. I have 4 of them. The top models from Intel, ITT, ULSI, and Cyrix. Waiting for some 386 chips to test these. My 386 collection is very poor.
have my old Am386-40 with an IIT 387 running a manufacturing machine. Still running strong every day after all these years, even with the old "cheap" vga monitor.
If I remember correctly worms was a game that was unplayable slow on a 386-sx 20... But got a huge boost with a 387. Would be cool if you could test that :)
I was messing around with some old computers a couple years ago and i discovered that some of the more recently compiled open source DOS tools were compiled in such a way that they require floating point. That was frustrating as one of the machines i was using was a passive backplane board with no accommodation for a 387 even if i had one. I recalled that back in the day there were fpu emulator TSRs for dos, and went looking, and found that you can still license them for a bunch of money. Oh well.
I remember a long time ago (mid 90s) I stumbled across a free 287-10and man it made a huge difference with Links 286. Iirc it was actually 2MHz slower than the main cpu, but they seemed to get along well enough.
@@5roundsrapid263 The 287 FPUs use a different internal clock modifier. If I remember correctly they run at 2/3rds the CPU clock, so his example was probably running underclocked at 8MHz when used with his 12MHz CPU.
@@5roundsrapid263 I wasn't really aware of it either until I went looking for a 287 to pair with my Harris 286-16 and couldn't find anything faster than 12MHz FPU so I did some research. Turns out I only needed a 10MHz part. :)
This comparison would have been fantastic for me in 1994. Our primary family PC at the time had a 100MHz IBM Blue-lightning CPU and a 33MHz Cyrix Fasmath. I had always felt the Fasmath let us down for performance, but now I know it wasn't actually so bad and just the low clock speed that was an issue. The computer was upgraded with an Intel DX2 Overdrive which was noticeably slower for integer but much faster for floating point, and then later with an IBM 5x86 that was much faster in every way.
The only thing I found so far is that the Cyrix FasMath CX-83D87-33-GP (and KN) models are asynchronous unlike others so they shut down unused parts of the FPU to save memory bandwidth on the data bus. Otherwise they were made same, same 32 bit bandwidth and data bus and same frequency. Will update as I find out more information.
I think software floating point is always used unless software is written to detect and take advantage of a hardware FPU, where it will use different instructions to do the calculations on the FPU... so there is no special Software emulated FPU -- as that is just what any program is doing already if there is no FPU.
Seeing that 487SX I'm officially envious now. :) That's a damn rare find! I'm waiting for that separate video about it. Edit: Fractint was a great program. For hi-res pictures with more complicated math it could took a night to render. I still have a lot of those fractal images on CD. And some of them printed in color and hanged on the wall. :)
@@matthewday7565 I thought 80486 CPU's had an integrated FPU on them, I'm curious what was the purpose of a stand-alone additional FPU. Perhaps someone knowledgable could explain?
7:48 what chip is supposed to be in the unpopulated location below the large 486 socket. Is it part of the opti chipset or an alternative location for a CPU.
Only a guess, but perhaps an option for on-board graphics in other models of the same board? It looks like there two two different pad options for different sized ICs...
To clarify that. Its not for chipset or onboard video. this is for a solder version of the 486 cpu. check out another video where I show more details of this board. ruclips.net/video/qaGQxZEYby0/видео.html
These old floating point chips make me appreciate the FPUs that come standard in today's modern CPUs, especially given how expensive these chips were back in the day. As a regular home PC user back then, these chips were nothing more than curiosities.
im re-watching the video and im late to receive my ULSI DX/DLC 40 Copro to push my 386DX 20 over its limits! thanks again for your amazing contend , take care F
about games using the math co, there is the original quake and a game called retro city rampage 486, they both run on 386 hardware. Also a lot of software like some trackers makes use of it.
Nice test, but could you, please, test and compare it to i486DX33 too? Would like to see if the FPU in 486 was any better than 387 FPU from INTEL. Maybe you can compare different 486 brands in FP math test like the one performed here, to see if all 486 variants had the comparable built-in math coprocessor. I do not think I've ever seen test like that before focused just on math part of 486 CPUs.
If I remember correctly, that i487SX is basically a full 486DX that checks that a 486SX is present, then disables it and takes over all processing by itself. Seemed like a bit of a scam, especially if your 486SX was just a DX with it's FPU disabled. Looking forward to the video on that one.
You are correct. The i487 has a signal on one of the pins that disables the SX and takes over for it. I had a Dell 486 tower back in the 1990s that had a "487" socket. The good thing is that you could put any other 486 CPU in it and it would take over for the SX. So I installed a Kingston Turbochip with an AMD 5x86-133 that blew the doors off any silly i487SX.
Most interesting bit was the mother board which could fit 486 and 386 chips(I did not know it was possible), could you please repeat the same benchmark with 486 CPU?
Good vid. Really great content from your channel. My only grumble would be out of the box thinking. Ie could you have stuck a 486dx 33 comparison at end. Ie the generation leap to 486 with integrated fpu. Edit typo
yeah, you’re right. would had been a great idea! 😓. Damn dude, why I did not have this idea 💡. But anyhow I have planned a video about the Intel Rapid Cad with that board. And there I will compare it then with the 486dx33. Thanks for your input. This is what I call a great audience and community.
@@CPUGalaxy hopefully with a back comparison ie a sub benchmark that has the 2 apps you used in this video with whatever apps you use for the rapid cad. Hopefully by Xmas you get the place you deserve in RUclips half or 1 million subs, and growing.!❤️❤️❤️❤️❤️❤️❤️❤️❤️❤️❤️
Very interesting and useful information. Expected the IIT to score a little better than it did, for no particular reason, though Cyrix was always king in the FPU department back then. It's a shame they couldn't keep up in later years. Going to have to re-test my own Cyrix 387DX, purely as I have to wonder if they were the same as the 83D87 versions or not, but don't have one of those (or the mysterious KN version). It should now be possible to deduce this from looking at your results, figuring out how far off my testbed system is with known chips (Intel, IIT) and applying that 'offset' to whatever the Cyrix chip scores. So thanks for doing these tests, very informative.
I thank you to see you here on my channel. ☺️. Just waiting hard to a new video on your cannel. actually you motivated me to make this 387 video. I recently watched a video from you where you got so mad of Landmark benchmark on your 386 setup. 😅. By the way, I love the intro you did for your videos now.
As far as I'm aware the 487 is just a 486DX that takes over all functions from the CPU. There are some early 386 boards which allowed you to use 287 coprocessors, it would be very interesting to see these in action and how they compare to the 387 setups! Would also love a follow up to see how much better the 40MHz parts perform at their rated speed with the AMD CPU. Makes me wonder what the top dog best setup is.
"But why are we collectors populating empty 387 sockets? Uh, you know the answer: yes, because we can and just hate to see empty cpu sockets." Made me laugh hard. But it's actually true. Even 287s if that matters. I have a few fpus and wanted to test them but decided not to because it would have taken time (sloppy excuse). After a few days I get this. Come on, coincidence in the galaxy? :)
My recollection is that the 486SX processor was a full 486DX processor with it's floating point unit artificially disabled. The 487SX "math co-processor" was in reality another fully functional 486DX CPU, with one extra pin to prevent it from being used in regular 486 boards, and also to prevent a regular 486DX from being used in a 486SX board. Once you installed the "487SX", it took over completely from the 486SX which simply sat idle in a halt state. The only upside to this whole ridiculous arrangement is that it later became possible to install more powerful "486" and "586" processors with clock-multipliers to get a genuine performance boost.
I remember using some shareware that emulated the 387 co processor in software, for the programs that absolutely needed the maths unit in there. While it was not as fast as the actual silicon, it came on on most testing as slightly faster than the actual CPU itself in doing this, more from optimised code for the functions than anything else, so would probably have cut the CPU only render times down to under 40 minutes.
I used the same with my 16MHz 386SX machine. Then while rendering AutoCAD 3D plots of my sister's house to be build, I got frustrated and bought a co-processor.
@@stonent But only if the software is compiled to make use of the FPU. Non-integer math will still run on the CPU unless the software is specifically designed to take advantage of it.
We had 8087 in XT-machine, then 387sx when moving to 386sx machine, after which the FPU was built in main CPU. My father used it for CAD-stuff, I merely for fractal rendering and Falcon 3.0. Funny how for me it was ”normal” to have FPU, but none of my friends had those. We then sometimes compared how long time it took to render some images, and with FPU it was a lot faster with proper software.
One thing of interest might be how well the various FPUs complied with the IEEE 754 standard. There were some interesting "bugs" in various cores, and also in the emulation libraries. (For some values of interesting, of course.)
Glad I've been holding off for months on deciding what FPU to buy. I really want my 386 (on one of those FOREX boards w/ clear jumper instructions on the PCB) to look proper...plus I'm trying to make it a bit faster than it was back when I last used it (around 2006-2007). Unless I find a 486 DX/2 66 cheaper.
I'm curious about that motherboard you have used, would a 486 to Pentium Overdrive or equivalent chip work on that motherboard also? Yet another really interesting, detailed and well explained video. Great work. 👍🏻😃
We take floating point arithmetic in hardware for granted. My dad used to write these things in assembly language. In the second volume of Knuth's book "The Art of Computer Programming" there is an entire chapter on multiplication "How Fast Can We Multiply?"
@@CPUGalaxy The Weiteks 3167s are pretty cool processors. Love to see more content on those! They're different beasts than the (Intel) x87 units and are way more flexible / versatile. They can be programmed independently as a memory mapped device, claiming a bit of system memory and run their own dedicated code, separate from the 386 CPU, something a 387 cannot do as it's directly bound to the CPU. Data transfers to and from the 387 coprocessor are accomplished directly through I/O lines and are automatically generated by the 80386 CPU for direct coprocessor instructions, being more of an 'addon' / extension for the 80386 CPU itself. The Weitek was more of a 'parallel' design in that regard as it could run its own custom developed calculation code and could output its results directly to a memory adress in RAM and did not need the main CPU to do that for it, freeing up valuable CPU cycles. Reminds me a bit of the Amiga's chipset that also did true parallel processing with a very advanced DMA controller for its time. IBM PCs were actually quite limited in that regard with their 'simple' little DMA controllers.. The Weitek units were mostly used for dedicated industrial purposes and certain customized scientific calculations where the 387 wouldn't cut it. For single precision floating point calculations the 387 was faster than the Weitek though. I wonder if there were people back in the day who were extreme enough to use both Weiteks and 387s at the same time. I think that's possible as my Compaq Deskpro 386 can take both a 387 and a Weitek and has 4 separate sockets (385, 386, 387 and Weitek), very cool stuff! 👍 Already got an IIT 387 in there, time to find me a Weitek 3167 and do some experiments 🤓
Do you plan on reviewing the i487 chip? That one was marketed as an upgrade for motherboards with embedded 486 SX chips, but in reality, that "co-processor" contained a full DX CPU that enitrely disabled the SX. It wasn't a real math co-processor. I've read in the documentation, that the chip would only run, if it detected an original SX CPU, but I tested it on Matsonic M601 motherboard, just by itself, with no other chips, and it worked. Ran at 25Mhz. That makes me wonder if the motherboard was designed to spoof the presence of an SX.
@@CPUGalaxy Will be looking forward to your review and hopefully benchmarks to compare against mine. It was substantially slower than my regular DX2-66 that I had the board with. I have another 486 motherboard that runs up to DX5-133 and POD83, but has no mention of i487 in the manual. I wouldn't know how to set jumpers correctly, and I don't want to take any risks.
I was never a fan of WinTel - my Amiga 1200 had a 68030 50Mhz card and I always lusted after a 68881 FPU but just for filling the missing space and SysInfo to give me great results. LOL.
Nice video. Does a co-processor work faster/slower with different main processors (brand, clockspeed)? Will it change the ranking? How much faster is an integrated math processor compared to a seperate co-processor? To be continued?
good questions and definitely tbc. 😉. Integrated math copros started with the 486. so of course they were faster. on the 387 if we would increase the clockspeed to 40 MHz it will get also faster.
Twist: it was the same FPU just a few years later :)
3 года назад
If you think Quake mattered more than AutoCAD... They sure did something wrong looking at tit with 20/20 hindsight. But office people were always a good market. At Work we are trying to squeeze the most out of our equipment because, it is worth it. We keep an eye on how much time is lost due to waiting for machines to do their thing, and if there are time-saves to be had, it is usually easy to justify either effort, or hardware purchase so that people with quite high hourly wages do not sit idle waiting for their computers. Just calculate how much time (and money) is to be saved if 10 of your employees loose 5% of their time because slow machines, processes.
Yay, I had the ITT 3c87, for POVRay! (and demos + own coding ofc). It had a good price/performance ratio. I even called the c't magazine's hotline to help with my decision (I don't remember if Andreas Stiller was on the other end, but this is likely).
Excellent video! I kinda figured Cyrix was going to be the fastest but I'm honestly a little surprised that Intel was the slowest. I wonder how do the various 487 coprocessors fare? Was Intel still the slowest of the lot? Because it seems like they didn't have a strong FPU until the Pentium CPU.
i have a i9 9700k and i don't know what the big caps and 2 small caps or whatever on the bottom are called and don't know wher eto buy em. Can i fix a damaged pad?
no, no it isnt. Literally it isnt in any way like hyper-threading. Hyper-threading has no particular relevancy to Co processors. Hyper-threading is a modification to the cpu scheduler allowing each physical core to process instructions in a way that attempts to keep all parts of the CPU's actual processing pipline occupied. Think of it like an assembly line building parts, moving from station to station as each resource in the cpu is applied to each thread's needs. The effect is that Multiple threads can be processed 'at once' (between 2 to 4) to run on a single core while incurring as small performance hit as possible for 2 threads, and increaseing performance penalty for 3 and up.. An FPU in co-processor configuration would take over all computation of specific math instructions, largely bypassing the CPU all together.
I had to look it up on Wikipedia but I had always assumed the dx variants meant they had the math copro but apparently that was only for 486 CPUs. I remember running a 387 emulator for years when I used POV Ray. That would have been a good benchmark.
do you have a rapidcad? i'm curious where that would fall between the 386 and 486
Yes, I have the Intel Rapid Cad as well. There will be a follow up video with benchmarks where I will compare the 386+387, Intel Rapid Cad and the 486 @ 33 MHz. 😀
@@CPUGalaxy Would you include other 386/486 hybrid chips such as 486dlc as well?
I've use a 386sx and a 486sx, both at 33Mhz, the 486 was as twice as fast in regular use . You can find on Internet the difference of cpu cycles needed for floating point operation on 386, 486, sx/dx. FP sofware emulation is slow on a 386, merly 50 cpu cycles and only 16 on 486.
@@pascalmathieu9332 if i understand correctly the RapidCad was actually a 486dx wired up to the socket layout for a 386, more or less, with the FPU inside the die and the second chip as merely a dummy for compatibility
The second chip is some logic chip which is redirecting the math co pro signals back to the cpu. this weekend in my next video I will review the RapidCAD
Oh the nostalgia. I did my PhD designing maths co-processor hardware. My rose-tinted glasses filter out the hideousness of having to design these things and now I can just appreciate the beauty in them.
I love computer architecture. How many cycles did your designs take for a multiply? Was it pipelined? What was the radix of the adders used for multiply? Does it handle denormals and the different IEEE rounding modes or nobody cared? I got the impression those are the bane for many designers. Multiprecision multipliers are becoming popular these days for use on CNNs. I don't know if that made sense back then.
At some point I was involved in CPU designs, and one of them brought me to some FPU work. Shifters, shifters, shifters and CORDIC!
It's hypnotic to me see the final result, how zillions of Verilog lines end up in those marvellous geometries on the die.
You must be a genius at that level... wonderful!
I remember in 1993ish when the AutoCAD ppl at work who were using older computers got upgraded with 387 co-pros. It was a glorious day.
I remember the day I got a 486 with a Matrox card for AutoCAD... Oh the joy of fast redraw!
who knew that I would need to wait 30-ish years to see benchmarks regarding something I always questioned myself about. Thank you !
Finally this channel gets the traction that it deserves. All this content is incredibly interesting, and I’m not at all into collecting stuff. It’s just a superb way of explaining the fundamentals of micro architecture. (edit: typo) (edit 2: and what in the world is that EDM-meets-yodeling song? Googling Austria hits, yodel techno, etc. is going nowhere ruclips.net/video/qaGQxZEYby0/видео.html )
thank you. lol, this yodeling song is called „bring me edelweiss“ from 1983. 😉
@@CPUGalaxy Thanks for the ear worms, guys 🤨
lol. now I got the earworm as well 😂😂
😂
Oh how I wanted a co-processor for my 386 SX25 all those years ago. Now I know that my desire was warranted 😄. Excellent video 👍✨
Hey Pauline, do you like vintage hardware?
Interesting to see that Cyrix nailed the fpu performance in the 387 era but got beaten up two generations later.
Good thing as they are cheap compared to some of the others be it for collecting or for a build.
Because they were basically using the same FPU 2 generations later..... Well, not quite. It was enhenced but on on their initial dx2 and dx4 cpu's the FPU was forced to run at bus clock. And on the 5x86 of socket 3 it ran at half cpu clock. It was still really strong for its actual clock speed but being disadvantaged by its low clocks.
Intel basically just got the FPU running according to spec whereas Cyrix hired mathematicians to optimize their FPU.
Later on the FPU usage in FPS games was wildly underestimated bei Cyrix as well as AMD. AMD was able to rectify this with the Athlon which had a really strong FPU.
Unfortunately, later they were betting on the APU and GPGPU for FP math rather than the integrated FPU in their bulldozer design. We know how that turned out ...
@@stefanmisch5272 Bulldozer was counting on games going wide threaded about 5 years before it started to really happen. It really had nothing to do with the GPGPU heterogeneity. It also suffered from chronic lateness to market, and a gross miss calculation about just how strong sandy bridge actually was. Bulldozer was intended to compete with Nahalam and not Sandy Bridge / Ivy bridge. In that world it would have been more than competitive against Gulftown and Clarkdale. AMD saw the writing on the wall pretty much the day the 2600K was launched and decided to throw everything, including the kitchen sink at a clean room proper strait up high performance core in zen. and only dedicated just enough resources to it to keep them from sinking,
The interesting thing was all 5 (or 6?) iterations of APU's did see IPC increases each generation and refinements.. I would imagine had AMD not thrown the towl in, made use of more agressive process nodes and such, the performance class chips would have had the 10% IPC gains per year that AMD originally promised with it.. As Excavator saw... Keeping in mind that it was really only the second iteration, and the APU's went several after that...
But in the end they really did make the right move. Lisa Su seems to know what she is doing.
I am no bulldozer apologist, It was a day late and a dollar short, And as a result it really did suck if am honest. but I am fascinated with the vision they were going for. Bulldozer is actually still a decent gamer if you have an 8thread 8350 with some clock applied, it had as long of a useful life to those who bought it as the 2600K did.
"It was ahead of its time".. lol
@@wishusknight3009 I still have a bulldozer machine built in 2016. Now it's 5yrs later but it's still slow as xxxk🤣
These chips were the Radeon Instinct MI 100 and Nvidia Tesla A100 from the late 80s / early 90s. Amazing how much this tech has progressed in 3 decades.
Awesome! I Nudged you for this maybe two months ago, thanks for doing it! It's a historically valuable video.
Hooray for Crystal Dreams! Loved that demo!
As a child I came across different systems which had those empty sockets for Math CoProcessors and I always wondered what magic could be in there.
Now I know for certain!
Thanks so much, I love this channel!
Thank you!
@@CPUGalaxy Keep up the good work and maybe follow up on the 287 or even earlier versions. ;-)
In early Pentium days, games started to require "486 or better", but they usually only needed a math coprocessor. By that time, 387s were very inexpensive and, most of the time, they were enough to allow 386s to run "486 or better" games.
Some of them did say "math co-processor required" but I imagine that was a more confusing sell than to just say "486 or better".
Very nice dude ! You make my day.
Now I understand that the ULSI FPU is a good choice (I just ordered one)
Thank you for your nice job :)
if i remember correct, the Cyrix also has 4x4 Matrix functions build in, i don't know if any of the other brands has this feature, i think there were some libraries to bind while compiling.
I remember all those advertisements for CPU with math coprocessor.
A great topic for those who don't know how we used to suffer back in the day!
I appreciate this video and thank you so much for your hard work!
Your brother,
W.Bushnaq
from Kingdom of Saudi Arabia
Love the video. I would also love to see these benchmarks run at the other CPU-s. If you can control for run-to-run variances, the fractal test could show even small differences, or maybe pinpoint some specific architectural traits.
You have an impressive collection of hardware there!
Respect.
Edit , in 90-91' I was still running an Atari 1040 STe and an Amiga 500.
The hardware you are showing was pretty high end/niche and indeed quite expensive at that time
I had to get the 387 for AutoCAD 1.0 . The difference really was impressive for my old i386. Drawing in 3D was so very slow without it....painfully so.
I remember MicroProse Grand Prix (GP1) speeding up when I installed a 387 on my Cyrix 386SX-16 computer back in the day. Might be an option for your 387 testing.
I'd really like to see a comprehensive list of pre-Pentium era games that used FPU co-processors. And also including games on the Amiga/Apple Mac/Atari ST that actually used the Motorola 68881 & 68882 FPUs...
As an 386-enthousiast I can only like this kind of stuff. Very nice!
Interesting and surprising results. I also have a selection of 387-FPU's in my collection; but I never bothered to actually test them like this. Just as you, I don't like empty sockets, so just installed one of them... I now learned that I did choose one of the slow one's ;-) (the ULSI MathCo).
I love vintage hardware! (And your channel)
I saw a familiar name in the list of FractInt contributors. Ken Shirriff.
Reverse engineering extraordinaire!
Master Ken.
The Co Processors were very much relic of their time, but theyre still AWESOME!
You do still get Co processors, but for cheapness they are just reprogrammed CPU's, we juat call them GPUs NPUs and APUs.
Integration is what makes technology cheap; what happened to FPUs happend to almost everything, steadily. It's been happening steadily from the very beginning of integrated circuits. Before there were CPUs built from a single chip, you could build a CPU from many chips where each chip had some small number of functions; e.g. some registers, or an adder, or logic operations, or adressed into a core memory plane. L1 cache used to be on the motherboard, then it was put into the CPU and L2 cache was put on the motherboard. FPUs used to be done in software, then there were coprocessors, then the coprocessor was put into the CPU. Then L2 cache was put into the CPU. Multiple CPUs used to live in multiple sockets on the same motherboard and then it was put into the CPU. RAM used to come as tiny DIP chips, dozens and dozens of them, then they were put into SIMMs and DIMMs and stuck on the motherboard. The memory controllers used to live in the north bridge on the motherboard, and then they were put inside the CPU. An early 8088 or 286 motherboard could do almost nothing and had an overwhelming number of chips on the motherboard. If you want a floppy drive; there's an ISA card for that. If you want a harddrive; there's an ISA card for that. If you want a scanner, there's an ISA card for that with a parallel port on it. If you want a mouse, there's an ISA card with serial port on it. 386DX-40 is when PC really started to become affordable. There were Maybe half a dozen chips plus some cache DIP modules; very highly integrated. If you wanted CDROM, harddrive, mouse and printer there was a single ISA card for that, that was fairly cheap. That's what made technology cheap; what made technology fast was Dennard scaling, not Moore's.
@@soylentgreenb 👌 thanx alot for your explanations , i really appreciate it 🙏
I remember contemplating getting a Fasmath back in the early 90s when they started to get really cheap. I saw benchmarks of one being ran at 16mhz on a 33mhz intel 386 DX and seeing it almost keep up with the intel 387 at 33mhz and was very very impressed. What I didn't understand at the time was the first few Cyrix FPUs were limited to lower clocks, but when ran half bus clock they were able to avoid async latency penalties and their IPC was increased a measurable amount relative to clock (vs say running them at 20 or 25mhz on a 33mhz bus clock). So it made them look really really good. Of course running them at bus clock will be much faster but the ability to show it off at 16mhz was a great selling gimmick. Of course the Intel FPU would increase IPC in the same situation, but the salesman wouldn't ever bother divulging that. IIRC halving the clock of the FPU on a 33mhz bus would only slow it to about 60 or 65% and not to %50 like one would expect. Perhaps due to a relatively higher amount of memory bandwidth.
This is why there was usually a minimal difference when I ran mine at 20mhz over 16 mhz in my computer at the time. Faster clocks were always better, but some dividers had less penalties than others. I saw no reason to run mine at 16 other than to impress people, but being a little bit OCD it just seemed more rounded. I am unable to recall however if the slower FPU would end up bottle necking the 386 at all due to it potentially tying up the system bus a little longer? I didn't think it an issue at the time. Some integer ops ran slightly slower after putting in the FPU (like maybe 3-5%) but from what I understand that was normal no matter what speed the FPU was.
For what I was doing, it was a welcome boost and well worth every penny.
Thank you. You put here some very valuable information and thoughts. Thank you, now I cant go to bed coz I need to test some things. 😂
@@CPUGalaxy lol good luck. I just got a very obscure joba few days back to extract some data from some really old QiC carts, and I happened to remember I had a special accelerator card for those floppy port based tape drives. I was up through the night trying to get it to work only to realize the drive I was provided was partly dead. Such is us retro enthusiasts.
The spare they had worked perfectly though, and I was able to turn a 4 hour per tape extraction into about 30 minutes with the card I had. First time for me to ever use that card that I picked up on a whim cause I thought it would come in handy one day... That was 26 years ago. haha
Thankyou for all the time you put in. It brings back memories and keeps my enthusiasm for playin around with my collections.
There was also in the days a very unusual brand, Weitek, their coprocessors were the fastest, but they didn't work as a x87 replacement, needing a special socket. In fact they were available for other CPU families as well. Also kudos to all those manufacturers that took the effort to implement IEEE 754 in hardware, it's quite a complex standard for 1985's technology. (EDIT: Oops, I have realized someone there is at least another comment on the matter)
I seem to recall WinChip and some other manufacturers that Best Buy and CompUSA/ComputerCity used to stock for older CPUs late in the day in 1999, in addition to Cyrix and other companies. FPUs probably could be used to improve decoding of music files. I recall on the Atari Falcon030 there was one MOD - or was it MP3? - player that could use the Motorola 68882 FPU to improve performance years after that computer was discontinued. I'm sure there's players on the Amiga that did the same...
Our first 386SX mentioned a Weitek co-processor
You really have to add the Weitek co-processors. They needed a different socket, but they were fast! I had done a similar (but more comprehensive) test about 18 years ago, comparing everything @ 20MHz, with what 387s I had available, but also with various 386 models. Apart from one graph (on some CPU forum which I cannot remember), I never published the results.
It's never too late to publish those results.
Very interesting video. I used to own the IIT DLC3 387 math chip in my old 66Mhz IBM Blue Lightning computer which I purchased back in late 1993.. First time I've seen that chip in years... quite nostalgic.
Your channel is very addictive 😍
Love your videos, thanks 😊
Thank you!
Awesome for screensaver demos ... amazing stuff
Fascinating comparison, I remember there was a DOS utility that would emulate a coprocessor for CPUs that didn't have it, I wish I could remember the name of it as it actually gave a slight boost to performance. I recall wanting to buy a 387 back when I had a 486SLC and then being flabbergasted at the price - far outside of my range back then as a student!
Great video with very nice comparison. I like this channel because there are very interesting informations.
Thank you!
I messed around with POVRAY back in the DOS 5 days on a superb 486SX25. Renders would take hours to days to finish. When I upgraded to a DX2-50 it would take minutes to hours to render. I never did own a socketed co-pro.
I'm glad I found this channel. Love em processors.
Also, I have a question: When this person was installing the co-processor, how did he manage to place it in the correct way? I didn't see any marks or guides.
Thank you very much. Regarding your question I can recommend watching my video where I did a more detailed review on that board. And you are right, the right placing of the chip is an issue. here the link. ruclips.net/video/qaGQxZEYby0/видео.html
@@CPUGalaxy Yo, thank you. I'll investigate more and I'll watch that vid. Hope we get to see more of this content. Thanks.
This reminds of of when I started in IT, I remember so many of these things.
the cyrix really kicked ass
Cyrix Co-processors were pretty good. It was their CPU's that sucked.
@@Dj-Mccullough Not quite. Their 386 CPU's were on par with intel's IPC and were clocked higher as well. At the time intel, despite being the designer, basically had the lesser chip compared to almost all competitors. It was their later socket 7 intergrated FPU units that were feeble compared to AMD, intel and IDT. Their integer units were actually quite a bit faster clock for clock as intel's at that time (hence the 200+ rating at 150MHz), but sadly for them Quake came along, which used the FPU.
This old stock looks fresh new with the quality of your prod! Glad I found you few months ago :)
Thank you
This was a great video! I love fractals. I want to run some of these programs on my 10700k RTX2070 just to see how far along we've come.
I got a RTX2060 just for this purpose. The Optix API is amazing for playing with 3D raytraced fractals and procedural textures in realtime.
Great test and comparison. I myself have in my 386DX 40mh, IIT 4C87 40Mhz coprocessor
Was wondering if it makes sense to add a copro to my 486DLC40 CPU, as it is annoying to see an empty socket as you say... At the end, it's only an esthetics matter, without any purpose.
Ahhh, the best part of the best DOS demo, Crystal Dreams II! You gave good tastes 😉
Schöne Grüße aus der Steiermark!
Nice video as always :) I wanted to see this comparison.
I was collecting 387s recently. I have 4 of them.
The top models from Intel, ITT, ULSI, and Cyrix.
Waiting for some 386 chips to test these. My 386 collection is very poor.
have my old Am386-40 with an IIT 387 running a manufacturing machine. Still running strong every day after all these years, even with the old "cheap" vga monitor.
If I remember correctly worms was a game that was unplayable slow on a 386-sx 20... But got a huge boost with a 387. Would be cool if you could test that :)
And Falcon 3.0
I was messing around with some old computers a couple years ago and i discovered that some of the more recently compiled open source DOS tools were compiled in such a way that they require floating point. That was frustrating as one of the machines i was using was a passive backplane board with no accommodation for a 387 even if i had one.
I recalled that back in the day there were fpu emulator TSRs for dos, and went looking, and found that you can still license them for a bunch of money. Oh well.
I remember a long time ago (mid 90s) I stumbled across a free 287-10and man it made a huge difference with Links 286. Iirc it was actually 2MHz slower than the main cpu, but they seemed to get along well enough.
It probably was running at 12MHz. Only a 20% overclock, definitely possible.
@@5roundsrapid263 The 287 FPUs use a different internal clock modifier. If I remember correctly they run at 2/3rds the CPU clock, so his example was probably running underclocked at 8MHz when used with his 12MHz CPU.
@@JeremyLevi I wasn’t aware of that. Still, a massive improvement over a stock 286.
@@5roundsrapid263 I wasn't really aware of it either until I went looking for a 287 to pair with my Harris 286-16 and couldn't find anything faster than 12MHz FPU so I did some research. Turns out I only needed a 10MHz part. :)
Who made Links 286? I can only find references to Links 386 Pro...
Even there is a demo, nice! I had couple years ago Cyrix FPUs, now only chips which I left are old 68040 cpus.
"super meth processor"
sounds like the official fpu of florida man
I heard that too!
@@5roundsrapid263 "special meth functions" like insanity and 3days without sleeping
Wonderful stuff!!!
This comparison would have been fantastic for me in 1994. Our primary family PC at the time had a 100MHz IBM Blue-lightning CPU and a 33MHz Cyrix Fasmath. I had always felt the Fasmath let us down for performance, but now I know it wasn't actually so bad and just the low clock speed that was an issue. The computer was upgraded with an Intel DX2 Overdrive which was noticeably slower for integer but much faster for floating point, and then later with an IBM 5x86 that was much faster in every way.
The only thing I found so far is that the Cyrix FasMath CX-83D87-33-GP (and KN) models are asynchronous unlike others so they shut down unused parts of the FPU to save memory bandwidth on the data bus. Otherwise they were made same, same 32 bit bandwidth and data bus and same frequency. Will update as I find out more information.
Would have loved to see a software emulator compared as well, man I spent some days in fractint.
Great vid man, really enjoyed it.
I think software floating point is always used unless software is written to detect and take advantage of a hardware FPU, where it will use different instructions to do the calculations on the FPU... so there is no special Software emulated FPU -- as that is just what any program is doing already if there is no FPU.
There were 387 emulators that could trick a 387 demanding program to run (at vastly reduced performance)
@@adriansdigitalbasement Here here! :) The software emulation part is shown at 9:09 - he even says it :)
Seeing that 487SX I'm officially envious now. :) That's a damn rare find! I'm waiting for that separate video about it.
Edit: Fractint was a great program. For hi-res pictures with more complicated math it could took a night to render. I still have a lot of those fractal images on CD. And some of them printed in color and hanged on the wall. :)
soon. Just stay tuned on my channel. Thanks for watching! 👍🏻
2 nd edit: we populate those empty sockets, because they are there (freely after Sir Edmund Hillary)
The 487SX was a precursor to the ODP Overdrive (with the extra pin which turned off the existing 486 - oops, spoiler)
@@matthewday7565 I thought 80486 CPU's had an integrated FPU on them, I'm curious what was the purpose of a stand-alone additional FPU. Perhaps someone knowledgable could explain?
Nevermind. I just read about an i486SX (with FPU part disabled/removed) on Wikipedia, lol. Very cheeky of intel.
7:48 what chip is supposed to be in the unpopulated location below the large 486 socket. Is it part of the opti chipset or an alternative location for a CPU.
Only a guess, but perhaps an option for on-board graphics in other models of the same board? It looks like there two two different pad options for different sized ICs...
I guess this area is for soldered SMD CPUs: 386 for the smaller one or 486 for the bigger one. :)
To clarify that. Its not for chipset or onboard video. this is for a solder version of the 486 cpu. check out another video where I show more details of this board. ruclips.net/video/qaGQxZEYby0/видео.html
Comes to mind also the top tier and expensive Weitek coprocessors.
These old floating point chips make me appreciate the FPUs that come standard in today's modern CPUs, especially given how expensive these chips were back in the day. As a regular home PC user back then, these chips were nothing more than curiosities.
im re-watching the video and im late to receive my ULSI DX/DLC 40 Copro to push my 386DX 20 over its limits! thanks again for your amazing contend , take care F
I just bought a ulsi chip from china, and it was real! A perfect match for my ti486dlc. Pretty cool
about games using the math co, there is the original quake and a game called retro city rampage 486, they both run on 386 hardware. Also a lot of software like some trackers makes use of it.
Nice test, but could you, please, test and compare it to i486DX33 too? Would like to see if the FPU in 486 was any better than 387 FPU from INTEL. Maybe you can compare different 486 brands in FP math test like the one performed here, to see if all 486 variants had the comparable built-in math coprocessor. I do not think I've ever seen test like that before focused just on math part of 486 CPUs.
Nice comparison!
If I remember correctly, that i487SX is basically a full 486DX that checks that a 486SX is present, then disables it and takes over all processing by itself. Seemed like a bit of a scam, especially if your 486SX was just a DX with it's FPU disabled.
Looking forward to the video on that one.
You are correct. The i487 has a signal on one of the pins that disables the SX and takes over for it.
I had a Dell 486 tower back in the 1990s that had a "487" socket. The good thing is that you could put any other 486 CPU in it and it would take over for the SX. So I installed a Kingston Turbochip with an AMD 5x86-133 that blew the doors off any silly i487SX.
Most interesting bit was the mother board which could fit 486 and 386 chips(I did not know it was possible), could you please repeat the same benchmark with 486 CPU?
yes, a follow up is definitely planned. several videos with that board. 😉
The Classic SimCity (first version for DOS) uses the x87. I've never heard about Sim City 2000 using the FPU.
Great benchmark ! It will be interesting to add the Rapidcad if possible !
Rapid Cad I have already planned to cover it in a separate video. 😉. So for sure you will see it soon here.
I had one of those IIT FPUs. It could do 4x4 matrix multiplication. Good for graphics. Unfortunately it was not used by any commercial software I got.
Good vid. Really great content from your channel.
My only grumble would be out of the box thinking. Ie could you have stuck a 486dx 33 comparison at end. Ie the generation leap to 486 with integrated fpu.
Edit typo
yeah, you’re right. would had been a great idea! 😓. Damn dude, why I did not have this idea 💡. But anyhow I have planned a video about the Intel Rapid Cad with that board. And there I will compare it then with the 486dx33. Thanks for your input. This is what I call a great audience and community.
@@CPUGalaxy hopefully with a back comparison ie a sub benchmark that has the 2 apps you used in this video with whatever apps you use for the rapid cad.
Hopefully by Xmas you get the place you deserve in RUclips half or 1 million subs, and growing.!❤️❤️❤️❤️❤️❤️❤️❤️❤️❤️❤️
Thank you very much buddy! and yes, I will use the same benchmark programs with the Rapid Cad to compare nicely.
"One of my personal favorite units."
Very interesting and useful information. Expected the IIT to score a little better than it did, for no particular reason, though Cyrix was always king in the FPU department back then. It's a shame they couldn't keep up in later years. Going to have to re-test my own Cyrix 387DX, purely as I have to wonder if they were the same as the 83D87 versions or not, but don't have one of those (or the mysterious KN version). It should now be possible to deduce this from looking at your results, figuring out how far off my testbed system is with known chips (Intel, IIT) and applying that 'offset' to whatever the Cyrix chip scores. So thanks for doing these tests, very informative.
I thank you to see you here on my channel. ☺️. Just waiting hard to a new video on your cannel. actually you motivated me to make this 387 video. I recently watched a video from you where you got so mad of Landmark benchmark on your 386 setup. 😅. By the way, I love the intro you did for your videos now.
Fractint ftw! One of my faves when I was a kid.
As far as I'm aware the 487 is just a 486DX that takes over all functions from the CPU.
There are some early 386 boards which allowed you to use 287 coprocessors, it would be very interesting to see these in action and how they compare to the 387 setups!
Would also love a follow up to see how much better the 40MHz parts perform at their rated speed with the AMD CPU. Makes me wonder what the top dog best setup is.
It's amazing how Cyrix performed well with the KN and some years later failed in the same subject with Cyrix 6x86
MII correct?
"But why are we collectors populating empty 387 sockets?
Uh, you know the answer: yes, because we can and just hate to see empty cpu sockets."
Made me laugh hard. But it's actually true. Even 287s if that matters.
I have a few fpus and wanted to test them but decided not to because it would have taken time (sloppy excuse).
After a few days I get this. Come on, coincidence in the galaxy? :)
😅👍🏻
My recollection is that the 486SX processor was a full 486DX processor with it's floating point unit artificially disabled. The 487SX "math co-processor" was in reality another fully functional 486DX CPU, with one extra pin to prevent it from being used in regular 486 boards, and also to prevent a regular 486DX from being used in a 486SX board. Once you installed the "487SX", it took over completely from the 486SX which simply sat idle in a halt state. The only upside to this whole ridiculous arrangement is that it later became possible to install more powerful "486" and "586" processors with clock-multipliers to get a genuine performance boost.
I remember using some shareware that emulated the 387 co processor in software, for the programs that absolutely needed the maths unit in there. While it was not as fast as the actual silicon, it came on on most testing as slightly faster than the actual CPU itself in doing this, more from optimised code for the functions than anything else, so would probably have cut the CPU only render times down to under 40 minutes.
I used the same with my 16MHz 386SX machine. Then while rendering AutoCAD 3D plots of my sister's house to be build, I got frustrated and bought a co-processor.
Can the 40's beat the 33GP-KN if you pair them with the 40MHz - guessing there is no option to run the FPU at a different clock to the CPU
Interesting how they had identical scores, usually there's a little error marginal even when benchmarking many times with the same component :)
I have a 486SLC based system with a socket for an 80387, I'm very curious to see what installing one would do to that system.
I would make make non-integer math faster.
@@stonent Exactly what he shows here, you'll get a significant boost to FPU computations but very few applications took advantage of it.
@@stonent But only if the software is compiled to make use of the FPU. Non-integer math will still run on the CPU unless the software is specifically designed to take advantage of it.
@@JeremyLevi Yes I know, I was just providing a simple answer.
@@stonent Right, but Michael might not which could lead to a very disappointed "huh, nothing changed" if he decides to add an FPU to his system.
We had 8087 in XT-machine, then 387sx when moving to 386sx machine, after which the FPU was built in main CPU. My father used it for CAD-stuff, I merely for fractal rendering and Falcon 3.0.
Funny how for me it was ”normal” to have FPU, but none of my friends had those. We then sometimes compared how long time it took to render some images, and with FPU it was a lot faster with proper software.
You can also use the math co to run quake on a 386 or many trackers and music software, even mp3 players
I found your channel to be very interesting
Thank you very much!
One thing of interest might be how well the various FPUs complied with the IEEE 754 standard. There were some interesting "bugs" in various cores, and also in the emulation libraries. (For some values of interesting, of course.)
Excellent video! And of course now all the scalpers are going to start buying up Cyrix 33GP FPU's to resell on eBay for thousands lol ;)
Glad I've been holding off for months on deciding what FPU to buy. I really want my 386 (on one of those FOREX boards w/ clear jumper instructions on the PCB) to look proper...plus I'm trying to make it a bit faster than it was back when I last used it (around 2006-2007).
Unless I find a 486 DX/2 66 cheaper.
I'm curious about that motherboard you have used, would a 486 to Pentium Overdrive or equivalent chip work on that motherboard also? Yet another really interesting, detailed and well explained video. Great work. 👍🏻😃
Thank you. Yeah, indeed this is a very interesting motherboard. But it does not support a Pentium OverDrive.
We take floating point arithmetic in hardware for granted. My dad used to write these things in assembly language. In the second volume of Knuth's book "The Art of Computer Programming" there is an entire chapter on multiplication "How Fast Can We Multiply?"
I remember the Motorola 68881 math co-processor
Finally Cyrix won!! Wieitek was missing though
Weitek will be covered in another video. 😉
@@CPUGalaxy The Weiteks 3167s are pretty cool processors. Love to see more content on those!
They're different beasts than the (Intel) x87 units and are way more flexible / versatile. They can be programmed independently as a memory mapped device, claiming a bit of system memory and run their own dedicated code, separate from the 386 CPU, something a 387 cannot do as it's directly bound to the CPU. Data transfers to and from the 387 coprocessor are accomplished directly through I/O lines and are automatically generated by the 80386 CPU for direct coprocessor instructions, being more of an 'addon' / extension for the 80386 CPU itself.
The Weitek was more of a 'parallel' design in that regard as it could run its own custom developed calculation code and could output its results directly to a memory adress in RAM and did not need the main CPU to do that for it, freeing up valuable CPU cycles. Reminds me a bit of the Amiga's chipset that also did true parallel processing with a very advanced DMA controller for its time. IBM PCs were actually quite limited in that regard with their 'simple' little DMA controllers..
The Weitek units were mostly used for dedicated industrial purposes and certain customized scientific calculations where the 387 wouldn't cut it. For single precision floating point calculations the 387 was faster than the Weitek though. I wonder if there were people back in the day who were extreme enough to use both Weiteks and 387s at the same time. I think that's possible as my Compaq Deskpro 386 can take both a 387 and a Weitek and has 4 separate sockets (385, 386, 387 and Weitek), very cool stuff! 👍 Already got an IIT 387 in there, time to find me a Weitek 3167 and do some experiments 🤓
... together with Cyrix EMC87, maybe? :-)
Do you plan on reviewing the i487 chip? That one was marketed as an upgrade for motherboards with embedded 486 SX chips, but in reality, that "co-processor" contained a full DX CPU that enitrely disabled the SX. It wasn't a real math co-processor. I've read in the documentation, that the chip would only run, if it detected an original SX CPU, but I tested it on Matsonic M601 motherboard, just by itself, with no other chips, and it worked. Ran at 25Mhz. That makes me wonder if the motherboard was designed to spoof the presence of an SX.
yes, I planed to do a review of the i487 ☺️
@@CPUGalaxy Will be looking forward to your review and hopefully benchmarks to compare against mine. It was substantially slower than my regular DX2-66 that I had the board with. I have another 486 motherboard that runs up to DX5-133 and POD83, but has no mention of i487 in the manual. I wouldn't know how to set jumpers correctly, and I don't want to take any risks.
I was never a fan of WinTel - my Amiga 1200 had a 68030 50Mhz card and I always lusted after a 68881 FPU but just for filling the missing space and SysInfo to give me great results. LOL.
Nice video. Does a co-processor work faster/slower with different main processors (brand, clockspeed)? Will it change the ranking? How much faster is an integrated math processor compared to a seperate co-processor? To be continued?
good questions and definitely tbc. 😉. Integrated math copros started with the 486. so of course they were faster. on the 387 if we would increase the clockspeed to 40 MHz it will get also faster.
Cyrix: Best FPU when it didn't matter, worst FPU when it totally mattered.
Twist: it was the same FPU just a few years later :)
If you think Quake mattered more than AutoCAD...
They sure did something wrong looking at tit with 20/20 hindsight. But office people were always a good market. At Work we are trying to squeeze the most out of our equipment because, it is worth it. We keep an eye on how much time is lost due to waiting for machines to do their thing, and if there are time-saves to be had, it is usually easy to justify either effort, or hardware purchase so that people with quite high hourly wages do not sit idle waiting for their computers.
Just calculate how much time (and money) is to be saved if 10 of your employees loose 5% of their time because slow machines, processes.
Yay, I had the ITT 3c87, for POVRay! (and demos + own coding ofc). It had a good price/performance ratio. I even called the c't magazine's hotline to help with my decision (I don't remember if Andreas Stiller was on the other end, but this is likely).
Excellent video! I kinda figured Cyrix was going to be the fastest but I'm honestly a little surprised that Intel was the slowest. I wonder how do the various 487 coprocessors fare? Was Intel still the slowest of the lot? Because it seems like they didn't have a strong FPU until the Pentium CPU.
I think I remember the big effort Intel put in to speeding up FP started with the 486. But I could be wrong.
i have a i9 9700k and i don't know what the big caps and 2 small caps or whatever on the bottom are called and don't know wher eto buy em. Can i fix a damaged pad?
This example of FPU, is like also saying the comparison with Hypertheading when it arrived in CPUs.
no, no it isnt. Literally it isnt in any way like hyper-threading.
Hyper-threading has no particular relevancy to Co processors. Hyper-threading is a modification to the cpu scheduler allowing each physical core to process instructions in a way that attempts to keep all parts of the CPU's actual processing pipline occupied. Think of it like an assembly line building parts, moving from station to station as each resource in the cpu is applied to each thread's needs. The effect is that Multiple threads can be processed 'at once' (between 2 to 4) to run on a single core while incurring as small performance hit as possible for 2 threads, and increaseing performance penalty for 3 and up..
An FPU in co-processor configuration would take over all computation of specific math instructions, largely bypassing the CPU all together.
Mate, why not find the original Intel tool from the overdrive kits? I have one, it's perfect for the job of removing chips.
I had to look it up on Wikipedia but I had always assumed the dx variants meant they had the math copro but apparently that was only for 486 CPUs. I remember running a 387 emulator for years when I used POV Ray. That would have been a good benchmark.
Very interesting fact on Simcity 2000 using the old FPUs!
Did any of these external FPUs have known bugs like the integrated FPU in the first generation Pentium?
never heard about that.