Coding for the World's Trickiest Chip - SEGA's Saturn DSP (CODING SECRETS)

GameHut

Просмотров 123 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 23 окт 2024

Комментарии • 1 тыс.

@mrcrue13 5 лет назад ⁺³⁹⁷
Saturn homebrew dev here. The leaked English coding docs for the Sega Saturn are known to be full of errors as they were rushed through translation or something.
@segaunited3855 5 лет назад ⁺⁸⁵
Blame Sega of America. Wasted all of 1994 on 32X. And then in '95, gave Developers sloppy SDKs and failed to communicate with Saturn's Engineers.
@imdone8243 5 лет назад ⁺⁸
Where did the term Homebrew originally come from?
@andrewgwilliam4831 4 года назад ⁺⁴⁹
Beer?
@KuraIthys 4 года назад ⁺⁴⁰
Yeah... Most likely.
Homebrew kits for alcohol and beer especially have been on sale long before computer games were ever a thing. (Home BREW. after all. Brewing is how you make beer in particular.)
How this transitioned to use in game development I don't know.
It's also a bit weird that the term has gotten ambiguous enough that we start calling people working with old microcomputers 'homebrew' devs.
Console development without an official license and dev kit is one thing. That's not a normal thing to do, so 'homebrew' kinda makes sense.
But one of the core differences between a console and a microcomputer is that anyone can develop microcomputer programs, and has always been able to.
Me writing a SNES game is a bit unusual...
But writing a game for my atari 800XL is only unusual in terms of how old that system is.
Back in the 80's you had a lot of 'bedroom coders' who did just that, and then got a publishing deal and released their work commercially;
@REDSIX 4 года назад ⁺¹⁶
@@segaunited3855 there is more to it than that. SoJ kept projects like Saturn secret from SoA and is why they wasted time and resources making warts for the Genesis.
@adamsfusion 5 лет назад ⁺³⁰⁷
This reminds me of a quote Jon made in 1997: "while the PlayStation was easier to get started on ... you quickly reach [its] limits, whereas the Saturn's "complicated" hardware had the ability to improve the speed and look of a game when all used together correctly."
Judging from the mastery of the DSP, it seems so. I mean, having an entire 68K all to yourself just to produce sound alone is pretty incredible.
@DeathBaseTURBO 5 лет назад ⁺³⁸
Playstation was simple and hardware wasn't anything over the top.
Saturn had the bits to pull far above anyone then but was difficult to get everything talking at the right time but some did pull it off, look at the quake ports, they said it wasn't possible but they did it anyway because the company said they could, sadly it didn't go much further than that but like most consoles, games get better as years go on as people have a better understanding of it, playstation was to easy and max out in 2 years, saturn still hadn't reached it's limit, i think there was one game that came close but just needed optimization and it would have been better than anything released including PC games as they were on the rise
@jollyrogerxp 5 лет назад ⁺¹⁰
Very much so, and even the sound processor itself was really impressive for the time...
@PrinceSilvermane 5 лет назад ⁺⁶
I have to wonder what kind of game you could make if you actually utilized every part of the Saturn to it. It seemed very powerful for its time but no one ever used it right.
@DeathBaseTURBO 5 лет назад
@@PrinceSilvermane fortunately there is an engine out in public you can use but I'm not sure how well optimized and user friendly it is
@jollyrogerxp 5 лет назад ⁺¹⁵
@@PrinceSilvermane Just as Jon explains extremely well in his videos, it was a complex piece of hardware. Games that can use it best are ones where both VDP processors can be fully utilized, the DSP has a reasonable chance to get used, and both SH-2 CPUs have work to do that can be done without having to fight for the memory bus too much. All in all, either a game is designed specifically with its architecture in mind (e.g. Panzer Dragoon), or a game concept is slightly tweaked to use the system as well as possible (e.g. Sonic R). In Sonic R TT did a really tremendous effort and it is among the best one could hope to obtain on the system, considering a reasonable development schedule. I have the utmost respect for Jon and his team managed to do. With infinite time available one can always do something better, but it is not representative of a realistic commercial game development process...
@jollyrogerxp 5 лет назад ⁺¹⁹⁶
Very well explained indeed. As an oldtimer PSX and Saturn programmer too, I can appreciate how challenging it is to explain plainly the very concept of simultaneous operation in the different units of a DSP. Something interesting is that the DSP multiplier unit actually performs the multiplication operation between X and Y every cycle, and the instructions are used to tell the unit at any point in time whether one is interested in the result or not... At any rate, the TMS C6000 DSPs are tricky beasts too, as different operations take a different amount of time to complete (like many other pieces of hardware), but the system does not "wait" for an operation to complete before moving forward (pipelining), and one has to (when programming in assembly) take that delay into account to avoid using a result at the wrong time ;)
@kaldo_kaldo 5 лет назад ⁺¹⁵
Good ol' race conditions. This is the same reason making multi-threaded programs (and games) is difficult. It's like coding for the DSP, but you never know when things will finish, and you have as many threads to work with as your heart desires, but overuse will give performance penalty instead of performance benefit.
@littlebigcommentary 5 лет назад ⁺¹⁰
Jesus man I assumed at the very least the saturn would have pipelining! Also Thanks jollyrogger for your work on making sonic xtreme playable again. I felt like you never got your due for how much work you put in to getting all those different versions working. It seemed like people didn't appreciate it that much and i just wanted to let you know that as a long time programmer and sonic fan i appreciated the fuck out of it :)
@jollyrogerxp 5 лет назад ⁺²
@@kaldo_kaldo well said :)
@jollyrogerxp 5 лет назад ⁺⁶
@@littlebigcommentary Thank you so very much, if I ever have time I will go back to it, but now there are a few young developers who have time and talent, and can put the Saturn hardware to good use :)
@thebravegallade731 5 лет назад ⁺⁴
Which is EXACTLY why most games still don't use more than two cores even now lol. Single core preformance is generally still the best bar to see for benchmarking games then multithreading ones.
@GameSack 5 лет назад ⁺³⁷⁵
Very interesting. Thanks for presenting that in a somewhat understandable manner. I was able to follow but of course that doesn't mean my mind fully grasps it. Congrats on learning the Saturn so well. I had heard that Sega kept a lot of the more complex coding abilities for themselves so that their games looked better than 3rd party games. I don't know how true that is, but it's what I read in magazines back in the day when 3rd parties were complaining about programming and Sega not exactly being helpful in their documentation or programming tools.
@landoncore92 5 лет назад ⁺¹⁶
I knew I would see you here Joe! We're both fangirls of Gamehut haha
@dan_loup 5 лет назад ⁺⁶
What kind of trick you used to make the perspective without using the W coordinates?
@GameSack 5 лет назад ⁺²⁶
@Joaquín Nuñez That's just conjecture... the "no way they would do that" part. Keep in mind that SoJ was run by Nakayama who was a very strange cat indeed.
@jacklazzaro9820 5 лет назад ⁺¹²
Game Sack, you're not the only one who gives Nakayama that reputation. I'm still not forgiving him for his hostile takeover in plans for the American Saturn.
(Edit): Especially how he did it; as follows
(Not part of edit): Turning down Silicon Graphics, insulting Sony, rushing the 32X (which doesn't regard the Saturn, but still make a large confusing position for the Sega fanbase), and pushing the release date from 9/2/95 to 5/15/95.
@DukeDudeston 5 лет назад ⁺²
Ooooh this reminds me.. new Sack episode tomorrow?
@gaving.griffon2703 5 лет назад ⁺⁷⁶
The visual representations really help in making the code easier to understand. Thanks Jon!
@JonGon005 5 лет назад ⁺⁴⁶
As somebody who has only barely dabbled in the fundamentals of coding, seeing a breakdown of hardware on such a technical level like this is both fascinating and humbling. It captivates me with how much mastery of the material you and other programmers needed to have at the time for such a dizzying machine while also being awe-inspiring to the point where I want to try and learn even more about this. Thank you, Jon!
@segaunited3855 5 лет назад ⁺¹
Saturn is not a Dizzying Machine.
@diezgp 5 лет назад ⁺¹⁶⁷
I burst out laughing when he said "This is where it gets more complicated". I got lost 3 minutes before that.
@jmcd21182 4 года назад ⁺³
I got lost yesterday and didn't even start watching the video until tomorrow
@keselekbakiak 3 года назад
I give up when i saw sin cos stuff
@lcalvom 5 лет назад ⁺¹⁸⁰
After watching the video, now I know that I'll never have the expertise to make a simple game, even though I'm a developer myself... Coding games in the 80's and 90's were the real deal, with all those people (especially you Jon) digging into those pieces of hardware to get the maximum performance.
EDIT: I started programming at age 20, I got my first computer the year before and I was determined to become a good developer; I loved playing games and I was very interested in learning C language to understand the source codes I downloaded from the Internet (about 2000-2001) including some Net Yaroze games I enjoyed playing. I was very proud of myself when I completed a playable version of Tetris in my first year of C learning, but then I started working on a consulting firm and time passed... and now I feel that I have wasted all this time.
So when I see this kind of video, I can't help but thinking that I could have learnt those things instead of letting time go. It's too late now to revert those bad decisions....
@CrossoverGameReviews 5 лет назад ⁺³⁸
This comment made me appreciate today's tech and specs. Though making games today is still difficult, even with "easy" to use engines, we're spoiled compared what people had to go through back then to make a game.
@johnsimon8457 5 лет назад ⁺³¹
The jump from 2D to 3D created a split between ‘Engine Developer’ and ‘Gameplay Developer’, it’s all a matter of specialization.
But, heck, you can learn to build games for NES or GB without too much effort. If all you’ve done is JavaScript, and you haven’t taken any computer architecture classes, sure there’s a learning curve, but 80’s kids built games for micro computers without too much fuss, and these days an emulator will have debug tools undreamed of at the time
@Selicre 5 лет назад ⁺¹¹
@@johnsimon8457 I have almost entirely on my own created a SNES game engine from scratch, and although it isn't much of a game as of yet, it's still got perfect terrain collision and an infinite (well, as infinite as the cart space) world. It's all a matter of how much time and effort you wanna spend on it.
@AmoPizzas 5 лет назад ⁺¹¹
I think it's a good thing that it's easier to make games nowadays. Not just with the easy to use and free engines, but also pretty much every other software you might need can be free too. Not to mention the abundance of information, tutorials, even straight up lessons that anyone can get for free. A lot of creative minds can then express themselves without working for a big company.
@khhnator 5 лет назад ⁺⁵
nowadays coding games is still difficult.... but in all sorts of different ways.
first games are more complex, all the math the sonic R used... you will be using more for things like simple texture effects and so, and you gotta code every single one of them, is essentially just a lot more to do.
but the major technical difference nowadays is that you don't really have to worry about the processors having the horsepower to do anything, if you do things right, they do. the problem is the memory latency.
essentially it takes more time to pull a random number on memory than the math you gonna do with it. so you need to code your program in a way that the processor can predict what is the next thing you gonna need from memory, and that can strange
@superandroidtron 5 лет назад ⁺⁶⁴
Oh neat, I didn't realize there was a VLIW core in the Saturn. It seems Sega included every possible kind of CPU they could, 2 RISC (SH-2), 1 CISC (68K), and the VLIW DSP core. Truly crazy hardware design. Thanks for sharing!
@segaunited3855 5 лет назад ⁺³
SH-2 is Dual Cored. NOT Dual CPU'd. The Chips are exactly the same, they're separated on two Wafers due to the Rushed taping of Saturn, that was incompleted due to CSK's Owner Isao Okawa ordering its Taping and Beta design to be rushed out 4 months too early.
The SEGA Saturn is Not a 32-bit only console. Its ACTUALLY a 64-bit Console. An Unfinished one.
Saturn's design was only half finished. SEGA released an 80% compete product as Saturn's Documentation wasn't even fully finished.
@superandroidtron 5 лет назад ⁺¹⁰
@@segaunited3855 Yes, they're identical, but they operate independently. It's 2 separate 32-bit CPUs. Multicore/multi CPU effectively mean the same thing.
@segaunited3855 5 лет назад ⁺²
@@superandroidtron You are correct. They do operate independently, but the reason why the share the same DSP Bus is because the MIPS Instructions are all embedded it. Each side pushes Simultaneous Dual 32-bit Instructions as the SH-4 is QUAD threaded and Dual Register Coded.
We have a PDF Document of Hitachi SH-2 Aurora. That shows how to get 64-bit instructions on its ISA setup using 6 Instruction Cycles.
@superandroidtron 5 лет назад ⁺⁵
@@segaunited3855 I really don't understand what you're saying. None of the CPUs in the Saturn are MIPS cores and the SH-4 (which isn't even in the Saturn, it's in the Dreamcast) is a single-threaded dual-issue CPU. I also can't find any documentation for an SH-2 called "Aurora."
@segaunited3855 5 лет назад ⁺²
@@superandroidtron Didn't say Saturn's CPU is MIPS Cored. IT HANDLES MIPS. The DSP can stream up to 50 MIPS to the SH-2 Aurora. People seem to think that Saturn can't handle MIPS when it certainly does. It has Double the MIPS of PS1.
Aurora is the codename of the Saturn's chipset. The name comes from a "In between combination of JUPITER and Model 2". Its basically a Low End OTS built of Model 2 Hardware. Combining the Phase 2 of Jupiter's "System 32 with Model 2 3D" with a Full Fledged 64-bit Model 2 Powered project.
SEGA names the CPUs of its consoles after its codename chipsets. Aurora is the Codename of the SH-2 ISA RISC of Saturn.
BTW, RISC has evolved into MIPS. Today, Renasas uses MIPS for ISA.
@yaisetan 5 лет назад ⁺⁴²
I only understood about 40%, but when I first started watching your videos I barely understood 5%... Your videos have helped me learn a lot
@SianaGearz 5 лет назад ⁺¹
Well what are the other 60%, i'm sure if you ask pointed questions, someone down here myself included might be able to help out.
@dhkatz_ 5 лет назад ⁺²
@@SianaGearz That's the beauty of not knowing. You don't know what you don't know.
@SianaGearz 5 лет назад
@@dhkatz_ yeah but this video is a collection of statements. You can go through them one by one and single out which is the first one that doesn't seem meaningful, contains unknown terminology, or presents an apparently logical conclusion that you don't know how the author arrived at. You can request clarification for all such statements one by one and in turn all statements in replies that pose a similar issue and eventually you shall arrive at complete understanding.
@nifela 5 лет назад ⁺³⁶
Oooh, programming that DSP sounds like a fun challenge.
@narryg 5 лет назад ⁺⁸⁷
Impressive chip but good god I can't imagine dealing with this
@johnrickard8512 5 лет назад ⁺¹³
This DSP chip would require quite a different paradigm compared to most other chips. Now that I see how it works, I can see that execution path modeling could be used to fairly easily cook up code for this thing, but debugging would be an absolute nightmare for sure.
@HappyBeezerStudios 5 лет назад ⁺⁹
The PS2 and the PS3 chipset also got their fair share of coding trouble.
The PS2 runs a MIPS core with 128-bit SIMD functionality, it runs the MIPS III ISA with specially added extra instructions. the FPU is not IEEE compliant Than come two VLIW units which are simila, but get used for different functions. Oh, and an additional video decoder and an external RDRAM MMU. It got 2 audio processors, one in the main MIPS CPU and the other to emulate the chip found in the PS1
The PS3 runs a combination of one dual threaded, dual issue, in-order core running PowerPC ISA 2.02 and six additional dual-issue units with 6 execution units each and indipendent memory management but without branch predictor, they run their own ISA, have embedded SRAM. All of that connected via a ringbus.
Syncing all that stuff correctly must've been a nightmare.
@segaunited3855 5 лет назад ⁺¹
@@johnrickard8512 Debugging on Saturn can be done two ways: Either on Sophia SDKs, or on modded Pheobe.
@evilash570 5 лет назад ⁺⁴¹
SEGA and Yu Suzuki and the AM2 team were so elite with programming back then. SEGA built the best arcade experiences back then and tried to bring those experiences home. Sadly they built hardware that was so complex for programmers that only the very best could use it to its full potential.
@segaunited3855 5 лет назад ⁺³
@7MGTESupraTurboA You are exactly correct. Sega of America wasted ALL of 1994 on 32X.
@HappyBeezerStudios 5 лет назад ⁺¹
They went full out in a way only Sony tried later when they went with IBM's Cell for the PS3.
@queercommunist 5 лет назад
well it didn't really show
@maroon9273 5 лет назад
@7MGTESupraTurboA so true 32x not only took a lot of money and resources but a lot of launch and later titles from the Saturn. They should have cancelled the 32x in 94
@maroon9273 5 лет назад
@7MGTESupraTurboA Can't forget about the Sega CD lack of super scaler games and the push for more FMV games as well. SOA should have consider the Cart version of the Saturn if there were serious about the price point of the 5th gen console. I agree with you but however, Sega Japan made some bad decisions such as the Saturn surprise launch, presenting the Mars project to SOA and the lack of tools and utilization of the Sega CD ASIC chip.
@ijabbott63 5 лет назад ⁺³⁸
The main problem with the DSP was the amount of time it took to load the matrix and set up the DMA for transforming the points. For transforming a small number of points, it was faster just to do it in the main CPU. At least that's what I found when porting Assault Rigs from PlayStation to Saturn and trying to get it to run at a reasonable frame rate (which kind of failed on the busier levels!).
@segaunited3855 4 года назад
It would have been easier and smooth had both sides of the SH-2 been used.
@KuraIthys 4 года назад ⁺¹⁰
That's a common problem when you have dedicated hardware for a task;
Getting the GPU in a modern PC to do a transform on 500,000 vertices is very fast.
But getting it to do a transform on 3? The setup time and draw calls will eat you alive.
(actually draw call minimisation is one of the most important game engine optimisation skills of the last 20 or so years on PC - and presumably also modern consoles, since they're almost the same architecture as the PC...)
@JoeStuffz 4 года назад ⁺⁵
I heard this is an issue for even SIMD on the same CPU (example: SSE2). You can do things like wreck the CPU pipeline to where it's not worth using SIMD
@ArneChristianRosenfeldt Год назад
@@JoeStuffznext time you tell me that single instruction multiple logic operations aka bit operations have issues.
@TheDukeOfZill 4 года назад ⁺¹²
This system is so powerful, it could steal the SOUND CHIP to aid in 3D graphics.
@ArneChristianRosenfeldt Год назад ⁺³
As could the Atari Jaguar
@MaxAbramson3 Год назад ⁺²
@@ArneChristianRosenfeldt Except that the Jaguar was designed from scratch to use the custom RISC CPU as a DSP--usable for audio or even setup for the 3D graphics pipeline. On the Saturn, a DSP that looks like a Transport Triggered Architecture (rather than RISC) had a bit of a learning curve for the average programmer.
@luke_rs 5 лет назад ⁺³⁷
Wish this video existed about 2 months ago :( My students had to cover this topic (use of Matrices in 3D computer graphics) for an assessment task. I'll probably use this video as an example of application of the technique next year. Cheers :)
@MrSapps 5 лет назад ⁺²
You are teaching saturn coding?
@luke_rs 5 лет назад ⁺²
@@MrSapps no, just computer science principles, and use of Matrices in graphics applications is one of them. He only briefly touches on it at the start of this video, but it's good to see it in a practical application.
@MrSapps 5 лет назад ⁺²
@@luke_rs Was going to say - that could be quite a unique course ;)
@mzxrules 5 лет назад ⁺²
i don't think it's the best idea to use this as a reference, since it might scare them off. Modern CPU/GPU architecture uses SIMD (Single instruction, multiple data) operations instead, which I think means that you can for example, add/subtract/multiply/divide two vectors in one instruction, rather than multiply the individual components of that vector with different instructions that happen to run in parallel.
@imdone8243 5 лет назад
@@luke_rs hey random question but I wonder if you could help me a bit.
What are the basic things I need to learn/research to be able to,
Draw a 3d model in a screen?
I'm learning stuff alone and sometimes it's hard because you don't know the terms and how shit is called.
@Imgema 5 лет назад ⁺²⁰
There's something fascinating when you watch a video about stuff you can't even grasp.
@YouTubeSupportTeams 4 года назад ⁺⁴
lol. For me it's depressing. 'WHY DON'T I KNOW THIS TOO???' :(
@VelvetCondoms 5 лет назад ⁺⁹
I enjoy assembly programming, but I don't get to do it very often, so this video was very enjoyable.
@richarddale76 5 лет назад ⁺⁷
FUN FACT. a game on Saturn called MDK was named after the memory of the DSP chip as it was a continual game where the accumulated numbers changed the visuals on screen. Each level just kept going and going until a time clock ended. Each level was coded with a single start “code” and the Saturn itself filled in the level, much in the same way a procedural game is made
@jollyrogerxp 5 лет назад ⁺²
Never heard of this. Was MDK actually ever released on Saturn? Maybe it's not the MDK I am thinking of...
@SinaelDOverom 5 лет назад ⁺²
Maybe you got the name wrong? Can't find any reference to that. The only MDK game that i can find was not released on Saturn - only PC and PSX.
@retractingblinds 5 лет назад ⁺⁵
I can't wait for the follow up video. What I like most about the saturn is it gives developers so many obscure options to get the results they want. You're made to manage exactly what part of any processor is handling a specific segment of code while another part of that same chip is processing two entirely different things. They don't make hardware like it, a highly specialized hyper paralleled system that evolves 2D game play to the next level and pushes 3D in a way the inter-mingles with what it can do in 2D. Antiquated and advanced, simultaneously!
That's why I love learning about it. It's so highly specialized that it earns its own understanding.
@segaunited3855 5 лет назад
ALL 5th Generation Consoles did 3D by using 2D and 3D playing fields and Models together.
@retractingblinds 5 лет назад ⁺²
@@segaunited3855 the saturn worked significantly differently due to its reliance on quads and features like infinite planes.
@segaunited3855 5 лет назад ⁺¹
@@retractingblinds Correct. Saturn used Trapezoids and Reshaped Sprites(N64 also used Reshaped Sprites but relied on small Triangles for Polygons).
PS1 used LARGE,bloated Triangles and Simultaneous Renders. But it was far inferior to Saturn(due to it lacking Recalculation,) and N64(Lacked Perspective Correction and Z Buffering).
@Minty_Meeo 5 лет назад ⁺¹⁰⁸
Nowadays we have our fancy compilers do all that multi-threading nonsense for us. Really makes you appreciate it.
@fluffycritter 5 лет назад ⁺¹⁶
Yeah this reminds me a lot of the brief fad of VLIW CPUs like Itanium and Hitachi SH-4. Of course most modern CPUs are VLIW under the hood, with a CISC (or even RISC) instruction decode step that basically recompiles to VLIW on-the-fly.
I was always interested in the Transmeta Crusoe since it basically did that except with the decode/conversion step in software. It's a shame that never really went anywhere on the consumer market though, aside from being in a handful of early low-power subnotebooks like the Sony Picturebook.
Sun's MAJC architecture also had some interesting design choices and it's a shame that never even made it to market as far as I know.
@noop9k 5 лет назад ⁺¹⁶
Minty Meeo this is not multithreading
@noop9k 5 лет назад ⁺⁵
fluffy: Transmeta was basically impossible to use as a standalone CPU that runs a modern OS, and it wasn’t, unfortunately, that fast, despite x86 JIT being very good. In particular, it lacked good SIMD support with SSE only appearing on Efficeon, too late. But it would be great, of course, if they made it open so the people could run recompilers for other architectures.
Before Pentium M appeared, Transmeta was quite popular, especially in Japan. I have Fujitsu subnotebook and HP tc1000, both use Crusoe.
@dshcfh 5 лет назад ⁺³
At least back then you knew exactly what was going on in your code, so in a way it was simpler.
@fluffycritter 5 лет назад ⁺²
@@noop9k Oh, sure, they never made a standalone version (and having the software instruction decode layer was pretty much the entire point) but it'd have been cool if they'd gotten to the point of supporting multiple ISAs or whatever. Or even had a version with the VLIW ISA exposed directly.
I had a Picturebook C1VN for a while and while the performance wasn't great, it was good enough and had *amazing* battery life compared to similarly-performing machines of the time. Of course Atom totally fits in that niche now (and for non-x86-legacy stuff ARM is doing quite nicely), so it's not like we've lost out in that market or anything. It's just always interesting to think about what could have been, or where the technology might have gone if it didn't end up just becoming yet another pile of zombie IP.
@ilrompiscatole5414 5 лет назад ⁺¹
Your works have inspired me to become the senior 3D/VR artist that I am today.
I just wanted to thank you.
Seeing the face of the man behind so many of my favorite games back in the ‘90/‘00 was amazing, knowing that you are a nice person (all the efforts and the charity you are doing) was once again inspiring.
Kudos my distant and long unknown mentor, kudos.
@NeoKesha 5 лет назад ⁺³
Gosh, this is just a masterpiece! I wonder if modern games have something like that, even not exactly like that, but still just beautiful and complex.
Also, i just love how you title everything "Impossible %THING%"
@VelvetCondoms 5 лет назад ⁺³³
I might know of one piece of silicon that might be harder to code for: The iAPX432.
I think I might be the only one here thinking "DAMN! THAT CHIP IS AWESOME!".
@Redhotsmasher 5 лет назад
What about the Cell processor though?
@jollyrogerxp 5 лет назад ⁺⁵
@@Redhotsmasher Ah definitely the Cell was a tough partner, but the individual programmable units in the Cell weren't too bad after all, in fact not all that dissimilar from the VU units in the Playstation 2. DSP programming has always been tricky, from the early TMS and NI chips to the more modern ones. The saving grace in modern systems is the availability of good tools (mostly the compilers) that help with the instruction scheduling and pairing, filling up the pipelines for you rather than having to do it by hand. On the other hand, having intimate knowledge of the instruction set and programming in assembly can sometimes result in algorithms that are very difficult or impossible to express with a high-level programming language, and that therefore a compiler will not emit on its own...
@l3p3 5 лет назад ⁺¹
@@jollyrogerxp Cell architecture was actually used in PlayStation, that is the reason for some scientists using them for complex calculations!
@fluffycritter 5 лет назад ⁺²
@@Redhotsmasher Cell was basically a bunch of PPC750 (what Apple called the G3) cores with added SIMD vector instructions tied together with a high-speed memory bus accessible via DMA. It was painful to work with at a memory controller/task scheduling level but the cores themselves were incredibly straightforward to use.
Later on Sony released a scheduling library called SPURS which made that a lot easier to deal with, as well. You still had to worry about breaking your tasks down to fit into each individual core's workspace memory but that was more of a design thing than an implementation thing.
@perli216 5 лет назад ⁺⁵
/r/iamverysmart
@RETROGAMINGARTS 5 лет назад ⁺³
I cannot thank you enough for this type of information. Thank you for explaining it in a way where someone with no coding knowledge can somewhat understand.
@K0dAHeY 5 лет назад ⁺⁵
Early 3D modeling is very fascinating. Now it seems much easier, back then, probably most coding was done manually.
@NintendoTV64 5 лет назад ⁺³
Loving the new logo, Jon! The Sega Saturn really is an enigma when it comes to programming.
@MrEightThreeOne 5 лет назад ⁺²
I am an electrical engineer and digital hardware is my favorite area, so I found this especially fascinating! Even cooler is how you managed to take advantage of the insane and convoluted hardware. I think I followed a good bulk of it, though I'd probably have to watch it again to fully grasp all the math steps going on in each calculation.
This was a really neat insight into how you managed to program so much to work at once, all the while having to deal with such a complicated piece of equipment. Very well done!
@mogo9052 5 лет назад
Mr. Eight-Three-One, please tell us you're gonna delve into Saturn homebrew
@MrEightThreeOne 5 лет назад
@@mogo9052 Haha, I'm not sure I would count on that. While I can do at least some programming, it's not my strong suit, unless you're talking Verilog or VHDL firmware. I'd say I'm above average as far as EEs tend to go (I impressed a lot of people in my embedded systems class, to say the least), but still, I'm more interested in the design and functionality of the hardware rather than actually making the software to go with it.
I do appreciate the thought though!
@TheKrazyKat89 5 лет назад ⁺⁶
That's fascinating, I've never seen that form of paralellization. This kind of code really requires comments for every line to be maintained though, lol.
@SolKnightt Год назад
Hi Mr. Burton! After roughly 26 years, I've finally finished Sonic 3D Blast on Genesis. And with all emeralds! I wanted to let you know I enjoyed the game and that your videos helped me appreciate all the work that went into it. Thank you to you and the team for everything y'all did!!
@Ali_Alhakeem 5 лет назад ⁺⁵⁷
I Like how i watch the video and still didn't understand anything and still get impressived
@DlcEnergy 5 лет назад ⁺⁵
did you just mix impressive with impressed? "impressived" xD that's going in the dictionary "very impressed"
@eduardoalvarez2497 5 лет назад ⁺³
If you want to understand the basics of this video, you can check this tutorial: skilldrick.github.io/easy6502/
It will teach you the very basics (put a pixel on screen, add, subtract, etc) on the same CPU that was used in the original NES.
@DlcEnergy 5 лет назад
Eduardo Alvarez just read the intro, then played the snake game. lol imgur.com/a/7jOq6kg
i'ma have to learn this. how long does it take to learn?
@eduardoalvarez2497 5 лет назад ⁺²
@@DlcEnergy Well I started a just few weeks ago. There are a few good links in that same tutorial to learn every instruction code that the 6502 have. Then you can go to the specifics of the NES on this link: nintendoage.com/forum/messageview.cfm?catid=22&threadid=7155
@Kazuo1G 5 лет назад ⁺¹
Basically, normal CPUs at the time would take in ONE instruction at a time. (Like, put the number "2" in slot 1)
The DSP can do FIVE instructions at once. (Like, put the number "2" in slot 1, take the number in slot 3 out, etc.)
In essence, you could call it an octopus CPU, because of how many things it can do at once, like the phrase you hear someone say "I only have 2 arms!", when they are doing two things at once and can't do another.
@martijn3151 4 года назад ⁺¹
Oh the joy of parallel computing. Speaking from personal experience, it indeed is hard to get used to a multithreaded or even worse, a multicore environment especially if you are coming from a single core environment. When we started making games ourselves, everything was single core. Taking advantages of multicore, requires you to think completely different. You need to build your engine from the ground up with multicore in mind. And that’s difficult.
That said, we could still offload a lot of lower priority or specialized tasks to other cores. Such as network updates, audio rendering, difficult calculations, particle engine updates (syncing with rendering is tricky), and various game systems that had a lower priority or could be updated out of sync with the main thread. For instance we had a sensory system that was complex, but could run “one frame behind” the main update loop. No one would notice. Offloading that to another core, was a massive gain.
But the worst bugs arise when there is a race condition. I remember spending a week or two on an obscure audio bug on the Wii U that only occurred once every couple of thousand of samples. Reproducing the bug took hours; but always resulted in a massive crash. Nintendo obviously found the bug as well. So it was a big showstopper.
After making a simulator that spawned thousands of audio samples each second inside the game, I managed to get this repro time to down to 5-10 minutes. Still not ideal, but enough to finally find the bug: some core initialization order and a critical section that was thread safe, but not if those threads ran on different cores... oh the joy... I still have nightmares of that garbled audio that was produced by running thousands of samples a second 😊
@eddievhfan1984 5 лет назад ⁺⁴
VLIW in action, pretty much. Mad respect to the team for dealing with all that.
@grindFish 4 года назад ⁺²
That actually makes a lot of sense and is really cool when you think of the potential through put and modular usage of that pipeline
@brandon-butler 5 лет назад ⁺⁴
I just took a class on Assembly and computer organization. The fact that this makes sense to me makes me so happy! 🤓
@classicmail8239 5 лет назад ⁺²
This was fascinating, it would be great to see more in-depth videos like this just to understand more about how the Saturn hardware works.
@philrod1 5 лет назад ⁺¹⁵
Crikey, coding that must have been a real brain twister - what can and cannot be done in parallel, which result do you need before some other calculation uses it, etc. I only hope there was a repeating pattern you could grasp.
p.s. Extra thanks, considering what happened in Malibu 👍
@DarmaniLink 3 года назад
I like how you show the code/diagrams while also giving an oversimplified explanation.
Those who know what they're looking at can stay interested and (i think) those who don't really know can follow
@AFellowGentleman 5 лет назад ⁺³
DSPs are very common in mobile phone base stations. I know at Ericsson they had entire achitectures built with many many of these and c code complied into VeryLongInstructionWord assembly to program them.
So yeah, this is not ”acient technology”, it is used every day when you call your mother or whatever.
@AesculapiusPiranha 5 лет назад
lol
@jollyrogerxp 5 лет назад ⁺¹
Absolutely, DSPs have been and are very common in so many fields, medical equipment, communication gear, radar/sonar, plane avionics, and many more!
@johnrickard8512 5 лет назад ⁺²
DSPs aren't ancient tech by any means, though I'd be the first to point out that 3d polygon transform is not what most people would use a DSP for.
@MuffinHop 4 года назад ⁺¹
Currently looking into DSP coding on Saturn. Thank you so much for making my work much easier!
@eddyjay83 5 лет назад ⁺³³
I would call this process "manual hyperthreading" since you have to program it yourself. Pretty interesting concept for something made in the mid-90s
@storerestore 5 лет назад ⁺¹⁴
I'd call it VLIW
@televisionandcheese 5 лет назад ⁺⁷
Well it's not really hyperthreading, that's using the core for something else while it's waiting for data to come.
@mattiviljanen8109 5 лет назад ⁺¹²
I would just call it parallel instruction execution, aka. pie :)
@fanzyflani3576 5 лет назад
@@mattiviljanen8109 It may also be a position-independent executable too.
@AnonyDave 5 лет назад ⁺⁴
An exposed pipeline and Multiply-Accumulate units is pretty standard DSP architecture. Not a lot of revolution has gone on, mostly just evolution. Datasheets will still have a block diagram that looks very similar these days
@pppgggr 5 лет назад
I'm a third year university student, and I'm just now learning to program in assembly code. Your videos have inspired me to take up a project programming for the SNES. The technical details for things on the channel are always amazing!
@RallyDon82 5 лет назад ⁺³
Sega Saturn is the console i think of when i think of gaming it captured me in 96, just awesome. its the gift that keeps on giving.
@segaunited3855 5 лет назад ⁺³
Saturn and N64 have aged with Grace. Mario 64 still looks very colorful and vibrantly polished.
Also, the Duke Nukem 3D Saturn port absolutely DESTROYS the PS1. Has Superior Lighting Effects,too.
@RallyDon82 5 лет назад ⁺²
i played the hell out of Duke 3d on saturn, Lobotomy software were the 3rd party kings for the saturn who knows what they could have achieved.
@segaunited3855 5 лет назад ⁺¹
@@RallyDon82 IKR? Its a shame on how BADLY Sega of America dropped the ball on Saturn. They NEVER taught people how to properly program and code games for it.
@karlmeyer5383 4 года назад ⁺¹
Would LOVE a video on the DSP and the issue with the dev docs! Thank you so much for this!
@valentine_puppy 4 года назад
Are we related?
5 лет назад ⁺³
Nice implementation of a VLIW processor. Maybe they could've implemented Tomasulo's algorithm to simplify programming but still get a good performance. Anyone has any idea why they didn't use it? The algorithm was developed in 67, so it was already around by the time of the Saturn.
@SaulidSnake 5 лет назад ⁺¹
Wow great stuff Jon keep it coming, you explained the DSP so well. I’ve not touched Assembly Code since second year at Uni, it was great fun playing around with it.
@Slamy4096 5 лет назад ⁺³
Single Instruction Multiple Data (short SIMD) stuff was never really easy. Even today compilers sometimes struggle with it. For my job i needed to learn NEON assembly for our ARM processor. It's different as only one instruction is executed which handles a whole matrix of data instead of the DSP which handles multiple instructions at the same team.
But still the the amount of different instructions of NEON was so high that the current GCC was not capable of correctly compiling code for this thing. Also lot's of register magic happened as this thing got 64 Bit Registers and 128 Bit registers which weren't other registers but instead the 64 Bit ones concatenated together. Lot's of thought and documentation was needed to maintain that monster. Keep up the cool videos sir.
@ArneChristianRosenfeldt Год назад
But here it is multiple instructions. SIMD is PSX GTE and Intel MMX
@DaVince21 3 года назад
The way you visualized it made a difficult thing much easier to understand, so thank you for this great video.
@james2175 5 лет назад ⁺⁵
Thank you for this video. I have to say I'm surprised the Saturn requires such complex instructions, even though I know very little about computer programming. I seriously thought you guys had some sorts of tools made by Sega to help your work, but it looks like programmers had to start from scratch.
@mogo9052 5 лет назад ⁺²
Justin Fanite
From scratch...I wonder how tough , that was.
@segaunited3855 5 лет назад ⁺⁴
On ALL 5th Gen Consoles. Programmers had to start from scratch when writing Assembly for a new Game. It was just easier on PS1, but just because its easier doesn't make it a superior piece of hardware. PS1 is WAY inferior to Saturn and Nintendo 64 all it did was draw basic Polygons better by making them larger in size and with Simultaneous Renders.
@MrDmoney156 5 лет назад ⁺²
@@segaunited3855 Exactly!
@kend4845 5 лет назад
I don't comment often, but I absolutely loved this video. The technical aspects of programming video games truly fascinates me, and you've made it fun to watch!
@KuraIthys 5 лет назад ⁺⁷
Ouch. That gives me a headache. Managing That level of parallel code is not a pleasant experience. (It's bad enough when you're dealing with a full symmetric multiprocessing system - but this is much like hyperthreading - except a CPU with hyperthreading manages code balancing by itself, where here you have to code it directly.)
The n64 also had a bad reputation for being difficult, but looking at why that was, I don't think it's even remotely comparable.
The problems on the n64 are related to high level optimisation, and some frustrating bottlenecks in the architecture that required careful workarounds.
Also Nintendo absolutely refused to provide microcode documentation for the Graphics processor until quite late into the system's life, meaning you were entirely dependent on the handful of pre-written routines Nintendo provided to get any 3d acceleration at all. - easier, sure, but not conducive to getting the best out of the system.
In hindsight looking at what the system contained it's not at all in the same league of complexity.
You had a MIPS 4300i CPU with a floating point unit, and a Graphics chip that consisted of a DSP with fixed function 3d logic, and a second MIPS 4300i core. This core differed from the main CPU core only in terms of having a large dedicated cache memory, accessing memory differently in general, and having a vector co-processor in place of a floating point co-processor.
The instruction set was obviously much the same.
Leaving aside not being given any low level documentation at all, the difficulty seems to arise not from the complexity of any given part of the system, but the interaction.
The Main memory is RAMBUS ram - which is very fast, but has high access latency. (meaning working with RAM efficiently is tricky), the main CPU is incapable of accessing memory independently, so it has to get the RCP (aka the graphics chip) to do it on it's behalf.
The texturing unit has a 4 kilobyte 'cache'. Which is just about enough to store a single 64x64 texture at 8 bit colour depth. (only 32x32 if you have mip-mapping enabled). This wouldn't be so bad, but despite being called a 'cache' it's manually controlled by the programmer, and in standard microcode implementations isn't used very effectively. (a major factor in why so many games appeared to have such low resolution - since most developers lacked the documentation and skills to write custom micro-code; Many of the games that seemed to defy the odds later in the system's life did so by using custom microcode.)
The cpu core inside the graphics chip, which does such things as geometry transformation and the like cannot run code or work with data directly from main RAM, it has to be loaded into the local cache, which isn't that large, relatively speaking, so the size of code running on this core has to be considered.
The pixel fill rate is about 60 megapixels a second. Which sounds like a lot for a system from that era, but becomes a big problem when you consider the system does multi-texturing, bilinear filtering, perspective correct texturing, anti-aliasing and environment mapping, amongst other things, all of which eat up a LOT of fill rate compared to what the system has to work with. (other systems of that era didn't use such effects, and it's estimated because of this the n64 was typically doing something like 5-8 times as much work drawing a single onscreen pixel as say, a playstation was doing.)
Essentially, the n64 was frequently being used to render a level of graphical effects that objectively speaking were out of it's league, and probably shouldn't have been attempted on a system with such relatively low performance.
So, no individual element of the n64 was that complex, and each part individually was actually quite powerful, but the pieces don't work well together, and the system is full of performance choke points.
Thus, what makes it a pain to work with is that it has very high peak performance, but lots of bottlenecks that drag it's average performance into the gutter.
In other words, the system is a high level optimisation nightmare.
Still, it's pretty clear that it doesn't even come close to the Saturn, and the Saturn's reputation for complexity is well earnt.
Just optimising DSP code alone already looks like a nightmare comparable to optimising n64 code in it's entirety...
Yeah, I do think you're right there. May just be the trickiest chip seen in a mainstream product. (Can't say trickiest EVER, because I'm sure there's some obscure supercomputer chip or something that's worse. XD)
@TDRR_Gamez 5 лет назад ⁺¹
So that's why Quake 2 64 looked like your marine smeared vaseline all over his visor?
Jokes aside, despite the complexity, it all definitely worked out well. Just look at Perfect Dark and compare it with any game in the Saturn or PSX, pretty much no game comes even close to it.
@segaunited3855 5 лет назад
@@TDRR_Gamez Shenmue. VF2. Games that can compete against N64 fairly well.
@unregistredhypercam 5 лет назад ⁺²
Thanks for making the time to write that. I'd heard a bit about the N64's design but was missing a few details. It makes a lot more sense now. What an odd design, especially that memory addressing quirk.
@1e1001 5 лет назад
oh my god wow im never going to be able to make games for that
@n64ra 5 лет назад ⁺¹
Possibly your best video yet. Keep up the coding secrets!
@doragasu 4 года назад ⁺³
Programming for DSPs is fun! I never touched the Saturn DSP, but in the past spent some years writing code for Texas Instruments C5000 and C6000 DSPs. They have all kind of weird hacks not usually present in regular CPUs, like guard bits in the accumulator, modulo addressing (extremely useful to implement digital filters), specific loop instructions to avoid branching (causing the pipeline to get emptied), instruction buffers inside the DSP to avoid reloading looping code each iteration, duplicated data buses...
If you write C code, the compiler is unable to use many of these (especially when coding for fixed point DSPs), so you end up having to write the computation intensive code in assembly language, and you need a good grasp of the hardware details to take advantage of many of these weird details.
@Elkplaysandpaints 5 лет назад
I'm not a programmer, so I couldn't make sense of a lot of what you were showing, but it was still facinating to watch. I still hope your videos benefit the Saturn homebrew community somewhat.
@Zekium 5 лет назад ⁺⁵⁶
Parallel programming on Assembly ? What kind of sorcery is that ???
@segaunited3855 5 лет назад ⁺⁹
Anyone can do it. Is just that in the mid 90s, Dual Core CPUs were almost unheard of.
@annakudriavtsev2510 5 лет назад ⁺⁸
VLIW
@MrSapps 5 лет назад ⁺²
This was always done to some degree, for example on a PS1 you have "double buffering" where by the GPU renders the last "commands" while the CPU is feeding in the current commands. That is concurrency - you had to be careful not to do anything that would require the CPU to wait for the GPU to do something else bye bye performance
@segaunited3855 5 лет назад ⁺²
@@MrSapps PS1 doesn't have Double Buffering. It only has Basic Frame Buffering with heavy distortion. 48KBs Chache Max.
@MrSapps 5 лет назад ⁺²
@@segaunited3855 It absolutely does have double buffering. And the scratch pad cache is 1kb. Here is a hello world with double buffering using Gs lib: github.com/nicolas17/psxdemo/blob/master/main.c
@Eschelaun 5 лет назад
Either you take and precisely organize meticulous notes or you have a crystal clear memory. Either way, thank you for this incredible breakdown of a complicated process.
@CorporalDanLives 5 лет назад ⁺¹⁴
This is the hardware of my nightmares.
@segaunited3855 5 лет назад ⁺²
Nah.
@TalosPCR 5 лет назад
This is brilliant, finally i found someone who shows how console optimization actually works. And it's so great to see how each line of code is thought in conjunction with the hardware schemas, .because nowadays when we program, mainly in Pc we need to abstract so many things and pray for everything to work out in hardware, see how values go from one place to another is somewhat comforting.
@lordofduct 5 лет назад ⁺³
WOOOOOOOOOOOO!!!!!!!!
This is the kind of GameHut I love most!
@marcellosilva9286 2 года назад
Looking at this makes me appreciate all the effort that Saturn developers put into their games, especially 3D ones, I wish I was this good at math.
@nsatragno 5 лет назад ⁺³⁵
Wait so you can read from a register and write to it on the same cycle without the data becoming inconsistent? I wonder if the "impossible" bit from the manual is related to that.
@mogo9052 5 лет назад ⁺¹
Nina Satragno
Wait, what?
@tursilion 5 лет назад ⁺⁸
You activate the circuits at the same time, but the multiply (for instance) starts at the same time so works with the data that is already there. By the time the new data arrives the operation has already completed.
@nekononiaow 5 лет назад ⁺⁷
Good observation. I suspect that there is actually some form of pipelining involved: these instructions do indeed occur at the same time but each one works on the result computed on the previous line, not on the result gotten from the instruction on their left.
This is why the first line only contains instructions loading data into the registers and no actual computations.
These computations happen on the second line with the content loaded into registers while the new "MOV" instructions load the registers with data to be used for computation on the next line/cycle.
@Wren6991 5 лет назад ⁺¹⁵
That part is completely normal though. This is the magic part of flipflops. Write occurs at the clock edge, read data is available very shortly afterward, and you have the rest of the clock to compute your expression involving the flop output (which in this case is also the new input to the flop). No data races, totally safe.
@SianaGearz 5 лет назад ⁺¹⁷
It's called latching, the cycle can have a number of phases, for example if all units latch the inputs at the rising edge of the cycle pulse, then all data is consistent, and writing happens somewhere later in the cycle. You can also see a form of manual pipelining involved, as the data is pre-read into the corresponding input register of each computational unit. The computational unit itself would be protected from the change in the corresponding input register during its operation with a latch.
@pyromen321 5 лет назад
Awesome quick overview! I love the more technical videos you put out
@BlueSatoshi 5 лет назад ⁺⁵
6:49 >The Impossible DSP
Will it be a video on DSP's ability to somehow stay afloat despite his bad financial skills and general ineptitude?
@NuclearTopSpot 4 года назад
That's why I love but also hate acronyms. Once you've made that one link in your brain, it's never gonna go away.
You seen all the boogie2988 fuss? Heard he's the new dsp
@adricklynn8882 5 лет назад ⁺¹
Absolutely fantastic video can't wait for the next one!!
@Nikku4211 5 лет назад ⁺³
Dun-du-du-dunnnnnnnnnnnn! I got you.
Try programming for the Atari Jaguar. It was infamously hard to program back in the day, which is one of the reasons it got next to no 3rd party support. Try seeing why, at the very least.
@noop9k 5 лет назад
Nikku4211 Jeff Minter moved to VLIW NUON after the Jaguar BTW.
@austinreed7343 5 лет назад
Nikku4211
That and Wii U, apparently.
@Nikku4211 5 лет назад
@DejaVoodooDoll I'm just going by what I heard about developing for it back in the day.
@Damaniel3 5 лет назад
I actually understood what was going on when you explained it, but I'm pretty sure I would have had a really hard time figuring it out with just the documentation and some examples. Thanks for the really good explanation!
@Kj16V 5 лет назад ⁺³
Don't lie, people. You mentally switched off half way through and just clicked the Like button. Didn't you.
@frankjansson7563 4 месяца назад
Haha, nah. I'm playing with my Zilog so this is fun :D
@TheRealShedLife 4 года назад ⁺¹
These are absolutely fascinating. And I thought the PS2's two vector processing units were crazy!
@unknownoption7013 5 лет назад ⁺⁸
How does the DSP does the multiplication and move a new value into the input registers at the same time and maintain consistency between if the original or new value is used? I only takes one clock cycle but surely the operations can't be truly concurrent? Are they sequential between themselves?
@mathyoooo2 5 лет назад ⁺¹
I'd guess so. Having a multi phase clock gen isn't unusual
@RobertSzasz 5 лет назад ⁺¹
An Option cycle is almost certainly multiple clock cycles. (it's possible to run various parts of a chip asynchronously and just have the physics of the chip manage timing of certain operations, but it almost never makes sense to do so)
@RachelMant 5 лет назад ⁺⁷
The answer to this lies in the update order of a data flip-flop (building block of a register) and in delay.
The input registers are fronted to a data bus and, after the data is left to settle on the data inputs, is clocked in - however, that doesn't mean it has instantly shown itself on the outputs as that has a time delay associated with it.
At the same time, the data on the output has actually already been multiplied during the same settling period as the data inputs saw. The multiplied data is then clocked into the output register's flipflops. No data crossed streams with other data, and consistency is assured.
No sub-clocking (phased clocks), etc.. just pure logic gate delay and careful engineering. This was actually animated into the diagrams with the movement of the data on the wires (delay and settling time) and flashes (clocking points).
@RobertSzasz 5 лет назад ⁺¹
@@RachelMant the downside is that a lot of those operations become exquisitely sensitive to process variation and design changes. Setting various operations to trigger on rising edges and others on falling edges is more robust if possible and only two operation phases are necessary.
@RachelMant 5 лет назад ⁺⁷
@Robert Szasz That is what circuit analysis and process analysis is for - to determine safe max clock speeds so that process variation and physics don't kick your design into instability.
For example, when prototyping to an FPGA, the synthesis tools are able to determine from a known maximum (worst case) per-gate delay time, how long each clock cycle must be to guarantee correct operation of such a design.
For something as fast as this and for when it was designed and constructed, such tools existed and the process analysis had been performed. While either multiple clock cycles per operation for deep pipelining, or subdividing the clock up into segments where certain actions are clocked in different parts of the clock waveform - with how it was described here and with what is happening in this block diagram, I find it unlikely such elaborate schemes would be in use.
For a DSP such as this, they'd only serve to make a larger surface area chip that runs hotter. More likely I think the DSP designers took care to ensure the worst-case timings weren't violated, and clock the simpler design as hard as they safely could. Less silicon required, less heat, faster in a real-time environment.
@mateuspinesi 5 лет назад ⁺¹
Awesome, the animations really help to understand the explanation
@MegaManNeo 5 лет назад ⁺³
I'm always impressed by your videos and efforts to tell us how you guys programmed games for SEGA's consoles back in the 90's.
At the same time however, it makes me wonder what made the Saturn so much of a hassle for other studios to develop while there are things such as Sonic R, Tomb Raider and even early development versions of Shenmue and Sonic Adventure running on that thing.
@segaunited3855 5 лет назад
It was not a hassle. Non Japanese programmers were never properly trained.
@ischmidt 5 лет назад ⁺³
@@segaunited3855 It was very much a hassle. The manuals were poorly translated and riddled with errors, there was a new 68000 sound driver program every week with a different set of bugs, the development systems were flaky, the hardware itself had a lot of corner cases, and VDP1 was about 60% of the speed of the PS1's GPU at drawing polygons. Yes, the system had a certain personality that Sony's more sterile hardware lacked, but Sony's hardware also was more performant and simpler to deal with. We had to write extensive amounts of assembly on the Saturn to get performance even in the ballpark of what the PS1 gave us with straight C code.
@segaunited3855 5 лет назад ⁺¹
@@ischmidt You're referring to the Western SDKs. Sega of America never provided any proper Sophia SDKs for developers because they completely dropped the ball on Saturn.
Majority of your problems early on came from programmers only using one Core to code C Language Assembly. Many of the games developed for 3D using only one Core suffered from unbalanced and improper programming. It was if the Saturn was performing with one hand tied behind its back.
Another was that many Non Japanese Developers like you weren't schooled on how to code Saturn's graphics and resorted only to using the VDP1 instead of Both VDP1 and VDP2.
Saturn's early SDKs were pretty bad, especially due to Sega of America's utter laziness and the fact that they wasted ALL of 1994 on 32X.
@videogamemusicandfunstuff4873 5 лет назад ⁺²
That was really interesting, thanks for this! I'm looking forward to 'the impossible DSP' video too.
@nneeerrrd 5 лет назад ⁺³
Thanks Jon, please do follow-up.
@tdya1 4 года назад
This channel is a gold treasure
@JoakimKanon 5 лет назад ⁺³
5:50 If this is done in parallel, we have a race condition? Moving a value to the X register and at the same time trigger the multiplier sounds dangerous.
@nekononiaow 5 лет назад ⁺⁵
Good remark, the execution is actually pipelined and the move will be loaded only right in time for the instruction on the next line to take it into account but not before.
@1e1001 5 лет назад ⁺¹
im assuming the chip is just designed to write the result before writing in
@melvin6228 5 лет назад
Really enjoyed the animation that made the pipeline immediately clear!
@PuyoPuyoMan 5 лет назад ⁺³
I don't care what anyone says programming in assembly is a real blast lol
Anyways, I skimmed through the SCU documentation as well (because what else am I gonna do with my time?) and it seems that there is no issue with loading data from a bank to X&Y while also preforming the multiplication operation like some people seem to speculate; Sega themselves do it in some example code in the document "SCU DSP Assembler User's Manual". I have yet to find the actual "impossible" part of the code, maybe I'll keep trying to find it or I'll leave myself to be surprised in the future, we'll see.
@SianaGearz 5 лет назад ⁺¹
I think i've got it? But i only skimmed, so i might be a bit off.
MOV MC1,X and MOV MUL,P are both X-bus operations so they technically collide in their bit encoding, and i think someone (at least whoever wrote the assembler) knew that this combination is possible and is disambiguated by hardware by setting one additional bit to MOV MCx,X and the hardware disambiguates that to not also mean a MOV MCx,P or something like that. But this knowledge was lost on the way to SEGA's techdoc team.
Y and A registers exhibit a similar non-collision that also doesn't seem accounted for in the manual.
@PuyoPuyoMan 5 лет назад
I noted that as well, though looking at the instruction codes I don't think there's any issue. MOV MC1, X is encoded in the 6 bit X-bus control field as 100xxx (the three xs seem to determine which bank to load the data from) while MOV MUL,P is encoded 010xxx (here the xs are "don't cares"). So I don't think that's the problem...
@SianaGearz 5 лет назад
@@PuyoPuyoMan I mean it obviously works, and seeing that it works it's easy enough to surmise why and how it works; however it's not strictly possible according to the word of the manual. The manual is merely wrong or sloppily formulated, this looks like deliberate hardware design, underpinned by the fact that the tool chain supports this.
@PuyoPuyoMan 5 лет назад
"it's not strictly possible according to the word of the manual", did they say that it doesn't work in the documentation? I was looking for anything like that for a bit in the User's Manual but again I just skimmed it so I might've missed it if there was a part where it explicitly said "you can't use multiple x-bus commands at once" or something like that.
@SianaGearz 5 лет назад
@@PuyoPuyoMan no, it doesn't go into enough detail and doesn't say that you can't issue two commands at the same time, but it would be implied given that both commands are specified with explicit overlapping bits that are not described as irrelevant. It is what my reading of the bit charts and it being presented there as alternate commands would suggest if I didn't know that it works. If the docteam expected it to work, they should have left the bit that sets the MUL switch out of the bitmask of the other MOVs on the same bus that are combinable with it and vice versa, and similarity with bits that are written as colliding on the other bus too. I read processor datasheets just about daily and i would have never guessed that it's possible from the datasheet, which is a gross miscommunication on their part.
@uzimonkey 5 лет назад ⁺¹
Ohhhh, I see. I never really understood what "stream processing" was and why DSPs could be so fast. But now I see, instead of the "start and stop" of most CPUs where there's usually one operation at a time a DSP can read in more data at the same time and keep the data moving in a _stream_. And I can also see why this is so useful in digital audio applications where data (often from multiple sources) has to be read, mixed and output all at the same time. Using a DSP or multiple DSPs chained together a tiny, relatively slow DSP processor can handle much more data than a processor clocked much higher.
It does look very tricky to program, but for the types of programs it runs it's probably not too bad. I assume most DSP programs perform the same operations over and over, like transforming vertex data through a transformation matrix. That's just a bunch of multiplies in a row with all the data in contiguous memory. I don't see any branches in there, can they even branch?
@totaltotalmonkey 5 лет назад ⁺⁴
Have you ever worked with Broadcom DSPs?
@fuanka1724 5 лет назад
Very interesting. Thanks for taking the time to make videos like these.
@eformance 5 лет назад ⁺⁴
I assume that the DSP is decoding the 6 instructions simultaneously? In practice is works like a 6 stage pipeline, but in coding, you prime each stage of the pipeline, then issue instructions to perform an action for each stage, is this correct? What frequency did the DSP run at? I think you might get a kick out of the Parallax Propeller microcontroller, it has 8 individual cores with 512 longs per core. In practice, each instruction executes in 4 cycles, with shared memory instructions taking longer if you miss the access window. If written correctly, you can align code and memory accesses to ensure you hit the main memory on every access window, resulting in a true 20MIPS per core. The hub memory is 32KB with another 32KB of ROM containing a bytecode interpreter (SPIN language), SIN/COS/LOG tables, and a bitmap font. I recently got back into doing some coding on a DOS platform and writing inline asm to speed things up, programming the Propeller is a lot like coding for those old processors, but all of the goofy things like "why do I only have 5 general purpose registers when the 8051 has the first 1K as a register", and "why must I load a segment descriptor into a general purpose register first, before loading the segment register", are not present in the Propeller. All 512 longs of COG ram are registers and can be treated as instructions or data. There is no cycle penalty for byte or word memory access, and there are a quite a few non-conventional instructions. Best of all, each core (COG) has a video generator builtin, so you can easily generate high res tiled video output or low res bitmapped output, or high res low-color output (the HUB memory size is the constraint). You can do VGA or NTSC directly from the COG with just a few resistors for support components.
@noop9k 5 лет назад ⁺¹
EFormance Engineering Yes, Propeller is great.
@nekononiaow 5 лет назад ⁺¹
Yes, this is correct, this is why the first line only contain register loading instructions.
@jollyrogerxp 5 лет назад
Absolutely, it is clearly a pipelined design, with pipeline stages all taking a single cycle, contrary to other more complex DSP designs. And yes, the Propeller is certainly interesting :)
@noop9k 5 лет назад ⁺¹
@@jollyrogerxp It is very cool how they designed it specifically for realtime bit-banging with no unpredictability. No interrupts, no need to care about preemption priorities and saving context. Instead you have enough "cores" to poll many different inputs.
@jollyrogerxp 5 лет назад
@@noop9k absolutely, this is what one would do on an FPGA or any dedicated ASIC to process streams of data with guaranteed throughput and latency, which is what DSPs are good at for hard real-time constrained systems!
@cammelspit 5 лет назад
So glad to see you back at it. Good show as always.
@pig1800 5 лет назад ⁺³
This seems similar to what we called "FMA" today, but in integer only... funny to see today's generic CPUs mimicking DSPs like this in SIMD instructions to make better performance.
What seems most complicated to me is all ALUs in that era are all integer only. My programming skills raised in modern days which CPUs already provided nice FPU functionality, even with robust SIMD instructions. But back then you have to use fixed point numbers which is basically just integers with an imaginary decimal. All these bit shifting, number overflowing things, make me headache just for imagine them... Programmers in this era really had hard days.
@segaunited3855 5 лет назад
It's pretty simple. CPUs with more than One Core were unheard at the time.
@MrSapps 5 лет назад
Yeah 16:16 fixed point is rather limiting
@feosTAS 5 лет назад ⁺¹
Thank you! This is getting crazier!
@solarflare9078 5 лет назад
feos? What are you doing here?
-SEGA Saturn TAS's when-
@feosTAS 5 лет назад
Been following since the start. And I don't have any saturn games that interest me.
@stoicvampirepig6063 5 лет назад ⁺¹³
Oh I see!
Actually no, no I don't. :)
@AesculapiusPiranha 5 лет назад ⁺¹
I know this is a joke but I find rewatching videos on complex topics and sometimes going as far to pause and take notes helps if you really want to understand.
@solarflare9078 5 лет назад
I get it!
.... *I don't get it!*
@stoicvampirepig6063 5 лет назад
@@AesculapiusPiranha
Thanks for the advice...quite wasted on me though as I'm 42 and if I haven't got it by now, and I haven't I shouldn't think it'll ever click.
I spent most of the late eighties and early nineties trying to learn to code...but never really progressed past a competent grasp of BASIC.
I'm comfortable with the fact I'm not clever enough...it's okay.
I love video games so much I love listening to basically white noise (to me) even if I don't get it.
@JohnnyThousand605 5 лет назад
@@stoicvampirepig6063 Don't give up! I'm 43 and I'm really enjoying getting into c#. =)
@TDRR_Gamez 5 лет назад
@@stoicvampirepig6063 Tried developing little games in GZDoom? It's quite easy and fun.
@nickwallette6201 5 лет назад
I'm just branching out from CPUs to FPGAs (where EVERYTHING runs in parallel), and haven't looked at DSPs yet, but this totally makes sense.
I've always thought of DSP as those magic chips that are somehow really good at math-intensive stuff, but didn't really know what they were doing special. In fact, now some of the (CPU) instructions that are often labeled as "for DSP routines" make more sense too, because they're usually something like multiply, add, and accumulate all in one cycle.
Looking at it from the FPGA angle helps, since you can see it's just cascaded ALU blocks that can run to coherency in the time of one cycle.
So thanks for the useful intro! :-)
@jollyrogerxp 5 лет назад
DSPs and of course even more so FPGAs are devices that software-only people have a hard time to wrap their heads around, precisely due to the issue of many states changing simultaneously, rather than thinking about a single stream of instructions... :)
@segaunited3855 5 лет назад ⁺²
@@jollyrogerxp Our friends at Alethiea Games are planning to implement FPGA in The RAZOR.
@welcome.to.vatenkeist 5 лет назад ⁺³
1st 😎
Saturn was such a nice Machine and i like your Videos alot mr. Game Hut 😊
@stunningstubbs 5 лет назад
I love the technical explanation!! Thanks for sharing and keep up the good work.
@Nezuji 5 лет назад ⁺²
Since you've also programmed the PS2, I'm curious as to why you feel that the Saturn DSP is harder to program than the PS2 EE.
@MrSapps 5 лет назад ⁺¹
VU microcode is also hell... actually most of PS2 coding is hell for similar reasons to the saturn.. so many damn bits of hardware all operating independently
@akongas 5 лет назад ⁺¹
I LOVE THIS STUFF! Thank you bud! Has anyone told you how calming your voice is?
@retroboy-fh1ji 5 лет назад ⁺³
If this was difficult to learn, imagine the Atari Jaguar coders doing 3D with so archaic and rare design with its two chipsets: "Tom & Jerry" (yeah, like the animation lol). But anyway, this is simply fascinating! I'm not a programmer but i can understand how difficult is (or was) learn the Saturn infamous hardware in general. It was more powerful than the PSX, no doubt, but it was rarely see that (Radiant Silvergun, Panzer Dragoon Saga, Sonic R, and some more, but not much examples unfortunely).
@segaunited3855 5 лет назад ⁺⁴
The sad thing about Saturn is how its actually an unfinished 64-bit Console. And how if only it had those 4 Months it REALLY needed to complete taping and documentation, it would have MURDERED the 5th Gen race at the very start.
@maroon9273 2 года назад
Plus the mk68k which is a bottleneck cpu in the jaguar and never works well with 32 and beyond chips.
@ArneChristianRosenfeldt Год назад
@@maroon9273Saturn has 68k, too
@Justin-TPG 5 лет назад ⁺¹
So I’m interested, given that you seem to know the Saturn hardware well, what you think the most accomplished, commercially released use of the hardware, and therefore the DSP, actually was. You’ve probably answered this question a million times!
@mandymom2800 5 лет назад ⁺¹⁰
No wonder the Saturn failed: it was a nightmare to program!
@khhnator 5 лет назад ⁺⁵
treasure's CEO back in the 90's (someone who touched pretty much every console up to then) said that n64 was many times worse than the saturn
@khhnator 5 лет назад ⁺¹
also, in the 90's... if you wanted to have enough horse power to do math for a 3d game... a dsp was what you had to use, there was no other choice.
unless you fricking Ken Kutaragi.
the big difference between ps1 and all the other consoles from that era was the no dsp for the 3d math
@noop9k 5 лет назад ⁺²
khhnator I don’t quite understand you. PS1 had GTE - Geometry Transformation Engine which is a coprocessor integrated into the CPU.
@SianaGearz 5 лет назад ⁺³
Irrelevant. PS2 is a "nightmare" too, and it was an extraordinary success! For that matter i gladly believe the reports that N64 was essentially undebuggable, in spite of being, from the software perspective (not the hardware!) less convoluted.
Yeah it is a bit magical how the PS1 came together - integrating a very fast special purpose DSP as processor instruction flow obeying COP was a smooth move. As was the bucketed DMA. The system's a bit crude as far as what it could do, which it pretty much had to be given the time it came out at, but it was laser focused on making it easy and accessible.
@noop9k 5 лет назад ⁺²
@@SianaGearz PS2 was a success mostly due to Sony's marketing and the Playstation brand created by PS1. Also due to actually being fastest at the moment of release. Unlike Saturn which had inferior 3D capabilities vs PS1 straight from the start. Still, it had quite a few bad ports where devs obviously spent most of their attention on GameCube&XBox versions. DC ports with lower res, worse textures, worse audio..
@stefanmadsen5605 5 лет назад ⁺¹
Please do a full hour of this.... its awesome
@caracalkarting 5 лет назад ⁺¹⁵
But can do you do it without collecting a coin
@KlausWulfenbach 5 лет назад ⁺⁵
To answer that, we need to talk about parallel universes...
@joshstarr6400 5 лет назад ⁺²
@Hickory Mouse It's actually surprisingly simple.
@PennPal573 5 лет назад
Nicobbq sucks
@youuuuuuuuuuutube 4 года назад
That's some beautiful code, multiple assembler instructions on 1 line :)
@miasuke 5 лет назад ⁺³
The Saturn is really powerfull. The weak point is the complexty to program in this board.
@mogo9052 5 лет назад
miasuke
Please elaborate on its power potential. No one seems to mention it's overall potential.
@greatsageclok-roo9013 5 лет назад ⁺¹
I can seem why this was difficult to code for.
Strangely, though, I understood every bit of what you were saying and never got confused.
@seanmorris440 4 года назад
Your BG music is phenomenal.
@MrInside20 5 лет назад
Love this channel! Always something interesting about programming but with the context of retro games 👌🏻

Следующие

Автовоспроизведение

The Mystery of Sonic R's Impossible Code - Coding Secrets