Since this keeps being asked: The RAM add-on added around 1-2 FPS in the final result. Adding more RAM to a system that already has enough usually does not help. The only reason it helped here is the specifics of the RAMBUS and the manual RAM aligning..
you sir are a god could you also re-write ocarina of time to allow an open source modding archtype like skyrim or fallout has? would be an insane community service tool you could expand your patreon as well. no biggy right?! xD who knows maybe you like the idea whatever you do looking at this project it's bound for succes great job my man!
I have to admit the only thing I was impressed about was how you tried to take out all the bottlenecks between the CPU, RAM & RCP. And you took things to the next level by asking this project: 'is this all I am ? Am I using the hardware to its fullest capacity & capability. Is the software doing this as well ? Can this project work on every level ?' I have actually made this argument to gamers about current games and most people never bother a reply despite the fact I'm not after an argument but just a discussion in computer philosophy. Because I'm good with refining renovation projects until they reach a level of all round perfection. But I am not a perfectionist. Don't you think the engineers who built the pyramids were thinking along the same lines as us ? I enjoyed your video. I think your the first to have done something like this on RUclips ???
This shows what “standing on the shoulders of giants.” Dozens of people barely pushed out M64 in the 90s. One man was able to triple the performance, but because of what know;edge has been developed and shared in the past 25 years. Amazing work, sir!
@Walid Fakhfakh Mario 64 is a really important game historically, it really figured out 3D camera control and controls in general, it's easy to take for granted now but a lot if early 3d games were fixed camera and had 2.5d controls or tank controls.
You can't blame the devs all that much either, since Super Mario 64 was a launch title. They only had so much time between when the system was available to begin developing for and when the console had to be released. 3D games were basically brand new so it's a very commendable effort.
It's also worth mentioning that this kind of insane micro-optimization is just a very, very, very hard skill to acquire. You rarely ever have the opportunity to acquire a skill like this naturally while making games. There would probably have been only a couple of engineers at Nintendo who could have done this, even if they had the extra time.
Also the hardware was odd -- SGI had been using the Reality Engine in systems for many years, so I assume the way to get performance out of the 3D chip itself was well known (but usually with more RAM). SGI had also used MIPS CPUs but had higher performance superscalar designs with pipelining, high performance FPU, etc.. The MIPS R4400 was like a base MIPS with almost no floating point instructions even. So you either let the compiler do it's thing and generate slow floating point code using many instructions, or hand optimize to use integer math with equivalent results.
@@MMOSimca Not only that, it wasn't easy to share knowledge and information at that time. This video for example wouldn't have been possible. You only had the people around you to learn from unless you reached out. Also lots and lots of documentation. That's why games got better optimized later in the life of the platform.
Even with this knowledge we didn't really have the hardware to take full advantage of it. Now we do. Took years to even emulate these games to begin with.
@@gamesandplanes3984 so? He still made it happen, thats not bullshit. Like he said, it would be impossible without it. What he did was insane and awesome. What great achievement like this have you accomplished?
the sad thing about modern game optimisation is instead of utilizing new hardware to actually improve visuals AND framerates and resolutions, they use it as an excuse to not optimise at all because "the newest tech can handle it anyway"
@@lazulazu2467 Pretty much this. Used to allocate time in my projects to optimise, now I dont bother so much. Sure my methods have improved, but hardware patches over the cracks nowadays. Its a shame, hardware improvements allowed software developers like myself to get lazy. Unless your working with 3D gfx, why bother. If you do work with 3d graphics call it a game for the future and move on. Real shame. I still remember marvelling at how they managed to fit some of those Pokemon games on those GB carts way back when. Those optimisations wouldnt happen now, youd just be directed to download a patch on launch.
Kaze is just the best. My favorite thing is that he doesn’t give the original developers any grief. There was just not enough knowledge available to have done all of this for them. Really honorable thing to mention. Thanks Kaze.
If they’re not confident enough to ship the finished game using an optimized build that wouldn’t introduce any bugs under the time constraints they had, it’s EARLY DAYS. Miracle that this game works as well as it does.
Also gotta keep in mind he's doing this with no constraints. He doesn't have to meet a deadline, stay within a budget (no need for expensive SGI machines when we have Blender), worry about whether future console revisions will break his clever hacks... No limits on ROM size, twice as much RAM, and way more advanced tools...
@@johnsimon8457 You need to realize that this was during the 1990s. The game wasn't an early build. The game wasn't in "early days". The game worked fine. But, because of the lack of technology, there were bugs. Simple as that.
The cost of 4 MB of RAM in June 1996, the date of the N64 release, was about $34 retail. Within 5 months it was $21. Nintendo should have released the system with the 8 MB standard. The system's price might have been $40 to $50 more, but within a year the cost of that memory would have been $15... You can never have enough memory.
For years I wondered if anyone would ever do something like this. I've known rewriting a console game was possible, but I never actually expected anyone to be crazy enough to do it. Bravo, this is absolutely extraordinary.
@@lsdkjsdlkjpiosdcposdjfdjad1363 not soo much leaked it was carefully reproduced bit by bit by the comunity yeah it got leaked in the gigaleak too but its kind of illegal to use that one and nintendo can actually demand you for using that one, same reason why the dolphin developers dont even take a look at the wii source code that got leaked
@@carso1500 They recreated it yes with their own code, but having the original source code helps a ton to understand the game and recreate it without using any parts of it.
@@LouisThau And community support on games like these allows them to live on for SO much longer (*did you get this video randomly recommended like I did or ?*)
@@beejay99ah Exactly. This is how *everyone* learns. Passion drives framework, and understanding does not come without passion, even if one so does lack the framework to fill it in.
@@trendybistroIn terms of best, I agree with you. But in terms of importance, you have no idea what you're talking about. Super Mario 64 was the title that outlined and defined the way every single 3D console game of its generation would be judged, set the stage for every 3D game that came after, and paved the way for hundreds design principles that are still in wide use today. It is THE trailblazer and icon of the 3D rennaisance, which is still the most influential and revolutionary period of gaming ever. Period. And that's not bias, either. I don't even like SM64 personally, but it'd be real dumb to call it anything other than one of the most important games of all time.
I am highly interested seeing a definitive edition of the split-screen multiplayer using these optimizations. I would like to be able play the multiplayer smoothly on real N64 hardware. I also find you videos regarding the work performed in order to make these optimizations fascinating.
@@theme7363 , then Nintendo would be having a purge of its current staff and it would be replaced by people that actually gave a damn for the franchises instead of treating them like Gollum treats the Ring of Sauron.
By optimizing the code, the parallel universes can more quickly be QPU aligned, allowing Mario to do faster hyper speed walking while holding the A press.
"if you just program well, you don't need to prevent crashes" is an absolutely insane (but not entirely wrong) thing to say and I absolutely love it. I can't get enough of your channel :D
It's absolutely right, for example to normalize an unsupported value in a function to something it can work with: int someProcess(int x) { // make sure x is positive, otherwise reset to 0 if (x < 0) { x = 0; } // do stuff with x } These types of checks happen in runtime (using CPU) and are usually made so that each function is pretty safe to use, which is a common practice to make the code more maintainable in the long run and more friendly to use. In a game like this, the code is written usually with a tight deadline and it will never be touched again once released, so you don't need maintainability. So in this case, these checks are only useful to be more forgiving to developers passing "wrong" values to it. But if the developers are careful enough to make sure only correct values are passed (even if they need to do that check exactly where needed), then the check doesn't need to happen everytime that function runs. I think that's what he's talking about.
@@dawidkotlinski Yes and those bugs are usually detected and fixed with actual gameplay testing. If it stops crashing and glitching, then the fixes are good and it's good enough for delivery.
"If I had more time, I would have written a shorter letter." I think this quote works well here. It takes time and skill to be able to optimize. The more optimization done the more astounding really. Sadly a lot of companies don't optimize as 'todays hardware can easily handle it." Which is why we get games that breach 100GB when they honestly shouldn't.
I work as a consultant optimizing games. It very often comes down to fact that you have a feature/performance trade-off. Meaning, you COULD spend time speeding up your game, but it would mean delaying feature X or tool Y by N weeks. Often, having something that works is much more valuable to your development process than having something that is fast, because things that WORK unblock other tasks, while things that are FASTER rarely unblock anything unless we are looking at issues that require 100x speedups. This is true for small companies - larger companies should have more dedicated people with strong profiling skills, but small companies are often too busy to even open a profiler.
Two great quotes that will stick with me from this video: "You either get to know about these things, or you get to go out and touch grass every once in a while. It's a strict either or, you can't have both." "RAMbus goes vroom vroom."
aaaahh, grass.... I was convinced he said "You either get to know about these things, or you get to go out and touch GIRLS every once in a while. It's a strict either or, you can't have both." It made me giggle
Hell yeah dude. I didn't even know this was more than an amazing fan feat until I saw who put out the video. The man himself. Holy Crap. Great sense of humor, I've only ever seen the finished SM64 content, never heard the man speak. I had a good time watching this.
I've been coding professionally for over 20 years, that is some super impressive work and entertaining with it. Code optimisation is one of my favourite tasks.
Well, as he said at 6:36 this RAM optimization wasn't actually possible on the original hardware... So Nintendo programmers probably did all they could already.
As someone who watched this video: does a master's degree in computer science not teach you how to clean up superfluous code for optimal performance? John Carmack's been showing the world how to do that since the 90s - he got raycasting* working on literally anything and got Quake to run in full 3D on pre-Pentium PCs with no dedicated graphical hardware :/ *don't make my dumbass mistake of confusing raycasting with raytracing or you're gonna confuse everybody :P
@@reloadpsi University teaches you the value of clean, maintainable (i.e. readable) code that is robust against any kind of problem. That's much more important than maximizing performance by throwing all of that out of the window. With today's machines performance isn't that much of a priority anymore. ;) (NOTE: I'm talking about games here, not general software) Also, when it comes to efficiency, there are some very different fields in computer science: (A) Optimizing algorithms, i.e. the general (macro) structure of the program (which is much more important) and (B) optimizing code for hardware usage (micro level), where you reduce the amount of calls / memory access, etc. So (B) means "what can we speed up, when the algorithm stays the same". And Kaze is doing (B) *hardcore style.*
If it were Sega, sure. But his hands would be tied. Better to keep this as a "patch" to the original ROM, which is probably going to be a complete replacement of every byte, but you know... can't go distributing whole ROMs without summoning the thunder.
I love how he not only improves the original code, explains it easily, but also doesn't shit on the original developers and pretend like they were incompetent.
Well in this case, they purposely made the game run at 20-30 fps because the consoles hardware couldn’t run it any faster. There’s no reason to say anything about the original developers because The game was meant to run slower.
kaze is literally the giga chad at this point some jacked dude just going "yes I have spent two years re writting and optimizing the source code of a 20 years old game, how could you tell?"
The rolled loops thing is funny. Obviously unrolled loops make the code much less maintainable, less readable, less terse, and more error prone. But back then programmers used to unroll loops as an optimization to save some CPU cycles (as the for loop part of the code does not need to run again and again ). But now we are seeing that rolling them actually improves performance. What a perfect cautionary tale about premature optimization, and also optimizing without profiling.
*Programmers* should not be the ones unrolling loops. Programmers should be writing code that allows the *compiler* to unroll the loop *if the compiler determines that an unrolled loop would improve efficiency*. This requires an intimate understanding of the hardware that the code is running on. If you have a shared bus and your instructions are cached, then yes, smaller code size optimization (i.e. rolled loops) will win over "faster" code speed optimization. That is unlikely to be the case for an in-order processor with no pipelining or cache. That said, you *do* need a good compiler and that compiler *does* need good knowledge of the hardware, or this all goes out the window. I've developed some code for the PIC and their C compiler is outright horrific. I can't tell you how many times I've had to twist myself into a pretzel to get the compiler to generate good code, and sometimes I just need to write it in assembly because the compiler is too stupid. And their compiler is made by the the same company that makes the microcontrollers!
@@alexbrown3603 picking holes in something written over 25 years ago for not using modern techniques? We couldn't apply that to literally everything on the planet. Oh doctors in the 70's, 70's and 90's should have used medicine from the 21st century... Car manufacturers in the 80's should have made electric cars. Etc
This level of optimization is absolutely mind blowing. If I were Shigeru Miyamoto, I would be so proud to see a passionate developer improving this game to this point so many years later. It proves how deeply his creations impacted our generation.
Shiggy hates anyone that takes something he had a hand in and "makes it better" in any way, so this would likely be seen as an affront to his work. Just as an example, he dislikes the DKC SNES titles because they're (literally) "Not his Donkey Kong", and tried to 1-up them with 'Returns' on the Wii (and arguably failed because the shoehorned motion controls directly conflict with the base gameplay the series is built on).
@@duffman18 Sources? You can't just claim it's a mistranslation of interviews without proof to back it up, particularly when it's been corroborated by multiple unrelated sources at many different times.
The unrolled loops are funny from a modern perspective - I'd write them as loops and then unroll later if i needed performance, even if compilers couldn't do it for me - but considering these devs were coming from the SNES? Makes total sense that they would just assume it's the right thing to do.
@@rotor13 , I disagree with the engine swap analogy What this man did was more equivalent of tweaking every single thing on the car so that it runs as efficiently as possible with the least possible amount of upgrades.
This is pure backend refactoring bliss! Writing good and optimized code is an art and should be valued more by all development companies. This is why I think all games after some years of its sales end should become open-source.
The sad reality is that development and production companies value money first and foremost, which means they are often incentivized to sell good enough software rather than great software. The N64's launch was delayed to make SM64 good enough. It went on to sell millions of copies and become one of the most beloved games of the late 90s. If an unstable 20FPS frame rate was good enough for that, why would they delay further? That said, death of author + 70 years is an insane copyright duration. The original maximum duration in the US was 28 years. 28 years after SM64's release is 2024. Under those terms, Nintendo would "only" have had time to sell the original release, the vibration support rerelease, the DS rerelease, the Wii Virtual Console rerelease, the Super Mario All-Stars 3D rerelease, and the Switch Online rerelease before the code went public domain.
Oh god just imagine if we lived in that world . So many more indie games I’d start with parasite eves code base and use that as a jumping off point for my game
Good code, fast code, easy to read/write code... Pick any 2. Engineering is all about tradeoffs and you need to understand the requirements. It's possible that if Nintendo wrote such super optimized code back in the day they would have missed a market window and it would never get released. Instead, it was "good enough" and was hugely successful for a flagship game.
Y'know, when I heard the source code was decompiled, I noticed Kaze went kinda quiet. Now I know that he used that time to REWRITE THE ENTIRETY OF SM64 TO A STATE THAT MAKES SHIGERU BLUSH!.
It's funny that so much time was spent decompiling the game and then straight after the release we got the Nintendo giga leak which had the source code for it anyway.
@@Agret from what I heard, it was only some of the source code, basically what was necessary for IQue to make their own version of Mario 64, so the entire source code wasn't leaked as of now
7:23 This is very interesting. Usually I hear about compilers unrolling loops to optimize them, but here, the holdup is RAM, not the processor. Incredible work all around.
Modern computers are also massively being held back by the RAM being so slow. I think the difference is that now we have more mitigations for that, like larger caches and speculative execution. This probably shifts the place where code-size vs conditional jumps balance out (conditional jumps in non-unrolled loops are also bottlenecked by memory).
@@kebien6020 also compilers are getting better at removing conditional jumps altogether. In modern C(++) code you don't have to manually remove most ifs because the compiler does it for you. I realized this when I wrote OpenGL shaders that aren't optimized at all I think so you had to manually do all the tricks to avoid ifs. Just by removing a few ifs in the shaders I gained much more performance than by implementing batched rendering
Did he say RAM or ROM at 7:23 8 and 16 bit systems run directly out of ROM and don’t copy any code to RAM for execution. A PC OS (IIRC) will load everything into RAM from disk barring external libraries. Eight cycles JUST to load the next instruction from ROM is INSANE. Removing instruction cache would make the N64 run 100x slower
I remember playing N64 games as a kid and looking at the skyboxes thinking "Why can't I go there? Why didn't they make it? It would be so much fun." This is exactly what it feels like to see mods for this game all these years later. Thanks for making this! (to everyone involved)
I think this has become my favorite video on youtube. Im currently studying for a software engeineering degree and I love covering memory optimisation in programs, so this was a great watch! Also sm 64 was one of the first games I played so bonus points for that
I wish this man was my programming teacher back in my youth. I couldn't help but smile throughout the entire video and at how passionate he is about all of this.
As a programmer and engineer I can say that this is a MAJOR achievement and a LOT of work. I can't congratulate him enough. Heck, the fluid framerate even looks like Super Mario Galaxy or Super Mario Odyssey. I just disagree when it was said that "the Nintendo programmers were not familiar with C, because it was a new language back then". C originated in the early 70s and was and is a well-known language. However, what is certain is that many 90s game programmers were making the transition from assembly to C / C++ code, I think that's what was implied in the video. It's a minor detail, like a 0.001% of what's in the video. KUDOS
Oh ok, I was wondering how C was a new language being already at that time 20 years old or so... It took that long to begin transition ? was it due to compiler refinements ? or just inertia ?
@@nlight8769 high level languages did not compile efficiently enough to be usable on older consoles, N64 was one of the first where the hardware power was capable of handling the unoptimizations to a degree
7:45 "Usually if you did this, you would get fired from your job" Completely understandable, won't even lie. I can't imagine even the most well-equipped programmer understanding something as insane as the Q_rsqrt function (for an example of course), especially during that time period. Ahh, classic C black magic.
A lot of this stuff would get you fired, the loop unrolling also comes to mind. I hope Kaze points out that although he's optimizing everything, optimization is only one facet of good programming and generally not the most important one in a professional setting like readability.
@@LizardLeliel it's all about context. He's programming for a never changing target (dead platform), with so much hindsight. There's no need to worry about mantainability because the game (engine) is forever done, etc. I'm sure if this was a an "actual job" he wouldn't be fired, he's worked according to requirements.
@@Radgerayden-ist I'm not saying he's doing bad for focusing on optimization here - he's the only programmer involved in this project. I'm just genuinely afraid new programmers may get the wrong idea that the standard of good code is just optimization.
@@LizardLeliel as someone who constantly messes up in Scratch, I can confirm readability is key to make a good game, even there you may get disoriented if you're being reckless with the block programming, let alone other guys looking at your code. Yes, even with a language made for kids, these things can happen.
What we expect from an Official "Remastered Classic" : This passion project. What we actually get : We lost the source code...so we just scrapped something similar together and hope y'all don't notice
@@lilaa3 Nintendo is an awful company from a consumer level and I'll actually never buy another product from them as long as I live. Even if I may want to.
@@lilaa3 Yeah it's more like "We've already got a working emulator for these games, it'd be faster to just use that, than spend the dev time to make a proper port" than anything about losing the source Still, there's so many things they could've fixed and just decided not to. Didn't even take the time to change the goddamn button textures. I don't remember my switch having a bean shaped grey X button, or a bright red B. One of the only changes I can recall that made it into the initial release was mario sunshine's camera controls being inverted to align with the other games. Which made it feel completely wrong to anyone that's played sunshine before. The only major change made the game worse for a lot of players
The older a game gets, the more people are going to crack it open, either to make modifications or exploit what's there. I think that's a beautiful thing. Amazing work!
I've heard of loop unrolling and function inlining to improve performance. This is the first time I've heard of someone loop rolling and outlining functions for increased speed. I guess it goes to show you how important it is to actually understand the limitations of the hardware.
As an gamer/engineer who originally learned/programmed in C++, I appreciate this video, some of the code improvements you found were super interesting, thanks for this!
Because of the tight constraints of early consoles, I always assumed, especially for a first party Nintendo game, that they were already very optimized. This blew my mind.
Well it was fairly optimized for what it was. The game did work and people played it, it almost never crashes, so thats quite a job already. At any stage of optimization there is another step to make it even better, so it never ends. The cost to squeeze out some % might just become increasingly larger and costly, especially on understanding the code and potentially re-using it for some other game (as having hard-coded shadows in the code is a pretty bad approach if you want to re-use it).
Hindsight 20/20. More is known about the N64 hardware now than when this was developed. Optimization techniques always advance. Like all technology. Doesn't indicate this is not an impressive accomplishment. Just everyone asking why Nintendo didn't figure this out back then.
Well he said at some point when he learned enough through romhacking he will eventually make his own game. The question is tho has his position changed on this or will he ever considering something to be enough in that regard xD
@@thomasrief well, regardless of his position on making a game, the man does an awesome job and clearly puts a lot of passion. I think it'd be awesome what Kaze could create without being constrained by SM64 (Though looks like he's working around a lot of the constraints)
What's kinda funny is I've thought about actually porting some N64 games to GameCube just so mods can take advantage of the extra resources and better emulation.
@@renakunisaki I suppose it would need a full rewrite though - IIRC the N64 doesn't even use the same processor as the GC, let alone a similar architecture for rendering
Zel, known for also some cool Zelda stuff, is actually working on a pretty late-gen Starfox Adventure's (Dinosaur Planet?) like project that looks almost early gamecubeish, on a goddamn N64.
I hope that GDQ allows this game to be played and have its own category, calling the game "Super Mario 64 plus" or "Enhanced SM64" or something, so we can watch speedrunners tear into it. Very exciting.
Why not call this Expansion Pak support version the Kakuchou Edition (拡張 meaning expansion), like the Rumble Pak support version is called Shindou Edition (振動 meaning rumble)?
Damn, "if you just program well you don't need to prevent crashes" I wish I had this level of confidence, I just sumbled in your video but awesome job!
7:00 additionally, I would say that Nintendo has to work under really tight time constraints, with level design and ideas chaining constantly. Still, they were able to deliver this amazing game that define rules for the 3D platformer genre for next coming years.
I love how you don't throw shade at the original devs. You respect the work they did, but use all the tools at your disposal to optimize it into something beyond what they could have imagined.
@@OtakuPlayTM They did deserve a little shade at least. Some of these things are a bit bonkers, and C had already been popular for about a decade (it gained popularity during the 80s)
Notice for whose claiming C language wasn't new in 1995: Kaze is actually inaccurate but not wrong since C was a new thing in console development (32 bits consoles) because before them, Assembly was used for coding games on SNES or Megadrive/Genesis. This fact is even more true for japanese devs; I own a book about the PS1 release where they mention the change from Assembly to C was complicated for japanese programmers.
Yeah, I think at the time the existing C programmers back then were mostly doing business apps or other high level projects, there weren't many C programmers with video game backgrounds.
Was I the only one noticing, using instancing for the coins, resulted in all coins spinning equally, whereas in the original game, they spun with a random offset to reduce repetiveness?
Regarding compiler optimization, loop unrolling, etc. It's one of the things that makes targeting old CPUs with modern compilers difficult. Many modern optimization techniques that result in real improvements for modern CPUs result in much slower code on older CPUs. They didn't have the benefit of gargantuan cache sizes, multi-way instruction decoders, register renaming, etc. For most older CPUs, the general case is that less instructions means more performance, because it leaves more bus time for data and IO.
4:15 the code isn’t necessarily bad, it perfectly did what it needed to deliver in time at a given quality threshold. I observe this a lot at work where coding purists will sweat about a perfect solution while the product development moves on only for them to start from scratch with the next badge of changes. In a commercial setting we cannot expect perfection. Oftentimes devs have to polish a turd. The result isn’t always beautiful under the hood but as long as it functions it’s good enough.
@@jphataraki6764 That's the line of thinking being talked about here. When you're coding commercial products, it's basically guaranteed that you will not have time to perfect your code. I'm not even factoring in the possibility of a rushed cycle here, I'm simply saying that, when it comes to how long coding takes, there is an enormous difference between reasonable and pure.
100% agree, I just spent roughly 2 months worth of work programming a large industrial production line. I programmed it not to be the most efficient with memory or to preserve the scan time of the PLC....but to be modular, I can add things on or take them away (comparatively) easily. And by fuck am i happy for that now, 1 week from installation I've been hit by the "Yeah so half the machine hasn't been built yet but you need to commission the stuff that has. The only issue is that half the missing equipment is "Safety rated" and therefore the PLC will throw a hissy when it doesn't exist. Sometimes it's not about speed, it's about safety and just making it work.
"you either learn about these things, or you go outside and touch grass. It's a strict either or, you cannot have both" - Single greatest quote of the whole vid. My life is complete, my life experience is validated. Thank you.
As a 10+ years developer, it amazes me how much love and effort you put it into this! Let me ask, how much time did you take to understand most the source code since the first time you read it? Amazing job!!!
@@ozordiprince9405 But that counts not even all optimizations mentioned in this video and also not the earlier work on mods. He was definitely already familiar with a lot of the codebase.
Great video (and great project)! One nitpick though: C was not a new programming language in 1995. C came around 1972 with 1978 being the year when the creators released the official guide book, extensively known and used by the 1990's. However, it was new for Nintendo developers as this was their first platform to use C for developing games.
I think what he meant is that C for game programming on consoles was relatively new. Nintendo up to that point only used assembly (NES, GB/C and SNES). I dont even know if any third parties besides Nintendo (HAL Laboratories comes to mind) ever toyed with other languages than assembly for producing games in Nintendo consoles, so its not that it wasn't done before, they were the first and couldn't even outsource or get help from third party studios that normally aided nintendo when they were under time pressure.
@@gerardmarquinarubio9492 yup exactly what I said: new to nintendo but not to the world. And yeah, maybe he meant that but was not what he said. Someone new to programming watching this video might form their knowledge thinking that one of the most influential language in the history was invented in the mid 1990's. Thus I thought useful to clarify.
@@gerardmarquinarubio9492 There are a few Sega Megadrive games that were written in C, such as Sonic Spinball and Ecco the Dolphin. This massively sped up development allowing Spinball to release before the end of 1993, but meant the game only ran at 30 fps. It's hard to compare the CPU in the SNES and Megadrive directly, but maybe the SNES would have performed even worse running a game compiled from C. I'm guessing the reason Spinball and Ecco were in the GBA Sega Smash Pack was because they could just compile the source code for the GBA instead of running them in a Megadrive emulator.
I'd love to see speedrunners using this and see what they notice. I feel like they are most likely to feel the difference and really appreciate the work that went into this.
@@Mate_Antal_Zoltan and how speedrunner would probably be used to the lower FPS and use the low FPS to their advantage so this wouldn't really work with speedrunning
Guys, it's not "this would be cool for speedrunning!", it's "I wanna see what people that have been optimizing their playing abilities on this game for the past 20 years think about this version"
I learned on C in university so this is particularly interesting to me. He's right about the touch grass part: you have to sell your soul to master pointer arithmetic.
@@osparav idk, all I know about it is that you add 1 and in the memory it adds 1*the number of bytes of the variable. Haven't searched to much to know what's difficult though.
@@osparav Using the same token for declare and use, not to having the hardware visualization about how pointers are, and not popularize the my_pointer[0] instead of *my_pointer
I'm not even a dev and most of the code optimization things you outlined are fairly straightforward and basic things that even I am aware of. But some of the things you did, here...man oh man...that is the kind of thing you can accomplish ONLY when you put thousands of hours into a project like this in addition to the thousands of hours of coding experience required to even know you could optimize in such a manner. Also, the LLMs are great at optimizing code. Give it snippets, describe the context, and they can optimize your code or offer suggestions. Almost everything you mentioned you did is something these LLMs would suggest or even re-write for you. Even if you know how to code an optimization, just give the code block to the LLM, describe the optimization you want, bam, it will do it. Often better than you can because it doesn't usually make typos or errors. But I noticed these LLMs make errors with instantiating or failing to declare variables them before invoking them. Anyway, this is an amazing project and I wish AAA studios would have an optimization team that did what you did.
Since this keeps being asked:
The RAM add-on added around 1-2 FPS in the final result.
Adding more RAM to a system that already has enough usually does not help. The only reason it helped here is the specifics of the RAMBUS and the manual RAM aligning..
@@Yveldi upload more graphics card for further boost
you sir are a god could you also re-write ocarina of time to allow an open source modding archtype like skyrim or fallout has? would be an insane community service tool you could expand your patreon as well. no biggy right?! xD who knows maybe you like the idea whatever you do looking at this project it's bound for succes great job my man!
Can you please optimize Ocarina of time too?
But you said in the video it isn't doable without the RAM expansion. So, for most of the N64 lifespan, it was an impossible feature.
I have to admit the only thing I was impressed about was how you tried to take out all the bottlenecks between the CPU, RAM & RCP. And you took things to the next level by asking this project: 'is this all I am ? Am I using the hardware to its fullest capacity & capability. Is the software doing this as well ? Can this project work on every level ?' I have actually made this argument to gamers about current games and most people never bother a reply despite the fact I'm not after an argument but just a discussion in computer philosophy. Because I'm good with refining renovation projects until they reach a level of all round perfection. But I am not a perfectionist. Don't you think the engineers who built the pyramids were thinking along the same lines as us ? I enjoyed your video. I think your the first to have done something like this on RUclips ???
he finally did it, he went insane.
The moment we all anticipated
This is the one where he's finally lost it.
What if he went so insane that he went full circle and returned to normal as an ascended being?
we have reached the endgame of sm64 modding
I'm just glad I was here to see it
This shows what “standing on the shoulders of giants.” Dozens of people barely pushed out M64 in the 90s. One man was able to triple the performance, but because of what know;edge has been developed and shared in the past 25 years. Amazing work, sir!
@@walidfakhfakh3660 it's not about the quality of the video... It's what we grew up on.
@@walidfakhfakh3660 Why do you complain about age and graphics?
@@walidfakhfakh3660 because is fun ? lol
How is hating on a game fun
@Walid Fakhfakh Mario 64 is a really important game historically, it really figured out 3D camera control and controls in general, it's easy to take for granted now but a lot if early 3d games were fixed camera and had 2.5d controls or tank controls.
You can't blame the devs all that much either, since Super Mario 64 was a launch title. They only had so much time between when the system was available to begin developing for and when the console had to be released.
3D games were basically brand new so it's a very commendable effort.
This plus the fact that tech was still pretty new at that time makes it perfectly fine
It's also worth mentioning that this kind of insane micro-optimization is just a very, very, very hard skill to acquire. You rarely ever have the opportunity to acquire a skill like this naturally while making games. There would probably have been only a couple of engineers at Nintendo who could have done this, even if they had the extra time.
Also the hardware was odd -- SGI had been using the Reality Engine in systems for many years, so I assume the way to get performance out of the 3D chip itself was well known (but usually with more RAM). SGI had also used MIPS CPUs but had higher performance superscalar designs with pipelining, high performance FPU, etc.. The MIPS R4400 was like a base MIPS with almost no floating point instructions even. So you either let the compiler do it's thing and generate slow floating point code using many instructions, or hand optimize to use integer math with equivalent results.
Never judge to code which written under unrealistic deadline is my motto
@@MMOSimca Not only that, it wasn't easy to share knowledge and information at that time. This video for example wouldn't have been possible. You only had the people around you to learn from unless you reached out. Also lots and lots of documentation. That's why games got better optimized later in the life of the platform.
This knowledge in the 90's...Imagine how things couldve been
I remember how things were in the 90s, things were much better then
1980's and 1990's were the best times for gaming.
@@V3ntilator early 2000s as well. (Mostly)
Even with this knowledge we didn't really have the hardware to take full advantage of it. Now we do. Took years to even emulate these games to begin with.
@@rsolsjo But I think having better software would make the push for better hardware faster.
That feel when you completely rewrite most of the code for one of the most significant video games in history to get a better framerate in your mod.
I feel like it was all an elaborate plan to advertise his mod, but I also feel flexed on if this is true.
Eh. His entire patch NEEDS the expansion pack. Therefore, it's bullshit.
@@gamesandplanes3984 so? He still made it happen, thats not bullshit. Like he said, it would be impossible without it. What he did was insane and awesome. What great achievement like this have you accomplished?
@@gamesandplanes3984 it's a massive improvement for the infinitesimally small price of 4mb of ram, L + touch grass idiot
@@gamesandplanes3984 There's several actual N64 games that need the expansion pak, are those bullshit too?
Imagine going back in time, infiltrating Nintendo's HQ overnight, and just replacing the code with this one, and leave a note like "L is real"
lol
2401
how about, a week after SNES launch of super mario world, doing a credits warp in front of the devs
@@mmoncure11 *day before launch
Some of these optimisations would seem like sorcery to the best devs in the world haha
"You can either learn these or go outside and touch grass." - The duality of programming.
I mean, optimizing a whole game seems way more incredible and worthy than touching grass
Or sacrifice video games, and watching Netflix in your daily schedule to touch grass.
Confirmed Kaze has not touched grass
He has some grass in his office😂
Everyone has to poop sometime...
If this was possible 20+ years ago, imagine how unoptimized games are today.
*laughs in halo infinite*
Don't have to imagine, lol.
the sad thing about modern game optimisation is instead of utilizing new hardware to actually improve visuals AND framerates and resolutions, they use it as an excuse to not optimise at all because "the newest tech can handle it anyway"
@@lazulazu2467 Pretty much this. Used to allocate time in my projects to optimise, now I dont bother so much. Sure my methods have improved, but hardware patches over the cracks nowadays.
Its a shame, hardware improvements allowed software developers like myself to get lazy. Unless your working with 3D gfx, why bother. If you do work with 3d graphics call it a game for the future and move on.
Real shame. I still remember marvelling at how they managed to fit some of those Pokemon games on those GB carts way back when. Those optimisations wouldnt happen now, youd just be directed to download a patch on launch.
Imagine a very old ring
Kaze is just the best. My favorite thing is that he doesn’t give the original developers any grief. There was just not enough knowledge available to have done all of this for them. Really honorable thing to mention. Thanks Kaze.
Same. :)
If they’re not confident enough to ship the finished game using an optimized build that wouldn’t introduce any bugs under the time constraints they had, it’s EARLY DAYS. Miracle that this game works as well as it does.
Developer solidarity
Also gotta keep in mind he's doing this with no constraints. He doesn't have to meet a deadline, stay within a budget (no need for expensive SGI machines when we have Blender), worry about whether future console revisions will break his clever hacks... No limits on ROM size, twice as much RAM, and way more advanced tools...
@@johnsimon8457 You need to realize that this was during the 1990s. The game wasn't an early build. The game wasn't in "early days". The game worked fine. But, because of the lack of technology, there were bugs. Simple as that.
All the background music is composed by Badub and will be part of my upcoming major mod Return to Yoshis Island 64.
Sir when are u going to finally bring ur talents to the sonic adventure modding community😭
Aw sweet!
I stand by you having should hve put a small capybara in the thumbnail, but amazing video none-the-less!
Could you put this hack on the original mario 64 and on super mario 64 land?
Could you add some of these optimizations to your older romhacks?
The cost of 4 MB of RAM in June 1996, the date of the N64 release, was about $34 retail. Within 5 months it was $21. Nintendo should have released the system with the 8 MB standard. The system's price might have been $40 to $50 more, but within a year the cost of that memory would have been $15...
You can never have enough memory.
For years I wondered if anyone would ever do something like this. I've known rewriting a console game was possible, but I never actually expected anyone to be crazy enough to do it. Bravo, this is absolutely extraordinary.
Can someone do this for Inspector Gadget on SNES?
@@SeanWMODonnell if the source code is leaked and someone cares to do it, yes
@@lsdkjsdlkjpiosdcposdjfdjad1363 not soo much leaked it was carefully reproduced bit by bit by the comunity
yeah it got leaked in the gigaleak too but its kind of illegal to use that one and nintendo can actually demand you for using that one, same reason why the dolphin developers dont even take a look at the wii source code that got leaked
@@carso1500 They recreated it yes with their own code, but having the original source code helps a ton to understand the game and recreate it without using any parts of it.
If I had the skills, I would add new areas to animal crossing New horizons :)
Imagine if all games could get a loving optimization pass like this.
ocarina of time would be epic
*cough* yandere simulator *cough*
sorry, must slow everything down to a grinding halt by including DRM that will be circumvented after 3 weeks anyways
A lot of people are converting SNES games to SA-1, :v
@@beacondude5000 Will be forgotten in 20+ years, unlike Mario 64.
I love that I get to live in a world where people like this exist.
Word! Even though 99.99% of his technical breakdown goes into one ear and out the other it's nonetheless music to both.
Honestly ? Same.
@@LouisThau And community support on games like these allows them to live on for SO much longer
(*did you get this video randomly recommended like I did or ?*)
>Is one of the greatest programming minds alive
>Dedicates his genius to Super Mario 64
I often think that working on projects you really love is part of the reason why you can be as dedicated about it.
@@beejay99ah Exactly. This is how *everyone* learns. Passion drives framework, and understanding does not come without passion, even if one so does lack the framework to fill it in.
@@JDelvaMusicit ain't even top 50.
@@JDelvaMusicmaybe of that console generation or year, but much more important games came before, and plenty of better games came after
@@trendybistroIn terms of best, I agree with you. But in terms of importance, you have no idea what you're talking about.
Super Mario 64 was the title that outlined and defined the way every single 3D console game of its generation would be judged, set the stage for every 3D game that came after, and paved the way for hundreds design principles that are still in wide use today. It is THE trailblazer and icon of the 3D rennaisance, which is still the most influential and revolutionary period of gaming ever. Period.
And that's not bias, either. I don't even like SM64 personally, but it'd be real dumb to call it anything other than one of the most important games of all time.
I am highly interested seeing a definitive edition of the split-screen multiplayer using these optimizations. I would like to be able play the multiplayer smoothly on real N64 hardware.
I also find you videos regarding the work performed in order to make these optimizations fascinating.
He's not the man Nintendo hired, but the man they needed.
Nintendo doesn’t need to hire their community modders, they know that we do it for free.
they need to sue them to make sure they can protect their brand from being loved ;)
@@theme7363 , then Nintendo would be having a purge of its current staff and it would be replaced by people that actually gave a damn for the franchises instead of treating them like Gollum treats the Ring of Sauron.
@@paxhumana2015 i like that gollum part like a LOT
@@brandonkruse6412 if they could. they would sue him for all he got. block his videos and piss on his grave. nintendo is a horrible company.
By optimizing the code, the parallel universes can more quickly be QPU aligned, allowing Mario to do faster hyper speed walking while holding the A press.
The real progress is that half A presses are now reduced to quarter A presses.
@@cesardelgado1544 kek
@@cesardelgado1544 A half "A" press is a half "A" press. You can't say it's only a quarter.
@@sergio_henrique Watch me call it one-sixth of an A press
Which all translates to "vroom vroom"
"if you just program well, you don't need to prevent crashes" is an absolutely insane (but not entirely wrong) thing to say and I absolutely love it. I can't get enough of your channel :D
It's absolutely right, for example to normalize an unsupported value in a function to something it can work with:
int someProcess(int x) {
// make sure x is positive, otherwise reset to 0
if (x < 0) {
x = 0;
}
// do stuff with x
}
These types of checks happen in runtime (using CPU) and are usually made so that each function is pretty safe to use, which is a common practice to make the code more maintainable in the long run and more friendly to use. In a game like this, the code is written usually with a tight deadline and it will never be touched again once released, so you don't need maintainability. So in this case, these checks are only useful to be more forgiving to developers passing "wrong" values to it. But if the developers are careful enough to make sure only correct values are passed (even if they need to do that check exactly where needed), then the check doesn't need to happen everytime that function runs.
I think that's what he's talking about.
Crashing the program, rather than letting it run in some unexpected state, is sometimes better than trying to "fix" it at runtime anyway.
@@dawidkotlinski Yes and those bugs are usually detected and fixed with actual gameplay testing. If it stops crashing and glitching, then the fixes are good and it's good enough for delivery.
"If I had more time, I would have written a shorter letter."
I think this quote works well here. It takes time and skill to be able to optimize. The more optimization done the more astounding really. Sadly a lot of companies don't optimize as 'todays hardware can easily handle it." Which is why we get games that breach 100GB when they honestly shouldn't.
The size of the game is generally due to assets, not unoptimized code.
I work as a consultant optimizing games. It very often comes down to fact that you have a feature/performance trade-off. Meaning, you COULD spend time speeding up your game, but it would mean delaying feature X or tool Y by N weeks. Often, having something that works is much more valuable to your development process than having something that is fast, because things that WORK unblock other tasks, while things that are FASTER rarely unblock anything unless we are looking at issues that require 100x speedups. This is true for small companies - larger companies should have more dedicated people with strong profiling skills, but small companies are often too busy to even open a profiler.
@@levonschaftin3676 well, egg on my face and shows what I know about game coding.
@@MrIHaveASword easy to make that mistake
Asset sizes can basically be infinite and those take up majority of the data. Someone has to decide the final quality that gets shipped.
Reading and rewriting an entire game's source code just so your mod runs faster is such a power move. Vroom vroom.
a powerer move would be to make a shitty mod that slows down the game, then optimize the game but not the mod, so it runs at 100%
He tried to flex. He warped the fabric of reality around his arms.
@r33mote I don’t see a real man here, especially not you
Glory to the next person to like this comment, the 420th like.
@r33mote you probably need it more than that guy
"Safety Checks are useful to prevent crashes, but if you just program well you don't need to prevent crashes"
-Kaze Emanuar
As a programmer I live by this 😂
Well, in a game running on bare hardware on a console...
what an absolute flex
Ah of course, how did I never think of just programming better.
Just Don't Segfault 😏
Easter egg: I took that picture of the RDRAM at 1:37 . I know I'm late to the party, but this is amazing work.
Two great quotes that will stick with me from this video:
"You either get to know about these things, or you get to go out and touch grass every once in a while. It's a strict either or, you can't have both."
"RAMbus goes vroom vroom."
aaaahh, grass.... I was convinced he said "You either get to know about these things, or you get to go out and touch GIRLS every once in a while. It's a strict either or, you can't have both." It made me giggle
Came to comment exactly that first quote. I'm 100% stealing that and planning to use it absolutely as soon as I can shoehorn it into a conversation.
I want to touch grass…
@@kaneCVR Wait, but I don't understand all those stuff and don't have a chance to go out and touch....
that earned a sub from me, not just the words, but I felt it was a genuine statement.
I like how everything no matter how complicated it got it always ended with the RAMbus going vroom vroom
Hell yeah dude. I didn't even know this was more than an amazing fan feat until I saw who put out the video. The man himself. Holy Crap. Great sense of humor, I've only ever seen the finished SM64 content, never heard the man speak. I had a good time watching this.
As an idiot, I appreciated seeing it appear so I could clap my hands together and laugh for a moment in between feeling confused.
@@GoTeamScotch Same but I'm trying to understand the other things because I wanted to make a rom hack
9:37
“if you just program well, you don’t need to prevent crashes”
what a chad
I've been coding professionally for over 20 years, that is some super impressive work and entertaining with it. Code optimisation is one of my favourite tasks.
it would be really cool to hear what the original programmers think about this.
I'd be very happy to see support or honest reactions from devs instead of "don't touch our IP"
They're all retired probably.
Well, as he said at 6:36 this RAM optimization wasn't actually possible on the original hardware... So Nintendo programmers probably did all they could already.
@@glurp1er it was possible with the expansion card
@@OrchidAlloy You mean the one that released two full years after Mario 64?
As someone with a master's degree in computer science I can confirm: This is wizardry.
You need a masters degree in kaze science
As someone who reads RUclips comment: Okay
It's: FIXING LAZY JPN GAME DEV
As someone who watched this video: does a master's degree in computer science not teach you how to clean up superfluous code for optimal performance? John Carmack's been showing the world how to do that since the 90s - he got raycasting* working on literally anything and got Quake to run in full 3D on pre-Pentium PCs with no dedicated graphical hardware :/
*don't make my dumbass mistake of confusing raycasting with raytracing or you're gonna confuse everybody :P
@@reloadpsi University teaches you the value of clean, maintainable (i.e. readable) code that is robust against any kind of problem. That's much more important than maximizing performance by throwing all of that out of the window. With today's machines performance isn't that much of a priority anymore. ;) (NOTE: I'm talking about games here, not general software)
Also, when it comes to efficiency, there are some very different fields in computer science: (A) Optimizing algorithms, i.e. the general (macro) structure of the program (which is much more important) and (B) optimizing code for hardware usage (micro level), where you reduce the amount of calls / memory access, etc. So (B) means "what can we speed up, when the algorithm stays the same". And Kaze is doing (B) *hardcore style.*
I'd say "Nintendo, hire this man" but honestly Kaze just hired himself at this point
Nintendo fans: "Hire this man!"
Nintendo lawyers: "Take down all his videos, and threaten to sue him!"
Nintendo doesn't deserve him!
If it were Sega, sure. But his hands would be tied. Better to keep this as a "patch" to the original ROM, which is probably going to be a complete replacement of every byte, but you know... can't go distributing whole ROMs without summoning the thunder.
@@nickwallette6201 The secret is, do not make it work without an original copy, so, it's a mod
screw hiring him, for the love of god please don't CnD him
I love how he not only improves the original code, explains it easily, but also doesn't shit on the original developers and pretend like they were incompetent.
Well in this case, they purposely made the game run at 20-30 fps because the consoles hardware couldn’t run it any faster. There’s no reason to say anything about the original developers because
The game was meant to run slower.
@@cjdig86 that and time constraints pumping out games. they made these game from scratch within 2 years.
kaze is literally the giga chad at this point
some jacked dude just going "yes I have spent two years re writting and optimizing the source code of a 20 years old game, how could you tell?"
with him regularly working out, yeah pretty much!
@Abdullah day drinking
@Abdullah day drinking
@Abdullah day drinking
@Abdullah day drinking
The rolled loops thing is funny. Obviously unrolled loops make the code much less maintainable, less readable, less terse, and more error prone. But back then programmers used to unroll loops as an optimization to save some CPU cycles (as the for loop part of the code does not need to run again and again ). But now we are seeing that rolling them actually improves performance. What a perfect cautionary tale about premature optimization, and also optimizing without profiling.
we used to do this on C64 as well to get more cycles :-)
*Programmers* should not be the ones unrolling loops. Programmers should be writing code that allows the *compiler* to unroll the loop *if the compiler determines that an unrolled loop would improve efficiency*. This requires an intimate understanding of the hardware that the code is running on. If you have a shared bus and your instructions are cached, then yes, smaller code size optimization (i.e. rolled loops) will win over "faster" code speed optimization. That is unlikely to be the case for an in-order processor with no pipelining or cache.
That said, you *do* need a good compiler and that compiler *does* need good knowledge of the hardware, or this all goes out the window. I've developed some code for the PIC and their C compiler is outright horrific. I can't tell you how many times I've had to twist myself into a pretzel to get the compiler to generate good code, and sometimes I just need to write it in assembly because the compiler is too stupid. And their compiler is made by the the same company that makes the microcontrollers!
@@DeadCatX2 you do know this game was written in the 90's on a retro console right
@@RussMichaels he's right tho
@@alexbrown3603 picking holes in something written over 25 years ago for not using modern techniques?
We couldn't apply that to literally everything on the planet.
Oh doctors in the 70's, 70's and 90's should have used medicine from the 21st century...
Car manufacturers in the 80's should have made electric cars.
Etc
This level of optimization is absolutely mind blowing. If I were Shigeru Miyamoto, I would be so proud to see a passionate developer improving this game to this point so many years later. It proves how deeply his creations impacted our generation.
Well said.
Except Nintendo HATES this stuff. Remember the Mario 64 remaster on Unreal Engine? It was fan-made, so Nintendo took it down. All traces of it
Shiggy hates anyone that takes something he had a hand in and "makes it better" in any way, so this would likely be seen as an affront to his work. Just as an example, he dislikes the DKC SNES titles because they're (literally) "Not his Donkey Kong", and tried to 1-up them with 'Returns' on the Wii (and arguably failed because the shoehorned motion controls directly conflict with the base gameplay the series is built on).
@@duffman18 Sources? You can't just claim it's a mistranslation of interviews without proof to back it up, particularly when it's been corroborated by multiple unrelated sources at many different times.
@@adamgreenhill110 I agree but miyamoto's opinion might be different from nintendo's (not like he could ever say his real opinion publicly though)
Making a rom hack is already abnormal, but rewriting a rom's whole source code is a lifetime amount of insanity
You should see the stuff they've done with Pokemon Fire Red's source, and then fit back in the space allowed by the original cartridge.
@@MisterBones2910you learn something new everyday
The unrolled loops are funny from a modern perspective - I'd write them as loops and then unroll later if i needed performance, even if compilers couldn't do it for me - but considering these devs were coming from the SNES? Makes total sense that they would just assume it's the right thing to do.
This is the software equivalent of restoring a 57 Chevy and then giving it ridiculous horsepower.
Just be re was rearranging what's already in there
Dude went through over 100,000 lines of code, this is like restoring 10 Chevy's or something, this guy is insane
To be more specific, building a Chevy Bel-Air with a Pro-touring chassis and the engine block for a C7 Corvette ZR1
Lol no, this is way harder
@@rotor13 , I disagree with the engine swap analogy
What this man did was more equivalent of tweaking every single thing on the car so that it runs as efficiently as possible with the least possible amount of upgrades.
This is pure backend refactoring bliss! Writing good and optimized code is an art and should be valued more by all development companies. This is why I think all games after some years of its sales end should become open-source.
Daikatana is a good example of this. Poorly received at the time, it was later open sourced and mods and bug fixes created a much improved game.
who knows, maybe even all software..
The sad reality is that development and production companies value money first and foremost, which means they are often incentivized to sell good enough software rather than great software. The N64's launch was delayed to make SM64 good enough. It went on to sell millions of copies and become one of the most beloved games of the late 90s. If an unstable 20FPS frame rate was good enough for that, why would they delay further?
That said, death of author + 70 years is an insane copyright duration. The original maximum duration in the US was 28 years. 28 years after SM64's release is 2024. Under those terms, Nintendo would "only" have had time to sell the original release, the vibration support rerelease, the DS rerelease, the Wii Virtual Console rerelease, the Super Mario All-Stars 3D rerelease, and the Switch Online rerelease before the code went public domain.
Oh god just imagine if we lived in that world . So many more indie games
I’d start with parasite eves code base and use that as a jumping off point for my game
Good code, fast code, easy to read/write code... Pick any 2. Engineering is all about tradeoffs and you need to understand the requirements. It's possible that if Nintendo wrote such super optimized code back in the day they would have missed a market window and it would never get released. Instead, it was "good enough" and was hugely successful for a flagship game.
I did something similar for the PC game Carnivores 2, really glad to see someone else doing the same stuff and documenting it to boot. Great stuff!
Its insane seeing how pretty this mod looks after so many optimizations. I never knew N64 was capable of those graphics
Y'know, when I heard the source code was decompiled, I noticed Kaze went kinda quiet. Now I know that he used that time to REWRITE THE ENTIRETY OF SM64 TO A STATE THAT MAKES SHIGERU BLUSH!.
It's funny that so much time was spent decompiling the game and then straight after the release we got the Nintendo giga leak which had the source code for it anyway.
@@Agret from what I heard, it was only some of the source code, basically what was necessary for IQue to make their own version of Mario 64, so the entire source code wasn't leaked as of now
original c code looked like trash though
Miyamoto san wouldn't blush at this, he would take it down out of spite
Reminds me of ship of Theseus, with all parts changed ;)
7:23 This is very interesting. Usually I hear about compilers unrolling loops to optimize them, but here, the holdup is RAM, not the processor.
Incredible work all around.
Modern computers are also massively being held back by the RAM being so slow. I think the difference is that now we have more mitigations for that, like larger caches and speculative execution. This probably shifts the place where code-size vs conditional jumps balance out (conditional jumps in non-unrolled loops are also bottlenecked by memory).
@@kebien6020 also compilers are getting better at removing conditional jumps altogether. In modern C(++) code you don't have to manually remove most ifs because the compiler does it for you. I realized this when I wrote OpenGL shaders that aren't optimized at all I think so you had to manually do all the tricks to avoid ifs. Just by removing a few ifs in the shaders I gained much more performance than by implementing batched rendering
Did he say RAM or ROM at 7:23 8 and 16 bit systems run directly out of ROM and don’t copy any code to RAM for execution. A PC OS (IIRC) will load everything into RAM from disk barring external libraries.
Eight cycles JUST to load the next instruction from ROM is INSANE. Removing instruction cache would make the N64 run 100x slower
@@kebien6020 I seem to remember hearing one of the biggest advantages of M1 is how it can optimize the caches.
I'd love to see a rom patch for these improvements. It would be nice to see how much better M64 would run.
I remember playing N64 games as a kid and looking at the skyboxes thinking "Why can't I go there? Why didn't they make it? It would be so much fun." This is exactly what it feels like to see mods for this game all these years later. Thanks for making this! (to everyone involved)
Glad to see i was not the only one who thought this!
Love the way you document the process, glad to be along for the journey!
Pretty cool to see the king of syntax in the wild.
@@casperdewith why yall love super mario 64? n64 is too old and thr game looks bad anyway
The fact that you were able to even read their code is an achievement. So much knowledge was needed to do this.
It's cause of the nintendo gigaleak that the source code was available
it was also reversed enginieered
and documented
since code is linear@ltg4lyfe ... unironically yes.
You can do it too
I think this has become my favorite video on youtube. Im currently studying for a software engeineering degree and I love covering memory optimisation in programs, so this was a great watch! Also sm 64 was one of the first games I played so bonus points for that
I wish this man was my programming teacher back in my youth. I couldn't help but smile throughout the entire video and at how passionate he is about all of this.
You mean vroom vroom
As a programmer and engineer I can say that this is a MAJOR achievement and a LOT of work. I can't congratulate him enough. Heck, the fluid framerate even looks like Super Mario Galaxy or Super Mario Odyssey. I just disagree when it was said that "the Nintendo programmers were not familiar with C, because it was a new language back then". C originated in the early 70s and was and is a well-known language. However, what is certain is that many 90s game programmers were making the transition from assembly to C / C++ code, I think that's what was implied in the video. It's a minor detail, like a 0.001% of what's in the video. KUDOS
Oh ok, I was wondering how C was a new language being already at that time 20 years old or so... It took that long to begin transition ? was it due to compiler refinements ? or just inertia ?
@@nlight8769 high level languages did not compile efficiently enough to be usable on older consoles, N64 was one of the first where the hardware power was capable of handling the unoptimizations to a degree
@@digiquo8143 Thanks :)
Just one last question, what was the language used for the previous gen (SNES, MegaDrive) ?
@@nlight8769 From what I've read, most were built with assembly. aka, assigning the hex values to their location in memory manually.
@@digiquo8143 Wow, asm is pretty hardcore for anything elaborate. Respect to the designers and programers from that era.
Thanks :)
PHD in mario 64 science
LMAO
He's got 2 PHDS
Ayyy lmao
that goes to pannen tbh
this is amazing. love optimization deep dives like this. I miss the days when we were that close to the hardware.. .Rock on!!
7:45 "Usually if you did this, you would get fired from your job"
Completely understandable, won't even lie. I can't imagine even the most well-equipped programmer understanding something as insane as the Q_rsqrt function (for an example of course), especially during that time period. Ahh, classic C black magic.
A lot of this stuff would get you fired, the loop unrolling also comes to mind. I hope Kaze points out that although he's optimizing everything, optimization is only one facet of good programming and generally not the most important one in a professional setting like readability.
You might not be fired if you left comments explaining exactly why you're doing these silly things, and tests showing that they are actually helping.
@@LizardLeliel it's all about context. He's programming for a never changing target (dead platform), with so much hindsight. There's no need to worry about mantainability because the game (engine) is forever done, etc. I'm sure if this was a an "actual job" he wouldn't be fired, he's worked according to requirements.
@@Radgerayden-ist I'm not saying he's doing bad for focusing on optimization here - he's the only programmer involved in this project. I'm just genuinely afraid new programmers may get the wrong idea that the standard of good code is just optimization.
@@LizardLeliel as someone who constantly messes up in Scratch, I can confirm readability is key to make a good game, even there you may get disoriented if you're being reckless with the block programming, let alone other guys looking at your code.
Yes, even with a language made for kids, these things can happen.
What we expect from an Official "Remastered Classic" : This passion project.
What we actually get : We lost the source code...so we just scrapped something similar together and hope y'all don't notice
@@lilaa3 Nintendo is an awful company from a consumer level and I'll actually never buy another product from them as long as I live. Even if I may want to.
Or they take the source code from the worst versions of the game.
Silent hill......
@@lilaa3 Yeah it's more like "We've already got a working emulator for these games, it'd be faster to just use that, than spend the dev time to make a proper port" than anything about losing the source
Still, there's so many things they could've fixed and just decided not to. Didn't even take the time to change the goddamn button textures. I don't remember my switch having a bean shaped grey X button, or a bright red B. One of the only changes I can recall that made it into the initial release was mario sunshine's camera controls being inverted to align with the other games. Which made it feel completely wrong to anyone that's played sunshine before. The only major change made the game worse for a lot of players
That's what happened with FF8 remaster
One interesting thing I noticed, in the new version all of the coins are synchronized.
"But first I need to explain.." *Mario 64 File Select Theme starts playing*
Never been more proud to support a Patreon. Kaze you do so much.
The older a game gets, the more people are going to crack it open, either to make modifications or exploit what's there. I think that's a beautiful thing. Amazing work!
Especially if the source code leaks
@@FunctionallyLiteratePerson from what I understood, the source code hasn't been leaked. It was reversed engineered.
@@yosyp5905 no, someone leaked the source code of the game :P
its kinda sad tbh
@@yosyp5905 it was both leaked(partially iirc) and reversed engineered
Kaze: Yeah I optimized the source code for Mario 64, took a couple weeks.
Nintendo: Source code?
Nintendo about to send their ninjas
I've heard of loop unrolling and function inlining to improve performance. This is the first time I've heard of someone loop rolling and outlining functions for increased speed. I guess it goes to show you how important it is to actually understand the limitations of the hardware.
As an gamer/engineer who originally learned/programmed in C++, I appreciate this video, some of the code improvements you found were super interesting, thanks for this!
Because of the tight constraints of early consoles, I always assumed, especially for a first party Nintendo game, that they were already very optimized. This blew my mind.
Well it was fairly optimized for what it was.
The game did work and people played it, it almost never crashes, so thats quite a job already.
At any stage of optimization there is another step to make it even better, so it never ends.
The cost to squeeze out some % might just become increasingly larger and costly, especially on understanding the code and potentially re-using it for some other game (as having hard-coded shadows in the code is a pretty bad approach if you want to re-use it).
It was pretty well optimized for the time
Hindsight 20/20. More is known about the N64 hardware now than when this was developed. Optimization techniques always advance. Like all technology.
Doesn't indicate this is not an impressive accomplishment. Just everyone asking why Nintendo didn't figure this out back then.
Like he mentioned in the video, almost none of this would've been possible without the RAM expansion pak.
It was already optimized.
Did you watch the video?
He explicitly says this isn't possible on the base n64 console.
This is INSANE. The map you demonstrated on looks like it could have been a GameCube launch title, and yet somehow runs on a real N64.
I would say Dreamcast
GameGear
Brilliant! Just brilliant!
I rarely comment on RUclips videos, but this is one of the videos that I enjoyed every second of
HOW are you still doing romhacks?! You pretty much rewrote the entire game for your mods. I'm amazed at your level of dedication to romchacking.
Well he said at some point when he learned enough through romhacking he will eventually make his own game. The question is tho has his position changed on this or will he ever considering something to be enough in that regard xD
@@thomasrief well, regardless of his position on making a game, the man does an awesome job and clearly puts a lot of passion. I think it'd be awesome what Kaze could create without being constrained by SM64 (Though looks like he's working around a lot of the constraints)
@@CrashCubeZeroOne constraints are good. they help you set goals
At this point Kaze is somehow gonna turn SM64 into an early GameCube game and that's just awesome.
What's kinda funny is I've thought about actually porting some N64 games to GameCube just so mods can take advantage of the extra resources and better emulation.
@@renakunisaki is that even possible? Because if so…
Super Mario Sunshine demake anyone?
@@renakunisaki I suppose it would need a full rewrite though - IIRC the N64 doesn't even use the same processor as the GC, let alone a similar architecture for rendering
Zel, known for also some cool Zelda stuff, is actually working on a pretty late-gen Starfox Adventure's (Dinosaur Planet?) like project that looks almost early gamecubeish, on a goddamn N64.
I hope that GDQ allows this game to be played and have its own category, calling the game "Super Mario 64 plus" or "Enhanced SM64" or something, so we can watch speedrunners tear into it. Very exciting.
It all depends on whether there's a community for it, probably; I remember a few (smw) rom hacks being played at gdq
Why not call this Expansion Pak support version the Kakuchou Edition (拡張 meaning expansion), like the Rumble Pak support version is called Shindou Edition (振動 meaning rumble)?
Super Mario 64 plus exists tho, it's a version of the PC port with a lot of features
@@cmyk8964 I personally want it to be called the "Haha, RAMbus goes Vroom Vroom" - Edition
Damn, "if you just program well you don't need to prevent crashes" I wish I had this level of confidence, I just sumbled in your video but awesome job!
Any plans on making a patch for the base game where it runs as fast and optimized as possible?
Yeah, I'd absolutely love to see this!
It would be interested to see this implemented in libsm64 so we can see Mario run in 60 fps inside other games :o
I also want this on super mario 64 land
PLEASE
Seconded! I got an Everdrive64 a few months back and would love to play it at 60 fps on real hardware.
7:00 additionally, I would say that Nintendo has to work under really tight time constraints, with level design and ideas chaining constantly. Still, they were able to deliver this amazing game that define rules for the 3D platformer genre for next coming years.
Kaze: "Hey Nintendo, I literally made your game perfect."
Nintendo probably: "Die."
Capitalism in a nutshell
I have to say, it's amazing listening to this again after watching some of Casey Muratori's videos, gives a real appreciation for the underlying tech
I love how you don't throw shade at the original devs. You respect the work they did, but use all the tools at your disposal to optimize it into something beyond what they could have imagined.
6:45 sums it up best-- "...so i really dont fault Nintendo for not doing this; they just Couldnt"
he did throw shade at them a few times.
@@OtakuPlayTM 2 extra raycasts for shadows
@@OtakuPlayTM They did deserve a little shade at least. Some of these things are a bit bonkers, and C had already been popular for about a decade (it gained popularity during the 80s)
@@soulextracter Nintendo was lazy imo when it came to coding
Notice for whose claiming C language wasn't new in 1995: Kaze is actually inaccurate but not wrong since C was a new thing in console development (32 bits consoles) because before them, Assembly was used for coding games on SNES or Megadrive/Genesis. This fact is even more true for japanese devs; I own a book about the PS1 release where they mention the change from Assembly to C was complicated for japanese programmers.
Yeah, I think at the time the existing C programmers back then were mostly doing business apps or other high level projects, there weren't many C programmers with video game backgrounds.
Book name please.
@@leslie5202 The book is in french sorry :/
How did C become a thing for video games so late?
@@akioasakura3624 probably has to do with developing compilers for console architectures.
This is extremely impressive as well as interesting!
this means vroom vroom
Well boys, Kaze did it again!
he sure did do it again. he completely lost his mind.
@@Zapdos0145 came here just to say Kaze is literally insane.
vroom vroom
Oh hey, Danntheman, pleasure to see you again.
again?
did he ever stop doing it?
Was I the only one noticing, using instancing for the coins, resulted in all coins spinning equally, whereas in the original game, they spun with a random offset to reduce repetiveness?
Janky hack mate. Coins all spin in sync.
Yeah but 2 ms :(
Also , zodiacs have zero paralax scrolling
🌎💨💨💨 66,600mph? Nope.
Someone probably demanded that
I think you can still use instancing and push a diff modelmatrix each iteration
Regarding compiler optimization, loop unrolling, etc. It's one of the things that makes targeting old CPUs with modern compilers difficult. Many modern optimization techniques that result in real improvements for modern CPUs result in much slower code on older CPUs. They didn't have the benefit of gargantuan cache sizes, multi-way instruction decoders, register renaming, etc. For most older CPUs, the general case is that less instructions means more performance, because it leaves more bus time for data and IO.
Nintendo devs: "I never said thank you"
Kaze: "And you'll never have to"
When I was a kid I'd have nightmares in SM64. Now I can have those nightmares at full FPS, thanks Kaze!
Piano monster at 60fps.
4:15 the code isn’t necessarily bad, it perfectly did what it needed to deliver in time at a given quality threshold. I observe this a lot at work where coding purists will sweat about a perfect solution while the product development moves on only for them to start from scratch with the next badge of changes. In a commercial setting we cannot expect perfection. Oftentimes devs have to polish a turd. The result isn’t always beautiful under the hood but as long as it functions it’s good enough.
For you
@@jphataraki6764 you see 😉 that thinking is the problem 😅
@@jphataraki6764 That's the line of thinking being talked about here. When you're coding commercial products, it's basically guaranteed that you will not have time to perfect your code. I'm not even factoring in the possibility of a rushed cycle here, I'm simply saying that, when it comes to how long coding takes, there is an enormous difference between reasonable and pure.
Not to mention that two experienced and highly skilled programmers can vehemently disagree about what's best
100% agree, I just spent roughly 2 months worth of work programming a large industrial production line. I programmed it not to be the most efficient with memory or to preserve the scan time of the PLC....but to be modular, I can add things on or take them away (comparatively) easily. And by fuck am i happy for that now, 1 week from installation I've been hit by the "Yeah so half the machine hasn't been built yet but you need to commission the stuff that has. The only issue is that half the missing equipment is "Safety rated" and therefore the PLC will throw a hissy when it doesn't exist.
Sometimes it's not about speed, it's about safety and just making it work.
"you either learn about these things, or you go outside and touch grass. It's a strict either or, you cannot have both" - Single greatest quote of the whole vid. My life is complete, my life experience is validated. Thank you.
This is wild my dude, amazing stuff. Keep that passion alive!
As a 10+ years developer, it amazes me how much love and effort you put it into this! Let me ask, how much time did you take to understand most the source code since the first time you read it? Amazing job!!!
He did say it took him a couple hundred hours and a few weeks of work total
@@ozordiprince9405 But that counts not even all optimizations mentioned in this video and also not the earlier work on mods. He was definitely already familiar with a lot of the codebase.
Leaving a comment. This really helped me during my workout, kept me so focused I didn't even realize my legs burning bless up
what¿
what?
the video was about vroom vroom bus and plumber man
@@marqimoth6987 two years late, but hearing something while I'm going keeps my mind busy enough to push farther :)
Me: I don't understand so much :(
Kaze: It's mean VROOM VROOM
Me: Oh, i understand now
Ayyy 2:02 you ain't gotta do Pannenkoek like that 😭😭
Great video (and great project)! One nitpick though: C was not a new programming language in 1995. C came around 1972 with 1978 being the year when the creators released the official guide book, extensively known and used by the 1990's. However, it was new for Nintendo developers as this was their first platform to use C for developing games.
I think what he meant is that C for game programming on consoles was relatively new. Nintendo up to that point only used assembly (NES, GB/C and SNES). I dont even know if any third parties besides Nintendo (HAL Laboratories comes to mind) ever toyed with other languages than assembly for producing games in Nintendo consoles, so its not that it wasn't done before, they were the first and couldn't even outsource or get help from third party studios that normally aided nintendo when they were under time pressure.
@@gerardmarquinarubio9492 yup exactly what I said: new to nintendo but not to the world.
And yeah, maybe he meant that but was not what he said. Someone new to programming watching this video might form their knowledge thinking that one of the most influential language in the history was invented in the mid 1990's. Thus I thought useful to clarify.
Glad u said this b4 me
@@gerardmarquinarubio9492 There are a few Sega Megadrive games that were written in C, such as Sonic Spinball and Ecco the Dolphin. This massively sped up development allowing Spinball to release before the end of 1993, but meant the game only ran at 30 fps. It's hard to compare the CPU in the SNES and Megadrive directly, but maybe the SNES would have performed even worse running a game compiled from C.
I'm guessing the reason Spinball and Ecco were in the GBA Sega Smash Pack was because they could just compile the source code for the GBA instead of running them in a Megadrive emulator.
I was coming to make that same comment.
I didn’t believe it’s possible at first but the vroom vroom joke really does get funnier the more it’s repeated
The Family Guy effect
Repetition legizimizes
@@Sebb_Music Repetition legitimizes
@@otesunki Repetition legitimizes.
Rule of 3 be damned
You're doing DRAM level optimizations.
You are deep enough in the rabbit hole you might be able to write an emulator.
You're a mad lad.
This is one of the most brilliant things I've seen. You are a genius, my friend.
I love how every SM64 RUclipsr has agreed to play the pannenkoek music whenever they explain something
I'd love to see speedrunners using this and see what they notice. I feel like they are most likely to feel the difference and really appreciate the work that went into this.
speedrunners when the game runs 1 frame faster (tyler1 rage)
this would probably either be banned from speedrunning, or would get its own category due to how... y'know, all the code was rewritten
@@Mate_Antal_Zoltan and how speedrunner would probably be used to the lower FPS and use the low FPS to their advantage so this wouldn't really work with speedrunning
Guys, it's not "this would be cool for speedrunning!", it's "I wanna see what people that have been optimizing their playing abilities on this game for the past 20 years think about this version"
@@techwizsmith7963 thank you, that is exactly what I intended.
I learned on C in university so this is particularly interesting to me. He's right about the touch grass part: you have to sell your soul to master pointer arithmetic.
@@osparav there's a good video on fast inverse square root in Quake 3, check it out
@@ImperatorZed I believe it was made by a guy named Nemean, very informative and hard to believe someone actually achieved this
@@ImperatorZed Where does it use pointer arithmetic? It does use pointer aliasing, but that's a different thing.
@@osparav idk, all I know about it is that you add 1 and in the memory it adds 1*the number of bytes of the variable. Haven't searched to much to know what's difficult though.
@@osparav Using the same token for declare and use, not to having the hardware visualization about how pointers are, and not popularize the my_pointer[0] instead of *my_pointer
2:01 I already was going to ask how this project would affect Mario's ability to traverse QPUs.
Insane stuff, and really well explained. As a coding noob, I actually have the feeling of understanding in some shape of form what you did
coding noob gang lesgoo
woooo
vroom vroom
I'm not even a dev and most of the code optimization things you outlined are fairly straightforward and basic things that even I am aware of. But some of the things you did, here...man oh man...that is the kind of thing you can accomplish ONLY when you put thousands of hours into a project like this in addition to the thousands of hours of coding experience required to even know you could optimize in such a manner.
Also, the LLMs are great at optimizing code. Give it snippets, describe the context, and they can optimize your code or offer suggestions. Almost everything you mentioned you did is something these LLMs would suggest or even re-write for you. Even if you know how to code an optimization, just give the code block to the LLM, describe the optimization you want, bam, it will do it. Often better than you can because it doesn't usually make typos or errors. But I noticed these LLMs make errors with instantiating
or failing to declare variables them before invoking them.
Anyway, this is an amazing project and I wish AAA studios would have an optimization team that did what you did.
this shit slaps so hard. the lizard satisfaction i get seeing something just get the living christ optimized out of it is wonderful
hehehe lizard satisfaction
hehehehe
@@Eldoofus i like your pfp
im so excited to see all these improvements in mario maker 3