I thought we needed a "Behind the Hardware" episode to talk a bit about cycles as they are very important. This topic is a bit of a prerequisite for talking about interrupts. I hope those of you with experience in pulses, duty cycles, etc. didn't mind the oversimplification of the clock in order to make a point about speeds. We could certainly dive deeper into this topic, but this should be enough to proceed with other mappers.
ty. I've struggled to explain this to a lot of people. I've also had difficulty explaining the fact that interlacing is required for the image to fit on a 640 x 480 screen essentially reduces the framerate of games by half (from 60 to 30), then there's the horizontal blank cycle.
Hey man, I don't know where to pitch you this request (and I'm sure you're SWAMPED with them). But I can't figure out for the life of me the answer to a very simple question, regarding Adventure Island III. Definitely need a hacker/tracker guy who knows what he's doing to trace this. There are many secret bonus rooms you can find by touching keys in hidden eggs, and when the room loads, there is a smaller, inner box room inside the room with an invincibility crystal inside. Outside the inner room, there are lots of fruit and 3 dinosaur cards. If the room loads with you already inside the inner box, you can get the crystal, and walk through the walls to exit the box. I think you can even jump back in. But most of the time, you start at the bottom of the larger room, outside the box, and you can't get in at all. Nothing in a computer is truly random, there MUST be a way to cause a start inside the inner room. I loved your Punch-Out!! videos, btw, very related.
By the way, many computers and consoles of that era used the same CPU (a variant of the 6502 chip) while having vastly different graphical and sound capabilities: this video gives a taste of why. In some cases, we have the same CPU in two different console generations (Atari 2600 and NES, picking at the opposites of the range). It's incredibly fascinating how the engineers managed to create incredibly different architectures while technically striving for similar purposes.
That's what the 6502 was designed to be: a low-cost, decent CPU for general usage in a variety of low-power devices. Attach it to a simple line output system, and you get the 2600. Attach it to a more complex tile-and-sprite renderer, and you get the NES.
Wonderful explanation. This is simply why it's not possible to "overclock" the NES by changing the crystal. Many other 80s computer worked exactly the same way, like the C64.
As long as you can install some sort of pre-scaler in front of the PPU to keep it running at the same pace as it normally does, it should work fine. Of course things like the audio will still be higher pitched, and any mapper that detects CPU cycles for its interrupt timer instead of scanlines will not work properly. That part can be fixed by patching the particular game's code instead.
these type of breakdowns always interest me...cause even as "simple" games were back then, a lotta stuff going on in those carts/chips. I can't even imagine what goes on now and days
I suggest you check out ben eater's videos to get a super lower level idea of how a 6502 system works. It's a big rabbit hole that results in you buying vintage components and reading a lot of datasheets, but it's the most fun project I've ever endevored.
@@MrAsmontero "Is easier now, you don't have as much limitations" Or it's harder now, you have much more complicated systems that interact in ways that can be very hard to understand and predict. Not having limitations is an illusion because player expectations are so much higher.
@@MrAsmontero Did you reply and then delete your comment? I've also programmed on both old and new, but would like to get your perspective, especially if you disagree with mine. But I can't see your reply except up in the bell menu, which doesn't give me the full text.
Yes, the CPU/PPU timing is very important for MMC3 interrupts. I once briefly had a PAL NES game called Drop Zone, which I ran on my (NTSC) toploader. It was a game which used a timer to draw a HUD in the bottom half of the screen. So on my TV, about the half of the font for information like the score and lives was cut off, due to of course that since it was a PAL game, it was programmed for the timing of a PAL console and TV.
Nowadays you can be clever about this for homebrew and detect the system region (simply by counting how much time passes between two Vblanks) and from there you can take the right timing values for the MMC3.
I must commend you for this wonderful explanation of timing, clocks and processing. As a software engineer and digital ASIC designer with experience in computer graphics, I understand all of this stuff, but I could have never explained so clearly as you did, very nice!
Sorry I didn't weave in something about PAL as it obviously involves different speeds. I remember thinking about it during production, but I was buried in Photoshop at the time.
The thumbnail was absolutely perfect for the notification I got on my phone this morning. As usual awesome video. Too early for a taco, but I grabbed a bagel and coffee.
Fantastic video~! I was always wondering what caused that twitching row of pixels on the MegaMan 3 boss selection screen - as well as a similar wiggle occurring sometimes in SMB3 right above the HUD - it makes sense now that it is technically scrolling the screen back to status bar and away from the stage tiles.
I never even NOTICED that effect in MM3 because it happens so fast. Just showing that at that specific time of the video it all came together in my head because I remembered that glitchiness.
This is incredibly interesting. It's amazing how everything used to work. Today there's no need to ensure the clock speed is a multiple of the amount of dots the TV can render. But in the NESs days of bare metal computing it was essential. I learned a lot in this video, thank you!
Very good explanation of what's going on inside such systems! In times of non CRT video devices it is very helpful to get reminded how an video signal is created in the first place. Thanks a lot 😊
Bitchin. Another banger expected. Gm and God bless my friends Edit: hah! I recall smb3 having a scanline bug too.. I believe dahrkdaiz or bbitmaster had found the fix years ago and shared it in irc
I can't tell you for sure, but different MMC3 revisions trigger the interrupt at slightly different points in time. Ideally, it would be triggered right as horizontal blanking starts so that the PPU scroll register can be updated before the next scanline starts to be drawn. It could either be that not happening at the right time (too early/late), or the way the interrupt handler was written (too much stuff happening before writing the scroll register?). There can also be different PPU/CPU cycle alignments and when an interrupt occurs the CPU is still allowed to finish the instruction it was executing before answering the pending interrupt. Different instructions take different amounts of time (the slower ones can take 6 or 7 CPU cycles), so depending on what the CPU was doing, the interrupt will be handled with varying amounts of delay and the PPU will be at a later dot than you want when you get to writing the scroll register. This is responsible for that little "jitter" with glitched looking pixels at the edge of a screen split. The best time to do these changes is when the jitter is hidden to the right of the image in H-blank but that's not easy to do without carefully timed interrupt handlers.
Hey DG, I've been a big fan of your channel for quite some time now. One of my favourite series of yours has got to be the one where you break down how the original Super Mario Brothers stores its level data, and how this causes the existence of the glitch levels. I was wondering if you're planning to do similar videos about SMB2 and 3, since those games have vastly more complicated levels, and I'm very curious as to how their data is stored and compressed.
@@DisplacedGamers It took me this long to realise that the video I was talking about was actually made by Retro Game Mechanics Explained, lol. I guess it kinda makes sense that I'd get confused, you're both great channels that explain super-complicated coding and hardware information in an in-depth and easy-to-understand way.
Quartz watches were the first use of crystal oscillators in consumer goods. Of course, those oscillators needed to produce a very precise frequency for accurate timekeeping. Engineers needed to produce a quartz crystal that would always resonate at a particular frequency, and whose resonant frequency could be controlled. Luckily, such a device has been in use for centuries; tuning forks, first invented in 1711, resonate at exactly one frequency, which was precisely set in manufacturing by grinding down the ends of the fork. To reduce the size and ensure it wouldn't be audible, quartz watch crystals are manufactured to resonate at exactly 32,768 hertz, using a laser to trim the tines down micrometers at a time until the desired frequency is reached. Sadly, this technique doesn't scale down indefinitely; crystal oscillators in the megahertz range are based on resonating discs and aren't nearly as exciting.
It feels like you've read my mind, as I am knee deep in this exact rabbit hole. Interestingly enough, the video output appears to function at twice the crystal rate, giving us 12 voltage outputs per color cycle. At least, that's what it looked like on the nes-dev ntsc page. Something about outputting on both rising and falling. I don't know what that means or how that is actually done in hardware. Maybe you have some thoughts or corrections :)
Work is performed on both the "rising" edge of the wave and the "falling" edge of the wave on the clock signal. This could be for triggering components at more exact timing, etc. This would still be considered six cycles, however, since both a rising edge and falling edge are just 2 parts of a single square pulse. There are components that can determine if a signal went from low-to-high (0 to 1) or high-to-low (1 to 0), which in this case would indicate the start and end to a single clock pulse. Hope that helped!
Not only video, 6502 also has rising and falling edge sub-cycles. I don't think Z80 does, or at least doesn't do anything extensive with them. This is how 6502 is able to perform a memory read in one cycle, it will assert an address on the bus and configure the bus in the LOW phase of the cycle, and then the memory needs some time to perform the read, it will assert the data as soon as it can, which the 6502 will consume in the HIGH phase of the cycle. Other comparable processors would generally have multiple-cycle memory operations, and would correspondingly be typically used at higher nominal clock rates with similar grades of memory. This design trait of 6502 was sometimes used to an advantage, for example to sneak in memory accesses in the LOW phase and avoid cycle stealing, but in general the higher nominal frequency allowed more granular timing in other chips.
Correct, the NES uses the low and high phases of the 21.47727 MHz master clock to generate the smallest granularity of the video signal, which works to 8 subpixels per pixel, or 12 subpixels per color cycle. And that's exactly how the NES palette gets it's 12 different hues, by shifting the color pulses by anywhere from 0 to 11 subpixels. Whether the PPU specifically looks for the rising and falling edges for this purpose, or just uses the momentary state of the clock as a logic input, I'm not sure.
I had absolutely no understanding of any of this stuff before discovering this channel. The breakdown and visual representation are just brilliant. As someone who grew up in a PAL region I've been seeing that printed on the instruction manuals of all my consoles without knowing what it meant. Now I do. Thanks and great video.
For a moment I was thinking: “Is he going to try and explain the whole clock cycle and render pipeline of the NES PPU, what a madman!” Then I saw the video is only 11 minutes long and relaxed. Great video, once again. Thanks for talking through all of these concepts. As someone who tinkers with HDMI signal generation, it still seems a little weird to me that modern video standards still include extra time before, during, and after the signal. That was all originally to give the electron beam time to move in a CRT, right? What’s all of that for now?
The VBlank and HBlank periods in modern setups are there because HDMI is a modification of DVI. And DVI had them because it was designed as a digital interface for CRT monitors, so they digitized all the fun picture tube control signals and delays. I would not be surprised to learn that HDMI packed audio into the VBlank period, but don't know for sure how it is encoded.
For a brief amount of time when I was a kid they tried to advertise computers using the work Mertz during the the 386 and early 486 days. I specifically remember reading a description near one of these that the scientific word megahertz is commonly just shortened to Mertz for the common person. It felt like for a couple of months everyone was saying the work mertz when advertising computers.
Everything computer-related nowadays is all theoretical and abstract. It's absolutely mind-blowing to go back and see actual pins on a CPU connected to actual quartz crystals. 🤯
Programmers have long forgotten about those things with multitasking operating systems, no real direct access to hardware anymore. I came from days of programming C64's and DOS, and one of the optimizations we used to do was bit shifting integers to use instead of floating point math. I had made a simulation system in recent years that required calculating floating point math on a modern multi-core computer, tens of millions of simulated operations. It was fast, but when I changed the code to utilize 64 bit integers and "fake floating point", the performance difference was unbelieveable. Then I recoded it using GPU compute instead and the performance was just 100x what my best "fake floating point" was, and as a bonus, actually calculated the real floating point numbers. Today the closest thing I get to programming "old school" is microcontroller projects, like Rabbit, Arduino and ESP32. And finding people who REALLY understand making bare metal systems is few and far between these days.
And just like that, I understand the flickering on that Megaman game. I remember someone pointed it out in a review video and I couldn't even begin to think of a logical reason.
Even though an image can appear to the human eye to be moving even if it's redrawn at a much lower framerate (e.g., the standard for theatrical movies was 24 FPS for a long time), it's interesting that 60 FPS remains the benchmark for video game performance. I guess nobody wants their modern game to play with a lower refresh rate than their NES could manage.
Though I can see why a lot of consoles doubled frames. Flicker is really annoying. Frame duplication not so much. Now give me a 100 Hz CRT and a matching console!
Wow what an awesome explanation! One part that I don't understand: what would happen if the CPU divided the crystal frequency by (for instance) 6 instead of 12? Would the CPU just run twice as fast? I assume not, or else why divide at all, but I'm not sure *why* it wouldn't work.
I think (and this is just off the top of my head, Matt) one reason for the selection of that CPU speed had to do with the ROM speeds. You could accelerate the CPU (reduce) divider and potentially exceed what the ROM can provide. That would be just one side effect of messing with an existing system design. The SNES still has to deal with this (fastrom vs. slowrom) despite being so much younger than the Famicom. That is an interesting part of history in of itself. Anyway - As for messing with dividers, there are some clone chips out there that don't match the dividers and cause the system timing to go bonkers. Some people have swapped them out just to see what would happen. I'll probably share some results soon in a community post.
@@DisplacedGamers That sounds like a plausible explanation. The PC Engine allows the software to set the timing used by the VRAM, and it can also set the dot output to be faster than normal, which causes the horizontal resolution to increase (up to a 512x240 image by outputting two nametables to the screen next to eachother). But this causes the VRAM chip to be accessed at ~10MHz, which is slightly higher than what it was rated for. If you don't set the VRAM timing correctly to do this the VDC (equivalent of the PPU for the PC Engine) will not be able to fetch all sprite and color entries for every scanline from VRAM and this will lower the sprites per scanline limit and in some cases only 2 of the 4 bits of color data for each pixel will be read, reducing the color capabilities to something like the NES has. It's been theorized that there could be other visual glitches whenever the RAM doesn't put the data on the bus fast enough but nobody has seen any such thing happening so far.
Thr MMC3 interrupt counts scan lines and send an IRQ when the specified one(s) are hit. But it's never quite perfect hence the glitch. Most games ensure black pixels are there but yeah the Mega Man screen is all blue :)
there's something satisfying about hearing the specific reason the NES CPU runs at the speed it does instead of just reading the speed and going "oh okay that's a number" like, I figured no matter what if the speed is faster then that's better and the only reason it's not .00001 Mhz faster is because Nintendo didn't want to spend more money to make it faster
Excellent breakdown as always, you are very good at introducing daunting concepts like these to the viewer. If we divide the amount of PPU cycles per frame by the three PPU cycles per CPU cycle, that would get us 29,780.5 CPU cycles per frame. I presume slowdown occurs when the CPU runs too many cycles to fit within one frame render before it can update graphics?
NTSC color subcarrier clock multiplied by M2 multiplier and then divided by NTSC divider (39375000 * 6 / 11 ~= 21 477 272,72727273...) sounds about right :)
I've never heard that the PPU used a different clock rate before. Why has no one told me this before? Now I need to write an NES emulator because this must be explored.
@@DisplacedGamers I've always been curious how the game decides where Jason is at any given time as he rampages through the map (including when you stumble on him outside of an official attack sequence), as well as how it decides who he's attacking. The latter might just be RNG, but it always seemed like he concentrated on certain counselors.
Woo, first comment! Love seeing these mini-documentaries going over old games, their innards and architecture. Here's hoping you get into more games and consoles in the future!
You do great videos. I was wondering could you do a behind the code on the teenage mutant ninja turtles 3 nes game for the movement and attacks that the turtles do, and the code for the health bar for the turtles and code for AI for foot soldiers and enemies?
It always blows my mind how many chips I see on the NES that I have. Pretty much everything except the PPU and the Ricoh CPU. I DO have a 6502, though. It makes me think, I could probably build an NES if I had the couple of missing chips, and a LOT of knowledge. I wouldn't do it, but it would be cool if somebody made an NES on a breadboard. I'm sure 1980s eletronics dealt with more noise than the bitty parasitic inductance/capacitance of the breadboards would be. And everybody seems to be making computers on breadboards these days.
So what happens when you connect your NES (or any console before HDMI era) to modern TV? Is there some kind of CRT-emulation built inside them, since LEDs don't care about electron beams and NTSC/PAL differences.
Hi. Nice video. I heard the tv frequency is derived/copied from tms9918 VDP (Colecovision, Sega Sg-1000, MSX). I found the 341 pixel per scanline on that VDP as well BUT in some documents I found an honest 342 pixel per scanline. With 342 pixel per horizontal line, the refresh rate drops to 59.922Hz. Do you have some information about this discrepancy?
Every VDP implementation was a little bit different in terms of line rate and field rate. But you are correct, the TMS9918 has the same pixel rate as the NES. The most critical rate is the chroma carrier frequency, which must match the TV very closely so colors are constant from left to right. TVs can tolerate more variation from the NTSC spec for the line rate and field rate.
But the clocks are always off by some value, they have a tolerance, and they have aging, so what if your NES has like -50ppm, and mine is like +50ppm; that'll give me 1second advantage in a 3 hour speedrun! ... I did some research, and the clock is actually +/-2 ppm, that's an incredibly precise oscillator that could cost some money! That could warrant a whole new video!:D
what would happen if you replaced the crystal with one of higher value say 26 or 27 mhz say from a cb radio? could you overclock the nes? i know if you change the 3.57 mhz to something like 5 or 7 mhz in a certain radioshack phone dialer you could turn it in a red box to make free phone calls on payphones at the time when payphones did in band signaling. it worked by doubling the frequencies of the * key to 1700 by 2200 hz witch happens to be the frequencies of the electronics of a payphone timing was controlled by the totalizer.
Not uncommon for the era. Analog TVs can handle a fair bit of variation, and hardware designers often took advantage of that. Same reason "240p" signals work on a display that was very much only intended to do 480i. You give the screen a malformed signal and it will do it's best to paint it to the screen. For what it is worth, the NES PPU actually outputs a very nasty signal and is pretty far off of NTSC in a lot of regards.
It was very desirable to have an integral number of graphics bus cycles per character cell, scanline, and frame, which couldn't generally be done without varying from the NTSC spec. Just doing a progressive scan requires a variation from the NTSC spec.
I'd say NES is easier just because you can easily DIY anything that you want, including your game engine. I don't think many people would want to mess with that on a modern platform. The beauty of these old consoles today is that you can do anything, you don't have to depend on anyone else's code if you don't want to.
It has quite a bit of stability margin, but how would you get the image out of it? A TV or video capture device requires very very close to 15.75KHz horizontal frequency.
Good question, the crystals are not perfect and their speed varies as a function of their temperature but the machining of their dimensions and tuning of their associated components is precise enough that the variations in speed are much smaller than the required precision. All NESes run at slightly different clock speeds in practice but they all function within the tolerances of the NTSC/PAL tv specs.
They're very very close, with usual good quality grade being 20ppm, so a 20MHz crystal might deviate by 1 Hz. They are factory trimmed by removing some material with a laser. There's also cheaper lower quality grades like 50ppm. Over decades, they can drift a little.
So, if the crystal dictates clock speed for the CPU and the PPU, what would happen if we were to swap the crystal for a faster one? Would we be effectively overclocking the CPU and the PPU?
You effectively would, or one could also use an external clock multiplier circuit, not unlike what's done with modern CPUs today. However, the PPU *has* to be locked to the colorburst frequency of whatever TV system it's outputting for, so you can't just boost the frequency of the PPU for higher resolutions or frame rates; the graphics will rapidly turn into garbage as a result. You could try and boost the CPU clock speed independently, but that would throw off the pitch of the audio system (it relies on CPU timing as well, since the audio is literally on the same CPU die), as well as screwing up any game logic/timing that uses CPU cycles instead of PPU-based timing. So while you could, the chances are quite high you won't get a usable system out of it.
Some emulators allow you to simulate "overclocking" chips in such a manner, reducing or eliminating slowdown that was present when playing certain games on real hardware. In real hardware, you'd presumably immediately break the video output if you swapped the crystal without doing some measure of extra work to keep it within NTSC standards. You might could overclock the CPU while leaving the PPU alone, but I'd guess that would help fewer games than you might think. On a related note, the difference in the PAL and NTSC TV standards resulted in the PAL NES using a 26MHz clock instead of 21MHz, but it also uses higher divisors for the CPU and PPU, resulting in those chips actually running at *slower* speeds. Yet the PAL NES CPU still sees more cycles per frame, because the PAL TV format was 50Hz (50 fields per second) instead of 60Hz.
You don't have much leeway until the TV can no longer decode the signal. You have some latitude on vertical refresh rate, though it shouldn't exceed 61Hz. But say you took a PAL NES (different timings) and PAL TV, you gain some extra leeway since all PAL TVs are fine with 60Hz vertical refresh. The colour burst is very sensitive, but you can drift off it very far and it won't be fatal, just at first you get false colour and then you lose colour altogether. It's OK, you can live without colour. The most critical is horizontal refresh rate. It's near 15.75 KHz (in both PAL and NTSC) and you can't deviate very far from it, the TV will lose sync. Very very narrow working range.
Meanwhile, as computer systems and game consoles became more "modern" developers have strayed further and further away from optimizing code for performance for various reasons, very few of them actually valid.
I guess that is due to the deeper pipeline in the video chip. The CPU had no transistors left to deal with pipeline issues? Though Intel always had a higher clock rate ( 4.7 in 1981 ). Then 1985 with the Amiga the CPU had the same clock. Can OCS do 640px ??
I thought we needed a "Behind the Hardware" episode to talk a bit about cycles as they are very important. This topic is a bit of a prerequisite for talking about interrupts.
I hope those of you with experience in pulses, duty cycles, etc. didn't mind the oversimplification of the clock in order to make a point about speeds. We could certainly dive deeper into this topic, but this should be enough to proceed with other mappers.
ty. I've struggled to explain this to a lot of people.
I've also had difficulty explaining the fact that interlacing is required for the image to fit on a 640 x 480 screen essentially reduces the framerate of games by half (from 60 to 30), then there's the horizontal blank cycle.
Hey man, I don't know where to pitch you this request (and I'm sure you're SWAMPED with them). But I can't figure out for the life of me the answer to a very simple question, regarding Adventure Island III. Definitely need a hacker/tracker guy who knows what he's doing to trace this.
There are many secret bonus rooms you can find by touching keys in hidden eggs, and when the room loads, there is a smaller, inner box room inside the room with an invincibility crystal inside. Outside the inner room, there are lots of fruit and 3 dinosaur cards. If the room loads with you already inside the inner box, you can get the crystal, and walk through the walls to exit the box. I think you can even jump back in. But most of the time, you start at the bottom of the larger room, outside the box, and you can't get in at all. Nothing in a computer is truly random, there MUST be a way to cause a start inside the inner room. I loved your Punch-Out!! videos, btw, very related.
Yo! I've kidnapped the princess while you were explaining clock speed. Ha Ha ha...
... this is one of the best comments ever.
Dude that is a fucked up thing to do
@@DisplacedGamersthis was your opportunity to say “you sly dog, you got me monologging!”
By the way, many computers and consoles of that era used the same CPU (a variant of the 6502 chip) while having vastly different graphical and sound capabilities: this video gives a taste of why. In some cases, we have the same CPU in two different console generations (Atari 2600 and NES, picking at the opposites of the range). It's incredibly fascinating how the engineers managed to create incredibly different architectures while technically striving for similar purposes.
That's what the 6502 was designed to be: a low-cost, decent CPU for general usage in a variety of low-power devices. Attach it to a simple line output system, and you get the 2600. Attach it to a more complex tile-and-sprite renderer, and you get the NES.
Designing a CPU is hard. 6502 and z80 had a large software library and market of trained assembler coders. Same as 68k and 8086.
Wonderful explanation. This is simply why it's not possible to "overclock" the NES by changing the crystal. Many other 80s computer worked exactly the same way, like the C64.
As long as you can install some sort of pre-scaler in front of the PPU to keep it running at the same pace as it normally does, it should work fine. Of course things like the audio will still be higher pitched, and any mapper that detects CPU cycles for its interrupt timer instead of scanlines will not work properly. That part can be fixed by patching the particular game's code instead.
these type of breakdowns always interest me...cause even as "simple" games were back then, a lotta stuff going on in those carts/chips. I can't even imagine what goes on now and days
I suggest you check out ben eater's videos to get a super lower level idea of how a 6502 system works. It's a big rabbit hole that results in you buying vintage components and reading a lot of datasheets, but it's the most fun project I've ever endevored.
Is easier now, you don't have as much limitations
The irony now is that developers will arbitrarily decide if you can't run it or that the game will run at 30fps because 😂
@@MrAsmontero "Is easier now, you don't have as much limitations"
Or it's harder now, you have much more complicated systems that interact in ways that can be very hard to understand and predict. Not having limitations is an illusion because player expectations are so much higher.
@@MrAsmontero Did you reply and then delete your comment? I've also programmed on both old and new, but would like to get your perspective, especially if you disagree with mine. But I can't see your reply except up in the bell menu, which doesn't give me the full text.
These are always so incredibly well edited and explained. I cannot express enough gratitude for your work
Thank you!
Yes, the CPU/PPU timing is very important for MMC3 interrupts. I once briefly had a PAL NES game called Drop Zone, which I ran on my (NTSC) toploader. It was a game which used a timer to draw a HUD in the bottom half of the screen. So on my TV, about the half of the font for information like the score and lives was cut off, due to of course that since it was a PAL game, it was programmed for the timing of a PAL console and TV.
Nowadays you can be clever about this for homebrew and detect the system region (simply by counting how much time passes between two Vblanks) and from there you can take the right timing values for the MMC3.
Your videos are truly the ASMR of the programming world.
I rewatch his videos every night, they’re so incredibly soothing
This might be your best thumbnail to date
Thanks!
I must commend you for this wonderful explanation of timing, clocks and processing. As a software engineer and digital ASIC designer with experience in computer graphics, I understand all of this stuff, but I could have never explained so clearly as you did, very nice!
Thanks!
Nice explanation, an extra section of PAL would be awesome.
Sorry I didn't weave in something about PAL as it obviously involves different speeds. I remember thinking about it during production, but I was buried in Photoshop at the time.
always looking forward to your videos, usually insta-watch no matter when! Thanks for doing these.
As always you have taken a very complicated aspect of technology and made it easy to grasp and understand. The hard part is remembering everything lol
The thumbnail was absolutely perfect for the notification I got on my phone this morning. As usual awesome video. Too early for a taco, but I grabbed a bagel and coffee.
Fantastic video~! I was always wondering what caused that twitching row of pixels on the MegaMan 3 boss selection screen - as well as a similar wiggle occurring sometimes in SMB3 right above the HUD - it makes sense now that it is technically scrolling the screen back to status bar and away from the stage tiles.
I never even NOTICED that effect in MM3 because it happens so fast. Just showing that at that specific time of the video it all came together in my head because I remembered that glitchiness.
Hope your channel blows up. Super impressive stuff man.
This is incredibly interesting. It's amazing how everything used to work. Today there's no need to ensure the clock speed is a multiple of the amount of dots the TV can render. But in the NESs days of bare metal computing it was essential. I learned a lot in this video, thank you!
Very good explanation of what's going on inside such systems! In times of non CRT video devices it is very helpful to get reminded how an video signal is created in the first place. Thanks a lot 😊
This was an excellent presentation!
I just started my NES journey and I'm so thankful for your video :)
ah yes, humble deep dives
Just a moment of appreciation for that clever title card.
Excellent Video! Thank you!
It’s mind blowing how this all works.
Bitchin. Another banger expected. Gm and God bless my friends
Edit: hah! I recall smb3 having a scanline bug too.. I believe dahrkdaiz or bbitmaster had found the fix years ago and shared it in irc
I'd like to hear the story behind the mentioned blinking scanline glitch: what causes it? 🤔 Incorrectly done interrupts?
I can't tell you for sure, but different MMC3 revisions trigger the interrupt at slightly different points in time. Ideally, it would be triggered right as horizontal blanking starts so that the PPU scroll register can be updated before the next scanline starts to be drawn. It could either be that not happening at the right time (too early/late), or the way the interrupt handler was written (too much stuff happening before writing the scroll register?).
There can also be different PPU/CPU cycle alignments and when an interrupt occurs the CPU is still allowed to finish the instruction it was executing before answering the pending interrupt. Different instructions take different amounts of time (the slower ones can take 6 or 7 CPU cycles), so depending on what the CPU was doing, the interrupt will be handled with varying amounts of delay and the PPU will be at a later dot than you want when you get to writing the scroll register. This is responsible for that little "jitter" with glitched looking pixels at the edge of a screen split. The best time to do these changes is when the jitter is hidden to the right of the image in H-blank but that's not easy to do without carefully timed interrupt handlers.
Hey DG, I've been a big fan of your channel for quite some time now. One of my favourite series of yours has got to be the one where you break down how the original Super Mario Brothers stores its level data, and how this causes the existence of the glitch levels. I was wondering if you're planning to do similar videos about SMB2 and 3, since those games have vastly more complicated levels, and I'm very curious as to how their data is stored and compressed.
I wouldn't rule out the possibility.
@@DisplacedGamers Awesome! I can't wait.
@@DisplacedGamers It took me this long to realise that the video I was talking about was actually made by Retro Game Mechanics Explained, lol. I guess it kinda makes sense that I'd get confused, you're both great channels that explain super-complicated coding and hardware information in an in-depth and easy-to-understand way.
Def wanna know more about that glob from MM3, The edge of the status area in Crystalis had a similar effect that was always there in your face too.
looks like the cpu is modifying the graphics data while the ppu is drawing that part of the screen
Fantastic explanation and visuals! Great work!
Quartz watches were the first use of crystal oscillators in consumer goods. Of course, those oscillators needed to produce a very precise frequency for accurate timekeeping. Engineers needed to produce a quartz crystal that would always resonate at a particular frequency, and whose resonant frequency could be controlled. Luckily, such a device has been in use for centuries; tuning forks, first invented in 1711, resonate at exactly one frequency, which was precisely set in manufacturing by grinding down the ends of the fork. To reduce the size and ensure it wouldn't be audible, quartz watch crystals are manufactured to resonate at exactly 32,768 hertz, using a laser to trim the tines down micrometers at a time until the desired frequency is reached.
Sadly, this technique doesn't scale down indefinitely; crystal oscillators in the megahertz range are based on resonating discs and aren't nearly as exciting.
I watch this channel and Retro Game Mechanics pretty consistently but still had no idea about this clock stuff.
Good morning! :D Thank you for posting. :]
Good morning!
Best video on your channel I love this stuff one of my fav you tubers. Always a great day when you upload
Thank you!
It feels like you've read my mind, as I am knee deep in this exact rabbit hole. Interestingly enough, the video output appears to function at twice the crystal rate, giving us 12 voltage outputs per color cycle. At least, that's what it looked like on the nes-dev ntsc page. Something about outputting on both rising and falling. I don't know what that means or how that is actually done in hardware. Maybe you have some thoughts or corrections :)
Work is performed on both the "rising" edge of the wave and the "falling" edge of the wave on the clock signal. This could be for triggering components at more exact timing, etc. This would still be considered six cycles, however, since both a rising edge and falling edge are just 2 parts of a single square pulse. There are components that can determine if a signal went from low-to-high (0 to 1) or high-to-low (1 to 0), which in this case would indicate the start and end to a single clock pulse. Hope that helped!
Not only video, 6502 also has rising and falling edge sub-cycles. I don't think Z80 does, or at least doesn't do anything extensive with them. This is how 6502 is able to perform a memory read in one cycle, it will assert an address on the bus and configure the bus in the LOW phase of the cycle, and then the memory needs some time to perform the read, it will assert the data as soon as it can, which the 6502 will consume in the HIGH phase of the cycle. Other comparable processors would generally have multiple-cycle memory operations, and would correspondingly be typically used at higher nominal clock rates with similar grades of memory. This design trait of 6502 was sometimes used to an advantage, for example to sneak in memory accesses in the LOW phase and avoid cycle stealing, but in general the higher nominal frequency allowed more granular timing in other chips.
Correct, the NES uses the low and high phases of the 21.47727 MHz master clock to generate the smallest granularity of the video signal, which works to 8 subpixels per pixel, or 12 subpixels per color cycle. And that's exactly how the NES palette gets it's 12 different hues, by shifting the color pulses by anywhere from 0 to 11 subpixels. Whether the PPU specifically looks for the rising and falling edges for this purpose, or just uses the momentary state of the clock as a logic input, I'm not sure.
I had absolutely no understanding of any of this stuff before discovering this channel. The breakdown and visual representation are just brilliant.
As someone who grew up in a PAL region I've been seeing that printed on the instruction manuals of all my consoles without knowing what it meant. Now I do. Thanks and great video.
For a moment I was thinking: “Is he going to try and explain the whole clock cycle and render pipeline of the NES PPU, what a madman!” Then I saw the video is only 11 minutes long and relaxed. Great video, once again. Thanks for talking through all of these concepts. As someone who tinkers with HDMI signal generation, it still seems a little weird to me that modern video standards still include extra time before, during, and after the signal. That was all originally to give the electron beam time to move in a CRT, right? What’s all of that for now?
The VBlank and HBlank periods in modern setups are there because HDMI is a modification of DVI. And DVI had them because it was designed as a digital interface for CRT monitors, so they digitized all the fun picture tube control signals and delays.
I would not be surprised to learn that HDMI packed audio into the VBlank period, but don't know for sure how it is encoded.
For a brief amount of time when I was a kid they tried to advertise computers using the work Mertz during the the 386 and early 486 days. I specifically remember reading a description near one of these that the scientific word megahertz is commonly just shortened to Mertz for the common person. It felt like for a couple of months everyone was saying the work mertz when advertising computers.
Good day, Thanks for the video.
dude I love your videos well explained and detailed. You're one super smart cat!
Definitely looking forward to a Part 2! -- Seemed to end just a little too short!
I feel cattified 😮 impressive presentation 👏 👌 👍
Okay, I definitely clicked because of the thumbnail. lol
Everything computer-related nowadays is all theoretical and abstract. It's absolutely mind-blowing to go back and see actual pins on a CPU connected to actual quartz crystals. 🤯
Programmers have long forgotten about those things with multitasking operating systems, no real direct access to hardware anymore. I came from days of programming C64's and DOS, and one of the optimizations we used to do was bit shifting integers to use instead of floating point math. I had made a simulation system in recent years that required calculating floating point math on a modern multi-core computer, tens of millions of simulated operations. It was fast, but when I changed the code to utilize 64 bit integers and "fake floating point", the performance difference was unbelieveable. Then I recoded it using GPU compute instead and the performance was just 100x what my best "fake floating point" was, and as a bonus, actually calculated the real floating point numbers. Today the closest thing I get to programming "old school" is microcontroller projects, like Rabbit, Arduino and ESP32. And finding people who REALLY understand making bare metal systems is few and far between these days.
Huh? Isn’t there still a crystal and still two pins for it? Just now the crystal is slower than the CPU.
Fantastic explanation, my friend! HAHAHA
And just like that, I understand the flickering on that Megaman game.
I remember someone pointed it out in a review video and I couldn't even begin to think of a logical reason.
So good! 💛
You got a like just for the thumbnail.
Ha! Thank you. I had trouble coming up with a thumbnail, and the idea for this one hit really late last night.
@@DisplacedGamers It works! Also like the videos in general. Clear explanations and visuals with calm voiceovers.
Even though an image can appear to the human eye to be moving even if it's redrawn at a much lower framerate (e.g., the standard for theatrical movies was 24 FPS for a long time), it's interesting that 60 FPS remains the benchmark for video game performance. I guess nobody wants their modern game to play with a lower refresh rate than their NES could manage.
Though I can see why a lot of consoles doubled frames. Flicker is really annoying. Frame duplication not so much. Now give me a 100 Hz CRT and a matching console!
Wow what an awesome explanation! One part that I don't understand: what would happen if the CPU divided the crystal frequency by (for instance) 6 instead of 12? Would the CPU just run twice as fast? I assume not, or else why divide at all, but I'm not sure *why* it wouldn't work.
I think (and this is just off the top of my head, Matt) one reason for the selection of that CPU speed had to do with the ROM speeds. You could accelerate the CPU (reduce) divider and potentially exceed what the ROM can provide. That would be just one side effect of messing with an existing system design.
The SNES still has to deal with this (fastrom vs. slowrom) despite being so much younger than the Famicom. That is an interesting part of history in of itself.
Anyway - As for messing with dividers, there are some clone chips out there that don't match the dividers and cause the system timing to go bonkers. Some people have swapped them out just to see what would happen. I'll probably share some results soon in a community post.
@@DisplacedGamers That sounds like a plausible explanation. The PC Engine allows the software to set the timing used by the VRAM, and it can also set the dot output to be faster than normal, which causes the horizontal resolution to increase (up to a 512x240 image by outputting two nametables to the screen next to eachother). But this causes the VRAM chip to be accessed at ~10MHz, which is slightly higher than what it was rated for. If you don't set the VRAM timing correctly to do this the VDC (equivalent of the PPU for the PC Engine) will not be able to fetch all sprite and color entries for every scanline from VRAM and this will lower the sprites per scanline limit and in some cases only 2 of the 4 bits of color data for each pixel will be read, reducing the color capabilities to something like the NES has. It's been theorized that there could be other visual glitches whenever the RAM doesn't put the data on the bus fast enough but nobody has seen any such thing happening so far.
That explains why they'd pick a lower frequency, but what good did it do them to have a clock that's 4 times faster than necessary?
@@warmCabinmaybe they use the sharp edges for a square signal? Or off the shelf TV parts ?
That's the best thumbnail. That's it.
The thumbnail really drew me in, btw.
Another excellent video.
I I wasn't already a subscriber, then I would have become one from the title card alone.
Thr MMC3 interrupt counts scan lines and send an IRQ when the specified one(s) are hit. But it's never quite perfect hence the glitch. Most games ensure black pixels are there but yeah the Mega Man screen is all blue :)
there's something satisfying about hearing the specific reason the NES CPU runs at the speed it does instead of just reading the speed and going "oh okay that's a number"
like, I figured no matter what if the speed is faster then that's better and the only reason it's not .00001 Mhz faster is because Nintendo didn't want to spend more money to make it faster
This is interesting 'for science' stuff for sure.
Excellent breakdown as always, you are very good at introducing daunting concepts like these to the viewer.
If we divide the amount of PPU cycles per frame by the three PPU cycles per CPU cycle, that would get us 29,780.5 CPU cycles per frame. I presume slowdown occurs when the CPU runs too many cycles to fit within one frame render before it can update graphics?
NTSC color subcarrier clock multiplied by M2 multiplier and then divided by NTSC divider (39375000 * 6 / 11 ~= 21 477 272,72727273...) sounds about right :)
Fascinating!
This is interesting stuff and nothing I've ever known about before.
I've never heard that the PPU used a different clock rate before. Why has no one told me this before? Now I need to write an NES emulator because this must be explored.
Had a notion earlier- I'd be interested in seeing the mechanics behind the NES Friday the 13th, one of the oldest roguelikes to wind up on a console.
What part of Friday the 13th?
@@DisplacedGamers I've always been curious how the game decides where Jason is at any given time as he rampages through the map (including when you stumble on him outside of an official attack sequence), as well as how it decides who he's attacking. The latter might just be RNG, but it always seemed like he concentrated on certain counselors.
That shade on Megaman. haha.
Still a fun game!
Haven't started watching but I'm calling it now:
"let's imagine a bus"
Is that 1989 board the one with the anti-glitch circuit on the CIC?
They say if we're living in a simulation, the clock speed is the speed of light.
Woo, first comment!
Love seeing these mini-documentaries going over old games, their innards and architecture. Here's hoping you get into more games and consoles in the future!
You do great videos. I was wondering could you do a behind the code on the teenage mutant ninja turtles 3 nes game for the movement and attacks that the turtles do, and the code for the health bar for the turtles and code for AI for foot soldiers and enemies?
The video thumbnail is killer hahaha :D
It always blows my mind how many chips I see on the NES that I have. Pretty much everything except the PPU and the Ricoh CPU. I DO have a 6502, though. It makes me think, I could probably build an NES if I had the couple of missing chips, and a LOT of knowledge. I wouldn't do it, but it would be cool if somebody made an NES on a breadboard. I'm sure 1980s eletronics dealt with more noise than the bitty parasitic inductance/capacitance of the breadboards would be. And everybody seems to be making computers on breadboards these days.
I would love to learn about the Atari 7800 in this same way! that would be interesting!
So what happens when you connect your NES (or any console before HDMI era) to modern TV? Is there some kind of CRT-emulation built inside them, since LEDs don't care about electron beams and NTSC/PAL differences.
HDMI has the same timing as analog. You can cut back on the idle time in hblank.
Here we go
I eat CPU chips for breakfast. My favourites are of course Motorola 68ks
Mario saving the princess is a metaphor for chasing light or chasing the beam
Time to do some hardware mods on the NES! What do you upgrade first?
No?
uh i would probably give it more RAM first, there's a substantial amount of the NES's address space that goes unused
Top!
Hi. Nice video. I heard the tv frequency is derived/copied from tms9918 VDP (Colecovision, Sega Sg-1000, MSX). I found the 341 pixel per scanline on that VDP as well BUT in some documents I found an honest 342 pixel per scanline. With 342 pixel per horizontal line, the refresh rate drops to 59.922Hz. Do you have some information about this discrepancy?
Every VDP implementation was a little bit different in terms of line rate and field rate. But you are correct, the TMS9918 has the same pixel rate as the NES. The most critical rate is the chroma carrier frequency, which must match the TV very closely so colors are constant from left to right. TVs can tolerate more variation from the NTSC spec for the line rate and field rate.
Very similar energy to the Atari's beam-chasing...
But the clocks are always off by some value, they have a tolerance, and they have aging, so what if your NES has like -50ppm, and mine is like +50ppm; that'll give me 1second advantage in a 3 hour speedrun!
... I did some research, and the clock is actually +/-2 ppm, that's an incredibly precise oscillator that could cost some money! That could warrant a whole new video!:D
what would happen if you replaced the crystal with one of higher value say 26 or 27 mhz say from a cb radio?
could you overclock the nes?
i know if you change the 3.57 mhz to something like 5 or 7 mhz in a certain radioshack phone dialer you could turn it in a red box to make free phone calls on payphones at the time when payphones did in band signaling.
it worked by doubling the frequencies of the * key to 1700 by 2200 hz witch happens to be the frequencies of the electronics of a payphone timing was controlled by the totalizer.
I was born in 1983 and I eat the stuff up 🍕🍔🌭!!!
Do you take requests for topics at all?
What's on your mind?
What happened next? Well in Whoville they say, that this viewer's brain grew three sizes that day.
"more comfortable with duty cycles or..." Translates as "yeah I know some of you are NERDS but I'm speaking to the rest of the class 👏"
This episode did not show up in my feed and was literally hidden from me for weeks... youtube is playing games with you...
Hm so the NES actually outputs fields very slightly faster than a standard NTSC signal which is ~59.94 hz
Not uncommon for the era. Analog TVs can handle a fair bit of variation, and hardware designers often took advantage of that.
Same reason "240p" signals work on a display that was very much only intended to do 480i. You give the screen a malformed signal and it will do it's best to paint it to the screen.
For what it is worth, the NES PPU actually outputs a very nasty signal and is pretty far off of NTSC in a lot of regards.
It was very desirable to have an integral number of graphics bus cycles per character cell, scanline, and frame, which couldn't generally be done without varying from the NTSC spec. Just doing a progressive scan requires a variation from the NTSC spec.
Coming next: How to overclock an NES
So whats the verdict? Is it harder to develop for the NES or modern consoles? Bare metal and assembly or SDKs and all-in-one dev kits like Unity?
I'd say NES is easier just because you can easily DIY anything that you want, including your game engine. I don't think many people would want to mess with that on a modern platform. The beauty of these old consoles today is that you can do anything, you don't have to depend on anyone else's code if you don't want to.
Can analyse SNES please 🙏😁
Always noticed that flickering on Mega Man 3! 😂
Such a passive aggressive tone when he pulled that one up.
YAAAAAAY
Not a fan of the music in the background, but good video!
I wonder what would happen if you swapped the clock crystal. Would it be stable?
It has quite a bit of stability margin, but how would you get the image out of it? A TV or video capture device requires very very close to 15.75KHz horizontal frequency.
So, all quartz crystals are perfect and run the clock at the expected rate?
Good question, the crystals are not perfect and their speed varies as a function of their temperature but the machining of their dimensions and tuning of their associated components is precise enough that the variations in speed are much smaller than the required precision.
All NESes run at slightly different clock speeds in practice but they all function within the tolerances of the NTSC/PAL tv specs.
you can change the frequency by changing the shape of the crystal and so they get very close to the required frequency in the factory
They're very very close, with usual good quality grade being 20ppm, so a 20MHz crystal might deviate by 1 Hz. They are factory trimmed by removing some material with a laser.
There's also cheaper lower quality grades like 50ppm.
Over decades, they can drift a little.
So, if the crystal dictates clock speed for the CPU and the PPU, what would happen if we were to swap the crystal for a faster one? Would we be effectively overclocking the CPU and the PPU?
You effectively would, or one could also use an external clock multiplier circuit, not unlike what's done with modern CPUs today. However, the PPU *has* to be locked to the colorburst frequency of whatever TV system it's outputting for, so you can't just boost the frequency of the PPU for higher resolutions or frame rates; the graphics will rapidly turn into garbage as a result. You could try and boost the CPU clock speed independently, but that would throw off the pitch of the audio system (it relies on CPU timing as well, since the audio is literally on the same CPU die), as well as screwing up any game logic/timing that uses CPU cycles instead of PPU-based timing. So while you could, the chances are quite high you won't get a usable system out of it.
Some emulators allow you to simulate "overclocking" chips in such a manner, reducing or eliminating slowdown that was present when playing certain games on real hardware. In real hardware, you'd presumably immediately break the video output if you swapped the crystal without doing some measure of extra work to keep it within NTSC standards. You might could overclock the CPU while leaving the PPU alone, but I'd guess that would help fewer games than you might think.
On a related note, the difference in the PAL and NTSC TV standards resulted in the PAL NES using a 26MHz clock instead of 21MHz, but it also uses higher divisors for the CPU and PPU, resulting in those chips actually running at *slower* speeds. Yet the PAL NES CPU still sees more cycles per frame, because the PAL TV format was 50Hz (50 fields per second) instead of 60Hz.
You don't have much leeway until the TV can no longer decode the signal.
You have some latitude on vertical refresh rate, though it shouldn't exceed 61Hz. But say you took a PAL NES (different timings) and PAL TV, you gain some extra leeway since all PAL TVs are fine with 60Hz vertical refresh.
The colour burst is very sensitive, but you can drift off it very far and it won't be fatal, just at first you get false colour and then you lose colour altogether. It's OK, you can live without colour.
The most critical is horizontal refresh rate. It's near 15.75 KHz (in both PAL and NTSC) and you can't deviate very far from it, the TV will lose sync. Very very narrow working range.
If you put a faster crystal on the cpu only it will make the game have less slowdown but the sound will be at a different pitch.
Meanwhile, as computer systems and game consoles became more "modern" developers have strayed further and further away from optimizing code for performance for various reasons, very few of them actually valid.
Why 60.0988 frames per second and not 60? What does marginally less than a tenth of a frame per second do?
9:41 A dot is a dot, you can't say it's only half
I never knew the NES PPU had a higher clock speed than the CPU.
That is typical for most computers and game consoles from the day.
I guess that is due to the deeper pipeline in the video chip. The CPU had no transistors left to deal with pipeline issues? Though Intel always had a higher clock rate ( 4.7 in 1981 ). Then 1985 with the Amiga the CPU had the same clock. Can OCS do 640px ??