0:50 Minor correction: bit depth and sample rate are two completely separate things. Sample rate only determines how many samples are made per second, bit depth determines the resolution of those samples. Sample rate directly determines the highest frequency that can be reproduced by a recording, that being half of the sample rate. This is why the low sample rate is so noticeable with the Snake Eater theme and F-Zero 99, as high pitch female vocals and drums are going to be the first things to get hurt significantly by a decrease in sample rate.
I like explaining it as a pipe with running water. Bit rate is how big the pipe is, and the sample rate is the water pressure. Best comparison are the SNES titles put on the GBA. You hear SNES games and they sound great, but on GBA with its lower bit rate and it sounds much worse. Final Fantasy Advance games were redone to meet the GBA's hardware, but Super Mario Advance games don't sound very good at all.
@@segasdreamer, the Super Mario Advance games and many other GBA games don't sound that bad. I believe that because some of the former's having to be released in a quick matter (at least for the first three games to some extent), it causes some of the music to sound off or be in a different music key, which isn't because of limitations of the GBA's sound hardware. Fans have made patches for them to make them sound closer to their source material, showing that the GBA's sound doesn't have to be very limited. Now for the latter, these are for games that were intentionally made for the GBA and its capabilities. Games like Mother 3, Wario Land 4, Klonoa Heroes, Mario VS Donkey Kong, and Kirby & the Amazing Mirror have great sounding music and audio despite having low bit rates and being streamed through the speakers at a relatively lower sample rate (Max possible sample rate of 32768 Hz and a depth of 8 bits, however most GBA games use a lower sample rate).
...inappropriate waterpipe analogy aside, the above sample rate/frequency linkage is called the Nyquist Theorem. While the music is technically "more full" (of samples) as the rate increases, the returns diminish starting at 44.1 kHz as its bound limit, 22,050 hz, is very close to the absolute edge of human hearing. They chose that as the CD (audio format) for a reason. DVD/video uses 48 kHz as it's more easily divisible (by 24, for example). Bit depth is a description of the total volume range, and similarly falls off around 16 as it becomes difficult to find equipment, or an environment, able to support reproduction of that range. The stair-step depiction of music samples is also misleading, as it's more the storage of the discreet points that your computer draws the line thru when converting to analog. The video, though, does a good job of explaining the reason this is all so apparent and important with games lol
@@superthe great explanation It is worth noting that, with CD using 44.1kHz, it was actually chosen out of convenience, as it's what pre-existing PCM adapters recorded at. 44.1kHz was chosen there as it was the highest sample rate with 16 bits compatible with both NTSC and PAL signals without doing anything like splitting samples across lines.
You can capture analog sound in CD and DVD resolutions. The resolution represents the detail between absolute zero and the loudest sound. I am not sure you are wrong, but something does not seem right.
Maybe they took the same files from MK8 without resampling? and for Booster Course they had to resample or create new music tracks entirely depending on the course?
Weird. Are the original Wii U files 48KHz? I've noticed Switch ports of games (the Assassin's Creed collection and Deadly Premonition for example) having atrocious audio quality compared to hardware as old as the Xbox 360
I noticed that on the demonstration here. I was noticing it on my studio audio interface and AKG's, so I hooked up a waterfall display, yep clearly clipped sample rates, so it wasn't our imagination, it was merely upsampled.
Technically speaking it _is_ 48 kHz, it's just not any better quality than the 32 kHz version because with any kind of digital media represented by integers, scaling is inherently lossy--you can't just create information out of nothing. Same reason upscaling a raster image doesn't magically enhance the quality and re-rendering a video at a higher framerate doesn't magically make it smoother. A choppy 6FPS video rerendered to 60FPS is still going to be choppy, the framerate's faster but the frames are just duplicated 10 times. Downscaling only works by destroying information, so it is lossy by definition; it should be expected that the reverse can't possibly improve quality. On that note, lossy compression works the same way: by destroying enough information to save space without human-noticeably impacting the quality. The information is destroyed, so the quality can't improve; it can only stay the same or diminish further. You don't see lossy compression when it comes to regular files since their literal meaning is more important than their semantic meaning. An example of lossy compression on plain text would be to take everything I've said in this comment, convert all lowercase letters to uppercase (or vice versa), and strip the punctuation, whitespace and vowels. Doing this would save a lot of space and make the text harder to read, and it would be impossible to programmatically restore the original text. The filler information that wasn't strictly needed for understanding the semantic meaning of the text is the information that was destroyed, or "lost", hence, lossy. I felt the need to explain this because I feel that people misunderstand what lossy compression means, and think that merely copying a lossy format file will somehow degrade its quality, as if they have a shelf life. TCHNCLLYSPKNGTS48KHZTSJSTNTNYBTTRQLTYTHNTH32KHZVRSNBCSWTHNYKNDFDGTLMDRPRSNTDBYNTGRSSCLNGSNHRNTLYLSSYYCNTJSTCRTNFRMTNTFNTHNGSMRSNPSCLNGRSTRMGDSNTMGCLLYNHNCTHQLTYNDRRNDRNGVDTHGHRFRMRTDSNTMGCLLYMKTSMTHRCHPPY6FPSVDRRNDRDT60FPSSSTLLGNGTBCHPPYTHFRMRTSFSTRBTTHFRMSRJSTDPLCTD10TMSDWNSCLNGNLYWRKSBYDSTRYNGNFRMTNSTSLSSYBYDFNTNTSHLDBXPCTDTHTTHRVRSCNTPSSBLYMPRVQLTYNTHTNTLSSYCMPRSSNWRKSTHSMWYBYDSTRYNGNGHNFRMTNTSVSPCWTHTHMNNTCBLYMPCTNGTHQLTYTHNFRMTNSDSTRYDSTHQLTYCNTMPRVTCNNLYSTYTHSMRDMNSHFRTHRYDNTSLSSYCMPRSSNWHNTCMSTRGLRFLSSNCTHRLTRLMNNGSMRMPRTNTTHNTHRSMNTCMNNGNXMPLFLSSYCMPRSSNNPLNTXTWLDBTTKVRYTHNGVSDNTHSCMMNTCNVRTLLLWRCSLTTRSTPPRCSRVCVRSNDSTRPTHPNCTTNWHTSPCVWLSNDRPTDLTTRSDNGTHSWLDSVLTFSPCNDMKTHTXTHRDRTRDNDTWLDBMPSSBLTPRGRMMTCLLYRSTRTHRGNLTXTTHFLLRNFRMTNTHTWSNTSTRCTLYNDDFRNDRSTNDNGTHSMNTCMNNGFTHTXTSTHNFRMTNTHTWSDSTRYDRLSTHNCLSSYFLTTHNDTXPLNTHSBCSFLTHTPPLMSNDRSTNDWHTLSSYCMPRSSNMNSNDTHNKTHTMRLYCPYNGLSSYFRMTFLWLLSMHWDGRDTSQLTYSFTHYHVSHLFLF Harder to read is an understatement. That is significantly harder to read. The core information is still there, though. If someone were to attempt to hand-transcribe this as-is and make errors (which is more likely since it appears less comprehensible), that would degrade the information more. Which is not necessarily a worry you'd have on a computer, but when people who think you can turn a jpeg into a png by merely changing the file extension get a hold of it, they can do incremental damage that adds up over time, leading to several copies of the same media floating around that all vary in quality. That's the point: by automated or algorithmic means, it can only stay the same or get worse, it can't improve. AI upscaling is a thing, but the upscale will never match the real original's quality and will have errors and artefacts, it's like sending the gibberish above to language model like chatgpt and asking it to transcribe it in English. It may very well get most of it right, but it'll also drop the ball. Here's what chatgpt responded with when I asked it to do just that: -- Technically speaking, 48 kHz is not necessarily better quality than the 32 kHz version because there are many factors that contribute to how an audio signal is represented by an integrated circuit. Currently, the less you can just create for maintenance, the less you can expect that the reverse can possibly improve quality. At the same time, 48 kHz does not make the audio smoother; it just means the frequency is higher, but at 60 fps, it's still going to be happy that the frames are adjusted and placed down 10 times. You should expect that the reverse can’t possibly improve quality unless you’re compressing in a way that’s not entirely like mp3. This is especially true for audio formats where the loss is significant. When it comes to smaller samples, it’s important to understand that the audio conversion process has to be taken into account, and this will definitely affect the quality, even with other sounds. For example, smaller compressed files in this manner do not hold the same quality when compared to larger files that are processed in this way. -- How closely does it match the original? Barely. Not only did it fail to transcribe by opting to paraphrase despite the fact that I didn't ask for that, it didn't even paraphrase it correctly. A human would do better. Sadly, we can't employ millions of humans in a sweatshop to manually upscale (redraw) and up-framerate (tween) our Naruto AMVs to 4K 60fps.
Considering Super Paper Mario was my main Paper Mario game that i played a lot, and considering I’m an audio nerd, the lower audio was very obvious to me, glad you brought it up in the video
The game is weird in other aspects that something bizarre like 20kHz audio seems par for the course. Music still sounds awesome anyways, probably helped by the distinct style that it has.
for f-zero if they just got the actual tracker file for it, it would have been thousands of times better and would haven taken up so much less storage 😭
If I were developing F-Zero 99 I would've still used a per-recorded version, but I would've used a higher sample rate. With the SNES you would have to emulate the SNES sound chip and it's various quirks (such as how panning is handled) to use the original sample and sequence data.
@@PercyPanleo true, although i’m pretty sure they could convert the original format used for the snes’s music to a .mod or .it tracker file. idk i just care about trying to optimize everything to the max lol.
The F-Zero 99 one is really inexcusable. The switch has more than enough power to decode modern audio formats like OGG Vorbis, which would result in both higher quality and lower file size compared to 11khz PCM. Alternatively they could've just emulated it. The SNES sound subsystem (S-DSP / SPC700) is incredibly trivial and fast to emulate. Audio files are tiny at just 64KB for the entire soundtrack. I'm part of a game dev studio that actually has a release on the Switch with dynamic high quality audio. The town theme for that game is split into several instrument tracks. Each is a Q10 (500Kbit) 48khz stereo OGG Vorbis file. When you are near the end of the game and all instruments are playing, it's decoding and playing 9 of these files at once, in real-time, while using less than 10% of the CPU for it.
@@JabjabsNo excuse, it can handle the complex physics of Tears of The Kingdom, Crysis, and 8 bot matches in Smash Ultimate or AMIIBOS with crazy AI that destroy professional players.. And you are telling me it can't handle high quality mp3 files?
@@saricubra2867 oh no that wasn't what I was saying. What i mean is the Tegra X1 has dedicated media decoding thus it should have even less load on the overall system. But even without that, there is stacks of compute to do purely software based decode and mixing. There is no excuse for them to have such poor audio quality.
@@Shotblur That's why i specified OGG **Vorbis**. Which is the actual codec inside the container. Sure Opus (which mostly uses the OGG container aswell) would be even more efficient, but would be more processor intensive. The audio team usually has a very small CPU and memory budget on consoles, so saving resources is key. Vorbis is reasonably efficient (on par with AAC, better than MP3) while being open source and patent free. Also, because it's older and more widely used, there's also much more tooling available for Vorbis. If you're using an established audio engine like FMOD, it's much more likely to support Vorbis than Opus.
It is certainly a widely believed myth that higher bit rates result in a more analog sound. High pitched sounds are lost if you use a lower bit rate, resulting in worse audio quality. That's the only difference.
@@igmnk low sample frequency also creates aliasing of high pitch sounds, which many associate with low quality digital audio. It can be avoided by running the signal through low pass filter, then even low sample rate sounds like low quality analog. Even 32khz sample rate would be enough to surpass quality of MC tape.
Nicer sounding versions of a few SPM songs were also put on a Year of Luigi CD (although I'm not 100% sure if it's actually the source files or just cleaned up versions of the 22.05Khz tracks) Similarly, the GameCube version of TTYD's soundtrack was stored at 32Khz, but in the Remake both the new and original soundtracks are stored at 48Khz, and I cannot tell whether the original tracks were actually rerecorded at a higher sample rate or just cleanly upsampled Edit: Okay after doing a direct comparisons I think the SPM ones are just upscales but the TTYD ones Im like 99% sure are source files
@@jeremyie definitely, if I recall correctly, the entire game takes less than 400 MiB of the 4.35 GiB you could have on a single layer Wii disc. Even considering the Speed Flower / Slow Flower items, it'd still have plenty of space to have speed up / slowed down duplicates of the tracks that play in areas where you could get those items...
@@mbc07 i wonder if it has to do with the game being a gamecube game originally and they were way too deep into development to be like. "well, lets rerender the music in higher quality"
@@meyadin5844 well, space wouldn't be an issue on GameCube either (1.35 GiB per disc), so it puzzles me they didn't render in higher quality from start...
Sample rate only affects the maximum frequency which can be captured by the audio, it has nothing to do with the smoothness or fullness of the audio. A 20khz audio recording can store signals up to 10khz. As long as the sample rate is twice the frequency of the sound you wish to record, it will be a perfect recreation of the sound because of the Nyquist-Shannon sampling theorem. A 32khz audio file would be indistinguishable from a 48khz file to most people as only the youngest most cared for ears can hear up to 20khz, also most instruments do not produce important sounds above 10khz, the biggest difference will be in cymbals or other high frequency sounds. Once you get lower than 20khz sample rate more important sounds will begin getting cut off (vocals, guitars, piano, etc etc) The reason higher sample rates than 48khz exist is because some digital audio effects will benefit from the higher sample to reduce 'reflections'
@@MaxLebled you need at least two samples to capture a waveform at a given frequency (hence the highest frequency you can sample is half the sample rate), frequencies above that will be sampled as lower frequency since multiple waveforms fit into the time it takes to capture two samples, the resulting samples will look identical to a lower frequency waveform. (e.g. if you sample at 40000hz, multiple 21000hz waves will get sampled as a 19000hz wave). This can create artefacts in the resulting audio, especially when using effects like distortion which introduce lots of frequencies that were not in the original audio. Increasing the sample rate moves the point at which reflection occurs up, reducing reflections (less high frequency content up there to reflect in the first place, and even if it is reflected the frequency will most likely still be over 20000hz and be inaudible to us).
This video reminded me of the fact that Madagascar for GameCube has its music stuck not only at 22050hz, but in mono as well, making all the music sound muffled and flat. It's a real shame too, since the bump to 44.1khz stereo is astounding. A few tracks got reused in the Madagascar 2 game on Wii, and they sound great. If only the composer had the rights to release the full game soundtrack.
for all these years I've wondered why 3DS OST recordings sound like *that* when there isn't a CD release to rip, little did I know it was the sample rate this whole time!
the human range is from 20 to 20,000 Hz and the sample Rate should be twice that to cover all frequencies so 40,000Hz wich is the ceiling of quality of digital audio so DVD and CD sounds identical
You need a reconstruction lowpass filter when doing the digital to analog conversion. 48khz is preferable to 44.1khz because this low pass filter doesn't need to be as steep. Steep filters introduce phase shift and are more likely to be a source of noise (the steeper the filter the more stages are required)
Yes! 48000khz audio is only 'better' than 44.1khz audio in very niche cases like speeding up or slowing down audio. Not to mention that most instruments don't even go to 15khz let alone 20khz
@@LorelaiLore As the other reply mentioned, the main benefit with 48kHz is with the steepness of the low-pass filter. With 44.1kHz it's easier to end up with a low-pass filter that affects audible frequencies, so 48kHz gives audio engineers a bit more leeway. Although I'm a little confused about the noise point they mentioned. In the situation of a filter not being steep enough for 44.1kHz, I'd expect to hear a roll-off in higher pitch frequencies (On top of the natural roll-off from your ears) rather than noise.
Also computers and programmers like working with nice even numbers, so I can see why some would go for 16 * 3000 over 44.1k even if it doesn't provide much in the way of tangible benefits to the audio quality.
@@angeldude101 This as well 44,1 was only the standard because it was the highest sample rate PCM recorders could do on VHS tape when the CD format was being developed IIRC
@@LoremIpsum1919 low res textures, but feels completely fine. very noticeable, and not a ton of 3d effects. plus since the 3ds is not a 16:9 screen the levels just feel off when playing. very playable, but definitely rayman origins on 3ds the vita version is nearly the same sprite resolution and sound quality as the 360/ps3 versions (obviously not in hd)
One overlooked thing also is the audible metallic/triangle-sounding distortion from mismatching samplerate of source audio and playback of the dac that isn't integers, once you know what that sounds like, you can't unhear it.
I would add that the lower sample rate in F-Zero 99 and Super Paper Mario may have been stylistic choices. Lower sample rates tend to sound tinnier, which gives a sort of "retro" feel to them. Since F-Zero 99 is obviously a retro-styled game, and Super Paper Mario was also trying to capture a retro feel at the time, it would make sense for their music to be intentionally lower quality. I can't help but wonder if the F-Zero 99 music being a lower sample rate than the SNES originals was just the result of beta tester feedback that it didn't sound "retro" enough.
1:08 "In laymen's terms, it's the quality of the audio" Not really. The sample rate ONLY dictates how high of a frequency can be correctly replicated. Bit depth (under 16 bits) is much more akin to "quality" if you want to simplify it in that way
0:28 Only a sparse few tracks from Melee were actual orchestral recordings. Others are just digital arrangements made to sound as much like a live orchestra as possible. I assume it is the same for Kirby Air Ride.
Agreed. For Kirby's Air Ride I think the only ones that were live recordings are the tracks from the animated series. Would have to listen to the full OST to say for certain.
It's worth noting that some tracks from Kirby Air Ride (such as Kirby Melee, Station Fire, and Checker Knights) were originally recorded for the Kirby anime.
more baffling is that the DSi hardware was upgraded to play at sample rates of 44.1khz but the 3DS was downgraded to ~32khz like the og DS. it wasn't very enjoyable to play project mirai.
how are you not gonna mention lossy vs lossless compression and differences in bitrate? Arguably way more important for this videos topic instead of just focusing on sample rate.
A few curiosities off the top of my head: Most music in the PC port of OutRun 2006: Coast 2 Coast is slightly panned to the left The 360 port of Hot Wheels: Beat That has the entire soundtrack downmixed to mono and slightly panned to the left, implying they still exist within a stereo file Different retail and prototype builds of Spyro 1 have different distributions of which songs in the soundtrack are stereo or mono. Reignited contains stereo versions of most of the soundtrack Ratchet & Clank 1's soundtrack is entirely in mono, though the prototypes and demo reel FMVs have short snippets of the original stereo tracks
10:47 oh this just hurts ; - ; the game rip quality of super paper mario is so bad and the ost is absolutely incredible. really hope we get the full quality one day
Finished the video, really good to see people talking about music quality in video games, I did think you were going to talk about older games and systems and their audio but I still found it damn interesting, especially the point about Tetris, man does that subpar quality music with the crisp sound effects bother me. Anyways, good video.
The "Nyquist-Shannon Sampling Theorem" states that a sampling rate of N samples per second produced is suitable to perfectly reproduce any audio wave up to N/2 frequency. People always think that higher sampling rates make audio samples "smoother," but that's not the case at all. A 20 kHz audio wave sampled at 44.1 kHz and played back at that frequency will look exactly the same as if it were sampled at 48 kHz, 96 kHz, or 192 kHz and played back at that frequency. If you look at all these analog waves after playback on an analog oscilloscope, you will see that the waves are exactly the same. Higher sampling frequencies don't make it smoother, that's just what people think because they think that a DAC produces the analog wave during playback, but it doesn't. After every DAC there is always a low pass filter, that's a capacitor and a resistor, and those two produce the analog wave. The DAC is just a variable power source to this circuit and it varies the output voltage. And because it is an analog low-pass filter, it can only produce perfectly smooth analog mixtures of sine waves, it cannot produce anything squeaky or sharp or steppy. The DAC can, but you don't connect your amplifier to the DAC, you connect it to the low pass filter. Today's DAC have such a low pass filter built in directly inside the chip. So all the sampling frequency does is limit the highest frequency that can be reproduced, because anything above N/2 is cut off and not reproduced at all. Bit depths are similarly confusing. People think that an 8-bit sample is more "steppy" in playback than a 16-bit sample, and a 16-bit sample is more steppy than a 24-bit sample, and so on. Again, this is not the case. The low pass filter cannot produce a steppy signal, it's physically impossible, it will always reproduce a smooth signal. The bit depth is simply what defines the signal-to-noise ratio (SNR) of the signal. The fewer bits, the more noise you will have in the final output signal. So is 16 bits a good number? Actually, it is. A music tape only has an SNR ratio of about 6 bits. Studio tapes in the 70's and 80's had about 11-12 bits SNR. So 16 bit is above what was considered studio quality back then! Unfortunately, a lot of misinformation and myths have been spread about these topics for decades, but you don't have to take my word for it; you can listen to an audio expert explain it to you, and you can even see what I just said with your own eyes on an analog oscilloscope. Just search RUclips for "Digital Show & Tell xiph.org". The full video title is "D/A and A/D | Digital Show and Tell (Monty Montgomery @ xiph.org)"
Super Paper Mario's music being so low quality makes me eternally sad. And the fact that they have that 30 second snippet of the full quality thing... I NEED THE REST! WE DESERVE THE REST!
I knew that Super Paper Mario did something different with music years ago when I got into GCN/Wii emulation. I used Dolphin Emulator to extract the music and they played in VLC Media Player seemingly twice as fast. Two possibilities: VLC bug or that was the file 1:1 and the Wii slowed playback 50% or something. Then I noticed the tinny sounds more easily during gameplay.
the 2 official nintendo-provided audio microcodes for the gamecube and wii, generally referred to as the Zelda-ucode and the AX-ucode, are both designed to not allow sample rates above 32 khz for any sounds or music that goes through the audio processor, HOWEVER, these consoles can ALSO play audio directly from the disc without going through the audio subsystem, and THIS audio is *always* native 48 khz, however, because the audio decoding and playback in this case is being handled by the optical drive itself rather than the audio processor, there are a number of limitations around its use, and any games run from internal storage can't use them. there IS a single game that uses its own audio microcode, and thus has 48 khz audio at all times, it is called BMX-XXX, it is nsfw, it's also not good
@@Firepal3D and to be fair, limiting audio sample rates under normal circumstances to 32 khz really wasn't a bad decision. most adults can't tell the difference between super-wideband and fullband audio, most children don't care, and the reduction in memory and disc space consumption was probably worth it
@@MizoxNG Nintendo has been pretty careful in their audio quality decisions honestly. clever use of live sequencing and good mixing generally makes up for the slightly reduced sound fidelity
@@MizoxNG it is strange though because the GC had the most audio RAM that generation unless some dev decided to use an obscene amount of the unified memory in the Xbox. This choice was probably due to disc space limitations.
This isn't Nintendo related, but the original Japanese version of Sonic Adventure on the Dreamcast had it's music streamed at 44.1khz. But when it was localized to the west, the English dub was added in addition to the Japanese audio, so the sample rate for the music was cut in half to 22.05khz in order for both audio tracks to fit. It's a neat instance of music sample rates being changed for localization purposes.
Super Paper Mario is about 400MB, 169MB of which is music. They could have had the sample rate three times higher and it would still fit on even a Playstation 1 CD.
I love my Snake Eater 3DS theme 💚 the trumpets blasting after opening the handheld is so iconic. There aren't many themes with that start that just make you pause.
It’s worth noting that higher Hz isn’t really necessary past a certain point. Human hearing range is 20-20000 Hz. Recordings can accurately record a sound if measured twice as many times a second, meaning past 40000 Hz it’s not super useful. The 96kHz, 192, etc. that you might see on some lossless media sites won’t truly make much of a difference. Plus, even then, most people can’t tell the difference between lossless music in those numbers and lossy media at like 320kbps MP3 files. This includes audiophiles like myself. Anyway, great video!
Past 40000 Hz is useful because of a problem in digital audio called aliasing and anti-aliasing filters. If you record at 40000Hz you will have muffled high frequencies at the limit of the hearing range. This is why when Sony and Philips defined the CD format they used 44.1KHz for headroom so the anti-alias filter does the work better to clean the 20Hz-20KHz without aliasing artifacts or filtering the highs.
Audiophiles *DEFINITELY* can tell a difference between lossless and 320Kbps MP3. MP3 is a flawed format and contains bugs that prevent it from achieving transparency. Meanwhile, modern codecs like Opus can achieve transparency at 160Kbps to 192Kbps
this is why im fine with just 24/48 and dont listen to ppl who fuss about anything beyond that or try to argue Tidal or Qobuz are somehow better than Apple Music. if you have 24/48 at that point the source doesnt matter.
I'd say the number of bits affect the sound as much as the sample rate. A 16 bit, 11 khz sounds mostly muffled, but 8 bit/11 khz you can hear the aliasing quite clear.
12:00 This would only be true if games stored their music in a raw PCM data stream. This is very uncommon because PCM is completely uncompressed and file sizes get very large very quick. A 44.1 kHz uncompressed CD WILL sound better than a 48 kHz compressed .OGG or .MP3 file.
Some games do in fact use PCM audio. The original release of Skyrim did use .WAV files for example (then they ruined it in the special edition by switching to a poorly compressed audio format).
The problem with other compression types like ogg and mp3, is that while you get smaller files, the processing equipment is taxed more to play the files. That's because the processor has to recreate the samples that weren't in the file and that takes time. Streaming PCM takes more space, but there's no real processing needed. The speed of processing may seem trivial today. But in a game where every millisecond counts to the beloved framerate, and consoles often opting for the lowest spec, most reliable hardware they can get away with, I can understand if the choice goes to PCM for games. Especially as storage got cheaper than processing power available. Remember. A full quality 44.1KHz Stereophonic CD could be played back with no real latency with minimal hardware as early as the late 70s. And by the 90s that tech were in toy players for kids. :)
Some things to note: - All Nintendo games use a custom 4bit ADPCM codec which further degrades the audio quality and is not lossless wav so on top of low samplerate, you also got compression. So the audio quality is rarely any better than old MP3 128kbps files, more likely even lower. Anything below 32khz is in fact way worse and I find 22khz close to unlistenable. - I do not know why companies like Nintendo use such inefficient compression methods. 44khz OGG files between 128-256kbps would make the most sense since the audio quality will be indistinguible from lossless to most and top of that be SMALLER in filesize. Yet sometimes there are still games using awfully noisy ADPCM with low samplerates. - There is no reason whatsoever to compress audio anymore in our current gen. Games come in tremendous filesizes which for AAA games is often 100GB+ and they do not even bother compressing the textures anymore which would make so much sense... Yet the audio and music side still gets compressed for a measly GB of diskspace and at the cost of quality. - I suspect Nintendo and other companies just do not want you to be able to rip game music in high quality or they are incompetent. But at the same time I wonder, why they cant be bothered to release purchaseable OSTs. Even those few OSTs available often suffer from quality issues and sometimes they literally pressed the low quality 32khz ingame music onto CDs. Donkey Kong Country Returns coming to mind. - The human hearing tops out around 20khz and even then you need good ears, youth and hardware to hear such details. In fact older folks often only hear up to 16-18khz. - Lets just say I dabbled a lot with game music over the decade and have done the ins and outs of audio engineering, also often focusing on improving the audio quality of compressed VGM.
One additional note to your second last point about hearing. Yes, we can only hear up to 20Khz in ideal situations. But we have to record at double that (40Khz +) to avoid aliasing. Just a quirk of how analog audio is recreated from a digital signal. I do find it funny that some musicians deliberately add aliasing back into their tracks for the effect, something that was worked hard on the be removed in the first place.
It depends on the Nintendo game, i played Legends Arceus and that game 100% uses uncompressed audio. No difference between listening the music in-game vs the ripped ost (.FLAC). Literally anything before the Switch simply has bad audio quality minus Sony and Microsoft consoles (because they use Hard drives and SSDs).
@@saricubra2867 That is a bit of generalization simply because going back to Sega CD/3DO in the early 90's you could simply pump full 44.1Khz redbook audio in. Space intensive but it was all about the "wow factor" back then.
I have a few things I wanted to say about this subject here: - In addition to compressing them, the formats Nintendo uses also usually include the looping data, _and_ include any dynamic layers or variants all within the same file, which likely makes for a much smoother workflow. - When it comes to storing things like sound effects, these are usually at least packaged in a smaller amount of files that each have many sound effects, with many games having (nearly) every sound effect in one file. Having some experience extracting them, this generally does make a noteworthy difference in filesize. - I'm going to personally disagree with the conclusion that audio shouldn't be compressed in the current gen. I think a lot of games are way too bloated, and having one game take over 100GB is egredious outside of a small handful of outliers. This is already bad on PC, where if you have 1-2tb of storage, these games take up massive amounts of space that add up pretty quickly. A few GB makes a world of difference when you're downloading multiple games. And even with higher overall storage, a lower filesize is still better for allowing more games, and also comes with the benefit of faster downloads. On a Switch, this is especially important. The console itslef only has 32GB of storage. And your only extension option is a microSD card, where most people are likely going to get 128-256GB cards. Switch games can't really afford to be too big.
The reason games use ADPCM over better codecs like MP3 and OGG is really simple: much lower CPU use. This is not so much a problem with 1 stream but it becomes one with like 10... And keeping in mind that sound is the poor child and gets like 5~10% CPU budget. Though you could definitely do the sfx in adpcm and the music in mp3/ogg. The other reason why ADPCM is still a thing (aside from habit and being in xbox 360 hardware) is that it has a much, MUCH simpler decoder. The decoder loop holds in maybe half a page of code, which makes things like SIMD optimizing the decoder and interactive music doing all sorts of jumps and loops and exact crossfades much easier since the decoder state is, like, 8 bytes and stored every sample block, so you can do things like state rollbacks.
A few notes: While F-ZERO is sampled, the SNES only mixed samples at 32khz. This is why *good* Genesis soundtracks could sound cleaner than SNES. Emulation can improve that to 48khz; but I have no idea if NERD (the company Nintendo bought and rebranded to make the Wii U and NSO emulators) implemented sound upsampling for SNES games. The actual effect of lowering the sample rate is predominantly heard at higher frequencies. Frequencies below half the sample rate are reproduced perfectly - this is why digital audio works *at all*. Using a lower sample rate is perfectly acceptable for sound effects that don't have high-frequency content. I'm seeing people mention compression (e.g. MP3/AAC/OGG/etc, not "bring the low sounds louder" kind of compression) in the comments. To be clear, a *lot* of games do not compress their audio, because decoding that audio stream is going to eat CPU cycles. Especially on the Wii and 3DS where they really, *really* don't have a lot of CPU time to go around. Even if you do have spare cycles, you have to make sure the decoder and mixer threads aren't competing with each other for system resources and causing audio stutter. None of that excuses such a hilariously low sample size for F-Zero music though lol
9:26 man... nintendo... they could have just saved the original file as 96khz and halfed that to get 48khz for the slowed down version. Yea 96khz is overkill, and requires some extra processingpower to be resampled, but i'm sure it would have fit in the performance and file size budget.
F-Zero 99 might actually be a conscious choice to give it a more lo-fi feel, similar to playing the original on a TV with speakers without good treble response.
@@saricubra2867 its because speakers have gotten better. a 32khz file in 91 would sound like a 16khz file in 2024 because of the difference in speaker quality, thats what op was saying
in 2024 i dont see any technical reason for it to be like this, so i think youre right. it really does sound like its coming out of those old crummy speakers
@@enthusiasticgeek7237The SNES had a Gaussian filter applied to its audio, so even a 44.1 KHz file would still have too much frequency range represented. Earlier models of the Sega Genesis also had a similar filter, only it was a dirty low-pass instead.
one of the 3ds custom themes i made has its background music at 12000Hz fortunately the theme is a gangsta mario theme so the low quality music fits with the aesthetic but still
@@DeevDaRabbit That can't be it because modern texture rendering allows for Nearest-Neighbor sampling (Minecraft uses this texturing method to keep it's tiny textures pixel perfect when enlarged.) Distortion is usually prevented by the perspective correction built into modern rendering systems but if that isn't enough, the mesh can be subdivided into smaller polygons.
i'm guessing for f-zero 99, it might've been to replicate how the game sounded on older speakers? i can't say for certain because i don't actually know how it would've sounded, or if older speakers would've sounded that much different at all, but if it would, then maybe? as for why they chose the lower quality version to use in tetris 99, i could think of a few reasons, but i'm not entirely sure on them. could be the same reason it's like that in f-zero 99, could be for consistency, could be just so you don't notice the lower quality in f-zero 99, idk.
Awesome video! Thats absolutely insane that super paper mario got done dirty like that. Its not a huge deal to make one extra track for the time flower!!!
I find it baffling that they're lowering the sample rate instead of the bitrate on whatever codec they're using. Audio codecs have much smarter ways of dropping bits than just chopping off higher frequencies, and when they have to, they will not have to for the entire track.
These games usually used some variation of ADPCM to encode audio, and this method isn’t like other audio codecs where the bitrate can be variable. Compressed ADPCM samples are of a fixed-size, so the only real option to reduce file size further is to reduce the number of samples via decreasing the sample rate.
@@WaluigiSoap I don't understand why they'd be using ADPCM on relatively modern consoles, heck, the DS would have been capable of using a fixed point decoder for vorbis or mp3 with little overhead, let alone the Wii U/3DS/Switch.
@@saricubra2867 depends on the game really. Super Circuit's was so low quality because it was designed to play at 60FPS, in both the main game and the single-pak multiplayer aspect (single-pak mode actually uses it's own separate sound engine 💀) it even bitcrushes higher quality samples to play at that 8000hz quality Most GBA games have their samples in the 12000hz to 16000hz in terms of quality anyway.
This is fantastic! I've noticed this for years when examining Nintendo games but never knew some of the reasons why (3MB home menu size limit). I'm glad you made a video digging into this!
this was interesting to watch bc i think we can all appreciate how nintendo's sound design itself is just objectively good for the purposes they serve in game but are held back by the fidelity of the sound files themselves. good video man.
Honestly, when I watched footage of F-Zero 99 on RUclips, I knew something was off with the audio as it sounded worse than the SNES original, and this video confirms that it in fact is. Pretty pathetic that Nintendo put a SNES track in a Switch game with a sample rate worse than the original SNES version itself.
As others have already pointed out, the sample rate only impacts the highest frequency that can be recorded, rates of 48kHz topping off at 24 kHz sounds. Past a certain point all you're doing is spending more space, so why are games using 48 these days when 44.1 would be just as suitable? My guess is that the engine is expecting 48 kHz. The difference between the two is only 7.8 kilobytes per second per channel, but with a big enough soundtrack it does add up. The sound track for Tears of the Kingdom is a little over 11 hours long, assuming all this is stored in the game uncompressed, going with 44.1 instead of 48 could've shaved off a bit over 600 megabytes. (odds are they are stored as 256 kbps AAC, but I haven't really looked into the game files, no real reason too either as I have the retail OST)
Every engine I know of allows you to use whatever sample rate you want, but there are 2 reasons you would want to use 48kHz as a developer 1) Using 48kHz gives you a lot of flexibility with effects like distorting the audio or changing the speed. 2) Besides a few Linux distros specifically made for playing music on embedded hardware and *maybe* MacOS, pretty much every OS nowadays exclusively outputs audio at 24-bit 48kHz, with only a few allowing you to change it and even fewer allowing you to change it easily. You can always resample audio from a different sample rate to the output sample rate through the game engine, but high quality resampling can be harsh on the CPU, especially with low power embedded SoCs like the one in the Switch. Given this, it's often better to just have all of your audio match the output sample rate at the cost of at most maybe about half a gigabyte if you aren't using compressed audio for some reason instead of potentially harming the games performance
When I made myself a theme for my 2DS I had to crush it down to about 11 kHz as well, because of the required run time to match the original song's looping point. Granted, I specially selected a song that had less frequent details up the higher frequency range, which is what the lower sample rate kills off. Still sounds crappier of course but much less crappy than it would on other songs.
5:02 Fun fact, 16 khz is also the sample rate for almost all of the samples used in the OST of Pokemon Diamond, Pearl and Platinum (and close to it is the sample rate of most samples from Mother 3, at around 15,768 hz, that's actually why Mother 3 sounds so good, for a GBA game anyway).
This is a misconception on how digital audio works. Your speakers do not and cannot play the "stair-step" like signal encoded in digital audio, it first has to be translated into an analogue signal. CDs encode audio at 16 bits and 44.1 Khz because it allows them to reproduce every possible sound within the 0-20 Khz range after it has been translated into analogue audio, which was chosen because it's the ordinary range of human hearing. Increasing the sampling rate does not improve the audio quality if the reproduction range is limited to 20 Khz.
the stair-step visualization is accurate to what un-interpolated PCM samples are. of course, the DAC will produce a continuous signal from this. also you forgot to mention sample-rate upconversions at the software level due to several different sample rates, and nyquist.
One thing to note regarding Air Ride's music is a lot of the seemingly live-recorded music 9 times out of 10 likely originates from the Kirby anime. While we got Kirby: Right Back At Ya! in the west, all the music was replaced since it was a standard practice 4Kids did with the shows they got the license for. Watching the Japanese original features a lot of songs that sound pretty familiar, most infamously Checker Knights (which is probably why that song only appears in Brawl and never again in later Smash games).
Awesome video! honestly the lower sound quality of games like Super Paper Mario wasn't super noticeable by my dumb kid self, ofc I knew it didn't sound as good as something like Mario Galaxy's soundtrack but I never quite wondered why or otherwise gave it much thought. There was a game though, that even for my low standards as a kid was pushing the limit. I highly reccommend taking a listen to Harvest Moon: Tree of Tranquility's soundtrack, then listening to basically any other Wii game's. The whiplash... oh man. I wonder what the sample rate for that game's music is, it's gotta be LOW. As a kid I used to joke with my younger sister that the game sounded like it was playing underwater. And don't even get me started on the sound effects oh my GOD. The way the sound came out of the Wii it was as if we were summoning demons from that thing. Lowkey I want to know what the sample rate of ToT's music is but I don't really know how to find that information. If anyone knows please do tell!
The janky NES triangle wave just hits so dang good 🥰🎉. Audio quirks help to make these older game systems charming and memorable. Gotta love those lo-fi vibes.
Slight correction, the Kirby Air ride and Super Smash Bros. Melee soundtracks weren't actually live orchestral recording, rather they were made with realistic digital midi instruments (although I've heard there was a few track from SSBM that were an orchestral recording)
7:42 Sometimes lower audio quality also gets used as an artistic choice, especially with new releases of old games. The BGM in Tetris 99 might be compressed to hell and back, but still fits this type of game very well...
Some extra information might be interesting here, in order to understand how digital audio works. The sampling rate needs to be a bit more than twice the highest frequency you want to record and playback. That is why 44.1kHz was chosen for the Audio CD back in the 1970‘s. It ends up with a highest possible sampling rate off 20kHz after processing the audio, though 22 kHz is technically the highest possible frequency. More than 98% of all humans cannot hear anything higher than 20kHz, so CD audio is already as full sounding as it gets. 48kHz was chosen for digital video, to ensure compatibility with whatever broadcasting standards used in any country worldwide. The Bit depth regulates the noise floor of the recording. At 16 Bits, the noise floor is 96dB below the loudest possible sound to record. If you consider, that living rooms are usually as noisy as 30dB, even if its super quiet, you end up with a sound pressure level above 116dB, before you even have a chance of noticing a noise floor. And let me tell you: This is going to hurt. As for sound fidelity, 16 Bit 44.1kHz (which is CD audio) is all you will ever need. For recording of course, we might want higher sampling rates and/or bit depths. But that is because of editing the audio, not for fidelity on playback.
A lot of PC ports of games when uses wav files instead of mp3 or ogg uses ADPCM 4 bit and 22.5 KHz. And low quality video codecs with 15 fps. Nowadays games use open source codecs like vorbis and x264/x265.
Recently I compared the PS1 Mega Man X3 to the GameCube version included in the X Collection. The PS1 version is 44.khz and the GameCube version is 32khz, and the difference is astounding, the PS1 version sounds a lot better, but I was also told this might be because of the ogg compression used in the X Collection.
The DS has some of my favorite oddities in terms of sound. Also, the 3DS bit explains why the soundtrack rips sound the way they do. Ps. Using a smaller sample size to save space for graphics is pretty smart, I mean, most average people would not be able to tell the difference.
@@saricubra2867 I grew up with it, so it bothers me less. I think the crunch adds some charm when the instruments aren't playing long notes. Whereas the Wii is so close to sounding perfect, it bothers me.
unsure if anyone has commented this, but if anyone is wondering why 44.1kHz for CD and 48kHz for DVD were chosen, it’s due to the Shannon-Nyquist Theorem which states that the sample rate must be twice the highest frequency needed to encode. since human hearing goes up to 20kHz, at least a 40kHz sample rate is needed, hence the 44.1 and 48kHz (they encoded above human frequencies so they could filter out the frequencies higher than 20kHz as it may cause aliasing, see below.) so the degradation in low sample rate audio is due to the fact that a sample rate that low can’t represent high frequencies. if a higher frequency than supported is encoded, you’ll hear a thing called “aliasing” where the frequencies above twice the sample rate “cramp Nyquist” and begin to go backwards. so if you have 20kHz audio and try to encode 11kHz, to my understanding those frequencies will be shifted back down to 9kHz.
Frequency is not the only factor of music quality. Lossy compression algorithms are way more responsible of bad audio quality. For example, smash bros ultimate uses opus codec with very low bitrate that destroys clarity of audio.
0:50 Minor correction: bit depth and sample rate are two completely separate things. Sample rate only determines how many samples are made per second, bit depth determines the resolution of those samples.
Sample rate directly determines the highest frequency that can be reproduced by a recording, that being half of the sample rate. This is why the low sample rate is so noticeable with the Snake Eater theme and F-Zero 99, as high pitch female vocals and drums are going to be the first things to get hurt significantly by a decrease in sample rate.
I like explaining it as a pipe with running water. Bit rate is how big the pipe is, and the sample rate is the water pressure. Best comparison are the SNES titles put on the GBA. You hear SNES games and they sound great, but on GBA with its lower bit rate and it sounds much worse. Final Fantasy Advance games were redone to meet the GBA's hardware, but Super Mario Advance games don't sound very good at all.
@@segasdreamer, the Super Mario Advance games and many other GBA games don't sound that bad. I believe that because some of the former's having to be released in a quick matter (at least for the first three games to some extent), it causes some of the music to sound off or be in a different music key, which isn't because of limitations of the GBA's sound hardware. Fans have made patches for them to make them sound closer to their source material, showing that the GBA's sound doesn't have to be very limited.
Now for the latter, these are for games that were intentionally made for the GBA and its capabilities. Games like Mother 3, Wario Land 4, Klonoa Heroes, Mario VS Donkey Kong, and Kirby & the Amazing Mirror have great sounding music and audio despite having low bit rates and being streamed through the speakers at a relatively lower sample rate (Max possible sample rate of 32768 Hz and a depth of 8 bits, however most GBA games use a lower sample rate).
...inappropriate waterpipe analogy aside, the above sample rate/frequency linkage is called the Nyquist Theorem. While the music is technically "more full" (of samples) as the rate increases, the returns diminish starting at 44.1 kHz as its bound limit, 22,050 hz, is very close to the absolute edge of human hearing. They chose that as the CD (audio format) for a reason. DVD/video uses 48 kHz as it's more easily divisible (by 24, for example). Bit depth is a description of the total volume range, and similarly falls off around 16 as it becomes difficult to find equipment, or an environment, able to support reproduction of that range.
The stair-step depiction of music samples is also misleading, as it's more the storage of the discreet points that your computer draws the line thru when converting to analog.
The video, though, does a good job of explaining the reason this is all so apparent and important with games lol
@@superthe great explanation
It is worth noting that, with CD using 44.1kHz, it was actually chosen out of convenience, as it's what pre-existing PCM adapters recorded at.
44.1kHz was chosen there as it was the highest sample rate with 16 bits compatible with both NTSC and PAL signals without doing anything like splitting samples across lines.
You can capture analog sound in CD and DVD resolutions. The resolution represents the detail between absolute zero and the loudest sound. I am not sure you are wrong, but something does not seem right.
One fun oddity is that Mario Kart 8 Deluxe's base game music is in 32KHz but all the Bosster Course Pass music in 48KHz
Maybe they took the same files from MK8 without resampling? and for Booster Course they had to resample or create new music tracks entirely depending on the course?
Even the reused Waluigi Pinball/Wario Stadium?
@@kirby2809No because both tracks just use the same audio file
@@DE23 thats what i thought, but your initial comment suggested otherwise. Thanks for clarifying!
Weird. Are the original Wii U files 48KHz? I've noticed Switch ports of games (the Assassin's Creed collection and Deadly Premonition for example) having atrocious audio quality compared to hardware as old as the Xbox 360
1:59 mario says spongebob
HE DOES LMFAO
classic video
@KingOfSpace the old video about that is very funny
*Discuss.*
Hey man you're right haha
On some SSB4 songs, they just upsampled them from 32 kHz to 48 kHz, so they technically still are 32 kHz despite taking a larger file size
oh god, so they just gained nothing and only lost storage space
Wow ok, rude
I noticed that on the demonstration here. I was noticing it on my studio audio interface and AKG's, so I hooked up a waterfall display, yep clearly clipped sample rates, so it wasn't our imagination, it was merely upsampled.
Technically speaking it _is_ 48 kHz, it's just not any better quality than the 32 kHz version because with any kind of digital media represented by integers, scaling is inherently lossy--you can't just create information out of nothing. Same reason upscaling a raster image doesn't magically enhance the quality and re-rendering a video at a higher framerate doesn't magically make it smoother. A choppy 6FPS video rerendered to 60FPS is still going to be choppy, the framerate's faster but the frames are just duplicated 10 times. Downscaling only works by destroying information, so it is lossy by definition; it should be expected that the reverse can't possibly improve quality.
On that note, lossy compression works the same way: by destroying enough information to save space without human-noticeably impacting the quality. The information is destroyed, so the quality can't improve; it can only stay the same or diminish further. You don't see lossy compression when it comes to regular files since their literal meaning is more important than their semantic meaning. An example of lossy compression on plain text would be to take everything I've said in this comment, convert all lowercase letters to uppercase (or vice versa), and strip the punctuation, whitespace and vowels. Doing this would save a lot of space and make the text harder to read, and it would be impossible to programmatically restore the original text. The filler information that wasn't strictly needed for understanding the semantic meaning of the text is the information that was destroyed, or "lost", hence, lossy. I felt the need to explain this because I feel that people misunderstand what lossy compression means, and think that merely copying a lossy format file will somehow degrade its quality, as if they have a shelf life.
TCHNCLLYSPKNGTS48KHZTSJSTNTNYBTTRQLTYTHNTH32KHZVRSNBCSWTHNYKNDFDGTLMDRPRSNTDBYNTGRSSCLNGSNHRNTLYLSSYYCNTJSTCRTNFRMTNTFNTHNGSMRSNPSCLNGRSTRMGDSNTMGCLLYNHNCTHQLTYNDRRNDRNGVDTHGHRFRMRTDSNTMGCLLYMKTSMTHRCHPPY6FPSVDRRNDRDT60FPSSSTLLGNGTBCHPPYTHFRMRTSFSTRBTTHFRMSRJSTDPLCTD10TMSDWNSCLNGNLYWRKSBYDSTRYNGNFRMTNSTSLSSYBYDFNTNTSHLDBXPCTDTHTTHRVRSCNTPSSBLYMPRVQLTYNTHTNTLSSYCMPRSSNWRKSTHSMWYBYDSTRYNGNGHNFRMTNTSVSPCWTHTHMNNTCBLYMPCTNGTHQLTYTHNFRMTNSDSTRYDSTHQLTYCNTMPRVTCNNLYSTYTHSMRDMNSHFRTHRYDNTSLSSYCMPRSSNWHNTCMSTRGLRFLSSNCTHRLTRLMNNGSMRMPRTNTTHNTHRSMNTCMNNGNXMPLFLSSYCMPRSSNNPLNTXTWLDBTTKVRYTHNGVSDNTHSCMMNTCNVRTLLLWRCSLTTRSTPPRCSRVCVRSNDSTRPTHPNCTTNWHTSPCVWLSNDRPTDLTTRSDNGTHSWLDSVLTFSPCNDMKTHTXTHRDRTRDNDTWLDBMPSSBLTPRGRMMTCLLYRSTRTHRGNLTXTTHFLLRNFRMTNTHTWSNTSTRCTLYNDDFRNDRSTNDNGTHSMNTCMNNGFTHTXTSTHNFRMTNTHTWSDSTRYDRLSTHNCLSSYFLTTHNDTXPLNTHSBCSFLTHTPPLMSNDRSTNDWHTLSSYCMPRSSNMNSNDTHNKTHTMRLYCPYNGLSSYFRMTFLWLLSMHWDGRDTSQLTYSFTHYHVSHLFLF
Harder to read is an understatement. That is significantly harder to read. The core information is still there, though. If someone were to attempt to hand-transcribe this as-is and make errors (which is more likely since it appears less comprehensible), that would degrade the information more. Which is not necessarily a worry you'd have on a computer, but when people who think you can turn a jpeg into a png by merely changing the file extension get a hold of it, they can do incremental damage that adds up over time, leading to several copies of the same media floating around that all vary in quality. That's the point: by automated or algorithmic means, it can only stay the same or get worse, it can't improve. AI upscaling is a thing, but the upscale will never match the real original's quality and will have errors and artefacts, it's like sending the gibberish above to language model like chatgpt and asking it to transcribe it in English. It may very well get most of it right, but it'll also drop the ball. Here's what chatgpt responded with when I asked it to do just that:
--
Technically speaking, 48 kHz is not necessarily better quality than the 32 kHz version because there are many factors that contribute to how an audio signal is represented by an integrated circuit. Currently, the less you can just create for maintenance, the less you can expect that the reverse can possibly improve quality.
At the same time, 48 kHz does not make the audio smoother; it just means the frequency is higher, but at 60 fps, it's still going to be happy that the frames are adjusted and placed down 10 times.
You should expect that the reverse can’t possibly improve quality unless you’re compressing in a way that’s not entirely like mp3. This is especially true for audio formats where the loss is significant.
When it comes to smaller samples, it’s important to understand that the audio conversion process has to be taken into account, and this will definitely affect the quality, even with other sounds.
For example, smaller compressed files in this manner do not hold the same quality when compared to larger files that are processed in this way.
--
How closely does it match the original? Barely. Not only did it fail to transcribe by opting to paraphrase despite the fact that I didn't ask for that, it didn't even paraphrase it correctly. A human would do better. Sadly, we can't employ millions of humans in a sweatshop to manually upscale (redraw) and up-framerate (tween) our Naruto AMVs to 4K 60fps.
this is the audiophile version of a scam
Considering Super Paper Mario was my main Paper Mario game that i played a lot, and considering I’m an audio nerd, the lower audio was very obvious to me, glad you brought it up in the video
i like how crunchy it sounds honestly
The game is weird in other aspects that something bizarre like 20kHz audio seems par for the course. Music still sounds awesome anyways, probably helped by the distinct style that it has.
@@angeldude101 my favorite paper mario ost and honestly among my favorite osts of all time
@@Temulgeh I have always figured that the crunchiness was intentional. It was trying to feel kind of retro when it came out, after all.
Love that game. I'm hoping when I'm 80 years old, I will have forgotten enough to do another play-through. Same with Chrono Trigger.
The fact they kneecapped the entire game's music fidelity just to make the speed flower work is just fascinating to me
for f-zero if they just got the actual tracker file for it, it would have been thousands of times better and would haven taken up so much less storage 😭
If I were developing F-Zero 99 I would've still used a per-recorded version, but I would've used a higher sample rate. With the SNES you would have to emulate the SNES sound chip and it's various quirks (such as how panning is handled) to use the original sample and sequence data.
@@PercyPanleo true, although i’m pretty sure they could convert the original format used for the snes’s music to a .mod or .it tracker file. idk i just care about trying to optimize everything to the max lol.
tbh the overhead of actually synthesizing the music using an emulated snes audio chipset is probably similar to the overhead of a modern audio codec
@@Firepal3D lol
it's kinda wild how few games seem to actually do this tbh. i'm pretty sure the only AAA game i ever played that does that is unreal gold lmao
12:49 omg hi crunchy audio of mario in a bubble, I love you
The F-Zero 99 one is really inexcusable. The switch has more than enough power to decode modern audio formats like OGG Vorbis, which would result in both higher quality and lower file size compared to 11khz PCM. Alternatively they could've just emulated it. The SNES sound subsystem (S-DSP / SPC700) is incredibly trivial and fast to emulate. Audio files are tiny at just 64KB for the entire soundtrack.
I'm part of a game dev studio that actually has a release on the Switch with dynamic high quality audio. The town theme for that game is split into several instrument tracks. Each is a Q10 (500Kbit) 48khz stereo OGG Vorbis file. When you are near the end of the game and all instruments are playing, it's decoding and playing 9 of these files at once, in real-time, while using less than 10% of the CPU for it.
If anything the Tegra would have a dedicated media decoder that would make it even less resource intensive.
@@JabjabsNo excuse, it can handle the complex physics of Tears of The Kingdom, Crysis, and 8 bot matches in Smash Ultimate or AMIIBOS with crazy AI that destroy professional players..
And you are telling me it can't handle high quality mp3 files?
@@saricubra2867 oh no that wasn't what I was saying. What i mean is the Tegra X1 has dedicated media decoding thus it should have even less load on the overall system. But even without that, there is stacks of compute to do purely software based decode and mixing. There is no excuse for them to have such poor audio quality.
OGG's just a container, and a 20-year-old one at that; hardly modern. Opus is what you mean, I think.
@@Shotblur That's why i specified OGG **Vorbis**. Which is the actual codec inside the container. Sure Opus (which mostly uses the OGG container aswell) would be even more efficient, but would be more processor intensive. The audio team usually has a very small CPU and memory budget on consoles, so saving resources is key. Vorbis is reasonably efficient (on par with AAC, better than MP3) while being open source and patent free. Also, because it's older and more widely used, there's also much more tooling available for Vorbis. If you're using an established audio engine like FMOD, it's much more likely to support Vorbis than Opus.
A much better way to describe the sample rate is as 2x the upper bound of audio frequency it can reconstruct, rather than audio quality.
gonna stop you right there, dont even bother. the take away from this video is that sample rate is a determining factor in sound quality.
It is certainly a widely believed myth that higher bit rates result in a more analog sound.
High pitched sounds are lost if you use a lower bit rate, resulting in worse audio quality. That's the only difference.
Yes, you're correct (Nyquist-Shannon theorem) but higher upper bound = better quality, lol.
@@igmnk low sample frequency also creates aliasing of high pitch sounds, which many associate with low quality digital audio. It can be avoided by running the signal through low pass filter, then even low sample rate sounds like low quality analog. Even 32khz sample rate would be enough to surpass quality of MC tape.
Nicer sounding versions of a few SPM songs were also put on a Year of Luigi CD (although I'm not 100% sure if it's actually the source files or just cleaned up versions of the 22.05Khz tracks)
Similarly, the GameCube version of TTYD's soundtrack was stored at 32Khz, but in the Remake both the new and original soundtracks are stored at 48Khz, and I cannot tell whether the original tracks were actually rerecorded at a higher sample rate or just cleanly upsampled
Edit: Okay after doing a direct comparisons I think the SPM ones are just upscales but the TTYD ones Im like 99% sure are source files
justice for the spm soundtrack
pretty sure the disc even had plenty of space for all that sweet music, but nope, we get low quality
@@jeremyie definitely, if I recall correctly, the entire game takes less than 400 MiB of the 4.35 GiB you could have on a single layer Wii disc. Even considering the Speed Flower / Slow Flower items, it'd still have plenty of space to have speed up / slowed down duplicates of the tracks that play in areas where you could get those items...
@@mbc07 i wonder if it has to do with the game being a gamecube game originally and they were way too deep into development to be like. "well, lets rerender the music in higher quality"
@@meyadin5844 well, space wouldn't be an issue on GameCube either (1.35 GiB per disc), so it puzzles me they didn't render in higher quality from start...
Sample rate only affects the maximum frequency which can be captured by the audio, it has nothing to do with the smoothness or fullness of the audio. A 20khz audio recording can store signals up to 10khz. As long as the sample rate is twice the frequency of the sound you wish to record, it will be a perfect recreation of the sound because of the Nyquist-Shannon sampling theorem.
A 32khz audio file would be indistinguishable from a 48khz file to most people as only the youngest most cared for ears can hear up to 20khz, also most instruments do not produce important sounds above 10khz, the biggest difference will be in cymbals or other high frequency sounds.
Once you get lower than 20khz sample rate more important sounds will begin getting cut off (vocals, guitars, piano, etc etc)
The reason higher sample rates than 48khz exist is because some digital audio effects will benefit from the higher sample to reduce 'reflections'
God to be a kid just discovering the treble knob
Could you please elaborate on the "avoiding reflections" bit at higher sample rates?
@@MaxLebled you need at least two samples to capture a waveform at a given frequency (hence the highest frequency you can sample is half the sample rate), frequencies above that will be sampled as lower frequency since multiple waveforms fit into the time it takes to capture two samples, the resulting samples will look identical to a lower frequency waveform. (e.g. if you sample at 40000hz, multiple 21000hz waves will get sampled as a 19000hz wave). This can create artefacts in the resulting audio, especially when using effects like distortion which introduce lots of frequencies that were not in the original audio.
Increasing the sample rate moves the point at which reflection occurs up, reducing reflections (less high frequency content up there to reflect in the first place, and even if it is reflected the frequency will most likely still be over 20000hz and be inaudible to us).
@@conifirouss thank you for the explanation!
I cannot like this comment enough. If only the “audiophile” community would listen
mario said spongebob
discuss
THANK YOU
"On the flipside, we have Super Paper Mario."
Aha! I see what you did there.
This video reminded me of the fact that Madagascar for GameCube has its music stuck not only at 22050hz, but in mono as well, making all the music sound muffled and flat. It's a real shame too, since the bump to 44.1khz stereo is astounding. A few tracks got reused in the Madagascar 2 game on Wii, and they sound great.
If only the composer had the rights to release the full game soundtrack.
As well as some of the songs having shorter 44kHz stereo samples on Madagascar Island Mania (a PC Exclusive)
@@stoyanstoyanov4552 That's true. That game has many key tracks in it.
@D0U8LE_H And i can confirm that the PC version of Madagascar 1 is still using 22050 kHz mono music
@@stoyanstoyanov4552 All console versions of the game do. It's a shame, really. Not even the Xbox version has a higher sample rate.
The Madagascar tie in game fanbase runs deeeep apparently 😂
Assassin’s Creed Valhalla Audio Quality: Hold my beer
And origins and odyssey
Ubisoft games in general tbh
for all these years I've wondered why 3DS OST recordings sound like *that* when there isn't a CD release to rip, little did I know it was the sample rate this whole time!
the human range is from 20 to 20,000 Hz
and the sample Rate should be twice that to cover all frequencies so 40,000Hz wich is the ceiling of quality of digital audio
so DVD and CD sounds identical
You need a reconstruction lowpass filter when doing the digital to analog conversion. 48khz is preferable to 44.1khz because this low pass filter doesn't need to be as steep. Steep filters introduce phase shift and are more likely to be a source of noise (the steeper the filter the more stages are required)
Yes! 48000khz audio is only 'better' than 44.1khz audio in very niche cases like speeding up or slowing down audio. Not to mention that most instruments don't even go to 15khz let alone 20khz
@@LorelaiLore As the other reply mentioned, the main benefit with 48kHz is with the steepness of the low-pass filter. With 44.1kHz it's easier to end up with a low-pass filter that affects audible frequencies, so 48kHz gives audio engineers a bit more leeway. Although I'm a little confused about the noise point they mentioned. In the situation of a filter not being steep enough for 44.1kHz, I'd expect to hear a roll-off in higher pitch frequencies (On top of the natural roll-off from your ears) rather than noise.
Also computers and programmers like working with nice even numbers, so I can see why some would go for 16 * 3000 over 44.1k even if it doesn't provide much in the way of tangible benefits to the audio quality.
@@angeldude101 This as well
44,1 was only the standard because it was the highest sample rate PCM recorders could do on VHS tape when the CD format was being developed IIRC
The Super Paper Mario fact crushes me so much
The low fidelity F-Zero 99 music makes it feel like you're playing some 2000s flash fan-game
11025 Hz was the default sample rate for audio on Flash, that's why.
Rayman Origins on the 3DS uses 11025Hz for both sound effects AND music. You can already imagine how I'd be like!
rayman origins on 3ds sure is rayman origins on 3ds lmao. what a weird port
@@genderender How is it weird.
@@LoremIpsum1919 low res textures, but feels completely fine. very noticeable, and not a ton of 3d effects. plus since the 3ds is not a 16:9 screen the levels just feel off when playing. very playable, but definitely rayman origins on 3ds
the vita version is nearly the same sprite resolution and sound quality as the 360/ps3 versions (obviously not in hd)
One overlooked thing also is the audible metallic/triangle-sounding distortion from mismatching samplerate of source audio and playback of the dac that isn't integers, once you know what that sounds like, you can't unhear it.
that’s what happens when you have no interpolation to fix the discrepancy.
I would add that the lower sample rate in F-Zero 99 and Super Paper Mario may have been stylistic choices. Lower sample rates tend to sound tinnier, which gives a sort of "retro" feel to them. Since F-Zero 99 is obviously a retro-styled game, and Super Paper Mario was also trying to capture a retro feel at the time, it would make sense for their music to be intentionally lower quality. I can't help but wonder if the F-Zero 99 music being a lower sample rate than the SNES originals was just the result of beta tester feedback that it didn't sound "retro" enough.
That comparison between 20KHz and 48KHz was really helpful!
Was going to sub already, but the transition from Snake Eater to F-Zero made it immediate
1:08 "In laymen's terms, it's the quality of the audio" Not really. The sample rate ONLY dictates how high of a frequency can be correctly replicated. Bit depth (under 16 bits) is much more akin to "quality" if you want to simplify it in that way
Exactly!
0:28 Only a sparse few tracks from Melee were actual orchestral recordings. Others are just digital arrangements made to sound as much like a live orchestra as possible. I assume it is the same for Kirby Air Ride.
Agreed. For Kirby's Air Ride I think the only ones that were live recordings are the tracks from the animated series. Would have to listen to the full OST to say for certain.
I believe nintendo's first fully orchestral game was Super Mario Galaxy
It's worth noting that some tracks from Kirby Air Ride (such as Kirby Melee, Station Fire, and Checker Knights) were originally recorded for the Kirby anime.
more baffling is that the DSi hardware was upgraded to play at sample rates of 44.1khz but the 3DS was downgraded to ~32khz like the og DS.
it wasn't very enjoyable to play project mirai.
very informative
i love the editing from 4:08
amazing after credit scene, too-
James-Money's Questionable Microphone Audio Quality
how are you not gonna mention lossy vs lossless compression and differences in bitrate?
Arguably way more important for this videos topic instead of just focusing on sample rate.
Lossy files have to decode thus increasing buffer times. Maybe it's a concession they make to keep games running smoothly.
A few curiosities off the top of my head:
Most music in the PC port of OutRun 2006: Coast 2 Coast is slightly panned to the left
The 360 port of Hot Wheels: Beat That has the entire soundtrack downmixed to mono and slightly panned to the left, implying they still exist within a stereo file
Different retail and prototype builds of Spyro 1 have different distributions of which songs in the soundtrack are stereo or mono. Reignited contains stereo versions of most of the soundtrack
Ratchet & Clank 1's soundtrack is entirely in mono, though the prototypes and demo reel FMVs have short snippets of the original stereo tracks
10:47 oh this just hurts ; - ; the game rip quality of super paper mario is so bad and the ost is absolutely incredible. really hope we get the full quality one day
Metal Gear Solid: Bit Eater
Finished the video, really good to see people talking about music quality in video games, I did think you were going to talk about older games and systems and their audio but I still found it damn interesting, especially the point about Tetris, man does that subpar quality music with the crisp sound effects bother me. Anyways, good video.
The "Nyquist-Shannon Sampling Theorem" states that a sampling rate of N samples per second produced is suitable to perfectly reproduce any audio wave up to N/2 frequency. People always think that higher sampling rates make audio samples "smoother," but that's not the case at all. A 20 kHz audio wave sampled at 44.1 kHz and played back at that frequency will look exactly the same as if it were sampled at 48 kHz, 96 kHz, or 192 kHz and played back at that frequency.
If you look at all these analog waves after playback on an analog oscilloscope, you will see that the waves are exactly the same. Higher sampling frequencies don't make it smoother, that's just what people think because they think that a DAC produces the analog wave during playback, but it doesn't. After every DAC there is always a low pass filter, that's a capacitor and a resistor, and those two produce the analog wave. The DAC is just a variable power source to this circuit and it varies the output voltage. And because it is an analog low-pass filter, it can only produce perfectly smooth analog mixtures of sine waves, it cannot produce anything squeaky or sharp or steppy. The DAC can, but you don't connect your amplifier to the DAC, you connect it to the low pass filter. Today's DAC have such a low pass filter built in directly inside the chip.
So all the sampling frequency does is limit the highest frequency that can be reproduced, because anything above N/2 is cut off and not reproduced at all.
Bit depths are similarly confusing. People think that an 8-bit sample is more "steppy" in playback than a 16-bit sample, and a 16-bit sample is more steppy than a 24-bit sample, and so on. Again, this is not the case. The low pass filter cannot produce a steppy signal, it's physically impossible, it will always reproduce a smooth signal.
The bit depth is simply what defines the signal-to-noise ratio (SNR) of the signal. The fewer bits, the more noise you will have in the final output signal. So is 16 bits a good number? Actually, it is. A music tape only has an SNR ratio of about 6 bits. Studio tapes in the 70's and 80's had about 11-12 bits SNR. So 16 bit is above what was considered studio quality back then!
Unfortunately, a lot of misinformation and myths have been spread about these topics for decades, but you don't have to take my word for it; you can listen to an audio expert explain it to you, and you can even see what I just said with your own eyes on an analog oscilloscope. Just search RUclips for "Digital Show & Tell xiph.org". The full video title is "D/A and A/D | Digital Show and Tell (Monty Montgomery @ xiph.org)"
Super Paper Mario's music being so low quality makes me eternally sad. And the fact that they have that 30 second snippet of the full quality thing... I NEED THE REST! WE DESERVE THE REST!
I knew that Super Paper Mario did something different with music years ago when I got into GCN/Wii emulation. I used Dolphin Emulator to extract the music and they played in VLC Media Player seemingly twice as fast. Two possibilities: VLC bug or that was the file 1:1 and the Wii slowed playback 50% or something. Then I noticed the tinny sounds more easily during gameplay.
the 2 official nintendo-provided audio microcodes for the gamecube and wii, generally referred to as the Zelda-ucode and the AX-ucode, are both designed to not allow sample rates above 32 khz for any sounds or music that goes through the audio processor, HOWEVER, these consoles can ALSO play audio directly from the disc without going through the audio subsystem, and THIS audio is *always* native 48 khz, however, because the audio decoding and playback in this case is being handled by the optical drive itself rather than the audio processor, there are a number of limitations around its use, and any games run from internal storage can't use them.
there IS a single game that uses its own audio microcode, and thus has 48 khz audio at all times, it is called BMX-XXX, it is nsfw, it's also not good
oh huh, this is some "deep iceberg" info
@@Firepal3D and to be fair, limiting audio sample rates under normal circumstances to 32 khz really wasn't a bad decision. most adults can't tell the difference between super-wideband and fullband audio, most children don't care, and the reduction in memory and disc space consumption was probably worth it
@@MizoxNG Nintendo has been pretty careful in their audio quality decisions honestly. clever use of live sequencing and good mixing generally makes up for the slightly reduced sound fidelity
@@MizoxNG it is strange though because the GC had the most audio RAM that generation unless some dev decided to use an obscene amount of the unified memory in the Xbox. This choice was probably due to disc space limitations.
@@BurritoKingdom but its optical discs were only 1.4 gigabytes, a fraction of what the PS2 and Xbox had access to
This isn't Nintendo related, but the original Japanese version of Sonic Adventure on the Dreamcast had it's music streamed at 44.1khz. But when it was localized to the west, the English dub was added in addition to the Japanese audio, so the sample rate for the music was cut in half to 22.05khz in order for both audio tracks to fit. It's a neat instance of music sample rates being changed for localization purposes.
Why do we have Nintendo OST CDs but not Nintendo OST on Spotify
dawg making a video about audio quality with that mic
I'm glad you made this video.
Super Paper Mario’s music could have been set at 2x speed as a method of saving space since it was originally meant to be a GameCube game.
Super Paper Mario is about 400MB, 169MB of which is music. They could have had the sample rate three times higher and it would still fit on even a Playstation 1 CD.
I dunno if I would want the music to be sped up though, you know
Kinda fine with the tempo it has and what not
@@SkawoHey! You're the guy who tested stop n' swap on a real N64!
I love my Snake Eater 3DS theme 💚 the trumpets blasting after opening the handheld is so iconic. There aren't many themes with that start that just make you pause.
It’s worth noting that higher Hz isn’t really necessary past a certain point. Human hearing range is 20-20000 Hz. Recordings can accurately record a sound if measured twice as many times a second, meaning past 40000 Hz it’s not super useful. The 96kHz, 192, etc. that you might see on some lossless media sites won’t truly make much of a difference. Plus, even then, most people can’t tell the difference between lossless music in those numbers and lossy media at like 320kbps MP3 files.
This includes audiophiles like myself.
Anyway, great video!
Past 40000 Hz is useful because of a problem in digital audio called aliasing and anti-aliasing filters.
If you record at 40000Hz you will have muffled high frequencies at the limit of the hearing range.
This is why when Sony and Philips defined the CD format they used 44.1KHz for headroom so the anti-alias filter does the work better to clean the 20Hz-20KHz without aliasing artifacts or filtering the highs.
Audiophiles *DEFINITELY* can tell a difference between lossless and 320Kbps MP3. MP3 is a flawed format and contains bugs that prevent it from achieving transparency.
Meanwhile, modern codecs like Opus can achieve transparency at 160Kbps to 192Kbps
@@EmergedFromReddit They don't, unless they are young and have extremely good hearing.
this is why im fine with just 24/48 and dont listen to ppl who fuss about anything beyond that or try to argue Tidal or Qobuz are somehow better than Apple Music. if you have 24/48 at that point the source doesnt matter.
I'd say the number of bits affect the sound as much as the sample rate. A 16 bit, 11 khz sounds mostly muffled, but 8 bit/11 khz you can hear the aliasing quite clear.
That was phenomenal, thanks for posting this as an audio enthusiast, and going full force into another VGM enjoyer phase
I wondered for so long why Super Paper Mario had such garbage audio quality compared to other Wii games.
But good OST
12:00 This would only be true if games stored their music in a raw PCM data stream. This is very uncommon because PCM is completely uncompressed and file sizes get very large very quick. A 44.1 kHz uncompressed CD WILL sound better than a 48 kHz compressed .OGG or .MP3 file.
Some games do in fact use PCM audio. The original release of Skyrim did use .WAV files for example (then they ruined it in the special edition by switching to a poorly compressed audio format).
The problem with other compression types like ogg and mp3, is that while you get smaller files, the processing equipment is taxed more to play the files. That's because the processor has to recreate the samples that weren't in the file and that takes time.
Streaming PCM takes more space, but there's no real processing needed.
The speed of processing may seem trivial today. But in a game where every millisecond counts to the beloved framerate, and consoles often opting for the lowest spec, most reliable hardware they can get away with, I can understand if the choice goes to PCM for games. Especially as storage got cheaper than processing power available.
Remember. A full quality 44.1KHz Stereophonic CD could be played back with no real latency with minimal hardware as early as the late 70s. And by the 90s that tech were in toy players for kids. :)
Some things to note:
- All Nintendo games use a custom 4bit ADPCM codec which further degrades the audio quality and is not lossless wav so on top of low samplerate, you also got compression. So the audio quality is rarely any better than old MP3 128kbps files, more likely even lower. Anything below 32khz is in fact way worse and I find 22khz close to unlistenable.
- I do not know why companies like Nintendo use such inefficient compression methods. 44khz OGG files between 128-256kbps would make the most sense since the audio quality will be indistinguible from lossless to most and top of that be SMALLER in filesize. Yet sometimes there are still games using awfully noisy ADPCM with low samplerates.
- There is no reason whatsoever to compress audio anymore in our current gen. Games come in tremendous filesizes which for AAA games is often 100GB+ and they do not even bother compressing the textures anymore which would make so much sense... Yet the audio and music side still gets compressed for a measly GB of diskspace and at the cost of quality.
- I suspect Nintendo and other companies just do not want you to be able to rip game music in high quality or they are incompetent. But at the same time I wonder, why they cant be bothered to release purchaseable OSTs. Even those few OSTs available often suffer from quality issues and sometimes they literally pressed the low quality 32khz ingame music onto CDs. Donkey Kong Country Returns coming to mind.
- The human hearing tops out around 20khz and even then you need good ears, youth and hardware to hear such details. In fact older folks often only hear up to 16-18khz.
- Lets just say I dabbled a lot with game music over the decade and have done the ins and outs of audio engineering, also often focusing on improving the audio quality of compressed VGM.
One additional note to your second last point about hearing. Yes, we can only hear up to 20Khz in ideal situations. But we have to record at double that (40Khz +) to avoid aliasing. Just a quirk of how analog audio is recreated from a digital signal.
I do find it funny that some musicians deliberately add aliasing back into their tracks for the effect, something that was worked hard on the be removed in the first place.
It depends on the Nintendo game, i played Legends Arceus and that game 100% uses uncompressed audio. No difference between listening the music in-game vs the ripped ost (.FLAC). Literally anything before the Switch simply has bad audio quality minus Sony and Microsoft consoles (because they use Hard drives and SSDs).
@@saricubra2867 That is a bit of generalization simply because going back to Sega CD/3DO in the early 90's you could simply pump full 44.1Khz redbook audio in. Space intensive but it was all about the "wow factor" back then.
I have a few things I wanted to say about this subject here:
- In addition to compressing them, the formats Nintendo uses also usually include the looping data, _and_ include any dynamic layers or variants all within the same file, which likely makes for a much smoother workflow.
- When it comes to storing things like sound effects, these are usually at least packaged in a smaller amount of files that each have many sound effects, with many games having (nearly) every sound effect in one file. Having some experience extracting them, this generally does make a noteworthy difference in filesize.
- I'm going to personally disagree with the conclusion that audio shouldn't be compressed in the current gen. I think a lot of games are way too bloated, and having one game take over 100GB is egredious outside of a small handful of outliers. This is already bad on PC, where if you have 1-2tb of storage, these games take up massive amounts of space that add up pretty quickly. A few GB makes a world of difference when you're downloading multiple games. And even with higher overall storage, a lower filesize is still better for allowing more games, and also comes with the benefit of faster downloads. On a Switch, this is especially important. The console itslef only has 32GB of storage. And your only extension option is a microSD card, where most people are likely going to get 128-256GB cards. Switch games can't really afford to be too big.
The reason games use ADPCM over better codecs like MP3 and OGG is really simple: much lower CPU use. This is not so much a problem with 1 stream but it becomes one with like 10... And keeping in mind that sound is the poor child and gets like 5~10% CPU budget. Though you could definitely do the sfx in adpcm and the music in mp3/ogg.
The other reason why ADPCM is still a thing (aside from habit and being in xbox 360 hardware) is that it has a much, MUCH simpler decoder. The decoder loop holds in maybe half a page of code, which makes things like SIMD optimizing the decoder and interactive music doing all sorts of jumps and loops and exact crossfades much easier since the decoder state is, like, 8 bytes and stored every sample block, so you can do things like state rollbacks.
A few notes:
While F-ZERO is sampled, the SNES only mixed samples at 32khz. This is why *good* Genesis soundtracks could sound cleaner than SNES. Emulation can improve that to 48khz; but I have no idea if NERD (the company Nintendo bought and rebranded to make the Wii U and NSO emulators) implemented sound upsampling for SNES games.
The actual effect of lowering the sample rate is predominantly heard at higher frequencies. Frequencies below half the sample rate are reproduced perfectly - this is why digital audio works *at all*. Using a lower sample rate is perfectly acceptable for sound effects that don't have high-frequency content.
I'm seeing people mention compression (e.g. MP3/AAC/OGG/etc, not "bring the low sounds louder" kind of compression) in the comments. To be clear, a *lot* of games do not compress their audio, because decoding that audio stream is going to eat CPU cycles. Especially on the Wii and 3DS where they really, *really* don't have a lot of CPU time to go around. Even if you do have spare cycles, you have to make sure the decoder and mixer threads aren't competing with each other for system resources and causing audio stutter.
None of that excuses such a hilariously low sample size for F-Zero music though lol
9:26 man... nintendo... they could have just saved the original file as 96khz and halfed that to get 48khz for the slowed down version. Yea 96khz is overkill, and requires some extra processingpower to be resampled, but i'm sure it would have fit in the performance and file size budget.
F-Zero 99 might actually be a conscious choice to give it a more lo-fi feel, similar to playing the original on a TV with speakers without good treble response.
But the SNES outputs at 32KHz. I mean you have a game from 2023/2024 with worse audio quality than one from 1991.
@@saricubra2867 its because speakers have gotten better. a 32khz file in 91 would sound like a 16khz file in 2024 because of the difference in speaker quality, thats what op was saying
in 2024 i dont see any technical reason for it to be like this, so i think youre right. it really does sound like its coming out of those old crummy speakers
@@enthusiasticgeek7237The SNES had a Gaussian filter applied to its audio, so even a 44.1 KHz file would still have too much frequency range represented.
Earlier models of the Sega Genesis also had a similar filter, only it was a dirty low-pass instead.
ruclips.net/video/5XAOS1wcWX0/видео.html
I'd do anything for high quality SPM soundtrack
one of the 3ds custom themes i made has its background music at 12000Hz fortunately the theme is a gangsta mario theme so the low quality music fits with the aesthetic but still
I want gangsta Mario on my modded 3DS please tell me how I can get gangsta Mario on my modded 3DS
How the heck is F-Zero99 1.4GB, that’s crazy…
Maybe the sprites and map textures were enlarged to prevent distortion.
@@DeevDaRabbit That shouldn't be the case since even SNES Emulators can make Mode 7 look super clear
@@anonanonymous9670 This isn't mode 7 anymore
@@DeevDaRabbit That can't be it because modern texture rendering allows for Nearest-Neighbor sampling (Minecraft uses this texturing method to keep it's tiny textures pixel perfect when enlarged.) Distortion is usually prevented by the perspective correction built into modern rendering systems but if that isn't enough, the mesh can be subdivided into smaller polygons.
@@ConcavePgons Well that's the only thing I can think of that makes the game so big
Super interesting video! So glad this popped up in my recommended today, and can’t wait to watch more of your videos soon! :)
i'm guessing for f-zero 99, it might've been to replicate how the game sounded on older speakers? i can't say for certain because i don't actually know how it would've sounded, or if older speakers would've sounded that much different at all, but if it would, then maybe?
as for why they chose the lower quality version to use in tetris 99, i could think of a few reasons, but i'm not entirely sure on them. could be the same reason it's like that in f-zero 99, could be for consistency, could be just so you don't notice the lower quality in f-zero 99, idk.
Awesome video! Thats absolutely insane that super paper mario got done dirty like that. Its not a huge deal to make one extra track for the time flower!!!
I find it baffling that they're lowering the sample rate instead of the bitrate on whatever codec they're using. Audio codecs have much smarter ways of dropping bits than just chopping off higher frequencies, and when they have to, they will not have to for the entire track.
These games usually used some variation of ADPCM to encode audio, and this method isn’t like other audio codecs where the bitrate can be variable. Compressed ADPCM samples are of a fixed-size, so the only real option to reduce file size further is to reduce the number of samples via decreasing the sample rate.
@@WaluigiSoap I don't understand why they'd be using ADPCM on relatively modern consoles, heck, the DS would have been capable of using a fixed point decoder for vorbis or mp3 with little overhead, let alone the Wii U/3DS/Switch.
Super Circuit's audio quality is 8000hz 😭
fitting name
Wtf
Not even stable and extremely aliased, the GBA sound quality always has been trash.
@@saricubra2867 depends on the game really.
Super Circuit's was so low quality because it was designed to play at 60FPS, in both the main game and the single-pak multiplayer aspect (single-pak mode actually uses it's own separate sound engine 💀) it even bitcrushes higher quality samples to play at that 8000hz quality
Most GBA games have their samples in the 12000hz to 16000hz in terms of quality anyway.
@@MarioKartSuperCircuit SNES sounds way, way better than GBA.
Super interesting video! Looking forward to more!
thought you should know this appeared in my recommended and i chose to eat food to it
This is fantastic! I've noticed this for years when examining Nintendo games but never knew some of the reasons why (3MB home menu size limit). I'm glad you made a video digging into this!
Did that newscaster at the beginning imply 8bit chiptune was “annoying”!?!?
1:59 Mario says SpongeBob?!?!
Discuss.
Wonderful video! I'm so glad I found your channel
6:00 - I play f-zero 99 a whole bunch, and I love that they just ripped the direct tracks with no changes. Makes it feel more accurate!
this was interesting to watch bc i think we can all appreciate how nintendo's sound design itself is just objectively good for the purposes they serve in game but are held back by the fidelity of the sound files themselves. good video man.
It's insane how they will work on these insane soundtracks and just never release them ever
This didn't age well
Honestly, when I watched footage of F-Zero 99 on RUclips, I knew something was off with the audio as it sounded worse than the SNES original, and this video confirms that it in fact is. Pretty pathetic that Nintendo put a SNES track in a Switch game with a sample rate worse than the original SNES version itself.
As others have already pointed out, the sample rate only impacts the highest frequency that can be recorded, rates of 48kHz topping off at 24 kHz sounds. Past a certain point all you're doing is spending more space, so why are games using 48 these days when 44.1 would be just as suitable? My guess is that the engine is expecting 48 kHz.
The difference between the two is only 7.8 kilobytes per second per channel, but with a big enough soundtrack it does add up. The sound track for Tears of the Kingdom is a little over 11 hours long, assuming all this is stored in the game uncompressed, going with 44.1 instead of 48 could've shaved off a bit over 600 megabytes. (odds are they are stored as 256 kbps AAC, but I haven't really looked into the game files, no real reason too either as I have the retail OST)
Every engine I know of allows you to use whatever sample rate you want, but there are 2 reasons you would want to use 48kHz as a developer
1) Using 48kHz gives you a lot of flexibility with effects like distorting the audio or changing the speed.
2) Besides a few Linux distros specifically made for playing music on embedded hardware and *maybe* MacOS, pretty much every OS nowadays exclusively outputs audio at 24-bit 48kHz, with only a few allowing you to change it and even fewer allowing you to change it easily. You can always resample audio from a different sample rate to the output sample rate through the game engine, but high quality resampling can be harsh on the CPU, especially with low power embedded SoCs like the one in the Switch. Given this, it's often better to just have all of your audio match the output sample rate at the cost of at most maybe about half a gigabyte if you aren't using compressed audio for some reason instead of potentially harming the games performance
When I made myself a theme for my 2DS I had to crush it down to about 11 kHz as well, because of the required run time to match the original song's looping point.
Granted, I specially selected a song that had less frequent details up the higher frequency range, which is what the lower sample rate kills off. Still sounds crappier of course but much less crappy than it would on other songs.
5:02 Fun fact, 16 khz is also the sample rate for almost all of the samples used in the OST of Pokemon Diamond, Pearl and Platinum (and close to it is the sample rate of most samples from Mother 3, at around 15,768 hz, that's actually why Mother 3 sounds so good, for a GBA game anyway).
Only talking about sample rate and not bit rate is absolutely wrong.
Also i wish you talked more about snes audio.
i am so glad someone finally made a video about this, particularly on SPM
This is a misconception on how digital audio works. Your speakers do not and cannot play the "stair-step" like signal encoded in digital audio, it first has to be translated into an analogue signal. CDs encode audio at 16 bits and 44.1 Khz because it allows them to reproduce every possible sound within the 0-20 Khz range after it has been translated into analogue audio, which was chosen because it's the ordinary range of human hearing. Increasing the sampling rate does not improve the audio quality if the reproduction range is limited to 20 Khz.
the stair-step visualization is accurate to what un-interpolated PCM samples are.
of course, the DAC will produce a continuous signal from this.
also you forgot to mention sample-rate upconversions at the software level due to several different sample rates, and nyquist.
One thing to note regarding Air Ride's music is a lot of the seemingly live-recorded music 9 times out of 10 likely originates from the Kirby anime. While we got Kirby: Right Back At Ya! in the west, all the music was replaced since it was a standard practice 4Kids did with the shows they got the license for. Watching the Japanese original features a lot of songs that sound pretty familiar, most infamously Checker Knights (which is probably why that song only appears in Brawl and never again in later Smash games).
This is a very good video, deserves more views!
Awesome video! honestly the lower sound quality of games like Super Paper Mario wasn't super noticeable by my dumb kid self, ofc I knew it didn't sound as good as something like Mario Galaxy's soundtrack but I never quite wondered why or otherwise gave it much thought. There was a game though, that even for my low standards as a kid was pushing the limit. I highly reccommend taking a listen to Harvest Moon: Tree of Tranquility's soundtrack, then listening to basically any other Wii game's. The whiplash... oh man. I wonder what the sample rate for that game's music is, it's gotta be LOW. As a kid I used to joke with my younger sister that the game sounded like it was playing underwater. And don't even get me started on the sound effects oh my GOD. The way the sound came out of the Wii it was as if we were summoning demons from that thing. Lowkey I want to know what the sample rate of ToT's music is but I don't really know how to find that information. If anyone knows please do tell!
The janky NES triangle wave just hits so dang good 🥰🎉. Audio quirks help to make these older game systems charming and memorable. Gotta love those lo-fi vibes.
Slight correction, the Kirby Air ride and Super Smash Bros. Melee soundtracks weren't actually live orchestral recording, rather they were made with realistic digital midi instruments (although I've heard there was a few track from SSBM that were an orchestral recording)
7:42 Sometimes lower audio quality also gets used as an artistic choice, especially with new releases of old games. The BGM in Tetris 99 might be compressed to hell and back, but still fits this type of game very well...
+plays sample*
-Oh this sounds so nice
+This is TRASH
-Oh
Some extra information might be interesting here, in order to understand how digital audio works. The sampling rate needs to be a bit more than twice the highest frequency you want to record and playback. That is why 44.1kHz was chosen for the Audio CD back in the 1970‘s. It ends up with a highest possible sampling rate off 20kHz after processing the audio, though 22 kHz is technically the highest possible frequency. More than 98% of all humans cannot hear anything higher than 20kHz, so CD audio is already as full sounding as it gets. 48kHz was chosen for digital video, to ensure compatibility with whatever broadcasting standards used in any country worldwide.
The Bit depth regulates the noise floor of the recording. At 16 Bits, the noise floor is 96dB below the loudest possible sound to record. If you consider, that living rooms are usually as noisy as 30dB, even if its super quiet, you end up with a sound pressure level above 116dB, before you even have a chance of noticing a noise floor. And let me tell you: This is going to hurt. As for sound fidelity, 16 Bit 44.1kHz (which is CD audio) is all you will ever need.
For recording of course, we might want higher sampling rates and/or bit depths. But that is because of editing the audio, not for fidelity on playback.
The day a CD Release of the Mario & Luigi Dream Team (Bros.) Soundtrack exists, I will cry of joy
A lot of PC ports of games when uses wav files instead of mp3 or ogg uses ADPCM 4 bit and 22.5 KHz. And low quality video codecs with 15 fps. Nowadays games use open source codecs like vorbis and x264/x265.
Recently I compared the PS1 Mega Man X3 to the GameCube version included in the X Collection. The PS1 version is 44.khz and the GameCube version is 32khz, and the difference is astounding, the PS1 version sounds a lot better, but I was also told this might be because of the ogg compression used in the X Collection.
Well researched, great depth. Thanks!
The DS has some of my favorite oddities in terms of sound.
Also, the 3DS bit explains why the soundtrack rips sound the way they do.
Ps. Using a smaller sample size to save space for graphics is pretty smart, I mean, most average people would not be able to tell the difference.
It doesn't only affect file size but also the size in ram and the cpu usage to play back which can matter a lot for performance
Also on ds, not only the sample rate mattered but the interpolation, where it had no interpolation which created quite a lot of aliasing
Also in Super Paper Mario's case I always thought the crunchy audio was a stylistic choice.
Thank God I'm not the only one who noticed/is bothered by Super Paper Mario's sound quality.
Now try listening Game Boy Advance music. I can't stand how it sounds, it's even worse than a Megadrive/Genesis.
@@saricubra2867 I grew up with it, so it bothers me less. I think the crunch adds some charm when the instruments aren't playing long notes. Whereas the Wii is so close to sounding perfect, it bothers me.
Great video bro!
Am I the only one who thought that the quality for f zero tracks audio was to make it feel like it was coming from am old crt?
unsure if anyone has commented this, but if anyone is wondering why 44.1kHz for CD and 48kHz for DVD were chosen, it’s due to the Shannon-Nyquist Theorem which states that the sample rate must be twice the highest frequency needed to encode. since human hearing goes up to 20kHz, at least a 40kHz sample rate is needed, hence the 44.1 and 48kHz (they encoded above human frequencies so they could filter out the frequencies higher than 20kHz as it may cause aliasing, see below.)
so the degradation in low sample rate audio is due to the fact that a sample rate that low can’t represent high frequencies. if a higher frequency than supported is encoded, you’ll hear a thing called “aliasing” where the frequencies above twice the sample rate “cramp Nyquist” and begin to go backwards. so if you have 20kHz audio and try to encode 11kHz, to my understanding those frequencies will be shifted back down to 9kHz.
Nintendo should just give us a music streaming app bundled with the NSO app so people domt nedd to use questionable means
Yes, by the way, Kirby Planet Robobot is on Spotify
Frequency is not the only factor of music quality. Lossy compression algorithms are way more responsible of bad audio quality. For example, smash bros ultimate uses opus codec with very low bitrate that destroys clarity of audio.