Why Higher Bit Depth and Sample Rates Matter in Music Production
HTML-код
- Опубликовано: 7 фев 2025
- What is the benefit of using higher bit depth and sample rate in a DAW session for recording music? Should you use 16-bit, 24-bit, or 32-bit floating point? Is it worth recording music in 96kHz or 192kHz, or is 48kHz sample rate good enough?
Watch Part 1 here: • Debunking the Digital ...
Watch this video to learn more about sample rates in music production (Dan Worrall and Fabfilter): • Samplerates: the highe...
Dan Worrall RUclips Channel: / @danworrall
Fabfilter RUclips Channel: / @fabfilter
This video includes excerpts from "Digital Show & Tell", a video that was originally created by Christopher "Monty" Montgomery and xiph.org. The video has been adapted to make the concepts more accessible to viewers by providing context and commentary throughout the video.
"Digital Show & Tell" is distributed under a Creative Commons Attribution-ShareAlike (BY-SA) license. Learn more here: creativecommon...
Watch the full video here: • Digital Show & Tell ("...
Original Video: xiph.org/video...
Learn More: people.xiph.or...
Book a one to one call:
audiouniversit...
Website: audiouniversit...
Facebook: / audiouniversityonline
Twitter: / audiouniversity
Instagram: / audiouniversity
Patreon: / audiouniversity
#AudioUniversity
Disclaimer: This description contains affiliate links, which means that if you click them, I will receive a small commission at no cost to you.
Fantastic breakdown. I went through a 192Khz phase about 15 years ago and suffice it to say.... lack of hard drive space and the computing power of the day cured me of that pretty quickly. 🤣
@MF Nickster I don’t know about other Daws, but reaper lets you move audio clips on a sub sample level
hah. I have an old MOTU Ultralite mk3 kicking around the advertised 192Khz and in the day it was the new hotness I was just like O.O at ever wanting to record that
Holy moly you’re brave. About a year ago I started mostly working at 88.2/96. I’ve been blown away by how far computing power has come as performance is super smooth these days and the results are quality but man a decade+ later I still feel you on that storage battle. Can’t even imagine what you went through. Whenever I break and print my hardware inserts in the session it’s basically a “lol -12gb” button for my C drive
96khz is all you need even if you are a believer in this stuff. Not a fan of 192
It's interesting. Shot a concert recently with ≈ 10 cameras and multicammed it in the NLE. We used Sony Venice across the board and shot X-OCN ST. That's about 5GB per minute. We recorded ≈ 100 channels of 24/48 on the audio side. If we had recorded @192Khz we would have still only wound up with around 45% less data than the video. From a purely data storage and bitrate standpoint PCM audio is not really all that terrible. Sony X-OCN is pretty efficient actually. ProRes is a real hog on the other hand but notably unlike PCM neither X-OCN nor ProRes are "uncompressed". Undoubtedly the resource hog for audio is plugins because unlike video where VFX and color correction tend to be tertiary operations (i.e. entirely non-real time and done by someone else), audio engineers are typically focusing on editing and finishing within the same software environment and in real time.
Thumbs up for the Dan Worrall link. His, and Audio Universities videos are the top vids on YT to watch.
I love Dan’s videos! Thank you for the kind words, Sven!
I've slowly been learning the benefits of oversampling for the last few years and before final mix export ill spend an hour or so applying oversampling on every plug in that offers the option on every mixer track.
This video really solidified my knowledge and affirmed that me spending that extra time has always been worth it!
The final mixes and masters do sound fucking cleaner by the end of it all because I do use a lot of compressions and saturation on most things.
Thank you for showing that 24 bit is not necessary for audio playback however for audio production that makes a big difference in terms of the amount of buffer between clipping and not clipping the input audio that is being produced.
That video you mentioned last time absolutely blew my mind. I didn't have a clue that the aliasing around the Nyquist frequency issue was a thing at all. I had the feeling that higher sample rates were better for basic audio clarity, in the same way that a higher bit-depth helps with dynamic range. I just had no idea how or why.
There's a fun way to show off aliasing, because it actually results in NEGATIVE frequency.
Take a circle and put a single dot on it, let's say at top dead center. Now, spin it at 250 turns per second, but take a picture 500 times per second. You'll see the dot jump between the top and the bottom of the circle.
Increase this to 450 rotations per second. You'll see the dot move first 225/500 of the way around the wheel, or a bit less than half, then in the next frame it's gone 450/500 around, and so forth. Clearly, it's rotating clockwise.
Now increase this to 550 rotations per second.
In the first frame, these both start at 0. The second, your 450 RPM circle goes 225/500 forward, but your 550 RPM circle shows up with the dot at a position of 275/500 rotations forward.
The third frame shows your first circle reaching 450/500 forward, while the second circle is at 50/500 forward.
Yes, that's right: the second one just moved backwards from +275 to +50. In fact, that first step moved backwards from +500 to +275; or we could say the 450 RPM circle turned +225, then +450, while the 550 RPM circle turned -225, then -450.
It's spinning at 450RPM BACKWARDS!
This backwards spinning is just a sine wave shifted τ/2 out of phase, which still sounds the same. (Yes that's what a sine wave is, it's a circle.) No, you can't just intuit that it must be 550Hz because it's going backwards; that would be like silencing the first 1/1100 of a recording and suddenly it's at double the pitch.
@@johnmoser3594 That's an excellent way to visualize it. It's kinda like how a camera's shutter effects can make a helicopter's blades look like they're standing still or going backwards very slowly.
A fun fact: The exact same reasoning is used in professional video cameras. The Arri Alexa 35 - a camera often used in movie making - has a whopping 17 stops of dynamic range. So even if a scene is way under exposed or over exposed, the problems can be corrected in post-processing.
That's really interesting.
So why is everything on my TV so dark and hard to see!!!!!!
@@jackroutledge352 Maybe modern filmmakers think underexposed means "gritty" and "realistic". Lol.
@jackroutledge352 perhaps your TV doesn't have a good dynamic range.
@@jackroutledge352yeah that sounds like an issue with your TV quality. Or maybe your settings are not optimized?
I have a 4K OLED Sony TV and it has HDR. Looks gorgeous with the right material.
@@RealHomeRecording It's a well-known problem/phenomenon. Lot's of people are complaining about the darker TV/movie-productions these days. It is much darker now.
I also have a 4k Oled TV (LG) but I can also see that some scenes are very dark in production.
For me 24 bit, 48khz digital recorder with analog desk and outboard gives all I need. You get the balance of pushing the levels through the analog to create the excitement and keeping lower digital levels to capture it with plenty of headroom.
It’s all you need. Your DAW might process audio as a 32bit float but your ADC is more than likely capturing 24bit. 48k gives a touch more high frequencies before nyquist sets it without essentially doubling file size.
Love it, I worked for Sony Broadcast including professional audio products, my team worked in Abbey road and the like, this take me back to those days when the analogue and digital battle lines were being drawn, I've always maintained digital offers a better sustainable quality, for the reasons you outline. Keep it up
Thanks, Colin! Sounds like you’ve worked on some awesome projects!
There is no "battle"going on, Dude. BTW, did you work on the analog or digital side of (Abbey) Road?
@@frankfarago2825 I said battles lines, I did not say there was a battle, I worked for Sony broadcast in the time when digital recording equipment like the PCM 3324 was being introduced, and remember conversations with engineers where they preferred analogue recorders, because they could get a better sound by altering it like bias levels, which to me always felt they were distorting the original recordings. I ran a team of engineers who installed, maintained and supported (sometimes during recording session (sometimes overnight)) these products at a time when the industry was starting to embrace the technology.
I remember this time when digital audio wasn't quite in the hands of the consumer yet and a guy who's name escapes me from Sony's "Digital Audio Division" as he put it brought a digital reel to reel deck into the studio of a radio station in San Francisco and played the theme to Star Trek Motion Picture/ Wrath of Kahn. It was awesome but the station was not properly set up for it and there was heavy audible clipping. They stopped and came back to it later and while the clipping was gone the solution just sucked all the life out of the recording. I wish I remembered the guy's name. I think it was Richard something.
But how many albums out there have been recorded to tape ? most all of them ! how many digital albums have i heard ? squat and if i have ? it was early adat !
Something to add about bit depth and floating point for audio processing is the phenomena of rounding/truncation and accumulated error. If you are processing with 16 or 24-bit integers then every time you do a math operation, you are truncating to that length. Now that doesn't sound bad on the surface, particularly for 24-bit, what would the data below 144dB matter? The problem is that the error in the result accumulates with repeated operations. So while just the least significant bit might be wrong at first, it can creep up and up as more and more math is done and could potentially become audible. It is a kind of quantization error.
The solution is to use floating point math, since it maintains a fixed precision over a large range. Thus errors are much slower to accumulate and the results more accurate. So it ends up being important not only for things like avoiding clipping, but also to avoid potentially audible quantization errors if lots of processing is happening. In theory with enough operations, you could still run in to quantization error with 32-bit floating point, since it only has 24 bits of precision, though I'm not aware of it being an issue in practice. However plenty of modern DAWs and plugins like to operate in 64-bit floating point which has such a ridiculous amount of precisions (from an audio standpoint) that you would never have any error wind up in the final product, even at 24-bit.
Rounding and truncation should not be a problem as long as the levels are not wildly out of range, either way too high (digital clipping) or way to low (down in the noise floor). The latter would be almost impossible, since recording at 144 dB below digital full scale would be obviously ridiculous -- even room sound and microphone preamplifier noise should be quite a ways above this level. However, there is one thing that needs to be watched, and that is proper dithering. Vanderkooy and Lipschitz did pioneering work on dither, and they recommend that triangular probability density dither should always be applied at 1/2 least significant bit whenever audio is sampled, or resampled (sample rate converted or gain shifted down, where the dither might be reduced below the current bit depth).
Vanderkooy and Lipschitz did say that dither might not be necessary when working with more than 24 bits, especially if the master is going to be converted to 16 bits for distribution. It can be dithered for 16 bits when transcribing to CD Audio or whatever. The dither provides a digital noise floor that spreads the quantization error power spectrum across the entire audio band, effectively making the resolution greater than the dynamic range implied by the actual bit depth.
This is the white paper, from AES, not free: aes2.org/publications/elibrary-page/?id=5482
I find a higher sample rate most useful when stretching audio tracks is necessary. Especially on drums to avoid “stretch marks.” But it's enough to bounce up from 48 (my standard) to e.g. 96 and bounce back when done.
Here's the simpler way to get the same effect. Check the settings for your time stretch algorithm. The default is usually the highest pitch accuracy. What increasing the project sample rate does is decrease pitch accuracy in favor of time accuracy. The alternative way of doing this is to set the time stretching algorithm to a lower pitch accuracy.
@@simongunkel7457 Are you using REAPER? There is this setting to preserve lowest formants - is that what you mean?
@@PippPriss Sorry for the late reply, didn't get a notification for some reason. REAPER has multiple time stretch engines and for this particular application switching from Elastique Pro to Elastique Efficient is the way to go. You can more directly change the window size on "simple windowed", though Reaper actually goes with a time based setting, rather than a number of samples. Also note that stretch markers can be set to pitch-optimized, transient-optimized and balanced..
Haha, “Stretch Marks” never heard that before. I’m going to say that instead of ‘artifacting’ from now on
How exactly does upsampling from 48k make the stretching more transparent? Since it's not adding any new data, I'd imagine it would do nothing.
Well, shoot. I just watched your video on 16/44 and how it was theoretically perfect. I was so excited about all the hard drive space I was going to save recording at 16/44. Thanks for presenting the other side of the equation. You and Dan W. are treasures.
Yep, I think that it's quite analogous to photography, where 8-bit colour channels work "pretty well" for printed output, but really fall apart for original scene capture, and just get worse when any kind DSP is applied to the "signal", where 'banding' shows up in stretched tones, and softened edges can get banding or artifacts introduced during processing... Great discussion.
I like high sample rates and I cannot lie. You other engineers cannot deny....
Lol...itty bitty waist....round thing...face..😂🤘🎶
When the engineer walks in with some RME and a Large Nyquist in your face, you get sprung!
@@Mix3dbyMark😂😂😂
FWIW I'm running 32 bit float and 48k on my DAW. That's my max bit depth with Studio One 6.1 Artist, it goes to 64 bit float in Pro.
As for sample rates, it looks like S1 goes up to 768K. Good enough?
@@Mix3dbyMark nice!
I always thought about the sample rate problem being that if you wanted to slow down a piece of audio with the highest frequencies being 20kHz, you'd lose information proportional to the amount you slow it down. So you need the extra magical inaudible information beyond 20kHz for the slowed down audio to still fill the upper end of the audible spectrum. That is something every producer will have probably experienced.
yeah if its essential for your workflow to slow some audio down then yes by all means. otherwise im happy with 48 or 44.1 because it sounds good. I like to export any files before mastering as 32 bit files tho, saves you issues from downsampling (as most DAWs run a 32 bit master fader now)
This is an excellent and very clear explanation. Thank you so much! I've seen Dan Worrall's videos on this topic, and I agree they are also brilliant.
That was super clear. You’re a great instructor. Is it useless to record in 96k and then bounce stems down to 48k to give my logic session a break?
No. It’s not useless. You can even bounce out the multitrack instead of combining sections into stems.
Both this and the previous video are great! Thanks for the great work!
Thanks for clarifying these things! Really useful for deciding on project bitrates and sample rates. Cheers!
Would be cool to see the importance of audio resolution in resampling!
Yes please
It's like Ansel Adams 'zone system' for audio.
Adams would prefog his film with light, then over expose the film in camera, while under exposing the film in chemistry, so as to get rid of the noise floor (full blacks) and get rid of the digital clipping (full whites), both of which he maintained "contained no information".
This resulted in very pleasing photographs.
Who would I do this process in photoshop
I'm surprised he didn't mention how higher sample rates decrease latency when live monitoring.
PS. I would love to see videos about the future of DANTE AV and Midi 2.0.
I guess because the digital buffers fill up sooner?
Good point, Zyon Baxter! It’s a balance in practice though, as it’s more processor intensive so using a higher sample rate might lead to needing a larger buffer size.
If anyone reading this is interested in learning more about this, check out this video: ruclips.net/video/zzM4yk3I8tc/видео.html
Actually with a given computing power, and assuming you make full use of it, higher sample rate mean higher latency.
@@lolaa2200 lower latency, even within the ADC/DAC, feedback loop is reduced
I think this is kind of a myth in all honesty.
In every other way, doubling samplerate means doubling buffer sizes. You have a delay effect? You'll need twice as many samples in the buffer at 96k.
Same for the playback buffer: if you double the samplerate, but keep the same number of samples in the buffer, you've actually halved the buffer size.
More samples are a great thing for denoising as well.
Temporal Denoising is quite a resource intensive task, but it works wonders in recodings of any type. Especially if you want to get rid of higher frequency noise.
Awesome stuff. Keep 'em coming.
This stuff is pure gold. Thank you so much.
Same reason why graphic designers need high res and high bit depth. A 1080p jpg image is great for viewing, but will look terrible once you zoom or change the brightness. If your final image is composed of other images, they better be at a good resolution, or they'll look pixelated
It's not about resolution it's about compression. Not entering int eh details but actually how much you compress your JPEG and the trade off between image quality and file size is exactly what is discussed here : matter of sampling rate.
As an amateur Photo(GIMP)-shopper, I figured this out a few years ago. Always start with a higher resolution than you think you'll need. It's easy to reduce the final product but a pain to go back and redo it with higher resolution.
@@lolaa2200 Ehh no, audio sampling rate is directly analogous to image resolution. We're not talking about image compression nor audio compression here.
@@MatthijsvanDuin I reacted to a message talking about JPEG. The principle of JPEG compression is precisely to give a different sample rate to different part of the image so yes it totally relate. JPEG compression IS a sampling matter.
Modern converters work as 1bit sigma-delta anyway and convert the data stream after the fact, using digital filters with a dithering noise beyond the audible range.
WRT noise floor and compression, when working with analog tape, it was (and I presume still is) much more common to compress and EQ on the way in to avoid adding noise by doing it later.
I guess having a high sample rate for when you need to e.g. slow recordings down is also useful because you still have that data. And to me that's pretty much the only reason to have things above 44kHz sampling rate
Great point, Zadar Leader!
Not something I think is neccessary, unless you specifically want ultrasonic content and make bat sounds audible. Now if you think time stretching sound better with a higher sample rate, you might be right, but you are using the most computationally expensive hack I can think of. Time stretching and pitch shifting algorithms use windows of a particular size (e.g. 2048 samples). But their effect depends on how long those windows are in time. So a higher sample rate would just decrease the time window. All of these effects make a trade-off though: The longer the window, the more accurate they get in the frequency domain, but the shorter the window the more accurate they get in the time domain. Most of them default to maximal window size and thus maximal accuracy in the frequency domain, but the errors in the time domain lead to some artefacts. So instead of increasing your project sample rate, which will make all processing more costly in computation, you could just opt for the second higherst frequency domain setting for your pitch shifting or time stretching algorithm. Which means window size is decreased, which actually reduces computational load for pitch shifting or time stretching.
@@simongunkel7457 I think he means the simpler resampling version where things get pitched down when playing slower, then the higher sample rate still has content to fill in the top of the spectrum when pitched down.
@@5ilver42 That's the ultrasonic content he referred to, wanting it to be captured so when you lower the rate it drops into the audible range.
@@5ilver42 It depends on the application, but if you're just slowing down for effect you actually want the top of the spectrum to remain vacant rather than shifting (previously inaudible) ultrasonic sounds into the audible part of the spectrum. Obviously if you want to record bat sounds you need to use an appropriate sample rate for that application, regardless of how you intend to subsequently process that record.
Compression you apply before any gain or od/ds is brought into the signalpath. It only might be applied again when mastering for different formats.
Got so excited when I saw you were using reaper
A very sophisticated way to say: the better quality you use to record, the better results you will have.
Succinctly, while human hearing has an upper frequency bound, targeting that limit when converting from analog to digital can (and usually does) result in literally corrupted digital representation because the contribution of the higher analog frequencies to the waveform don't just disappear, they get aliased into the lower frequencies.
Excellent video Kyle.
Sometimes I miss the analog tape days, till it comes to signal to noise. At least tape saturation sounds much better than digital clipping, which I'm sure nobody goes that hot. 🙂
Great video, Kyle.
Gracias amigo, saludos desde México 🇲🇽❤
Rather than for musical purposes, I think it is valuable as data for profiling or natural phenomenon analysis in the future.
That was really great man! I didn't know this before! I was just using standard because I didn't know what would it change. But now I understand it! 🤘
Glad to help, 3L3V3N DRUMS! I still use 48kHz most of the time because the processing power and storage I save outweigh the tiny bit of aliasing that might occur. (In my opinion)
@@AudioUniversity Great to know. That's actual the standard in Ardour while I'm recording my drums. So I'll let it like that!
@@AudioUniversity Where would it occur though? Your converter on the hardware side always uses the maximum sample rate it can support, because that makes the analog filter design much easier. Then if you record at lower sampling rates it will apply a digital filter and then downsample - both are hardware accelerated DSP that get controled via the driver. If you set to record at 48k, your converters don't switch to a different filter design and a physically different sample rate, they just start to perform filtering and downsampling before sending the digital signal to the box.
@MF Nickster I agree.
you have done it again. i would love to see a video on square waves
When I first started making digital audio recordings, the encoder allowed me to choose 14 bit or 16 bit. Times have changed! (F1 / SLF1 system, 1983...)
Thank you for all your great videos and subscribed.
sample rate does more than help with anti-aliasing. rupert neve was convinced that capturing and processing the ultrasonic signal that came with the audible actually contributed to the perceived pleasantness of the sound, and the emotional state it communicates. so, even if you can't hear it, per se, it counts in the overall timbre and feel - you can easily argue that, in the analog domain, ultrasonic signal - for instance harmonics - actually changed the behavior of compressors, to say the least - and that, multiplied by x number of tracks. so higher sample rates also allow for a wider bandwidth into the ultrasonics, which seems to matter for the quality of the signal. the downside is the processing power, and storage space.
It's risky to record frequencies above 20KHz, even when the original sample rate is above 88.2khz. Ultrasonic frequencies in this band are susceptible to being digitally folded down into the audio range, producing extremely unnatural-sounding aliasing distortion. While this hazard can be carefully avoided within a pure 96Khz+ digital processing chain, any side trip to an external digital processor may involve resampling that can run afoul of ultrasonic frequencies. Why take such risks when the speculative benefits have never been shown to be audible?
I understand about oversampling and why it's used internally. However, sometimes I think of potential reasons to *record* at higher sample rates - but I'm no expert and I wonder whether this is ever justified. Two such reasons I can think of right now:
1. Field recordings that you might want to slow down later on to half or quarter speed.
2. Recordings made in adverse conditions that might need noise reduction processing (I've heard some people say that higher sample rates can help with NR quality).
Do you have any comments on either of these? I'd be interested to hear your advice. Thank you!
The two reasons you listed are indeed valid points. Pitch correction or pitch manipulation would be another.
@@RealHomeRecording Many thanks, that's helpful!
Here is the right level to render to for audio uploaded to RUclips:
The ideal volume limit level is -5.9 Db. (RUclips automatically normalizes volume to that level)
All instruments should be below this level with the peak spikes reaching -5.9 Db.
Just put all instruments at around -18 Db and then increase accordingly between -18 and -5.9 Db.
Don't ignore clipping. Or it will sound like Golden hour by JVKE.
I already say this but i say again, most plugin work better at high sample rates since most plugins doesn't have internal oversampling, so it's good to work at reasonably high sample rate like 96k or 192kHz, though i say this but im still working at 44-48kHz 😂
Actually almost all plugins these days have internal oversampling.
My DAW (Reaper) has external oversampling per plugin or plugin chain, which means it takes care of the upsampling and after processing the filtering and downsampling. To the plugin it looks like the project runs at a higher sample rate, while for plugins where aliasing isn't an issue it can still run at the lower sample rate.
most plugins like 10 years ago had oversampling allready built in them.
The demonstration about analog audio gain and noise floor is exatcly how cameras work aswell. I'm actually shocked by how similar they are. Capturing images with a camera is a constant battle between distortion (clipping the highs) and the image being too dark (blending in with the noise floor) and bringing it up in post then causes the noise to come up aswell.
say whatever you want to ...believe whatever you want to .... the difference between DSD recordings and Lossless wav or flac on my Fiio DAP is NIGHT vs DAY
Its really about the recording just as much as the file type / resolution
One thing often overlooked in the sample rate argument is digital mixers. The converters are often run in low-latency (high speed) mode in order to keep the round trip through the console low enough that it doesn't affect people's performances. This is done by simplifying the digital anti-aliasing filters to reduce processing time. This is not trivial stuff, I'm talking on the order of 40dB attenuation at .6fs vs 100dB. In other words, if your console runs at 48KHz, an input of 28.8K at full scale will come out the other side of your console as 19.2K at -40dB. That's enough to cause some issues, especially since a lot of manufacturers trying to meet a price point completely leave out the analogue anti-aliasing filters (Sony suggests 5-pole analogue filters in front of 48K ADCs). Running a digital console at 96KHz effectively means around 90dB stop-band attenuation even with the ADCs in low-latency mode. Of course, you also reduce aliasing caused by internal processing as you say.
@mfnickster The issue isn't processing power so much as ADCs MUST have group delay in order to have linear phase anti-aliasing. DACs must also have group delay for the reconstruction filters. The processing power within the console's DSP is fast, but nothing is instantaneous, so every place once can reduce the latency must be considered. Oversampling also requires group delay, so pick your poison. In a computer environment, the plug-in can report it's internal latency so the DAW can compensate by pre-reading the track, not so in a mixer.
Here are two other reasons to go with 96k:
The ad/da latency of your system will be much smaller, and (if for some reason) you record to a file played back in the wrong sample rate you will notice it right away 😁
!All points on point!
"...for any properly mastered recording." I long for properly mastered recordings. A thing of myth and beauty. Like a unicorn.
I record 24bit because of the noise floor. But I record on 96KHz because of the round trip time! My system actually has a lower latency when it's set to 24bit 96KHz.
Higher sample rates reproduce higher frequency. There is no more info in the audible range. The clue is in the file size, double the frequency double the size. More bits is lower noise floor which most dacs cant reproduce out the audio port. And yet it sometimes sounds better to me 🤷♂️
did you test it in an ABX-setting? Otherwise your perception just might have fooled you! 😀 #beenthere
@@paulhamacher773 a lot of the time its difficult to find a higher res version of the same mastering
@@crapmalls ...then just record your own stuff at different settings and let somebody else play it for you without telling. also, use more than just a few examples. you will see, you got fooled. there isn't more info, you can't hear above 20kHz.
@@mb2776 yeah thats what i mean. I know theres literally no difference because the higher sample rate just goes into higher frequencies. The file size is the giveaway. Apparently it can help with timing in the dac but thats an oversampling issue and a dac issue IF the dac is even good enough for it to matter
well, the comments around noise floor are a bit misleading. a 24 bit signal doesn't have 144 db noise floor, that would be nice, as this depends on the noise floor of the conversion. 144 ( 6db per bit, ) is theory only.
Wow. Good to know.
It is good to see you back in action with your awesome videos!
Years ago I worked for a classy recording facility with an audiophile guy in the lead whose idée fixe was 192/32, he considered even 96/24 as a downgrade. Even if top of the line computers were used, they clipped on a regular level due to the fact that data storage devices could not keep up with the load of a normal album session that is about 60+ processed tracks. Even M2 drives could hiccup. I had to consolidate drum tracks during editing after every 5 seconds otherwise DAE#90-something kept on popping up and it crashed. PT hell. So audio production was restricted to box fighting with both arms tied back. Just for the full picture, when the same guy demonstrated the benefits of the high sample rates via A/B test, without his glasses on, the actual track that he was A-ing was not labeled "192KHz". It was "192MP3". He just messed it up. The B was the 192KHz one that he was talking down. "Can you hear the degradation, it is clearly audible, yeah?".
Yeah it is. Confidently. So if you convince yourself that, being a human with perception clipping around 20KHz at age 18-25 and going down later, you can really hear the difference for example between 48/24 that is a reasonable data size to work with and 192/32 that can clip the hell out of the system and storage with no practical production benefits, don't forget to provide a DNA test proving that you are a cross-species batman with 200KHz ears.
02:02 it worths noticing that it is not because you record in 24 bits audio that you have 24 bits of dynamics : hardwares have noise level as well. But sure, digital intefrace are still way lower in noise than analog.
Nice to see that this stuff is understood properly by the younger engineers that didnt live through the evolution of analogue and digital recording. So much nonsense spoken about hgh bit rates. Well done.
Aliasing definitely isn't a problem unique to digital audio recording, I'm a geophysicist and a geophysical data processor aliasing is also an issue in geophysical data in exactly the same way except it ends up being a visual issue.
Great video. In audio production, another beneficial effect of using higher sample rates, apart from getting rid of aliasing, is that doubling the sample rate cuts latency in half...
It's odd that so many people bring this up. It tells me that many systems are poorly designed and don't adjust the sample buffer size to the sample rate, e.g. they are a fixed number of samples rather than a fixed amount of time. Or people just don't know how to adjust the buffer size to reduce latency (at a cost of higher chance of dropouts).
This is why the USMC Sony and others need to make DSD more available and not guard it so much. It's a lot easier to work with when the editing program supports it. Almost never have to worry about noise floor and you can do almost all processing on a core duo with ease.
You using Pyramix for this? If not, then what?
So, wrt sample rates (5:01), when a more gentle low pass filter to Nyquist of 48k is surely within the realms of DSP and CPU now, it begs the question: why do anti-aliasing filters remain at 24kHz for processing at 96k, or when over-sampled 2x-4x?
I wish there was a higher sample rate option for highmid to higher frequencies that keeps a 48hz sample rate on the lowmid-low frequencies but targets a higher sample rate for the rest.
I'm a bit rusty on this, but there is an issue with the Nyquist frequency. Going from analog to digital. You want to "brick wall" band pass the signal at half the sampling frequency. Brick wall is a perfect low pass filter, which doesn't exit. There are very good low pass filters. Going from digital to analog, you again want to brick wall filter the signal to recover the analog signal from the sampled signal. Even more confusing, there are digital low pass filters, but they have to obey Nyquist as well.
This video was SO HELPFUL but now I just wonder why anyone uses more than 2 compressors on a sound
96 is the sweet spot, think of sample rates as the display quality on your monitor, 1080p is going to look worse than 4k because it has less pixels, it’s the same thing in music, more samples equals more detail.
For digital SNR to be that important, you would need it to be higher than your hardware SNR, which is quite unlikely in the case of acoustic recordings. Maybe for electronic music it's more important.
In the 16-bit days keeping the digital noise floor below analog meant going in hot and thus risking clipping. With 24 bits, you gain the headroom and your analog noise floor will be louder than the quantization noise. So it a non-issue these days, but only because we moved to recording at 24 bits, where the bottleneck becomes the analog chain in front of the ADC.
@@simongunkel7457 Do you have the order of magnitude? I feel like 90dB of SNR is already extremely good for your whole analog chain.
@@jonasdaverio9369 If you wanted to make use of the 96dB provided by 16 bits, you'd have no headroom. I tend to be cautious and leave at least 12dB between the loudest peak I got during soundcheck. There are plenty of mics that can beat 90dB, and dynamic mics don't generate noise on their own, so you only get the preamp noise. 100dB SNR isn't that uncommon even at quite low price points and I just measured 90dB on an old Behringer interface I have lying around.
@@simongunkel7457 Thanks for the details!
You get an increase in noise as you add effects and tracks in the digital domain, thus it's not just capture, but also editing that needs a lower noise floor. Even just adjusting gain in the digital domain adds noise.
I wondering if a human can differentiate a sine wave from a sawtooth wave at high frequencies, when the harmonics forming the sawtooth wave will be above 18-20khz
So we can't hear these frequencies, but the pressure difference will be much steeper in the case of a sawtooth wave (much faster attack)
Maybe you can give me some opinions and sources on this topic?
P.s. I really like your videos, thank you!
That’s an interesting question, would be curious too, but I assume you’d be able to hear the difference, what you think?
Check out this video: Digital Show & Tell ("Monty" Montgomery @ xiph.org)
ruclips.net/video/UqiBJbREUgU/видео.html
Monty runs a square wave through the system and illustrates something called the Gibbs Effect. Although, the frequencies that make a triangle wave or square wave perfectly triangular or perfectly square exceed 20 kHz. So the sound should be the same theoretically!
@@BenCaesar Above around 5k, no audible difference between a sine and square wave tones. Try it yourself!
Bit depth matters, assuming you are going to change the dynamic range of what is on the file by a lot. Sample rate does not (assuming you only care about audible frequencies!)… if you stretch a 20 KHz sinewave, and make it a 19 KHz sinewave, the application doing this re-sampling is not taking the original samples and moving them, it is interpolating a position between the original samples and synthesising a new sample, it will be as good as the algorithm the software uses - the sample rate of the source (44.1/48/96etc) is irrelevant, if the software is good, it will do a good job, if the software is poor, it will do a poor job. Luckily for us in 2023, this is a very solved problem and things like Reaper which still has a re-sampling mode on export, default to very good implementations for sample rate conversion whereas in the olden days, where the original Pentium processor was crazy expensive, it would take forever to export whilst re sampling. Any re-sampling algorithm, that is used today, does not simply draw a straight line between two samples and put a new dot the appropriate proportion along the line, The wave form represented by the original samples is effectively constructed, and the sample that is synthesised is placed on the reconstructed wave form, which is mathematically very precise relative to the original samples. This accuracy does not get better at high sample rates, these samples are temporal, not amplitudinal (ie - the inaccuracy is in the bit depth, not jitter in when the sample was made - unless the ADC was bad - in which case it’s bad at any rate!!) For those that are thinking about aliasing, again, the quality of the software you are using is far more profound than the sample rate you select, for example, a good piece of software may put a low resonance, brick wall filter at about 21 kHz to filter away higher frequencies, so they don’t cause aliasing. If your software does this, and many do, there is a good chance that the software developer has thought things through carefully. If you are dependent on sample rate to minimise aliasing, then there is a good chance that your software of choice has problems in many areas!
Yep, frequency and volume range is enough. But what about resolution? In 16 bits 48 kHz signal is just so many data paints, which will wipe any difference between very close, but slightly different signals.
For example, digitizing short enough 15 kHz and 15.001 kHz sine signals would result in the same binary file. Moreover, DAC is not looking at the whole file, only at a short part of it, meaning that we will likely have frequencies changing over time.
Compare this it to image sensors or displays. Having 1 inch HDR sensor gives enough size and depth. But we still want it to be 4K or 8K.
Good stuff to know. Thank you.
I would argue that increasing bit depth doesn’t give you a lower noise floor it only increases dynamic range. If you want to say the noise floor is proportionally lower that’s fine. The noise floor is essentially fixed depending on input and output configuration, equipment as well as power and other electrical interference.
@nrezmerski A lower noise floor only relative to overall dynamic range, as in signal to noise ratio. Changing 16,24, or 32 bit won’t change the system’s inherent noise floor.
Might be worth explaining what aliasing is, and why aliasing can occur only in digital processing, and never in analog processing
ruclips.net/video/O_me3NrPMh8/видео.html (most visible on the back wheel of the carriage going right). That's aliasing on a film from 1903 and that's not digital.
@@simongunkel7457 aliasing in the realm of analog Audio processing
@@eitantal726 Well BBD delays can alias and a classic piece of gear which can do this is the Moogerfooger MF-104M. But you could mod any BBD effect to do it - it's just the Moog has controls that allow you to go to aliasing mode without any further tinkering.
Aliasing is an issue whenever something is discretely sampled, which it why also applies to motion film (with each "sample" being a frame of video)
it's sort of the same with visual production with pictures and video files. The average joe posts jpegs in 8bits and maybe a png with alpha channel every sunday. But in production we use 32bits EXRs everywhere because you can play with high dynamic range in comp and it's fast it can store layers and metadata you haven't even heard about and deep data and ....
Real music is not single static sine waves but a whole spectrum that varies with time. I would like to see this mathematical argument extended to spectra, because the error on each frequency component would surely accumulate? Real music is very very processed, being encoded and decoded multiple times from various streaming services and codecs, so I think adding a bit of headroom in terms of frequency and bit depth is quite sensible to keep the artefacts down.
I use 192kHz multi tracking then master to DSD for amazing replay via a decent DAC
If your signal to noise ratio is below 96dB (including not only mic and preamp but also room), recording with 24 Bits only makes sense for the manufacturer. ;-)
Unless you like to record 8 Bits of noise...
Would you recommend mixing in 24 bit . I used it before and it’s ok to me
Yes.
@@AudioUniversity I make predominantly reggae music and for some reason in I normally create music in Ableton now mixing in Luna at 24 has given me the desired sound I was looking for. All out API and studer Tape.
Clear explanations
I heared Spotify is rumored to offer higher fidelity audio, probably with less compression or lossless audio using codecs like Flak instead of mp3. My audio equipment probably isn't good enough to hear the difference though, but maybe it will be good for music producers.
Excellent
I'm watching all the videos on the channel, thank you for sharing your knowledge with us. One question: what is the setup of the sound equipment installed in your car? Is it a High-End system? I'm curious to know what system (equipment) you use in your car...
I just use the stock system, but I’d love to upgrade someday! Thanks.
Perhaps an Idea to consider (and make a video) that compares DSD to PCM and the differences between PURE DSD recording mastering output and the ones that use PCM in between ... Nevertheless, DSD128 or DSD256. PCM 24/96 vs DSD128 ... Is it really that close ? Or is there some "hidden difference" ;-) ...
AFAIK this method is built in to every audio interface nowadays. So obviously a sampling resolution higher than 44.1 is useful, but in principle you shouldn't record audio files at 96kHz or higher, because they just take up a lot of hard disk space and need more cpu power to play them, especially if you have a lot of tracks... but you don't gain quality.
Could you do a video on dsd, what it is, pros, cons etc
Record in 48 khz/32 bit float
Bounce in 48 khz / 24 bit
Most mastering engineers and streaming services will knock it down to 44.1 / 24 anyway (some will keep it at 48). So there's no point in wasting processing power by using higher rates and depth.
Yeah it has more information, but 1. No one cares and 2. It'll all be gone by the time it reaches your audience.
Most professional studios with million dollar rigs don't use anything above 44.1 / 24, even for recording. Imagine that...
***Exception: Only use higher rates if you're recording a vocal that you think will need a lot of pitch correction. Record it in 96 khz so you can get a cleaner vocal after it's been run upside down and put away wet in Melodyne.
If you plan to bounce in 44.1, then record it in 88.2***
4:00 this is wrong -- it has to be MORE than double the highest bandwidth. You cannot capture 24khz with 48k sampling. You can capture 23.999 though.
I understand the part that says we only need a couple of points to draw a smooth sine wave through, there's no stairs... but can anybody explain why IRL this is ? Is this because of the hardware or physical properties of DA converters that convert this data back to an analog signal, or is this a mathematical thing ?
Can we talk about should EBU R128 and LUFS apply on RUclips platform?
I disagree that 44100 is adequate for playback. Waveshape details are important, because contrary to the common theoretical examples, most audio doesn't come in nice proper sine waves. Further, the more audio sources there are the more destructive interference becomes an issue at high frequency when the sample rate is inadequate. I've never heard a 44100 recording of an orchestra where it felt like I was there. When there are a lot of sound sources or reflections, higher sample rates become mandatory to preserve location data, to make it sound like the instrument is actually in front of me.
Weird how limiting was not discussed. It's one of the applications of high sample rates that actually make actual sense in practice most of the time. At least in a mastering context that is. The sample peak level has a higher chance to match the intersample peak level (true peak) when higher sample rates are used even if high sample rates have no effect on the d/a. That's the main working principle behind true peak limiters.
great video!
Is there any advantage to upsampling when applying parametric EQ, crossfeed, filters, volume leveling etc? Also do some DACs work better with higher sample rates if you are able to offload the conversion in the digital domain in a pc? I am a roon user and curious your take on this.
I can't hear any extension of high or low frequencies with greater bit depth or sampling rate. What I can hear is a more natural sound. It's very obvious to me when going from 16/44.1 to 24/48. Maybe slightly less so from 24/48 to 24/96. I really can't hear the difference between 24/96 and 24/192. It's an interesting experiment to run a high quality ADC directly to a high quality DAC and switch the sampling and bit depth. All of this adds gain stages but also the coloration of filters which might be the largest factor.
if a frequency of an audio signal exceeds 24kHz when using a 48kHz sample rate, it will cause lower frequency artifacts that seem to reflect back downward into the audible range.
One question... When I'm listening to a song on RUclips, how do I identify if that song is an audio file without loss of quality or if it's an audio file with loss of quality? Where can I see the specifications of the audio being played to know if it is, for example: a “WAVE” or “FLAC” format (without loss of quality) or if it is an “MP3” type file (where there was compression and loss Of Quality)? Is there any extension for the Chrome browser that shows real-time specifications of the audio being played? I visited RUclips's audio file guidelines and it says the following... “[...] Supported file formats: (1) MP3 audio in MP3/WAV container, (2) PCM audio in WAV container, (3 ) AAC audio in MOV container and (4) FLAC audio. Minimum audio bitrate for lossy formats: 64 kbps. Minimum audible duration: 33 seconds, excluding silence and background noise. Maximum duration: none “[...]”. Therefore, RUclips accepts audio files without loss of quality and audio files with loss of quality.
I believe RUclips videos have an audio Bitrate of 128kbps.
Having watched the linked video mentioned here, I can just about get my head around the idea that the computer can reconstruct a sampled sine wave perfectly at any frequently below niquist, even when it has a very minimal number of samples per cycle of the wave - but I'm stuck at how this works with real sampled sound, i.e., not a perfect sine wave.
I'm familiar with the idea that all sounds can be understood as a combination of lots of sine waves, but given that real recorded audio just looks like an irregular and unpredictable wiggly line, I can't see why the sine wave example translates into real recording. My intuition would be that regular waveforms like a perfectly repeating sine would be able to be accurately sampled at a higher frequency than irregular waveforms for a given sample rate. How can the 'decoder' deduce the trajectory of the wave between sampled points when the waveform wiggles all over the place?! Surely there can't be only one mathematically correct solution to how to fill in the gaps in that case too?
There is only one solution in those cases too. So long as it’s below the Nyquist frequency. If the fastest “wiggle” that matters is the 20 kHz, it and all lower frequencies will be accurately reconstructed as shown in the video. Did you watch the last part about square waves in Monty’s full video?
Just a detail: no speaker can reproduce above 20khz, the square wave is composed of many sines above the capability of the speaker everything gets lost either way, and all amps have low pass filters (also around 20khz) to not destroy speakers. If I remember well a d/a converter also has a 20khz-22khz low pass filter. The real limitation in frequency are our ears.
Look up Fast Fourier Transform Tutorial by Karl Sims. It lets you model an arbitrary waveform live and shows all the sine/cos components that produce it. I notice you get ringing if you have sharp edges, just as you would in analog when the bandwidth is limited.
@@Miisamm Indeed a DAC will have a low-pass filter and this is key to deducing the correct "wiggle" as @davidcooper8241 pondered. Any way for the signal to wiggle _aside_ from the desired one will contain >=Nyquist frequencies, so when you pass the signal through the low-pass filter, getting rid of those overly high frequencies, the resulting signal is mathemagically the desired one.
Word of the day: Nyquist 😉💥
Finally I have strong arguments to argue with audiophiles! 😅
Sadly, most are like flat earthers. They just won't listen.