Sample Rate, Bit Depth, Bit Rate, and You(r Ears), Explained

David MacDonald

Просмотров 5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 28 авг 2024

Комментарии • 39

@Hand-in-Shot_Productions 6 месяцев назад ⁺⁵
I was assigned to watch this for my college audio-editing class, and I found this quite an informative video! I have learned several new terms today: sample rate, bit depth, and bit rate, as well as some other terms.
Thanks for the video! I'll subscribe!
@DavidMacDonald 6 месяцев назад ⁺¹
Thanks so much for the comment! Would you mind sharing what college? I'm just curious!
@Yusfi5150 Год назад ⁺⁵
The best explanation on youtube. Thanks Bro
@Harindu101 7 месяцев назад ⁺¹
Thanks a ton for explain things to me. Really appreciate it!
@dormin600 4 месяца назад
very clear and concise thank you for the wisdom
@stewartmoore5158 Год назад
This deserves a lot more views.
@davidr00 Год назад ⁺¹
Dude… thanks for the info. 🖤
@jmadrid4264 10 месяцев назад ⁺¹
great video, thank you so much!
@KevinWeed 3 года назад ⁺¹
Thanks.
@wblayney1992 3 года назад ⁺²
This is a good video going over some stuff that's tough to explain, but the analogy of digital images is a little bit misleading. Because it's not possible for us to create a mathematical function that describes the image coming from the sensors in our camera, our only option for making the digital copy closer to the analog equivalent is to reduce the amount of information lost at the sensor (increasing the resolution and bit depth). But audio is not like this - it actually is possible for us to create a mathematically perfect description of the analog signal such that it can be perfectly recreated again (much like lossless compression), and that's what the sampling theorem is all about. As long as you stick to the rules (fourier transform of 0 above Fmax / 2, constantly spaced samples etc), the digital signal is perfectly describing the analog input wave, and the reconstructed wave will sound identical to the input wave (ignoring aliasing filter slope and time domain distortion caused by it). This means that (again, with an ideal aliasing filter), there is literally 0 audible difference in the audible range between 44.1KHz, 48KHz, 96KHz, etc. Obviously this isn't true with photos, because you will always get something positive out of increasing the resolution! In this sense, a better analogy to audio is vector images, like SVG or PDFs. These use mathematical descriptions of the images, such that you can zoom in infinitely without degredation, and they perfectly represent the continous nature of the shapes despite being stored as a long discrete string of numbers.
@DavidMacDonald 3 года назад
I disagree with your statement that vector graphics are a useful analogy. The ability to create a function to “perfectly” is very similar to vector graphics, but that’s not what’s happening when audio is recorded. Using FFTs and related techniques to model a sound is like vectorizing a bitmap, it’s a separate thing from the bitmap itself. Most audio is _never_ represented by a mathematical model. What you described is like a vector, but that’s not what I described in the video.
@DavidMacDonald 2 месяца назад
@@lyntedrockley7295it is very much not like vector graphics in the way the data is stored digitally. Vector graphics would be more analogous to basic synthesis from unit generators.
@GodmanchesterGoblin 2 месяца назад
I would add that not all WAV files are uncompressed, although most of them are. The exceptions are WAV files using ADPCM compression.
@nicksterj 2 месяца назад ⁺¹
Indeed, WAV can even store MP3-compressed audio!
@GodmanchesterGoblin 2 месяца назад
@nicksterj Yes, although I had to look that one up. 😊 I have only worked with LPCM and ADPCM in real products.
@nicksterj 2 месяца назад ⁺¹
@@GodmanchesterGoblin (Confession: I had to look it up too!) 🤓
@Jacques80120 Месяц назад
Yes exactly, wav is only a container and its "codec" (if you can call it that) is usually PCM.
The wav container can hold almost any other codec actually but most devices/ software won't know what to do with it 😂
@nicksterj 2 месяца назад
Excellent explanation, but it perpetuates the myth that more samples make the recording "more accurate" or "truer to the original waveform." The analog output after conversion from the sample set at 5:35 will be exactly the same as the one at 5:05, because even though the sample rate is lower, it's still above the Nyquist limit for that frequency, i.e. more than 2 samples per cycle.
Also, it is not necessary to capture the "highest and lowest" levels in the signal. You just need > 2 samples per cycle and it doesn't matter where in the phase they are taken. At exactly 2x the frequency, the samples could fall on the zero crossings and you'd get silence, which is why you need more than 2x.
@DavidMacDonald 2 месяца назад
Not really. More samples _does_ make the digital data more representative of the analog phenomenon. There is certainly a point of diminishing returns where more samples doesn’t make a difference but if you think the number of samples doesn’t matter, you could record at a sample rate of 1025 Hz and save a lot of data! The only reason the analog output of those two moments would be identical is because the graphic is showing a sine wave. A real audio recording would have a more complex waveform. And the exact high and low point don’t matter much, but yes, phase _does_ matter for frequencies that are very near 1/2 the sample rate. That’s why we record at double the highest frequency plus a little bit. That little bit extra is (in part) to give some room to account for phase. Otherwise, a sine tone slightly out of phase at near half the sample rate would sound softer than one in phase.
@nicksterj 2 месяца назад
@@DavidMacDonald I don't disagree with any of that except the statement "more samples does make the digital data more representative of the analog phenomenon." The reason is that when your sample rate passes 2x the frequency, you have enough information to reconstruct it perfectly according to Shannon-Nyquist. The analog output of a 20 kHz tone sampled at 44.1 kHz is exactly the same as if sampled at 48 kHz, 88.2 kHz, 96 kHz etc. In fact it _cannot_ be any different because of the sinusoidal nature of sound waves. And if your sampling rate is sufficient for 20 kHz it is by definition sufficient for everything below that. Higher sampling rates don't gain you anything but a higher bandlimit-they don't make frequencies already within the audio band any more "accurate." A complex waveform is no different from a sine wave for the purposes of Nyquist.
@nicksterj 2 месяца назад ⁺¹
P.S. if your signal contains nothing above 500 Hz, you certainly _can_ sample at 1025 Hz and save a lot of data. But you know that. :)
@GodmanchesterGoblin 2 месяца назад
@DavidMacDonald The only case where 1025 samples per second would not be sufficient would be when the signal being sampled has frequency components above half the sampling frequency. The input being bandwidth limited is a fundamental requirement in sampling theory when applying Nyquist. As soon as the input signal has components above half the sample rate you get aliasing effects, which are unpleasant and unwanted in audio.
@nicksterj 2 месяца назад
@@GodmanchesterGoblin Yes, I'm aware of that. The point still stands that having more samples per cycle _beyond_ the Nyquist frequency does nothing to improve the sound or make the output waveform "more accurate." It only gives you higher bandwidth.
@andrewsheehy2441 Месяц назад
It is a common misconception that the Nyquist Limit allows accurate reconstruction of analog signal. It does not. The source of the Nyquist Limit comes from two papers pubished in 1928 and 1948 (Nyquist and Shannon, respectrively) and those papers are focused on digital communication. These papers do not deal with continuiously varying signals at all. In fact, for a high fidelity reconstuction you need to sample at at least 5x and preferably 10x the highest frequency present in the signal.
@nicksterj Месяц назад
Of course it applies to continuously varying signals. If it didn't, you wouldn't hear anything remotely musical out of any digital system. The frequency response chart, THD and S/N figures on any digital recorder will tell you how good the fidelity is (spoiler alert: for hi-res, nearly perfect).
The Theorem itself states: “If a function x(t) contains no frequencies higher than B hertz, then it can be completely determined from its ordinates at a sequence of points spaced less than 1/(2B) seconds apart.” Function _x_ is a continuously varying function, as _all_ audio signals are made up of continous sinusoid frequencies.
@andrewsheehy2441 Месяц назад
@@nicksterj Thanks for your reply.
If we look at that quote in more detail (p34 of Shannon's classic paper) we firstly see that we are assuming an infinite number of samples:
Xn = [ ... , s_-2, s_-1, s_0, s_1 s_2, ...]
We also see that the reconstruction is based on summing together an infinite set of sinc functions, with each one centred on a sample point and scaled using the value of f(t) at that point.
This is the classic sinc reconstruction.
We further see that f(t) is itself implicitly assumed to be equal to the sum of a set of sinusoids (which, by definition, means that f(t) is periodic with a period, T, defined by the LCM of the periods of the constituent sinusoids).
As for the condition that the separation between adjacent samples must be less than half the period of the highest frequency component present in f(t) then this arises from the basis functions chosen for the reconstruction which are all zero at the sample points and have peak values halfway between the sample points: if f(t) contained sinusoids that had a half-wave duration that was less then the sample separation then the sinc functions - which do not have zeros between sample points - would not be usable.
It’s worth pointing out that the sinc functions chosen are not special and could be replaced by piecewise parabolic functions or even triangle functions: because the reconstruction is based on an infinite set then all that is required of that set is that all unique pairs of functions are orthogonal and each basis function is equal to zero at all sample points, other than the one upon which it is centered.
But in a practical case - where we have N samples, N reconstruction functions and are dealing with non-periodic signals - then the Nyquist limit is a crude guide at best.
If one conducts some precise simulations with a test function that is composed of a finite set of sinusoids of different amplitudes, frequencies and phases (where we know for sure what the highest frequency component is) then sampling at the Nyquist limit, Sr == f_max. will definitely result in substantial reconstruction errors between samples.
@nicksterj Месяц назад ⁺¹
@@andrewsheehy2441 Well, you obviously know more about it than I do. I'll have to take your word for it, but with the caveat that theory and practice are two different animals. :)
At any rate, the theorem will still tell you the minimum sampling rate you need to capture a real-world signal without loss of information. When I talk about "perfect" reconstruction, in practical terms that means only that further improvements in precision, while certainly possible, would yield no audible benefit!
@andrewsheehy2441 Месяц назад
@@nicksterj In the field electrical engineering probably the most feared topic is digital signal processing (DSP). This is because deeply understanding what’s really going on requires a pretty decent level of competence with linear algebra, basis functions, trigonometry, calculus, matrices, complex numbers, Fourier analysis, probability and statistics, differential equations - as well as, these days, competence with coding and computational methods. It gets harder if you’re trying to do fancy things on an embedded system.
It’s probably not surprising that the field is replete with misconceptions and misunderstandings - for instance we are told that in order to perform a DFT you need to use complex numbers and matrices. This is wrong: you can manage perfectly well without using any complex numbers at all.
Most practising engineers simply don’t have the time or energy to really go deep. But if you do then one will find many nuances and insights which serve to keep the subject endlessly fascinating.
Returning to the point I recall when the CD first came out the analog crows maintained that the digitisation at 44.1k samples per sec was somehow not a faithful way to represent music which (very optimistically) contains frequency components up to 20kHz. They were mostly unable to articulate why but they were right all along. I find that pretty amusing!
@ben94_ 2 года назад ⁺¹
thank you
@DavidMacDonald 2 года назад
You're welcome
@elijahjflowers 8 месяцев назад ⁺¹
thanks, do you have any tips for understanding audio interpolation?
@DavidMacDonald 7 месяцев назад ⁺¹
Interpolation happens any time the software has to make an educated guess about what is happening between the samples. This might happen if you are stretching or re-pitching audio, or just resampling a 44.1k clip into a 48k project.
@elijahjflowers 7 месяцев назад
@@DavidMacDonald thanks, but it’s hard to find which sinc formula is better to make that “guess” with and i’ve seen some linear interpolations that add samples to the file.
@DavidMacDonald 7 месяцев назад ⁺¹
@@elijahjflowers interpolation always adds samples to the file. That’s its job. It isn’t ever going to be perfect and different algorithms will give different results in different circumstances. You just have to experiment.

Следующие

Автовоспроизведение

Nyquist-Shannon; The Backbone of Digital Sound