How Digital Audio Works - Computerphile

Computerphile

Просмотров 266 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 3 янв 2025

Комментарии • 493

@OneBigBug 9 лет назад ⁺⁴³⁹
Obligatory correction: The reason that 44.1 was chosen is not because they measured the threshold of human hearing at precisely 22.05, it's because it was "about 20kHz" and they needed to add a bit more on the high end as a transition band, which you might think of as...room for error in how they design the electronics. The reason it's 44,100 and not 44000 or 45000 (or 48,000) is related to how it was stored on old video recording systems. Today, if we didn't care about keeping to standards developed ages ago, it really just needs to be "40,000Hz + a bit more than that"
The "Humans can hear 20kHz" thing is just a general guideline, not a hard and fast rule. Biology is not precise enough to say "22.05kHz is exactly the right amount." People vary too much for that.
edit: Also, sampling a signal at twice the frequency is called the "Nyquist rate" and will give you alias-free sampling, which is why we use that. It's not just arbitrarily "Yeah, that seems like enough", it's a mathematical rule. That's not really super important to know, but it's a good term to Google if you want to know more.
@Timmsstuff 9 лет назад ⁺¹⁰
+OneBigBug Exactly, there needs to be room for the antialiasing filter
@xasdrubalex 9 лет назад ⁺²⁰
+OneBigBug +1 for Nyquist (aka Shannon-Nyquist sampling theorem), i was a little disappointed because this great fact was missing from the video
@FireBarrels 9 лет назад
+Joe Mills It sounds more like distortion from the speaker trying to play that loudly than a 24Khz tone though.
@dvamateur 9 лет назад ⁺¹
+OneBigBug 48kHz was used on DAT tape to prevent straight digital copying from CD which are 44.1kHz. Yes, 48kHz was used to combat piracy.
@anothergol 9 лет назад ⁺⁵
+Joe Mills most people won't even hear a 17khz tone (which isn't a problem, there's nothing interesting for us above that), and if you think you can, double-check your system.
And this whole video, while not saying huge BS, is technically rather vague.
@TheHoaxHotel 9 лет назад ⁺²⁸⁷
That's why I always record at 88kHz, so my dog can appreciate the fine notes of the super piccolo.
@pracheerdeka6737 4 года назад ⁺¹
hahaahha
@pracheerdeka6737 4 года назад ⁺¹
HE IS NOT SPEAKING FOR DOGS LOL HAHAHAH
@sonnenhafen5499 4 года назад
@donald trump why should a speaker not be able to output 20kHz+? besides that it isn't particularly flat in that band because it's not designed for that.
DAC makes sense, do you know why in detail?
@juniorcastiel952 3 года назад
i guess im asking the wrong place but does someone know a tool to get back into an Instagram account?
I stupidly forgot the password. I love any tips you can offer me
@samdustin6733 3 года назад
@Junior Castiel Instablaster :)
@FASTFASTmusic 9 лет назад ⁺⁶⁶
and at 7:20 the Square wave myth!!! This has been disproven countless times. The dots on a waveform although they look like square waves on the screen are actually just average values of where the real waveform is. When it goes through your DAC a perfect sine-wave will ALWAYS be reproduced even at 2-bit 22Khz tone. Furthermore a Square wave is physically impossible in nature (check out Fourier expansion and adding sine-waves to represent square waves)
@monkeyxx 7 лет назад ⁺⁹
That was the first "gotcha" I spotted in this video as well.
@francescodipalma9785 7 лет назад ⁺⁶
Glad to see that someone pointed this out. It's incredible how almost everybody got this wrong, even pros.
@FASTFASTmusic 9 лет назад ⁺⁴⁰
The very first statement was wrong. The air doesn't move across a room until it hits your eardrums, that would imply that the air in the room moves at the speed of sound across a room. The air on average stays in the same place, but as it is an elastic medium the information it carries moves across the room at the speed of sound, so not so much like billiard balls passing on energy, but rather billiard balls attached by springs that return to where they were initially.
@GuyWithAnAmazingHat 9 лет назад ⁺¹¹⁹
I'd love to see more videos on digital audio, sound recording and editing.
@schitlipz 9 лет назад ⁺³
[not a reply to your comment, seems I can't post a new comment but only "reply"]
Brainy guys:
1) What about the information theory (or whatever it's called) where it takes like 3 samples to reconstruct a sine wave?? So two samples of a 20khz wave at 44khz doesn't sound like enough. I dunno.
2) With regards to depth; why not crowd up the samples at lower volumes, in a logarithmic manner - the way we percieve sound. That way high frq, low volume sounds would have better reproduction. Again I dunno.
@Wasthere73 9 лет назад ⁺²
+schitlipz for the most part, many people are unable to hear up to 20k, and tbh, its not a huge loss. my own ears cut off around 18.5k. Unless you are gonna design software around audio, you wont have to worry if 44.1 is enough. For the most part, it is. just do your own hearing tests. If you are super curious though, look up the nyquist theory.
and about bit depth, that one is a lot more complicated. the guy in the video wasnt entirely right when he explained it. i cant explain it to you in simplified terms. just look it up on youtube
@pandaofdoom7684 8 лет назад ⁺²⁸
Correction (at 7:10): The explanation of the noise ("grain") of very low volume signals is wrong. The computer doesn't "connect the samples with lines". Instead, whenever the computer measures the signal, it has to be rounded to the next associated integer value. This causes the so called quantization error - basically the rounding error. So the signal you hear upon playback is the original one plus the quantization error, which causes distortion (unless dithering is used, but that's another story). No square waves here.
@sumit_2111 Год назад
Exactly, if we look at images and say we have only 2 bits to store intensity value of pixels so effectively 4 levels between highest and lowest intensity value, the lower values will be mapped to zero and the medium to higher all will be mapped to 1 and hence will create very bad looking photos as the detailing is gone
@tylerwmbass Год назад
Dither is a topic that could get its own video too
@pawepalczynski5621 9 лет назад ⁺⁷⁷
The hearing range is up to 16-20 kHz. The reason it's 44.1 kHz is because it allows to detect 22.05 kHz frequencies. The gap between 20 kHz and 22.05 kHz is because before the A/D converter there is a filter that cuts off anything above 22.05 kHz to avoid aliasing. That filter starts cutting off at 20 kHz and reaches -60dB at 22.05 kHz so that nothing audible is lost. If the filter cut-off at exactly 20 kHz (or much closer to it) it would introduce a lot of distortion in frequncies and phases.
@black_platypus 9 лет назад
+Paweł Palczyński That is some cool extra info, thanks for sharing!
@VickyBro 9 лет назад
+Paweł Palczyński Something like a badgap?
@FernieCanto 9 лет назад ⁺⁵
+Paweł Palczyński Well, the actual reason why 44.1 KHz was chosen is because, originally, digital audio for CD production was recorded in Sony U-Matic tapes (yes, the format for analogue video), and the technical specifications of the tape made 44.1KHz the most obvious choice. Having a roll-off filter at the top of the spectrum is useful to avoid distortion, but the exact frequency you choose is not really very important; it's not like a lot of people can hear anything above 18 KHz.
@pawepalczynski5621 9 лет назад ⁺¹
+Joe Mills If you use bones (skull, jaw) to input sound instead of eardrum you can go even much higher. There is a upper cutoff frequency for air transmitted sound however, because at some point sound would need to be painfully loud for you to hear. A frequency at which it is not yet painful but you can still hear it as loud as 1kHz at 10^(-12) W m^(-2) intensity would be a highest you can hear then.
@marcinsosinski766 9 лет назад ⁺¹
+Paweł Palczyński Actually 44,1 has nothing to do with any type of filter. You can always calculate and implement a slightly different one. There is one simple reason for 44,1 or 48 kHz was chosen. There were no HARD DISKS big nough at that time to perform all digital master. Remember it's 1970s. Only posibll method to make a master from witch you could make a matrice to press CD was to record digital signal on video tapes.This was pre BETA time. So only viable option was to use UMATIC (stanard video recorders in TV production of that period) I cant remember exact resolution in nowadays digital terms but it worked out you could record (in monochrome) 16bit/48kHz or 16bit/44.1kHz on tape running 30 frames per second (NTSC framerate) 44.1 was choosen for CD probably only because you could fit extra 8% running time on disk that way. As for argue that it is inferior to 48, that doesn't matter. First CD players from Phillips and Sony were equipped with 14 bit DAC's, and up for this day people value this players for pleasant sound.
@dlarge6502 5 лет назад ⁺³
44.1 kHz was chosen as it captured all desired frequencies and it also just happened to be the perfect rate to store the samples digitally on PAL and NTSC videotape. This was used as the storage method for transporting the audio between locations.
Later, when we didnt need to care about storing it on videotape we switched to 48 kHz as it helps with making filtering much simpler.
Also you dont get square waves, thats impossible with a band limited signal. You can only get the original signal, smooth and non-square. 16 bits determines where the noise floor is and for playback 16 bits is more than enough to capture the dying cymbal. You use more bits, like 32 bit float, when editing before finally exporting to 16 bits. Editing with 24 bits or 32 bit floats helps because it gives you so much headroom to apply filters etc without adding any more noise that will be noticeable.
@brianpacheco1927 7 лет назад ⁺⁴
Working in IT and being a huge music lover, this gives me a huge level of appreciation for artists and their studio engineers on how much work goes into putting together an album with lots of tiny microdetails never really ever heard. Definitely makes me want to go out and buy some higher end audio rather than all the streaming MP3's we do nowadays.
@patk2225 6 лет назад
Brian Pacheco yeah my sound system is pixel perfect and my sound system has microdetails that no other sound system has
@brianpacheco1927 6 лет назад
When I had the HD800s they were great for micro-detail but I realized that it just gets fatiguing after a while and doesn't sound natural.
@patk2225 6 лет назад
my tweeters and subwoofer has 21 watts but my mid range woofer has 190 watts why does my mid range woofer need so many watts ?
@dude157 9 лет назад ⁺¹⁶
Hi Computerphile, great video. I'll just add a minor correction for you. Humans can hear up to 20kHz , not 22kHz. In fact, by the time people reach adulthood, the top end of hearing is closer to 16kHz. The reason a sampling frequency of 44.1kHz was chosen as a standard was not because it is twice that of 22050Hz. It's to do with a problem called aliasing. Any frequency content contained within a signal, which is above half of the sampling frequency will introduce low frequency alias signals (See Nyquist Theorem). This is the exact same reason helicopter blades appear to spin backwards or slowly in video. For audio we would like to capture frequency information up to 20kHz, thus determining the sample frequency to be 40kHz. The only problem is, any high frequency information above 20kHz will ruin the audio due to aliasing. So in addition we add a low pass filter, called an anti-aliasing filter. Filters can't have a really steep cut off without causing all sorts of distortion, so we need to leave a little bit of space in the frequency spectrum to fit one in. Hence we oversample at 44.1kHz to allow for that.
@EgoShredder 9 лет назад
+Sam Smith I would add that 16KHz is IF you have looked after your hearing, e.g. wearing ear plugs at concerts and not blowing your ear drums out with awful dance music. I've done a few sample audio tests and found I can hear to around 17KHz and I am 44 years old. Certain sounds at specific frequencies cause me a significant amount of physical pain, however this had no effect on my mum who could not hear anything. So I wonder if the rate of decline in hearing is steady or suddenly drops off the cliff at a certain age?
@LinucNerd Год назад
Eight years later, and I see this wonderful comment... Even tho I still can't quite make sense of aliasing.
@heavymaskinen 8 лет назад ⁺⁴
Only partially correct about the bit-depth. It really only determines the noise floor (noise made from quantisation). Very important for recording. For consumer-playback - not so much.
The human hearing threshold is generally regarded as 20kHz, but sensitivity drops way before that, and both the sensitivity and threshold lowers with age.
@paulanderson79 8 лет назад ⁺¹
Yes. There's a lot of confusion about bit depth and resolution. For consumer playback there is no benefit as you say.
@dlarge6502 5 лет назад
I can hear up to about 15k at 38. And a 15k tone really isnt very interesting!
@tylerwatt12 9 лет назад ⁺⁸
Now I know why higher sample rates improve treble clarity, in a waveform high frequencies spike up and down a lot very quickly, a slow sample rate misses those spikes. Great video!
@oresteszoupanos 9 лет назад ⁺¹
+Tyler Watthanaphand Yup, that's part of the Sampling Theorem. There's a rule that says "if you wanna capture perfect audio up to X KHz, then you need to sample *at a minimum" of 2X KHz." Search for Nyquist Frequency for more info... :-)
@oliveirahugon 9 лет назад ⁺¹
+Orestes Zoupanos Better results are achieved when you double the sampling frequency (according to the maximum frequency of the signal) and add a little more sampling cycles. Hence the 44.1 KHz, which is 22 KHz * 2 + 100 Hz. The extra 100 Hz prevent the signal from containing aliasing artifacts.
@WingmanSR 9 лет назад ⁺¹
+Hugo Neves You're right, but he did basically say that. That's what he meant by _sample *at a minimum" of 2X KHz."_ . Granted, that doesn't really emphasise the benefit of the fudge factor.
I guess it depends on how you think of it, it's either _Nyquist >= 2X_ or _2X+100 = Nyquist_.
@RyanRenteria 9 лет назад
+Hugo Neves they picked 44.1 because it was compatible with both PAL and NTSC video equipment. Early digital audio was stored on video cassettes. They needed a minimum of 40hz, and then extra room for the antialiasing filter, and it had to be divisible by both PAL and NTSC standards.
@GoldSrc_ 9 лет назад
+Tyler Watthanaphand
Sample rates above 44.1KHz do not improve audio quality at all.
It doesn't matter if you go from 20Hz to 20KHz in an instant, at 44.1 it will all be perfect.
@omgimgfut 9 лет назад ⁺⁶⁶
a more in-depth explanation can be found by searching: Digital Show and Tell Monty Montgomery on youtube
@GoldSrc_ 9 лет назад ⁺⁷
+omgimgfut
Oh yes, that video explains it better.
@planetweed 9 лет назад ⁺²
+omgimgfut Was about to comment the exact same thing but you beat me to it :)
@elephantgrass631 8 лет назад ⁺¹
For some reason, I can only thumbs up your comment once. I was trying to thumbs up it a billion times.
@francescodipalma9785 7 лет назад ⁺⁵
It's not just more in depth. It's also the correct explanation.
@Hermiel 5 лет назад
Ha, I just posted the same thing.
@otakuribo 9 лет назад ⁺³
I'm loving the audio stuff, between this and Sixty Symbols, I've learned a lot.
@realraven2000 9 лет назад ⁺⁴
You also get "squares" at the top end when you compress to hard as the peak of the waves can be cut off. Any sharp edges are heard as a form of distortion.
@HandyAndyTechTips 9 лет назад ⁺²²
Great video. Just thought I'd add to the chorus, and comment on an irony in digital audio. Back when CDs were introduced, everyone was trumpeting how amazing their 90+db of dynamic range was. Now, we're lucky to see discs released from major record labels that use more than 10db :-) And, in fact, most albums I've bought recently go one step worse, and use heaps of heavy digital clipping on all of the drum hits. A bit sad, I suppose....
@JonnyInfinite 9 лет назад ⁺⁸
+HandyAndy Tech Tips that's a consequence of the Loudness War...
@johnberkley6942 9 лет назад ⁺³
Clipping on the drum tracks has a long history. Back in the day Motown got a great drum sound that way. But analog and digital clipping are different beasts. What sounds great on tape sounds really shitty in the digital domain.
@snickers10m 9 лет назад ⁺⁵
I'd love to hear more from this guy!
@vyli1 9 лет назад ⁺⁸
excellent video... I hope there's more coming from this guy; Amazing, thanks for the upload.
@dipi71 9 лет назад ⁺²
Nice illustrations, but this was just the tip of the tip of the iceberg.
It would be nice for further episodes about the topic of digital audio to mention Shannon, the logarithmic nature of the decibel scale, real-life annoyances like the noise floor and other concepts, e.g. bit rate, data compression (in FLAC, for example) vs data reduction (i.e. data loss, and that in more than one respect, like in MP3). It's a very interesting field of topics.
@NemGames 9 лет назад ⁺²
I actually followed this whole thing. great explanation
@TheBluMeeny 9 лет назад ⁺⁵
Ha! I remember asking for a video on this topic ages ago! Its awesome that we finally got it!
@blackboxlearning Год назад
Amazing work ! I had wanted to understand audio processing for a while now. Thank you for the lovely explanation and delivery.
@MrSparker95 9 лет назад ⁺¹
If sampling of a signal with frequency higher than half of the sampling frequency occurs then the signal will not be 'cut off', it will transform into a signal with another frequency.
@black_platypus 9 лет назад
+Sparker yes, nobody said otherwise... they were only talking about cutting off when adding waves past the bit depth, weren't they?
@MrSparker95 9 лет назад
Benjamin Philipp No, I'm talking about the 'cutting off' at 2:24.
@black_platypus 9 лет назад
+Sparker oh, pardon, I "overlooked" that (shame that you can't use the word "overheard" in the english language like that)
Yeah, I guess that can only ever be taken as an abstract concept where you "cut off" at an information threshold :/
@LiViro1 7 лет назад
I decided to learn something about digital audio, and ended up here (among other places). Great explanation, you should be a teacher the way you communicate.
@colinsmith6480 9 лет назад ⁺²
Really nice explanation, thank you so much, been looking for a decent explanation for ages
@ClearInstructionsOnly 9 лет назад ⁺²⁶
Instruction Clear. Successfully picked up acoustic waves through my red cat. Thank you.
@anzer789 6 лет назад ⁺¹
I saw on another channel that the 44.1 kHz sampling rate was because we kept the human limit of hearing as 20 kHz and added a 2.05 kHz extra limit because low pass filters weren't accurate enough to cut off exactly above 20 kHz, so we added a bit of leeway. Double that and you end up with 44.1 kHz
@esekion1 6 лет назад ⁺¹
and it's true !
@dlarge6502 5 лет назад
It was also the prefect rate for data storage methods used at the time.
@darrylwatkins2335 5 лет назад
Best explanation of how Soundwaves are converted to digital. Thank you much!
@yanwo2359 9 лет назад ⁺⁴³
Excellent succinct explanation! Ironic that the audio had a hum throughout.
@MacoveiVlad 9 лет назад
Yes, it is some kind of fan. But if the video editor heard it he would have removed it. That is why i suppose he did not hear it.
@seanski44 9 лет назад ⁺⁷
Dave's PC was by his knee, the fan was running at different speeds throughout the video so not easy to remove.
@MacoveiVlad 9 лет назад
So it was Murphy's law :)
@Matiburon04 9 лет назад
+Yan Wo It was on purpose, obviously
@EgoShredder 9 лет назад ⁺²
+Yan Wo They had a recording session of meditating monks in doing a new album. ;)
@davidweimers4471 9 лет назад
This is actually an important part of the Sound Engineering degree that I took. Understanding digital audio means that you need to understand the Nyquist Theorem to make the best decisions on how you're going to record a particular source for an end medium and really influences the quality of the final product. Digital audio processing has become an every day part of your average Sound Engineers job these days.
@adamhlj 9 лет назад ⁺¹
This is exactly what I have been wondering about lately. I just got a Zoom H5 and was confused about all the recording settings! This explained it.
@Sandromatic 9 лет назад ⁺³
Could you do sound with a logarithmic scale? such that the shorter waves can get more detail than the larger ones?
@deluksic 9 лет назад
Sandra Nicole Yeah, more like a float rather than integer :)
@WarrenGarabrandt 9 лет назад
+Sandra Nicole Interestingly, this is sort of what happens with audio compression techniques. Many audio codecs chop up the continuous sound signal into short sample frames (of a few milliseconds long each) and convert the audio signal into discrete frequencies (with a Fourier Transform of some kind). Then a psychoacoustic model is applied to the frequency spectrum to eliminate details our ears care less about, and apply emphasis to those remaining that it does care about. This is sometimes done with a MEL Frequency analysis, thought every codec has its own method. Lastly, the remaining frequencies are packed in a compressed data stream, such as a Huffman tree, or LZW, or whatever the codec calls for. Decoding and reconstituting the original sound perfectly is of course impossible at this point, so they call these kinds of codecs "Lossy", since they throw out a lot of information to fit the most important parts of the sound into the least amount of data possible.
@anothergol 9 лет назад
+Sandra Nicole as written, it does exist, but for 8bit audio. For 16bit it's simply not needed. And in fact, even 14bit is already enough for our perception, unless you like to crank up the volume a lot on quiet parts of a song.
@johnberkley6942 9 лет назад ⁺¹
Love it! More! For instance a video about the geeky side of compression. It would be nice to understand what I'm doing when I twiddle them knobs...
@MisterSofty 9 лет назад ⁺³⁴
I love the square wave! Stop hating on the square wave! :D
@vuurniacsquarewave5091 9 лет назад ⁺³
+Mister Softy Pulse waves with different duty cycles... they're good for everything!
@PeterWalkerHP16c 9 лет назад ⁺²
+Mister Softy
LOL, I'll have one of everything thanks Mr Fourier.
@J2897Tutorials 9 лет назад ⁺⁷
+Mister Softy Square waves can be dangerous depending on the size of your ship.
@RubixB0y 9 лет назад ⁺¹
After "Bitshift Variations in C minor" I have a special place in my heart for sawtooth waves.
@Scarabola 4 года назад
8 bit video games have some bomb music.
@WhyFi59 9 лет назад ⁺¹
Wow! This is just what I was waiting for! Thank you!
@nigelnigel9773 9 лет назад ⁺⁵
Whilst I agree this is a simplification, this is pretty much what A Level music technology teaches and it's explained well.
@RC-1290 9 лет назад
7:15 The computer doesn't have to draw lines between the samples. They can just remain individual samples. When you output the signal it's smoothed out anyway (I assume the momentum of the moving bits in your speakers help with that).
@hydrox24 9 лет назад ⁺²
+RC-1290 It's not smoothed out so much as a sin wave is fitted to the samples, and the DAC just outputs a perfect combination of sin waves.
@wesleyjin8071 3 года назад
This is great info for a producer trying to approach it from an engineering side.
@tsjoencinema 9 лет назад ⁺⁵
Awesome video. More like this, please.
@GoldSrc_ 9 лет назад ⁺⁵
Yeah, I will have to call BS about the part about the sound card outputting a square wave if the bit depth is too low.
All the bit depth affects is the noise floor.
@RWoody1995 9 лет назад
+Gordon Freeman he wasn't wrong, he was talking about in an audio editing situation, if you add two 24bit tracks together without halving the volume of each beforehand you will get clipping because you will likely have parts of the audio that are at lets say 40,000/65536 in each, add those together and you get 80,000 which is out of range of that 65536 so anything above 65536 will be cut off, if this is bad enough it would become a square wave, and the reason you wouldn't halve it beforehand even though that might seem simpler is because you will throw away some of the data in doing that, better to add them together in a higher bit depth and then convert that back down to 24bit afterwards or in the case of the final output 16bit. in terms of the final output file you are right but thats not what he was explaining.
@GoldSrc_ 9 лет назад
megaspeed2v2
Fair enough, but your average joe will never get into contact with 24bit audio.
@RWoody1995 9 лет назад
Gordon Freeman this isnt about your average joe though, this applies to the audio engineers producing the audio in the first place before it gets compressed down to 16bit and put on a CD DVD or game
@RWoody1995 9 лет назад
Gordon Freeman Yes its for the average joe, he was just explaining that anything higher than 16bit is only needed for audio that is going to be edited, not really rocket science.
@GoldSrc_ 9 лет назад
megaspeed2v2
Well, he said that the computer draws lines to connect the samples, that is not what happens, any audio engineer would know that's not how you would explain it to the average joe.
He also could have simplified it by saying that the bit depth affects the quietest sounds which don't end up sounding like square waves.
He got quite a few things wrong, he could have done better.
@yvonnevanderlaak8169 9 лет назад ⁺⁵
There is no need to go into floating point math and sampling theorem on the first video! This explanation is simple to follow and covers the basics very well. Maybe another video can go into more details on different aspects. Great video. Thanks :)
@geonerd 9 лет назад ⁺¹
+Yvonne Van Der Laak Yea, but... ;) He's saying some things that involve sampling theory that happen to be completely wrong.
@teharbitur7377 9 лет назад ⁺⁶
How come you didn't mention the Nyquist-Shannon sampling theorem?
I know you are trying to make it less 'technical' but sometimes it's nice to mention some of the more technical things.
@SkitchAle 8 лет назад
+Teh Arbitur I was wandering the same thing
@h7opolo 3 года назад
1:46 with this information, i was able to deduce the solution to my staticky PC audio; I had the sample rate for audio output set to max setting of "24 bit, 192,000 Hz (Studio quality)," and once I lowered the sample rate to "24 bits, 96000 Hz (Studio quality)", all the static noise magically vanished! YAY, thank you, this had been harming me for years.
@woosix7735 3 года назад
quiet signals don't sound grainy because they are square waves, it's because of quantization noise.
@wesofx8148 9 лет назад ⁺³
Very well said!
There's a lot of analogies you can draw to photography and Photoshop. Clipping is like over-exposing an image. Sampling frequency is like pixel resolution. Bit-depth is like color-depth. In photography, ideally you want you final image to span the entire 24bit colorspace, with no obvious pixelation, and no obvious posterization. In audio production, you want your final track to span the entire 16bit _wavespace_ with no obvious digital distortions.
@unvergebeneid 9 лет назад ⁺¹
I know he was simplifying but to be sure, if you sample frequencies higher than twice the sample rate, you're not simply "cutting them off." It's actually much worse: you're introducing kind of "phantom frequencies" that weren't in the signal but turn up in your digital signal. Those are mirrored at the highest frequencies you can represent, so the higher you go into "you didn't filter properly" territory, the lower the frequency gets (until it can't anymore, then it gets higher again), thus being more off and mostly more noticeable, too.
@danwalker77 9 лет назад ⁺¹
Very nice video guys!
@repker 9 лет назад
Yea man, would love more topics on digital audio
@MuztabaHasanat 9 лет назад ⁺¹
if the wave does not fit in the bit depth , why don't we just increase the bit depth like 32 bit or 48 bit ? Another question is why not bit depth is 16 bit or 32 bit ? why 24 bit ?
@fuckwadify 9 лет назад ⁺¹
+Muztaba Hasanat the reason you can't keep increasing the bit depth is because the file size get bigger, remember 1 cd is 700mb and that only 16bit
@MuztabaHasanat 9 лет назад
Thanks to all :)
@oliveirahugon 9 лет назад
+Alex Lee Most humans can't detect the difference between an analog audio signal and a digital audio signal with 16 bits/sample. Also, when you use higher and higher bit depths, you need better digitization equipment. The least significant bits tend to be mostly noise if you don't possess a good equipment.
@hydrox24 9 лет назад ⁺²
+Muztaba Hasanat Every wave *could* fit in 16bit. It's just that for mixing and recording they require overhead. This is because every time you manipulate a wave a little bit, the volume of every sample needs to be described using a number between -16,000 and +16,000 (or so). If you change the volume for a sample and the maths ends up saying 'give this sample a volume of 5,256.6'. But you can't have floating point numbers (no decimal places). So it rounds it up to 5,257. You can't hear this change in volume, but it over many thousands of changes to a sample this begins to create a fairly significant change. a combination of lots of differences in .5 from what the wave should be. Done to many samples in a song, this creates a quiet noise.
Larger bit depths make this noise much quieter. So that is why 24-bit (or 32bit) is generally used in mixing. The final master doesn't need this because no more changes will be made and the noise doesn't have a chance to get loud. 16-bit is more than enough for even hi-fi listening.
16-bit gives enough control over volume that the quietest sounds are around -120 dB, which is well out of any human's hearing range.
@yaghiyahbrenner8902 9 лет назад
7:10 yes.. but at that volume its still sampling at 2^16-bit samples.this is where subjective listening vs science dont correlate, as some people doesn't hear it and others do. but 24-bit theoretically is better.
@MovingThePicture 9 лет назад ⁺³⁹
3bit are more than enough for the average loudness war song.
@lotrbuilders5041 6 лет назад ⁺¹
MovingThePicture true, but 4-bit is much easier to process
@kathalave6678 9 лет назад ⁺²
it helps me to understand this because i have a subject digital audio..i am a MT student multimedia technology
@GoldSrc_ 9 лет назад ⁺¹
+Kath Alave
Go watch
D/A and A/D | Digital Show and Tell (Monty Montgomery @ xiph.org)
here in youtube, it gives you a better explanation than this video.
@balazsfekete 9 лет назад
Having more experience with computer graphics, it is interesting to see how the same concepts apply to audio processing.
@ricktrott760 9 лет назад ⁺³
I just synthesised a wave in Sound Forge, then I boosted it and it 100% squared the wave off, I also reduced the bit rate and it made the wave much more blocky. His explanation seems very good to me. They even show the squared off wave when he shouts in the video. I'm sure there are a lot more details to why and how it does this but you cannot deny that the waves are somewhat cut off and look square. Download audacity and try it yourself.
@DanLMH 9 лет назад
Working with audio is super interesting, I love it. Analogue and digital sounds are great fun to play with, i would recommend FL Studios demo for anyone interested in becoming a sound engineer :P
@Seegalgalguntijak 9 лет назад
I remember my first soundblaster card, it could not only sample at 44.1kHz, but also 22.05kHz, and I think even something lower than that (16kHz?) - it also had an option to record at a bitrate down to 8 bits. So basically, it sounded like talking through an old telephone. But then, this was an amount of data the PCs of back in the day could handle much better. It was an old 386, and the soundcard could not use a bitrate any higher than 16 bits.
@realraven2000 9 лет назад ⁺¹
Hey you got the same speakers as I. Gotta love the alesis!
@johannes914 9 лет назад ⁺²⁵
Please explain when dynamic compression comes in the process ...
@jacobh1995 9 лет назад ⁺¹¹
+johannes914 It comes into the process, way overused, once a record company is involved.
@jacobh1995 9 лет назад ⁺³
rhoyt15 Yeah, I'm stereotyping. But 9 times outta 10, the record companies don't use it correctly.
@DirtyRobot 9 лет назад
+johannes914
dynamic compression comes into the process after recording.
It is technically a post production tool but can be used in other places like live performance.
You would choose to use it when a sound source is very dynamic, in that the volume levels change a lot and the rate that they change is not predictable.
Think of it basically as an AI fader that can analyse the input and quickly make a decision of how much volume breaking or boosting it requires.
@FASTFASTmusic 9 лет назад
+johannes914 ln simple terms, compression makes the loudest parts of the signal quieter which then means you can turn the whole thing back up without distortion.
@DirtyRobot 9 лет назад
*****
If someone gives too much mic, enough to damage the recording or performance then you just lost your job.
@kilésengati 9 лет назад ⁺⁵
Hmm, I want more of such videos. Music is so interessting when corresponding with electronics and informatics.
@ajinzrathod Год назад
when we say 44.1 kHz is recorded. I means 44100 samples values are recorded in each second
But the bass typically ranges from about 20 Hz to 250 Hz or so
But the bass can be recorded for more than 300 seconds, right?
and 300 seconds would be 13230(300*44.1)
So how it is between 20 to 250 Hz only.
Correct me if in understood something wrong
@YingwuUsagiri 9 лет назад
Awesome to know how the technical part of my audio work works. Working with orchestral pieces and a lot of low and a lot of high and those 'lingering' cymbal notes I did work at the higher settings like 24 bit. Good to know why exactly I have to do so on a technical level.
@robin888official 9 лет назад ⁺¹⁴
I like the fact that 44100 = 2^2 * 3^2 * 5^2 * 7^2
(First four primes squared.)
I can't imagine that's a coincidence.
@whiteeyedshadow8423 5 лет назад ⁺²
yur being a math nerd(and i like it)
@djsunil6333 5 лет назад ⁺²
Every number is some product of primes
@deus_ex_machina_ 4 года назад
Illuminati confirmed?
@oisiaa 9 лет назад ⁺⁵
Fascinating and well explained.
@vuurniacsquarewave5091 9 лет назад ⁺¹
I'd love to see a video about different audio formats, not in the .wav or .mp3 file type sense, but rather the encoding methods, like PCM, ADPCM, DPCM, PWM, etc.
@andriypredmyrskyy7791 9 лет назад
If getting the small details for quiet notes is an issue, why not use a logarithmic scale, where values are further apart at higher intervals? Then you could keep detail at small volumes when you need it and lower resolution at high values where you don’t.
Also, why not use a relative scale, where you mention how much the wave changes each sample? That way you'd have no upper or lower end.
@OnixRose 9 лет назад ⁺³
My DAW has settings up to 192,000 Hz, are there any benefits or downsides of using a sample rate this high? Considering the "industry standard" is significantly lower, what applications make use of this sample rate ?
@RyanRenteria 9 лет назад
+OnixRose almost none. 192 will give you a huge file size and slow your sessions down. On top of that, there is scientific evidence that ultra high sample rates actually sound worse (intermodulation distortion). Plugins and DAWs like higher sample rates because it gives them more information to process. as a result, most plugins up sample to 88.2 or 96k when your session is running at 44.1/48k. I wouldn't bother using higher than 96k. 96 would be a good sample rate to run at if you think you're going to be doing a lot of time stretching or similar processing. Other than that, you cant really go wrong with 48k.
@teekanne15 9 лет назад ⁺²
maybe one on synths? Different waveforms, envelops, filters, overdrive and such?
@gothxx 9 лет назад ⁺¹
Interesting, would be nice with a video on how the data is stored/compressed in files.
@antivanti 9 лет назад
I guess floating point audio formats can help with two of the problems. The headroom before digital clipping occurs and the fidelity at very low volumes.
@PaulIstoan 9 лет назад ⁺¹
Great show!
@stephenkamenar 9 лет назад
5:25
Why use negative volumes? If you didn't, you'd get double the bit depth, and you wouldn't have to deal with phase cancellation.
I'm sure there's a good reason why, so tell me
@TheGregaM 9 лет назад ⁺¹⁰
Drawing lines between points? Sorry but that's just not true. That would produce a lot of aliasing and distortion (like square waves as he said) and that's certainly not what happens. DA converter draws smooth curve with reconstruction filter. Only thing you loose with lowering bit depth is data lost with quantizing, which makes noise floor louder.
I think you should correct that statement.
@domminney 9 лет назад ⁺³
Apologies, but for the sake of a short introduction video I simplified the whole subject somewhat.
@TCWordz 9 лет назад ⁺⁶
+David Domminney Fowler
Please don't do that, and if you feel it is necessary to oversimplify to such an extent pleast specify in the video "this is something of an oversimplification". Because this video is downright incorrect in some cases.
@Bencarelle 9 лет назад ⁺³
+David Domminney Fowler, I agree with +Tommy59375 completely. I watch the *phile videos precisely because of the way that masters of their craft are able to explain deeply complicated concepts without distortion or oversimplification. These aren't buzzfeed videos.
@DevinBigSeven 9 лет назад
Curious why they don't take the magnitude of the wave, say by using a full wave rectifier circuit. The input is sinusoidal and so when outputting, when there is a change from decreasing values to increasing and you're near 0, you know to take the negative of the value until you go past 0 again. Whether the wave should start as negative or positive could be determined by the amplitude of the initial sample, which might be possible in hardware; although, I'm not sure what difference it makes, starting low or high, producing the inverse of the wave.
@codediporpal 9 лет назад
Good intro and a subject dear to my heart. Hope you do more video about digital audio.
@chevalierdeloccident5949 5 лет назад
If you would like to know more about the algorithms for converting from a higher sampling rate to another ( _downsampling_ , e.g. recorded @48KHz then written to a CD @44.1KHz) look up the term _dithering_ in relation to digital audio.
@SvenGunderssen 8 лет назад
Can you do more videos on digital audio? Specifically how audio software applications / plugins work.
@philipstuckey4922 9 лет назад ⁺²
what kind of scale would one use for mapping input to bits? would a logarithmic scale work better for keeping the small and loud sounds?
@neromule 9 лет назад ⁺¹¹
+Philip Stuckey Well, actually, dB is already a logarithmic unit...
@TheWeepingCorpse 9 лет назад
yes that would work but i dont know of any log DACs.
@TheUglyGnome 9 лет назад
+TheWeepingCorpse And it would be a PITA for audio processing if we used logarithmic sample scales.
@eideticex 9 лет назад
+Philip Stuckey The ideal scale would depend on what kinds of sounds you are sampling. More often than not equipment we have access to is far simpler and falls into linear scale.
@drake1o232 9 лет назад
+Philip Stuckey More advanced methods for storing audio actually take this into account and actually changes the scale over time to avoid this problem
@Ty1350 6 лет назад
Very easy to understand. You are a good teacher
@djalphakay 9 лет назад ⁺¹
I hope someone can answer me on this:
From 2:10 he says that we only need a low sample rate (or sample frequency as he calls it) to represent the low frequency sine wave. But why doesn't that produce a sort of triangle shape? As he says later on in the video, the computer thinks it should go straight from point to point, and if it did that with the low sample rate, it would not produce a sine wave sound. I think I watched video explaining this once, but I can't find it now... I believe it has something to do with dithering? or something idk
@domminney 9 лет назад ⁺¹
We didn't really go into that in this video, it was very basic. Maybe it's a subject for another time.
@djalphakay 9 лет назад
I get that part, and that wasn't really my question. I can try and explain it differently:
If you know a bit about sound and its digital representation, you'd know how a sine wave sounds and how a triangle wave sounds. To me it seems like a triangle wave needs much fewer "points" to represent that wave, it actually only needs the maximum value and the minimum value. But the slope of the sine wave is constantly changing, and so to me it seems like it would need an infinite amount of points to represent that sound wave.
@hydrox24 9 лет назад ⁺³
+paulcmnt This is entirely incorrect. Photos are not equivalent to audio. Just like pixels, audio files contain a bunch of 'point samples' however unlike pixels, they are not represented as 'squares' or as a flat section of the audio waveform. Whenever you listen to a digital file, it goes through the DAC, which looks at all of the point samples and then finds the only possible combination of sin waves that will fit, and outputs that sin wave as a perfectly smooth voltage change (analogue signal).
@geonerd 9 лет назад
+Alpha Kay His explaination is essentially wrong - there is no "point to point" in the reconstruction. (And there are no "stair steps" either.) Any competent DAC will produce a lovely sinusoidal output, regardless of the bit depth or sample rate. Please watch the video I linked to earlier; it does a superb job explaining and demonstrating how all this works. :)
@geonerd 9 лет назад
+Alpha Kay The digital -> analog converter understands this and "fills in" the "missing" sine wave information in a smooth, intelligent, accurate manner.
@marshalrando6767 4 года назад
Great explanation!
@justcarcrazy 9 лет назад
5:07 It's not "volume level", it's amplitude.
@kylewhite2985 9 лет назад
Amazing video! It would be great if you pick up from here to talk a little about digital compression and the infamous Loudness Wars, which would make a great title! The War for your Ears! lol Great work. Cheers to you guys.
@crappymeal 9 лет назад ⁺²
nice bit of info for when, if ever i use my microphone
@TemporalOnline 9 лет назад
One thing I wonder is about the sounds we cannot hear but still have an effect on us, like when happened in a laboratory that at night people started to get visual artifacts that where traced to I guess infra sound/noise made by an air conditioner or something like that, how do you treat that, or would use in a scary film for instance, or if they are totally cut out.
Also there was a buzzing/fan/whatever noise in the background all the way through that was kinda... distracting.
@rafagd 9 лет назад ⁺²
10:00 There is another reason for softwares to use 32bits. Most computers can only allocate memory in lots of 8, 16, 32 and 64 bits, so 24 bits are awkward to use. So the software may use the larger 32bits because it is easier and faster, and just convert it to 24bit when you ask it to save/export the file.
@antivanti 9 лет назад
Will this be a series of videos? Will you go into audio compression and things like that like you did with pictures and JPEG compression?
@mrnarason 9 лет назад ⁺³⁴
makes me want to be a sound engineer
@DirtyRobot 9 лет назад ⁺⁵
+Victor P.
Do it, I did.
@morgogs 9 лет назад ⁺¹
+Dirty Robot What does that involve?
@KutAnimus 9 лет назад ⁺⁴
+Dirty Robot I wouldn't do that considering the sad economic state the audio engineering industry is right now. Lots of really good sound engineers go months upon end without gigs these days.
@thelol1759 9 лет назад ⁺¹
+morgogs Math.
@DirtyRobot 9 лет назад ⁺⁴
+morgogs Depends if you want to go study or you want to dive in.
When I made my choice there were no courses you could do but I had a fair bit of experience so I contacted all the recording studios in my area and took a low paid position then worked my way up.
@supernunb3128 5 лет назад
Very well explained; thank you very much!
@mewx 3 года назад
Awesome video! Thanks a lot!
@BOBOUDA 9 лет назад ⁺⁴
So sample frequency is the "fps" of sounds ?
@heheheheheeho 9 лет назад ⁺³
+BOBOUDA More like Vsync on a monitor. At least when converting from analog to digital =)
@bennyuoppd33 9 лет назад
+Patrick The Buried Then what happens when your cpu stalls for a bit?
@stoppi89 9 лет назад
+BOBOUDA Pretty much, yes (analog audio/ live audio has infinite sample frequency, just like RL). Bit depth is like Contrast of a monitor.
@stoppi89 9 лет назад
+BOBOUDA Pretty much, yes (analog audio/ live audio has infinite sample frequency, just like RL). Bit depth is like Contrast of a monitor.
@3snoW_ 9 лет назад
+Benny Kolesnikov That usually doesn't happen because the cpu speed is WAY higher than the sampling frequency and between 2 samples the cpu has more than enough time to do whatever it needs. However i've seen it happen, it sounded like slow-mo with stuttering, it was really weird. And just before my pc crashed lol.
@EgoShredder 9 лет назад
Dividing the audio up into the smallest samples at 96KHz or 192KHz, benefits the whole frequency range. It will not allow you to hear beyond your hearing capabilities, but it will make what you can hear much more detailed and smooth. Add to the fact that a 24-bit depth pushes the 'noise floor' further down, thus giving you more usable range to work with above the noise. So to summarise having higher bit depths and frequency ranges, does improve what you can achieve and hear. It is NOT only about higher frequencies to make your dog howl! Also the creator of this video did not mention the audiophile music formats such as Super Audio CD, which uses a bit depth of ONE, but has a really weird process which uses ultra high frequencies.
@marcinsosinski766 9 лет назад ⁺¹
+EgoShredder Actually this ONE bit has nothing in common with bith depth per se in conventional understanding. It's a different coding system. DSD has 2,84 something sampling rate and uses Pulse Width Modulation where "density" of ones and zeroes actualy defines analog value in given time. Something like so called D class amplifiers. I wonder if it could be nice to create PWM amplifier locked exactly with dsd signall with at least 10W per channel. No DAC in such system at all. Ouput stage would be a DAC :)
@maxgrrr2244 9 лет назад ⁺²
+EgoShredder 96KHz or 192KHz will not produce more detailed and smooth sound in the audible range. In fact, it can be damaging because inaudible ultrasonics can produce intermodulation distortion in the audible range.
@EgoShredder 9 лет назад ⁺¹
+Marcin Sosiński It's a complex and controversial system but somehow it seems to sound nicer than the other competing formats. Apparently the SACD system introduces a lot of high frequency noise during the process, plus DSD is very hard to edit; I think DXD was introduced as an editing format for DSD production.
@EgoShredder 9 лет назад
+Max Grrr Yes that can happen. I suppose what I am getting at is how can we get digital to match what analogue can do? Both have advantages but how can we get the best of both worlds?
@bensonthomas3155 9 лет назад
Great work, God bless you all
@maxgrrr2244 9 лет назад ⁺²
Rather inaccurate video imho.
1. It's not 44.1KHz because the limit of human hearing is 22.05KHz. It has to do with the upper limit of human hearing and the choice of anti-aliasing filter (transition band width). The assumed human hearing range is roughly 20Hz to 20KHz. The Nyquist sampling theorem tells us that the sampling rate should be at least twice the max frequency of the signal (so 40Khz).
The problem now is that our original signal contains frequency above 20KHz, if we try to sample at 40Khz aliasing will occurs (frequency above 20KHz fold into the hearing range). The signal must be low pass filtered (anti-aliasing filter). We can't perfectly cut frequency right at 20KHz, in practice a transition band is necessary. For practical and economic reasons, a 2.05KHz transition band was chosen. Now our signal contains frequency from 20Hz to 20+2.05 = 22.05KHz. Back to Niquist, we need to sample at 44.1KHz.
2. You DON'T end up with square wave. You could use 4 bits per sample and you still would not get square wave. You'll get a ton of quantization error (rounding error) and the signal will be drawn into noise (low SNR). Moreover, the computer doesn't draw line between point. It finds a continuous signal from the sequence of samples, there is a unique solution.
@KarlFFF 9 лет назад
Why not have a logarithmic bit depth? Seems to me that it would solve the problem of having high quality highs and lows. But ok I'm not a sound person so I might be missing something here with why this is not a good idea. But to me it seems similar to the thing with light levels in pictures.
@slamallama2385 9 лет назад
Could you explain how digital to analogue converters work and if they have any measurable effect on sound quality?
@FASTFASTmusic 9 лет назад
+Slama Llama all digital audio passes through a DAC from CD to Mp3... it just makes recreates the original signal out of sine waves. more expensive DACs cost more money.. that is all.
@ave4190 7 лет назад
Log bits? That'd be able to reproduce quite quiet tones while still producing reasonably small files . . .
@BenMcKenn 9 лет назад
Why not put the sound level on a logarithmic scale? Seems like that would allow you to get good detail for low-amplitude signals while still having plenty of headroom.
@IvanDSM 9 лет назад
+Ben McKenna Because that would cause you to lose dynamic range (and believe me, songs nowadays already have a serious lack of dynamic range, including those "remasters" of classic albums) and lose detail on high-amplitude signals.
@BenMcKenn 9 лет назад
OK, but what's dynamic range, exactly?
@tobiasverstege5165 9 лет назад
I suggest a followup about lossy audio compression.
@ShinobiEngineer 3 года назад
Great video! 😎👍
@PhattyMo 9 лет назад ⁺¹
The sampling rate is (at least) 2x the highest frequency you plan to sample..because,aliasing,Nyquist,something,something.

Следующие

Автовоспроизведение

Debunking the Digital Audio Myth: The Truth About the 'Stair-Step' Effect