The data format over the audio channel you requested is a signed short value (Int16). But when you sed struct you tried to convert it to unsigned chars (UInt8). You should be able to get the data with struct if you change it to unpack your sample size as signed short data. data_int = struct.unpack( str( CHUNK ) + 'h', data ) # Signed Short - 16 bits This will give you your sample set about the origin as you would typically see in a time plot with no extra manipulation.
This is great. Thanks a lot for posting it. Setting data_np = np.frombuffer(data, dtype = np.int16) works a lot better (no need to subsample, pywave datatype es int16). You'd need to readjust ax.set_ylim(-2**15, 2**15)
Your video is really helpful in helping getting started with pyaudio. Thank you so much. Here is tip you can reduce complexity in changing 2 bytes into int... just one line has to be changed. Instead of >> data_int = np.array(struct.unpack(str(2*CHUNK)+'B', data), dtype='b')[::2] + 127 use >> data_int = struct.unpack(str(CHUNK)+'h', data) :)
Bytes are 8 bits and ints are 16 bits. Converting from bytes to ints adds a zero byte to the front, doubling the size of the data. Also with signed bytes everything from 128 -255 is considered negative.
@@MarkJay I am so sorry.This cannot run in my pc. I use Spyder to running the program. my code is the same as yours. when I run the program, it cannot draw a figure and it will throw return pa.read_stream(self._stream, num_frames, exception_on_overflow). I cannot understand this. can you teach me ?
My script is hanging- I followed step by step and checked my output data binary then int and watched the plot as it changed with all the changes. Then when I did the automated update- thefig.canvas.draw()... the program hangs. I tried reducing CHUNK to just 2* instead of 4. Still hangs. im searching through the comments but i dont see anyone with this issue. I can only believe im doing something specifically odd. The only thing i see thats odd is that the args of np.arange are not highlighted as Mark's is. everything else is highlighted just as his is. help ?
Why do you have 2 * CHUNK data? You showed that it is there but didn't explain why.. also, later you take every alternate data points(10:07) so maybe those extra data points are junk?
Hi, I fully followed this vid code and code it in jupyter notebook, it doesn't occur any error but when it runs, it didn't come out anything. Anyone knows what is the problem? It didn't come out as what the video shows the final output.
After seeing the other comments,I tried to replace your original code by data_int=np.frombuffer(data,dtype=np.int16)+127 and I could see the change in waveform but when there is no voice,unlike yours,I don't see clear background.I still see noise waveform.
i'd really wanna see some DSP and more audio tutorials with python because i've tried to learn JUCE for C++ but it's way more syntax heavy than python so I started with Py Great vid btw
Great work there! Thank you for your video. I try what you are explaining and when I print the data all the entries are 0 as if there is no input. I am using my laptop's built-in microphone and I am confused about what is going wrong.
Hi. This is helpful for me since I’m new to Python. The problem you’re having with the data looking split - around 7:05 in the video - is pretty obviously because the data is coming back as 2’s-complement in the hex values and your conversion to integer doesn’t do that correctly. Please look up 2’s-complement to see how to do the conversion.
what I am thinking is to use audio card output channels can be input through line in in MB and so the channel will be Line In. Pls correct me. I am gonna use this from my bluetooth ic output
is it possible to make this exact same thing, but use Pyserial instead of Pyaudio. Im trying to collect data from an arduino serial port and display its signal/Frequency spectrum in real time.
Hi Mark, This video series is really great ! I use Python 3.6.4 and the '%matplotlib tk' line gives an error; 'End of statement expected.' Thus the program does not run neither in IntelliJ nor in Jupyter notebook. I could change over to Pqt5 GUI, but want to get to the root of this issue. Can you give me any idea?
I had the same problem in Spyder IDE ! I would like to know what may be causing this too, these compatibility issues are actually pretty common in python codes, unfortunately :/
i am so confused with different types of python ide and this one which is defining every in[ ] separately can someone help me with this basic step of why and what
Hi it is running , plot is displayed - however I get random noise on the graph not related either to mic or other I/O sound devices . Is there e simple way to select which device is acting as Input. I want to analyze the music coming out default output device instead from mic. Using last Anaconda distribution with Jupyter notebook on WIN10. Thanks
Hey I am getting an overflow error: File "/Users/nameofcomputer/PycharmProjects/untitled/venv/lib/python2.7/site-packages/pyaudio.py", line 608, in read return pa.read_stream(self._stream, num_frames, exception_on_overflow) IOError: [Errno -9981] Input overflowed Is there a fix for this?
Yuri Mota do you mean the microphone? The best way is to get the cleanest signal into the computer. The normal headphone jack can be noisy. If you can get a digital audio interface, you will get much cleaner input
Hm, I have a really stange problem: If I leave out the + 127 my code works, otherwise the numpy array has some kind of modulo properties and the numbers stay inside the range from -127 to 127, so for example a 8 what should become a 132 is a -119 then. Any idea why this can happen?
Brent Cavner awesome! The noise is likely due to hardware, like your microphone, or the interface. I've noticed that sending audio in through the headphone jack can be noising. If you can get a usb audio interface, it will likely help
I want to compare two audio files and want to find similarity score . How i have to do? I have one audio .wav file of 4 to 6 seconds and another audio .wav file of about 1 to 2 seconds.
Hey, I have a doubt If I have a music sample which has two components of say guitar and piano. SO is there any way we can split that sample into ins components?
Hi! :) Than you for the video :) I can't understand one thing. Can't we do it in much simplier way using: data_int = struct.unpack(str(2 * CHUNK) + 'b', data) (instead 'B') without any further conversion?
you get twice the size in the data, because 2 bytes represent one 16 bit signed integer, which is what you want to read. you should unpack with the 'h' flag data_int = np.array(struct.unpack(str(CHUNK)+'h', data)) and set limits from about -600 to 600 ax.set_ylim(-600,600)
I dont know if im missing something, but where do we even do stuff like asigning mic and stuff like this....Does pyaudio just do all of that by itsself?
Hi, I'm very new to python, but am trying to use it for a school project. Is there any way that something similar to this could be used, but instead of using microphone input you analyzed the output coming from the speakers?
I am kinda new to this. When I tried printing data_int i get all 0s how is it that in the video you are getting different values? Any help would be appreciated
l love your video and the explanations amazing work i'm new in python and im working on a project of mine and i saw that your coding in notebook and i dare to ask how write the code on pycharm is the same?
I know the doubt is pretty silly. ...How does it show the spectrum when I haven't even given any audio as input? Only if I input an audio data it can plot the spectrum, right?
hi, i want to try your code, but i had a problem, but it´s something strange, it says: AttributeError: 'Window' object has no attribute 'setGeometry' i don´t know if something its missing, or something else... im working on raspberry 3 with ubuntuMate...
good stuff, wasn't aware that this could be done so directly. sadly, i also get all 0s as my audio samples - so running this on a mac may require a few more setup steps. i have not been able to uncover the proper way to do this, despite a bit of research. would LOVE it if someone could provide details on what it takes to get the audio input from the default mic into pyaudio on a mac.
Have you got your answer what are the few more steps??? Coz I am using MacOS and i am getting 0s value in my tuples that mean it didn't take any data from my audio? pls help me out
so, i changed "Output=False", and some data started to show up. and then i got some overflow errors, but then got data again. hopefully that will move you forward. Please be aware there is a lot of mis-information in this tutorial. i don't think the author fully understands what the pyaudio.paInt16 data format is - (a 2-byte signed integer), as everything after that is a muddy mess of bad data structure manipulation. ask yourself why len(data) = 8192 when the CHUNK is set to 4096. (because each piece of CHUNK is 2 bytes! that's what paInt16 IS). when using the proper data format, the data values will range from -32767 to +32767. you might consider loading the numpy library, and then converting 'data' variable to a properly structured variable with: amplitude = np.fromstring(data, dtype='int16') and then try to use that variable (amplitude) to plot the values, ignoring all that struct.unpack business. by the way, if you do look at those bytes individually, you will discover they are stored as twos-complement - but across 2 bytes, of course. but doing so is just bad tech - use a Int16 data type since the data was created with an Int16 data type.
@@DaveCampbellKY thankyou for this important information it's really helpful. And can you tell me(or send me the link) from where you got these information and do this coding part as well (github link) if you have. Thankyou so much
@@priyankasrivastava6757 - it might be time to 'roll up your sleeves' and dig into some source material - this is actually a fairly straightforward exercise, to take the inspiration from the original video and correct it using proper data structures. i haven't done that myself - but may, and if i do i will post it here. in the meantime, these are useful references: pyaudio home - people.csail.mit.edu/hubert/pyaudio/ pyaudio documentation - people.csail.mit.edu/hubert/pyaudio/docs/ numpy.dtype docs - docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html twos-complement details - en.wikipedia.org/wiki/Two%27s_complement
Great video! I am getting 00 values when returning data as well as data_int ('0\x00\x00\...' and '0, 0, 0...' respectively) Any advice would be much appreciated.
Chris Coletti sounds like a conversion problem from bytes to ints. You could try using np.frombuffer To do the conversion instead of struct. Someone made a comment in the github issues section which shows the correct syntax
Yeah, turns out for some reason printing the data on just one chunk doesn't work, but printing the data in a loop works fine. Thanks! BTW .frombuffer and .fromstring return much cleaner data anyway! your videos rock dude!!
why every time I hit run, it says: AttributeError Traceback (most recent call last) in () 7 8 CHUNK = 1024 * 4 ----> 9 FORMAT = pyaudio.PaInt16 10 CHANNELS = 1 11 RATE = 44100 AttributeError: module 'pyaudio' has no attribute 'PaInt16'
i just want to thank you both mark jay and adam rose, the code adam rose submitted fixes a problem when audio goes beyond 255 or under 0 (what i can guess is saturation) that the base code has, it yielded better results when i tried this method and it even worked in the part 3. please pin this comment so that everybody can know
after I run the code, the display not really as good as you. It's only show 3 numbers; 0 1 and 255. The matplotlib window giving me a long vertical lines, even when my voice detected, it does moving but with vertical lines from the center then spread to left and right (no horizontal line) after I closed the window it gave me an error message: TclError Traceback (most recent call last) C:\Anaconda3\lib\site-packages\matplotlib\backends\tkagg.py in blit(photoimage, aggimage, bbox, colormode) 21 "PyAggImagePhoto", photoimage, ---> 22 id(data), colormode, id(bbox_array)) 23 except Tk.TclError: TclError: this isn't a Tk application During handling of the above exception, another exception occurred: TclError Traceback (most recent call last) in () 20 data_int = np.array(struct.unpack(str(2 * CHUNK) + 'B', data), dtype='b')[::2] + 127 21 line.set_ydata(data_int) ---> 22 fig.canvas.draw() 23 fig.canvas.flush_events() C:\Anaconda3\lib\site-packages\matplotlib\backends\backend_tkagg.py in draw(self) 350 def draw(self): 351 FigureCanvasAgg.draw(self) --> 352 tkagg.blit(self._tkphoto, self.renderer._renderer, colormode=2) 353 self._master.update_idletasks() 354 C:\Anaconda3\lib\site-packages\matplotlib\backends\tkagg.py in blit(photoimage, aggimage, bbox, colormode) 28 _tkagg.tkinit(id(tk), 0) 29 tk.call("PyAggImagePhoto", photoimage, ---> 30 id(data), colormode, id(bbox_array)) 31 except (ImportError, AttributeError, Tk.TclError): 32 raise TclError: this isn't a Tk application what happened....???
Hi Mark! Thanks for this awesome video. Wanted to ask you.. is there any way we can analyze mean frequency/fundamental frequency and save that data into a text file? time vs fundamental frequency data. ?
Hey Mark Jay , actually after running the : import pyaudio in jupyter notebook i've found this error : ModuleNotFoundError Traceback (most recent call last) in ----> 1 import pyaudio ModuleNotFoundError: No module named 'pyaudio'
Ok. That means you need to install pyaudio. If you are on windows, you can just do pip install pyaudio and everything should work. If you are on linux, also need pulse audio installed and to start an instance of pulse audio before running pyaudio
I have a problem when installing a module. ex: "pip install pyaudio" I get an error saying : File "", line 1 pip install pyaudio ^ SyntaxError: invalid syntax I already knew that the synthax "install" is the problem. How to fix it? I'm using a python 3.6.2 running on a windows 8.
I am having a bit of trouble getting a pop up window like you do. I have matplot lib and I am working in Jupyter notebooks. I also looked through the comments and didn't find anything that worked. You mentioned that matplot lib uses tkinter? Do I need to have tkinter installed before it will work? I didn't see you do it but maybe I need to import it? Great video though! Any suggestions are welcome!
Tkinter is part of python standard library so it should work. If you check part3 of the series, I made a python file version. Maybe that will work better
Hey Mark Jay! Your tutorial is amazing and exactly what I was looking for. But there is a small problem i'm facing. I can't see the graph being plot. I am running Anaconda Shell and copied you code completely. The line which you added at first i.e %matplotlib tk is giving me an error so I removed out. Please help. Thank you
Saad Tahir thanks! Glad you like them. That line is an ipython magic function. Try opening a shell by typing "ipython" (instead of python). You can also start a jupyter notebook and work in there. Just type "jupyter notebook" in a cmd window
I also created a python file for the viewer (instead of a jupyter notebook). you can try running this scrip. github.com/markjay4k/Audio-Spectrum-Analyzer-in-Python/blob/master/audio_spectrum.py
Hi Mark! Thanks a lot for the video and code you uploaded. I just have a question when I run the python file you wrote for the viewer( I use sublime). After I run the code, the picture only shows up for a second, and the program just kind of crashes with "OSError: [Errno -9981] Input overflowed" error. And I a wondering what goes wrong.. Can you please help. Thanks so much!
Lynn Gu glad you like them! Are you using the notebook version? In the github repo theres a python file version that will work in a text editor like sublime
Excellent video Mark. Thank you. I'm getting the same error, even with indent... OSError Traceback (most recent call last) in () 18 19 while True: ---> 20 data = stream.read(CHUNK)
Lynn Gu do you see anything that says --IOPubRate? It could be you're creating objects that are too big for your jupyter to plot. You can increase the iopubrate with a command when you start the jupyter server.
I think the error has to do with not enough memory size to read the audio data. use a try and except statement when you read from your microphone. for example try: data = stream.read(CHUNK) except OSError: continue you can also try playing with the size of CHUNK.
The data format over the audio channel you requested is a signed short value (Int16). But when you sed struct you tried to convert it to unsigned chars (UInt8). You should be able to get the data with struct if you change it to unpack your sample size as signed short data.
data_int = struct.unpack( str( CHUNK ) + 'h', data ) # Signed Short - 16 bits
This will give you your sample set about the origin as you would typically see in a time plot with no extra manipulation.
how can we modify the plot ?
This is great. Thanks a lot for posting it. Setting data_np = np.frombuffer(data, dtype = np.int16) works a lot better (no need to subsample, pywave datatype es int16). You'd need to readjust ax.set_ylim(-2**15, 2**15)
thanks! yes definitely that's a better option. makes the code cleaner and faster
Great series!
We open the stream in Int16 format, so we can simply unpack it this way:
data_int = struct.unpack(str(CHUNK) + 'h', data)
TIME TO MAKE MY FLYING ARROW CONTROLLED BY WHISTLE
Your video is really helpful in helping getting started with pyaudio. Thank you so much.
Here is tip you can reduce complexity in changing 2 bytes into int... just one line has to be changed.
Instead of
>> data_int = np.array(struct.unpack(str(2*CHUNK)+'B', data), dtype='b')[::2] + 127
use
>> data_int = struct.unpack(str(CHUNK)+'h', data)
:)
Why don't you multiply by 2? Because I got an error that says "unpack requires a buffer of 2048 bytes"
I am using MacOS and i am getting 0s (zeros) value in my tuples that mean it didn't take any data from my audio? pls help me out.
Hi, I had this problem, try changing 'output = true' to 'output=False'. That worked for me. (running on macOS).
@@lazzapie Thanks, this worked for me, on MacOS :)
output=false because that returns 2000+ samples at zero.
Ah, makes sense.
Great post. Well presented. I was looking for a way to process a live audio stream. This video will help a lot. Thank Mark for this video.
if you are still getting weird display of your audio stream after adding 127, try adding 128 instead
thx bro it worked for me^^
This plot 7:14 stucks in a white screen ... just the window appears and that's it.
Do somebody know why it doesn't make anything?
🥲
Bytes are 8 bits and ints are 16 bits. Converting from bytes to ints adds a zero byte to the front, doubling the size of the data. Also with signed bytes everything from 128 -255 is considered negative.
yeah I definitely went about it wrong using struct. using numpy.frombuffer() really simplifies process going from bytes to ints
@@MarkJay I am so sorry.This cannot run in my pc. I use Spyder to running the program. my code is the same as yours.
when I run the program, it cannot draw a figure and it will throw return pa.read_stream(self._stream, num_frames, exception_on_overflow). I cannot understand this. can you teach me ?
@@gaoxuewu5411 Try reducing the size of the chunk
I am not getting the figure in separate figure window even on using % matplotlib tk, what to do ?
So if you have discontinuity, then you apply a window function like hann, right? I dont know a lot but moving it instead felt weird. (7:50)
for me the window is not showing, i print the data_int in the console and the numbers are showing in console but the window not appears
do you have matplotlib installed?
i resolve putting this line 'plt.show(block=False)' before the while, thanks, do you know one way to get the frequency number of the audio ?
Negocio Esquisito awesome! Check out pt2 in the series. We add a spectrum analyzer
@@negocioesquisito Thanks , I didn't know about the 'block = False' thing, that's good to know in general
@@negocioesquisito thanks!
My script is hanging- I followed step by step and checked my output data binary then int and watched the plot as it changed with all the changes. Then when I did the automated update- thefig.canvas.draw()... the program hangs. I tried reducing CHUNK to just 2* instead of 4. Still hangs. im searching through the comments but i dont see anyone with this issue. I can only believe im doing something specifically odd. The only thing i see thats odd is that the args of np.arange are not highlighted as Mark's is. everything else is highlighted just as his is. help ?
Why do you have 2 * CHUNK data? You showed that it is there but didn't explain why.. also, later you take every alternate data points(10:07) so maybe those extra data points are junk?
he explained it
Can it appear in gui for desktop voice assistant? Please help me
Why do we need the comma after line as in " line, = ax.plot(..)"
Why '%' is not recognized by pycharm? I am not able to see any graph after running this code.
Because it's an Anaconda Magic Function. You don't need it in PyCharm.
at 7:53 could the reason that the plot comes up like that be that the binary of negative number is taken as 1's or 2's complement?
Hi, I fully followed this vid code and code it in jupyter notebook, it doesn't occur any error but when it runs, it didn't come out anything. Anyone knows what is the problem? It didn't come out as what the video shows the final output.
hi can i use sound lm393 module for microphone?
can you have a tutorial on how to visualize the pitch graph of sounds?
will consider this
After seeing the other comments,I tried to replace your original code by
data_int=np.frombuffer(data,dtype=np.int16)+127
and I could see the change in waveform but when there is no voice,unlike yours,I don't see clear background.I still see noise waveform.
I get exactly the same with 2 microphones
Hi, have you been able to fix it?
i'd really wanna see some DSP and more audio tutorials with python because i've tried to learn JUCE for C++ but it's way more syntax heavy than python so I started with Py
Great vid btw
Hello ,
I am getting this error while unpacking "struct.error: unpack requires a buffer of 8192 bytes" Help me to sort this thing
me too
i think i found the answer
use this: data_int = struct.unpack(str(len(data)) + 'b', data)
Hello, why do i run data_int then all my outputs are \x00?
Sorry I'm not sure.
The chart is choppy at 8:00 because you unpack two byte integers into array of single bytes.
Great work there! Thank you for your video. I try what you are explaining and when I print the data all the entries are 0 as if there is no input. I am using my laptop's built-in microphone and I am confused about what is going wrong.
I have the same problem with you, did you already fix it?
matplotlib tk has a syntax error, what should I do i am using IDLE python to code
Are you using jupyter?
no i am not
@@lynerscruxifyl3049 I see. That line is a magic function that only works with jupyter
so what should i replace it with? im quite new to coding haha
@@lynerscruxifyl3049 try jupyter.
Just do
pip install jupyter
Then type
jupyter notebook
In the cmd window
Hi. This is helpful for me since I’m new to Python. The problem you’re having with the data looking split - around 7:05 in the video - is pretty obviously because the data is coming back as 2’s-complement in the hex values and your conversion to integer doesn’t do that correctly. Please look up 2’s-complement to see how to do the conversion.
David Rogoff thanks, I will check it out. I notice this everytime I use struct.unpack
To convert from 2s cm is just a not +1 right?
As you use pyaudio.paInt16 as format (16bit wave), why not use "struct.unpack(str(CHUNK) + 'h', data)" decode stream in short(int16)?
hello, how can i install pyaudio and portaudio(i am using window)?
richard yang you should be able to just do
pip install pyaudio
this is what i got when i type in "pip install pyaudio" in cmd
Installing collected packages: pyaudio
Running setup.py install for pyaudio ... error
Complete output from command c:\users\sry\appdata\local\programs\python\python37\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\sry\\AppData\\Local\\Temp\\pip-install-u93v30ro\\pyaudio\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('
', '
');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\sry\AppData\Local\Temp\pip-record-sndx4fd0\install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.7
copying src\pyaudio.py -> build\lib.win-amd64-3.7
running build_ext
building '_portaudio' extension
creating build\temp.win-amd64-3.7
creating build\temp.win-amd64-3.7\Release
creating build\temp.win-amd64-3.7\Release\src
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.14.26428\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -DMS_WIN64=1 -Ic:\users\sry\appdata\local\programs\python\python37\include -Ic:\users\sry\appdata\local\programs\python\python37\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.14.26428\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.14.26428\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17134.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17134.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17134.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17134.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.17134.0\cppwinrt" /Tcsrc/_portaudiomodule.c /Fobuild\temp.win-amd64-3.7\Release\src/_portaudiomodule.obj
_portaudiomodule.c
c:\users\sry\appdata\local\programs\python\python37\include\pyconfig.h(117): warning C4005: 'MS_WIN64': 巨集重複定義
c:\users\sry\appdata\local\programs\python\python37\include\pyconfig.h(117): note: 命令列引數: 請參閱 'MS_WIN64' 先前的定義
src/_portaudiomodule.c(29): fatal error C1083: 無法開啟包含檔案: 'portaudio.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.14.26428\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2
----------------------------------------
Command "c:\users\sry\appdata\local\programs\python\python37\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\sry\\AppData\\Local\\Temp\\pip-install-u93v30ro\\pyaudio\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('
', '
');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\sry\AppData\Local\Temp\pip-record-sndx4fd0\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Users\sry\AppData\Local\Temp\pip-install-u93v30ro\pyaudio\
@@richardyang4204 you require the microsoft visual studio
Great tutorial, thanks for uploading! Do you know how it would be possible to read the output of the audio card, not the microphone?
Actually, I was thinking the same.
Ditto!
what I am thinking is to use audio card output channels can be input through line in in MB and so the channel will be Line In. Pls correct me. I am gonna use this from my bluetooth ic output
How can we do it seriously from the audio card so we dont have to use a mic
is it possible to make this exact same thing, but use Pyserial instead of Pyaudio. Im trying to collect data from an arduino serial port and display its signal/Frequency spectrum in real time.
hello zach. seems like we are working on a similar project.I have the same problem here. Did you find a way to do it?
Hi Mark, This video series is really great ! I use Python 3.6.4 and the '%matplotlib tk' line gives an error; 'End of statement expected.'
Thus the program does not run neither in IntelliJ nor in Jupyter notebook. I could change over to Pqt5 GUI, but want to get to the root of this issue. Can you give me any idea?
I had the same problem in Spyder IDE ! I would like to know what may be causing this too, these compatibility issues are actually pretty common in python codes, unfortunately :/
i am so confused with different types of python ide and this one which is defining every in[ ] separately can someone help me with this basic step of why and what
Hi it is running , plot is displayed - however I get random noise on the graph not related either to mic or other I/O sound devices .
Is there e simple way to select which device is acting as Input. I want to analyze the music coming out default output device instead from mic. Using last Anaconda distribution with Jupyter notebook on WIN10. Thanks
so beautiful
what dose it mean %matplotlib tk and how can fix my invalid syntax on python
thanks
this is only needed if you're using jupyter. it creates a separate window for the plot. if you're using a normal python file, you can remove it
thanks
I have just 0 ,1 and 255 byte. is this a problem?
HI! Thanks for video!! I have one Question, can this code work without struct , if i take data from librosa ??
Thanks. Yes, I would recommend using numpy.frombuffer instead. Its much simpler to use and I think its faster
Hey I am getting an overflow error: File "/Users/nameofcomputer/PycharmProjects/untitled/venv/lib/python2.7/site-packages/pyaudio.py", line 608, in read
return pa.read_stream(self._stream, num_frames, exception_on_overflow)
IOError: [Errno -9981] Input overflowed
Is there a fix for this?
are you using macos or linux? I'm not sure how to get pyaudo working with those operating systems
I keep getting syntax error on %matplotlib tk. Any idea where that could be from?
stackoverflow.com/questions/44225002/why-doesnt-matplotlib-inline-work-in-python-script
Hello. My graphic is not that clean. It seems there is a huge interference, how can I filtrate it?
Yuri Mota do you mean the microphone? The best way is to get the cleanest signal into the computer. The normal headphone jack can be noisy. If you can get a digital audio interface, you will get much cleaner input
Hm, I have a really stange problem: If I leave out the + 127 my code works, otherwise the numpy array has some kind of modulo properties and the numbers stay inside the range from -127 to 127, so for example a 8 what should become a 132 is a -119 then. Any idea why this can happen?
By "my code works" I mean I get a curve that lies between -127 and 127, but looks right
I'm having an issue with "%matplotlib tk". I get an error about invalid syntax. I'm running python 2.7..
Brent Cavner try watching part 3 where we switch to a normal python file to run. The code link is in the description
Awesome! It works. Thank you! Is there a way to reduce the noise it is picking up?
Brent Cavner awesome! The noise is likely due to hardware, like your microphone, or the interface. I've noticed that sending audio in through the headphone jack can be noising. If you can get a usb audio interface, it will likely help
Awesome.... Thanks for making this video Mark.
Why did he add 'B' in data_int?
I want to compare two audio files and want to find similarity score . How i have to do? I have one audio .wav file of 4 to 6 seconds and another audio .wav file of about 1 to 2 seconds.
Hey, I have a doubt
If I have a music sample which has two components of say guitar and piano.
SO is there any way we can split that sample into ins components?
% i have python 3.6 and give invalid syntax ???
Loving this!
Hello, how can I plot the graph in a external box as you've done?
Matheus Silva yes, check out part 3 of the series. I made a normal python file version
Thank you much. On OSX, may need to install `brew install portaudio` for `pip install pyaudio` to succeed.
Great script. I had to enter %matplotlib notebook to the cell to get it working properly in Jupyter.
Hi! :)
Than you for the video :)
I can't understand one thing. Can't we do it in much simplier way using:
data_int = struct.unpack(str(2 * CHUNK) + 'b', data)
(instead 'B') without any further conversion?
how to run project help me pls
Try
%matplotlib qt
if tk gives you error
that or %matplotlib qt5
what i want to know is how you choose where the program gets the audio from.
Since i only have 1 input device enabled, i believe pyaudio simply reads from that
you get twice the size in the data, because 2 bytes represent one 16 bit
signed integer, which is what you want to read.
you should unpack with the 'h' flag
data_int = np.array(struct.unpack(str(CHUNK)+'h', data))
and set limits from about -600 to 600
ax.set_ylim(-600,600)
actually the limits should be -32768 to 32767,
ax.set_ylim(-32768,32767)
I dont know if im missing something, but where do we even do stuff like asigning mic and stuff like this....Does pyaudio just do all of that by itsself?
That's the exact same question I had in mind but I tried it and it still works
Is this the same technic can be apply to EEG, ECG, EMG signal processing?
Hi, I'm very new to python, but am trying to use it for a school project. Is there any way that something similar to this could be used, but instead of using microphone input you analyzed the output coming from the speakers?
If its coming out a speaker, you likely already have the digital audio on the computer and can process. Or input that audio to a computer
im having this error
AttributeError: '_tkinter.tkapp' object has no attribute 'createfilehandler'
how can i resolve this?
Where I could read open Audio processing journals?
Instead of using matplotlib, I'm outputting my data to a txt file. Except I get a weird ellipsis with 3 numbers either side?
I am kinda new to this. When I tried printing data_int i get all 0s how is it that in the video you are getting different values?
Any help would be appreciated
if all the code correct ,it could be a mic problem
Hi chandana, got any fixes? I am facing same issue, even with random values, getting 0, 1 and 255 only
what do the x and y axes reperesent?
X is samples and y is amplitude
Can you please make a video like the above which will have real time datetimes as x ticks instead of just number of samples?
l love your video and the explanations amazing work
i'm new in python and im working on a project of mine and i saw that your coding in notebook and i dare to ask how write the code on pycharm is the same?
I know the doubt is pretty silly. ...How does it show the spectrum when I haven't even given any audio as input? Only if I input an audio data it can plot the spectrum, right?
hi, i want to try your code, but i had a problem, but it´s something strange, it says:
AttributeError: 'Window' object has no attribute 'setGeometry'
i don´t know if something its missing, or something else...
im working on raspberry 3 with ubuntuMate...
richardyo888 you can remove that line of code. I just used that to center the figure on my screen, but its not necessary
hello @Mark Jay: can we do it in reverse as well? means from signals to audio?
good stuff, wasn't aware that this could be done so directly. sadly, i also get all 0s as my audio samples - so running this on a mac may require a few more setup steps. i have not been able to uncover the proper way to do this, despite a bit of research. would LOVE it if someone could provide details on what it takes to get the audio input from the default mic into pyaudio on a mac.
Have you got your answer what are the few more steps??? Coz I am using MacOS and i am getting 0s value in my tuples that mean it didn't take any data from my audio? pls help me out
so, i changed "Output=False", and some data started to show up. and then i got some overflow errors, but then got data again. hopefully that will move you forward.
Please be aware there is a lot of mis-information in this tutorial. i don't think the author fully understands what the pyaudio.paInt16 data format is - (a 2-byte signed integer), as everything after that is a muddy mess of bad data structure manipulation. ask yourself why len(data) = 8192 when the CHUNK is set to 4096. (because each piece of CHUNK is 2 bytes! that's what paInt16 IS).
when using the proper data format, the data values will range from -32767 to +32767.
you might consider loading the numpy library, and then converting 'data' variable to a properly structured variable with:
amplitude = np.fromstring(data, dtype='int16')
and then try to use that variable (amplitude) to plot the values, ignoring all that struct.unpack business.
by the way, if you do look at those bytes individually, you will discover they are stored as twos-complement - but across 2 bytes, of course. but doing so is just bad tech - use a Int16 data type since the data was created with an Int16 data type.
@@DaveCampbellKY thankyou for this important information it's really helpful.
And can you tell me(or send me the link) from where you got these information and do this coding part as well (github link) if you have. Thankyou so much
@@priyankasrivastava6757 - it might be time to 'roll up your sleeves' and dig into some source material - this is actually a fairly straightforward exercise, to take the inspiration from the original video and correct it using proper data structures. i haven't done that myself - but may, and if i do i will post it here. in the meantime, these are useful references:
pyaudio home - people.csail.mit.edu/hubert/pyaudio/
pyaudio documentation - people.csail.mit.edu/hubert/pyaudio/docs/
numpy.dtype docs - docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html
twos-complement details - en.wikipedia.org/wiki/Two%27s_complement
@@DaveCampbellKY Thankyou
Really great. Gonna test this out and throw a FFT on it if it works
Great video! I am getting 00 values when returning data as well as data_int ('0\x00\x00\...' and '0, 0, 0...' respectively) Any advice would be much appreciated.
Chris Coletti sounds like a conversion problem from bytes to ints. You could try using np.frombuffer To do the conversion instead of struct. Someone made a comment in the github issues section which shows the correct syntax
did you ever figure it out? im getting the same problem were the data is either 0 or 255, nothing in between...
Hassan Khalil try using np.frombuffer and I think the type is
Yeah, turns out for some reason printing the data on just one chunk doesn't work, but printing the data in a loop works fine. Thanks! BTW .frombuffer and .fromstring return much cleaner data anyway! your videos rock dude!!
awesome! thank you!
Thanks for the video! How do I extract features from .wav files using this library or other libraries? Thanks in advance!
use librosa library librosa.github.io/librosa/
Can I further improve on your code in order to AI interpret data?
If you're doing this in a normal python script via matplotlib.use('TkAgg'), you may have to enable interactive mode: plot.ion()
could you explain how that works ....?
@@mimanshumaheshwari Not really sure how it works behind the scenes. I just remember googling for solutions and this line fixed the issue 👍
Great video keep going dude!
why every time I hit run, it says:
AttributeError Traceback (most recent call last)
in ()
7
8 CHUNK = 1024 * 4
----> 9 FORMAT = pyaudio.PaInt16
10 CHANNELS = 1
11 RATE = 44100
AttributeError: module 'pyaudio' has no attribute 'PaInt16'
Mecca 0497 should be paInt16 (not PaInt16)
ah... thank you! I didn't see that, P instead p
I didn't know why but...
TclError Traceback (most recent call last)
C:\Anaconda3\lib\site-packages\matplotlib\backends\tkagg.py in blit(photoimage, aggimage, bbox, colormode)
21 "PyAggImagePhoto", photoimage,
---> 22 id(data), colormode, id(bbox_array))
23 except Tk.TclError:
TclError: this isn't a Tk application
During handling of the above exception, another exception occurred:
TclError Traceback (most recent call last)
in ()
20 data_int = np.array(struct.unpack(str(2 * CHUNK) + 'B', data), dtype='b')[::2] + 127
21 line.set_ydata(data_int)
---> 22 fig.canvas.draw()
23 fig.canvas.flush_events()
C:\Anaconda3\lib\site-packages\matplotlib\backends\backend_tkagg.py in draw(self)
350 def draw(self):
351 FigureCanvasAgg.draw(self)
--> 352 tkagg.blit(self._tkphoto, self.renderer._renderer, colormode=2)
353 self._master.update_idletasks()
354
C:\Anaconda3\lib\site-packages\matplotlib\backends\tkagg.py in blit(photoimage, aggimage, bbox, colormode)
28 _tkagg.tkinit(id(tk), 0)
29 tk.call("PyAggImagePhoto", photoimage,
---> 30 id(data), colormode, id(bbox_array))
31 except (ImportError, AttributeError, Tk.TclError):
32 raise
TclError: this isn't a Tk application
You can do
data_int = np.fromstring(data, dtype=np.int16)
First Last thanks, I think that's a better method for converting the bytes than struct
i just want to thank you both mark jay and adam rose, the code adam rose submitted fixes a problem when audio goes beyond 255 or under 0 (what i can guess is saturation) that the base code has, it yielded better results when i tried this method and it even worked in the part 3. please pin this comment so that everybody can know
Very thanks for uploading it .. but could u pls just tell how to make it react on computers audio also.. with microphone ??
Needed to add plt.ion() to get an image
after I run the code, the display not really as good as you. It's only show 3 numbers; 0 1 and 255. The matplotlib window giving me a long vertical lines, even when my voice detected, it does moving but with vertical lines from the center then spread to left and right (no horizontal line)
after I closed the window it gave me an error message:
TclError Traceback (most recent call last)
C:\Anaconda3\lib\site-packages\matplotlib\backends\tkagg.py in blit(photoimage, aggimage, bbox, colormode)
21 "PyAggImagePhoto", photoimage,
---> 22 id(data), colormode, id(bbox_array))
23 except Tk.TclError:
TclError: this isn't a Tk application
During handling of the above exception, another exception occurred:
TclError Traceback (most recent call last)
in ()
20 data_int = np.array(struct.unpack(str(2 * CHUNK) + 'B', data), dtype='b')[::2] + 127
21 line.set_ydata(data_int)
---> 22 fig.canvas.draw()
23 fig.canvas.flush_events()
C:\Anaconda3\lib\site-packages\matplotlib\backends\backend_tkagg.py in draw(self)
350 def draw(self):
351 FigureCanvasAgg.draw(self)
--> 352 tkagg.blit(self._tkphoto, self.renderer._renderer, colormode=2)
353 self._master.update_idletasks()
354
C:\Anaconda3\lib\site-packages\matplotlib\backends\tkagg.py in blit(photoimage, aggimage, bbox, colormode)
28 _tkagg.tkinit(id(tk), 0)
29 tk.call("PyAggImagePhoto", photoimage,
---> 30 id(data), colormode, id(bbox_array))
31 except (ImportError, AttributeError, Tk.TclError):
32 raise
TclError: this isn't a Tk application
what happened....???
Try using np.frombuffer. it's a much better way of converting the audio data from bytes to ints
Getting same issue brother, have you solved it now?
Hi Mark! Thanks for this awesome video. Wanted to ask you.. is there any way we can analyze mean frequency/fundamental frequency and save that data into a text file? time vs fundamental frequency data. ?
Hey Mark Jay , actually after running the : import pyaudio in jupyter notebook i've found this error : ModuleNotFoundError Traceback (most recent call last)
in
----> 1 import pyaudio
ModuleNotFoundError: No module named 'pyaudio'
Ok. That means you need to install pyaudio. If you are on windows, you can just do pip install pyaudio and everything should work. If you are on linux, also need pulse audio installed and to start an instance of pulse audio before running pyaudio
@@MarkJay I'm working on windows , I try it on prompt right ?
my spectrum window can't show up. I'm using python 3.7
same. any solutions?
I have a problem when installing a module. ex:
"pip install pyaudio"
I get an error saying :
File "", line 1
pip install pyaudio
^
SyntaxError: invalid syntax
I already knew that the synthax "install" is the problem.
How to fix it?
I'm using a python 3.6.2 running on a windows 8.
That's strange. You shouldn't get a syntax error if you're running the command from a cmd wimdow
Mark Jay I guess i just have to reinstall the python and use some other version of it.
Anyhow.... Thanks for the reply!
I fix It!! I just read this article ----> stackoverflow.com/questions/8548030/why-does-pip-install-inside-python-raise-a-syntaxerror
Harie Amjari Thats awesome! Glad you figured it out! Its a good feeling
@@iordanissapidis3534 Oh, reviving a 2 year old comment. I stop coding in python. Now I code in C.
It looks like you should be interpreting the bytes as _Unsigned_ int16, then you shouldn't have to subtract 127 (or 128)
When instead of microphone I use a wave file, I get this error:
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Maybe it is stereo audio, which means you basically should have 2 lines in the graphic.
I am having a bit of trouble getting a pop up window like you do. I have matplot lib and I am working in Jupyter notebooks. I also looked through the comments and didn't find anything that worked. You mentioned that matplot lib uses tkinter? Do I need to have tkinter installed before it will work? I didn't see you do it but maybe I need to import it?
Great video though! Any suggestions are welcome!
Tkinter is part of python standard library so it should work. If you check part3 of the series, I made a python file version. Maybe that will work better
Hey Mark Jay! Your tutorial is amazing and exactly what I was looking for. But there is a small problem i'm facing. I can't see the graph being plot. I am running Anaconda Shell and copied you code completely. The line which you added at first i.e %matplotlib tk is giving me an error so I removed out. Please help. Thank you
Saad Tahir thanks! Glad you like them. That line is an ipython magic function. Try opening a shell by typing "ipython" (instead of python). You can also start a jupyter notebook and work in there. Just type "jupyter notebook" in a cmd window
I also created a python file for the viewer (instead of a jupyter notebook). you can try running this scrip.
github.com/markjay4k/Audio-Spectrum-Analyzer-in-Python/blob/master/audio_spectrum.py
Hi Mark! Thanks a lot for the video and code you uploaded. I just have a question when I run the python file you wrote for the viewer( I use sublime). After I run the code, the picture only shows up for a second, and the program just kind of crashes with "OSError: [Errno -9981] Input overflowed" error. And I a wondering what goes wrong.. Can you please help. Thanks so much!
Lynn Gu glad you like them! Are you using the notebook version? In the github repo theres a python file version that will work in a text editor like sublime
Thanks for the response!!! Yeah I used the github version, however I keep getting the input overflow error and I don't quite know why..
If the jupyter notebook cell running p = pyaudio.PyAudio() never terminates, try switching the backend from tk to qt: %matplotlib qt
Hi! Sir, I'm encountering error here:
OSError Traceback (most recent call last)
in ()
7 input=True,
8 output=True,
----> 9 frames_per_buffer=CHUNK
10 )
11
I'm using Rpi jupyter notebook. How can I solved this? thank you
Larenze Dimaandal not sure but it's probably the audio driver if you're using a pi. I've had trouble using pyaudio on linux
Thank you, so we can't implement this on pi? Do you have an alternative scripts with this issue?
I understand thank you so much :D
hi mark
i 'm getting this error " File "", line 12
data = stream.read(CHUNK)
^
IndentationError: unexpected indent"
Imene Ben Haddada you've made an indent error.
Imene Ben Haddada python is picking about using indents so be mindful of them.
Excellent video Mark. Thank you.
I'm getting the same error, even with indent...
OSError Traceback (most recent call last)
in ()
18
19 while True:
---> 20 data = stream.read(CHUNK)
Which API in this code did you use to read audio from.the microphone!?
how to add autotune with realtime ?
when i run it i see only zeroes (0 0 0 0 0 0 0 0), someone know why it's happen?
Hi, I had this problem, try changing 'output = true' to 'output=False'. That worked for me. (running on macOS).
@@lazzapie this just save my life
Hi, when I am running the code on jupyter notebook, I kept getting OSError: [Errno -9981] Input overflowed error, do you know how can I solve it?
Lynn Gu do you see anything that says --IOPubRate? It could be you're creating objects that are too big for your jupyter to plot. You can increase the iopubrate with a command when you start the jupyter server.
I am trying to do this on a raspberry pi and i am getting the same error.
I think the error has to do with not enough memory size to read the audio data.
use a try and except statement when you read from your microphone.
for example
try:
data = stream.read(CHUNK)
except OSError:
continue
you can also try playing with the size of CHUNK.
I have just tried this and still have no success
although it does not give an error anymore, it just doesn't show the plot
Hey there dude... Good tutorial
I am getting invalid output device as a error... Dont know how to rectify it. Appreciate if you can help :)