The Best Library to Augment Audio Data: Audiomentations

Поделиться
HTML-код
  • Опубликовано: 16 янв 2025

Комментарии • 21

  • @SonGoku-rl9qf
    @SonGoku-rl9qf 8 месяцев назад +1

    your an awsome teacher. Thank you Sir!

  • @linjuck1859
    @linjuck1859 3 года назад +1

    Thanks for your selflessly made tutorials, it's really fantastic!

  • @mohamedhamada8330
    @mohamedhamada8330 Год назад

    Very useful! Thanks for this great playlist Valerio 😍

  • @daniilkrapivin3707
    @daniilkrapivin3707 3 года назад

    This channel is GOLD. Thanks for the content man :0

  • @santosotoso4287
    @santosotoso4287 Год назад

    Thanks a lot, a very informative video, great job

  • @rightcurve
    @rightcurve 2 года назад

    Great contents. Please consider making more voice related topics such as speaker recognition.

  • @janithdesilva7518
    @janithdesilva7518 3 года назад +2

    Could you please conduct a video series on reducing background noise of data, then audio data preprocessing, feature extraxtion of audio and training a model from audio. Please!

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  3 года назад +1

      I already have a few series that one way or another address audio feature extraction and model training.
      I'm planning for the future videos on Audio ML in Production touching the whole pipeline from an MLOps perspective.
      Stay tuned!

  • @kakubasamuel7255
    @kakubasamuel7255 2 года назад

    Thanks a lot Valerio, This channel has taught me a lot, May you give us an example of application of augmentation especially using audiomentation on a whole dataset in Keras. Thanks a lot once again

    • @nelbn
      @nelbn 2 года назад

      I was wondering the same thing here. What is best way to implement audio augumentation in Keras. Is it possible to integrate and create a flow, similar to what can be done with the class ImageDataGenerator (native to Keras) or do we have to implement audio augumentation on some files, store them and then finally use the new dataset as a regular dataset in Keras?

  • @Tayzakoko
    @Tayzakoko 2 года назад +1

    1. Is it possible to hide a photo in an audio file?
    2. Convert audio to Spectrogram picture and extract features. Can I convert a spectrogram photo to an audio file?

  • @jasminecheung1998
    @jasminecheung1998 Год назад

    Thank you for your video. I have a question regarding to the audio augmentation. In my project, the test speaker is not in the train data (2000 samples), so my model performers pretty bad on test set,only 50% accuracy. I try to use the pitch shift (shift 2*2*(np.random.uniform()) )on the training data, but still doesn't works well. How should I use audio augmentation for this dataset?

  • @benwilliams1065
    @benwilliams1065 Год назад

    Class, as always

  • @BrunoKramm
    @BrunoKramm 3 года назад +1

    Dear Valerio, thanks again for a great tutorial, just one thing or problem with the pitchshifting -8/8 up is that the whole spectrum gets shiftet, but different pitches from natural sounds have formants that normally stay (therefore the pitchshift sounds upwards like mickeymouse and downwards like darth vader). In spectrogramms natural pitchshifts you can in fact perfectly see these formants stable in some of the bands (sometimes with extra relation inbetween) while the rest of non formant spectrum is shifted. Therefor i recommend a shifting algorithm with formant correction.F.e. when working on voices it makes sense to run with python batches on professional pitchshifting algorithms with formant correction. Just my 5 cts.

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  3 года назад +1

      You're absolutely right Bruno. I would suggest using a professional pitch-shifting solution like Rubber Band. However, if the pitch shift is limited a less complex solution (e.g., librosa) would work just fine for DL audio applications.

  • @aysenur5961
    @aysenur5961 3 года назад

    Hello this video series is very informative and amazing like the others. Thanks a lot. And I have a question.
    If I want to apply data augmentation to the entire dataset, not a single audio file,
    (and if I want to save this augmented dataset) would it be sufficient to prefer a for loop?

  • @yangwang9688
    @yangwang9688 3 года назад

    How to use it in pytorch dataset and make it on-the-fly instead of offline?

    • @ValerioVelardoTheSoundofAI
      @ValerioVelardoTheSoundofAI  3 года назад

      You should look into torch-audiomentations which specifically addresses that need ;)

    • @jianquan9154
      @jianquan9154 2 года назад

      @@ValerioVelardoTheSoundofAIis it possbile to do it for single item instead of each batch? bc the output of dataset is spectrogram