*NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course) SHAP course: adataodyssey.com/courses/shap-with-python/ XAI course: adataodyssey.com/courses/xai-with-python/ Newsletter signup: mailchi.mp/40909011987b/signup
Hi. Thanks for your video. I have a question. When the augmentation methods has to be realistic depending to the use case, why they cannot be applied before spliting? Thank you.
If you augment an image, it may look very similar to the original image. For example, by only slightly adjusting the brightness or flipping the image. So, after many augmentations, you can have many copies of similar images in you dataset. If you train on some of these images, it is likely that the model will be able to predict the others correctly. So, if you include some augmented images, from the *same* image, in both the training and test set then you can over exaggerate your models accuracy. Alternatively, you could split the images into the training\test set and then augment them. This way you will not have similar images in the two sets. It is still best practise to also get the accuracy on the test set that is not augmented. I hope that makes sense :)
*NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course)
SHAP course: adataodyssey.com/courses/shap-with-python/
XAI course: adataodyssey.com/courses/xai-with-python/
Newsletter signup: mailchi.mp/40909011987b/signup
Excellent video! I will be recommending this friends in the future when they want to know about data augmentation.
Thank you Michael!
Hi. Thanks for your video. I have a question. When the augmentation methods has to be realistic depending to the use case, why they cannot be applied before spliting? Thank you.
If you augment an image, it may look very similar to the original image. For example, by only slightly adjusting the brightness or flipping the image.
So, after many augmentations, you can have many copies of similar images in you dataset. If you train on some of these images, it is likely that the model will be able to predict the others correctly. So, if you include some augmented images, from the *same* image, in both the training and test set then you can over exaggerate your models accuracy.
Alternatively, you could split the images into the training\test set and then augment them. This way you will not have similar images in the two sets. It is still best practise to also get the accuracy on the test set that is not augmented.
I hope that makes sense :)
@@adataodyssey Thanks a lot for you reply. Yes. It does.