221 - Easy way to split data on your disk into train, test, and validation?

Поделиться
HTML-код
  • Опубликовано: 19 ноя 2024
  • НаукаНаука

Комментарии • 71

  • @faaalsh8784
    @faaalsh8784 2 года назад +3

    Since I started dealing with machine learning with images, you are my teacher. Thank you for the awesome tutorials you are doing. Have you posted a video about splitting data for semantic segmentation?

  • @kev-dm5388
    @kev-dm5388 2 года назад

    thank you so much, you save my life for my college mid exam

  • @kibetwalter8528
    @kibetwalter8528 2 года назад

    This guy is a Lifesaver. Always. Thank you.

  • @saratesfamariam1176
    @saratesfamariam1176 Год назад

    Thank you for all the tutorials!

  • @SabbirAhmed-nc5hh
    @SabbirAhmed-nc5hh 3 года назад

    good demo, was looking for something like this. was facing bugs in splitfolders, but didn't found intuitive solve like this elsewhere. Thanks !

  • @rameshwarsingh5859
    @rameshwarsingh5859 3 года назад

    Excellent Post for Sreeni sir..👌 helps me to distribute data sets easily,,thank U

  • @paulntalo1425
    @paulntalo1425 2 года назад

    Thank you for sharing. Please make another video showing how to split a large dataset of images with metadata in the train CSV file. And how to sort the train image folder into subfolders for each label category.
    Thank you

  • @osiris583
    @osiris583 3 года назад +1

    You made may day bro! Ty so much

  • @jacobusstrydom7017
    @jacobusstrydom7017 3 года назад

    O man this could have saved me so mush time. Thanks!!

  • @vitzrd2076
    @vitzrd2076 2 года назад

    Your video made my day bruhh, Thank You very much dude!

  • @Darkev77
    @Darkev77 3 года назад +1

    This was really helpful!

  • @supriyasumanidrpshc0048
    @supriyasumanidrpshc0048 3 года назад

    It was really very helpful, thanks for sharing it.

  • @pravinpawar2206
    @pravinpawar2206 3 года назад

    #if you are getting errors used this
    import splitfolders
    input_folder = '/content/drive/MyDrive/dataset/Garbage dataset'
    splitfolders.ratio(input_folder,output='/content/drive/MyDrive/dataset/split_garbage_dataset',
    seed=1337, ratio=(.7, .15, .15),
    group_prefix=None) # default values)

  • @lando2519
    @lando2519 2 года назад

    thank you for the help, you are much appreciated!

  • @wadhaalmattar2343
    @wadhaalmattar2343 Год назад

    Thanks a lot, this is very helpful

  • @tanghsien
    @tanghsien 2 года назад

    Fantastic! This is really helpful!

  • @mihretdesta9153
    @mihretdesta9153 Год назад

    You are such a fantastic man!! but I have one question for you, I can't understand imbalanced datasets for multi-class image classification with code and before or after splitting the data into train val and testing for oversample?

  • @moussarais9052
    @moussarais9052 2 года назад

    Thank you very much.. I have a question: I have according to each jpg a json file (their labels).. how can I also split these to the right folder? Thank you

  • @zakirshah7895
    @zakirshah7895 3 года назад

    Teacher, can you make a video regarding image cropping. For example, we have many images in a folder in which the area of focus is in different locations, so how to remove the unwanted black background.

  • @kibetwalter8528
    @kibetwalter8528 2 года назад +1

    you answer all my questions

  • @questless3033
    @questless3033 Год назад

    how do you divide timeseries image data set like I have 800 images of plant from week 0 to week 12. How do I divide them to test, train and val ?

  • @shivamwalia5634
    @shivamwalia5634 3 года назад

    Hi sreeni ,How to do Instance segmentation using Mask R-CNN for malaria cell segmentation.

  • @nitishsingla9057
    @nitishsingla9057 3 года назад +3

    How the seed is defined whether to take 42 or 1337 ?

    • @leonguyen7139
      @leonguyen7139 Год назад

      It could be any number. It just to make sure you have the same result eveytime you split.

  • @limzisin26
    @limzisin26 2 года назад

    Good day Sir, I have an urgent question. After I splitting the dataset into train, val and test, how I can write them in the model.fit() function, because I saw the model.fit() function from others, they have x_train, y_train and so on...Thanks..

  • @reemawangkheirakpam8165
    @reemawangkheirakpam8165 3 года назад +1

    sir, can you please make a video on instance segmentation using python

  • @surflaweb
    @surflaweb 3 года назад

    This is very useful. Thanks bro

  • @هبةحميد-ز8و
    @هبةحميد-ز8و 2 года назад

    sir, I downloaded a dataset from kaggle(flower recognition) and tried to work this way, but the following message (found 0 image belonging to 5 classes) shows that it is reading the folders but not reading the image knowing that it is inside the folder

  • @paulntalo1425
    @paulntalo1425 2 года назад

    Thank for sharing

  • @thanveerahamed660
    @thanveerahamed660 3 года назад

    This was really helpful thank you for doing this vedio

  • @ertanman
    @ertanman 2 года назад

    Thank you very much sir

  • @unamattina6023
    @unamattina6023 2 года назад

    can I splitfolders but only the jpg files? because in my dataset I have jpg and png files but I only want to split jpg files, how I can do this?

  • @yeening9844
    @yeening9844 3 года назад

    not sure why I get (SyntaxError: positional argument follows keyword argument) at ratio(.7,.2,.1) part

    • @DigitalSreeni
      @DigitalSreeni  3 года назад

      Change the following and see if that works....
      From:
      splitfolders.ratio(input_folder, output="cell_images2",
      seed=42, ratio=(.7, .2, .1),
      group_prefix=None)
      To
      splitfolders.ratio(input_folder, "cell_images2",
      42, (.7, .2, .1),
      None)

  • @frieda1669
    @frieda1669 Год назад

    after run, no new folders were created.
    but theres no signs for errors

  • @sajansudhir1859
    @sajansudhir1859 2 года назад

    Thanks for the video.Do we have any similar quick strategy to split CoCo Dataset ?

    • @DigitalSreeni
      @DigitalSreeni  2 года назад +1

      I am not aware of any ready to use libraries for that task.

  • @shankarmahadevan7146
    @shankarmahadevan7146 3 года назад

    Hi sir! I'm using the Apeer platform for annotating my images, but I'm unable to export all my annotations at once... How can I do it, Sir? I couldn't find any resources on that...

  • @random-yu5hv
    @random-yu5hv 3 года назад

    Thank you for sharings. Can you upload object detection in medical images?

  • @ajaysaikiranpenumareddy9809
    @ajaysaikiranpenumareddy9809 3 года назад

    Thank you sir

  • @kurniawankhaikal3433
    @kurniawankhaikal3433 3 года назад

    i have problem with 80,19,1 ratio, can you solve that?

  • @burakemregundes7172
    @burakemregundes7172 2 года назад

    I split my dataset, but the image in the test folder is also in the validation folder, is this true?

  • @kasrakakaee3441
    @kasrakakaee3441 2 года назад

    god bless your soul

  • @Sahil-m3s4t
    @Sahil-m3s4t 10 месяцев назад

    Thanks boss

  • @dianasoaresmagalhaes6901
    @dianasoaresmagalhaes6901 3 года назад

    You're amazing! can you make a video on instance segmentation using python?

  • @alicjaeckstein1628
    @alicjaeckstein1628 2 года назад +1

    Amazing! But my output folders are empty, when I use the code split folder. Do you have an idea why?

    • @ritujangra00
      @ritujangra00 2 года назад

      same here....

    • @ritujangra00
      @ritujangra00 2 года назад

      can somebody tell the reason

    • @vitzrd2076
      @vitzrd2076 2 года назад

      if you are doing in jupyter then enter the full of that folder

  • @muhannedmtd22
    @muhannedmtd22 3 года назад

    How to split to train , val , test in fixed number

  • @matancadeporco
    @matancadeporco 3 года назад

    ty

  • @shristykashyap2983
    @shristykashyap2983 2 года назад

    what is the meaning of seed? And why did you take 42 as the value

  • @johnmoisespaunlagui5026
    @johnmoisespaunlagui5026 2 года назад

    what does the seed=42 do??

    • @DigitalSreeni
      @DigitalSreeni  2 года назад

      Random is not so random - understanding random in python
      ruclips.net/video/azFSGHGeawg/видео.html

  • @angelgabrielortiz-rodrigue2937
    @angelgabrielortiz-rodrigue2937 2 года назад

    This video is awesome. However, I couln't understand the "seed" parameter. Could you elaborate?

    • @DigitalSreeni
      @DigitalSreeni  2 года назад +1

      'Seed' is used to pick images at 'random'. Without a seed your images are selected at random all the time. This is not good if you want your experiments to be reproducible. In our example, fixing the seed to a number gives you same split in your images all the time. Changing the seed changes the images that gets picked.

  • @kalluriramakrishna5732
    @kalluriramakrishna5732 3 года назад

    Thank you sir

  • @SemSemOnTop
    @SemSemOnTop 2 года назад

    Thank you very much