TidyTuesday: Comparing TidyModels with Caret

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024
  • In this #TidyTuesday video, I go over the new TidyModels package and compare it with its predecessor Caret. I go over the typical modeling process using the abalone dataset and explore the intricacies of TidyModels. I also discuss the things I like and dislike about the new package. Finally, I go over what I think is the best way to approach #TidyModels if you are already familiar with #Caret.
    Code from the video: github.com/and...
    Tidymodels: www.tidymodels...
    Caret: topepo.github.i...
    PC Setup (Amazon Affiliates)
    Keyboard: amzn.to/3Bbbk3T
    Mouse: amzn.to/3BcRGVo
    Microphone: amzn.to/3ePo9JS
    Audio Interface: amzn.to/3qTAmjz
    Webcam: amzn.to/3L9Ql6j
    CPU: amzn.to/3qGa6Zu
    GPU: amzn.to/3DnhMHL
    RAM: amzn.to/3LdTxh7

Комментарии • 27

  • @HaiLeQuang
    @HaiLeQuang 3 года назад +3

    This channel is a gem. I love every minute of your video. Keep up the good work mate. Greetings from Vietnam.

  • @robertoosvaldo472
    @robertoosvaldo472 4 года назад +1

    This video was a great introduction to tidymodels, especially as a caret user. Thank you!

  • @danielvalentins1756
    @danielvalentins1756 4 года назад +4

    Amazing video, thank you very much! There is very little material on RUclips yet about tidymodels, it will be very valuable if you keep posting videos about this package!

  • @geilin2394
    @geilin2394 4 года назад +2

    This is a great side by side!

  • @rrrprogram4704
    @rrrprogram4704 4 года назад +1

    Awesome ... it would be just brilinat if there is complete playlist about tidymodels package...please make vidoes on it

    • @AndrewCouch
      @AndrewCouch  4 года назад

      Checkout my TidyTuesday Playlist ruclips.net/p/PLJfshcspBCYeJeO8YFT5e5HxuYOb5a_1W I have a decent amount of TidyModels videos and will continue creating them!
      -Andrew

  • @user-cx8br9qi7h
    @user-cx8br9qi7h 4 года назад +2

    it's impressive and helpful!

  • @a072826
    @a072826 4 года назад +1

    Very informative! Thanks.

  • @masterofzion1
    @masterofzion1 3 года назад +1

    Thanks for this!

  • @lucasokwudishu605
    @lucasokwudishu605 3 года назад +1

    Hello Andrew. I just stumbled upon this tutorial and it is super helpful. I do have one question. How would you extract variable importance using the tidymodels? Thanks.

    • @AndrewCouch
      @AndrewCouch  3 года назад

      vip pacakge is the main package used to calculate variable importance scores for Tidymodels. I also have a video on interpreting black box models here: ruclips.net/video/eNvKnhMJd2o/видео.html. But I'll probably make a video on VIP in the future.

  • @jamespaz4333
    @jamespaz4333 3 года назад

    Who could even dislike this video...

  • @jiggunjer9686
    @jiggunjer9686 4 года назад +1

    I'd like to know if tidymodels lets me tune without any resampling logic. I know caret rejects tunelengths > 1 if the trainControl method is "none". Currently my fastest workaround is just use 2-fold cv.
    Also regarding the video: caret preProcess doesn't always work with tibbles, so you'd need another as.data.frame pipe after the selecting the predictors.

    • @AndrewCouch
      @AndrewCouch  4 года назад

      According to the Tune package documentation, tune_grid requires a rsample object. So, I think 2-fold CV is probably the fastest. There are also other methods to speed up the training process such as parallel processing.
      -Andrew

  • @zegpi1821
    @zegpi1821 4 года назад +1

    Really good video! Tidymodels stuff is appreciated, although I have a doubt: why are you doing cross-validation on training data? That confuses me. My formal education teached me that cross-validation and hold-out partitions are separate approaches to choose from when you're splitting the data for modelling, but we are not supossed to use both, are we?

    • @AndrewCouch
      @AndrewCouch  4 года назад

      Cross validation allows a more robust estimate of the model's hyperparameters. Essentially cross validation creates multiple train and test sets from the original training data and uses it to tune a model with the given hyperparameter grid. Here's a link that explains it better than I can. topepo.github.io/caret/model-training-and-tuning.html
      -Andrew

    • @zegpi1821
      @zegpi1821 4 года назад

      @@AndrewCouch Thank you, but that's exactly my point. In the video I saw that you applied cross-validation on the training set. Maybe is my mistake.

    • @AndrewCouch
      @AndrewCouch  4 года назад +1

      Zegpi's One way to think about it is the method is a combination of both cross validation and the train/test approach. Once the hyperparameters with the best average metric are produced, the entire train set is then fitted using the optimal hyperparameters.

    • @zegpi1821
      @zegpi1821 4 года назад

      ​@@AndrewCouch I was taught that cross-validation on training set creates a huge risk of overfitting, but I've never tried. I will try that for sure now! Thank you very much.

    • @AndrewCouch
      @AndrewCouch  4 года назад +1

      There’s a risk however with random sampling it should help alleviate it. I believe the term is called data snoopage where it isn’t exactly leakage but it does look at the test data a little bit. Glad you enjoyed the video!
      -Andrew Couch

  • @arafat464
    @arafat464 3 года назад

    I'm so used to caret, it's hard to move on to tidyModels; even if the creator of caret himself is working on tidyModels. Can you do a video on mlr3? That seems to be the other main ML package in R.

    • @AndrewCouch
      @AndrewCouch  3 года назад

      I am not making videos anymore however I recommend looking at www.tidymodels.org/learn/ and www.tmwr.org/

  • @rrrprogram4704
    @rrrprogram4704 4 года назад

    where can we learn tidymodels throughly ...... it own website doesn't help much

    • @AndrewCouch
      @AndrewCouch  4 года назад

      From what I understand, the Rstudio team is working on the TidyModels website and are adding tutorials. However, a lot of the sections are missing/broken which is why I have been focusing my weekly videos on the TidyModels package. If you want, you could look at each package's documentation.
      -Andrew

  • @artem.starchenko
    @artem.starchenko 4 года назад +2

    Thanks, but your screen is too big

    • @AndrewCouch
      @AndrewCouch  4 года назад

      Hey Артём,
      Another commenter has mentioned this before and I have made the fixes. For the future, the code will be easier to read. You can check out the link in the description for more clarification. Thanks for watching!
      -Andrew