Hands-on dplyr tutorial for faster data manipulation in R

Поделиться
HTML-код
  • Опубликовано: 15 июл 2024
  • dplyr is a new R package for data manipulation. Using a series of examples on a dataset you can download, this tutorial covers the five basic dplyr "verbs" as well as a dozen other dplyr functions.
    Watch the follow-up tutorial: • Going deeper with dply...
    View the R Markdown document: rpubs.com/justmarkham/dplyr-tu...
    Download the source document: github.com/justmarkham/dplyr-...
    Read about why I love dplyr: www.dataschool.io/dplyr-tutor...
    Tutorial contents:
    1. Introduction to dplyr (starts at 0:00)
    2. Loading dplyr and the example dataset (starts at 2:29)
    3. Understanding "local data frames" (starts at 3:23)
    4. Verb #1: `filter` (starts at 5:17)
    5. Verb #2: `select`, plus `contains`, `starts_with`, `ends_with`, `matches` (starts at 7:54)
    6. Using chaining syntax for more readable code (starts at 9:34)
    7. Verb #3: `arrange` (starts at 12:53)
    8. Verb #4: `mutate` (starts at 13:55)
    9. Verb #5: `summarise`, plus `group_by`, `summarise_each`, `n`, `n_distinct`, `tally` (starts at 15:31)
    10. Window functions: `min_rank`, `top_n`, `lag` (starts at 26:47)
    11. Convenience functions: `sample_n`, `sample_frac`, `glimpse` (starts at 32:44)
    12. Connecting to databases (starts at 34:21)
    == RESOURCES ==
    Reference manual and vignettes: cran.r-project.org/web/package...
    July 2014 webinar: pages.rstudio.net/Webinar-Seri...
    July 2014 webinar code: github.com/rstudio/webinars/t...
    Tutorial by Hadley Wickham: www.dropbox.com/sh/i8qnluwmui...
    GitHub repo: github.com/hadley/dplyr
    List of releases: github.com/hadley/dplyr/releases
    == LET'S CONNECT! ==
    Newsletter: www.dataschool.io/subscribe/
    Twitter: / justmarkham
    Facebook: / datascienceschool
    LinkedIn: / justmarkham

Комментарии • 361

  • @douglaspiresmartins2955
    @douglaspiresmartins2955 4 года назад +31

    You rock dude, 6 year later and this vid is still really really helpful.

    • @RicoLamar987
      @RicoLamar987 3 года назад +1

      Willing to PAY anyone to do my business statistics R homework!!!! Someone please message me ASAP to negotiate a price

  • @bryanwu6973
    @bryanwu6973 7 лет назад +35

    This is great! I really love the fact that you show the "base R" approach as a comparison. Looking forward to more vids.

    • @dataschool
      @dataschool  7 лет назад +3

      Thanks! Glad it was helpful to you!

  • @user-li5qi1hj5y
    @user-li5qi1hj5y 9 лет назад +5

    I've been using dplyr for a while now and I still learned some new and useful stuff from this video. Thanks very much for sharing.

    • @dataschool
      @dataschool  9 лет назад +1

      You're welcome! Glad it was useful to you. I'm planning to create an "updates" video once dplyr 0.3 is released!

  • @johnsheehan2434
    @johnsheehan2434 7 лет назад +2

    Thank you for putting these tutorials together. They are FANTASTIC for the R newbie. And I particularly love that you have the R Markdown version that we can keep for reference.

    • @dataschool
      @dataschool  7 лет назад

      Awesome! You are very welcome! Here's a link to the GitHub repository, for anyone who needs it: github.com/justmarkham/dplyr-tutorial

  • @mago2007
    @mago2007 5 лет назад +8

    This is, with out question, one of the best R tutorials i have seen! Thank you!

  • @1997nakul
    @1997nakul 7 лет назад +9

    That was a well spent 40 minutes. Very neat, precise, and easy to understand. You have my additional gratitude for comparing the dpylr to the base R functions, which helped in visualizing why dpylr is practical. Thank You!

    • @dataschool
      @dataschool  7 лет назад

      You're very welcome! Thanks so much for your incredibly nice comment! :)

  • @andrenana15
    @andrenana15 5 лет назад

    I was doing the side by side comparison on a spreadsheet, then I found your video. This is gold. I subscribed to your channel immediately. So thankful to you.

    • @dataschool
      @dataschool  5 лет назад

      That's great to hear... thanks so much for sharing!

  • @yongyanchen8836
    @yongyanchen8836 4 года назад +2

    Thanks a lot! As a greenhand to R, your presentation is really helpful to me! Great!

  • @dwadelson
    @dwadelson 9 лет назад +51

    very clear, straightforward, succinct, and helpful. i particularly liked the way the dplyr approach was contrasted with the base R approach for each of the examples. excellent. thank you.

    • @dataschool
      @dataschool  9 лет назад +4

      Wow, thanks David Adelson, that's such a nice compliment! I appreciate you taking the time to share it with me!

    • @miztx2syuiip590
      @miztx2syuiip590 2 года назад

      yes thank u 4 sharing - Its what America is all about! and still the same & going strong in 2022 whew whew! Really grateful for showing the chaining option really cuts down on my workflow

  • @bir_dilim_bilim3510
    @bir_dilim_bilim3510 5 лет назад +4

    Thanks for this video! I really loved the way you explain things. Very clear and step-by-step approach. Will definitely watch more of your videos to get better in R!

    • @dataschool
      @dataschool  5 лет назад

      Thanks for your kind words! 🙌

  • @differe94nt
    @differe94nt 10 лет назад +2

    I have learned more about how to neatly and efficiently using more than one functions to apply on more than one columns at once.
    Really appreciate your organized illustration!!!
    Thank you so much!!!!!
    Anticipating for more lessons from you:))))))))))))))))

    • @dataschool
      @dataschool  10 лет назад +2

      Awesome, I'm glad you learned a lot from the video!

  • @Papercraftfreak3
    @Papercraftfreak3 7 лет назад

    This tutorial is the most helpful resource I've found thus far for my Data Science project. Thank you so much for posting!!

    • @dataschool
      @dataschool  7 лет назад

      You're very welcome! I'm so glad to hear that it has been helpful to you!

  • @keithtyrrell9198
    @keithtyrrell9198 4 года назад +1

    I'm fairly new to R and didn't really know where to start for my current project. I think I learned about 90% of what I need to know from this video. Thank you very much!

  • @faseehahmed4153
    @faseehahmed4153 Год назад +1

    8 years old, still the best tutorial on dplyr.

  • @prempasanha3
    @prempasanha3 6 лет назад

    The way you explained the commands comparing to Base R help us to appreciate dplyr. Thanks. Good job

  • @smfry010
    @smfry010 8 лет назад

    Outstanding video, I never knew that dplyr was so powerful AND easy. Thank you for providing this video!

    • @dataschool
      @dataschool  8 лет назад

      +Michael Fry Glad it was helpful to you! Thanks for your kind words!

  • @victorls
    @victorls 10 лет назад +3

    Thanks for this great video. I learnt more here in 1 hour than in several days surfing the Internet!

    • @dataschool
      @dataschool  10 лет назад +1

      Awesome, that was my goal! Thanks for your comment!

    • @antocdt1413
      @antocdt1413 6 лет назад

      me too!

  • @datguysam
    @datguysam 9 лет назад

    Very clear and useful tutorial on dplyr. Thanks a lot!

  • @itisdawit8298
    @itisdawit8298 4 года назад

    wow your simply the best instructor in youtube world.
    you shorten the time I need explore more about dplyr; very clear explanations. Many thanks

  • @macanbhaird1966
    @macanbhaird1966 4 года назад

    I am teaching statistics and data manipualtion and this is most useful! Thanks for this.

  • @fghj-zh6cv
    @fghj-zh6cv 6 лет назад

    Your tutorial is meticulous, clear and useful for those who are used to basic R approach but feels a need to learn dplyr package. This not lengthy video does help me to write R-code in an efficient and convenient manner. Thanks.

  • @thabitiyssoufa3232
    @thabitiyssoufa3232 4 года назад +1

    Best tutorial i have never watched on Yutub .Thank u and keep moving forward .

  • @nurmohammad2979
    @nurmohammad2979 7 месяцев назад +1

    It's just time-worthy step from my side to watch this video. I pay my gratitude to your nice work. Thanks a lot.

    • @dataschool
      @dataschool  6 месяцев назад

      You're welcome! Thanks for the kind comment!

  • @MortenBunesGustavsen
    @MortenBunesGustavsen 8 лет назад +3

    Thanks a lot for this very clear and instructive tutorial !

    • @dataschool
      @dataschool  8 лет назад +1

      +Morten Bunes Gustavsen You're very welcome!

  • @pburet
    @pburet 7 лет назад

    Thx for putting this together, it's amazingly clear. I find the comparison with the standard R code very useful.

    • @dataschool
      @dataschool  7 лет назад

      Great to hear! You are very welcome!

  • @aegystierone8505
    @aegystierone8505 4 года назад

    Thank you for doing this Kevin!

  • @meghashankar9544
    @meghashankar9544 5 лет назад +1

    Thankyou so much for this detailed and crisp tutorial!

  • @AstridLexical
    @AstridLexical 10 лет назад

    Thank you Kevin. You are helping so many people with your great explanations. I'm also doing the Data Science Specialization through Coursera.

    • @dataschool
      @dataschool  10 лет назад

      You're very welcome, Astrid! Good luck with the Specialization!

  • @hasankhalid
    @hasankhalid 4 года назад

    Do you such an excellent job at teaching. Everything is so thought through and organized that it makes learning easy.

    • @dataschool
      @dataschool  4 года назад

      Thanks very much for your kind words!

  • @elpiopro
    @elpiopro 4 года назад +1

    2020 and still a very informative and wonderful video. Thanks!

  • @georgenjunge911
    @georgenjunge911 5 лет назад

    Your explanations is second to none

  • @gregoryhorne2952
    @gregoryhorne2952 10 лет назад +2

    The dplyr package should simplify exploratory data analysis when used in conjunction with the graphics packages (base, ggplot2, or lattice). Excellent introductory tutorial.

  • @MrDavisv
    @MrDavisv 6 лет назад

    Thank you!!! The Chaining example is awesome, and makes complete sense now.

  • @leechmaster21
    @leechmaster21 5 лет назад

    You're so clear man. Proud of you!

  • @shishu3986
    @shishu3986 8 лет назад

    Very useful! Especially the part that shows how to connect to a database. Thank you!

    • @dataschool
      @dataschool  8 лет назад

      +Shi Shu You're very welcome! Glad it was helpful to you!

  • @mrmacmasterbigdaddyj8552
    @mrmacmasterbigdaddyj8552 4 года назад

    Thanks thats a great tutorial ! and the structure is perfect!

  • @thompsonnucky2985
    @thompsonnucky2985 4 года назад +1

    Appreciating your great work! Thanks again.

  • @otroleonarbe
    @otroleonarbe 5 лет назад +1

    Great video and well explained. Please keep posting this type of videos.

  • @saurabhsinha2026
    @saurabhsinha2026 3 года назад

    Really very informative and to the point. Thanks a lot for creating and sharing this video.

  • @hafianeyacine8872
    @hafianeyacine8872 5 лет назад

    Very clear and helpful ! Thank you for your time and efforts.

  • @forestsunrise26
    @forestsunrise26 2 года назад

    Thank you so so much for your tutorials and materials!

  • @kanchanaramar
    @kanchanaramar 3 года назад

    Thanks a lot. Your explanation is clear and direct.

  • @jordanndetcho2789
    @jordanndetcho2789 7 лет назад

    Very easy to understand, straight to the point and useful for beginners like me ^^
    thank you very much !

  • @noorakazanji7423
    @noorakazanji7423 3 года назад +1

    this is my favorite R tutorial on youTube. Merci BCP

  • @theforester_
    @theforester_ 2 года назад

    thanks for the tutorial! big shout out from Brazil

  • @punitkaur3276
    @punitkaur3276 7 лет назад

    38 mins well spent thanks for an awsum tutorial!!!

    • @dataschool
      @dataschool  7 лет назад

      Great to hear! Thanks for watching :)

  • @fahadshery
    @fahadshery 8 лет назад

    Saved the day! thank you ever so much. Keep them coming!

    • @dataschool
      @dataschool  8 лет назад

      +fahadshery Great to hear!

  • @julianonas
    @julianonas 7 лет назад

    excellent, thank you for your time to share the explanation!

  • @AlokPratapSingh4001
    @AlokPratapSingh4001 4 года назад

    Very precise to the point lecture,...! Thank you much

  • @shreyanshmantri1756
    @shreyanshmantri1756 5 лет назад +1

    Excellent video,cleared all my basics....very very thanks

  • @dheerajkura5193
    @dheerajkura5193 7 лет назад

    Explanation was awesome. It's changed n improved my perception towards Rstudio

    • @dataschool
      @dataschool  7 лет назад

      Thanks! Glad it was helpful to you!

  • @samueljuma5905
    @samueljuma5905 Год назад

    Thanks for this elegant presentation sir

  • @juliobrettas3911
    @juliobrettas3911 9 лет назад

    Awesome tutorial! I'll surely rework all my codes with this... Thanks!

    • @dataschool
      @dataschool  9 лет назад

      Thanks! dplyr makes code so much easier to read, right?

  • @bikashpokharel478
    @bikashpokharel478 3 года назад +1

    I just started 38:56 mins ago, now I am a R expert. Thanks a lot

  • @wow_kavita.indulkar
    @wow_kavita.indulkar 4 года назад

    Really helpful..thanks for the efforts

  • @tuenguyen1225
    @tuenguyen1225 7 лет назад

    Excellent tutorial. Thank you very much, Sir.

  • @fet1612
    @fet1612 4 года назад +2

    26:47
    10. Window functions: `min_rank`, `top_n`, `lag` (starts at 26:47)
    32:44
    11. Convenience functions: `sample_n`, `sample_frac`, `glimpse` (starts at 32:44)
    34:21
    12. Connecting to databases (starts at 34:21)

  • @satyak5456
    @satyak5456 2 года назад +1

    Great Video, it helped me to learn more on dplyr.. Subscribed and searching for more videos from Data School's list :D Thank you!

  • @shivibhatia1613
    @shivibhatia1613 9 лет назад +9

    This is simply fantastic. I have gone through so many forums - R help, Stats exchange etc etc and in most of the cases ppl bluntly say go and brush ur skills and don't believe in answering easy questions posted by newbies. However in the video above you have taken the pain to explain step by step how dyply is working. Really helped.
    In 11:31 we are filtering where DepDelay >60 so can we do a similar between 2 columns. I did so but received an error as they were factors. Thanks,

    • @dataschool
      @dataschool  9 лет назад

      Shivi Bhatia Thanks for your kind comments! Glad it's helpful to you! As for your question, you can filter on multiple conditions by listing them separated by commas. For example: filter(flights, Month==1, DayofMonth==1)

  • @Sampanacheify
    @Sampanacheify 8 лет назад

    I found following your steps in the video, but applying them to my own data, was incredibly helpful. Thanks for the great video and my code as a result is more efficient and aesthetically pleasing :-)

  • @sethjchandler
    @sethjchandler 10 лет назад

    This is an extremely helpful and well done video. Thank you very much for making it.

    • @dataschool
      @dataschool  10 лет назад

      You're welcome! And I'll have a follow-up dplyr video coming soon! :)

  • @longphung7665
    @longphung7665 2 года назад +1

    Great tutorial. Thank you so much.

  • @cristiancapetilloconstela9923
    @cristiancapetilloconstela9923 6 месяцев назад +1

    Perfect explanation!

  • @manjunathroyal2133
    @manjunathroyal2133 4 года назад

    Really helpful video.

  • @fediahlioui
    @fediahlioui 6 лет назад

    Thanks a lot for this very clear and instructive tutorial

  • @JoseRojas2
    @JoseRojas2 3 года назад

    Thanks for making this video! it was very useful

  • @JoeEdwards8D
    @JoeEdwards8D 9 лет назад +2

    Thank you for posting this, I saw the link for this in the Data Scientists Toolbox class on Coursera.

  • @kgsc4139
    @kgsc4139 5 лет назад

    Simple and concise. Thank you :)

  • @afifkhaja
    @afifkhaja 5 лет назад

    Very well explained and helpful. Thanks

  • @oggyoggyoggyy
    @oggyoggyoggyy 4 года назад +1

    Thank you very much for your interpretation !! it helps a lot

  • @dorotamarkowska5542
    @dorotamarkowska5542 3 года назад

    thanks Kavin It was great!

  • @rogerwilcoshirley2270
    @rogerwilcoshirley2270 4 года назад

    Nicely done 👍

  • @Joshua35070
    @Joshua35070 8 лет назад

    Great Video!
    (and fantastic package)

  • @bikrammaharjan5475
    @bikrammaharjan5475 9 лет назад

    Great video, thanx a lot!

  • @Asmaab1313
    @Asmaab1313 4 года назад

    It's very helpful!! Thank youu

  • @leidyssilvera3624
    @leidyssilvera3624 5 лет назад

    Hi... Thanks for this video. I really learned many things.
    Greetings from Venezuela

  • @tribibpal
    @tribibpal 5 лет назад

    you are truly awesome . Thank you so much for this .

    • @dataschool
      @dataschool  5 лет назад

      Thanks for your kind words!

  • @Christopher_G.
    @Christopher_G. 7 лет назад

    awesome tutorial man very helpfull and easy to understand !

  • @HiepPham007
    @HiepPham007 8 лет назад

    Thank you! Very clear and helpful.

  • @vinaychuri
    @vinaychuri 7 лет назад

    Very good video, clear and clean

  • @user-xc2sp1fg7s
    @user-xc2sp1fg7s 5 лет назад

    Very helpful vedio for learning dplyr. And it's very kind that he(the lecturer) provides link where we can download all R-script and explanations. Thanks!

    • @dataschool
      @dataschool  5 лет назад

      Thanks for your kind words!

  • @PacoAstola
    @PacoAstola 6 лет назад

    That you for you time and effort to make the vídeo!
    Congratulations!

  • @coffeeedobrien
    @coffeeedobrien 8 лет назад

    Great video! Thanks Kevin!

    • @dataschool
      @dataschool  8 лет назад

      Thanks Ed! Glad it was helpful to you :)

  • @MotariOngeta
    @MotariOngeta 3 года назад

    simple, well organized flow

  • @statisticstime4734
    @statisticstime4734 3 года назад

    Excellent!

  • @SayantanSenBony
    @SayantanSenBony 6 лет назад

    Thanks a lot, great presentation.

  • @debashisbanerjee260
    @debashisbanerjee260 4 года назад

    Very helpful. Thank you

  • @brendenmorley2643
    @brendenmorley2643 8 лет назад

    THank you .... very detailed and informative

    • @dataschool
      @dataschool  8 лет назад

      Glad it was helpful to you!

  • @iCorlitotv
    @iCorlitotv 3 года назад

    Super helpful! thank you

  • @dnyw0802
    @dnyw0802 5 лет назад +2

    you r awesome ...great clear learning steps .. 🙏👍👍👍👍

  • @nept4ne
    @nept4ne 5 лет назад +1

    A lot of thanks, greetings from a beginner in rstudio.

  • @ucfj
    @ucfj 8 лет назад

    Great intro. Thanks a lot!

    • @dataschool
      @dataschool  8 лет назад

      +Juliusz Gonera You're welcome!

  • @thrinadhn
    @thrinadhn 8 лет назад

    Thanks a lot ....very usefully to all

    • @dataschool
      @dataschool  8 лет назад

      +Thrinadh Nagubadi You're welcome!

  • @williammendieta5427
    @williammendieta5427 3 года назад

    Thanks! Great video!

  • @anuragsharma5208
    @anuragsharma5208 8 лет назад

    Superb liked it very much.... easy, neat and clan explanation... super like

    • @dataschool
      @dataschool  8 лет назад

      +Anurag Sharma I super appreciate it :)

  • @AK-ff8cc
    @AK-ff8cc 7 лет назад

    Very good! Thank you.

  • @RobvanMechelen
    @RobvanMechelen 8 лет назад

    Excellent! Thank you

  • @pythontools3234
    @pythontools3234 8 лет назад

    nicely done!

  • @hyakushiki23
    @hyakushiki23 5 лет назад

    Another great video by Mark. One thing I would add is name your data frame dfFlights so you don't get it mixed up with the column Flights.

  • @kvafsu225
    @kvafsu225 3 года назад

    It is really nice. Great thank you very much