Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)

Поделиться
HTML-код
  • Опубликовано: 22 авг 2024

Комментарии • 93

  • @JohnMKaya-lm1ry
    @JohnMKaya-lm1ry 4 года назад +2

    Thank you very much for the great tutorial!
    By the way, slice function has new features. This is from the dplyr version history:
    slice() gains a new set of helpers:
    slice_head() and slice_tail() select the first and last rows, like
    head() and tail(), but return n rows per group.
    slice_sample() randomly selects rows, taking over from sample_frac()
    and sample_n().
    slice_min() and slice_max() select the rows with the minimum or
    maximum values of a variable, taking over from the confusing top_n().

  • @BmyofMe
    @BmyofMe 4 года назад

    best data manipulation tutorial I ve ever found on youtube, keep up your good work, you are a great tutor.

  • @RootedTango1212
    @RootedTango1212 8 лет назад +6

    Thank you very much! Can't emphasise how much these two tutorials have helped me :)

    • @dataschool
      @dataschool  8 лет назад

      +hima sagar You're very welcome! So glad to hear! :)

  • @anthonystaines
    @anthonystaines 9 лет назад

    Really useful pair of tutorials - I'm an experienced R user trying to get my head around the dplyr way of thinking, and these were helpful.
    Thanks again,

    • @dataschool
      @dataschool  9 лет назад

      Anthony Staines Awesome! Thanks for the kind compliment!

  • @MarcTelesha
    @MarcTelesha 9 лет назад +3

    Great video on more advance dplyr in R. This makes manipulating of data so smooth, readable and fast. HIGHLY recommended.

    • @dataschool
      @dataschool  9 лет назад

      Marc Telesha Thanks Marc, glad the video was helpful to you! I agree that readability is so important, which is one reason I'm a big fan of dplyr.

    • @MarcTelesha
      @MarcTelesha 9 лет назад

      Data School Can I give a slight criticism???
      SPACE after every %>% and not lines over 80 characters with three or four %>% it is driving me CRAZY! I know alt+enter is a nice shortcut but ctl-shift-P also works.
      Smiles

    • @dataschool
      @dataschool  9 лет назад +1

      Marc Telesha Ha! I used nicer formatting in my previous dplyr tutorial, but this time, I decided to write my tutorial code the way I write my real code :) However, I'll consider changing back for my next tutorial!

    • @MarcTelesha
      @MarcTelesha 9 лет назад

      Data School Thank you if you do :) Even if you don't it is a really good run through of dplyr. It will be intresting to see if Juypter (AKA iPython) Notebook might work better for your tutorials?

    • @dataschool
      @dataschool  9 лет назад

      Marc Telesha Definitely worth considering! I teach some of my data science classes using IPython notebook, but up to now have not used them for R code. Thanks for the idea!

  • @cherub6958
    @cherub6958 7 лет назад

    Intense, to-the-point but very informative, have learned a lot from the pair of dplyr videos, I feel very much comfortable with dplyr.

  • @mosesotieno1629
    @mosesotieno1629 4 года назад +1

    You step by for the newbies like me! I love your teaching approach. You are real great tutor!

    • @dataschool
      @dataschool  4 года назад

      Thanks!

    • @waynebruno7051
      @waynebruno7051 3 года назад

      you prolly dont give a damn but does any of you know of a way to log back into an Instagram account??
      I stupidly forgot my account password. I would appreciate any tips you can give me

    • @shawnkoda8146
      @shawnkoda8146 3 года назад

      @Wayne Bruno Instablaster ;)

    • @waynebruno7051
      @waynebruno7051 3 года назад

      @Shawn Koda thanks so much for your reply. I got to the site thru google and I'm trying it out atm.
      Seems to take quite some time so I will get back to you later when my account password hopefully is recovered.

    • @waynebruno7051
      @waynebruno7051 3 года назад

      @Shawn Koda it did the trick and I now got access to my account again. Im so happy:D
      Thank you so much you really help me out :D

  • @ishansgyan8665
    @ishansgyan8665 7 лет назад +1

    Really great pair of Dplyr videos. Solved 90% of my task of exploring how to perform data manipulation tasks. It will be really helpful if you make similar videos on Functions and loops in R with some good use cases.

    • @dataschool
      @dataschool  7 лет назад

      Glad you liked the dplyr videos, and thanks for your suggestion!

  • @gksujay3465
    @gksujay3465 7 лет назад

    Very clear and concise explanation in both the parts. Thank you so much for making the learning awesome !! Please continue to make more such videos on new concepts in R

    • @dataschool
      @dataschool  6 лет назад

      Glad it was helpful to you! Currently, I'm only making videos on Python, I'm sorry!

  • @deepak39754
    @deepak39754 7 лет назад

    Easy to understand with your series of videos on dplyr ...Thanks for lucid explanations

  • @asneogy
    @asneogy 8 лет назад

    Very nice pair of tutorials, love this package! Quick thing about matching of the columns 'color'= 'col' - you need to keep them in the order of the tables a and b in the join statement. Meaning, if you did 'col'= 'color' it would give an error, since a does not have 'col' and b does not have 'color'.

  • @endalealtaye3147
    @endalealtaye3147 9 лет назад

    Very Precise, Clear and quite helpful .Thanks

    • @dataschool
      @dataschool  9 лет назад

      Endale Altaye You're very welcome! Thanks for your kind words.

  • @tribibpal
    @tribibpal 5 лет назад

    Awesome man . Now you own me officially.

    • @dataschool
      @dataschool  5 лет назад

      Ha! Maybe you would like to support Data School on Patreon: www.patreon.com/dataschool

  • @dheerajkura5914
    @dheerajkura5914 7 лет назад

    Excellent...! too good explanation
    Really really helpful to understand the Rstudio and to munge the Data

    • @dataschool
      @dataschool  7 лет назад

      Thanks for the kind comment!

  • @rogerwilcoshirley2270
    @rogerwilcoshirley2270 4 года назад

    Nice job, very helpful !

  • @BurningR
    @BurningR 9 лет назад

    great tutorial, thank you! coming from stata, this makes R datamanipulation seem much less confusing

    • @dataschool
      @dataschool  9 лет назад

      Emil Begtrup-Bright Awesome, glad it was helpful to you! Welcome to the world of R :)

  • @KreshnikMorina
    @KreshnikMorina 4 года назад

    Thank you very much, wonderful explanation!

  • @luisvalesilva8931
    @luisvalesilva8931 9 лет назад

    Great video! Lots of useful tricks.

  • @RockMonkeyLV
    @RockMonkeyLV 8 лет назад +1

    When referring to objects created by data_frame as being local data frames and ones created by data.frame as not local. What do you mean by local vs not local?

    • @dataschool
      @dataschool  8 лет назад +1

      +Glenn Z Great question! I just answered this on Stack Overflow: stackoverflow.com/a/35605110/1636598

  • @MR-eg6np
    @MR-eg6np 4 года назад

    this was a great video, thank you! Any chance you will make other R videos again?

    • @dataschool
      @dataschool  3 года назад

      Not any time soon, sorry!

  • @stewartli5395
    @stewartli5395 6 лет назад

    Great tips. Thank you very much.

  • @picasso1334
    @picasso1334 7 лет назад

    Great video. So helpful, thank you

  • @jasontarimo3997
    @jasontarimo3997 5 лет назад

    Amazing videos. Are you going to do anymore videos on R? Such an amazing tool to get stuffs done on your day (wrangle) very fast. I have a request. How could you do a map() with dataframe in R, like giving different names for values in a column. Is there any function in dplyr for this?

    • @dataschool
      @dataschool  5 лет назад

      Glad you like the video! No, I'm not planning to do any more videos with R - I'm sorry! I work in Python now, and I haven't used R in years. I like both languages, but I prefer to get as good as possible in one language.

  • @rupeshmohanasundaram6718
    @rupeshmohanasundaram6718 5 лет назад +1

    Hi, I like this video which is also very useful to me. I need your help that in case we pass column names of df as arguments to a function how to use those variables in functions like select, arrange, distinct, summarise of Dplyr verb. Kindly reply as soon as possible

    • @dataschool
      @dataschool  5 лет назад

      Sorry, I don't quite understand your question. Good luck!

  • @j7andrew
    @j7andrew 5 лет назад

    What if you're looking to find a combination of sequences that occurred? For example, I want to know how many times X then Y occurred

    • @dataschool
      @dataschool  5 лет назад

      Sorry, I'm not sure I fully understand your question!

  • @asbcllc
    @asbcllc 9 лет назад

    Spread the power of dplyr and R

    • @dataschool
      @dataschool  9 лет назад

      Alex Bresler Ha, dplyr is indeed awesome!

  • @hplcdadong
    @hplcdadong 7 лет назад

    Great tutorials. Thanks a lot.

  • @asdfghjkl12904
    @asdfghjkl12904 5 лет назад

    Thank you for the nice tutorial! :)

  • @shivibhatia1613
    @shivibhatia1613 9 лет назад

    Perfect video for data manipulation in R. The second video is also great in continuation to the first one. Please upload more tutorials on R.
    Just one question though- when i run the same code:
    june%>% group_by(type, city)%>% top_n(3,frieght). Here june is the name of my excel file. and grouping based on type city where top 3 freight to be filters. this gives correct output though as i have 28 columns hence in the R console it only shows 6 columns and remaining reads as variables not shown.
    Is there a possibility to show all or max columns in the console because i could then take the result to the business user or any alternate you could suggest.
    Thanks again for the videos. ROCKS!!!!

    • @dataschool
      @dataschool  9 лет назад

      Shivi Bhatia Glad you are enjoying the videos! To answer your question, you can indeed show more columns. Check out the "Viewing more output" section of this document: rpubs.com/justmarkham/dplyr-tutorial-part-2

  • @toddc1021
    @toddc1021 7 лет назад

    Hi thank you very much for the tutorial. This is very helpful. I have one question though. With dplyr 0.50, I could not get the same result as yours at 10:40 of the video. The arrange function sorts all the rows from the highest dep_delay the lowest which messed up the order created by group_by(day, month). Is there an alternative? Thank you.

    • @dataschool
      @dataschool  7 лет назад

      I'm sure there's an alternative, though I haven't used the latest version of dplyr and so I can't say for sure. Let me know if you figure it out!

    • @roshanmr2011
      @roshanmr2011 7 лет назад +2

      Todd Cho use all 3 variables in arrange(month, day, desc(dep_delay))

    • @toddc1021
      @toddc1021 7 лет назад

      Thank you Roshan that's the answer!

  • @IndianYJ
    @IndianYJ 9 лет назад

    I have tweeted it! Awesome!!!

    • @dataschool
      @dataschool  9 лет назад

      Amit Ugle Thanks for sharing! :)

    • @IndianYJ
      @IndianYJ 9 лет назад

      can you please create a tutorial on shiny apps? You are an awesome teacher!

    • @dataschool
      @dataschool  9 лет назад

      Amit Ugle I will definitely consider it! Shiny does have some great written tutorials: shiny.rstudio.com/tutorial/

  • @pranavsatbhai4489
    @pranavsatbhai4489 6 лет назад

    my_db

    • @dataschool
      @dataschool  6 лет назад

      Sorry, I'm not sure why you are getting this error!

  • @chengPin
    @chengPin 6 лет назад

    Great!

  • @tonkouts
    @tonkouts 9 лет назад

    Great video. Thanks for sharing Kevin!
    At the moment I'm interested in replacing "for" loops when possible, using dplyr package and the "do" command.
    I have the following script :
    ## split initial dataset based on a grouping variable/column
    ## and save each (new) dataset as a different .csv file
    data.frame(mtcars) %>%
    group_by(cyl) %>%
    do(d=data.frame(.)) %>%
    do(write.csv(.$d, paste0("data_cyl_",.$cyl,".csv")))
    Seems to work, as I can see the .csv files created in my workspace, but it also returns the following error:
    Error: Results are not data frames at positions: 1, 2, 3
    Any ideas or thoughts?

    • @dataschool
      @dataschool  9 лет назад

      tonkouts I'm sorry, I wish I could help but I'm not that familiar with the do() function!

    • @tonkouts
      @tonkouts 9 лет назад

      Data School No problem. Do function is very helpful. Especially when it's combined with "broom" package to create tidy models (model outputs).

    • @dataschool
      @dataschool  9 лет назад

      tonkouts Neat! I'll have to check out "broom", thanks!

    • @tonkouts
      @tonkouts 9 лет назад +1

      Data School Really keeps (and influences) you thinking from a data frame point of view all the time... cran.r-project.org/web/packages/broom/vignettes/broom_and_dplyr.html

    • @dataschool
      @dataschool  9 лет назад

      tonkouts Wow, great vignette!

  • @harishmehra5956
    @harishmehra5956 6 лет назад

    Awsome

  • @13statistician13
    @13statistician13 5 лет назад

    You speak waaaay too slow for my liking. I'd recommend speeding up the speech in your next videos. Other than that, the content is good! Thanks.

    • @dataschool
      @dataschool  5 лет назад +1

      Glad you like the content!