Tidying Data in R with pivot_longer()

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024
  • Is one of your variables spread across multiple columns in a data frame? You've come to the right place!
    If this vid helps you, please help me a tiny bit by mashing that 'like' button. For more #rstats joy, crush that 'subscribe' button!

Комментарии • 29

  • @EquitableEquations
    @EquitableEquations  10 месяцев назад

    You can find materials supporting this vid (and others) at github.com/equitable-equations/youtube.

  • @filipenunesvicente7872
    @filipenunesvicente7872 2 года назад +5

    Extremely helpful, it got me out of a pickle concerning a dataframe with multiple names on it! Thanks for the quality content.

  • @ignvzinho
    @ignvzinho 2 года назад

    Thanks for the simple and precise explanation.

  • @Kinglium
    @Kinglium 2 года назад +1

    thank you so much for your clear explanation!

  • @richardmusonda3404
    @richardmusonda3404 Год назад

    Quality content and Quality Professor.

  • @danaetapia9286
    @danaetapia9286 Год назад

    Thank you much for taking your time explaining this. 😍😍

  • @cjspear
    @cjspear Год назад

    Fantastic video, thank you for your help.

  • @edwardvasquez4288
    @edwardvasquez4288 Год назад +1

    thank you this was very straight-forward

  • @romanvasiura6705
    @romanvasiura6705 Год назад

    Thank you!
    P.S. definitely it's hard to remember all feature, but at least I'll know where I can find good tips)) and refresh my knowledge...
    You've done amazing work 😃

    • @EquitableEquations
      @EquitableEquations  Год назад +1

      For sure! I'm constantly googling and checking help files for functions I don't use every day.

  • @j.knetsch3413
    @j.knetsch3413 4 месяца назад +1

    Thanks a lot! Good explination!

  • @kevindave277
    @kevindave277 2 года назад +2

    Exceptional video. I would be very glad if you could provide the link for the dataset so I can work with it locally. Much thanks.

    • @EquitableEquations
      @EquitableEquations  2 года назад

      This is set #3 from Triola's Elementary Stats. You can download it from www.triolastats.com/es13-datasets

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 2 года назад +1

    Or you could use...names_pattern = "day_?(.*)_(.*)"names_pattern = "day_?(.*)_(.*)" to split your "DAY" column into day and time. using this type of regex. I have not figured out how to get rid of am but is should not be too hard. Just have to fiddle with regex. By the way I prefer to use snake_case which can be done with janitor::clean_names(df). Nice presentation and great source of data.
    Thanks

  • @MegaSesamStrasse
    @MegaSesamStrasse 3 года назад +2

    Thanks for the helpful introduction! What can i do if I face following problem:
    - there are variable spread across multiple columns
    and
    - observations are scattered across multiple rows

    • @EquitableEquations
      @EquitableEquations  3 года назад +1

      Hi! Pivot_longer is your basic tool for dealing with variables spread across multiple columns. The first tool I would consider if each observation used multiple rows would be pivot_wider.

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 2 года назад +1

    Following up on my last comment: names_pattern = "day_?(.*)_(\\d+)" works! We need that extra '\' in there so that \d+ works (basically we have to escape the \ which works in normal regex by itself but need another in R).

  • @dabinjeong9560
    @dabinjeong9560 Год назад

    very useful video! thank you

  • @user-ro9ex5im2p
    @user-ro9ex5im2p Год назад

    This was very helpful

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 2 года назад

    Andrew,
    Nice presentation. I could not find the FQA data no matter where I looked. Please provide the link to data when you use external data. I would recommend using data from two sources in these exercises. First, simply the in-house (available in a package or on e of the common r data sets) and second, external data. The external data needs either to be referenced properly. You could also turn into RData that can then be downloaded from for instance Github.
    The blocks data does not have to be downloaded. It is already available in the "GMLsData" R package.
    Thanks and keep up the good work!
    P. S. I attended grad school in the Chicago area (that is a small university in Hyde Park).

    • @EquitableEquations
      @EquitableEquations  2 года назад

      Hi! The pre-loaded data sets are lovely but very well-explored elsewhere, especially the tidyr and dplyr sets, so I chose to avoid them here. You can find FQA data here: universalfqa.org/

    • @haraldurkarlsson1147
      @haraldurkarlsson1147 2 года назад

      @@EquitableEquations Thanks.

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 2 года назад

    P.S. If you are looking for "messy" data then the billboard data that comes with tidyr is perfect.

    • @EquitableEquations
      @EquitableEquations  2 года назад

      That's true! Anyone interested can see how to pivot this one with ?pivot_longer.

  • @ronvave2997
    @ronvave2997 6 месяцев назад

    Thankful for this video. Question. In my data, I've pivot_long columns 3:6, but I also need columns 7:8 in the same dataset as another variable/column and values. How can I do this in the same code chunk?

  • @hugobarrera771
    @hugobarrera771 Год назад

    Love u men!!!