What is data wrangling? Intro, Motivation, Outline, Setup -- Pt. 1 Data Wrangling Introduction

Поделиться
HTML-код
  • Опубликовано: 5 дек 2024
  • Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. These videos introduce you to these tools. Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
    Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup • What is data wrangling...
    01:44 Intro and what’s covered
    Ground Rules
    02:40 What’s a tibble
    04:50 Use View
    05:25 The Pipe operator:
    07:20 What do I mean by data wrangling?
    Pt. 2: Tidy Data and tidyr • Tidy Data and tidyr --...
    /00:48 Goal 1 Making your data suitable for R
    /01:40 `tidyr` “Tidy” Data introduced and motivated
    /08:15 `tidyr::gather`
    /12:38 `tidyr::spread`
    /15:30 `tidyr::unite`
    /15:30 `tidyr::separate`
    Pt. 3: Data manipulation tools: `dplyr` • Data Manipulation Tool...
    00.40 setup
    /02:00 `dplyr::select`
    /03:40 `dplyr::filter`
    /05:05 `dplyr::mutate`
    /07:05 `dplyr::summarise`
    /08:30 `dplyr::arrange`
    /09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
    /11:45 `dplyr::group_by`
    /15:00 `dplyr::group_by`
    Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins • Working with Two Datas...
    Combining two datasets together
    /00.42 `dplyr::bind_cols`
    /01:27 `dplyr::bind_rows`
    /01:42 Set operations
    `dplyr::union`, `dplyr::intersect`, `dplyr::set_diff`
    /02:15 joining data
    `dplyr::left_join`, `dplyr::inner_join`, `dplyr::right_join`, `dplyr::full_join`,
    ______________________________________________________________
    Cheatsheets: www.rstudio.co...
    Documentation:
    `tidyr` docs: tidyr.tidyverse.org/reference/
    `tidyr` vignette: cran.r-project...
    `dplyr` docs: dplyr.tidyverse...
    `dplyr` one-table vignette: cran.r-project...
    `dplyr` two-table (join operations) vignette: cran.r-project...
    ______________________________________________________________
    New York Times “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights”, By STEVE LOHRAUG. 17, 2014 www.nytimes.co...
    ______________________________________________________________

Комментарии • 14

  • @davideterribile8906
    @davideterribile8906 5 лет назад +1

    Very clear communication, one of the best around

  • @lyssasamuel1575
    @lyssasamuel1575 4 года назад +2

    Great video! I would really love to have the cheat sheet, however the one provided in this video is not posted on the link provided in this video. If anyone knows where I can find it, I would greatly appreciate it!

  • @joshuachan7761
    @joshuachan7761 4 года назад

    Thank you! Extremely helpful series

  • @anthonychariton9952
    @anthonychariton9952 4 года назад

    A really great series of videos, thank you making these, it's truly appreciated

  • @eliebordron5599
    @eliebordron5599 4 года назад

    Really super heplful. I already learned a lot.

  • @albowen6
    @albowen6 2 года назад

    Great video, please update!

  • @LifeWithBrad
    @LifeWithBrad 6 лет назад

    this is great!

  • @musicspinner
    @musicspinner 3 года назад

    this is good

  • @riverland0072
    @riverland0072 6 лет назад

    Thanks

  • @jideidowu4267
    @jideidowu4267 2 года назад

    The people who read this kinds of data what are they called?

  • @Virchov
    @Virchov 5 месяцев назад

    Los Aguacates!

  • @백영래-u3x
    @백영래-u3x 5 лет назад

    Raw data is indispensable for data science. It's not mundane to gather it.

  • @devotionconceptual9389
    @devotionconceptual9389 5 лет назад

    @..