What is data wrangling? Intro, Motivation, Outline, Setup -- Pt. 1 Data Wrangling Introduction
HTML-код
- Опубликовано: 5 дек 2024
- Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. These videos introduce you to these tools. Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup • What is data wrangling...
01:44 Intro and what’s covered
Ground Rules
02:40 What’s a tibble
04:50 Use View
05:25 The Pipe operator:
07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr • Tidy Data and tidyr --...
/00:48 Goal 1 Making your data suitable for R
/01:40 `tidyr` “Tidy” Data introduced and motivated
/08:15 `tidyr::gather`
/12:38 `tidyr::spread`
/15:30 `tidyr::unite`
/15:30 `tidyr::separate`
Pt. 3: Data manipulation tools: `dplyr` • Data Manipulation Tool...
00.40 setup
/02:00 `dplyr::select`
/03:40 `dplyr::filter`
/05:05 `dplyr::mutate`
/07:05 `dplyr::summarise`
/08:30 `dplyr::arrange`
/09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
/11:45 `dplyr::group_by`
/15:00 `dplyr::group_by`
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins • Working with Two Datas...
Combining two datasets together
/00.42 `dplyr::bind_cols`
/01:27 `dplyr::bind_rows`
/01:42 Set operations
`dplyr::union`, `dplyr::intersect`, `dplyr::set_diff`
/02:15 joining data
`dplyr::left_join`, `dplyr::inner_join`, `dplyr::right_join`, `dplyr::full_join`,
______________________________________________________________
Cheatsheets: www.rstudio.co...
Documentation:
`tidyr` docs: tidyr.tidyverse.org/reference/
`tidyr` vignette: cran.r-project...
`dplyr` docs: dplyr.tidyverse...
`dplyr` one-table vignette: cran.r-project...
`dplyr` two-table (join operations) vignette: cran.r-project...
______________________________________________________________
New York Times “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights”, By STEVE LOHRAUG. 17, 2014 www.nytimes.co...
______________________________________________________________
Very clear communication, one of the best around
Great video! I would really love to have the cheat sheet, however the one provided in this video is not posted on the link provided in this video. If anyone knows where I can find it, I would greatly appreciate it!
Thank you! Extremely helpful series
A really great series of videos, thank you making these, it's truly appreciated
Really super heplful. I already learned a lot.
Great video, please update!
this is great!
this is good
Thanks
The people who read this kinds of data what are they called?
Los Aguacates!
Raw data is indispensable for data science. It's not mundane to gather it.
@..
@