Tidy Data and tidyr -- Pt 2 Intro to Data Wrangling with R and the Tidyverse
HTML-код
- Опубликовано: 5 дек 2024
- Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
tidyr.tidyverse...
tidyr.tidyverse...
tidyr.tidyverse...
tidyr.tidyverse...
tidyr.tidyverse...
----------------
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup • What is data wrangling...
/01:44 Intro and what’s covered
Ground Rules
/02:40 What’s a tibble
/04:50 Use View
/05:25 The Pipe operator:
/07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr • Tidy Data and tidyr --...
00:48 Goal 1 Making your data suitable for R
01:40 `tidyr` “Tidy” Data introduced and motivated
08:10 `tidyr::gather`
12:30 `tidyr::spread`
15:23 `tidyr::unite`
15:23 `tidyr::separate`
Pt. 3: Data manipulation tools: `dplyr` • Data Manipulation Tool...
00.40 setup
/02:00 `dplyr::select`
/03:40 `dplyr::filter`
/05:05 `dplyr::mutate`
/07:05 `dplyr::summarise`
/08:30 `dplyr::arrange`
/09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
/11:45 `dplyr::group_by`
/15:00 `dplyr::group_by`
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins • Working with Two Datas...
Combining two datasets together
/00.42 `dplyr::bind_cols`
/01:27 `dplyr::bind_rows`
/01:42 Set operations
`dplyr::union`, `dplyr::intersect`, `dplyr::set_diff`
/02:15 joining data
`dplyr::left_join`, `dplyr::inner_join`, `dplyr::right_join`, `dplyr::full_join`,
______________________________________________________________
Cheatsheets: www.rstudio.co...
Documentation:
`tidyr` docs: tidyr.tidyverse.org/reference/
`tidyr` vignette: cran.r-project...
`dplyr` docs: dplyr.tidyverse...
`dplyr` one-table vignette: cran.r-project...
`dplyr` two-table (join operations) vignette: cran.r-project...
______________________________________________________________
Looking forward to an update on this for the new pivot_longer() and pivot_wider() grammar!
I know Im randomly asking but does anyone know of a method to get back into an instagram account?
I stupidly lost my account password. I would appreciate any assistance you can give me.
@Keanu Jack Instablaster =)
@Mack Gerardo i really appreciate your reply. I found the site thru google and Im trying it out now.
Takes quite some time so I will reply here later with my results.
Great video! nicely explained and well delivered with graphics!
I learned just so much by watching this. I regret I wasn't able to download the datasets, I don't know if it's me or the venerable age of the video
Can't download EDAWR from github.
Error: Failed to install 'EDAWR' from GitHub:
(converted from warning) cannot remove prior installation of package ‘backports’
I too
Great Video - Well explained and Easy to understand
Nicely presented - short and succinct
@Garret : Please advise how to import the data sets? I have installed "devtools" package, but unable to install package "EDAWR". Looking for your help. thank you .
package ‘EDAWR’ is not available (for R version 3.4.1)
@@williambiggs2308 Using anaconda, how does one create an environment with an older version of base-r (3.5.1)? Is base-r 3.4.1 needed to access the EDAWR package?
HI, you can create theme by yourself.
country
@@FancyTreer032 thank you very much!
Hi, is it possible to use function separate() to separate more than one column using the pipe operator or any other method? thanks
thank you very much for this helpful video
So are gather and spread replaced by pivot_longer and pivot_wider?
Excellent presentation
12:23 you gave life to me!
すごく分かりやすい!
Do we really want to pivot_wider pollution?
pollution %>%
ggplot(aes(city, amount, group = size))+
geom_bar(aes(fill = size), stat = 'identity', position = 'dodge')
Out of date! Please post update with *pivot_* functions, scoped variables and something on non-standard evaluation pleeeeze....
I would prefer more coding examples. 8 minutes in before tidyr package is even introduced. Lets goooooooo
very helpful
Thanks
nice presentation but the audio is pretty poor. the concept of observation is key
He looks like Marty Mcfly Senior!!!
Looking at your first use of gather, it seems that you have not properly assessed what an observation is, have you? I would think an observation here would best be defined as a country. Then, the columns, should be country name, count for 2011,count for 2012,count for 2013, shouldn't it? The way you have it, Country, Year, N; what are the observational units? A year-country? Why not make it a country, as I have suggested?