Pat, I think I have pointed this out before but the fpp3 package will do most of these things that your are doing with much simpler code. fpp3 was after all designed for time series. I still love your code gymnastics and have watch some of your videos multiple times - each time I learn something new. Thanks!
I like to save versions of my data after/as I clean it as .RDS files so I can see what I did/reproduce easily later. People asking about organization: Usually I group my projects with /background_info /in_data /out_data /code as separate directories. I don't think there's anything special about those files except that it's organized enough and general enough to be consistent so I easily program and reuse paths across projects. I organized this way originally when reading about reproducible research and data sharing in neuroscience and psychology, so you might want to see if there's something that a group has suggested for your field that you can work within (if you want to data share). If it's a big project I also have a README with dependencies/version info and an RProj with a source.R that auto-opens and runs everything. Thanks Dr. Schloss! Learning a lot here :)
Awesome! My only caution against Rds files is that they limit you to R and they aren’t text files. I prefer to work with csv/tsv files as much as possible
Great stuff. You can make your life easier sometimes by using the %in% operator, e.g normalized_range = year %in% 1951:1980 also gives you TRUE/FALSE indicator and more concise code. The nice thing about the %in% operator is that it works on many datatypes (bools, integers, reals, chars) in both lists and vectors.
Pat you have tutorials for all levels, which is fabulous. You are so prolific, unfortunatly, I can't keep up with all you produce. You are amazing. Somehow I missed "riffomonas." Where does that word, "Riffomonas", come from?
Hah! It comes from the idea of riffing in music but riffing on other peoples code. My hope is that people can see how I riff on my own code to do the same for their own purposes. The “omonas” is a common ending for bacteria
I did not see the same trend you see in your data - namely a gradual increase. The curve for my local station (near the southern tip of Lake Michigan) is essential flat. What we may be looking at is the moderating effect of the lake. But I do see the cold October of 1925 (the relative deviation is -7.3) but I am missing measurements for 1917. Interesting stuff - also a good lesson in how to deal with NAs. Please bring more stuff like this with broad appeal and data that is easily and freely obtained. Thanks.
They group aesthetic here links all the data from the same year together. You could use color=year but then every year would be a different color. Instead I used color=is_this_year to get the two colored figure.
I must have missed the last few minutes when I posted the earlier question. I meant before you added the is_this_year column you already had group= year Color = year so to rephrase my question at that point of the tutorial is the group parameter doing anything in addition to the Color parameter as both are set to year at that point
Hello Boss! Could you please elaborate why you drop the groups after you group by and summarise. It was so confusing that you said when group by and summarize will remove the grouping to the right. I did not see any change after you drop the groups. The tibble size is 1558*3 which is exactly same size compared to the tibble without drop groups. Thank you sir!
Thanks for watching and for your question! It doesn’t change the size of the tibble only the grouping or structure of the tibble. I remove the groupings because they can mess with downstream processes. If I did another mutate with the data still grouped there could be unintended results
Pat, I think I have pointed this out before but the fpp3 package will do most of these things that your are doing with much simpler code. fpp3 was after all designed for time series. I still love your code gymnastics and have watch some of your videos multiple times - each time I learn something new.
Thanks!
Hello from Scotland. Many thanks for this excellent video!
Your video is a gem!
Thanks!🤓
Wonderful video as usual. Thumbs up.
Thanks Timmy!
I like to save versions of my data after/as I clean it as .RDS files so I can see what I did/reproduce easily later.
People asking about organization: Usually I group my projects with /background_info /in_data /out_data /code as separate directories. I don't think there's anything special about those files except that it's organized enough and general enough to be consistent so I easily program and reuse paths across projects. I organized this way originally when reading about reproducible research and data sharing in neuroscience and psychology, so you might want to see if there's something that a group has suggested for your field that you can work within (if you want to data share). If it's a big project I also have a README with dependencies/version info and an RProj with a source.R that auto-opens and runs everything.
Thanks Dr. Schloss! Learning a lot here :)
Awesome! My only caution against Rds files is that they limit you to R and they aren’t text files. I prefer to work with csv/tsv files as much as possible
Beautiful analysis work.
Thanks! I’m glad people are enjoying it
Great stuff. You can make your life easier sometimes by using the %in% operator, e.g normalized_range = year %in% 1951:1980 also gives you TRUE/FALSE indicator and more concise code. The nice thing about the %in% operator is that it works on many datatypes (bools, integers, reals, chars) in both lists and vectors.
Thanks! It’s all a matter of what I remember when I’m under the spotlight of recording 😂
Pat you have tutorials for all levels, which is fabulous. You are so prolific, unfortunatly, I can't keep up with all you produce. You are amazing. Somehow I missed "riffomonas." Where does that word, "Riffomonas", come from?
Hah! It comes from the idea of riffing in music but riffing on other peoples code. My hope is that people can see how I riff on my own code to do the same for their own purposes. The “omonas” is a common ending for bacteria
Excellent job!
Thanks!
I did not see the same trend you see in your data - namely a gradual increase. The curve for my local station (near the southern tip of Lake Michigan) is essential flat. What we may be looking at is the moderating effect of the lake. But I do see the cold October of 1925 (the relative deviation is -7.3) but I am missing measurements for 1917. Interesting stuff - also a good lesson in how to deal with NAs.
Please bring more stuff like this with broad appeal and data that is easily and freely obtained.
Thanks.
Cool results and insights! 🤓
Pat,
So when you replace the empty spaces with zero was that a form of imputation? Basically replacing missing values?
Thank you for another great video. Quick Q What does the ‘group’ argument do in the ggplot aesthetics as you also have Color set to year. Thank you
They group aesthetic here links all the data from the same year together. You could use color=year but then every year would be a different color. Instead I used color=is_this_year to get the two colored figure.
@@Riffomonas thank you
I must have missed the last few minutes when I posted the earlier question. I meant before you added the is_this_year column you already had group= year Color = year so to rephrase my question at that point of the tutorial is the group parameter doing anything in addition to the Color parameter as both are set to year at that point
Right - in this case they do the same thing. I tend to use group for line plots even if it’s redundant with color just to be safe
Totally awesome… man … you really explain things very well… you hit the spot … thanks so much please keep doing great work and help people
Thanks for the encouragement 🤓. Im glad people are finding this thread of videos helpful
could you please explain how you manage your .R files (workflow wise)? And why setwd() is not your favorite?
Using paths in R and why you shouldn't be using setwd (CC179)
ruclips.net/video/StqDYjM6ULo/видео.html
@R.Hainez look for the here package very useful
Hello Boss! Could you please elaborate why you drop the groups after you group by and summarise. It was so confusing that you said when group by and summarize will remove the grouping to the right. I did not see any change after you drop the groups. The tibble size is 1558*3 which is exactly same size compared to the tibble without drop groups. Thank you sir!
Thanks for watching and for your question! It doesn’t change the size of the tibble only the grouping or structure of the tibble. I remove the groupings because they can mess with downstream processes. If I did another mutate with the data still grouped there could be unintended results
I still can't wrap my head around the fact why you normalize the temps between 50's and 80's. Shouldn't you normalize between all the years?
Here’s a FAQ describing the idea of the temperature anomaly and why nasa does it this way… data.giss.nasa.gov/gistemp/faq/