Get my FREE cheat sheets for R programming and statistics (including transcripts of these lessons) here: www.learnmore365.com/courses/rprogramming-resource-library
I can't say this enough, but please keep making these. I'm hoping I can use your videos to supplement my own stats classes in the future, because they really are great at getting at the meat and potatoes of both the code and theory behind the stats.
Thanks so much for making this. I've been watching dozens of videos, and none of them explain how to extract the values from the t-test for inline coding like here! MVP
as a total newbie in statistics and trying to learn by myself this video has helped a lot. Thanks for coupling up explanations of code and basic concepts in statistics. Very much appreciated!
Great video, I'm just starting with R and your videos have been great. The explanations of the codes you provide is fantastic. Keep up the awesome work.
I was able to figure out how to do the first graphic at 5:26 ggplot(data = ) + geom_density(mapping = aes(lifeExp), fill = "red", alpha = 0.2) + geom_vline(xintercept = mean($lifeExp), color = "red", alpha = 0.4, linetype = "dashed", size = 1.5)
I was able to figure out how to recreate the 2nd graph too! (not perfect, but pretty close) ggplot(data = ) + geom_density(aes(lifeExp, color = continent, fill = continent), alpha = 0.3) + geom_vline(xintercept = 48.9, color = "red", alpha = 0.4, linetype = "dashed", size = 1.2) + geom_vline(xintercept = 71.9, color = "cyan4", alpha = 0.4, linetype = "dashed", size = 1.2) + labs( title = "Density Plot of Life Expectency in Africa and Europe", y = "", x = "Age in yrs") + annotate( "text", x = 43, y = 0.075, label = "48.9 yrs", color = "red", alpha = 0.8, size = 4.5) + annotate( "text", x = 80, y = 0.075, label = "48.9 yrs", color = "cyan4", alpha = 0.8, size = 4.5) **NOTE** The data frame I had loaded was gapminder filtered(continent %in% c("Africa", "Europe"))
Would be possible to show in the future data analysis episodes the following matters: 1- A decision about making numeric variables as factor or leave it as a numeric as well as how about incase having categorical variables? 2- What the best decision can be made to deal with data having missing values (NA)? As always you made things enjoyable as the best, thanks alot!
Guys does anyone know why I get this error for the paired test: " Error in t.test.formula(lifeExp ~ year, data = ., paired = TRUE) : cannot use 'paired' in formula method "
Great video, could you comment a little more about the example you have used for paired t-test? One could guess that the sample used for individuals in 1957 is not the same than that in 2007 even if it is the same continent. The two samples could be considered independent from each other. I other words that we are not using repeated or matched measurements of life exp. and could safely use a two-sample t-test instead of a paired one. Thank you so much.
Very interesting observation Mauro. I hadn't thought of that. I think however that in this case, what is being sampled are countries (not individuals). It is the countries that are matched. But its an interested thought and I'll muse over your comment a little.
bonus learning objective, the dot for piping when it's not the first part of a formula - not what I came to learn but super important point (pun intended)
Hello, Greg! Thank you for sharing this video! I have one question about the plots though - how did you display M on each density plot, and how did you manage to put the plots together?
Isn‘t the difference in life expectancy statistically significant when you compare Ireland to Switzerland because you don‘t compare it only for one year but for say the last 30 years? Or maybe a general question: Does statistical significance change when you compare more than one event?
They are density plots, easy to make, I just havent figured out the vertical line yet. gapminder%>% filter(continent %in% c("Africa", "Europe"))%>% ggplot(aes(x=lifeExp, color=continent, fill=continent))+ geom_density(alpha = 0.2) gives you the basic graph
@@sciencefliestothemoon2305 gapminder%>% filter(continent %in% c("Africa", "Europe"))%>% ggplot(aes(x=lifeExp, color=continent, fill=continent))+ geom_density(alpha = 0.2)+ geom_vline(xintercept = 50, linetype="dashed", color = "red", size=0.8) #50 or mu is the median of population in your test
Great video. Can you please work on Two One Sided T test (TOST) video which is quite prevalent and widely used in the pharmaceutical industry. Also please address the issue of statistical significance versus practical significance.Thank you
This is great! I'm new to R and new to statistics and your channel is very helpful! A question, if using t-tests to determine significance in product price changes, would you use a double sided, single sided test, or paired test? Thank you!
I had that issue too and I downloaded the following packages for the function. Maaybe they work for you install.packages("magritter") install.packages("dplyr")
Ty Greg for yet another great vdo. Two questions. 1) How do you get the two vlines representing the two countries means in the graph? 2) I tried Levene’s test. What is wrong with this code? library(car) gapminder %>% filter(country %in% c("Ireland", "Switzerland")) %>% leveneTest(lifeExp, country, center = mean) Can’t figure out either of those 😊
To answer your question 1, use (geom_vline) at the end of your ggplot code, this is an example using the GAPMINDER dataset: gapminder %>% select(continent, lifeExp) %>% filter(continent == 'Europe') %>% ggplot(aes(x = lifeExp))+ geom_density(fill ='orange', alpha = 0.5)+ geom_vline(aes(xintercept = mean(lifeExp)),linetype = 'dashed')
Thanks Greg, I can clearly see the use of the t-test to compare two means. But I still don't understand the first example. Why testing a population if we already know its mean? and also what is the use to "sample" it to make a test?
Hi Francois - great question. In the fist example, we're comparing our sample data mean to some hypothesised mean (which may be because of previous studies or assumptions). We may, for example, all believe that Irish men are on average 6 foot tall. We can take a sample of men, measure the mean and ask if it is different from that asumptions (the 6 feet)
Hi there, I have followed your instructions for the examples: gapminder %>% filter(continent == "Africa") %>% select(lifeExp) %>% t.test(mu = 50) as well as my_ttest % filter(continent == "Africa") %>% select (lifeExp) %>% t.test (mu = 50) however both come up with: Error in select(., lifeExp) : unused argument (lifeExp) Would you know what could be the cause of this and how to fix it by any chance?
First you need to load the data: install.packages("patchwork") library(patchwork) and if want, you cant attach the data so you dont need to call the data all the time: attach(gapminder) and if you want to create a object to work with it: data("gapminder") name_your_data
It is not just simplest to give the 95% confidence interval rather than the p-value. Thus nipping the p-hacking business in the...? There seems to be far too much confusion around the p-values.
Good question. The truth is that there are so many ways that people p hack (even without knowing it). It’s a big problem. The best way forward is to make people aware of it.
Get my FREE cheat sheets for R programming and statistics (including transcripts of these lessons) here: www.learnmore365.com/courses/rprogramming-resource-library
I will like you to seperate the statistics analysis from business analysis
I can't say this enough, but please keep making these. I'm hoping I can use your videos to supplement my own stats classes in the future, because they really are great at getting at the meat and potatoes of both the code and theory behind the stats.
Thank you for the feedback!
Thanks so much for making this. I've been watching dozens of videos, and none of them explain how to extract the values from the t-test for inline coding like here! MVP
You're very welcome!
as a total newbie in statistics and trying to learn by myself this video has helped a lot. Thanks for coupling up explanations of code and basic concepts in statistics. Very much appreciated!
As a PhD candidate, your videos helped a lot. Very efficient. Thanks a lot.¨Please continue with other statiscal tests ;)
Thanks, will do!
@@RProgramming101 Yes!
Jesus Christ, this channel just keeps getting better and better. Thanks for the content!
Wow, thanks! So nice of you, Javier 😁
Great video, I'm just starting with R and your videos have been great. The explanations of the codes you provide is fantastic. Keep up the awesome work.
Awesome, thank you!
I hope you will never stop making videos on R........as far as R is in trend.
one of the best R programming teacher.....love you
You are too sweet Amit - thanks! More R videos to come of course! ☺
I love how excited he is to talk about R!!
So nice of you, Janine. Thank you for your feedback!
One of the best t-test and p-values in YT!
Thank you for the feedback!
Great as usual. I can't wait to see the next video. Thanks for the great work.
Thanks again!
Great video. Could you please also start sharing the code to produce the graphs you use to visualize the concepts you speak about?
I was able to figure out how to do the first graphic at 5:26
ggplot(data = ) +
geom_density(mapping = aes(lifeExp), fill = "red", alpha = 0.2) +
geom_vline(xintercept = mean($lifeExp), color = "red", alpha = 0.4, linetype = "dashed", size = 1.5)
I was able to figure out how to recreate the 2nd graph too! (not perfect, but pretty close)
ggplot(data = ) +
geom_density(aes(lifeExp, color = continent, fill = continent), alpha = 0.3) +
geom_vline(xintercept = 48.9, color = "red", alpha = 0.4,
linetype = "dashed", size = 1.2) +
geom_vline(xintercept = 71.9, color = "cyan4", alpha = 0.4,
linetype = "dashed", size = 1.2) +
labs(
title = "Density Plot of Life Expectency in Africa and Europe",
y = "",
x = "Age in yrs") +
annotate(
"text", x = 43, y = 0.075, label = "48.9 yrs", color = "red",
alpha = 0.8, size = 4.5) +
annotate(
"text", x = 80, y = 0.075, label = "48.9 yrs", color = "cyan4",
alpha = 0.8, size = 4.5)
**NOTE**
The data frame I had loaded was gapminder filtered(continent %in% c("Africa", "Europe"))
Thanks a lot! As usual, the explanation was amazing! I used to get confused in t-test but this complicated stuff you have made it so easy!
Great to hear!
Would be possible to show in the future data analysis episodes the following matters: 1- A decision about making numeric variables as factor or leave it as a numeric as well as how about incase having categorical variables?
2- What the best decision can be made to deal with data having missing values (NA)?
As always you made things enjoyable as the best, thanks alot!
Will do!! I have a video on missing data using R on my Global Health channel (go to global health with greg martin and you'll find it there)
Thank you sir for proving this important and beautiful knowledge, I love the way you teach.
Thank you again
It's my pleasure
Another great one in the bag. Great job!
Thanks 👍🏻
Thanks for sharing this Greg. It's incredibly helpful
Glad it was helpful!
@@RProgramming101 Please remember to do a tutorial on difference in differences between a control and experimental group. Thanks
Learning so much from your videos!
I'm so glad! Thanks for the feedback, Hernan. Much appreciated.
Book shaka laka to you for making these great videos...please keep posting
Thank you for explaining in this beautiful manner
My pleasure 😊
Guys does anyone know why I get this error for the paired test: " Error in t.test.formula(lifeExp ~ year, data = ., paired = TRUE) :
cannot use 'paired' in formula method "
Did you figure out how to troubleshoot that issue I'm also experiencing the same issue?
Having this same error
you are simply awesome Greg i learn alot from You....Thanks to my GURU( means teacher in hindi)
Wow, thanks! So nice of you.
Great video! Helped me with my R lab homework, thank you!!
You're very welcome! Glad it was helpful!
I really appreciate you making these videos! They are really helpful!
Glad you like them! You are so welcome!
I really enjoyed your videos. As always. Thx for making this great video explanation.
Glad you enjoyed it!
Thank you for this great video!
It's so fan to look how easy you do it))
Hopefully I'll find more materials in your other resources!
Of course, more to come, Roman. Thank you for the amazing feedback!
Great video. But it will greatly be appreciated if you can share the codes for the graphs?
Great video, could you comment a little more about the example you have used for paired t-test? One could guess that the sample used for individuals in 1957 is not the same than that in 2007 even if it is the same continent. The two samples could be considered independent from each other. I other words that we are not using repeated or matched measurements of life exp. and could safely use a two-sample t-test instead of a paired one. Thank you so much.
Very interesting observation Mauro. I hadn't thought of that. I think however that in this case, what is being sampled are countries (not individuals). It is the countries that are matched. But its an interested thought and I'll muse over your comment a little.
Thanks for such a great and clear explanation. Could you mind making a video on 'how to visualize the T-test result'?
5:49 hypothesis test
10:51 test for difference of mean (two side test)
17:50 test for difference of mean (one side test)
work of a genius
Very kind of you to say.
Thanks for the video. Do you have one that discusses the density plots you created for life expectancy? This is an excellent series, btw.
you are great teacher
🤩🤩
Wow thank you. Cheers
Thank you for this. I have seen a lot of people turning to a KW test in R. Can you go over when that is appropriate?
will do :)
bonus learning objective, the dot for piping when it's not the first part of a formula - not what I came to learn but super important point (pun intended)
Great video. Thank you so much
You are so welcome! Glad you enjoyed it!
You are the best, as always. But you could have explained the code for the density plot too. Or have I missed it?
please make vedios about principal component analysis in R
Thanks for the suggestion!
Can you please share how to create those plots that u were showing during the start of the video
Hello, Greg! Thank you for sharing this video!
I have one question about the plots though - how did you display M on each density plot, and how did you manage to put the plots together?
ah - thanks for the questions (good ones). Hard to answer in the comments but I will make a video that explains. Watch this space. Happy day. Greg
@@RProgramming101 Wonderful, thank you! All the best:)
Great video. A video on testing data for different distributions would be nice, such as normal, weibull etc.
Great suggestion!
Isn‘t the difference in life expectancy statistically significant when you compare Ireland to Switzerland because you don‘t compare it only for one year but for say the last 30 years?
Or maybe a general question: Does statistical significance change when you compare more than one event?
Can you explain how you did create plots for t.test?
Can you please do more videos about all tests, and linear and logistic regressions? Other than that, awesome videos, thank you very much!
Great suggestion! Thank you for the feedback.
I get u lound and clear in this remote place in Africa.
Good teaching
Please how did u draw the graphs?
What village are you in? Kikikikiki
@@mashfintech Yamumbi
@@mugomuiruri2313 nice one brother. I am in Pretoria South Africa hiding from dudula.
They are density plots, easy to make, I just havent figured out the vertical line yet.
gapminder%>%
filter(continent %in% c("Africa", "Europe"))%>%
ggplot(aes(x=lifeExp,
color=continent,
fill=continent))+
geom_density(alpha = 0.2)
gives you the basic graph
@@sciencefliestothemoon2305
gapminder%>%
filter(continent %in% c("Africa", "Europe"))%>%
ggplot(aes(x=lifeExp,
color=continent,
fill=continent))+
geom_density(alpha = 0.2)+
geom_vline(xintercept = 50, linetype="dashed",
color = "red", size=0.8) #50 or mu is the median of population in your test
Great video. Can you please work on Two One Sided T test (TOST) video which is quite prevalent and widely used in the pharmaceutical industry. Also please address the issue of statistical significance versus practical significance.Thank you
Great suggestion! Thank you for the feedback.
This is great! I'm new to R and new to statistics and your channel is very helpful! A question, if using t-tests to determine significance in product price changes, would you use a double sided, single sided test, or paired test? Thank you!
Where do you get tidyverse, patchwork and gapminder? I'm using RStudio at the moment.
thanks! This is a great help!
Hello, I tried to follow but it does not find the gapminder. I have many issues with R not finding library vocabularies. Also error on %>% function.
I had that issue too and I downloaded the following packages for the function. Maaybe they work for you
install.packages("magritter")
install.packages("dplyr")
Based on my experience,it is much better to perform inferential statistics in R rather than in Python......
great videos thank you - would you be able to do something on how to work with skewed data
Great suggestion!
Such a great video, can I have that R script?
Ty Greg for yet another great vdo. Two questions.
1) How do you get the two vlines representing the two countries means in the graph?
2) I tried Levene’s test. What is wrong with this code?
library(car)
gapminder %>%
filter(country %in% c("Ireland", "Switzerland")) %>%
leveneTest(lifeExp, country, center = mean)
Can’t figure out either of those 😊
will try to make a video that addresses this.
@@RProgramming101 Looking forward to that! Thank you very much
To answer your question 1, use (geom_vline) at the end of your ggplot code, this is an example using the GAPMINDER dataset:
gapminder %>%
select(continent, lifeExp) %>%
filter(continent == 'Europe') %>%
ggplot(aes(x = lifeExp))+
geom_density(fill ='orange', alpha = 0.5)+
geom_vline(aes(xintercept = mean(lifeExp)),linetype = 'dashed')
@@souhaibsebbane5623 Thank you very much for your helpful answer! I'm going to try this out right away.
Thanks Greg, I can clearly see the use of the t-test to compare two means. But I still don't understand the first example. Why testing a population if we already know its mean? and also what is the use to "sample" it to make a test?
Hi Francois - great question. In the fist example, we're comparing our sample data mean to some hypothesised mean (which may be because of previous studies or assumptions). We may, for example, all believe that Irish men are on average 6 foot tall. We can take a sample of men, measure the mean and ask if it is different from that asumptions (the 6 feet)
Hello sir
Can't find the package 'gapminder ' in R
Thanks for sharing
My pleasure
Thank you sir!
You are welcome!
BEST VIDEO
wow - thanks for the feedback (much appreciated)
Life-saver ❤
I'm thrilled that my video helped you or provided you with useful information. Thanks for letting me know!
Hi there, I have followed your instructions for the examples:
gapminder %>%
filter(continent == "Africa") %>%
select(lifeExp) %>%
t.test(mu = 50)
as well as
my_ttest %
filter(continent == "Africa") %>%
select (lifeExp) %>%
t.test (mu = 50)
however both come up with:
Error in select(., lifeExp) : unused argument (lifeExp)
Would you know what could be the cause of this and how to fix it by any chance?
dplyr::select(lifeExp) %>%
#MASS and dplyr package Select function clashes, so we tell R to use dplyr
First you need to load the data:
install.packages("patchwork")
library(patchwork)
and if want, you cant attach the data so you dont need to call the data all the time:
attach(gapminder)
and if you want to create a object to work with it:
data("gapminder")
name_your_data
It is not just simplest to give the 95% confidence interval rather than the p-value. Thus nipping the p-hacking business in the...? There seems to be far too much confusion around the p-values.
Good question. The truth is that there are so many ways that people p hack (even without knowing it). It’s a big problem. The best way forward is to make people aware of it.
Why can't I install the packages after typing install.packages(patchwork) and install.packages(gapminder)? :(
very clear
Glad you think so!
That might be a stupid question, but how to I create the vertical line for the mean in the graphs?
I'll make a video about that (hard to address in commments)
@@RProgramming101 thanks. 👍
Thank you
You're welcome!
nice video thank u
Most welcome
Hi, what statistics packages you are using? Can you share your R file?
Will find a way to get the script for you.
One sided test for two different means: alway s the error: groupng factor must have two Steps?
boom shakalaka - we did this!
Amazing, Jamie! Thanks for your feedback.
May I know he source for DATA?
Error: unexpected symbol in:
"gapminde %>%
filter(continent %in% c("Africa"."
>
This is the 'analyze' tutorial.
Please pump up loudness next time.
Thanks for the suggestion!
7:32 the P value is not a probability, it’s just a number
Chacalaca
Boom...
I don't understand