R demo | Kruskal-Wallis test + Post-Hoc | How to conduct, visualize, interpret & more 😉

yuzaR Data Science

Просмотров 9 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 апр 2022
In this video, we'll:
install.packages("ggstatsplot")
library(ggstatsplot)
ggbetweenstats(
data = d,
x = education,
y = wage,
type = "nonparametric")
If you only want the code (or want to support me), consider join the channel (join button below any of the videos), because I provide the code upon members requests.
Enjoy! 🥳

Комментарии • 74

@jwebbnature 11 месяцев назад ⁺³
Very clear and calm directions, thank you for making these videos to help us :)
@yuzaR-Data-Science 11 месяцев назад
Glad you like them! Thank you for watching!
@saygindiler3928 2 года назад ⁺¹
Perfect, your videos amazing. Thanks
@yuzaR-Data-Science 2 года назад
Glad you like them!
@jackx7382 2 года назад ⁺¹
Every part is so well explained!
@yuzaR-Data-Science 2 года назад ⁺¹
Thanks! I am glad you liked it!
@emredunder9108 2 года назад
Excellent!
@yuzaR-Data-Science 2 года назад
Glad you liked it!
@juniorsouza4826 11 месяцев назад
Amazing!
@yuzaR-Data-Science 11 месяцев назад
Thanks! If you liked this one, you might enjoy gtsummary or emmean package reviews. I found them so useful, that I could not resist to make videos on them. I use them everyday.
@MrNummularius Год назад
Amazing
@yuzaR-Data-Science Год назад
Thank you! Cheers!
@ogollafredrickotieno 11 месяцев назад
easily explained
@yuzaR-Data-Science 11 месяцев назад
Thanks 🙏
@kydaviddoyle1969 2 года назад
Great video!! Could you explain a little more on how to read/ interpretation of the Dunn test results as to tell which is significant? For example it looks like
@yuzaR-Data-Science 2 года назад ⁺¹
Only significant Dunn tests are displayed. That means - between 0.05) relationship. And only p.values from Dunn tests are displayed. Which means p-values are the only think you need to interpret. Just google please "how to interpret p.values". And "why do we need p.value correction for multiple comparisons". Hmm, eta-squared interpretation is actually part of the video, that is why I don't really know what is unclear. In the RStudio write following: "?interpret_eta_squared()". You'll get the table with the interpretation. You might also generally check what is an effect size and why it is useful. Again, just google it and read a bit. That will help. Thanks for watching and Cheers!
@NextGenAge Месяц назад
Great video! Is it possible to only show the pairwise comparisons between one group e.g. 'Original' and Synthetic1, Synthetic2, Synthetic3 ... etc? It also does pairwise comparisons between those synthetic groups which I don't want to show and also don't want to conduct tests for except with the original one. Having a separate one by one figure takes up a lot of space so wondering if this is possible?
@yuzaR-Data-Science Месяц назад ⁺¹
I think it’s difficult with this function, although not impossible. But it’s much more practical to model it, with for example quantile regression, and use tab_model function from sjPlot package 📦 I have videos on both if need some assistance for a start
@Alex-gw6pm 4 месяца назад
Thank you for video! If you have 4 groups in each 5 animals (in general less than 10 animals in each group) and the distribution is normal, which better to use parametric or non parametric tests?
@yuzaR-Data-Science 4 месяца назад
parametric tests are fine in my opinion. try anova. i also have similar video of anova and repeated anova.
@MsTenseiga Год назад
just found you when I'm desperate to get my statistics in order. I understand what you did, I just can't replicate it for my own data just yet for some reason...
I have like 18 rows of data, all named by species. Every species has a different amount of data. So, x would be the species, and y would be distance they travelled in my agarose gel. I know I need the Kruskal-Wallis test. I have managed to create a boxplot for my data, and to conduct the test for my data separately. Now I know which species is significantly different from which. I just can't figure out how to get that data visualized. I used to do it by hand, but that's borderline impossible with so much damn data. I'll try your approach with ggstats. Thanks so much for giving me a point to start
@yuzaR-Data-Science Год назад
you are welcome! most of the mistakes are minor ... like data might be untidy somewhere or you made a typing mistake.... happens to me all the time. should work with no problem, except you have one observation per species
@MsTenseiga Год назад
@@yuzaR-Data-Science so cool of you to respond ^^ I don't know why, but R apparently doesn't like the = sign. keeps telling me there's an unexpected one, but I typed it exactly as you did.
It's unbelievable us biologists are expected to use R just like that when we're doing our thesis... but no one actually teaches us. Frustrating. And every tutorial is completely different.
I'll just keep trying. Probably something I missed before you specified the x and y axis
@yuzaR-Data-Science Год назад
:) I am biologist myself, cheers mate! I learned R autodidactically! Keep going it's worth it! Now to your question: are you sure you need = and not ==. They differ in R, and in the beginning I confused them too.
@yuzaR-Data-Science Год назад
Oh, by the way, hear a link to the blog-post of this video, where you can copy paste R code: yuzar-blog.netlify.app/posts/2022-04-13-kw/
My blog generally could be a good start, because I teach my students with it, and so far, they progress.
@AuthenticMusicalInsight 9 месяцев назад ⁺¹
OMGGGG THIS IS AMAZING. Thanks.... What alternative do you offer to perform as a two-way anova?
@yuzaR-Data-Science 9 месяцев назад
Thanks! 🙏 for two-way anova I would recommend the {emmeans} package. I have two videos on it. I use it everyday and think it's more general approach to any kind of model (not only linear-regression/two-way-anova) with two (or more) predictors and interaction between them. hope you find emmeans also useful! cheers
@sanjitchandradebnath4916 4 месяца назад
Great video, it really helped me to easily do my test and add the p value on the plot. However, I have a small question. Instead of violin plot, I want to make a boxplot without the points. How can I change the default violin plot into just boxplot? Can you please suggest me? Thanks a lot.
@yuzaR-Data-Science 4 месяца назад
Sure, just check out the options of the function and you’ll find almost everything what you need to adjust. Certainly type of the plot .
@Maxwaener Год назад
Great video. When I try to change y axis to log10 the multiple comparisons disappear from the graph. Any suggestions as to how changeing the y axis to log10 and keep the multiple comparisons? (Couldnt find it on SO)
@yuzaR-Data-Science Год назад
Hi, thanks! I am not sure about the axis, but when you change the data to log, the multiple comparisons might dessapear naturally. There is an option in ggbetweenstats - ""pairwise.display = "all" try this one
@zane.walker Год назад
Have you had any experience with using the extract_stats function with purrr:map to extract the stats from the ggbetweenstats function from multiple data sets?
@yuzaR-Data-Science Год назад
not yet. but it seems like a nice function. I used report package for a while, and did a review on it. I think it's a better option.
@bogdanandjelic2200 4 месяца назад
Thanks for the content once again! I've got an issue - is it possible to disable scientific notations of p-values? Much appreciated.
@yuzaR-Data-Science 4 месяца назад
Not that I know of. In fact I also hate these one. I already contacted the author one about it. But if more people tell him, he might solve this issue sooner. Thus, please, open a new issue on his github profile, so that he sees, that most of folks don't like it. cheers mate
@bogdanandjelic2200 3 месяца назад
Makes sense! Thanks, will do it. Cheers@@yuzaR-Data-Science
@yuzaR-Data-Science 3 месяца назад
👍
@yuzaR-Data-Science 3 месяца назад
👍
@Maxwaener Год назад
Do you know a way to change the p values in the plot so that thay dont show as scientific notation? Making the p values more readable.
@yuzaR-Data-Science Год назад ⁺¹
Not possible to my current knowledge. However, the package develops quickly and if check the options, it might be possible now. If no, you can request a feature on GitHub page of the package. If yes, please, let me know in the comments. Thanks for watching!
@staedtler8479 5 месяцев назад
If you don't mind i have a small question concerning the applicability of the kruskall wallis test. When measuring heavy metal concentration across different strata (e.g. sediment, water, fish organes) the units used are different. In this case, does it affect the applicability of the test ?
@yuzaR-Data-Science 5 месяцев назад ⁺¹
It sounds like it would. We can't compare kg to gramms, right? So, if the concentration is soooo different, that you need different units, why do you need the test at all, when you'll bring to the same unit, you'll see the difference immediately. Hope that helps.
@staedtler8479 5 месяцев назад
@@yuzaR-Data-Science Yes i firstly thought the same but it's not simple to convert HMs concentrations without having the density. Which bring an issue when conducting those kind of test. Surprisingly when i asked different AI about it, the explanation i got was about how the assumptions should be met only, and it doesn't really affect the test as it's robust enough to handle those differences.
@yuzaR-Data-Science 4 месяца назад ⁺¹
well, in any case, we suppose to compare apples to apples. you should better ask your scientific supervisor about it and read some papers who did similar research, so that you can see how they did that comparison and you can immediately cite them
@Rewuik Год назад
Amazing content, thank you! But, when i try to use par=(mfrow) the plots are not working. I'm trying to plot side-by-side plots with ggstatsplot.
@yuzaR-Data-Science Год назад ⁺¹
Thanks for the feedback, Rogério! There are few ways to make it work:
1) grouped_ggbetweenstats()
indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggbetweenstats.html#grouped-analysis-with-grouped_ggbetweenstats
@yuzaR-Data-Science Год назад ⁺¹
2) ggarrange(a, b, c, ncol = 1, nrow = 3,
labels = "AUTO", common.legend = T)
@yuzaR-Data-Science Год назад ⁺¹
3) and the last, patchwork R package: patchwork.data-imaginist.com/
@Rewuik Год назад
@@yuzaR-Data-Science Thank you, the package patchwork is so simple :)
@yuzaR-Data-Science Год назад ⁺¹
you are welcome! hmmm, I might do a package review about patchwork one day ;)
@JibHyourinmaru Год назад
what if i have more than 2 variables to test? like 5 variables? can I do it all together and see the result in one figure? do you have line code for that? tq
@yuzaR-Data-Science Год назад
Yes you can! But it depends on your question. Check out grouped_ggbetweenstats() function, or you have to do regression. One of my favorits is a quantile regression, I have a tutorial on that too on my channel.
@yuliaegorova589 Год назад
Is it possible to put effect size instead pvalues in pairwise comparisons on a plot?
@yuzaR-Data-Science Год назад ⁺¹
That's actually a great idea! I think not, but I just asked the author of the package and will get back to you when it's possible.
@yuliaegorova589 Год назад
thanks a lot! would be great since we only show significant Comparisions on a plot already ( so i already know it is significant). what I am more interested is whether the effect size between the groups is large or small
@yuzaR-Data-Science Год назад
There is another package from the same guy: pairwise_comparisons. It should have all the effects
@yuliaegorova589 Год назад
@@yuzaR-Data-Science yep, this one I use myself, but can it be combined with plots from this package? and also i wonder if you know where to find functions that calculate the effect sizes
@yuzaR-Data-Science Год назад ⁺¹
you can ask the Indrajeed, the guy who created ggstatsplot package. hmmm, effectsize package is a useful one. I might do a video on "effectsize" in the future.
@sergiom.querido9645 Год назад
how to solve the following error? I have installed all dependencies packages but the error remains Error: package or namespace load failed for ‘ggstatsplot’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
there is no package called ‘statsExpressions’
@yuzaR-Data-Science Год назад
The error message tells you how ;) install “statsExpressions” package 📦
@hamadalonazi723 Год назад
Hi, yuzar!
Please help me with this error. I have been trying to do the same test and am getting this error message.
Error in `mutate()`:
ℹ In argument: `isanoutlier = (.) %$% ...`.
ℹ In group 1: `job.category = Doctor`.
Caused by an error in `x$terms`:
! $ operator is invalid for atomic vectors
Could you let me know how I can solve it?
Please help
@yuzaR-Data-Science Год назад
Well, I can't help without data and code. But sometimes googling the error message helpt. You are for sure not the first person to get such error ;) Thanks for watching!
@rubyanneolbinado95 4 месяца назад
Why it says "cannot find function" when ggstatplot has already been installed. So sad.
@yuzaR-Data-Science 4 месяца назад
did load the library(ggstatsplot)? installing is done once. but you have to load it every time you use it. cheers
@rubyanneolbinado95 4 месяца назад
This is very helpful. Thank you so much. God bless. ❤
@yuzaR-Data-Science 4 месяца назад
you are very welcome! :)
@rubyanneolbinado95 4 месяца назад
Hi sir, can you give me tips on how to present the results of my GLM. I have 3 models made but the two are not significant. Should I present them all in my thesis?
@yuzaR-Data-Science 4 месяца назад
you can use sjPlot package for visualisation, gtsummary package for creating amazing tables and emmeans package for extracting the most results from your models. I reviewed all these packages on my youtube channel. the rest depends on you research question, thus ask your supervisors.
@kelleyknaak9051 2 года назад ⁺¹
p̾r̾o̾m̾o̾s̾m̾ ✨
@yuzaR-Data-Science 2 года назад
👍
@dinhluongnguyen3610 2 месяца назад
I could'nt visit "For more details and R code go to....."
@yuzaR-Data-Science 2 месяца назад
Sorry for that, man! Netlify shut down my blog since they want me to pay for increased traffic. I refuse to pay for doing something useful for the world (without earning absolutely nothing) and since R is open source. But I want to reopen it ASAP, as soon as I find an alternative for Netlify. It'll take some time though, because I am not an IT guy. FYI: my blog is actually the script for the video, word by word, code by code. Thanks for understanding!
Since you are a member, I created a community post with R code for the whole video. Please, let me know whether you could see it and get the code. If you wish, I could send you an HTML version of that video with both, code and explanations, and if you like the other videos too, I’ll create others until I fix the blog. Cheers and thank you for joining! Highly appreciate that!

Следующие

Автовоспроизведение

R package reviews | glmulti | Find The Best Model !