Using stat_summary from ggplot2 to add a statistics layer to plots in R (CC089)

Поделиться
HTML-код
  • Опубликовано: 4 окт 2024

Комментарии • 40

  • @Riffomonas
    @Riffomonas  3 года назад

    Is there a stat_* function that you'd like to learn more about?

    • @venkatpgi
      @venkatpgi 3 года назад +1

      Hi, wonderful and very educative videos. Can you pls let know is there is any way to connect the points with lines within the stat_summary function?

    • @Riffomonas
      @Riffomonas  3 года назад

      @@venkatpgi try using geom="line" in stat_summary

    • @venkatpgi
      @venkatpgi 3 года назад +1

      @@Riffomonas My data is to look time trend (x-axis has time) and Y axis is a score measured on multiple individuals at each time point. Using stat_summary I could calculate mean score at each time point and its CI. Now I wish to connect the point representing mean at each time point so that the trend is more clear to the reader. Adding geom = "line" is not working on my case. Can you help?

    • @Riffomonas
      @Riffomonas  3 года назад

      You’ll probably want two stat summary functions. One to get the means and another to connect the means. You can also always create a separate data frame with rhe means and add those data with separate geoms

    • @leonardofwink
      @leonardofwink 2 года назад

      could you please make a video about changing boxplots defaults ranges?
      i didnt find any documentation understandable enough yet :(
      i've been looking for something like min/%30/median/%80/max
      thank you so much for all your videos :D

  • @ratral
    @ratral 3 года назад +3

    Thanks, as always, each video teaches me something new and motivates me to explore more about it.

    • @Riffomonas
      @Riffomonas  3 года назад

      Awesome! That’s the best compliment you can give me 😊

  • @ciroweinstein8627
    @ciroweinstein8627 Год назад +1

    Thank you, wish i started looking at your videos earlier ...
    Really cool stuff
    Its been helping out a lot

    • @Riffomonas
      @Riffomonas  Год назад +1

      Glad to hear it! I'm glad you're finding them helpful 🤓

  • @danielvaulot5105
    @danielvaulot5105 3 года назад +2

    Very nice, I did not know hwo powerful the stat_* functions were ....

    • @Riffomonas
      @Riffomonas  3 года назад

      They're my new favorite thing :)

  • @niceday2015
    @niceday2015 2 года назад +1

    Wow, it's super great to learn that from you. Thanks a lot!

  • @yujiangsun5428
    @yujiangsun5428 2 года назад +1

    thank you! very informative, love it

  • @Jcakiiiii
    @Jcakiiiii 2 года назад +1

    This is great thank you so much!!!!

  • @guzman_uy9420
    @guzman_uy9420 2 года назад +1

    Great! thanks. But how could we get the letters of the Tukey test above the bars? I mean, ready for publication charts

    • @Riffomonas
      @Riffomonas  2 года назад +1

      Keep watching 🤓 there’s a couple of episodes on that

  • @afonsoosorio2099
    @afonsoosorio2099 Год назад

    Great tutorial, informative and high-quality. Could you kindly tell me what .groups="drop" is doing at the end of the code block where you are grouping (group_by) and computing the summary statistics? Are you ungrouping?
    Thank you.

    • @tomaszuspienski4833
      @tomaszuspienski4833 Год назад

      The answer is in the previous video (CC088). It's removing a comment above the table

  • @CristinaCampbell
    @CristinaCampbell 2 года назад +1

    How do I get means of the max? I have hourly data for several dataloggers collecting temperature data. I'd like to get daily temp maximums by the month and then means for the maxes. If I : group_by (day, type) %>% summarise (mean=mean(temp)) then I realize I'm getting means (the average of all the temps). However, if I do summarise (max=max(temp)) I feel like I'm getting 1 observation (the maximum of the observations). So how to average the maximums? Hope this is clear. Thx

    • @Riffomonas
      @Riffomonas  2 года назад +1

      Hi Cristina - I'd probably do something like one summarize to get the max and then another summarize to get the mean. Of course, for each summarize you need to make sure you have the right grouping variables

    • @CristinaCampbell
      @CristinaCampbell 2 года назад

      @@Riffomonas Thanks for the reply, I'll try that. Yep, I've noticed I'm getting a little confused with all the ways I can group (day, sensors, months, habitat, time) and how that changes the outcome. Anyway, I'll mess around with my code.

  • @andyserowitz5370
    @andyserowitz5370 2 года назад +1

    Why do you use fun.data as an argument in stat_summary when others use fun.y or fun .x? Please help me understand when and why you’d use either. Thanks.

    • @Riffomonas
      @Riffomonas  2 года назад +1

      Hi Andy - thanks for watching! fun.data takes a function that outputs the y, ymin, and ymax values. fun.y takes a function that outputs the y value. There’s also a fun.ymin and fun.ymax values to get you the same kind of thing. I guess which you use depends on the type of data you want out and what functions you’re using to summarize the data

    • @andyserowitz5370
      @andyserowitz5370 2 года назад

      @@Riffomonas I think I understand now. I’ll play with the code some more.

  • @santiagodevillalobos9654
    @santiagodevillalobos9654 2 года назад +1

    is there a way to graph the difference between two bars (fun=diff) but that at the same time is based in the absolute valour of the difference?

    • @Riffomonas
      @Riffomonas  2 года назад

      You can write your own function here using whatever functions you want.

  • @puttipongchunark7178
    @puttipongchunark7178 2 года назад +1

    How to export a summary data from stat_summary to a dataframe? Please help!!!

    • @Riffomonas
      @Riffomonas  2 года назад +1

      You would need to do a separate group_by/summarize pipeline to get out the summary data you want

    • @puttipongchunark7178
      @puttipongchunark7178 2 года назад

      @@Riffomonas Thank you

  • @rishikeshdash12
    @rishikeshdash12 2 года назад

    Sir, I have plotted a Slope plot on 4 time points. Each time point has 10 samples. I want to add a median line which will connect each time point and I want to add error bars in each time point which will show inter quartile range of that point.
    Sir, I have tried fun.data and fun inside stat_summary but failed to add errorbars and median line.
    I think manipulating aesthetics inside stat_summary it is possible don't know how to do it.
    If you help me I will be always grateful to you😇.

    • @Riffomonas
      @Riffomonas  2 года назад +1

      Sorry but I really don’t have the ability to do one on one help for free and it’ll be a while before I come back to this type of content

    • @rishikeshdash12
      @rishikeshdash12 2 года назад

      @@Riffomonas Ok sir but one of your video give me idea to add that Median Line and IQR errorbar. I use same method I.e. used for Putting centroid in PCoA Plot. 😁😍🤩

  • @sitendugoswami1990
    @sitendugoswami1990 3 года назад +1

    Are we supposed to ignore the oscar?

    • @Riffomonas
      @Riffomonas  3 года назад +1

      Hahaha! It's not *really* an Oscar :) A friend sent that to me as a token of our friendship. I include it in the video so he knows I appreciate our friendship

  • @puspitalestarikhanna8582
    @puspitalestarikhanna8582 Год назад

    Hi! Can I ask, how to show data "sd" if the result is "NA"? I tried this syntax -> sd(df$x, na.rm = TRUE), but that didn't work, please share if you know the solution :)