Chi-Square Test, Fisher’s Exact Test, & Cross Tabulations in R | R Tutorial 4.10| MarinStatsLectures

Поделиться
HTML-код
  • Опубликовано: 5 янв 2025

Комментарии • 49

  • @marinstatlectures
    @marinstatlectures  5 лет назад +8

    🤓In this #R video, we learn to conduct Pearson’s chi-square test and Fisher's Exact test in R, as well as produce contingency tables with R. Chi-square test and Fisher’s exact test, can be used to check if two variables are independent. Cross tabs or contingency tables show the frequency distribution of variables to help understand the correlation between them. To learn better the concept of chi-square test watch this video (ruclips.net/video/pfc9MUz03XA/видео.html ). Want to support us⁉️ You can Donate (bit.ly/2CWxnP2), Share Our Videos, Leave Comments or Give us a Like 👍🏼! May all your learnings are #statistically and #scientifically significant 🦄

  • @malmulkalil4546
    @malmulkalil4546 8 лет назад +10

    a program, a man ,a mission,
    you are a hero

  • @axelbiyiha618
    @axelbiyiha618 2 года назад +1

    Finally someone who correctly explains formulas for R, great job thank you very much!

  • @emreyldrm2006
    @emreyldrm2006 8 лет назад +7

    Allah Razı olsun gardaşım. Adamsın

  • @RidgyULRASite
    @RidgyULRASite 3 года назад

    Great tutorial! Concise and super clear! Good job! Thank you very much!!!

  • @jeromedoe1884
    @jeromedoe1884 5 лет назад +1

    Thanks very much!! Your videos are very very very helpful!

  • @larrythelocsta
    @larrythelocsta 6 лет назад +3

    You just saved me from failing this shit, much appreciated.

    • @marinstatlectures
      @marinstatlectures  6 лет назад +1

      good to hear...weve got 100+ videos, so feel free to check those out :)

  • @vigneshchellam9352
    @vigneshchellam9352 23 дня назад

    Bro can you explain the relationship between EOD experience of discrimination and DASS21 what are the analysis we have do in that. Can you explain

  • @MuneerIslam
    @MuneerIslam 7 лет назад +1

    I must thank you for making statistics & R so easy. I wonder if you could please explain a little about Yate's correction. When to use it and when not?

  • @RakeshSharma-yx6qv
    @RakeshSharma-yx6qv 4 года назад

    Can you please tell what does mean by"Caesarean" in this dataset?

  • @chiomaodozor2084
    @chiomaodozor2084 4 года назад

    What if I wanted my table ordered as Male above Female and Yes before No? How would I write that into R to make sure the table is setup the way I want it?

  • @truecolors51
    @truecolors51 3 года назад

    What do the residuals of the test tell us ? also , what could I do as I post hoc test????

  • @jababnamgay6366
    @jababnamgay6366 3 года назад

    But how do you interpret the chi squared result here.

  • @sabrineseddiki9757
    @sabrineseddiki9757 3 года назад

    How can i convert my chi square results (x square, df, p value) into a graph?

  • @thaisbarbosadeoliveira7962
    @thaisbarbosadeoliveira7962 6 лет назад +1

    Thank you very much! You helped me!

  • @lpierich
    @lpierich 3 года назад

    I like your videos but is there a way you can make the ending longer, because the adds put up towards the end and you cant see the last thing you did

  • @akshaypunde7520
    @akshaypunde7520 7 лет назад

    Hello Sir. What type of test do i use to compare means of males and females for different factors like Security, Reliability to know if there is a significant difference in the means of males and females for those factors?

  • @davidizquierdogomez
    @davidizquierdogomez 7 лет назад

    Nice tutorial, it has solved many of the problems I had but I still have a doubt...I run the chisq function and the outcome is that p-values might not be reliable......then I run Fisher test but what if the R outcome after a fisher test is next:
    "LDKEY is too small for this problem. Try increasing the size of the workspace"
    Would be alright to run chisq.test on the contigency table and use Monte Carlo aproach for the calculation for the p.value?
    Thank you very much ¡¡¡

  • @juliettadina7032
    @juliettadina7032 4 года назад

    i have N= 1000000 and I don't know how to do the chisq test with my data. Help?

  • @TheCooPeer
    @TheCooPeer 6 лет назад

    Quick question about Chi-Square Tests:
    I am supposed to find out if there is a statistically significant connection between two variables on a level of significance of 1%.
    My calculation of the Chi-Square-test has the following output:
    X-squared = 81.469, df = 1, p-value < 2.2e-16
    I would interpret it so, that there IS a "connection" between my variables on the 1% significance level because the p-value is

  • @murthyadivirk
    @murthyadivirk 8 лет назад

    Hi Mike, What about the p-value in this case? the chi squared value is below the critical value but the p-value is below the required significance level. Does this mean we reject the null hypothesis that there is no difference between men and women regarding smoking?

    • @marinstatlectures
      @marinstatlectures  8 лет назад +1

      Hi +Murthy Adivi , here the test-stat is less than the critical value, as the p-value is greater than a significance level of 5%, therefore we would FAIL TO REJECT the null hypothesis. we DO NOT have evidence to conclude that smoking is different between men and women (we do not have evidence to believe the alternative hypothesis is true). this of course does not mean we accept the null to be true, we're just not convinced that it is not true.

  • @im_karamo1907
    @im_karamo1907 5 лет назад

    hi prof, thanks for this great job. i have a question. i have two dataset each has binary outcome i want to compare the outcome between this two datasets to see which one has poor or good outcome. 10 independent variables in dataset 1 and 5 independent variables in dataset 2. but they all have the same outcome measure which is coded as dummy variable 0 &1. how can i determine which group has better outcome? can i just use their proportions values? need explanatioin thanks

    • @marinstatlectures
      @marinstatlectures  5 лет назад

      Hi, im a bit unclear on the question...maybe you can try to elaborate and clarify a bit. when you say you want to compare the outcomes for the 2 different datasets, what do you mean by that? are the other independent variables being considered at all? if so, how? also, what do you mean by "which group has a better outcome"? how are you defining better outcome? if you can clarify the question you are trying to answer with this data, i may be able to offer some suggestions.

    • @im_karamo1907
      @im_karamo1907 5 лет назад

      @@marinstatlectures Thank you for your reply for my message. Once again thanks again for this beautiful lectures on statistics.
      Here is what I mean: I have two dataset
      1.tpa dataset, this dataset is about patients who receive tpa treatment (the outcome variable is binary coded as 0 and 1) this has 16 independent predictors
      2. notpa dataset, this dataset is about patients who do not receive tpa treatment (the outcome variable is binary coded as 0 and 1) this has 12 independent predictors
      • My goal of the project is to predict which of the two dataset has good outcome (the outcome is define as whether if the patient receives tpa treatment or not will have disability at discharge from the hospital)
      • Good outcome was define as the patient was not disable at discharge from hospital and poor outcome is define as patient has disability after discharge.
      • Using statistical measures how do you determine which one has better outcome measure.
      I can explain further if you do not comprehend my point. Thank you Mike.

  • @PsicometristasBrasil
    @PsicometristasBrasil 6 лет назад

    Nice tutorial! Good job! Just for clarification, please replace "parametric" by "non-parametric".

    • @marinstatlectures
      @marinstatlectures  6 лет назад +1

      that is correct. previously we added an "annotation' to the video to clarify that technically the Chi-Square test is classified as non-parametric, not parametric, but RUclips has discontinued the use of annotations and all of them have been deleted. i would add that while Chi-Square is technically classified as non-parametric, it is much more like a parametric approach, and i usually prefer to refer to it this way. it makes use of a theoretical probability distribution (chi-square distribution, the sum of squared Normals), and it relies on a large sample in order for the sampling distribution to be approximately chi-square distributed. it requires the same set of assumptions as parametric approaches like the 2-sample t-test, ANOVA,.., and it doesn't have as relaxed assumptions as the usual non-parametric approaches like the Wilcoxon signed-rank, or rank sum test,....and so i think it is more helpful to think of it this way....probably a better classification for me to use would be "large sample approach" and "non-large sample approach"

  • @timgiles8793
    @timgiles8793 11 лет назад

    Hi Mike, great videos...really helpful! Just a quick question. I've done a chi-square test on the presence of disease in different seasons. There is a significant difference, but I don't know how to pinpoint this or do a pairwise comparison of the 4 seasons. Is there something on R that I can do to see the comparison between the seasons and get a p-value for each?
    Cheers!

    • @marinstatlectures
      @marinstatlectures  11 лет назад +3

      Hi Tim Giles , thanks for the comment! Sure, you can do all pairwise tests, but you will have to create all the 2x2 tables to do so. You can either create them using the 'matrix' command, and manually enter the values for the table...
      *table1

  • @DazzleMeezy
    @DazzleMeezy 10 лет назад +1

    Thank you!!

  • @miguebyte
    @miguebyte 7 лет назад +1

    I am some confused, CHI square test only works for sample with normal distribution, but when I search on internet how to test Normality distribution of discrete variables, The most common answer is There no such as thing as Normality test for categorical variables. Therefore how is possible we can call CHi square a PARAMETRIC TEST. I am done with stats, everytime I think a have step forward, some BS like this slap me in the face :/ Does anybody help?

    • @marinstatlectures
      @marinstatlectures  7 лет назад +4

      Hi Migue byte , here is something that may help. first, the ChiSquare test does not need the variable to be normally distributed (it is impossible for something like "gender", or any sort of yes/no variable to be normally distributed). what it needs is a "large sample"...and the guideline for this, is expected cell counts of at least 5 (or some may say 10). this "large-sample" requirement means that the test statistic you calculate will approximately follow a chi-square distribution. (the chi-square is a sum of squared normals, but id say just leave that aside for now...but that is where normality comes in, in case you hear something about that). technically, the chi-square test is a non-parametric test (as it doesn't work with parameters like the mean/SD), BUT it assumes a distribution (the chi-square distribution) so it is a test that uses a distribution. the whole parametric/non-parametric thing becomes confusing for the chi-square test as the majority of parametric tests rely on large samples, use parameters, and assume some nice distribution...while the majority of non-parametric tests do not need large samples, do not use parameters, and do not assume a distribution. the chi-square kinda crosses into both camps as it is technically non-parametric...but it does still assume a distribution like a parametric test does. BUT, it is important to remember that these are just labels...the chi-square is much like all the other parametric tests in that it relies on a large sample and assumes some distribution...but technically it gets labelled as non-parametric.
      hope that helps your understanding of this!

    • @miguelsuarez475
      @miguelsuarez475 7 лет назад

      Professor Marin, Thank you very much for taking part of your time to reply to my comment, I really appreciate your help, and even more the fact you're helping a lot of people with your online lectures for the love of teaching. I think a have advanced a lot in stats by myself, but that "PARAMETRIC" label was confusing me. Thank you again, i have another step forward. I wish you the best.

  • @finns23653
    @finns23653 7 лет назад

    now how do i plot the probability density function?

  • @SrJaramango
    @SrJaramango 6 лет назад

    Hello, Mike! I wondered if there is a way to obtain the expected contingency table for the fisher.test :/. Awesome video!

    • @marinstatlectures
      @marinstatlectures  6 лет назад +2

      Hi, there’s not, and that’s because Fishers test is different...it is a permutation test. The test involves calculating all possible permutations of the table, and the p-value is the proportion of all permutations that are as extreme or more than observed (there are actually some approximations done, but that’s the underlying concept, and why there is no expected table for this test)

  • @ignacio4786
    @ignacio4786 6 лет назад +1

    you are a fucking master of rstudio, thanks bro for helping me to pass my subject!

  • @Terryno34
    @Terryno34 11 лет назад

    Thanks for your video. I would like to ask if R is able to handle Fisher Exact test for contingency table larger than 2X2?

    • @marinstatlectures
      @marinstatlectures  11 лет назад +1

      Hi Terryno34, thanks for watching! Yes, Fisher's Exact test in R can handle tables of larger dimension than 2x2, but the test may not be able to be calculated for very large sample sizes (eg) if the dimensions are too large, or n is too large, there may be too many possible permutations, and the test statistic/p-value can not be calculated.

  • @fprts1540
    @fprts1540 7 лет назад +1

    @Mike_marin, you rock

  • @puikwancheung4441
    @puikwancheung4441 9 лет назад

    Thanks for your videos!
    I came across a paper about Chi-square test and it says Chi-square test is a non-parametric test because it's distribution free:
    www.ncbi.nlm.nih.gov/pmc/articles/PMC3900058/
    Could you explain a bit more about this?
    Thank you!

    • @marinstatlectures
      @marinstatlectures  9 лет назад +1

      Hi Pui Kwan Cheung , sure, i can elaborate on this a bit. first to note is that this test is often referred to a non-parametric, and i usually refer to it as parametric. technically, this is a non-parametric (and distribution free) method...since the variable(s) are categorical, there is not a summary parameter (like a mean/sd) and there is not an assumed distribution, hence why it gets called a non-parametric method. but, in order to conduct the test, a few assumptions need to be met (one of them being a large sample size, usually expressed as all expected cell counts being at least 5). this assumption is necessary in order to approximate the p-value for the test, using the Chi-Square distribution. if this assumption is not met, you should not use the chi-square distribution to estimate the p-value. since this test requires the same assumptions as a parametric test, and when the assumptions are met, the p-value is calculated using a statistical distribution (the chi-square distribution), i tend to refer to it as a parametric method. but technically, it is actually a non-parametric method). but as i said, i group it into the set of parametric methods, as it has all the same assumptions/requirements as a parametric method. i find this easier for grouping sets of methods in the head. and really, this is just a definition we give to things, and the more important part is to understand what a test can be used for and what it actually tells us, when it is/isn't appropriate, and what conditions or assumptions are required in order to use it.

    • @puikwancheung4441
      @puikwancheung4441 9 лет назад

      MarinStatsLectures I see! I totally understand now. Thank you very much for sharing those videos! They are indeed very helpful!

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      you're welcome Pui Kwan Cheung