P-values Broke Scientific Statistics-Can We Fix Them?

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 1,2 тыс.

  • @SciShow
    @SciShow  5 лет назад +883

    There is a typo at 7:37! The P-value for 6 tea cups is 0.05, not 0.5. Thanks to everyone who pointed it out!

    •  5 лет назад +8

      There is a typo at 7:37! The P-value for 6 tea cups is 0.05, not 0.5.

    • @RT-oy7mu
      @RT-oy7mu 5 лет назад +5

      @@SirShades23 Nice try, @daniquasstudio
      was the one who corrected it.

    • @daniquasstudio
      @daniquasstudio 5 лет назад +6

      It is an honor, thank you

    • @VariantAEC
      @VariantAEC 5 лет назад +2

      That last option should be the way journals proceed. Results shouldn't matter, if they do the science takes a backseat.

    • @davidalearmonth
      @davidalearmonth 5 лет назад +3

      I feel like the stats approach on the tea with milk is wrong at 1 in 70. I would have figured each cup was a 50/50 chance, so picking 8 currently would be 1 in 256?

  • @mhaeric
    @mhaeric 5 лет назад +726

    There's something both meta and ironic about a dead fish being used to poke holes in a methodology by a Fisher.

    • @HaloInverse
      @HaloInverse 5 лет назад +47

      You could _also_ say that he was fishing for data that supported his hypothesis.

    • @reallyWyrd
      @reallyWyrd 4 года назад +11

      It reminds me of the famous robot and dead herring experiments carried out at the Maximegalon Institute For Slowly And Painfully Working Out The Surprisingly Obvious.
      Except that this result wasn't obvious.
      Except that, if we were better at actually doing stats and science, it would have been.

    • @KnakuanaRka
      @KnakuanaRka 3 года назад +1

      *ba dum tss* xD

    • @lc9245
      @lc9245 3 года назад +3

      No, it didn’t. The methodology by Fisher is just a set of theories, the practice of those theories were what’s troublesome. His method is fine, but the considerations when it comes to statistics, the meta data, weren’t in consideration. Because p-value is easy to calculate, researcher abuse it. It’s not Fisher’s fault, it’s society’s fault.

    • @leonorf2730
      @leonorf2730 3 года назад +1

      Looks like 😎 the Fisher became the fished.

  • @WeatherManToBe
    @WeatherManToBe 5 лет назад +210

    Just a heads up for everyone; you can tell the difference between milk first vs tea first. If you do milk first, you temper the milk as you pour the tea in, stopping the proteins in the milk from denaturing and clumping together on the top as a skin or foam. (Only concerning freshly brewed tea held in a decent pot staying near boiling point) If milk is added to a near full cup of tea, the first bit of milk gets 'burnt' before the tea is eventually cooled down with additional milk added. If tea is below 82 degrees, there is no difference.
    This is the same problem with gluten/eggs/other dairy in sauces. Always add hot stuffs to cold stuff, the slower the better.

    • @raxleberne4562
      @raxleberne4562 4 года назад +11

      It's amazing, the subtleties there are to be overlooked when studying things. I feel as if I will think of this every time I encounter something with no apparent explanation.

    • @Achill101
      @Achill101 3 года назад +10

      @@raxleberne4562 - the point of statistical tests is to see if there's an effect at all, not yet to understand the causation. If it is nearly certain that there's an effect , people are more likely to look into the mechanism of how it works. We shouldn't criticize a statistical test for not doing what it's not supposed to do.

    • @sophierobinson2738
      @sophierobinson2738 3 года назад +1

      Works with coffee, too.

    • @laurelgardner
      @laurelgardner 3 года назад +9

      Yeah, I found it pretty GD annoying that they just assumed it was nonsense when making this video.

    • @ruairidhmcmillan2484
      @ruairidhmcmillan2484 3 года назад +4

      @@laurelgardner exactly, it's not scientific to dismiss any potential effects at the level of two experimental media interacting (milk and coffee) just because these effects are not immediately apparent. Science wouldn't be all that useful if everything which was apparent made for an accurate representation of everything which is not apparent.

  • @Paul-A01
    @Paul-A01 5 лет назад +529

    DM: You encounter a feral null hypothesis.
    Researcher: I run a study on it!
    *rolls* Critical significant results!

    • @calamusgladiofortior2814
      @calamusgladiofortior2814 5 лет назад +23

      I find this joke... (rolls d20, checks table) amusing.

    • @MrUtak
      @MrUtak 5 лет назад +14

      *rolls a 20*
      Did the DM see it?
      *rolls again*

    • @mal2ksc
      @mal2ksc 5 лет назад +13

      I cast Hellish Rebuke as a reaction to discredit the researcher!

    • @ValeriePallaoro
      @ValeriePallaoro 5 лет назад +3

      f*ckin excellent!!

    • @dmarsub
      @dmarsub 4 года назад +4

      This is why in some pen and paper system critical rolls only happen with 2 rolls now.
      (And why study reproduction is so important)

  • @argentpuck
    @argentpuck 5 лет назад +1173

    1 in 20 has always bothered me when I studied statistics in a scientific setting. Any D&D player can tell you just how often a 1 or 20 actually comes up and it's rather more often than 5% sounds like.
    Edit:
    This blew up a lot more than I expected and people are focusing on the wrong thing. I used D&D because I figure most people who watch these videos are familiar with rolling icosahedrons. The point, though, has nothing to do with dice probability or the cognitive biases around particular results (although, thinking about it, that does speak to p-hacking).
    The point I intended is that 5%, especially in a large sample, is quite a lot. If I flood the market with a placebo cure for the common cold and 5% of the 10,000,000 who used it report that it worked, that's half-a-million voices confirming pure nonsense. Cognitive biases being what they are, basically any confirmation can get people to draw the wrong conclusion (e.g., anti-vaxxers), certainly, but a 1-in-20 probability that something is pure chance is rather high odds and this video confirms that it is basically arbitrary.

    • @richardoteri356
      @richardoteri356 5 лет назад +11

      Yes.

    • @joegillian314
      @joegillian314 5 лет назад +55

      The reason it's 5% is because of the empirical rule.
      In a normal distribution we have the following properties:
      approximately 68% of all data lie within 1 standard deviation of the mean
      approximately 95% of all data lie within 2 standard deviation of the mean
      approximately 99.7% of all data lie within 3 standard deviation of the mean
      The second property is where the 5% comes from.

    • @jackielinde7568
      @jackielinde7568 5 лет назад +17

      I was thinking about this very thing... with my dice bag a foot away from me on the desk.

    • @crovax1375
      @crovax1375 5 лет назад +51

      There is a bias towards recalling a roll of a Nat 1 or 20 over any other failed or successful roll, because players get more excited about a critical failure or success

    • @interstellarsurfer
      @interstellarsurfer 5 лет назад +22

      @@joegillian314 So, did the empirical rule lend itself to D&D, or does D&D adapt to the empirical rule?
      Further research is needed. 😅

  • @codysmit
    @codysmit 5 лет назад +363

    So you could say that the p-value... was born from a tea-value.

    • @mdunkman
      @mdunkman 5 лет назад +18

      Cody, it was a result of a Student’s Tea-test.

    • @microbe_guru
      @microbe_guru 5 лет назад

      +

    • @Dornatum
      @Dornatum 5 лет назад +1

      Oh my God that makes so much sense

    • @markdodd1152
      @markdodd1152 5 лет назад +2

      They kind of tea-bagged the P value

    • @jonathankool1997
      @jonathankool1997 4 года назад

      Is it worse that is such a thing as a T value?

  • @MrDavidlfields
    @MrDavidlfields 5 лет назад +176

    P-hacking has been a significant problem in recent years leading to repeatability problems with hundreds of published studies.

    • @ValeriePallaoro
      @ValeriePallaoro 5 лет назад +2

      that's what she said ...

    • @TheRABIDdude
      @TheRABIDdude 5 лет назад +1

      David Fields She gave the example of "collecting more data" for p-hacking, but I don't understand how doing more repeats is going to make the result less reliable...? Please can someone explain.

    • @MrDavidlfields
      @MrDavidlfields 5 лет назад +5

      TheRABIDdude explaining here may not be too easy but this video explains it very well. Basically it allow fr more chances at getting significant results by chance.
      ruclips.net/video/Gx0fAjNHb1M/видео.html

    • @Merahki3863
      @Merahki3863 4 года назад +2

      @@TheRABIDdude one way is to do more studies and cherry pick the ones that fit your agenda and only publish those results. Generally the point of a larger sample is to make it realistically representitive of a population, but only adding more biased data isn't going to make the study more credible. In fact it will become more misleading.

    • @TheRABIDdude
      @TheRABIDdude 4 года назад

      David Fields Thanks for the link :) I get the impression then that when she said "collect more data" she meant collect *more detailed* data so that you can do many comparisons/tests, and then choose not to correct for family-wise error in order to greatly raise the chance of finding any significant results to talk about?

  • @nitrogenfume9762
    @nitrogenfume9762 5 лет назад +326

    All I remember from AP Stats:
    "If the p is small, reject the Ho."

    • @pizzas4breakfast
      @pizzas4breakfast 5 лет назад +23

      Okay, let's reject da ho! Beeeeeeeech

    • @KILLRXNOEVIRUS
      @KILLRXNOEVIRUS 5 лет назад +2

      @@pizzas4breakfast Yee

    • @arnabsarkar4735
      @arnabsarkar4735 5 лет назад +2

      Alternatively "If the p is small, reject the H-nought/null/zero". Sorry I am no fun at parties.

    • @dickJohnsonpeter
      @dickJohnsonpeter 5 лет назад

      @@arnabsarkar4735 Or reject the Haught.
      (H-aught) but I suppose you could argue that aught doesn't _technically_ mean zero...

    • @jakezepeda1267
      @jakezepeda1267 5 лет назад +6

      Usually Hos reject me.
      I would ask why this Ho has a p, but it IS 2019 afterall.

  • @8cordas381
    @8cordas381 5 лет назад +20

    I am a medical doctor and I will show this video forever to so many colleagues who do not have that insight when using studies to make decisions. Loved it, thank you.

    • @frankschneider6156
      @frankschneider6156 5 лет назад

      MDs are no scientists (unless they do this full-time and then they know anyhow), so that's carrying owls to Athens.

    • @8cordas381
      @8cordas381 5 лет назад

      @@frankschneider6156 No, but MDs get thrown at a lot of studies to guide our decisions, and yes, we do read them, being outdated is not allowed in our job. One current awful consequence of statistics misuse misguiding MDs is the opioid crisis, in plain sight.

    • @frankschneider6156
      @frankschneider6156 4 года назад

      8cordas
      Yes I agree, but a single study isn't worth the paper it's printed on. It's rather the ratio of cumulated amount of papers in favor of something vs those negating it, thats important. A single paper (even if absolutely thoroughly executed) is rarely sufficient to base decision making upon it. And that's of course far more true, if the authors are biased and hell bend on getting a certain result..

    • @8cordas381
      @8cordas381 4 года назад

      @@frankschneider6156 That is the right way, but that is exactly where the danger and manipulation lie. The methodology of how meta-analysis choose which studies to use, to tweak and to search details so certain studies that do not have the result you want do not have the characteristics to be included in the meta-analysis. I see your point, and in an honest world things should work in the way you describe, but some people would do anything for extra cash, and those few people are enough to mess a whole system.

    • @frankschneider6156
      @frankschneider6156 4 года назад +1

      8cordas
      I meant the cumulative amount of papers, not meta studies. In theory meta studies should be a great thing significantly increasing the data set and thus the accuracy of the result, but in practice every study has undocumented properties and boundaries that often the researcher himself isn't even aware of. So mixing data (possibly gathered for different purposes with different technologies, different levels of detail, different environments or populations) from lots of different studies typically just mixes apples and pumpkins and out comes ... well .. garbage (GIGO, garbage in, garbage out) and that's still assuming the team conducting the meta-analysis to be well meaning, honest and skilled. So I perfectly share your critical view of meta studies. I haven't seen a single one (at least as far as I can remember) that I would trust farther than I could throw a truck.

  • @brentrawlins6490
    @brentrawlins6490 5 лет назад +489

    As a statistician, it is sad to see such a potentially powerful tool be misused so much.

    • @jackielinde7568
      @jackielinde7568 5 лет назад +14

      As a statistician, do you have polyhedral dice and how often do you abuse statistics when playing D&D? ;)

    • @interstellarsurfer
      @interstellarsurfer 5 лет назад +10

      Cooking the books is a problem as old as... books! 😋

    • @brentrawlins6490
      @brentrawlins6490 5 лет назад +13

      @@jackielinde7568 Yes, and I roll in the open with witnesses. Also, what is the point of playing a game if you're going to cheat? In my experience failing at something can be just entertaining at succeeding.

    • @jackielinde7568
      @jackielinde7568 5 лет назад +6

      @@brentrawlins6490 Oh, I wasn't saying you fudge your rolls. I was "suggesting" that you run the numbers for probabilities of success. I've seen players do that. Not saying Min Maxing is wrong when playing D&D, but it's just not my cup of tea. :D

    • @brentrawlins6490
      @brentrawlins6490 5 лет назад +4

      @@user-jp1qt8ut3s is it possible to switch the wording from "significantly different" to "fundamentally different" I might get you out of having to find the P-value

  • @film9491
    @film9491 5 лет назад +400

    I love how petty the origin of p value is. I never heard that story before

    • @sohopedeco
      @sohopedeco 5 лет назад +9

      I still wonder how the woman sensed the order of pouring of her cup.

    • @marin0the0magus
      @marin0the0magus 5 лет назад +18

      @@sohopedeco Eh, perhaps there is something in the way the diferent beverages mix, or how the sugar in the milk reacts with the tea, maybe? Some people can be very sensitive about their tea, from the type of leaves to the water type and temperature and to the time the leaves were infused before serving... So it wouldnt surprise me if that was the case.

    • @marin0the0magus
      @marin0the0magus 5 лет назад +16

      @@sohopedeco "Milk should be added before the tea, because denaturation (degradation) ofmilk proteins is liable to occur if milkencounters temperatures above 75°C. "
      Huh. Would you look at that o:

    • @limiv5272
      @limiv5272 5 лет назад +7

      @@marin0the0magus I was thinking it could be related to the cup's temperature. If the milk is added first the cup is still cold, but if the tea is added first the cup is very hot when the milk is added so it's surrounded by heat from all sides. This is obviously not a well formulated explanation. My dad loved to do these kinds of experiments with me when I was little because I'm a very picky eater and he didn't believe me that things were different and thought I was just being stubborn. Then, of course, I proved to him I can tell the difference between 3% and 5% white cheese and water from the tap and water that went through a filter (-:

    • @eagle3676
      @eagle3676 5 лет назад

      @@marin0the0magus yes if you're a tea addict, you can notice small differences

  • @Greg5MC
    @Greg5MC 5 лет назад +145

    This video should be mandatory viewing for every science class.

    • @ErroneousTheory
      @ErroneousTheory 5 лет назад +7

      Every science class? Every human

    • @delphinidin
      @delphinidin 4 года назад +2

      Every scientific journal... and science graduate program... and university science department...

  • @SingularityasSublimity
    @SingularityasSublimity 5 лет назад +86

    A very important topic that not enough people (including scientists) consider. The limitation of p-values focused on in this video are Type I errors (wrongly rejecting the null hypothesis). However, Type II errors (wrongly accepting the null hypothesis) are very problematic as well. Let's say you get a p-value of .25, which is well above the threshold set by Fischer. It still indicates that there is a 75 percent probability that your results are not an artifact of chance. Usually this outcome is the result of small sample sizes but not necessarily and it can lead researchers to stop considering a legitimate finding that just happened not meet this p-value criteria, which would also be a shame if we are talking about a potential treatment that can help or save lives. Beyond Bayesian stats, effect size stats are also very helpful here.

    • @jeffreym68
      @jeffreym68 5 лет назад +7

      I am always surprised to see how few fields are calculating and publishing effect sizes. I used to think that was the default, rather than the outlier.

    • @SingularityasSublimity
      @SingularityasSublimity 5 лет назад +2

      It is completely shocking

    • @entropiCCycles
      @entropiCCycles 5 лет назад +9

      I'm reminded of some summary research in Psychology as a field (it may have been the big replication attempt or some other bit of meta-research), where they found that, for studies that used the often cited alpha of .05, the power of such tests were about .06.
      I'm also reminded of a professor's talk from back in graduate school where they showed that, with sample sizes common in Psychological research, Ordinary Least Squares regression was outperformed, not only by equal weights (i.e. every predictor had the same slope term), but by *random* weights.

    • @randylai-yt
      @randylai-yt 5 лет назад +5

      The real difficulty is when multiple tests are involved, the interpretation of effect sizes are no longer calibrated. On the other hand, p-values at least could still be adjusted to account for the inflation of type I error.

    • @piguyalamode164
      @piguyalamode164 4 года назад

      @@entropiCCycles Wow, your line of best fit being worse than random. Impressive!

  • @fermat2112
    @fermat2112 5 лет назад +216

    That is either a very small machine or a very large fish.

    • @CanuckMonkey13
      @CanuckMonkey13 5 лет назад +72

      She said MRI, but I think she MEANT fMRI, which as we all know is a fish MRI, sized especially for fish.

    • @emilchandran546
      @emilchandran546 5 лет назад +8

      Linda some salmon can be pretty big, especially going back a bit.

    • @dontbotherreading
      @dontbotherreading 5 лет назад +1

      It's both

    • @MrWombatty
      @MrWombatty 5 лет назад +5

      As they weren't going to cook the salmon, it didn't need to be scaled!

    • @cillyhoney1892
      @cillyhoney1892 4 года назад +1

      Salmon can get huge. I've seen and got to partake in eating a five foot long salmon IRL. It fed over fifty people!

  • @johncarlton7289
    @johncarlton7289 5 лет назад +30

    This is probably the best video you guys have done in more than a year.

  • @TechnoL33T
    @TechnoL33T 5 лет назад +96

    9:20 is such an AMAZING idea! Kill the drive for success in publishing! Incentivizing skewing results for attention is so bad, and this is definitely the fix for it!

    • @drdca8263
      @drdca8263 5 лет назад +8

      Just confirming that you aren’t being sarcastic

    • @TechnoL33T
      @TechnoL33T 5 лет назад +11

      @@drdca8263 Not at all! I suppose this could look like exaggerated enthusiasm, but I find the idea to be legitimately exciting!

    • @drdca8263
      @drdca8263 5 лет назад +2

      MangoTek Thank you for confirming! I largely agree. Well, I definitely agree that it is promising, less sure that it is the “One True Solution” in practice? Definitely agree that it is a theoretically really nice solution, by entirely bypassing the incentives there, and it would be really cool if it works out well in practice, and there is a good chance that it will.

    • @TechnoL33T
      @TechnoL33T 5 лет назад +6

      @@drdca8263 it may not be perfect, but it's a whole world ahead of what we're doing now. I don't see any downsides that aren't already dramatically worse right now.

    • @drdca8263
      @drdca8263 5 лет назад +2

      MangoTek I think it is likely to work, but let me spitball some potential (potential in the sense of “I can’t rule them out”, not “others can’t rule them out”) issues. This setup would result in a larger number of studies published with null results (and not just interesting null results). Therefore, in order to have the same number of studies with interesting results, this requires a greater total number of studies published.
      Reviewing the proposals takes time and effort. If we for some reason cannot afford to increase the amount of effort spent on reviewing papers before publication, and so can’t increase the rate of papers being published (this sounds unlikely? Like, probably not actually a problem), then this would result in a lower rate of papers with interesting and accurate results?
      Which, could very well be worth it in order to eliminate many of the false results, but exactly where the trade-off between “higher proportion of published results are correct” vs “higher number of correct published results” balances out, idk.
      But yes, I agree it sounds like very good idea, should be tried, hopes it works out.

  • @MyBiPolarBearMax
    @MyBiPolarBearMax 3 года назад +13

    Science: “double blind studies are the gold standard because it eliminates the bias of the researchers’ preferred outcome!”
    Also science: “we dont need two step publishing!”

  • @jeffreym68
    @jeffreym68 5 лет назад +161

    I agree with the two-step process. I hate the idea of killing statistical significance just because some people use it incorrectly because they either misunderstand it or, much worse, but hopefully much more rarely, because they are purposely misusing them. I'm boggled by the number of times I have to explain, even to scientists, that you have to set your p-value FIRST, typically using similar studies as a guide, THEN analyze the data and interpret the results. Perhaps one solution is more and better teaching of the topic. Amazingly, some fields of graduate study do not require expertise in psychometrics.

    • @NeoAemaeth
      @NeoAemaeth 5 лет назад +3

      I guess you mean α not p?

    • @jeffreym68
      @jeffreym68 5 лет назад +2

      @@NeoAemaeth Actually, I used an abbreviation for the phrase "setting the probability that the results will be due to chance with which we are comfortable in this experiment" because I thought it was more understandable to the general reader. My apologies if it had the opposite effect.

    • @benderrodriguez142
      @benderrodriguez142 5 лет назад +9

      The real issue is not setting the p value ahead of time but manipulation or elimination of data to make the value be 0.05. As a scientist who reports to a p hacker at work, it is a major issue.

    • @jeffreym68
      @jeffreym68 5 лет назад +2

      @@benderrodriguez142 I definitely agree that it's a huge problem, and have been employed by a person who did this (briefly, obviously). But I have more often been hired by people who honestly didn't know how the process SHOULD work. In my experience, making people commit to the whether they will use .01, .05, etc. ahead of time fixes the problem with people reporting a mix of p values because they don't know better. Short of professional shunning, reviewers asking pointed questions or changes in ethics I'm not sure what to do about p hackers.

    • @benderrodriguez142
      @benderrodriguez142 5 лет назад +1

      @@jeffreym68 that makes sense. Guess I haven't ran into too many people that didn't understand the process. Although, I know a few that act like they didn't understand what they were doing was wrong, full well knowing it was being misused. Can't wait to get a new job as I feel dirty every time I leave work. My boss also tried to put ** and then label that as 0.1 to trick people it is really 0.01 and what not. Some people lack ethics.

  • @insertfunnyhandlehere
    @insertfunnyhandlehere 5 лет назад +127

    Heat changes the flavor of dairy products at relatively low temperatures just the act of the tea being cooled by the cup before mixing can make a subtle change in your tea.

    • @MrTheWaterbear
      @MrTheWaterbear 5 лет назад +4

      But it's by mere degrees difference. It's not impossible, but it's very strange if that were the reason... I mean, unless the cups are super cold.

    • @dejayrezme8617
      @dejayrezme8617 5 лет назад +14

      Answering the real questions about this video haha.
      It makes sense, pouring hot tea into milk will lead to a different temperature difference. The milk will get into contact with far more hot water molecules when tea is poured last, not just because the cup isn't cooling it but because you mix the milk and hot tea constantly while pouring.
      It might also be that you end up with smaller suspended fatty milk droplets.

    • @MrDrakkus
      @MrDrakkus 5 лет назад +8

      I was about to say something similar! If you start with the tea first, the heat of the tea will "cook" the dairy as you pour it faster than the dairy cools the tea. If you start with the dairy first, then it will cool the tea faster than the tea will cook the dairy. At least, up until you stop pouring and the temperature averages out. Starting temperature and ending temperature would probably be the same either way. The important bit though is that when starting with the dairy, that initial bit of cooling faster than heating will mean less cooked dairy overall, which will have a slightly different flavor and texture. I wouldn't be surprised at all if it was enough to be noticeable.

    • @insertfunnyhandlehere
      @insertfunnyhandlehere 5 лет назад +7

      It's actually not so unusual as it's the protein breakdown caused by the heat and proteins in dairy products dont breakdown the same under 200 f as they do over 200 f and the deference of tea in the pot versus tea in a room temperature ceramic cup can change by as much 10 f in the 195 f too 205 f range. I think good eats goes over this in more detail in his milk episode.

    • @interstellarsurfer
      @interstellarsurfer 5 лет назад +7

      I believe it's the temperature sensitive chemical reactions between the tea and milk, that are responsible. They're more pronounced when adding milk to hot tea, than when adding tea to a chilled cup of milk.
      In the same way that adding acid to water is 👌, but adding water to acid can be ☠

  • @daniquasstudio
    @daniquasstudio 5 лет назад +111

    7:39 I could be wrong, but don't you mean a p-value of 0.05 on the right side?

  • @agnosticdeity4687
    @agnosticdeity4687 5 лет назад +16

    I would like to point out ( in my most pretentious British accent) that adding the milk to a hot or near boiling cup of tea "shocks" the milk because of the sudden change in temperature, whereas adding the milk first and then the tea raises the temperature slowly and this (according to my old boss) has an effect on the taste.
    Also I have to admire the intelligence of this scientist. That is a very smart way to get a free whole salmon ;-)

  • @mschrisfrank2420
    @mschrisfrank2420 5 лет назад +55

    Also, statistically significant result or not, always ask what the effect size was.

    • @DAMfoxygrampa
      @DAMfoxygrampa 3 года назад

      Bigger effect the more significant the data ?

    • @ihbarddx
      @ihbarddx 3 года назад

      @@DAMfoxygrampa A small effect size might make a significant result irrelevant. In BIG DATA(tm), for example, very many relationships are significant, but far fewer relationships are important.

  • @crazykaspmovies
    @crazykaspmovies 5 лет назад +61

    Ah yes, rolling a critical fail on your research and submitting a false positive

  • @SaltpeterTaffy
    @SaltpeterTaffy 5 лет назад +63

    This is one of the best episodes of SciShow I've seen in a long time. No wasted time whatesoever. :D Reminds me of the SciShow of a few years ago when Hank was flying by the seat of his pants and talking slightly too fast.

    • @econgradstudent4069
      @econgradstudent4069 3 года назад

      No it is not! their definition of a p-value is wrong! and the example of the lady testing tea is also incorrect in its calculation of a p-value!

    • @SaltpeterTaffy
      @SaltpeterTaffy 3 года назад

      @@econgradstudent4069 Well, I was thinking more about the presentation than the accuracy of the information. It's well paced, doesn't spend much time with lame jokes.

    • @christy7955
      @christy7955 3 года назад

      @@econgradstudent4069 Fascinating, what are the correct definitions and p-values?

    • @realstatistics5376
      @realstatistics5376 3 года назад +1

      ​@@christy7955 the correct definition is is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct.
      So they are saying the probability of what is seen, but it should be "at least as extreme as what was seen".
      For their Lady tasting tea example they by chance get the right p-value since there is nothing more extreme than getting every cup right. But that is chance, any outcome would in their definition have the same "p-value".
      And for continuous tests their definition is meaningless since it would always be zero (since continous distributuons have infinitely many outcomes)

    • @christy7955
      @christy7955 3 года назад

      @@realstatistics5376 Thanks for taking the time to comment! I'm trying to get better at stats but my goodness it can be tricky.

  • @Overonator
    @Overonator 5 лет назад +113

    Bayesian analysis is the best alternative and effect sizes. This is why we have a replication crisis and why we have so many false positives and why we have (edit ”ststistically") significant results with tiny effect sizes.

    • @gardenhead92
      @gardenhead92 5 лет назад +16

      If we started using Bayesian analysis we'd just have "prior hacking" :D

    • @SolarShado
      @SolarShado 5 лет назад +11

      "significant results with tiny effect sizes"
      This has to be one of the worst cases of jargon being misunderstood by those unfamiliar with it that I've seen. To be fair, it's also one of the wider gulfs between the common meaning and the technical meaning. It really drives home the importance of actually understanding the terminology you're reading, or being sure you're getting your information from someone who does and can 'translate' for the layperson.

    • @jeffreym68
      @jeffreym68 5 лет назад +8

      @@SolarShado So common that people misunderstand these terms and come away with the wrong picture of the research. Short of earlier or more widespread teaching of research methods & statistics, I'm not sure how to bridge that gap.

    • @SolarShado
      @SolarShado 5 лет назад +10

      @@jeffreym68 My first reaction is "more people should be taught research methods and statistics", but I know, practically, that even if we tried, it probably wouldn't stick. There's very little reason for the average person to need that knowledge in their daily lives. I think the solution is more/better science reporting, like what scishow does. Though I don't have much hope that they'll ever manage to drown out the more sensationalist voices...

    • @Overonator
      @Overonator 5 лет назад +1

      @@SolarShado Am I not understanding something?

  • @SuperCookieGaming_
    @SuperCookieGaming_ 5 лет назад +25

    I wish you could have made this years ago when I was taking statistics. you explained the concept so well. it took me a week to wrap my head around why we used it.

  • @corlisscrabtree3647
    @corlisscrabtree3647 5 лет назад +8

    Awesome video. Truly appreciate it. An excellent review of all the things my committee told me when I was doing my dissertation research! I hope you can find a sponsor to discuss sample size and power next.

  • @TheNewsDepot
    @TheNewsDepot 5 лет назад +36

    Just to be fair, if you pour a hot liquid into cool milk, you're less likely to curdle the milk.
    But if you pour cool milk into hot liquid, like freshly steeped tea, you are pretty likely to curdle the milk. Someone with a fine enough pallet might actually be able to tell the difference.

    • @jacksonpercy8044
      @jacksonpercy8044 5 лет назад +2

      If the temperatures are the same, how is there any difference which one is added first?

    • @TheNewsDepot
      @TheNewsDepot 5 лет назад +8

      @@jacksonpercy8044 The temperatures aren't the same.
      If the tea is piping hot, as it should be and the milk is cool or room temperature, then you want to have the tea added slowly to the milk so it does not curdle.
      If you add the milk to the tea, the tea transfers way more heat to the relatively little milk being added in the first second of the pour and that milk will curdle.
      It's the same reason you temper eggs when adding egg whites to a heated micture. You add some of the heated mix to the eggs first to slowly raise their temperature and then slowly add them to the rest of the mix. Just dumping them in makes them cook and harden.
      It's very possible this woman could tell the difference when the tea and milk were mixed improperly.

    • @jacksonpercy8044
      @jacksonpercy8044 5 лет назад +3

      Ahh, that makes sense. I should have realised it was about heat transfer with different volumes of liquid. I make my tea with a tea bag in the cup I intend to drink from, so I've never really thought about using milk first. Speaking of which, would it even be possible to brew tea in milk while slowly adding hot water?

    • @TheNewsDepot
      @TheNewsDepot 5 лет назад +2

      @@jacksonpercy8044 I think using a tea bag is what the British call a high crime. :D
      I make my tea with coffee, hold the tea.
      I don't think you could get the milk hot enough to diffuse the tea without curdling it.
      The reason you heat the water is to allow space between them for the particles of tea to get into. In milk that space already has fats, so it would have to really hot to get the tea properly dispersed in the liquid.
      Maybe something like soy milk or almond milk would have enough of a heat tolerance, but everyone better be able to tell the difference in taste then.

    • @TheNewsDepot
      @TheNewsDepot 5 лет назад +2

      @@jacksonpercy8044 This is why I should look things up before hypothesizing. You could likely cold brew the tea in milk the same way cold brew coffee is made. Takes a lot longer for things to equalize, but should be possible.

  • @coolsebastian
    @coolsebastian 5 лет назад +12

    This was a very interesting episode, great job everyone.

  • @pharmdiddy5120
    @pharmdiddy5120 5 лет назад +11

    Soooo glad to see negative results published in major medical journals these days!

  • @AlexComments
    @AlexComments 3 года назад +2

    I took Business Statistics in college last semester, and it's wild how much more sense this makes than the intro lecture on hypothesis testing that I sat through months back.

  • @vice.nor.virtue
    @vice.nor.virtue Год назад +2

    That experiment with the cups of tea is literally the most British piece of science I’ve seen in my whole life

  • @NthMetalValorium
    @NthMetalValorium 5 лет назад +7

    Sounds like there's a bigger problem of scientists getting pressured to publish significant results.

    • @LimeyLassen
      @LimeyLassen 5 лет назад +1

      butterfly meme
      _is this capitalism_

  • @jamesmnguyen
    @jamesmnguyen 5 лет назад +15

    P-Values have basically become an example of reward-hacking.

    • @ValeriePallaoro
      @ValeriePallaoro 5 лет назад +2

      that's what she said ...

    • @jamesmnguyen
      @jamesmnguyen 5 лет назад

      @@ValeriePallaoro That literally does not apply to this comment.

    • @tonyrandall3146
      @tonyrandall3146 3 года назад

      @@jamesmnguyen *teleports behind you*

  • @frankschneider6156
    @frankschneider6156 5 лет назад +2

    The first video in a long time, that honors the name SciShow. Keep this level up.

  • @ThinkLikeaPhysicist
    @ThinkLikeaPhysicist 5 лет назад +2

    This is why, in particle physics, we use the 5-sigma criterion (a p-value of 3x10^-7) for discovery. A p-value is one of the most useful tools in reporting scientific results, as long as you use it correctly! If you want to know more, we've got some good statistics videos over at our channel Think Like a Physicist.

  • @masterimbecile
    @masterimbecile 5 лет назад +36

    Statistical significance doesn't necessarily mean actual significance.

    • @cablecar10
      @cablecar10 5 лет назад +1

      @@user-jp1qt8ut3s I'm curious, what's something you consider actually "significant" that can't be demonstrated statistically?

    • @robertt9342
      @robertt9342 5 лет назад

      masterimbecile . I have been in statistics courses where they actually cover that point.

    • @masterimbecile
      @masterimbecile 5 лет назад +1

      @@cablecar10 Sometimes it may be issues with the statistical analyses/ experimental design itself. For instance, a truly beneficial drug might not be shown to have statistically significant result simply due to biased/underpowered sample collection.

    • @masterimbecile
      @masterimbecile 5 лет назад +2

      @@cablecar10 Other times, maybe the statistical analysis might be looking at the wrong number/ something else could be significant but not accounted for by the researchers.

    • @masterimbecile
      @masterimbecile 5 лет назад +1

      @@cablecar10 Just remember: the p-value is simply one decision tool at the end of the day, and a damn elegant one at that. Something can be significant and worth pursuing regardless of what the p-value suggests.

  • @kirjakulov
    @kirjakulov 5 лет назад +42

    As my supervisor says: statistical significance does not mean biological significance.
    You always have to be very very careful interpreting the data and stats. 👍

  • @TesserId
    @TesserId 3 года назад +2

    This is great. I was actually wanting to see a double blind test of tea/milk order. I also want to se one about microwaving tea, and another on squeezing tea bags.

  • @nathanwestfall6950
    @nathanwestfall6950 3 года назад

    Great video! "Publish or Perish " is a mantra I have heard chanted in quite a few institutions. I have never heard "discover the truth" or "do something useful" said though. Maybe all that's needed is a catchy phrase to encourage more academic honesty/integrity.

  • @Alvarin_IL
    @Alvarin_IL 5 лет назад +5

    The recognition at the end is the true "P" value :)
    Love these "science-insider" episodes!

  • @Narokkurai
    @Narokkurai 5 лет назад +8

    Good god, that's a satisfying milk pour at 3:49

  • @gracecalis5421
    @gracecalis5421 4 года назад +2

    Me at 8pm: I should really go to sleep
    Me at 3am: _SaLmON iN aN fMrI_

  • @benedictifye
    @benedictifye 3 года назад +1

    I believe the point of pouring tea milk first is that the change in temperature of the cup is more sudden when you pour boiling water in it, so the cup is more likely to shatter if it’s not resistant to the temperature change. Putting the milk first and then warming it with tea protects the cup from such a drastic swing in temperature

  • @iKhanKing
    @iKhanKing 5 лет назад +4

    Here's the problem with multiple comparison corrections. If you have a survey with 20 different items and run the data, you have to do a correction, and your P-values are microscopic.
    On the other hand, if 20 different researchers each run a study with one different survey item each, those researchers only have to use the 0.05 P-value.
    There is an equal probability of one of those 20 studies rejecting the null and one of the 20 different items rejecting the null, but the threshold is vastly different.
    Doctors deal with this in medicine every day in another way. They use scientific judgment to interpret data. If you order 30 pieces of data, and one of them comes back mildly out of normal range, it's very possible it's a false positive. However, if that's a specific value you are worried about, then it may be worth investigating further. We don't alter the normal ranges.

  • @snowyh2o
    @snowyh2o 5 лет назад +8

    Why couldn’t this come out when I was actually taking statistics? This is literally the last half of the second midterm XD

  • @contrarianduude3463
    @contrarianduude3463 5 лет назад +2

    The fish was making "eyes" at me the whole time during the MRI. How do you tell a dead fish I'm just not that in to you?

  • @jp4431
    @jp4431 5 лет назад +2

    I had an epidemiology prof keep telling us not to focus on p-values, but on confidence intervals and effect sizes (clinical significance).

  • @joegillian314
    @joegillian314 5 лет назад +13

    That's not a correct definition of a p-value. The meaning of a p-value is the probability of getting a result at least as extreme as your data, under the assumption that the null hypothesis is true. To say that it is the probability of the data occurring at random is not exactly right because you cannot forget the assumption of the null hypothesis being true.
    Additionally, the evaluation of a p-value is based on the level of significance (alpha) which is entirely determined by the experimenter(s). [There are some conventions when it comes to choosing a level of significance, but ultimately a person can choose whatever value for alpha they want].

    • @imranrashid8615
      @imranrashid8615 5 лет назад +6

      Joe Gillian .. we get it you took high school stats. They gave a good and concise summary in everyday language

    • @gardenhead92
      @gardenhead92 5 лет назад

      Moreover, since this is probability we're talking about, *ALL* data occur at "random", by definition.

    • @fujihita2500
      @fujihita2500 5 лет назад

      Keep using that word, I don't think the significance level means what you think it means

    • @npip99
      @npip99 5 лет назад +3

      Adding complex phrasing doesn't add content. She fully explained that they were calculating the odds "Assuming she couldn't tell the difference between the two types of tea". Just because you decided to call that sentence a "null hypothesis" doesn't mean the original explanation was wrong, nor does it mean you're learning anything by memorizing more terminology as opposed to trying to learn the actual concept instead. This is the epitome of why the school system manages to supposedly teach "something", but infact teach nothinges of real content at all. It's just memorization. 3:52 is the definition, again "even if the effect they're testing for doesn't exist" is the logical reasonable and easily understood way to say "assuming the null hypothesis"

    • @Lucky10279
      @Lucky10279 5 лет назад

      They did say "in a nutshell."

  • @DharmaJannyter
    @DharmaJannyter 5 лет назад +3

    As a first test I would've just given her 8 cups of one type but told her it was 4 cups each. :P
    That should lower the chances of her not messing up by merely guessing, no?

  • @m0n0x
    @m0n0x 5 лет назад

    I had a hard time understanding why p-hacking is such a big deal, but now its all crystal clear. Thank you!

  • @jmonteschio
    @jmonteschio 5 лет назад +1

    This video is easily the best recommendation RUclips has made to me for watching in a long time. Great video, and I really hope that all scientific journals completely switch over to the "decide whether or not to publish first" method.

  • @EricSartori
    @EricSartori 5 лет назад +3

    Great video. Will be sharing with my colleagues.

  • @trisstock9047
    @trisstock9047 5 лет назад +15

    The statistical probability that Earl Grey tea should be drunk with milk at all is vanishingly small.

    • @jeffreym68
      @jeffreym68 5 лет назад +2

      I'm British. That probability is, in fact, quite high, even for those of us who like Picard.

    • @xplinux22
      @xplinux22 5 лет назад +2

      Also ask anyone in southeast Asia or in the Indian peninsula, and you'll find all sorts of milk teas to be exceedingly popular.

    • @frankschneider6156
      @frankschneider6156 5 лет назад

      True, as we all know, the only way to properly drink tea is cold, mixed with red bull and ice cubes.

  • @persinitrix
    @persinitrix 4 года назад

    Coming from an "aspiring" industrial and systems engineer a few dots were connected that were left distant from the few statistics and probability classes i have taken at university. Hypothesis testing and Bayes Theorem have made a bit more sense to me. I praise You

  • @bryanandrews5214
    @bryanandrews5214 5 лет назад +1

    Okay, so the confound in the study was the temperature of the cup. When you start with the colder milk, it lowers the temperature of the cup which starts it at a lower point when the tea is poured into the cup.

  • @chadchucks6942
    @chadchucks6942 5 лет назад +6

    Man I clicked this hoping to learn about a fish

  • @MarvelX42
    @MarvelX42 5 лет назад +3

    "There are three kinds of lies: lies, damned lies, and statistics."

  • @rollinwithunclepete824
    @rollinwithunclepete824 5 лет назад

    A very good video. Thanks to Olivia and the SciShow Gang!

  • @QuantumPolagnus
    @QuantumPolagnus 5 лет назад +1

    Thank you, SR! I always get excited when I hear them gearing up for announcing the President of Space. You've done a lot for the show, and I think all of us long-time viewers appreciate it.

    • @SrFoxley
      @SrFoxley 5 лет назад +1

      Aaw, thanks! I'm glad you enjoy the show so much, eh! And, again, I just want to point out that the hard-working Sci-show crew are the real heroes here, eh-- without them, there'd be none of this excellent content for us to enjoy!

  • @chinobambino5252
    @chinobambino5252 5 лет назад +5

    As someone involved in research myself, yes the system has its flaws, mainly the triple payout publishing companies get by charging publicly funded researchers to publish their work behind a paywall. However, I don't see the 2-step manuscript submission being a good idea. What if a relatively mundane study is denied publishing, yet they established some kind of incredible, unpredictable results? Would this also not lead to a loss in sample size, as journals would stop publishing repeats of past experiments (more than they already do), yet these repeats make the data more reliable?

  • @SECONDQUEST
    @SECONDQUEST 5 лет назад +4

    Of course you can tell the difference right away. It's about mixing properly

  • @fish3977
    @fish3977 5 лет назад +1

    That 2 step publishing sounds like a great thing. I'd unironically love to see more papers being published that just go "yeah, nothing interesting happened"

  • @inthso362
    @inthso362 3 года назад +1

    Hey, here's an idea: Fisher makes 3/5, 1/7, or 8/0 milk first/last, doesn't tell Bristol how many there are of each, and sees what happens.
    There, fixed it.

  • @montycantsin8861
    @montycantsin8861 5 лет назад +11

    When giving a urine sample, you can hack your p-value with eye-drops.
    I'll see myself out.

    • @greenredblue
      @greenredblue 5 лет назад +7

      I did it wrong and now my eyes really hurt.

  • @francoislacombe9071
    @francoislacombe9071 5 лет назад +30

    There are three kinds of lies: lies, damned lies, and statistics.

    • @sdfkjgh
      @sdfkjgh 5 лет назад +9

      Francois Lacombe: I remember my college statistics class. One of the first things the wild-eyed, wild-haired, gangly, crazy-ass professor told us was that Mark Twain quote, then he explained that by the end of the semester, if he did his job right, we should be able to make the numbers *dance for us.* One of the best, most fun teachers I ever had, even if I remember almost none of the subject matter.

    • @Grim_Beard
      @Grim_Beard 5 лет назад +5

      Statistics never lie. People lie with statistics.

    • @gordonlawrence4749
      @gordonlawrence4749 5 лет назад +1

      Statistics do not ever lie. You just get complete idiots that for example fail to understand that correlation is not causation. Usually these are writers and often they work for newspapers. Anyone who knows how the maths works can see through people bending the numbers. The one exception is the MMR Jab causing Autism fiasco where it was not the maths that was the problem it was faked results.

  • @Jcewazhere
    @Jcewazhere 5 лет назад +1

    @SR Foxley Thanks buddy, you're supporting about half the channels I enjoy :)

    • @SrFoxley
      @SrFoxley 5 лет назад

      Yay! You have good taste in channels, then, eh!

  • @Roll587
    @Roll587 5 лет назад

    Researcher here - the pressure to publish is no joke.

  • @daviddavis4885
    @daviddavis4885 5 лет назад +6

    This would have been helpful two hours ago before my Stats quiz...

  • @rdreese84
    @rdreese84 5 лет назад +5

    Earl Grey, you say? Hot, I presume...

  • @Lucky10279
    @Lucky10279 5 лет назад

    "P-value, the probability that you'd get that result if chance is the _only_ factor.". This is the clearest, most straightforward definition of the term I've ever come across. I tutor basic statistics and I'm definitely borrowing this definition to tell students what the P-value means and why it's not quite the same thing as the probability that your hypothesis is true. That one phrase has made it far more clear to me why this is the case, which will help me explain it. The textbook the school uses emphasises that the P-value is NOT the probability that the hypothesis is correct, but it doesn't clearly why.

  • @ancbi
    @ancbi 5 лет назад +2

    After 1:48 "I guess all pictures of tea and tea cups are relavant now." --- The video editor, probably.

  • @Master_Therion
    @Master_Therion 5 лет назад +3

    I wonder if the fish's ghost thought, "They're going to bring me back to life!" when it saw its body being put into the MRI. "Yes! I get to live again. I guess I'm... off the hook."

  • @acetate909
    @acetate909 5 лет назад +8

    I wonder how strongly correlated P-hacking and the replication crisis are. I bet it's something like 0.05%

    • @Grim_Beard
      @Grim_Beard 5 лет назад

      Correlations are not measured in % and a correlation of 0.05 would be extremely low.

    • @acetate909
      @acetate909 5 лет назад

      @@Grim_Beard
      Ya, I was obviously joking. It wasn't meant to be taken seriously or literally. But thanks anyway Dr. Joke D. Stroyer

  • @im.empimp
    @im.empimp 5 лет назад

    That was a great explanation of P-values. I had always wondered why the standard was 0.5, and now I know the apocryphal version of it's origins. Thank you!
    I also had never heard of the 2-step manuscript method, and I _LOVE_ it!
    Years ago I had a professor who insisted we "recalculate" a study's findings until we "identified" the significant results (i.e. any combination that led to

  • @CarstenGermer
    @CarstenGermer 5 лет назад +2

    Woohoo! I finally understood this is very important information that's relevant to my interests!
    Completely switch to the two-step method and would suggest that, when scientists submit the first part of their study, they must submit an abstract that explains what it's all about to a generally interested audience. Now _that_ would make science more accessible.

  • @duckgoesquack4514
    @duckgoesquack4514 5 лет назад +6

    Its hard to paint the world in back and white, with shades of grey.

  • @ShubhamBhushanCC
    @ShubhamBhushanCC 5 лет назад +4

    You don't put milk in Earl Grey.

    • @metamorphicorder
      @metamorphicorder 5 лет назад +2

      Of course not. Only a barbarian would do that. You always put the earl grey into the milk.

    • @molchmolchmolchmolch
      @molchmolchmolchmolch 5 лет назад

      Maybe you don't but I do

  • @clochard4074
    @clochard4074 5 лет назад

    This is such an important topic! I hope it gets the attention it deserves!

  • @alan58163
    @alan58163 3 года назад

    Wonderful video! For a humorous and in-depth exploration of this and more, I recommend Jon Oliver's bit called "Scientific Studies"

  • @xKuukkelix
    @xKuukkelix 5 лет назад +6

    Videos name and thumbnail were so weird that I had to click

    • @ryank1273
      @ryank1273 5 лет назад

      Welcome to my world!

  • @Mike504
    @Mike504 5 лет назад +3

    Olivia's p value for being so amazing is 0.00000438

  • @SeanPat1001
    @SeanPat1001 3 года назад +1

    Yes! I have found virtually all stat texts emphasize P-value. One thing to bear in mind is that a P-value is a random variable. Every random variable has a confidence interval and they never report that part.
    Bayesian statistics can help, as long as there is a way to measure the probability of the alternate hypothesis. This is not always possible.
    In industry, the usual method is to select alpha and beta values, based on the consequences of making a Type I or Type II error. It’s assumed you will not always be right, but things should work in the long run.
    In all fairness, the same happens in research. People try duplicating experiments and if they get similar results, they are more sure.
    We know nothing. Everything we think we know is an educated guess. Until 1962, every chemist knew xenon was an inert gas. But Neil Bartlett proved to the world that xenon was not inert by conducting a novel experiment. This led to a realization that we didn’t understand chemical bonding as well as we thought.

  • @bcddd214
    @bcddd214 5 лет назад +1

    BEAUTIFUL!
    I've been yelling the same thing at scholars for years.

  • @thinkabout602
    @thinkabout602 5 лет назад +3

    Liars figure and figures lie - I always get questioning when I hear " there's a story "
    junk in - junk out or I heard ........... she should not have been told there were 4 & 4

  • @cyanidejunkie
    @cyanidejunkie 5 лет назад +3

    666th like... you mad bro?
    Btw, who puts that much milk in their tea anyway?

  • @fernandoaleman607
    @fernandoaleman607 5 лет назад

    Love it. One of the best videos in a while SciShow!

  • @Cruznick06
    @Cruznick06 5 лет назад +1

    Thank you. I've never understood why we kept using it.

  • @Zeldaschampion
    @Zeldaschampion 5 лет назад +2

    SR Foxley U rock. Keep up da good work!

    • @SrFoxley
      @SrFoxley 5 лет назад +1

      Thanks Link!

  • @wackohacko24
    @wackohacko24 4 года назад

    I forgot to mention, this is one of the most amazing videos I've seen on You Tube. Thank you for covering this subject.

  • @agnosticgo
    @agnosticgo 5 лет назад

    They weren't studying the mental abilities of dead fish they were studying statisticians… Tuh-may-to Tuh-mah-to

  • @justintime970
    @justintime970 4 года назад +1

    100% of surveys show that everybody takes surveys...

  • @blazeinhotwings
    @blazeinhotwings 5 лет назад

    One thing to keep in mind is that the “gold standard threshold” of .05 depends a lot on your field of study (socials sciences use higher p values like .05 and things like cutting edge physics use much smaller p values (

  • @hammadsheikh6032
    @hammadsheikh6032 4 года назад

    This is such a difficult topic to teach, and you did a marvelous job. I will use this video in my classes.

  • @0urmunchk1n
    @0urmunchk1n 5 лет назад

    I really like the idea of journals determining whether to publish papers before results are generated. Add in eliminating p-hacking and funding for replication studies and the reliability of scientific literature will take yet another leap forward.

  • @austinmckee2117
    @austinmckee2117 3 года назад

    I took a medical statistics class in college, and know what a p value is… but this gave me such a better understanding. So thankful for scishow

  • @sarahleonard7309
    @sarahleonard7309 5 лет назад

    This is what some of us have been shouting from the wilderness for years. If you want to see my father foam at the mouth, just mention the phrase "P value" and sit back to watch the fun. I am VERY glad that this has finally become a widespread topic of conversation!

  • @MsZeeZed
    @MsZeeZed 5 лет назад +1

    Another point of the dead fish in the MRI is to understand your experimental environment. Muriel Bristol’s leaf tea was drawn from an urn (no tea bags in the UK until after WWII). Tea in an 1920s UK academic common room would be poured into china cups that have a low thermal capacity. It was traditional to put the milk in first for boiling tea, as the cool milk prevents the china cup from cracking. With an urn the tea is already boiled and steeping at around 80C, so the order of mixing with milk has no real effect, but if you put the milk in first the exterior of a china cup will still be *initially* cooler. So if freshly mixed behind Bristol’s back & handed out 1-by-1 the temperature of the cup would be noticeable. Fisher focused on rejecting the null hypothesis, but that only proved Muriel could sense how the tea was made, it does not prove she could taste the difference, even if she thought it was her sense of taste that was determining that.

    • @eljanrimsa5843
      @eljanrimsa5843 5 лет назад

      Fanciful explanation! But the significance of your data shouldn't depend whether you can come up with an explanation you like.

    • @MsZeeZed
      @MsZeeZed 5 лет назад

      Eljan Rimsa yes its as impossible to say if this explanation is true as saying it can be judged by taste using p-value alone. Its more likely than the hypothesis that mixing these 2 liquids in a different order creates a different taste using sense organs that don’t work optimally in *hot* or *cold* ranges. If the water was boiling the mixing order may make a difference & it could be a different recipe that formed the conviction that there is a difference in taste. Design your experiment to standardise the tea mixing & think of how to evaluate the human factor too, that is the real science.

    • @MsZeeZed
      @MsZeeZed 5 лет назад

      Eljan Rimsa also I find the milk 1st method argument strange, as its a tradition formed for C19th practical reasons that no longer exist for 99% of C21st tea making.

  • @paulblaquiere2275
    @paulblaquiere2275 3 года назад

    I used to do fMRI research - this is one of the (many) reasons I left. I stopped believing I was doing good science. I was encouraged to look at the data in many different ways, i.e., I'd eventually get a significant P-value. I remain fundamentally convinced my hypothesis was not correct, but that was never published or even publicised, so for all I know, other poor students have repeated the same study with the same result (but perhaps more of a willingness to play with the data). If there are rigorous fMRI researchers here, I wish you the best of luck, and I hope the culture has changed since I was doing research. I love the 'decide to publish before results' idea (I think my hypothesis was interesting! It just wasn't true).
    One element missing here on fMRI bits is that, to aid comparability across multiple subjects, all the data is 'fitted' to a standardised brain . The issue being brains are very much not standardised, but most fMRI researchers are neurologists rather than mathematicians or technologists so don't really understand the process by which this is done or the implications of this warping of the data. If you put a fish in there, you'll get a picture of activations on a human brain if that's the programme you run it on...

  • @sophitsa79
    @sophitsa79 4 года назад

    The new publication approach sounds brilliant