How to Find Outliers with Excel

Поделиться
HTML-код
  • Опубликовано: 28 янв 2025

Комментарии • 128

  • @dinadawood
    @dinadawood 4 года назад +12

    I had to go through a set of over 300 data points, this helped me so much, I'm so glad I found this video just in time. Thank you!!

  • @quinndougan
    @quinndougan 4 года назад +5

    Writing an Exam right now using excel and this saved my life!! Thank you!!

  • @PayneDeathz
    @PayneDeathz 4 года назад +2

    I regret not opening this video first. Great video!!

  • @rynolin342
    @rynolin342 5 лет назад +8

    This is going to help me so much on my math test. Thank you.

  • @tjdeleon13
    @tjdeleon13 3 года назад +1

    This was awesome. I had a whole page 9 columns wide, saved me so much work. Thank you

  • @isabellaforguson
    @isabellaforguson 5 лет назад +13

    This helped me so much I can’t thank you enough!!

  • @nickbabulal2526
    @nickbabulal2526 4 года назад +1

    Thank you so much this is the best video to explain this you just earned yourself a subscriber

    • @absentdata
      @absentdata  4 года назад

      Happy that you subscribed😁

  • @nerdygabeeg
    @nerdygabeeg 4 года назад +1

    Thank for making this video, you are using my favorite PP template!

  • @jorge960621
    @jorge960621 4 года назад +2

    Thank you for such a nice and patient explanation!

  • @mimiallen7873
    @mimiallen7873 3 года назад

    Sir also thank you so much for this video, gosh this made it so much simpler to understand how to calculate!!!

  • @nigelhutchinson5332
    @nigelhutchinson5332 2 года назад +1

    Thanks for this great explanation! Curious what the 1.5 factor is as it relates to the upper and lower bounds? Is this some type of outlier exclusion factor? Or just a suitable one based on the data set?

    • @absentdata
      @absentdata  2 года назад +2

      1.5 is a constant that is used in this statistical formula. Think of it like PI

    • @damionc
      @damionc Год назад +1

      @@absentdata I found that I had to change that 1.5 to 1 for the dataset I was working with for it to pick up other values I considered outliers as well. Will do some more research on that factor and what it is exactly but I had to change it to get a better result.

  • @madoomadoo6382
    @madoomadoo6382 5 лет назад

    I really love your explanations.Thanks a bunch!

  • @zg5672
    @zg5672 2 года назад +1

    Thanks dude u saved a depressed student about to end it all

    • @absentdata
      @absentdata  2 года назад

      I am glad this brightened up your day!!!

    • @zg5672
      @zg5672 2 года назад +1

      @@absentdata no. it brightened up my life.

    • @tentacleprincess5772
      @tentacleprincess5772 2 года назад +1

      Keep going dude! Stay strong!

    • @zg5672
      @zg5672 2 года назад

      @@tentacleprincess5772 i am strong now because of absent data

  • @mslx9496
    @mslx9496 4 года назад +1

    If I have no negative values, I asume that I have to use the absolute value for the lower bound, right?

  • @ynn9173
    @ynn9173 4 года назад +2

    When can outliers be positive and when can they be negative for the data set?

    • @absentdata
      @absentdata  4 года назад +1

      It depends on the data set. For example even if you had a outlier boundaries that's negative, it doesn't always apply. For examples a child birth dataset might not use any negative outliers then you would just use 0 as your range.

  • @mariaalejandraramirezmarti7724
    @mariaalejandraramirezmarti7724 2 года назад

    You are the best! really really helpful!

  • @kadichidu4171
    @kadichidu4171 2 года назад

    Please make one complete tutorial video on Excel from basic to advanced

  • @tomasvesely6447
    @tomasvesely6447 6 лет назад +9

    Thank you for the video. Just a quick question why are we multiplying by 1.5?
    Thanks!

    • @datascienceds7965
      @datascienceds7965 6 лет назад +11

      Outliers live outside the inner quartile range. By statistical definition, they live 1.5 times below or above your inner quartile range.( copied from the link he ut in the descriotin box)

  • @igi-chan
    @igi-chan 4 года назад +1

    Thank you so much! this is so well made! it explains so well! you're the best!!! :D

  • @Tfol077
    @Tfol077 2 года назад

    This was super helpful brother thank you so much

  • @lBubbleable
    @lBubbleable 3 года назад

    I found this very helpful, thank you! I'm trying to compare different business with their respective values to find the most "different" one. Is this method ok to use in that case?

    • @absentdata
      @absentdata  3 года назад +2

      In a sense yes, you may be doing a univariate analysis of the three businesses. For example you may be looking at Revenue across all the three businesses in comparing their distributions. You can compare the shape of the distribution whether it's a normal distribution or a very skewed distribution. Skewed distributions are affected by outliers.

    • @lBubbleable
      @lBubbleable 3 года назад

      @@absentdata Thank you so much!!

  • @TheVanDamminator
    @TheVanDamminator Месяц назад

    Is there a reason we'd use quartiles instead of standard deviations and the mean?

    • @absentdata
      @absentdata  Месяц назад

      IQR works for on most data including skewed data or outlier resistance; use standard deviation for symmetric, normal distribution

  • @linspositivevibesonly1046
    @linspositivevibesonly1046 Год назад

    This video helped so much❤

  • @Window20230
    @Window20230 4 года назад

    Very helpful video, a quick question: Can I use this method to emit outliers in a degradation trendline? or this method is for data that fit normal distribution or some other distribution only?

    • @absentdata
      @absentdata  3 года назад +1

      Yes, absolutely. However then you are going to be dealing with standard deviation for normal distribution. This would be a different approach

    • @Window20230
      @Window20230 3 года назад

      @@absentdata Thanks!

  • @MikeSocayre
    @MikeSocayre 5 месяцев назад

    It is a very helpful video, but what is the name of this method?

  • @biankaolivier8207
    @biankaolivier8207 5 лет назад +2

    This helped me alot, thanks!

  • @catherineshepperdley7951
    @catherineshepperdley7951 4 года назад

    Thank you - This was really helpful!

  • @eddyrealon1105
    @eddyrealon1105 3 года назад

    What is the basis for the 1.5 multiplied to the IQR? What is the rationale for that? Thank you!

    • @absentdata
      @absentdata  3 года назад +4

      This is part of the accepted original formula. Essentially quartiles can be scaled to standard deviations. The math equates 1.5×IQR to under 3 standard deviations from the mean. So the 1.5 was designed in the formula to capture the majority of outliers assuming the data is normally distributed.

  • @ragexbabayega4058
    @ragexbabayega4058 2 года назад

    What if the lower quartile is negative in value, then our IQR turns in a really large value of Q3+Q1, will this affect the values of outliers?

    • @absentdata
      @absentdata  2 года назад +1

      No it should be fine.. However it depends on the data. Should you have negatives or not might be a better question

  • @deepankumarr4256
    @deepankumarr4256 3 года назад

    How this works on positive only data. I have list of numbers coming daily say. I want to know if the trends are inline.
    What is the best way to use that. If there is any dip or spike in the trend, how to t find out.

    • @absentdata
      @absentdata  3 года назад

      Works on negative outliers also

  • @mimiallen7873
    @mimiallen7873 3 года назад

    Can you tell me in general what the reason is why a cell won't calculate?

    • @absentdata
      @absentdata  3 года назад +1

      Its hard to say without having the actual file. There are a host of reasons..
      data is not a number
      incorrect formula

    • @mimiallen7873
      @mimiallen7873 3 года назад

      @@absentdata Ok thank you for that!

  • @aqsanaveed7976
    @aqsanaveed7976 3 года назад

    can we delete the outliers and then proceed by leaving the spaces blank or do we have to fill them?

    • @absentdata
      @absentdata  3 года назад

      You could do that by using an if function

  • @sirtopinhat9217
    @sirtopinhat9217 5 лет назад

    I like this video, very helpful thankyou

  • @dileep3809
    @dileep3809 3 года назад

    Can U please prepare a proper statistics series.i went through many videos but I like the way you teach

    • @absentdata
      @absentdata  3 года назад

      I am glad you enjoy the way I teach. Yes, I want to do a purely statistic playlist. Hopefully there is enough interest :)

  • @hienle-kc6qw
    @hienle-kc6qw 3 года назад

    Can i just plug in whatever the Q1 and Q3 I want? Thank you!

    • @absentdata
      @absentdata  3 года назад

      Welll not really because that invalidates the need for calculating the quartiles. However, I think the approach you are trying to take is more maual where you are setting the boundaries for the outliers yourself.

    • @hienle-kc6qw
      @hienle-kc6qw 3 года назад +1

      @@absentdata yep, I work with big listings at my work and we look for values that out of range based on unique base values. Thank you!

  • @akifrahman
    @akifrahman 4 года назад

    I tried to do this method but I would always get negative values for the lower bound so I can't tell if there's an outlier or not

    • @absentdata
      @absentdata  4 года назад

      Probably indicates that your data is very normally distributed with no outliers

    • @akifrahman
      @akifrahman 4 года назад

      @@absentdata would you mind if I showed you my data and say if you think I have correctly identified the outliers?

    • @akifrahman
      @akifrahman 4 года назад

      @@absentdata also my data are decimals so like 0.0347 etc. Would that affect the method at all?

  • @tvoyodelo2858
    @tvoyodelo2858 4 года назад

    Extremely helpful!

  • @tammykonarski6528
    @tammykonarski6528 5 лет назад +2

    Thank you, thank you, thank you!!

  • @iroha283
    @iroha283 2 года назад

    What does it mean when u say click F4?

    • @iroha283
      @iroha283 2 года назад

      Is it on the keyboard? What if I’m using excel on an iPad

    • @absentdata
      @absentdata  2 года назад

      ⌘⌥R or Command T in Mac I believe it will help you lock the position of the cells. BTW I've never had a Mac so I am hoping Google is correct :)

  • @Hertesler
    @Hertesler 6 лет назад

    Thanks for this great video!

  • @StellaNimas
    @StellaNimas 3 года назад

    Hii! can i use this for a likert scale data?

  • @pipertaylor3641
    @pipertaylor3641 4 года назад +1

    this is a lifesaver

  • @kelsichesnut
    @kelsichesnut 6 лет назад

    Super helpful, thank you!

  • @simonbreecher8542
    @simonbreecher8542 4 года назад +1

    why do we multiply by 1.5?

    • @absentdata
      @absentdata  4 года назад +1

      This is standard multiplier for outlier detection. However you can increase it

  • @zzz11221
    @zzz11221 Год назад

    BTW dont forget to sort the data first this will be crucial at bigger datasets as it will be problematic otherwise
    2

  • @adamatraore1883
    @adamatraore1883 Год назад

    Hello, I tried to use the function "OR", I couldn't get the right answer. Instead I used the Function "AND". Thanks for the Video

    • @absentdata
      @absentdata  Год назад +1

      Thanks I'm glad you solved the issue in the end

  • @williamtully6161
    @williamtully6161 5 лет назад

    Fantastic! Awesome Thank you!!

  • @kasy26x0x
    @kasy26x0x 6 лет назад

    Thanks for the video!

  • @evachristine2138
    @evachristine2138 5 лет назад

    I have over 20,000 columns of data, how do i do this without doing each one seperately?

    • @absentdata
      @absentdata  5 лет назад

      Well a column should be seen as a unique dimension in your data. so if you have 20,000 columns. By Excel only has 16,384. I would question the structure of your data. I would see big you can melt the columns down to rows

  • @albert276
    @albert276 3 года назад +1

    how do we know we want 1.5?

  • @manas806
    @manas806 3 года назад

    how to count the number of true and false after this for the outliers

  • @loverose4732
    @loverose4732 4 года назад

    Thank you so much, sir.

  • @adewaletijani7023
    @adewaletijani7023 3 года назад

    Is 1.5 a constant ?

    • @absentdata
      @absentdata  2 года назад

      Yes, its a best practice to use 1.5. However, you can alter it but then release that you will lose data. It conforms to a standard normal distribution which means that it's a reliable way of saying that most of your data falls under this about 97.5%

  • @ellereal8216
    @ellereal8216 4 года назад

    How to apply this to 3 data points only? Can you help?

    • @absentdata
      @absentdata  4 года назад

      Do you just 3 rows of data? That would be the best use of this.

    • @ellereal8216
      @ellereal8216 4 года назад

      @@absentdata No. I mean just three data. I conducted an analysis with 3 trials so I generated 3 data points. And I need to know whether I can average all those three and report it.
      Edited** What I meant with "analysis" is "experiment"

    • @absentdata
      @absentdata  4 года назад

      @@ellereal8216 Yes that could work for your specific application.

    • @ellereal8216
      @ellereal8216 4 года назад

      @@absentdata Is that the best option for my case to use to identify the outlier? Can you advise me some other stat tools I can easily apply on a calculator.

    • @absentdata
      @absentdata  4 года назад

      @@ellereal8216 Unfortunately I can't really prescribe what to do cause I would need to know your data and context. However I think you can just explain your logic when presenting your findings. You can also can also use percentiles.

  • @jedn1660
    @jedn1660 5 лет назад

    Very helpful, thank you

  • @leonamoliveira7034
    @leonamoliveira7034 6 лет назад

    Thank you for your help

  • @payammadani3781
    @payammadani3781 4 года назад

    liked and subscribed. a ton thanks

  • @Tubeytime
    @Tubeytime 4 года назад

    No fluff, thanks

  • @gabrielvaz7014
    @gabrielvaz7014 4 года назад

    amazing. thank you!

  • @JassiLive2810
    @JassiLive2810 5 лет назад +1

    BRO THERE IS SOMETHING WRONG IN THIS DATA....BECAUSE WHENEVER I JUST APPLIED A DIFFERENT FORMULA TO FIND FIRST AND 3RD QUARTILE I JUST GET DIFFERENT VALUES WHICH IS NOT POSSIBLE

  • @berto_osorio_fl
    @berto_osorio_fl 3 года назад

    thank you!

  • @TheGollo10
    @TheGollo10 5 лет назад

    What is the name of the test?

    • @mickeylandcaster3
      @mickeylandcaster3 5 лет назад

      I am not sure the test has a name. It's more of a technique to classify the borders of your data.

  • @Metia_9898
    @Metia_9898 3 года назад +2

    THANK YOU BRUHHHHH

  • @karanpatil8218
    @karanpatil8218 10 месяцев назад

    WHY HAVE YOU USE 1.5* IN UPPER BOUND

    • @absentdata
      @absentdata  10 месяцев назад

      1.5 is a constant value that I used in the equation

  • @kameshkumar1239
    @kameshkumar1239 4 года назад

    Thank you so much

  • @mahmoudfaouzichaoubi5855
    @mahmoudfaouzichaoubi5855 6 лет назад

    شكراً، Tank you

  • @ryujuano8414
    @ryujuano8414 6 лет назад

    F4 (absolute value) won't work in windows :(

    • @naijastylist5545
      @naijastylist5545 6 лет назад +2

      Ryujinne Juano press( fn ) (f4) it would work

  • @ghaidferas6146
    @ghaidferas6146 2 года назад

    why the 1.5?

    • @absentdata
      @absentdata  2 года назад

      Its a statistical formula. Its similar to using a constant like 3.14 for PI

  • @ask-wj2pf
    @ask-wj2pf Год назад

    Thankyou

  • @vivekkoli4479
    @vivekkoli4479 2 года назад

    Why 1.5?

  • @marissakachadoorian4772
    @marissakachadoorian4772 6 лет назад

    I have a mac and f4 also does not work

  • @prasannakumar-qv8nf
    @prasannakumar-qv8nf 5 лет назад

    why do you multiply 1.5 to upper bound

    • @fuad471
      @fuad471 4 года назад

      it is just a statistical rule

  • @wilsonbrown3521
    @wilsonbrown3521 5 лет назад +4

    How to calculate outliers by Kermit the Frog

  • @erniestremathic2290
    @erniestremathic2290 3 года назад

    Hello Sir. If the Result Show True. Meaning That we have to delete the data (true only) or the significance true data? (Im Using Panel Data).
    For eg:
    2013 TRUE apple.co
    2014 False apple.co
    2015 TRUE apple.co
    2016 False apple.co
    2017 False apple.co
    SHOULD I dELETE ALL.. or just the true ?

  • @VyTran-nu3on
    @VyTran-nu3on 3 года назад

    thank you!