5 Number Summary And How To handle Outliers Using IQR-Statistics

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 61

  • @krishnaik06
    @krishnaik06  3 года назад +6

    Just a quick update the median is 5.5 if you remove the outlier it is 5

    • @sonamgupta-zp1sr
      @sonamgupta-zp1sr 3 года назад +1

      Sir.. I have a question.. In lower bracket (are we consider 27 element in IQR FINDING(14+1*%)?

    • @gauravrajpal3101
      @gauravrajpal3101 3 года назад

      Can you post a video on implementation of VIF along with IQR ?

    • @suhagajakos1876
      @suhagajakos1876 2 года назад

      Very nicely explained,I understood it,thank u very much

    • @socrateeschandrasekaran1989
      @socrateeschandrasekaran1989 2 года назад +1

      Yes Krish, good but adding separately no- 27 will change total no.of elements, I hope you ll agree this

    • @socrateeschandrasekaran1989
      @socrateeschandrasekaran1989 2 года назад

      @@sonamgupta-zp1sr yes correct Sonam

  • @rushabhkabra4985
    @rushabhkabra4985 3 года назад +5

    just told us will make a video this morning in the class and by evening we have it on youtube, insane level of dedication 💯

  • @roshankumargupta46
    @roshankumargupta46 3 года назад +8

    Sir please correct me if I'm wrong.
    Median for even observations would be (n/2) +(n/2 +1) which should be 5.5

  • @0SIGMA
    @0SIGMA 3 года назад +2

    Sir. Just a huge fan of yours. Hoping you a bigger success as a youtuber !

  • @jinks3669
    @jinks3669 3 года назад +1

    This is one of your best videos I've seen.
    So so very well explained. Naman to your efforts .

  • @arshiavashisht3481
    @arshiavashisht3481 Год назад

    you never dissapoint, thanks for lucid explanations

  • @shansingh9858
    @shansingh9858 3 года назад +1

    Even I don't know why Krish sir starts creating videos on those topics on which he has already created...
    Please sir ..continue with some advance topics of Deep learning , Transfer Learning , Big Data using Black board..
    It will really helpful for all for us !.
    Plsss sir..
    Love u sir !

  • @abhishek_maity
    @abhishek_maity 3 года назад +1

    Amazing!! Thankyou for this... it cleared that Box Plot concept... i was always confused there !! 😀😀

  • @balramthakur9951
    @balramthakur9951 3 года назад +1

    The way you are explaining is outstanding ❤️

  • @AanyaSS
    @AanyaSS 6 месяцев назад

    Wonderful explanation, extremely clear and concise, thank you so much!

  • @karthikr2186
    @karthikr2186 3 года назад

    why do ppl dislike this type of amazing videos

  • @divyankrawat8371
    @divyankrawat8371 3 года назад

    In accordance to the content this channel has it should have 1M subs!! Great work sir

  • @AmanSharma-tn3dy
    @AmanSharma-tn3dy 2 года назад

    its an osm video due to this I am able to understand how boxplot work but also how we remove outlier.

  • @rahmankhan4994
    @rahmankhan4994 Год назад

    I can not thank you enough literally for what you are doing but still thank you so much

  • @shrikantdeshmukh7951
    @shrikantdeshmukh7951 3 года назад +2

    Please do not round number correct formula is
    If it is 3.25. then correct is 3rd +0.25(4th-3rd)

  • @souravpanda2686
    @souravpanda2686 2 года назад +3

    you have used the IQR value from the previous calculation without the addition of 27 in practice we can take the whole data set and as I see it won't make any difference. correct me if I am wrong

  • @saumendrak
    @saumendrak 3 года назад

    Really krishna sir..nice example & explanation...

  • @bhartijoshi3231
    @bhartijoshi3231 3 года назад +3

    Hello Mr Naik, while calculating the 25 and 75 percentile you have not taken 27 , later on you have added 27. It will change. Please clarify

    • @vigneshkr7391
      @vigneshkr7391 2 года назад +1

      It’s just a buffer value. To demonstrate that the value doesn’t falls into the given data set!

  • @TheMISBlog
    @TheMISBlog 3 года назад

    Another Great Video by a great teacher, thanks

  • @thepresistence5935
    @thepresistence5935 3 года назад

    intro love !

  • @krishnabhutada3983
    @krishnabhutada3983 3 года назад

    Cleared all my concepts...tHANK yOU

  • @prakanshusahu1577
    @prakanshusahu1577 2 года назад +1

    sir to find the 25% pertentile, ifwe have n observation why we condsider N+1 observation in the formula
    25/100*(N+1)

  • @shadiyapp5552
    @shadiyapp5552 Год назад

    Thank you sir ♥️

  • @rambaldotra2221
    @rambaldotra2221 3 года назад

    Grateful Sir !!

  • @dehumanizer668
    @dehumanizer668 2 года назад +5

    Actually there is a mistake. Shouldn't add 27 to the list after calculating Q1, Q3 and IQR. It gives different result if you count these by having 27 from the beginning.
    import math
    # Basic explanation of IQR Technique
    list_1 = [1,2,3,4,5,5,6,7,8,9,10,27]
    # Q1 is 25th percentile and Q3 is 75th percentile
    l_q1 = list_1[math.floor((25 * (len(list_1)+1))/100)]
    print('Q1:', l_q1)
    l_q3 = list_1[math.floor((75 * (len(list_1)+1))/100)]
    print('Q2:', l_q3)
    l_iqr = l_q3 - l_q1
    print('IQR:', l_iqr)
    lb = l_q1 - 1.5*(l_iqr)
    print('Lower Bracket:', lb)
    hb = l_q3 + 1.5*(l_iqr)
    print('Higher Bracket:', hb)
    for i in list_1:
    if i < lb or i > hb:
    print('Outlier:', i)
    list_1.remove(i)
    print('After removing outlier:', list_1)
    Output:
    Q1: 4
    Q2: 9
    IQR: 5
    Lower Bracket: -3.5
    Higher Bracket: 16.5
    Outlier: 27
    After removing outlier: [1, 2, 3, 4, 5, 5, 6, 7, 8, 9, 10]

  • @meeturiajaykumar.2384
    @meeturiajaykumar.2384 Год назад

    Hello sir,
    Can you please make tutorial on the outlier detection and removal using IQR on a dataset. Had an obstacle to how to carryon if there are multiple columns with the outliers....

  • @kiranbagadhi4053
    @kiranbagadhi4053 3 года назад +1

    Hi Krish, a small doubt in median part if I am not wrong the median there would be the middle number right, so there the exact median number we couldn't find because the data is in evenly format so we have to take two numbers as median and we would perform the following calculation right (5+6)/2 i,e 5.5 is the median (50% percentile)...Correct me if I am wrong Krish...😊

    • @krishnaik06
      @krishnaik06  3 года назад

      Yes u r right

    • @kiranbagadhi4053
      @kiranbagadhi4053 3 года назад +1

      Thanks a lot for the explanation Krish, a very clean and a neat explanation...

    • @vedantbhenia9085
      @vedantbhenia9085 2 года назад +1

      But to calculate the outlier, the outlier should be included in the dataset and then IQR should be calaculated. Correct me if I am wrong

  • @AnimeFanClub786
    @AnimeFanClub786 3 года назад

    Your previous thumbnail was cool

  • @vigneshkr7391
    @vigneshkr7391 2 года назад

    Hi Krish,
    Shall we replace outliers values with IQR which might gives more accuracy!

  • @Mere_Shivjii
    @Mere_Shivjii 3 года назад

    Hi Krish thanks for the detailed information, is it always required to remove outlier, i think outlier capping at 98 or 99 the percentile value help in getting the entire dataset for further analysis.

  • @sathishk3494
    @sathishk3494 3 года назад

    sir, in finding outlier using z-score and IQR video you said lower bound is q1*1.5 and upper bound is q3*1.5 but here lower bound is q1-1.5*(IQR) likewise for upper bound, which formula to follow.
    and what is the use of minimum, maximum and median here we don't you it any way right, to filter the outlier. please comment on this sir please clear my doubts.

  • @nikhilverma1044
    @nikhilverma1044 3 года назад

    Hello Sir, would you kindly suggest how can we do relationship analysis in data among non quantitative columns while doing the EDA?

  • @adityaayaan8727
    @adityaayaan8727 2 года назад

    Why boxplot always show outliers in right side of the graph we know that the outliers can be both side why it doesn't show outliers in left side

  • @dilipgyawali1776
    @dilipgyawali1776 2 года назад +1

    why u didn't consider 27 while calculating percentiles???

    • @dilipgyawali1776
      @dilipgyawali1776 2 года назад

      then IQR may change and outliers may be different....

  • @jaiswalmagic1
    @jaiswalmagic1 2 года назад

    my notes for this video : vishaljaiswal.notion.site/5-Number-Summary-And-How-To-handle-Outliers-Using-IQR-Statistics-3881d8c6de524572b68930b280172c78
    thoughts and comments are welcome. 😀

  • @vageshyaggati8214
    @vageshyaggati8214 3 года назад

    Sir, in a data set age column(17-98) years old and bank balance column are There. I use box plot for age and balance. It shows outliers on the 70 year to 98 year. Can I ignore or Use IQR formula for removing that outliers.
    Sir please help me 🙏🙏

  • @amarpratapsingh4806
    @amarpratapsingh4806 3 года назад

    I know it is unrelated to the video. But could anybody tell me should i learn backend programming basics as a data scientist?

  • @shanmukhadari4549
    @shanmukhadari4549 3 года назад

    sir why 1.5 is used to in calculation IQR

  • @hamzahjamal6286
    @hamzahjamal6286 3 года назад

    Sir, instead of median why can't we use the mean?

  • @spandanswain2879
    @spandanswain2879 3 года назад

    Is it necessary to do the outlier removal operation in every regression project?? please answer it..

  • @tharunps8048
    @tharunps8048 3 года назад

    So the numbers should be sorted ?

    • @praveenbhatt3127
      @praveenbhatt3127 3 года назад

      Yes,for the five number summary data should be sorted.

  • @shibanarayansahoo1874
    @shibanarayansahoo1874 3 года назад +1

    Median will be (5+6)/2=5.5

    • @krishnaik06
      @krishnaik06  3 года назад

      Yes u are right. I missed it

  • @MaksoodAlam1986
    @MaksoodAlam1986 3 года назад

    Hi

  • @mohitrana2801
    @mohitrana2801 3 года назад

    bro ur elaboration is not clear and confusing