Python: univariate statistics

Поделиться
HTML-код
  • Опубликовано: 4 ноя 2024

Комментарии • 29

  • @mutaivictor7607
    @mutaivictor7607 Год назад +3

    This resource is quite helpful. This resource is ideal for those who possess prior knowledge of the language and need a condensed and intensive overview of the subject matter. I really like the conciseness and directness of this statement.

  • @vincentvalentine9417
    @vincentvalentine9417 2 года назад +14

    Man, this is so useful. This is perfect if you know the language and just need a crash course on stuff. I really like how to the point this is

  • @b_flieg7579
    @b_flieg7579 8 месяцев назад +2

    Hi Mark, thanks for the video and the playlist. I don't know, if you tell it after 18:25, but it is a lot easier to use a for - loop instead of copy and paste the code in each cell. What i made is in example:
    for i in df.columns:
    print(f'{i}: {df[i].dtype}')

  • @Dilofi1712
    @Dilofi1712 Год назад +1

    Thak you so much for doing this video. My school taught us how to do EDA and I was struggling to keep up. This helped me a lot.

  • @gamerforever9137
    @gamerforever9137 3 года назад +3

    Hi Mark,
    Amazing video, I was confused about the part of skewness and normality, and then I stumbled upon your video, which cleared all my doubts.
    Thank you

  • @CycleTheDark
    @CycleTheDark Год назад +1

    I like how real this tutorial is

  • @Farrukhw
    @Farrukhw 4 месяца назад +2

    Mark, you should start using `for` loop instead of writing each line with `print()`. For example, try this code:
    ---
    for col in df.columns:
    print(f"{col}={df[col].nunique()}")
    ---
    and you will get the same output in just two lines.

    • @MarkKeith
      @MarkKeith  4 месяца назад +1

      Definitely 👍
      I use these videos in a book for students who are coding for the first time. At this point, I haven’t gotten to loops yet. But then, a bit later, I use your exact code to make the point that automation saves them a lot of time.

  • @Baron-digit
    @Baron-digit 3 года назад +1

    Hi Mark,
    thanks for sharing. Very helpful to find a good approach to start analysis!

  • @barulli87
    @barulli87 10 месяцев назад +1

    great content! can you also talk about how to interpret these results? what can we do with all the concepts you discussed

    • @MarkKeith
      @MarkKeith  10 месяцев назад +1

      That’s a good question, but the answer is a bit long for the comments. Basically, those results give you an idea about the cleanliness and distributions of the features. For example, if the skewness is too high, then you know that you’ll either need to transform the feature or choose a modeling algorithm that doesn’t depend on linear assumptions. If categorical features have a large number of unique values, then you’ll need to check to make sure that every value is adequately represented or you’ll need to do some grouping.
      I talk about a lot of these issues in later videos when I get to the modeling phase.
      Thanks!

  • @zahraseyedghorban82
    @zahraseyedghorban82 Год назад +1

    very clean programming! thank you.

  • @digvijayshekhawat5314
    @digvijayshekhawat5314 4 года назад +1

    It was good and cleared many of my doubts.

  • @majdfahadal-thopiti8369
    @majdfahadal-thopiti8369 9 месяцев назад +1

    Thank you bro it is so much helpful but I have question I do not understand the univariant in this data set we have numerical and categorical it is not seems !
    thank you again from Saudi Arabia

    • @MarkKeith
      @MarkKeith  9 месяцев назад

      Hey, thanks for the comment and question! Can you tell me a bit more about what you’re asking? Are you wondering about what to look for between numeric versus categorical features?

  • @karandeepsingh7900
    @karandeepsingh7900 Год назад

    The kurtosis of normal distribution is 3 but you say that if we ahve kurtosis near +-1 that is to be considered ?

  • @ApPillon
    @ApPillon 10 месяцев назад +1

    thanks my dude

  • @11hamma
    @11hamma 4 года назад +1

    Hi Mark nice stuff for beginners. Can you please provide notebook(s) used?

    • @MarkKeith
      @MarkKeith  3 года назад +2

      I know I'm incredibly late on this, but I've added links in the description

    • @Kishor_D7
      @Kishor_D7 Год назад

      ​@@MarkKeith😂😂

  • @aishwaryawuntkal3283
    @aishwaryawuntkal3283 4 года назад +2

    Where's the document from? It seems helpful. Could you please provide that?

    • @MarkKeith
      @MarkKeith  3 года назад

      I know I'm incredibly late on this, but I've added links in the description

  • @pavansingara9408
    @pavansingara9408 3 года назад +1

    hi mark , could you please provide the document of how you handled categorical and numerical variables??

    • @MarkKeith
      @MarkKeith  3 года назад +1

      I know I'm incredibly late on this, but I've added links in the description

  • @melisaappaof944
    @melisaappaof944 3 года назад

    what about multivariate? how do we calculate in pandas

  • @bibekrauth3408
    @bibekrauth3408 4 года назад +1

    Can you provide the document

    • @MarkKeith
      @MarkKeith  3 года назад

      I know I'm incredibly late on this, but I've added links in the description

  • @ahorasimipaco
    @ahorasimipaco 2 года назад

    awesome

  • @Nizzy001
    @Nizzy001 9 месяцев назад

    i need subtitle in pt br :(