Not Linear Relationship Between Numeric Predictor and Binary Outcome in Logistic Regression (4K)

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024

Комментарии • 34

  • @RUJedi
    @RUJedi 23 дня назад +3

    I work with stats regularly and I'm always learning something from your channel! Brilliant job!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  23 дня назад

      Great to hear! Thanks man! And thank you for watching and commenting!

  • @mmdigital123
    @mmdigital123 5 дней назад

    Great videos, perhaps the best in R community I have ever seen.

  • @windkl
    @windkl 23 дня назад +1

    Outstanding, clear and very informative video! Thank you !

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  23 дня назад +1

      Glad you enjoyed it! Hope you like the rest on my channel too ;)

  • @soylentpink7845
    @soylentpink7845 23 дня назад

    Very great video! It‘s something that I always asked myself how to best do. I usually took the way of first fitting a more complex model - like a tree ensemble - and than looking at the partial dependence plot.
    Thanks!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  23 дня назад

      That's a great idea too! Another one is just to use GAM model. It fits any numeric predictor very well, and you don't need to choose the polynomial degree. The GAM results do not work with some of my favorite packages though. That's the only drowback. But generally, GAM is may be the way to go and I wanna dive deeper into GAMs later. Thanks for watching and for a nice feedback!

  • @marcoesteves4367
    @marcoesteves4367 23 дня назад

    Very helpful and briliantly explained

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  23 дня назад

      Glad it was helpful! Thanks for nice feedback and for watching!

  • @45tanviirahmed82
    @45tanviirahmed82 4 дня назад

    I have a request! Can you please talk about All the assumptions of commonly used statistical tests used in Research in One Single Video? (like different t tests, Anova, Regressions and their non-parametric counter parts). This will help me a lot to stop making mistakes while choosing the model.
    Before jumping into any kind of test, we must need to meet the assumptions, right?

  • @hikeaway1596
    @hikeaway1596 24 дня назад

    great video as usual! thanks and keep it up!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  24 дня назад

      thanks for your continuous support! :) glad you enjoy my content!

  • @hoppybrewologist
    @hoppybrewologist 13 дней назад

    Love your work - can you show PCA plots sometime please

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  13 дней назад

      Thanks you soo much! :) I plan to do PCA for sure, but I don't know whether I can manage this year. The to-do list is kind of long already, and I plan to cover lot's of modelling stuff, so, supervised methods. I will then come to unsupervised, like PCA etc.

  • @alijanbain2852
    @alijanbain2852 24 дня назад

    Thanks for another incredible video! Your explanations are always top-notch. If possible, could you share the code used in this video? It would be super helpful for practicing the concepts. Keep up the amazing work!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  23 дня назад

      Thanks for such a great feedback! I am very happy it's useful! Please, feel free to rewatch and pause the video to write down the code, since it is a good learning strategy. Better then copy-pasting. But if you wish to have the hole code, consider to join the channel (it's the join button below every video) and send members the code in the community tab. Kind regards! And thanks for watching!

  • @haraldniederstatter4068
    @haraldniederstatter4068 24 дня назад

    Danke!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  24 дня назад

      Sehr gerne, Harald! Hoffe die anderen Videos sind auch nützlich! Danke auch meinerseits für die finanzielle Unterstütrung! Das motiviert weiter zu kreieren. Herzliche Grüße, Yury

  • @ousmanelom6274
    @ousmanelom6274 23 дня назад

    Sur R pourquoi vs faites pas encodage et la normalisation des variables avant de créer votre modèle de machine learning

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  23 дня назад

      :) first, please, rephrase your question without jargon if you can, secondly, explain WHY should one do whatever you propose. thanks!

  • @smartinssmart
    @smartinssmart 23 дня назад

    😊👏👏👏👌

  • @viv-analytics
    @viv-analytics 23 дня назад

    Once again, a great educational video using state-of-art R packages. Thx, @yuzaR-Data-Science

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  23 дня назад

      Glad you enjoyed it again! Don’t hesitate to let me know if the quality starts to decline. Or feel free to recommend how can I improve the quality of content.

  • @mauriciomorales3165
    @mauriciomorales3165 21 день назад

    Hi Yury, AMAZING WORK WITH THIS! Explain a lot!!
    I'm performing some logistic regression analysis right now, I would like to make you a few question, is it possible to contact you by email, github or twitter? if not, I can add more context in this comment. Your help will be great and really appreciate!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  21 день назад +1

      Thanks a lot Mauricio for such a nice feedback! I can try to answer your questions here on youtube comments to the best of my abilities and to the time possible. Cheers

    • @mauriciomorales3165
      @mauriciomorales3165 20 дней назад

      @@yuzaR-Data-Science Thank you! I checked some of your papers, they helped me a lot with some of my question!
      Well, here the context:
      I have a model that predict severity (disease - non-disease) based on a genotype, for example AA vs AG|GG. That is the basic model, so now I add sex and age, for my second model, then obesity and the last one include more comorbidities: allergies, arterial hypertension and so on...
      I perform a mixed selection process and based on AIC criteria I select the best model, all good so far, however, here my questions:
      some people add the crude OR and the adjusted OR. By definition, I know that adjusted OR is the ones that is adjusted when you add more independent variables to the model, however, is there a way to put the crude and the adjusted all in one model? For example, with the function glmulti::glmulti() and finalfit::fit2df() I can put in a table with all my variables the crude and the adjusted OR, so looks like there is a way to calculate and plot both, Could you provide more information about this?
      My other question, do you know if there is a way to check for confounder using code, for example, I read applied logistic regression book, in the book the author mention that you can check this by interaction between variables. However, I'm not sure if I can perform a test or a plot that could say me "this is a confounder in your model".
      Last question, for the stats::glm() function, you can define interaction by ":" and "*", so imagine that I need to set the interaction between the genotype and obesity, because I know that in my data, obese guys have more probability to have the disease when they have specific genotype. So one option is to allow the interaction between this variables" glm(severity ~ genotype * obesity + other variables + ...) or using ":" instead.
      My question is: this interaction is correct in the biological way? What do I mean, is the interaction really representing the condition genotype - obese together? I understand the meaning in the code, but I do not know if i can extrapolate that to the biological meaning.
      Any comment are really appreciate here! Thank you for your time and help!!!!

    • @mauriciomorales3165
      @mauriciomorales3165 20 дней назад

      I forgot to mention, that most of my variables are binaries, age is the only one with numerical data type

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  14 дней назад +1

      Well, first, there is no difference between odds ratios from any multivariable logistic model and adjusted odds ratios. All ratios from multivariable regression are adjusted for the confounders you put in. So, don't worry about this terminology to much.
      Secondly, to control for confounders, you can use techniques such as:
      Stratification: Dividing the sample into groups based on the confounder and analyzing the relationship between the independent and dependent variables within each group.
      Statistical adjustment: Including the confounder as a covariate in the statistical model.
      Finally, use "*" instead of ":" because "*" will implicitly use ":" anyway.
      Hope that helps!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  14 дней назад +1

      don't forget to make binaries a factor then ;)