How to do the Titanic Kaggle Competition

Поделиться
HTML-код
  • Опубликовано: 10 фев 2021
  • This video is for those who want to get started doing #kaggle.
    ❤️ Support the channel ❤️
    / @aladdinpersson
    Paid Courses I recommend for learning (affiliate links, no extra cost for you):
    ⭐ Machine Learning Specialization bit.ly/3hjTBBt
    ⭐ Deep Learning Specialization bit.ly/3YcUkoI
    📘 MLOps Specialization bit.ly/3wibaWy
    📘 GAN Specialization bit.ly/3FmnZDl
    📘 NLP Specialization bit.ly/3GXoQuP
    ✨ Free Resources that are great:
    NLP: web.stanford.edu/class/cs224n/
    CV: cs231n.stanford.edu/
    Deployment: fullstackdeeplearning.com/
    FastAI: www.fast.ai/
    💻 My Deep Learning Setup and Recording Setup:
    www.amazon.com/shop/aladdinpe...
    GitHub Repository:
    github.com/aladdinpersson/Mac...
    ✅ One-Time Donations:
    Paypal: bit.ly/3buoRYH
    ▶️ You Can Connect with me on:
    Twitter - / aladdinpersson
    LinkedIn - / aladdin-persson-a95384153
    Github - github.com/aladdinpersson

Комментарии • 78

  • @grantfoster2663
    @grantfoster2663 2 года назад +8

    Really helpful to someone to see someone work through a really simple solution as someone moving from to python from R!

  • @OSShubho
    @OSShubho 3 года назад +20

    Thanks for sharing this simple and elegant beginner friendly code. Your approach are very clear and understandable.

  • @AbdulRehman-nu2pb
    @AbdulRehman-nu2pb Год назад

    Thankyou so much for sharing this elegent and simple , beautifully written code. As a bigineer your code is a holy grail !!!

  • @aexairkeys
    @aexairkeys 3 года назад +3

    awesome job! love the simplicity. keep going!

  • @jayz4581
    @jayz4581 2 года назад +1

    People get 100% because this dataset is so classic and they are always finding the best features or maybe use ensemble methods. But your intro is so straightforward for me to start at kaggle. Thanks!

  • @fentazimohamedreadh5274
    @fentazimohamedreadh5274 Год назад

    Thank you so much!!!
    It was really helpful to get started in Kaggle competitions^^

  • @kyoujinko
    @kyoujinko 2 года назад

    This made so much sense, thank you.

  • @deepudeepak1390
    @deepudeepak1390 3 года назад +1

    I did the same approach when I started my kaggle journey 😀 .. .. request from my side please make some viedos on transfer learning in natural language processing thank you

  • @gezahagnnegash9740
    @gezahagnnegash9740 2 года назад

    Thanks a lot. As a beginners, it's helpful for me!

  • @jinks6887
    @jinks6887 2 года назад

    Thanks I've subscribed. Very simple yet informative content.

  • @SussyBaka-ci5xi
    @SussyBaka-ci5xi 9 месяцев назад

    helped a lot! thank you!

  • @MDEMANURRAHAMAN-
    @MDEMANURRAHAMAN- 2 года назад

    That was really helpful.
    Thanks

  • @sena1663
    @sena1663 2 года назад

    That was easy and helpful :) Thanx!!

  • @minhlong1920
    @minhlong1920 3 года назад +1

    Thank you sm dude

  • @denisvoronov6571
    @denisvoronov6571 2 года назад

    Perfect for the beginner!

  • @danasharon4752
    @danasharon4752 2 года назад

    Thank you!

  • @juan.forero_
    @juan.forero_ 2 года назад

    Thank you bro!!

  • @teamsonnyliston
    @teamsonnyliston Год назад

    Thanks a lot man you helped me

  • @suhass6628
    @suhass6628 3 года назад +13

    Well done mate!. Thanks for this. Hopefully you will do more Kaggle stuff. Will follow everything

    • @AladdinPersson
      @AladdinPersson  3 года назад +5

      Yeah it will for sure, got another video coming soon on a bit more advanced competition

  • @LameGamerYT
    @LameGamerYT 2 года назад

    GOD LEVEL VIDEO THANKS SO MUCH!

  • @sanskarram992
    @sanskarram992 3 года назад

    Very helpful for begineers ..................
    Thanks for such content.

  • @newkamphora
    @newkamphora Год назад

    Thank you, very helpful ;)

  • @nabshieshty
    @nabshieshty 7 месяцев назад

    nice vid, did my assignment with this

  • @mtk-0_0
    @mtk-0_0 Год назад

    appreciate good effort!

  • @maxvettel7337
    @maxvettel7337 Год назад

    This is what I really need as a beginner

  • @kefahelhelou9418
    @kefahelhelou9418 Год назад

    Thanks a lot

  • @abdoali-nl2yt
    @abdoali-nl2yt Год назад

    thanks for you

  • @Borzacchinni
    @Borzacchinni 2 года назад

    Thanks for the video!
    Do you happen to be from Norway perhaps?

  • @arnelecleir4876
    @arnelecleir4876 2 года назад

    In this case (using a regression), is it possible to just use stata? I feel like most of the actions performed here would have been easier/quicker in stata… I’m asking this since I now how to work with stata and am currently learning data science via datacamp/kaggle and want to compare some tools :)

  • @mukundkrishna2789
    @mukundkrishna2789 2 года назад +1

    For logistic regression, isn't it necessary to do feature scaling before training? When I searched in the net, it was specified that we should do feature scaling for logistic regression

  • @yuliusharjoseputro2069
    @yuliusharjoseputro2069 3 года назад +1

    Hi, thanks for your tutorial.
    I've implemented your code, but why the accuracy that I got is different with you?

  • @ccuuttww
    @ccuuttww 3 года назад +6

    100% its mean over fitting
    of course u can do more stuff to boost your performance
    PCA, boost sampling, cross validation, even prior parameter

    • @AladdinPersson
      @AladdinPersson  3 года назад +4

      I agree, you can try/do a lot to more to make it even better, for this one I tried to keep it minimal and simple

    • @adilsonmedronha706
      @adilsonmedronha706 2 года назад

      Actually it is not overfitting because this accuracy were measure through test set (unseen data), not train set.

  • @ouhjnadmacabenta3054
    @ouhjnadmacabenta3054 2 года назад

    Hi bro how did you set up the CSV file on the jupyter because my CSV file was not defined thanks

  • @udbhavprasad3521
    @udbhavprasad3521 3 года назад

    Can you make a video about XGBoost; their is not many resources for that

  • @LeonidasParigoris
    @LeonidasParigoris 9 месяцев назад

    Thanks for this! I have a question, at 3:45 how are you able to avoid writing the whole directory of the file and just say "train.csv", instead of writing the whole snake of the directory e.g. "C:\\Users\\etcetc\\Python\\titanic\\train.csv"?

  • @gauravms6681
    @gauravms6681 3 года назад +4

    remember me when this channel is gonna go hit : )

    • @jose3538
      @jose3538 3 года назад +1

      Remember me too!

  • @shaikhkashif9973
    @shaikhkashif9973 Год назад +2

    Bro for *Embarked*u should go for Nominal encoding not a label because it's names of ports

  • @Leopar525
    @Leopar525 2 года назад +1

    I really like your style of thinking and explaining. Could you please advise on any (free or not) courses/articles or anything you believe is good for beginners?

  • @pranjalsingh1389
    @pranjalsingh1389 Год назад

    Why did we not used fit.transform on test set

  • @adayinthelife5496
    @adayinthelife5496 Год назад

    I think your code is excellent, but it freaks me out how many data scientists only see their accuracy as a result. Understanding and presenting the results in meaningful way is key to any science. So... who was likely to survive??

  • @classicemmaeasy2292
    @classicemmaeasy2292 Год назад

    Very short,simple and explanatory, but you use machine learning techniques all through, you don't really explore and visualize the data.
    This video is awesome by the way,and beginner's friendly

  • @suhass6628
    @suhass6628 3 года назад +3

    And the 100% people, rumour has it that some people have got the info of the people from the actual Titanic records which is publicly available. So it would give 100% obviuosly

    • @AladdinPersson
      @AladdinPersson  3 года назад +1

      Makes sense!

    • @viralmedia.007
      @viralmedia.007 6 месяцев назад

      so are kaggle competitions genuine??
      i always wonder how would people get 100% correct predictions
      or is this specific to this competition only?
      moreover they come with such huge prize pools

    • @suhass6628
      @suhass6628 6 месяцев назад

      @@viralmedia.007 yes the actual competitions which have prize money very genuine. The rigged ones are usually very basic or fir which data is already available publicly

  • @danilomontalvo5756
    @danilomontalvo5756 2 года назад

    everything else works for me except predictions when getting to 14:43 it just says "AttributeError: 'function' object has no attribute 'predict'"

  • @AIPlayerrrr
    @AIPlayerrrr 3 года назад +1

    Planning to do more real ones in the future?

    • @AladdinPersson
      @AladdinPersson  3 года назад

      yes

    • @AladdinPersson
      @AladdinPersson  3 года назад

      Got any ideas of some you think would be useful?

    • @AIPlayerrrr
      @AIPlayerrrr 3 года назад

      I watched a lot of video of yours and I think you are very likely to place high as you are really knowledgeable. You explain thing very well. I think you can try the recent human protein competition. It’s a fun weakly supervised classification problem.

    • @talha_anwar
      @talha_anwar 3 года назад

      Upvoted

  • @RpSKhaira
    @RpSKhaira Год назад

    Noob here, question: why did you clean your data through a function? Why not just run those exact commands outside of the function?

    • @timgen-iu1qo
      @timgen-iu1qo 11 месяцев назад +1

      i think because he had 2 tables with input data and it's easier to write 1 function and call it 2 times than writing the algorithm 2 times and change something for each table

  • @maitrijain7758
    @maitrijain7758 Год назад

    Ur code gives error when we predict x test

  • @magikarp1743
    @magikarp1743 Год назад

    can someone pls help me out here ? at 14:55 on running it shows "value error: X has 8 features per sample; expecting 7"

    • @timgen-iu1qo
      @timgen-iu1qo 11 месяцев назад

      I have same error, haven't you solved it yet?

    • @magikarp1743
      @magikarp1743 11 месяцев назад

      @@timgen-iu1qo yea i got my mistake... in the 2nd cell i wrote test = pd.read_csv("train.csv") instead of test = pd.read_csv("test.csv")... silly of me

    • @timgen-iu1qo
      @timgen-iu1qo 11 месяцев назад

      @@magikarp1743 IMAGINE, same mistake... Thanks 😂😂

  • @gurudevdatta3960
    @gurudevdatta3960 2 года назад

    im getting an error while spilting the data can you help me? or if you dont mind an you send your number please i will send screenshot to you?

  • @Honest_Reply900
    @Honest_Reply900 2 года назад

    Well done. thanks for you efforts! 100% accuracy? I am sure they have cheated :)

  • @Ajay_Pathak_
    @Ajay_Pathak_ 3 года назад +1

    I'm having errors while fitting the model
    It says
    Float() must be str or .... Not method

  • @karlagonzalez6808
    @karlagonzalez6808 2 года назад

    Do u know how to find the most popular name among male Titanic passengers?

    • @krnl1304
      @krnl1304 2 года назад

      One with the maximum frequency should be the most. So use count() and max()

  • @vishalgoklani
    @vishalgoklani 3 года назад

    LogisticRegression??? where's the neural network? :)

    • @AladdinPersson
      @AladdinPersson  3 года назад

      In the moment it felt like it would be overkill, in retrospect I regret it :3

  • @mehermanoj45
    @mehermanoj45 3 года назад

    Plz speed runing datasets like games😂

    • @AladdinPersson
      @AladdinPersson  3 года назад

      How you mean? :P

    • @mehermanoj45
      @mehermanoj45 3 года назад

      @@AladdinPersson pick a random dataset and try how fast can u go from downloading to inference.