How to do the Titanic Kaggle Competition
HTML-код
- Опубликовано: 10 фев 2021
- This video is for those who want to get started doing #kaggle.
❤️ Support the channel ❤️
/ @aladdinpersson
Paid Courses I recommend for learning (affiliate links, no extra cost for you):
⭐ Machine Learning Specialization bit.ly/3hjTBBt
⭐ Deep Learning Specialization bit.ly/3YcUkoI
📘 MLOps Specialization bit.ly/3wibaWy
📘 GAN Specialization bit.ly/3FmnZDl
📘 NLP Specialization bit.ly/3GXoQuP
✨ Free Resources that are great:
NLP: web.stanford.edu/class/cs224n/
CV: cs231n.stanford.edu/
Deployment: fullstackdeeplearning.com/
FastAI: www.fast.ai/
💻 My Deep Learning Setup and Recording Setup:
www.amazon.com/shop/aladdinpe...
GitHub Repository:
github.com/aladdinpersson/Mac...
✅ One-Time Donations:
Paypal: bit.ly/3buoRYH
▶️ You Can Connect with me on:
Twitter - / aladdinpersson
LinkedIn - / aladdin-persson-a95384153
Github - github.com/aladdinpersson
Really helpful to someone to see someone work through a really simple solution as someone moving from to python from R!
Thanks for sharing this simple and elegant beginner friendly code. Your approach are very clear and understandable.
Thankyou so much for sharing this elegent and simple , beautifully written code. As a bigineer your code is a holy grail !!!
awesome job! love the simplicity. keep going!
People get 100% because this dataset is so classic and they are always finding the best features or maybe use ensemble methods. But your intro is so straightforward for me to start at kaggle. Thanks!
Thank you so much!!!
It was really helpful to get started in Kaggle competitions^^
This made so much sense, thank you.
I did the same approach when I started my kaggle journey 😀 .. .. request from my side please make some viedos on transfer learning in natural language processing thank you
Thanks a lot. As a beginners, it's helpful for me!
Thanks I've subscribed. Very simple yet informative content.
helped a lot! thank you!
That was really helpful.
Thanks
That was easy and helpful :) Thanx!!
Thank you sm dude
Perfect for the beginner!
Thank you!
Thank you bro!!
Thanks a lot man you helped me
Well done mate!. Thanks for this. Hopefully you will do more Kaggle stuff. Will follow everything
Yeah it will for sure, got another video coming soon on a bit more advanced competition
GOD LEVEL VIDEO THANKS SO MUCH!
Very helpful for begineers ..................
Thanks for such content.
Thank you, very helpful ;)
nice vid, did my assignment with this
appreciate good effort!
This is what I really need as a beginner
Thanks a lot
thanks for you
Thanks for the video!
Do you happen to be from Norway perhaps?
In this case (using a regression), is it possible to just use stata? I feel like most of the actions performed here would have been easier/quicker in stata… I’m asking this since I now how to work with stata and am currently learning data science via datacamp/kaggle and want to compare some tools :)
For logistic regression, isn't it necessary to do feature scaling before training? When I searched in the net, it was specified that we should do feature scaling for logistic regression
Hi, thanks for your tutorial.
I've implemented your code, but why the accuracy that I got is different with you?
100% its mean over fitting
of course u can do more stuff to boost your performance
PCA, boost sampling, cross validation, even prior parameter
I agree, you can try/do a lot to more to make it even better, for this one I tried to keep it minimal and simple
Actually it is not overfitting because this accuracy were measure through test set (unseen data), not train set.
Hi bro how did you set up the CSV file on the jupyter because my CSV file was not defined thanks
Can you make a video about XGBoost; their is not many resources for that
Thanks for this! I have a question, at 3:45 how are you able to avoid writing the whole directory of the file and just say "train.csv", instead of writing the whole snake of the directory e.g. "C:\\Users\\etcetc\\Python\\titanic\\train.csv"?
remember me when this channel is gonna go hit : )
Remember me too!
Bro for *Embarked*u should go for Nominal encoding not a label because it's names of ports
Yeah you right
I really like your style of thinking and explaining. Could you please advise on any (free or not) courses/articles or anything you believe is good for beginners?
Why did we not used fit.transform on test set
I think your code is excellent, but it freaks me out how many data scientists only see their accuracy as a result. Understanding and presenting the results in meaningful way is key to any science. So... who was likely to survive??
Very short,simple and explanatory, but you use machine learning techniques all through, you don't really explore and visualize the data.
This video is awesome by the way,and beginner's friendly
And the 100% people, rumour has it that some people have got the info of the people from the actual Titanic records which is publicly available. So it would give 100% obviuosly
Makes sense!
so are kaggle competitions genuine??
i always wonder how would people get 100% correct predictions
or is this specific to this competition only?
moreover they come with such huge prize pools
@@viralmedia.007 yes the actual competitions which have prize money very genuine. The rigged ones are usually very basic or fir which data is already available publicly
everything else works for me except predictions when getting to 14:43 it just says "AttributeError: 'function' object has no attribute 'predict'"
Planning to do more real ones in the future?
yes
Got any ideas of some you think would be useful?
I watched a lot of video of yours and I think you are very likely to place high as you are really knowledgeable. You explain thing very well. I think you can try the recent human protein competition. It’s a fun weakly supervised classification problem.
Upvoted
Noob here, question: why did you clean your data through a function? Why not just run those exact commands outside of the function?
i think because he had 2 tables with input data and it's easier to write 1 function and call it 2 times than writing the algorithm 2 times and change something for each table
Ur code gives error when we predict x test
can someone pls help me out here ? at 14:55 on running it shows "value error: X has 8 features per sample; expecting 7"
I have same error, haven't you solved it yet?
@@timgen-iu1qo yea i got my mistake... in the 2nd cell i wrote test = pd.read_csv("train.csv") instead of test = pd.read_csv("test.csv")... silly of me
@@magikarp1743 IMAGINE, same mistake... Thanks 😂😂
im getting an error while spilting the data can you help me? or if you dont mind an you send your number please i will send screenshot to you?
Well done. thanks for you efforts! 100% accuracy? I am sure they have cheated :)
I'm having errors while fitting the model
It says
Float() must be str or .... Not method
How did you handle this
I also get the same error
got solution for this????
I have no idea what I did after this error , I might have even left it entirely 😂, sorry guys !
Do u know how to find the most popular name among male Titanic passengers?
One with the maximum frequency should be the most. So use count() and max()
LogisticRegression??? where's the neural network? :)
In the moment it felt like it would be overkill, in retrospect I regret it :3
Plz speed runing datasets like games😂
How you mean? :P
@@AladdinPersson pick a random dataset and try how fast can u go from downloading to inference.