Beginner Data Science Portfolio Project Walkthrough (Kaggle Titanic)

Ryan & Matt Data Science

Просмотров 28 тыс.

804

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 9 фев 2025

Комментарии • 97

@RyanAndMattDataScience 6 месяцев назад ⁺¹
Hey guys I hope you enjoyed the video! If you did please subscribe to the channel!
Join our Data Science Discord Here: discord.com/invite/F7dxbvHUhg
If you want to watch a full course on Machine Learning check out Datacamp: datacamp.pxf.io/XYD7Qg
Want to solve Python data interview questions: stratascratch.com/?via=ryan
I'm also open to freelance data projects. Hit me up at ryannolandata@gmail.com
*Both Datacamp and Stratascratch are affiliate links.
@KyaBroderick 7 месяцев назад ⁺⁹
this needs more views. was so in depth and perfect for a beginner!
@collingreens7 7 месяцев назад ⁺³
I loved the walkthrough, honestly the last about 35 mins I had no idea what was going on but it's really cool that people like you are giving free tutorials on such complex work. Thanks!
@RyanAndMattDataScience 7 месяцев назад ⁺¹
No problem. Everything I go over in this vid is covered in my ML and Python playlists. Check them out!
@RyanAndMattDataScience Год назад ⁺⁸
Hope you enjoyed this video, it took so long to produce. If you enjoyed it, please subscribe to the channel.
I just uploaded the 2nd part of this video where I improve the model (linked down below)
Below are a few links that you should check out:
Part 2: ruclips.net/video/KzK1pifa2Vk/видео.html&ab_channel=RyanNolanData
Kaggle Code: www.kaggle.com/code/ryannolan1/titanic-wip-9-12
Twitter: twitter.com/RyanNolanData
LinkedIn: www.linkedin.com/in/ryan-p-nolan/
SciKit-Learn Tutorials: ruclips.net/video/SjOfbbfI2qY/видео.html&ab_channel=RyanNolanData
Practice SQL & Python Interview Questions: stratascratch.com/?via=ryan
@mericcapar2447 Месяц назад ⁺¹
Thank you! I learned machine learning algorithms and made my first kaggle project with you! I am very grateful for that and i will watch your other videos. Thanks for great content.
@RyanAndMattDataScience Месяц назад ⁺¹
Hey congrats man huge step forward! I appreciate you checking them out. Also join our discord
@mericcapar2447 Месяц назад
@@RyanAndMattDataScience link is expired for this video. I will check other videos but maybe you should check other videos too.
@mericcapar2447 Месяц назад
@@RyanAndMattDataScience also it can be about my country bec discord banned :D
@GregThatcher Месяц назад
Thanks!
@RyanAndMattDataScience Месяц назад
Thanks Greg! Really appreciate it
@AmaRan31 Год назад ⁺²
It was a super useful video and I am happy to have done my first Data Science project. Thank you very much.
@RyanAndMattDataScience Год назад
Congrats on completing your first project
@zacharygrant1829 2 месяца назад
Appreciate it man, graduated last April and this series has been a lifesaver to refresh my python skills before I begin working.
@RyanAndMattDataScience 2 месяца назад
No problem, join our discord also! Plan on expanding out content to our website in 2025 for more details on vids and such
@yakosti Год назад ⁺⁵
Thank you for this, your videos are so helpful. Keep it up!
@RyanAndMattDataScience Год назад
Np new one tomorrow. New project this week
@janneskleinau6332 27 дней назад ⁺¹
Why do we write
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.2, stratify = y, random_state=21)
and then never use X_valid and y_valid again? Isn't that a useless waste of data for the training?
@Searchingxpeace 26 дней назад
they are used for testing the model's performance
@ritamchatterjee8785 Год назад ⁺¹
yes man, much appriciated for your efforts
@RyanAndMattDataScience Год назад ⁺²
Thanks for working on this project! My housing one will be out in November
@mdaniels6311 26 дней назад
I checked it out and those without ages were MORE likely to survive.
This is counter intuitive, as I thought that poorer people's ages may not have been entered, and thus would appear more in those who did not survive.
I then wondered if perhaps those without ages were infants or babies, and perhaps as children were first on boats, they survived at higher rates.
I will spend some time checking this data out to try and find an answer.
But it also made me realise having good social and political understand of the world will help a data scientist and machine learning practioner as these understandings may enhance the ability to explain odd findings.
@RyanAndMattDataScience 25 дней назад
This is quite interesting
@ChillWebDeveloper Год назад ⁺¹
at least you explain in detail what you are typing for after copy your line of code. Nice video btw
@RyanAndMattDataScience Год назад
Thank you!
@robertbenson8554 7 месяцев назад
Excellent video. So much in it, thought process, code tips etc.
@RyanAndMattDataScience 7 месяцев назад
Thank you
@autiematic7224 6 месяцев назад
So helpful !! Ideal demonstration for my first projects, going forwards
@kotonelm069 4 дня назад
Is is always better to cut age group into more smaller groups
?
@idontevenwanttomakea Год назад ⁺¹
Hi, great video. One idea - instead of writing out so many loc statements, it might be easier to just use labels=False when using qcut.
@RyanAndMattDataScience Год назад
Thank you! And I’ll look into it for the next time I use it
@samallen598 10 месяцев назад
Very cool video! Would love to see some of this type of content.
@RyanAndMattDataScience 10 месяцев назад
Have a lot of other vids to check out!
@aviluminos8759 9 месяцев назад
I got 78% result using forest. Thanks for the brilliant explanation!
@RyanAndMattDataScience 9 месяцев назад
No problem, awesome job
@jessedostal3256 Месяц назад
Your head is over the code that you are generating. It would help substantially if you could move the part with your camera to a different part of the screen (perhaps upper left hand corner?).
@RyanAndMattDataScience Месяц назад
Hey, I should have the notebook for this vid available on my Kaggle account. Also going to make this into an article
@mehdismaeili3743 2 месяца назад
Excellent .
@rakshitshukla4205 6 месяцев назад
Thank you so much for the video!!
@alihajizadeh7749 Год назад
It really helped a lot, thank you, keep going
@RyanAndMattDataScience Год назад
Thank you
@pasindugimhan5779 Год назад
Great work brother! I have subscribed you and waiting for next Kaggle projects also
@RyanAndMattDataScience Год назад ⁺¹
Thanks and I just uploaded one this week!
@pasindugimhan5779 Год назад
@@RyanAndMattDataScience Also at the final steps, I've faced to some errors. So, is there any way to contact you please..
@AbelGriffen 11 месяцев назад
Hi @Ryan Thanks for making this amazing video. I just want to understand why did use "Plus one for yourself" @25:05? Thank you!
@codingcambodia Год назад ⁺¹
keep up your good work
@RyanAndMattDataScience Год назад ⁺¹
Thank you! Working on more videos this weekend!
@ayushijainrkt 9 месяцев назад
1:17:00 i don't understand the usage of .transform('count'). Can someone explain with an example?
@elfincredible9002 10 месяцев назад
Thanks... I really enjoyed and you explain so well.... Bless you.
@RyanAndMattDataScience 10 месяцев назад
Thank you
@japyh4 Год назад ⁺¹
Thank you so much. Keep it up:)
@RyanAndMattDataScience Год назад
No problem
@chamudigamage7417 Месяц назад
Can you show us the part where you get the input? I don't know if I'm doing it right
@RyanAndMattDataScience Месяц назад ⁺¹
Shoot me a question on discord with the link to your notebook I’ll answer it on Januarys Q/A
@Al-Ahdal 9 месяцев назад
@Ryan Nolan Data: Excellent vdo.
@RyanAndMattDataScience 9 месяцев назад
Much appreciated
@katorechaitanya Год назад ⁺¹
you explained it in fantastic way just one request
will you please provide the valid link for notebook actually its not working
@RyanAndMattDataScience Год назад
Hey, just checked the link it's working?
www.kaggle.com/code/ryannolan1/titanic-wip-9-12
@umarmusisi8853 8 месяцев назад
Awesomely awesome...i had to sub
@RyanAndMattDataScience 8 месяцев назад
Np
@onurdatascience Год назад ⁺¹
Amazing content!
@RyanAndMattDataScience Год назад ⁺¹
Thanks man, you helped a ton with this vid
@tosinwilliams9343 Год назад
Just a suggestion your next video should be on using chatgbt for this project
@lecturesfromleeds614 Месяц назад
If they had not called it "un-sinkable" I doubt it would have got half the attention that it did? I thought that its maiden voyage was Liverpool to New York and it sank en route? But the Dataset has other locations. Also didn't many of the survivors commit suicide not long afterwards?
@RyanAndMattDataScience Месяц назад
Not sure? It did have a sister ship also. I do collect titanic cards though. Have some from 1911 and a bit later
@sildistruttore 10 месяцев назад
What's the point in splitting the dataset into train and validation if then at the end you are using only the training to do the grid search with cross validation? doesn't the grid search directly create the validation set on the training set you give it?
@ixcel87 Год назад
great tutorial, can't wait to check out part 2!
question on correlation map; how did you use it to narrow down your parameters/features?
@RyanAndMattDataScience Год назад
Part 2 is out! And I did this project a long time ago will try to take a look at the code and see the reasoning
@ixcel87 Год назад
@@RyanAndMattDataScience
Thanks again! I will view part 2 today. Also, definitely let me know about the correlation map and how it was used!
@Orokusaki1986 4 месяца назад
quick question: does Kaggle give you a rating based on speed/efficiency? I'm wondering specifically about just importing the whole libraries.
@tosinwilliams9343 Год назад
This was really helpful 🥳🥳🥳🥳
@RyanAndMattDataScience Год назад
Thanks
@alanjohnstone8766 10 месяцев назад
The length of the name is dominated by married ladies who have their married name AND their maiden names in brackets. Here is the top few:
'Penasco y Castellana, Mrs. Victor de Satode (Maria Josefa Perez de Soto y Vallejo)',
'Phillips, Miss. Kate Florence ("Mrs Kate Louise Phillips Marshall")',
'Duff Gordon, Lady. (Lucille Christiana Sutherland) ("Mrs Morgan")',
'Brown, Mrs. Thomas William Solomon (Elizabeth Catherine Ford)',
'Andersson, Mrs. Anders Johan (Alfrida Konstantia Brogren)'
So I think when noting that survival is related to name length you are actually picking up that name length is a predictor of being female who of course have a
higher chance of surviving.
Analysing this dataset is addictive - I must give it up!
@RyanAndMattDataScience 10 месяцев назад
theres so much to take a look at
@jacksun8129 7 месяцев назад ⁺¹
Hi Ryan, I am new to data science. I am a bit lost on what the point of analyzing the ticket number and passenger name. What is the goal of doing that? Same with qcuts, are we doing them to help with a decision tree model? Do we need to do any of this if we just build a regression model?
@mdaniels6311 26 дней назад
As richer people tend to have longer names, it is worth seeing if there are any correlations or patterns in the data.
@lenkapang-ek4fe 3 месяца назад
I don't understand pipeline.:( how can I do for that?
@RyanAndMattDataScience 3 месяца назад ⁺¹
we have a video on them
@md.ishraquebinshafique1968 Месяц назад
One question to the community.
Which one of the two is better:
1. df.describe()
2. print(df.describe())
@Ананімна Год назад
hello! thank you for your video, i am trying to follow you and repeat all the steps. i have found a better way to assign labels for age groups :
df['Age_Lebel'] = pd.qcut( df['Age'], 8, labels = np.arange(8) + 1 )
hope it can be helpful!
@RyanAndMattDataScience Год назад ⁺¹
Awesome! I may try to revisit this in the future.
@alanjohnstone8766 10 месяцев назад ⁺¹
I do not think you meant to make the young French girls ‘noble’ which you did. I am just starting to learn pandas with your help but some of the complicated string editing you did would have been so much simpler and more understandable if done in an old fashioned ‘for loop’. I know it is frowned upon by ‘experts’ but the whole point of Python is that the code is readable.
@RyanAndMattDataScience 10 месяцев назад
Probably a small mistake. Curious if it did better with not marking them as noble
@SYCS13YashGadhave 7 месяцев назад
can we add this project to our resume ?
@RyanAndMattDataScience 7 месяцев назад
Sure
@ChillWebDeveloper Год назад
nice video tutorial chief! What kind of extension do you use ?
@RyanAndMattDataScience Год назад
Wdym extension?
@ChillWebDeveloper Год назад
for example as you are typing, pop up auto correct or something like that@@RyanAndMattDataScience
@saadAhmed-co1si 9 месяцев назад
can you make more project every month?
i want learn about project more.
@RyanAndMattDataScience 9 месяцев назад ⁺¹
Eventually. I’m so busy though atm going through a backlog of videos to create
@J6rms Год назад
Am I the only one that can't see anything he types or clicks in the beginning of "Starting the Project" (from 9:00 for approx a minute)? :/
@RyanAndMattDataScience Год назад
Hey there was a small editing error but all the code is in the description through the Kaggle link
@mn4769 9 месяцев назад
Why on 40:57 my output of age is 0 every row
@DJAMOH1 3 месяца назад
Ever found this?
@tamtam8420 7 месяцев назад
on 48:37 there is a shorter way of extracting Title: train_df['Title'] = train_df['Name'].str.extract(' ([A-Za-z]+)\.', expand=False)

Следующие

Автовоспроизведение

Beginner Data Science Portfolio Project Walkthrough (Kaggle Titanic Part 2)