Data Science Beginner Project: Kaggle House Prices Regression Analysis (Full Walkthrough)

Поделиться
HTML-код
  • Опубликовано: 9 июл 2024
  • Welcome to our latest data science project! In this exciting RUclips tutorial, we'll dive into the world of advanced regression analysis using Kaggle's House Prices dataset. When working on the project, the code was able to achieve a top 10% score!
    Kaggle Notebook: www.kaggle.com/code/ryannolan...
    Interested in discussing a Data or AI project? Feel free to reach out via email or simply complete the contact form on my website.
    📧 Email: ryannolandata@gmail.com
    🌐 Website & Blog: ryannolandata.com/
    🍿 WATCH NEXT
    Scikit-Learn and Machine Learning Playlist: • Scikit-Learn Tutorials...
    Optuna Hyperparameter Tuning: • Mastering Hyperparamet...
    Titanic Data Science Project: • Beginner Data Science ...
    Stacking Regressor: • Python Stacking Regres...
    MY OTHER SOCIALS:
    👨‍💻 LinkedIn: / ryan-p-nolan
    🐦 Twitter: / ryannolan_
    ⚙️ GitHub: github.com/RyanNolanData
    🖥️ Discord: / discord
    📚 *Practice SQL & Python Interview Questions: stratascratch.com/?via=ryan
    WHO AM I?
    As a full-time data analyst/scientist at a fintech company specializing in combating fraud within underwriting and risk, I've transitioned from my background in Electrical Engineering to pursue my true passion: data. In this dynamic field, I've discovered a profound interest in leveraging data analytics to address complex challenges in the financial sector.
    This RUclips channel serves as both a platform for sharing knowledge and a personal journey of continuous learning. With a commitment to growth, I aim to expand my skill set by publishing 2 to 3 new videos each week, delving into various aspects of data analytics/science and Artificial Intelligence. Join me on this exciting journey as we explore the endless possibilities of data together.
    *This is an affiliate program. I may receive a small portion of the final sale at no extra cost to you.
  • НаукаНаука

Комментарии • 47

  • @RyanNolanData
    @RyanNolanData  8 месяцев назад +5

    Hey guys I hope you enjoyed the video! If you did please subscribe to the channel!
    Here is the Kaggle Notebook: www.kaggle.com/code/ryannolan1/kaggle-housing-youtube-video
    I do plan on updating it + adding more notes/comments to it.
    Also practically everything I covered in this project is on the channel. You can find the videos in this playlist: ruclips.net/video/SjOfbbfI2qY/видео.html&ab_channel=RyanNolanData
    Up next I'm working on a Python Classes course and the start of a series on Deep Learning!

  • @TheErick211_
    @TheErick211_ 2 месяца назад +3

    If is relevant at all I would recommend that if you are zooming in the screen then move the zoom towards the same position you are reading or talking about, often in the video the zoom wasn't relevant

  • @mgrahamization
    @mgrahamization 2 месяца назад

    This is fantatsic and Ive subscribed to your channel. Im only new to this but people like you who spend their time creating videos like this are commendable. I hope to give back like this one day. Also, you mentioned someone on Kaggle that you got some tips from. Who was that? Im fascinated to know who has more knowledge than someone like you that has heaps

  • @kwizeralambert1316
    @kwizeralambert1316 8 месяцев назад +3

    You are the best teacher. Keep it up, once I started Kaggle but have not made any competition..But this seems to encourage to consider that.

  • @elfincredible9002
    @elfincredible9002 3 месяца назад +1

    I just finished it. Dope... Thanks so much.

  • @japyh4
    @japyh4 8 месяцев назад +1

    Thanks for the video, it was awesome.

  • @pradipthij3552
    @pradipthij3552 Месяц назад

    hey thank you for this amazing vid.

  • @wahyunanandika1679
    @wahyunanandika1679 18 дней назад

    Thanks man, it help me a lot

  • @mattadata
    @mattadata 8 месяцев назад

    Ok, dude... I haven't even watched the video yet. I'm just here to say that on my way home from work today I was thinking about doing this EXACT project and I completely forgot about. All of a sudden your video pops up on my feed... Yo, Data science out hear reading minds!

    • @RyanNolanData
      @RyanNolanData  8 месяцев назад

      Haha awesome! Hope you enjoy it

  • @pubgdoremongamer8823
    @pubgdoremongamer8823 7 дней назад

    sir why don't you just use r2 score instead of MSE?

  • @shivamsapru2246
    @shivamsapru2246 8 месяцев назад

    Your videos are great. I just love this channel. It's just that kndly try to focus the recording on the code when you are typing. 🙂

  • @richardweston3554
    @richardweston3554 4 месяца назад

    I'm a little bit over an hour in and good video so far! I think you could have saved a lot of time doing many things programmatically so far though.

    • @RyanNolanData
      @RyanNolanData  4 месяца назад

      I agree with you, it’s not the cleanest code

  • @mattysmirks
    @mattysmirks 4 месяца назад

    Thank you for creating this video. Can you expand more on why you did not include both Lasso and ElasticNet at the 2:25:10 mark? I'm curious if it made the Stacking Regressor worse at the very end in your original notebook.

    • @RyanNolanData
      @RyanNolanData  4 месяца назад +1

      If I remember correctly it made the results worse when submitting the results. I had kept a spreadsheet with all my attempts.

  • @vancouverrrr
    @vancouverrrr 6 месяцев назад

    i knew u looked familiar and then saw the vintage cards in the back Lol, im subscribed to ur card channel too

    • @RyanNolanData
      @RyanNolanData  6 месяцев назад

      No way haha first dual subscriber

  • @olinabin2004
    @olinabin2004 6 месяцев назад +1

    You earned a subscriber :)

  • @s.s.sdhyuthidhar2276
    @s.s.sdhyuthidhar2276 4 месяца назад

    Hey Nolan Do you have separate tutorials for every machine learning model you used in this tutorial?

  • @OrangeTomato474
    @OrangeTomato474 8 месяцев назад +1

    I'm trying to build something similar but instead of prediction they have asked me to explain house price-
    A data science model that explains how different factors(gpd, unemployment, interest rate etc ) impacted home prices over the last 20 years.
    Any suggestions on what type of model should I use for this problem

    • @RyanNolanData
      @RyanNolanData  8 месяцев назад

      I would look at implementing Principal Component Analysis to see what has the biggest impact

  • @SophiaUmaru
    @SophiaUmaru 10 дней назад

    i try to run the train_df.columns and test_df.columns but i get a namerror saying train and test not defined ..pplease what could be wrong

    • @forfiverr3873
      @forfiverr3873 9 дней назад

      you probably haven't initialised the variables...read them from the csv provided using pd.read_csv()

  • @senthilkumars1061
    @senthilkumars1061 6 месяцев назад

    Doubt In this question when we have already given the distinct train and test data separately. Then why do u perform an additional split using train_test_split ?

    • @RyanNolanData
      @RyanNolanData  6 месяцев назад

      So I can get a better model for my train set

    • @RyanNolanData
      @RyanNolanData  6 месяцев назад

      Think of train, test, validation

    • @SamLaseter
      @SamLaseter День назад

      ​@@RyanNolanData Shouldn't you do the imputation after you split your data into training and test sets to avoid data leakage?

  • @itsmephougat
    @itsmephougat 3 месяца назад

    With some tuning i got 0.018
    Can you make more such competition videos cause i love it.

    • @RyanNolanData
      @RyanNolanData  3 месяца назад +1

      Nice job! And someday. I want to finish building out my OpenAi/Langchain playlist and then work on a dbt one first

    • @Prathamydvv
      @Prathamydvv 2 месяца назад

      can i see your code for learning purpose

  • @forfiverr3873
    @forfiverr3873 9 дней назад

    Bro this is going to seem very dumb...why do you plot any parameter against the price in the y axis? why not use id?

  • @olinabin2004
    @olinabin2004 6 месяцев назад

    Timestamp for personal purpose : 47:00

  • @ttien1612
    @ttien1612 2 месяца назад

    r^2 score = -4.0019e+19 i think you wrong somewhere

  • @coopernik
    @coopernik 5 месяцев назад

    Very instructive video but the zoom was a bit off

    • @RyanNolanData
      @RyanNolanData  5 месяцев назад +1

      Have all the code in the description if anything is off

  • @vishnukp6470
    @vishnukp6470 8 месяцев назад

    can you do any timeseries for the next time?

    • @RyanNolanData
      @RyanNolanData  8 месяцев назад +1

      Next year for sure! I’m currently studying Deep Learning