Complete Machine Learning Project for Absolute Beginners (Tutorial)

Поделиться
HTML-код
  • Опубликовано: 8 авг 2022
  • Machine Learning Project for Absolute Beginners: schoolofmachinelearning.com/a...
    Dataset: github.com/upgini/upgini/raw/...
    Machine learning projects are a crucial aspect of learning ML, and most importantly they are a huge part of becoming a machine learning engineer. Doing projects helps you to build your knowledge of ML and also helps to showcase what you have learned as well.
    This is a complete tutorial for a sales forecasting project using machine learning for beginners. The dataset we will make use of contains 5-years worth of product sales data. Our goal is to effectively forecast the future sales of those products for the next 3-months. To achieve this goal we will be making use of a state-of-the-art gradient boosting algorithm as well as a python library called Upgini, for data enrichment.
    By completing this project, you will be able to learn:
    1. How to effectively use popular python libraries like pandas
    2. How to use catboost
    3. How to enrich data with Upgini
    4. Importance of data enrichment
    5. What are SHAP values
    6. What are SMAPE values
    7. How to split time-series datasets into training and testing sets
    8. How to train and test models
    The dataset we will look at is
    Machine Learning Roadmap 2022 Website:
    bit.ly/LearnML2022
    Join me on #100DaysOfML and follow along to learn machine learning!
    Start from day 0: • 100 Days Of ML
    ------------------------------------------------------------------------------
    Discord Link for School Of Machine Learning:
    ------------------------------------------------------------------------------
    / discord
    -------------------------------------------------------------------------
    LINKS:
    --------------------------------------------------------------------------
    🛤️ NEW Machine Learning Roadmap 2024 Website:
    schoolofmachinelearning.com/2...
    --------------------------------------------------------------------------
    MORE VIDEOS:
    --------------------------------------------------------------------------
    📌I'm Starting My Machine Learning Company (Day 1)
    • I'm Starting My Machin...
    📌Top Machine Learning Certifications For 2021
    • Top Machine Learning C...
    📌Why You Should NOT Learn Machine Learning!
    • Why You Should NOT Lea...
    📌How I Learnt Machine Learning In 6 Steps (3 months)
    • How I Learnt Machine L...
    📌How To Learn Machine Learning For Free
    • How To Learn Machine L...
    --------------------------------------------------------------------------
    Follow me:
    --------------------------------------------------------------------------
    Subscribe: ruclips.net/user/smithakolan...
    LinkedIn: / smithakolan
    Instagram: / smithakolan
  • НаукаНаука

Комментарии • 55

  • @janebirman5057
    @janebirman5057 Год назад +5

    I’m so glad I found this channel! It’s very well organized, it has high quality video topics, and a level of expertise that I haven’t seen in other DS/ML RUclipsrs. Keep up the great work!

    • @SmithaKolan
      @SmithaKolan  Год назад

      Thank you Jane, that makes me so happy to hear. 😊

  • @ShoaibKhan
    @ShoaibKhan Год назад +4

    This is an excellent video for absolute beginners! Looking forward to more videos!

  • @SaffatUllah
    @SaffatUllah Год назад +2

    This was such an amazing video. Could you please do more of these step-by-step machine-learning project tutorials? They're really helpful!

  • @idowuisaac2278
    @idowuisaac2278 Год назад +1

    Hi Smitha, thank God i found you online. ML projects are basicly all i need now. Meanwhile was that project complete, i tot u were going to predict something.

  • @gulistanibadat4565
    @gulistanibadat4565 Год назад +2

    As always, a very useful video!

  • @upgini
    @upgini Год назад +1

    If you want to use upgini in production mode you can use transform method. It enriches any datasets on a production step with an actual features for a present day

  • @commercial3750
    @commercial3750 Год назад

    Amazing video. I learned so much. Thank you for your help

  • @sriharib3641
    @sriharib3641 Год назад +1

    I referred 6 days, then I finally watched, 100 days challenge is no more videos.

  • @Death_User666
    @Death_User666 7 месяцев назад

    LEGENDARY
    subscribed and liked

  • @michelchaghoury9629
    @michelchaghoury9629 Год назад +13

    Please we need more project based tuts, and keep going really helpful

    • @SmithaKolan
      @SmithaKolan  Год назад +5

      Glad you found it helpful! Definitely will be making more!

  • @GeorgeVMorpheus
    @GeorgeVMorpheus Год назад

    Very nice approach...you can try normalizing and/or standardizing the feature values too...might or not gives you better scoring, but it'd help with the computations and time performance of the model

  • @YadavJii-pt2sx
    @YadavJii-pt2sx 5 месяцев назад

    Wow this is amazing!!! Thank you so much 💯😊

  • @sundaresanm9854
    @sundaresanm9854 Год назад +4

    Hi.. that's a great tutorial! As a college student I'm new to ai and ml, and currently I'm doing a project on detecting impersonation in online examination, As a beginner I feel it is hard but I have to finish it, can you do a tutorial on this? It would be helpful for me and to inspire me to do explore more in this domain🥺💫

  • @alanpros6950
    @alanpros6950 Год назад +1

    I hit the bell button, hope this will by cool.

  • @anounTT
    @anounTT Год назад

    You can do alt+down arrow and it will duplicate the line underneath in colab.

  • @YasmineHabchi-kp8op
    @YasmineHabchi-kp8op Год назад

    Thank you for the video, I have just finished the notebook and I want to ask how can I participate in Kaggle competition, what is the next step and THANK YOU!

  • @FarhanHussain
    @FarhanHussain Год назад +4

    An excellent ML project tutorial for beginners!

  • @shikhargupta352
    @shikhargupta352 Год назад +1

    i got an error at 20:30 is there any sol for that

  • @eirisarca6717
    @eirisarca6717 Год назад

    Thank you so much Smitha! I’m really enjoying this project series on ML and it has helped me a lot in my ML learning journey. Hope you could continue it 🫶

  • @shahbozrazzoqov
    @shahbozrazzoqov Год назад

    Amazing!

  • @prajwalpai6442
    @prajwalpai6442 Год назад +3

    Thanks a lot for this🙌

  • @merajulrahmanshipon3308
    @merajulrahmanshipon3308 Год назад +5

    I am having an error please help.
    calculate_metrics() got an unexpected keyword argument 'eval_set'

    • @yanapr6095
      @yanapr6095 Год назад

      Did you find the solution for this?

    • @maxim7454
      @maxim7454 Год назад

      Hi! this is upgini developer. This method was deprecated. But special for Smitha viewers we returned it back yesterday. Try again and everything will work. :=)

    • @maxim7454
      @maxim7454 Год назад

      @Raja Muhamed A new version of code. You need to reinstall upgini to use it. %pip uninstall -y upgini
      %pip install -Uq upgini

  • @wastefellow5141
    @wastefellow5141 Год назад

    Which is best Laptop for Machine Learning Engineer

  • @tusharmall8427
    @tusharmall8427 Год назад

    Hello ma'am, can you suggest AI related Project for participated in Hackathon

  • @baconian_road_construction
    @baconian_road_construction Год назад

    Wow, Upgini really is something else. It's so cool how it can find data that's actually relevant to your training set so seamlessly. This is the first time I've seen anything like it! Does it pick out the enrichment data purely based on the search key you provide and how well it correlates to the target?

    • @roma5482
      @roma5482 Год назад

      It actually picks external features based on three components (all from the labeled training dataset): search key - just to match the records from external data sources. Second - based on label, to filter unrelevant features and rank them, and this is NOT being done with correlation, as it's not gonna be very useful for ML model accuracy boost. Third one - based on already existing features in the labeled dataset, as you most probably dont' need same signals as you already have 😉

  • @Timepass-zr7cy
    @Timepass-zr7cy Год назад

    What happens with 100 days of aiml

  • @student7818
    @student7818 10 месяцев назад

    i cant install upgini in my vs code can you give a solution

  • @techbinay
    @techbinay 11 месяцев назад

    Dataset link not opening , please update

  • @sabz6074
    @sabz6074 Месяц назад

    Hey,
    I tried to make the enriched dataset but this error happened:
    You are trying to launch enrichment for 15213 rows, which will exceed the rest limit 10000.
    what should I do??

  • @royalevictoria
    @royalevictoria Месяц назад

    I get an error - "You are trying to launch enrichment for 730500 rows, which will exceed the rest limit 10000."

  • @flosrv3194
    @flosrv3194 2 месяца назад

    Hi Smitha, it refused to install the dependencies, nothing works on my end

  • @AdrenalineAkash13
    @AdrenalineAkash13 Год назад

    dataset?

  • @user-uy6wi5fu4p
    @user-uy6wi5fu4p 5 месяцев назад

    if you're forecasting sales weeks/months in advance, wouldn't you need to know what your features are weeks/months in advance? Example: if I want to know the sales forecast for April (and it's February right now), I would need to know all the features for April (aka the items, dow jones, the weather - or whatever the features we trained on are). So, shouldn't we check if we can even predict these features first?

  • @subramanianchenniappan4059
    @subramanianchenniappan4059 6 месяцев назад

    I am a java developer for more than a decade with python handson. Do you have a detailed tutorial on machine learning basics 😊😊

  • @sriharib3641
    @sriharib3641 Год назад

    Yes, what happened 100 days challenge?

  • @samvarthikac1996
    @samvarthikac1996 3 месяца назад

    I'm getting error at enriched_train_features.head()..as it says head is noneType

    • @GregoryJoseph-hc5hj
      @GregoryJoseph-hc5hj 3 месяца назад

      There is a cap on the free tier of upgini. 10k as of this date.

  • @datacamp3557
    @datacamp3557 Год назад +1

    Please help me understand how such an implementation is deployed. If we use this in real life we will need to obtain day to day values of the features that are being incorporated into the model. From where do we get that data?

    • @CrazyFanaticMan
      @CrazyFanaticMan Год назад

      You need to source data from somewhere. You can either get it directly, for example, you own a company and your storing all kinds of data such as transactions and user behavior, or you ask participants to fill out surveys or to answer questions, or measure how they perform on certain physical/mental tests and so on.
      Or you can source data indirectly. For example, maybe you find a website that can give you sports data, or maybe you build a web scraper to scrape comments off of TikTok or Facebook. Sometimes companies offer free APIs like RUclips and Reddit so you can get certain data from their website. Some API's you have to pay for like getting historical stock price data. Or maybe you just avoid paying for that and you build your own web scraper to scrape publicly available data from other sources.
      You have to decide what data you are trying to gather, whether you can generate the data yourself or whether you have to source it and how you are going to source it. Sometimes it's easy and sometimes it harder, it really depends.
      When you train and deploy a model, you can see how well it accurately predicted an outcome by comparing it with the actual outcome. For example, you might be able to predict that the a stock price will go up and down tomorrow after the markets close. Just check whether your model did an okay job or not tomorrow, it will never be perfect because its impossible to predict the future but you can tweak it to a threshold you are happy with.

    • @roman5786
      @roman5786 Год назад

      Call transform after fit with DATE in a search keys, it will enrich your dataset with an actual features for the present date. That's all 😉

  • @karthikbhandary879
    @karthikbhandary879 Год назад +1

    I think you forgot to put the link for the dataset in the description.

  • @colinmaharaj
    @colinmaharaj Год назад

    I'm not into python.. :(

  • @mugomuiruri2313
    @mugomuiruri2313 4 месяца назад

    good girl.from africa

  • @anounTT
    @anounTT Год назад

    I know this video is 10 months old and you may be doing this by now, but the flow of your videos would go faster and smoother if you were talking and typing at the same time. I see when you type you are looking at another monitor. This creates a pause and ruins the flow of your presentation. Everything you say after you type you should say during. This was feedback that I got from teaching online coding students. I used to do the same thing.

  • @colinmaharaj
    @colinmaharaj Год назад

    Ok so I'm gonna say it, you don't have to, but I'm unsubscribing to other channels that's just gonna waste my time and affect my focus.