Tutorial 3-End To End ML Project With Deployment-Project Problem Statement,EDA And Model Training

Поделиться
HTML-код
  • Опубликовано: 16 ноя 2024

Комментарии • 227

  • @krishnaik06
    @krishnaik06  Год назад +11

    Join this channel membership to get access to materials and connect with me:
    ruclips.net/channel/UCNU_lfiiWBdtULKOw6X0Digjoin

    • @Pravin33unique95
      @Pravin33unique95 Год назад +1

      I sir i have got problem as how to extract data into vs code CSV student daya

    • @SylvanusNusetorAtiku
      @SylvanusNusetorAtiku 5 месяцев назад

      @krish naik could youplease help with step by step how to commit to large files of the project to github?

  • @arias2832
    @arias2832 Год назад +13

    I was lost after finish an ibm course in data science. Nobody give me a job because I don´t have experience. I think that with your videos i will get it. Thanks for your excelent work, really has helped me a lot. Greetings from Colombia!

  • @samruddishetty147
    @samruddishetty147 18 дней назад

    Thanks Krish,you realised a gap in the market where there are very less people explaining end to end this clear!

  • @jaysamirhegshetye5660
    @jaysamirhegshetye5660 8 месяцев назад +1

    I really wanted to do a ML project where I can utilize all ML algorithms on single dataset. I find this playlist best for my project. Thank you a lot krish sir for making this informative and instructive tutorial !!!

  • @javeedtech
    @javeedtech Год назад +8

    After this series I regained the interest in machine learning, thanks for timely series..👍

  • @ramkisundararaman3711
    @ramkisundararaman3711 9 месяцев назад +1

    Krish - You are awesome as always. I was out of the market for more than a year and now getting back to Data Science and your videos are helping me refresh my skills. Thanks

  • @gourabguha3167
    @gourabguha3167 Год назад +8

    Thanks a lot Sir for extending the series..with ci/cd pipeline and mlops..Very much looking forward to it.

  • @sudhirmalik100
    @sudhirmalik100 10 месяцев назад +3

    I think there is one improvement required here which is we should split data first then do fit_transform on training data and then transform data on test set.

  • @talibdaryabi9434
    @talibdaryabi9434 Год назад +21

    I wonder why most of those who watch it learn something well but don't press the like button. I think this is the least you can do it. So please support those who try to teach the world-class method for free.

  • @elnazfathi
    @elnazfathi 5 месяцев назад

    You are doing an amazing job presenting how to get this done in industrial environment. Thanks for your effort!

  • @bilelkhelifi897
    @bilelkhelifi897 9 месяцев назад +1

    Best Playlist and channel i ve ever seen, THANK YOU SO MUCH KRISH

  • @mahdiaspanani8004
    @mahdiaspanani8004 5 месяцев назад

    I love the parts that you have an error, but you don't stop recording. and this teaches us every person might have these problems.

  • @insaneclutchesyt948
    @insaneclutchesyt948 Год назад +1

    thank you very much you are the very few of those ppl who shows errors , it helps a lot , keep going!!

  • @apurvtewari3779
    @apurvtewari3779 9 месяцев назад

    What this entire project be described as to mention in my resume as a fresher.
    Help will be appreciated all. :)

  • @rajeshvenaganti6797
    @rajeshvenaganti6797 Год назад +2

    I have gone through your Python & Ml Playlist and it was a great learning experience, Thanks for this End to End Ml Project playlist, and thanks for your note that you will extend this deep learning & NLP, I am eagerly waiting for your session for deep learning & NLP implementation in this playlist,

  • @beingbawe
    @beingbawe 4 месяца назад +1

    i enjoy ur style of teaching. Thank you for all your hard work

  • @mayankamble2588
    @mayankamble2588 6 месяцев назад

    Amazing Playlist. Learning alot. One point on transformation: Standard scalar should be applied separately on training and then use that scaler to transform the test data. This way there wont be any data leakage in the testing set.

  • @ransinghray3688
    @ransinghray3688 9 месяцев назад

    Krish you are really redefining the tech educational system, you are awesome!!

  • @sudeepmathur9654
    @sudeepmathur9654 Год назад +1

    Excellent video . I retired recently & just thought to keep myself engaged by learning new things , saw your video & found it very useful. Keep it up & best wishes for all the hard work you are doing in spreading knowledge. -- Sudeep Mathur

  • @ashishsharma214
    @ashishsharma214 Год назад +2

    Thank you for this amazing playlist Krish!
    God bless you.

  • @tatakae6666
    @tatakae6666 Год назад +1

    This series is absolute gold

  • @opman5657
    @opman5657 Год назад +1

    very well explained and good job for learners like me. Thanks. Gob bless you

  • @harishs-dm8mm
    @harishs-dm8mm Год назад +3

    Hii Krish thanks for extending the project. Please include Data and model versioning and mlops practices

  • @mahikhan5716
    @mahikhan5716 Год назад

    Nothing be better than . He just poured everything as data science needs . Owe to him

  • @sabinadhikari2643
    @sabinadhikari2643 Год назад +3

    For the sns.countplot() function we have to pass the value for x. i.e sns.countplot(x=data) will work. otherwise sns.coutplot(data) will give an error.

  • @mohitpansari6603
    @mohitpansari6603 3 месяца назад +1

    There is one issue - we have made here Standard Scaling using whole X rather we should have done that using only X_train -> fit_transform and X_test -> transform

  • @khirooo
    @khirooo Год назад +5

    Hi Sensei, I am following your projects and every detail, and I am very thankful for your valuable content. but I think I found a code mistake at EDA part.
    with this new code :
    works well .... regards

    • @yashpisat9267
      @yashpisat9267 11 месяцев назад

      Thankyou for the solution, I am facing a similar issue with "df.groupby('parental_level_of_education').agg('mean').plot(kind='barh',figsize=(10,10))" at this statement

    • @HimanshuBisht94
      @HimanshuBisht94 11 месяцев назад

      @@yashpisat9267 Here is the solution.
      df.groupby('parental_level_of_education')[['math_score','reading_score','writing_score','average']].agg('mean').plot(kind='barh',figsize=(10,10))

    • @shivankvishwakarma2994
      @shivankvishwakarma2994 11 месяцев назад

      Thanks man!!

    • @lokeshbapte-j9n
      @lokeshbapte-j9n 10 месяцев назад

      how u solved that error
      please tell emmediately @@shivankvishwakarma2994

    • @KAKAROT808
      @KAKAROT808 2 месяца назад

      ​@@yashpisat9267Hey I have questions....can you explain this specific code please

  • @karanbais1843
    @karanbais1843 8 месяцев назад

    loving the playlist sir thank you for it

  • @sparshjain7542
    @sparshjain7542 Год назад

    you are the why I love machine learning

  • @royalchallengersbangalore535
    @royalchallengersbangalore535 Год назад

    your teaching is like ❤❤

  • @list10001
    @list10001 9 месяцев назад

    Thank you for the valuable tutorials.

  • @kumaronlineplay
    @kumaronlineplay Год назад

    Excellent Video.. Thanks for sharing it.

  • @abhinavsaini1662
    @abhinavsaini1662 Месяц назад

    Enjoying the videos

  • @shahilshrestha3700
    @shahilshrestha3700 Год назад

    thank you so much sir for your video. I am learning a lot of new concept in better explaination all because of you

  • @suvarnapawar3186
    @suvarnapawar3186 Год назад

    very nice explaination sir...thank u very much

  • @AnimeAficionado28
    @AnimeAficionado28 Год назад

    we thankful for your wonderful knowledge on ML and i have a wish if could make a Deep Learning project playlist from scratch it would be very grateful of you.

  • @rahulnakka87
    @rahulnakka87 3 месяца назад

    # Remove duplicates
    df_no_duplicates = df.drop_duplicates()
    # Keep the last occurrence of each duplicate
    df_no_duplicates = df.drop_duplicates(keep='last')

  • @Pravin33unique95
    @Pravin33unique95 Год назад +4

    How you create notebook folder and in that EDA and model training I don't understand

  • @sharifdeenashshak4496
    @sharifdeenashshak4496 Год назад

    looking forward for deep learning end to end series

  • @aravinda1595
    @aravinda1595 8 месяцев назад

    Sir amazing video btw
    Boys have performed really well in MATHSSS

  • @utkarshapadhye7656
    @utkarshapadhye7656 Год назад +3

    I am getting packages not found in vcode , though I followed all the steps from start and also in my global env all packages are installed.

  • @asieharati
    @asieharati Год назад +1

    You should't do the preprocessing on X. You should fit to X_train and fit_transform on X_test

  • @nanditagautam6310
    @nanditagautam6310 Год назад

    Great ! Thanks Krish

  • @aslanali9977
    @aslanali9977 Год назад +2

    Thanks!

  • @manar4944
    @manar4944 5 месяцев назад

    That's super great work, it really helped me.
    But i'm surprised that you've done columns transformation "Standard Scaler" before splitting the train/test sets, most articles said it will result in data leakage, can you please elaborate

  • @shalakam1617
    @shalakam1617 Год назад

    Thanks for this series

  • @matindram
    @matindram Год назад

    Thank you for the series

  • @rupindersingh1312
    @rupindersingh1312 Год назад +1

    thanks for this video
    10-07-2023

  • @nikhildoye9671
    @nikhildoye9671 11 месяцев назад +2

    Shouldn't we split the data first and then apply transformation? Won't this lead to data leakage?

  • @sajjaduddin8188
    @sajjaduddin8188 2 месяца назад

    Thank you sir

  • @AkilaDS-kz6yv
    @AkilaDS-kz6yv Год назад +2

    How can we export data and eda codings could you please explain that part?

  • @maximilianlossl226
    @maximilianlossl226 Год назад +3

    Sorry, but I really don't understand how to get the data and jupyter files into my Visual Studio Code, can you help me?

  • @sibaprasadnaikbehera3442
    @sibaprasadnaikbehera3442 Год назад

    Thank you for problem statement sir we are eagerly waiting for that

  • @gauravmishra7591
    @gauravmishra7591 Год назад +2

    Please make a Pyspark end to end project like a real world

  • @nirbhaysedha8541
    @nirbhaysedha8541 Год назад

    thanks sir it will help us a lot🙏

  • @sagarthacker5114
    @sagarthacker5114 Год назад +3

    Hello Krish, I was hoping to ask for your opinion on a particular aspect of data preprocessing. Shouldn't we perform data splitting first to prevent data leakage, as standard scaling considers the mean and variance of the entire dataset? This may include the test set, leading to potential data leakage. Would you kindly share your thoughts on this topic? Thank you very much.

    • @krishnaik06
      @krishnaik06  Год назад +1

      Yes i will take care of it while writing in a modular way...

  • @shalakam1617
    @shalakam1617 Год назад

    Thank You for series

  • @pankajkumarbarman765
    @pankajkumarbarman765 Год назад

    Thank you so much sir

  • @SaranyaDass-l9h
    @SaranyaDass-l9h Год назад +1

    Hi, How are we importing the data of the csv file into VS ?

  • @ncheymbamalu4013
    @ncheymbamalu4013 Год назад

    Adjusted R² instead of R² for the evaluation metric.

  • @sanket_a2033
    @sanket_a2033 Год назад

    Thank You Sir..

  • @pankajkumarbarman765
    @pankajkumarbarman765 Год назад

    Thank You So much sir 💗

  • @Mery._.11111
    @Mery._.11111 Год назад

    Thank you so much !

  • @shruthakeerthipurushothkum2724
    @shruthakeerthipurushothkum2724 6 месяцев назад

    hi @krishnaik06 , can you kindly show how did you work with jupyter in vscode , i mean did you do the eda in jupyter notebook and then converted to vscode ..

  • @rohitbharti2882
    @rohitbharti2882 Год назад

    So much happy sir ❤❤❤

  • @yashsoni1153
    @yashsoni1153 Год назад +2

    Sir I am not able to open jupyter notebook in vs code I thing there is error in file
    Pls help me to resolve this...

  • @robinchriqui2407
    @robinchriqui2407 Год назад

    Thank you, Krish. It refreshed a lot of information and skills I'm looking forward to seeing the automation and deployment part of it. Will you integrate the ML Ops part in the future?

  • @DhaneshRamesh-p9b
    @DhaneshRamesh-p9b Год назад +3

    Hey Krish i have all my libraries such as numpy but when i try to run it through the ipy kernal it shows numpy not found

    • @prianshmadan
      @prianshmadan Год назад

      Bro, were you able to resolve this?

    • @riachoudhari7297
      @riachoudhari7297 Год назад

      @@prianshmadan Kindly let me know the solution for the same please

    • @shitikanthabagh9859
      @shitikanthabagh9859 8 месяцев назад

      Install the ipykernel again, it would may be upgrade to the latest python package available not 3.8 used in this video conda install -p environment path ipykernel --update-deps --force-reinstall and then in the interpreter selecte the correct jupyter kernel, it should work

  • @rahulsharma5693
    @rahulsharma5693 Год назад +5

    Hi Krish, when I try to do from src.logger import logging, it gives error no module named src, but if i do from logger import logging then it works? any idea???

    • @karishmamehar4081
      @karishmamehar4081 Год назад

      bcz both .py files are present in same module so we can directly import it, if exception is present outside of src then src.logger will work

    • @rahulsharma5693
      @rahulsharma5693 Год назад +1

      @@karishmamehar4081 yep i was able to understand that but why did it work for krish in the video and me getting error

    • @avbendre
      @avbendre Год назад

      @@rahulsharma5693 yes same doubt i think it has something to do with magic

  • @ravulapallivenkatagurnadha9605

    Neeed more videos like this

  • @swL1941
    @swL1941 Год назад

    @Krish Naik
    Hello Sir, why didn't you use Cross Validation instead of Train-Test-Split ?

    • @subrataassam
      @subrataassam Год назад

      Could have used CV or might be any other data splitting techniques but I guess the main aim of this tutorial was to build a framework for an ML project. Can improvise later as per new ideas or in-depth explorations.

  • @Ythandlenoobme
    @Ythandlenoobme День назад

    how and when he added the notebook folder he didn't mention in the starting please help i am stuckk

  • @mayankporwal4858
    @mayankporwal4858 Год назад +1

    Here we did not discuss about catboost_info file that is present, why is there and what is it's use??
    please explain Krish sir.

    • @AmbarGharat
      @AmbarGharat 7 месяцев назад

      It will automatically come once you install catboost and IDK why.

  • @laxmanteja
    @laxmanteja Год назад

    I'm very interested Krish about your teaching techniques and in this end-to-end project, can I expect automation of the project with code

  • @javeedtech
    @javeedtech Год назад

    Thanks again

  • @MartinBurleston
    @MartinBurleston 3 месяца назад

    how did you get the dataset and import it on vscode
    i dont understand

  • @video89652
    @video89652 Год назад

    Hello Krish, is there any projects that solved Direction of Arrival Problem in Audio Signal Processing. Can you do a tutorial on it

  • @shahirajlakade7921
    @shahirajlakade7921 Год назад +1

    sir any upcoming data analyst batch missed 50% off offer😔

  • @sh__--
    @sh__-- Год назад

    Thanks 😊

  • @amazingplaytv2661
    @amazingplaytv2661 Год назад +2

    where do I get the data files. I mean the contents of the notebook folder?? I am coding along the series

    • @santhoshkoppisetti5034
      @santhoshkoppisetti5034 11 месяцев назад

      bro do u know how to get contents of notebook,please sayy????

  • @PavanReddy-xl1uu
    @PavanReddy-xl1uu Год назад +3

    Hi everyone,
    I'm getting the below error when I'm trying to run "exception.py" file.
    (c:\Users\pavva\OneDrive\Documents\AI Project\venv) C:\Users\pavva\OneDrive\Documents\AI Project>python src/exception.py
    Traceback (most recent call last):
    File "src/exception.py", line 2, in
    from src.logger import logging
    ModuleNotFoundError: No module named 'src'
    I did import this line "from src.logger import logging" in exception.py.
    All the files name are correct and it's in proper order.
    Can someone help me?
    Thank you.

    • @ramin.nourizade
      @ramin.nourizade Год назад

      Hi, remove src/ from import

    • @PavanKumar-ut2lo
      @PavanKumar-ut2lo Год назад

      @@ramin.nourizade if i only use "import logging" then it's not updating in the logging file.

    • @Pravin33unique95
      @Pravin33unique95 Год назад

      I also getting same problem syntax invalide at line 9

    • @imadsyed6417
      @imadsyed6417 Год назад

      refresh your vscode

    • @saikrishna887
      @saikrishna887 Год назад +2

      import sys
      sys.path.append(os.path.abspath('C:/Users/xxxx/MLProject/src'))
      # Now you can import 'logging' from 'logger' module
      from logger import logging
      Try adding the code,This will resolve your issue.

  • @shubhamkshirsagar4511
    @shubhamkshirsagar4511 5 месяцев назад

    drop_dupicates(inplace=true)

  • @nishantverma2966
    @nishantverma2966 Год назад +1

    How we can export the EDA, model training and data file to Visual Studio

  • @abhijeetrokade2349
    @abhijeetrokade2349 Год назад

    Which kinds of project need to choose when we preparing for interview?

  • @Gautam1108
    @Gautam1108 16 дней назад

    Guys please help me. My computer restarted on its own and now even after reactivating the "venv", I'm unable to run the exception.py file. I'm getting an error saying that there is "ModuleNotFoundError: No module named 'src' ".

  • @mrityunjayupadhyay7332
    @mrityunjayupadhyay7332 Год назад

    great

  • @MuhammadJunaid-i3y
    @MuhammadJunaid-i3y Год назад +1

    how I can add notebook folder..? you did'nt tell about notebook and csv.

    • @AmbarGharat
      @AmbarGharat 7 месяцев назад

      add it from vs code

    • @KAKAROT808
      @KAKAROT808 2 месяца назад +1

      Or Direct Go the file and add in ml projects folder notebook and then go to note book folder and add data folder and paste the file and vs code will analyse this and you successfully add the file

  • @ShivamPatel-yg3kd
    @ShivamPatel-yg3kd Год назад +1

    Hi Sir, I have a doubt, as you have created "total marks" and "average marks" as two separate independent features, and you are doing EDA for both the features, suggested to create 2 separate models for each of them as well. But my doubt is why do we need to do the same things for both of them separately as average marks is directly correlated with total marks(total marks/3). Am I missing something? Please clarify. Love your videos 😊

    • @uditthakkar8130
      @uditthakkar8130 Год назад +1

      Hey Shivam,
      It kind of depends on the problem that we are trying to solve. Suppose, what we are doing here is all self learning but there must be a target decided by the stakeholders/clients.
      If you are trying to predict a student's eligibility for a scholarship, the total marks might be more important than the average marks since scholarships may be based on total marks.
      If the model indicates that the total marks are a strong predictor of a student's performance, it may be harder to understand how much the average marks contributed to that prediction if they are not considered separately.
      Also, elimination of noise and variability in the data is a factor here!

  • @swL1941
    @swL1941 Год назад

    If Boosting and Bagging methods are very powerful then why a simple Ridge Reg has more R2 score ??

  • @hmtbt4122
    @hmtbt4122 Год назад

    thanks

  • @surajpandey-er4nt
    @surajpandey-er4nt Год назад +1

    Hi Krish, just a small doubt I am facing an issue with installing everything through requirements.txt instead I had to install everything separately. what could be the issue here?

  • @krislai7453
    @krislai7453 Год назад +1

    why is that when i tried to import library it says "no module named 'numpy'

  • @ridj41
    @ridj41 Год назад

    I have become mad after again and again setting up the environment,since even after installing all of the libraries but running the code it says that library is not there.

  • @Laizin
    @Laizin Год назад +4

    anyone getting an error related to installing catboost??
    or is it just me?

    • @lionsinescanor405
      @lionsinescanor405 Год назад +1

      same here. Did u find the solution?

    • @mhapich
      @mhapich Год назад

      Me too - Failed to build catboost
      ERROR: Could not build wheels for catboost, which is required to install pyproject.toml-based projects

  • @kumarsrajesh9078
    @kumarsrajesh9078 Год назад

    Please make a end to end project of pyspark

  • @Svilco
    @Svilco Год назад

    What is the correct sci-kit learn version for 3.8

  • @r.ranjankumar2106
    @r.ranjankumar2106 Год назад

    Im facing Module not found error even after creating the enviornment. Any help me how to fix this

  • @Pravin33unique95
    @Pravin33unique95 Год назад

    I have doubt while running code I face syntax error in exception handling

  • @umerfarooq-ck3qn
    @umerfarooq-ck3qn Год назад

    Hello Sir , thank you so much for this playlist. Here getting an error : ValueError: A given column is not a column of the dataframe while executing 'X = preprocessor.fit_transform(X)' , even i have done with X = df.drop('reading_score',axis=1) . Please help

  • @abubakarsaddiq4098
    @abubakarsaddiq4098 6 месяцев назад

    stuning