End To End Machine Learning Project With Deployment | Customer Churn Analysis | Churn Prediction

Satyajit Pattnaik

Просмотров 202 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 24 дек 2024

Комментарии • 327

@SatyajitPattnaik 4 года назад ⁺³³
Code & Dataset for practice: github.com/pik1989/MLProject-ChurnPrediction
Also watch, Churn Analysis dashboard creation using Power BI:
Part 1: ruclips.net/video/Dryfm9VaEw4/видео.html
Part 2: ruclips.net/video/mAPcW_YDUKQ/видео.html
@WOW-vn5jd 3 года назад
can i get ppt of this ?? or can you tell how w can make such a wonderfull ppt on telecom customer churn like this ??
@VenuGopal-dr8ln 3 года назад ⁺¹
Thanks a lot 🎉🎉
@SatyajitPattnaik 3 года назад
@@WOW-vn5jd Canva 😁
@jahangagan5396 2 года назад ⁺²
Didn't get the insights that you find from describe() method like 75% clients have 55 months tenure and 25% clients pay some USD
@depeshkumarmohanty9522 2 года назад ⁺¹
@@jahangagan5396 i thnk 25 is wrongly typed it should be 75
@AbhirupBhattacharya 2 года назад ⁺¹⁶
This channel is one of the best I've come across for ML, Data Analysis and Python. The level of content uploaded here is so remarkable. Thank you so much Sir. ❤
@ZepAnalytics 2 года назад
😊
@rishabsharma7114 3 года назад ⁺⁶
The best end-to-end machine learning project. I have learnt a lot from this video. Thank you, sir.
@SatyajitPattnaik 3 года назад ⁺¹
Thanks Rishab
@tonynikolaos3527 Год назад ⁺⁴
I am at half the lecture and can not contain my gratitude: Totally awesome lecture and explanations! Please do other lectures related to ML- you have rare gift to impart understanding. Million Thanks! God Bless!
@sameerpandey5561 3 года назад ⁺⁹
One of the best End to End ML project explanation and implementation I have seen till now....
Thank you @Satyajit for taking out your time and preparing such a wonderful video for us...
@SatyajitPattnaik 3 года назад
Thanks, that keeps me motivated.
@PriyankaKumari-of7se 2 года назад ⁺¹⁰
Wow, you really have put in great effort to make such a detailed video. Thanks a ton! Keep sharing your knowledge 🙂
@animeshtalukdar1852 4 месяца назад
Thank you for providing such a detailed and clear explanation for this data science project. I am learning data science using Gemini and your videos, and I am so happy to have found you as a resource.
@surajghogare8931 2 года назад ⁺¹
Highest way of teaching.. U covered each and every minute details in just one project.... 🙏thanks
@SatyajitPattnaik 2 года назад
Thanks a lot..
@sachin-ll1by 6 месяцев назад
it felt so real life...like i am a da in an company and my manager is giving us walkthrough of the project...kudos
@sandeepthukral3018 3 месяца назад
Hi, is this industry level project for data scientist?
@srijanakhatri7854 Год назад ⁺¹⁰
Hello, I wanted to thank you. I got the job as data analyst. In the technical interview, I was asked to do any dummy projects and walk through the process .I used this data and topic of churn analysis and did similar project like yours with the help of your code.
I used Power for data visualization and used python for data modeling for the same data and was able to impress the interviewers . I also had the theoretical knowledge on what I was doing but I am not good at technical stuff because all I learnt in the school was to calculate eigen-value, PCA, manually, definition of p-value and regression analysis😂😂. I can understand what the code is doing but if I have to code,I will spent all day on youtube.
Its my second day on job but I don't have proper understanding of relational database. They are using Powerbi and SSMS together for the daily reports and all. I need to sustain on the job😂😁 Please advise me with the link of videos if you have made on SQL,powerbi, and python on how they are working together.
@SatyajitPattnaik Год назад ⁺²
Check my playlists, there are videos on power bi and sql too (end to end)
@anant_gpt Год назад
@srijanakhatri7854 can you please guide me how to go on as i am someone who is starting from scratch
@anirbansarkar6306 3 года назад ⁺⁴
Thank you so much Satyajit for this wonderful end-to-end full tutorial. That was really helpful.
@SatyajitPattnaik 3 года назад
Thanks Anirban
@dilakshaaveesh4360 2 года назад ⁺²
Really appreciate your effort best project explanation so far for me ! Thank you !
@abhishekpatra5020 2 года назад ⁺²
Wonderful explanation. Love how you get into detail wrt business understanding and coding
@mugen05 9 месяцев назад ⁺¹
Very great project and explanation! Thank you! ❤
@chr1112 4 года назад ⁺⁴
thank you for the tutorial you are really amazing and really have a gift to teach. Keep the good work up
@SatyajitPattnaik 4 года назад
Thanks for your kind words 👍
@AIMLOdeysey 4 года назад ⁺³
Thank you Satyajit. The sessions was very informative and helpful!!
@SatyajitPattnaik 4 года назад
Thanks Sainath
@kirankumar-ir5ir Год назад ⁺¹
Big fan for ur efforts and explanation. I feel motivated from u
@bilalahmad3730 2 года назад ⁺²
What a Great project, thanks Sir
@mohitpandey5190 3 года назад ⁺⁶
Hi Satyajit,
I believe performing resampling on entire data might be responsible for data leakage since some of the resampled examples will be the part of both training and testing datasets now contributing to a high F1 score.
@SatyajitPattnaik 3 года назад ⁺²
Yes, that's a mistake I did here, you can do resampling on training data instead of the full data, I already acknowledged it in some other comment 👍
@mohitpandey5190 3 года назад ⁺¹
Noted @@SatyajitPattnaik, Make sense. Amazing work on churn modeling 🙌🏼
@nimitsharma5104 3 месяца назад
@@mohitpandey5190 But my performance metrics dropped significantly after this, Came down to around 80% in almost all metrics. Do you have suggestions over other techniques that could work much better.
@iamdare 3 года назад ⁺²
Thank you for this, I really learnt a lot
@SatyajitPattnaik 3 года назад
Great to hear this 😁
@akhileshkushwaha6596 2 года назад ⁺²
X_resampled, y_resampled = sm.fit_sample(x,y) become X_resampled, y_resampled = sm.fit_resample(x,y) thanks for the video
@riyatiwari4767 3 года назад ⁺²
Finding telecom domain projects on youtube are very difficult. Thank you for sharing..Please share more end to end projects focused on telecom domain..
@SatyajitPattnaik 3 года назад
Definitely
@riyatiwari4767 3 года назад ⁺¹
Please bring projects which can be showcased on CV. Like..customer segmentation, customer retention, customer call anlysis.It would be very helpful. Thanks for ur work!
@SatyajitPattnaik 3 года назад
@@riyatiwari4767 noted, pls wait for 2 weeks, you would see new Telecom projects 🔥🔥
@erukullasrikanth15 3 года назад ⁺²
Amazing work..clear explanation of EDA and explaining insights by analysing...Keep doing more end to end ML projects for us in different domains.. thank you
@SatyajitPattnaik 3 года назад ⁺¹
Thanks Srikanth.
@depeshkumarmohanty9522 2 года назад ⁺¹
@@SatyajitPattnaik thanks sir and please do one more end to end ml project like this in a different domain.
@SatyajitPattnaik 2 года назад ⁺¹
@@depeshkumarmohanty9522 Already done, pls search "end to end satyajit" you will get more projects
@raghavverma1094 2 года назад ⁺¹
what an amazing video..thanks for it
@snehalpophale6287 Год назад ⁺¹
Very very helpful!!
@priyankahello8849 Год назад ⁺¹
Awesome video!
@anushkasaraswat1813 11 месяцев назад ⁺¹
thank you so much for this video
@jaideepsingh8109 2 года назад ⁺³
Thank you brother, You saved me from failing my assignment. You are really a Savior sent by god.
@SatyajitPattnaik 2 года назад
Thanks Jaideep ☺️😍
@zephyrindia5173 3 года назад ⁺¹
Lovely project..
@a2sirmotivationdoses782 3 года назад ⁺¹
Thanks for the amazing End To End Explanation...... Sir can you please make a video playlist on Speech To Text Machine learning Model building from Scratch?
@SatyajitPattnaik 3 года назад
Sure, will do 👍
@gauravmore8578 2 года назад ⁺³
Hey, I'm getting this error
What am I doing wrong?
ValueError: X has 65 features, but RandomForestClassifier is expecting 50 features as input.
@ittaboyinaharshithayadav7759 2 года назад
Yeah me too...could u please help..In Webpage i'm getting the error as internal server error and in anaconda prompt Value error as: X has 51 features,but RandomForestClassifier is expecting 50 features as input...
@ittaboyinaharshithayadav7759 2 года назад
I'm not getting the error always but 3 in 5 times..for the 2nd input i'm getting the output but for 1st i'm getting the error
@gauravmore8578 2 года назад ⁺²
@@ittaboyinaharshithayadav7759 Did you manage to get output ?
@harshilshah9628 2 года назад
@@gauravmore8578 were you able ro solve it..?
@ittaboyinaharshithayadav7759 2 года назад
@@gauravmore8578 No..I'm not getting the output everytime...for most of the cases I'm getting Value error: X has 51 features, but randomforestclassifier expecting 50 has input...
@ManigandlaSanjay-uo1dk Год назад ⁺²
'SMOTEENN' object has no attribute 'fit_sample'
and
name 'X_resampled' is not defined
can you explain the error and correct answer sir please
@ManigandlaSanjay-uo1dk Год назад ⁺²
Sir please reply
@SatyajitPattnaik Год назад ⁺¹
its changed to "fit_resample", please use chatGPT for faster resolution :)
@abdulhamidpatel230 9 месяцев назад ⁺¹
Hey there love you teaching skills... I have a question that the EDA steps used over here can we use it as the base for different DS projects or there is something different
@SatyajitPattnaik 9 месяцев назад
Same 😀
@SaumyaShrivastava-i3t Год назад ⁺²
having problem while running it, could you please help me out. while running app.py error occur in browser.... Internal Server Error
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
@SatyajitPattnaik Год назад
Pls go through comments, many had this issue because of the input that they pass
@iramkhan4014 2 года назад ⁺²
This is a great project and I got to learn a lot from it. I wanted to ask will this be a good project for my final year dissertation?
@SatyajitPattnaik 2 года назад ⁺¹
Yes surely
@muskanpiplani87 2 года назад ⁺¹
Hi, i would want to know the procedure of what tweak in code for next 3 month churn prediction you were talking about
@lambertagunbiade2114 2 года назад ⁺¹
thank you sir.
@SatyajitPattnaik 2 года назад
welcome sir
@darshankokal4670 2 года назад
Awesome content
@rajeshwarraotangellapalli3791 5 месяцев назад
Many thanks
@ashwinin633 2 года назад ⁺¹
Hi actually I am having doubt how to handle the imbalanced data set like having only two col x and y and y is having both Pos and negative values in it how to approach with it I tried transforming x to log x and many more but didn't worked
@KumarHemjeet 3 года назад ⁺²
You have label encoded the data before splitting the data into test and train that is why you are getting such high precision and recall. There is data leakage here.
@hssp1534 2 года назад
But isn't that normal to do
@VenuGopal-dr8ln 3 года назад ⁺¹
Hi,
Every industry have predicting wheather a person churn or not churn how about making this project more realistic with prescriptive analytics by converting churn customer into non churn and if person is already a churn by providing him to good premium to convert churn customer into non churn customer..it will be helpfull to entire community..please make video on this
@SatyajitPattnaik 3 года назад
Good suggestion, but the part you explained is really difficult to implement as it's need cost analysis and ROI calculation, and it needs sales team involvement too, which is really difficult to show.
@VenuGopal-dr8ln 3 года назад
@@SatyajitPattnaik thanks for ur reply please can u make video on reinforcement learning on banking or insurance data it will helpfull for aspiring data scientist .hope that video soon 🎉
@SatyajitPattnaik 3 года назад
@@VenuGopal-dr8ln Reinforcement learning is a huge topic, let me know a specific topic that you want, and I can make a video on it.
@VenuGopal-dr8ln 3 года назад
@@SatyajitPattnaik I have idea in customer retention in financial industry but i don't know how to implement reinforcement learning although it requires huge data ...my idea is to implement prescriptive analytics in churn prediction ..it needs normal any ensemble algorithm to predict churn or non churn customers then taking churn customer data we implement some prescriptive analytics by before customer going to churn we will recommend some premiums to them then using LSTM models and reinforcement learning and time series ..LSTM needs because we need to focus on customer past behaviour with the company and time series needs because it's customers travel how loyal they are with company and finally reinforcement learning needed because we need to train our model cantinously with the data then model learn it self for future data
@soumyasinghal5704 3 месяца назад
I’m working on a project titled "Predictive Analytics for Customer Churn in E-commerce Using Machine Learning and Sentiment Analysis on Customer Reviews" for my master’s. I’ve explored many datasets, but I’m struggling to find one that has both transactional data and customer reviews from the same source. Most datasets either have one or the other.
Could you please help me find a suitable dataset or guide me on how to approach this? Since my project integrates both traditional analysis and sentiment analysis, having data from a single source is crucial.
Thanks in advance for your help!
@iqballatifable 3 года назад ⁺³
thank you for the tutorial, i have a question, when im running all your code, ValueError: X has 64 features, but DecisionTreeClassifier is expecting 45 features as input. can you explain it? thank you
@SatyajitPattnaik 3 года назад
Shouldn't be, can u check again, many people are using this, And it works fine.
If u can't figure out, upload your code on GitHub, share me the link and tell me which line is it failing..
@iqballatifable 3 года назад
Ok I will check it again, I hope running well like others, thank you very much
@iqballatifable 3 года назад ⁺¹
@@SatyajitPattnaik Your code is awsome, it's really work finally, thank you very much
@SatyajitPattnaik 3 года назад ⁺¹
@@iqballatifable thanks Iqbal, do share my channel with your friends & colleagues 🔥🔥
@ashwathvinod516 3 года назад ⁺¹
@@iqballatifable I too got same error how did you resolve it
@beyou7893 3 года назад ⁺¹
thank you so so much sir............
@SatyajitPattnaik 3 года назад
Welcome 🔥🔥
@doitsecretely7930 2 года назад ⁺²
ValueError: X has 52 features, but DecisionTreeClassifier is expecting 50 features as input. How to Fix this problem , please?
@SatyajitPattnaik 2 года назад
The error message is quite clear, you must be passing some value that is not present in original data, hence once you are doing encoding a new feature is getting generated hence you got 2 extra features
@lakshmikanth1988 6 месяцев назад ⁺¹
Hi how tenure_group hot coded into dummies? and how it generated 0's and 1's
@varunmalhotra666 Год назад
I am getting error while implementing SMOTENN....when I am running it is giving me error 'nonetype' object has no attribute 'split'. Kindly help me to solve this error I am stucked
@AshishSingh-po5sh 5 месяцев назад
Project explanation is great, just one problem, when u applied smoteenn before splitting train and test data, u r getting 92% accuracy through wrong approach, on correcting this , accuracy comes down below 80%, if u could fix this issue.
@ankanaboral3891 Год назад
Very detailed explanation. Thanks a lot for the efforts in creating this. I have a query regarding selection of training dataset for a subscription based model. If I select historical data with churners in the last 3months, for example, would I need to restrict the subscription start date in the training dataset for creating the base data?
@SatyajitPattnaik Год назад
You can create tenure as a derived metrics based on expiration date - subscription date
@ankanaboral3891 Год назад
@@SatyajitPattnaik Thanks for replying, but my question was should I filter out the data by subscription start date too like only taking subscriptions for a particular time period? I am taking churned customers for last 3months.
@sandeepthukral3018 3 месяца назад ⁺¹
Hi, is this industry level project for data scientist?
@SatyajitPattnaik 3 месяца назад ⁺¹
@@sandeepthukral3018 yes, i have deployed similar application in 2 of my previous companies, however the complexity in real production level projects is immense
@knightbliss1 3 года назад ⁺²
Hi Satyajit. .In addition to predicting whether a customer churned out or not, we can also predict when the customer is going to churn out for e.g. 6 months /9 months etc. How can we accomplish that?
@SatyajitPattnaik 3 года назад
There's a column called as tenure, so if you are testing the model on customer X and he's in the system for 24 months, just pass tenure as 30 for that customer and u will get the results for that customer if he's going to be churned in next 6 months or not.
@saimanohar4830 3 года назад ⁺²
Hi Satyajit. Nice End-to-End project. Could you let us know when we need to do Hyper-parameter tuning, and why is it required.
@SatyajitPattnaik 3 года назад
Once you build your prototype model, and want to take a further step in enhancing the model, you might have to do hyperparameters Optimization....
@sarfrazjaved330 3 года назад ⁺²
Hyper parameter tuning is required to reduce the variance.
With introducing small amount of bias we are able to reduce variance to a large extent.
@SatyajitPattnaik 3 года назад
@@sarfrazjaved330 Spot on 👍👍
@nitishsantpur2412 3 года назад ⁺¹
Very good explanantion but I have a doubt as in if we choose decision tree in this case is there a way to find out the root node? any code available to do so?
@SatyajitPattnaik 3 года назад
You can plot a tree, sklearn.tree.plot_tree
@nitishsantpur2412 3 года назад
@@SatyajitPattnaik Thank you very much but I got it very small nodes I used
fig, axes = plt.subplots(nrows = 1,ncols = 1,figsize = (5,5), dpi=300)
tree.plot_tree(model_dt_smote,filled=True) and I am getting nodes that are not clearly visible any way to increase size of nodes and make it look clear? That would be great help!!
@pardhuparvataneni1754 3 года назад ⁺¹
why is cross validation not performed while building the model. Is there any criteria that you have considered to not use CV.
@SatyajitPattnaik 3 года назад
A lot of things can be done in this model, this is not the most optimized model, like you said CV can be performed, other feature engineering steps plus hyperparameters Optimization can also be done.
@harshithayadavittaboyina7390 2 года назад ⁺¹
@Satyajit Pattnaik Thanks a lot for helping out and providing the vedio for college project..but I'm getting an error at the last...after submitting the user values in webpage..I'm getting the error as internal server error and the same in Anaconda prompt..I'm getting the Value error : X has 51 features,but RandomForestClassifier is expecting 50 features as input...But I'm not getting the error always but mostly...for 1st input given in CSV file I'm getting the output but for 2nd input I'm getting the output...I followed the same code given in github...could u please help..
@harshithayadavittaboyina7390 2 года назад
Sorry,for 2nd input in CSV file I'm getting output,but for 1st input I'm getting Value Error and internal server error
@harshilshah9628 2 года назад
@@harshithayadavittaboyina7390 Hey, did you find the solution as i am facing same kind of issue..
@DilipKumar-ww5si 3 года назад ⁺²
Hi Satyajit, Thank you for your detailed steps to implement End-to-End ML with deployment. I have few questions, can you please clarify? Please explain the usage of the following lines
df_1=pd.read_csv("first_telc.csv") # how did you get this file "first_telc.csv?
df_2 = pd.concat([df_1, new_df], ignore_index = True)
single = model.predict(new_df__dummies.tail(1)) # as per this line, you are considering only the input which we enter from html so, df_1, df_2 steps may not be required. If I just process the input from html using pd.get_dummies and then do model.predict , this will be enough for prediction. Am I correct?
Any how, Thank you once again for your contribution
@DilipKumar-ww5si 3 года назад ⁺¹
Hi Satyajit, I found the reason for using 'first_telc.csv' and the above clarifications which I raised. I have figured it out, you have to use some sample data and then add your input data because, then only all combination of categorical values are covered and it will be converted into equal number of columns as compared with Analysis file using pd.get_dummies. To match 51 columns, you are loading that csv file. However to predict, you are using only the last row (tail) which is entered by user in UI. Thanks a lot.
@thingsicando9662 2 года назад
Okay a
@bilalahmad3730 2 года назад
@@DilipKumar-ww5si very helpfull, Thanks
@sasumsudha 3 года назад ⁺²
dear sir, what is first_telc.csv file which was in app.py , please explain
@SatyajitPattnaik 3 года назад
It's not used anywhere, pls ignore.
@genuinelag9936 10 месяцев назад
yes i also have the same doubt
@digitalnomad2196 2 года назад ⁺¹
For deployment can you please explain what is in df_1 ? and why you concat with new_df to create df_2. I am working on a similar project with this data set.
@digitalnomad2196 2 года назад
When I check that df_1 'first_telc.csv'. It has 19 columns and 75 rows ? can you please explain where this comes from
@digitalnomad2196 2 года назад
and why you concat this with the data coming from the user ?
@debdutesarkar6370 2 года назад ⁺¹
Sir As i am from Pharmaceutical sales background,is it possible make a video on end to end Business Analytics project on pharmaceutical company used cases like ur very nice Telecom project,it would be very helpful for us.
@SatyajitPattnaik 2 года назад
there are some healthcare case studies on my channel
@230489shraddha 2 года назад ⁺²
Thanks a lot for this session Satyajit .... I have a doubt that how data scientists typically finalise a machine learning model for a given problem since there are many machine learning models available?
@SatyajitPattnaik 2 года назад
The model that predicts better on test data is fine tuned and pushed into production to get tested on live data, further the results are evaluated by sales and various teams, they give feedbacks whether the model is actually doing good or not, and then further remodelling is done, it's a long process
@x_x3557 3 года назад ⁺¹
Sir, can you please tell what is the use of concatenating df1 with the input queries ?
@SatyajitPattnaik 3 года назад ⁺¹
To test the model on a new data, I am just capturing the input values in a new dataframe and then calling model predict on that dataframe
@traeht 5 месяцев назад
I noticed one thing. You applied SMOTEENN function on the train and test data. Applying SMOTEENN function on the test set and later calculating f1 score, recall etc from predictions made from the test set does not give the correct results. When I didnt apply SMOTEENN to the test set my results were actually very similar to the decision tree classifier score calculated on data without SMOTEENN function. What makes sense is applying SMOTEENN only to the train data and not test data.
@SatyajitPattnaik 5 месяцев назад
@@traeht yes thats a good observation, i did that mistake, balancing out should be done for training data only
@amankumar-vz9ds Год назад ⁺¹
Sir, Can we get the PPT you have used to explain the model.
@SatyajitPattnaik Год назад
I don't share ppts for free, but you can create the ppts by yourself by taking ideas from mine.
@anuradhabhanu Год назад ⁺²
hi sir, i am anu,
recently i joined in data analyst course, your video's are very help full for me
i am facing issue on
after installing masked rcnn then i fail to importing "from mrcnn import model as modellib "
kindly resolve the issue,
thank you
@SatyajitPattnaik Год назад
Whats the error?
@anuradhabhanu Год назад
@@SatyajitPattnaik thanks for reply
AttributeError Traceback (most recent call last)
in ()
1 from mrcnn.config import Config
2 from mrcnn import utils
----> 3 import mrcnn.model as modellib
4 from mrcnn import visualize
5 from mrcnn.model import log
/usr/local/lib/python3.10/dist-packages/mrcnn/model.py in
253
254
--> 255 class ProposalLayer(KE.Layer):
256 """Receives anchor scores and selects a subset to pass as proposals
257 to the second stage. Filtering is done based on anchor scores and
AttributeError: module 'keras.engine' has no attribute 'Layer'
i search on stack overflow it tells about all instances keras application KE is changes to KL
but i don't understand how to fix it
@chamangupta4624 3 года назад ⁺¹
Very thank u
@SatyajitPattnaik 3 года назад
Welcome Chaman
@tamizhmalar9843 Год назад ⁺¹
may I know what ML algorithm is used for this churn prediction sir?
@SatyajitPattnaik Год назад ⁺¹
Classification algorithms
@tamizhmalar9843 Год назад ⁺²
does this come under predictive analytics sir?
@SatyajitPattnaik Год назад
@@tamizhmalar9843 yes
@mutasimahmed8975 2 года назад ⁺¹
can you give exact pre reqrusite for this projects ? so i learn and directly do this project
@SatyajitPattnaik 2 года назад
Python, Stats, Machine Learning is what you need to learn.
@20-122JayaKumpatla Год назад ⁺¹
I run project and i gave input to the fields,it does not showing any output
@Masterkey.1818 4 месяца назад
did you solve the issue?
@saurabhagrawal3691 3 месяца назад ⁺¹
In my understanding, SMOTE should be applied on training data set...not on entire dataset.
@SatyajitPattnaik 3 месяца назад
@@saurabhagrawal3691 yes I already mentioned that in multiple separate comments, it was a mistake that i did
@saurabhagrawal3691 3 месяца назад ⁺¹
@@SatyajitPattnaik But I like your EDA approach. Kudos. I am a very new player in ML. Doing data science masters from Great learning. There are so much to learn from you. Will keep watching more case studies shared by you. Thank you for your amazing work.
@saurabhagrawal3691 3 месяца назад
@@SatyajitPattnaik Could you please share any detailed work of yours which include saving model and deployment using pickling and Flask. Many thanks.
@suyoghole3501 2 года назад ⁺¹
hey man create an recommendations engine to the people not get churn recommended something so they got customers back N already have they get more interested
@SatyajitPattnaik 2 года назад
Sure, will build one
@nitishsantpur2412 2 года назад
Hi how can we compare two float numbers in python with greater than operation I am getting error saying >= is not a function in numpy?
@nagalakshmichavali1370 10 месяцев назад ⁺¹
Awesome! i enjoyed practicing the project along with listening your insights thanks a lot .I am statistician by virtue of my profession wish to connect with you how can i reach you!
@SatyajitPattnaik 10 месяцев назад
All Social media links are provided in my video descriptions, you can reach me via those links :)
@nagalakshmichavali1370 10 месяцев назад
oooo okay
@mrgroom8108 Год назад
Hello sir,
Can I know the ML model of which you are using like linear regression, logistic regression, random forest
@shrinathlanjudkar7490 3 года назад ⁺¹
How can we apply Artifical Intelligence , eager and lazy learning in it, finding common patterns in it
@SatyajitPattnaik 3 года назад ⁺¹
You can read more about eager and lazy learning, the algo used here in this use case is probably DecisionTree if I recall correctly, and DT is basically a eager learner.
So how the learning happens is something that is internal to the algorithm, to know more about it, you have to dig in to the model's base codes, here we are just implementing the algorithm and performing our predictions.
I hope you got my point, for example, if u want to know why Decision Trees are called eager Learners, you will have to check how a decision tree is implemented and what's the code within the algos.
A link which can help you: towardsdatascience.com/machine-learning-classifiers-a5cc4e1b0623
In case you still have the doubts, we can probably connect over LinkedIn.
@nik.editz2561 Год назад ⁺¹
Hi..Satyajit ,can you make a video how anyone get fresher datascientist job in industry....and another question is,Are you from Odisha ?????
@SatyajitPattnaik Год назад
Sure i can make a video, and yes i am from Odisha
@avinashnair5064 3 года назад ⁺¹
Hello sir can you tell me the source from where did you get the Dataset from?
@SatyajitPattnaik 3 года назад ⁺¹
It's a public dataset easily available on Kaggle and various other websites.
@avinashnair5064 3 года назад
@@SatyajitPattnaik thankyou sir
@krishna3876 3 года назад ⁺²
Bro how to download your dataset bro
@SatyajitPattnaik 3 года назад
GitHub repository link is provided, pls fork the project on GitHub and use the data..
@anepomnjashiy 3 года назад ⁺¹
At first, thank you a lot for this video! At second )))) - can you provide a subtitles for it?
@SatyajitPattnaik 3 года назад
Well, it's in English already, let me know if you're facing any difficulties
@anepomnjashiy 3 года назад ⁺¹
Yes, I'm experience some difficulties with understanding the pronunciation. You would be so kind if allow subtitles in this video. Thank you, again for this great content!
@shrutimadan4451 3 года назад ⁺¹
After training the model , you have shown the accuracy on test data i.e 93% but how we can get churn/Not churn against that in test dataset, not through API but in the python only
@SatyajitPattnaik 3 года назад
Yes, we can get it, just run the entire test data in a loop, and append the prediction score in a new column, for ex: df["Prediction"] = model.predict_proba(X_test)
Just giving you ideas..
@shrutimadan4451 3 года назад
@@SatyajitPattnaik Thanks for the clarification. I will try that .
One more ques . - We have done the correlation part, so where we used that? those variables which were not affecting the target variable can be deleted before model building.. Please let me know if i m thinking in right way ? Or correct me if i m wrong. would be really helpful.
@shrutimadan4451 3 года назад
Thanks for the clarification. I will try that .
Two more ques . -
1.We have done the correlation part, so where we used that? those variables which were not affecting the target variable can be deleted before model building.. Please let me know if i m thinking in right way ? Or correct me if i m wrong. would be really helpful.
2.max_depth=6, min_samples_leaf=8 , how we have decided this in this e.g. ?
@VyĐặngThịTường-i3m Год назад
Can use the model to predict data in PBI right? Then, create 1 table to have churn values customers in PBI. I don't use file Excel but use data from a database to test. Can you help me ?
@SatyajitPattnaik Год назад
Predicting on PBI isn’t possible, but predictions can be integrated with PBI to showcase the results. In PBI you can do time series forecasting but no ML predictions
@VyĐặngThịTường-i3m Год назад
@@SatyajitPattnaik I have finished building the customer churn prediction model and saved it in the pickle library. Now how can I use that model to predict data on PBI (can use integration or some other way) and show convenient results for calculating customer churn? I don't know what to do :( Please help me. Please. Thank you very much.
@SatyajitPattnaik Год назад
@@VyĐặngThịTường-i3m Predictions can be done outside Power BI, Power BI is just a visualisation tool
@VyĐặngThịTường-i3m Год назад
@@SatyajitPattnaik I can't get data from PBI to predict outside because data in database db2 :(. How to connect VS code with database db2 to create churn customer in data on the database? Then, I can use the data it use in PBI, right ?
@SatyajitPattnaik Год назад
@@VyĐặngThịTường-i3m use chatGPT on how to connect db2 from your python notebook, once u r able to create connections, start working on the predictive model , and then u can connect with power bi and show predictions on pbi
@bikramaditya1855 2 года назад
Asking if u. Have any playlist for the very first things of excel, Powerpoint or ms word. If u have, than pls provide me with the link.
@anuradhabhanu Год назад ⁺¹
how to get the customer chrun dataset
@SatyajitPattnaik Год назад
Checj video description
@akshajagarwal3519 Год назад ⁺¹
I am getting Internal server error when i click on submit on the webpage , kindly help ASAP
@SatyajitPattnaik Год назад
So your values arent matching with the expected values
@akshajagarwal3519 Год назад ⁺¹
@@SatyajitPattnaik I am inputting the values directly from the dataset itself , can you provide with the values to be entered.
@SatyajitPattnaik Год назад
@@akshajagarwal3519 well if u r using exact similar values then theres no way it Will fail, its tested and used by many, pls debug it, or else modify the input fields to be drop down values
@akshajagarwal3519 Год назад
@@SatyajitPattnaik Thank You
@RiyaSingh-zd3yl 6 месяцев назад
How did you fix this error??
@AIMLOdeysey 4 года назад ⁺¹
Hi Satyajit, May I ask you to share the link for python code used in today’s session!!
@SatyajitPattnaik 4 года назад
I will upload it tomorrow first half for sure, it's a busy day for me today, thanks for understanding!!
@SatyajitPattnaik 4 года назад ⁺¹
github.com/pik1989/MLProject-ChurnPrediction
@onestopzz6446 2 года назад ⁺¹
Is it mandatory to be good enough to write code on our own, If we are looking for data engineer jobs? .........from a final year student
@SatyajitPattnaik 2 года назад ⁺¹
Not mandatory
@anggifirdiansaputra6615 Год назад ⁺¹
Hello, thanks for making an interactive video. However, I advise reducing the size of the title of the video section, such as the Model Deployment title. It's very distracting and I can't see what you're doing because it's obscured by the title of the video section🙂
@SatyajitPattnaik Год назад
Its a very old video of mine, checkout my channel for some latest end to end projects 😀
@srujankumar1838 3 года назад ⁺⁷
Hi Satyajit, thank you well-done project. I had a data science interview this week can I take this end-to-end project to present in front of the interviewer?
@SatyajitPattnaik 3 года назад ⁺²
Ofcourse you can..
@susantakumarsahoo1982 2 года назад ⁺¹
@@SatyajitPattnaik are you from odisha??
@SatyajitPattnaik 2 года назад
@@susantakumarsahoo1982 yes
@chaitu037 2 года назад ⁺¹
Even am planning for the same. Thank you for the indepth explanation.
@shrutimadan4451 3 года назад ⁺¹
Hi ,How to create the churn data in retail industry.. ?
@SatyajitPattnaik 3 года назад
You have to read some papers about it.
@datascienceworld Год назад ⁺¹
Can any of you please tell me what is the difference between monthly charges and total charges? Thanks in advance.
@SatyajitPattnaik Год назад
In the telecom churn dataset, "Monthly Charges" refers to the total amount charged to the customer for their telecom services each month , while "Total Charges" refers to the total amount charged to the customer over the entire duration of their service with the telecom company .
@datascienceworld Год назад
@@SatyajitPattnaik Thanks a lot.
@pragmatic_p8 3 года назад ⁺¹
Can I know what are pre-requisites of this project?
@SatyajitPattnaik 3 года назад ⁺¹
Python and basics of ML should be enough
@shashankpandey1966 2 года назад
Could you please explain --telco_data.TotalCharges--...AS we dont know which columns contains the null data , then how should we direct write it ?
@SatyajitPattnaik 2 года назад
Can you point the time where you have this doubt
@naziaakhtar8575 3 года назад ⁺¹
how do we incorporate the 6 month or 3 month prediction into this?
@SatyajitPattnaik 3 года назад ⁺¹
Your time related parameters will play a vital role here, let say, you want to test your existing customers, and latest tenure period of that customer is 27 months, if u want to test whether he stays after 6 months or not, u need to pass the tenure as 33 for that customer with other details..
@SatyajitPattnaik 3 года назад
there are other columns that would change as well, like total charges, age etc etc..
@TsheringYangdon-h2k Год назад
sir i have a problem in model deployment in spyder.
can you help me("UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte") this is the error
anyways thanks for great and amazing video
@SatyajitPattnaik Год назад ⁺¹
Its an encoding error, spot the error in the exact line, it must be failing while you are reading the file
@akhileshkushwaha6596 2 года назад ⁺¹
smoothen function is showing error
@dhanushn1787 2 года назад ⁺¹
I am getting Internal server error When we run the project
@SatyajitPattnaik 2 года назад
You need to see the error and fix it, its a very basic project and lot of error handling is required
@codesoupp 2 года назад ⁺¹
Hi sir..can i use this in data analyst role resume?
@SatyajitPattnaik 2 года назад
Can, as Predictive Analytics is a part of Analytics :)
@anketsonawane6651 3 года назад ⁺¹
Someone please help me. I have a confusion between customer segmentation and customer churn
@SatyajitPattnaik 3 года назад ⁺¹
Customer segmentation is an unsupervised technique, where you need to create customer clusters based on their behaviour, when you have a variable where you know who is churning, and who's not, you can have a classification model to predict whether a new customer will churn or not.
@anketsonawane6651 3 года назад ⁺¹
@@SatyajitPattnaik thanks :)
@SatyajitPattnaik 3 года назад ⁺¹
@@anketsonawane6651 I can make a video on customer segmentation if you promise me 100 subs 😂
@aryanpachpute8976 Год назад ⁺¹
Send the code which you gave in video?
@SatyajitPattnaik Год назад
Check description
@national3737 3 года назад ⁺¹
Sir how we can show the reason that he is churning
@SatyajitPattnaik 3 года назад
That's a second problem, churn or not churn is one classification problem, you can also train a multiple class classification by having classes like: not churned, churned due to service, churned due to cost etc, so it's a multi class classification 👍

Следующие

Автовоспроизведение

End to End Data Science Project | Chatbot from Scratch | Cosine Similarity based Chatbot