Project 10. Credit Card Fraud Detection using Machine Learning in Python | Machine Learning Projects
HTML-код
- Опубликовано: 8 апр 2021
- Hi! I will be conducting one-on-one discussion with all channel members. Checkout the perks and Join membership if interested: / @siddhardhan Check membership Perks: / @siddhardhan
. In this video we have built a Credit card Fraud Detection system using Machine Learning with Python. For this project, we have used the Logistic Regression model.
All presentation files for the Machine Learning course as PDF for as low as ₹200 (INR): Drop a mail to siddhardhans2317@gmail.com
Enroll at One Neuron to learn from 100 courses in one subscription with 5% discount: courses.ineuron.ai/neurons/Te...
Machine Learning Projects Playlist: • Machine Learning Projects
Machine Learning Course with Python Playlist: • Machine Learning Cours...
Hello everyone! I am setting up a donation campaign for my RUclips Channel. If you like my videos and wish to support me financially, you can donate through the following means:
From India 👉 UPI ID : siddhardhselvam2317@oksbi
Outside of India? 👉 Paypal id: siddhardhselvam2317@gmail.com
(No donation is small. Every penny counts)
Thanks in advance!
Hi guys! I am Siddhardhan. I work in the field of Data Science and Machine Learning. It all started with my curiosity to learn about Artificial Intelligence and the ability of AI to solve several Real Life Problems. I worked on several Machine Learning & Deep Learning projects involving Computer Vision.
I am on this journey to empower as many students & working professionals as possible with the knowledge of Machine Learning and Artificial Intelligence.
Let's build a Community of Machine Learning experts! Kindly Subscribe here👉 tinyurl.com/md0gjbis
I am making a "Hands-on Machine Learning Course with Python" in RUclips. I'll be posting 3 videos per week: Monday Evening; Wednesday Evening; Friday Evening.
Dataset file: www.kaggle.com/mlg-ulb/credit...
Colab File Link: colab.research.google.com/dri...
Download the Course Curriculum File from here: drive.google.com/file/d/17i0c...
LinkedIn: / siddhardhan-s-741652207
Telegram Group: t.me/siddhardhan
Facebook group: groups/49085... Instagram: / siddhardhan23
wow, this is such a fluent, smooth explainer, I am a mere beginner, and still able to understand almost everything. My teacher was not even able to explain a line of it correctly.
Hello, @Siddhardhan Your presentation feed me more than I thought. Really awesome accuracy score and good EDA. Thank you for your video.
Many RUclips videos and even courses in Spanish do not explain very well and do not cover everything necessary to do a machine learning project, but you explain it very well. I'm not very good at English but I understood the procedure very well, excellent video
Thanks a lot 😇
Hi siddhardhan..Your explaination is awesome.. keep up the good work..Nice comparision between over fitting and under fitting over accuracy.. and nice example too..
simple and nice explanation...,I didn't no machine learning. just, I know python but the way of your explanation helps me lot to understand machine learning.Thanks a lot.
Hi, Siddhardhan
"I just wanted to take a moment to express my sincere gratitude for the excellent tutorial on Credit Card Fraud Detection using Machine Learning in Python that you posted on RUclips. Your clear and concise explanations, combined with the practical examples, have helped me tremendously in understanding the fundamentals of this complex topic. Your dedication to providing high-quality content is evident, and I appreciate the time and effort you put into creating such an informative tutorial. Once again, thank you so much for sharing your knowledge with the world - you're making a positive impact on the lives of many, including mine."
finished coding practice .Feeling a lot confident
The way you explained was really amazing. I was able to clear all my doubts regarding this project. Keep up the good work and seriously thanks a lot for providing such a great content!!!!
thanks a lot for your positive words 😇
when I am loading the dataset and checking the null values I am getting some,but he is not is anyone else getting this error
@@Siddhardhan accuracy of this prj?
Great work Siddhardhan, you really explained it in an amazing way
Really explain step by step very easy way . Want more vdo on machine learning as Covid Detection etc
Very informative video...thanks for your community work....God bless 🙏
A great teacher ❤️❤️❤️❤️❤️❤️ i have ever seen in my life...who is explaining each and every line of a big project code🔥🔥🔥🔥🔥🔥🔥🔥
Thanks a ton 😇
@@Siddhardhan i can't explain you how much you help me for clearing my doubt.. that stuck into my brain for atleast a week❤️
Extraordinary content! I have watched all your videos from hands on ml course to this one.Everything was explained such that even a beginner would understand it. You have a really great gift in teaching complex stuff in a easy manner. My request to you is to keep teaching like this so that you will be able to change the life of lots of people like me. I am going to recommend this channel to all my juniors and friends.
Thanks a lot 😇
@@Siddhardhan why have you done train_test_split after balancing dataset wont it create problem of data leakage?
can you suggest some key points to add this project in resume. Thanks
Absolutely , Stunning . Your way of explanation is too good 💥💥. Thanks for sharing !! Super excited from more projects videos ...
Thank you so much 😀
Hey, have build the same project as u teached but with different dataset & while using logistic regression im facing the ValueError: as 'Could not ocnvert string to float' what should i do?
Your explanation is really wonderful and so easy to understand
Hi... I am working with the same dataset. But In my pc when I try to find info of the dataset, it shows a different amount of data than yours. I coundn't find why is it.
Thanks a lot! You're doing a great work. Keep it up!!
my pleasure 😇
Hey.... I got total no. Of iterations reached limit warning in logisttic regression model...... What to do to solve this
bro from next video onwards please add some visualizations. any way ur explanation is excellent, thank u for sharing this content to public who needs like this content.
I still didn't get the part where groupby is being used for checking on the mean. Why is it determined and what does it conclude?
Kindly through some light please
Hi 👨🎓, from Brazil/Teresina/PI. 👏👏👏
What if i use Random forest (isolation forest algorithem)Instead of logistic regression?? With svm also bcz of the data is imbalenced and having some outliers?
Thank you for this very informative tutorial!
man you are doing a really great job!
great , bhai mst explain kiya
Thank You for this entire project
Thanks a lot Sid sir. Love and support from Pakistan.
Dear sir, I am doing my MSc. I am thinking to do my project dissertation on Credit Card Fraud Detection . Every body doing on kaggle dataset. If I do on same data set university people will say "copied" . Could you please suggest any other dataset from another resource . Thanks
Can you please tell me how can I manage to perform test of different algorithms on different dataset in single colab repository ?
very good content sir.
one question i have a project on this i searched for null values , in mine its showing null values from V8-Class whole. how come in urs not showing null values?
hi there,
there are missing values in the dataset from kaggle..
pls do check
Hello sir, i m getting an error like 'dataframe'object not callable whn i run legit_sample= legit.sample() .could yu please help me with this...
sir I have a question whenever I run this part of the code: credit_card_data['Class'].value_counts() it doesn't show the exact amount of fradulent transaction for example its supposed to show 492 but in my case its showing 239 why is that sir?
Amazing Videos!! Thanks for sharing to get practices on ML Projects
My pleasure 😇
Can we do over sampling instead of under sampling in this case?
wow... very nice and detail video and smooth explanation. Please add how to inbuilt sampling technique in this video . I liked the video very much .
Hello sir, amazing lecture, but have a doubt, why did do take those numbers instead of transaction id's and how do you take those numbers
Thankyou so much for making this video It helped me alot😄
Thank You Sir, It was one of the best tutorials ever and I loved the way you explaining all lillte things in easiest way..
Thanks a lot for your positive words 😇 happy that you liked it!
@@Siddhardhan Sir, can you make a complete video on how to start ML, what are its pre-requisite and maths and mathematical intuition of all algorithm . Please Sir .Separate playlist . Because I'm getting confused what to do first ....
You can follow this playlist: ruclips.net/p/PLfFghEzKVmjsNtIRwErklMAN8nJmebB0I
hi Siddhardhan, thank you so much for this content, your explanation is really easy to understand.
listening to your step by step explanation and directly practiced it on my google colab really help me to understand it.
your content helps me a lot!
when I am loading the dataset and checking the null values I am getting some,but he is not is anyone else getting this error
I just wanna know whether it gives the accuracy details only or detect whether card is fraud or not
Your teaching good sir your videos very useful my project sir thank you so much sir
What's the ultimate result out of it? What did we learn? Is there any way to find out fraudulent transactions in legit data set?
Great Work! but the collab link does not contain the required files like the dataset, it contains other files of sampling house
Have you also implemented this for neural network algorithm?
Sidhard ,you are awesome!!!!!
hi siddhardhan,
I have an error,kindly resolve it.# training the Logistic Regression Model with Training Data
model.fit(X_train, Y_train)...in this part ,I got STOP:TOTAL NO OF ITERATIONS REACHED LIMIT.How can i resolve it,
Very good video and easy to understand code, thumbs up!!
after CONCAT I should be getting both 492 values for each class right? But I'M getting 492 values for class 1 and 1 value for class 0. PLS HELP!
can anyone tell me where the data preprocessing is done in the video?
Can anyone tell me about encode categorical feature where some feature have more than 10k Category.
I am just work with diff dataset
hii everytime iam uploading your data there is a missing values in data any suggestions...??
I just want to suggest few points before you post a project
1. Goal of the Project
2. Output of this
3. ways that it can be implemented
4. in realworld how it is checked with the user inputs
I think this basic points must be mentioned in the project to be called it as a meaning full project.
Anyways thanks for your contribution @Siddhardhan
Do you have an idea of how this model could be used in a real-world, for instance, in a web application with a backend in python or node js?
Hey can you please give the answers to these questions
@@jeremycapitalyou will have to embed the model in the server. Also you will have to include separate modules for data cleaning and preprocessing. that would include a separate data engineering vertical to your application, we use kafka for stream processing
Extremely nice explanation. It's explained in such a simple way. Thank you, brother! Really helped a lot. 😁
*Please tell me what topics of ML topics used in this project so that I can start this project after learning those topics .*
Awesome mate, finally my search ends with your content.
*Please tell me what topics of ML topics used in this project so that I can start this project after learning those topics .*
how to process the missing values to make as a meaningfull data??
I am getting an error after training data and testing data
At 36:17
After fitting in model.fit(x_train, y_train)
Found input variables with inconsistent numbers of sample: [197,787]
Kindly help me in resolving this
Very good explanation thank u fro the explanation
Great.
I would like to add some remarks that might help in future projects :
- The way you dealed with imbalanced data will certainly lead to bad predictions, because alot of information were lost when taking a small sample from a large dataset.
- use SMOTE instead of sampling
- use decision trees or random forests algos they are better when dealing with imbalanced data
- use more evaluation metrics like ROC curve, F1 score, recall...
when I am loading the dataset and checking the null values I am getting some,but he is not is anyone else getting this error
@@nandinimadan6421there are no missing values from the default dataset maybe u wan to download again the data
Can someone explain how thr logisticregression function works? How the v1 v2 v3...... Values are defined as fradulent or not.
Anyone knows what the numbers in the dataset denote. How do they relate to real life transactions?
credit card fraud detection using machine learning iska code app kha pe likh rha hu thora btyia mugha ya pura code milskti ha kya or pppt
You explained so well.. 😍
Bro while foolwing the same codes it is showing a name error in the X_train, X_test,Y_train, Y_test = train_test_split(......)...and it shows like namerror : name 'train_test_split' is not defined. But I have givenfrom ""sklearn.model_selection import train_test_split"" in the beginning itself . Could anyone say why this error is happening to me?
Hi, can you give to code to plot the dataframe , to visualise the comparison of fraudulent and legit means. thank you
Video is good, but I am getting Found input variables with inconsistent numbers of samples: [787, 197]
error...can anyone help me out
Hi what did we conculde in the end what is the result ?
Very nice explanation and really liked the video. The classification report is also very good measure of the model. I think if we do cross validation and use some boosting techniques then the score can be increased more and one more imp thing is that here accuracy score doesnt matters, main is precision and recall because we cant let a fraud trans to become non fraud. Thank you
nice insights. you can definitely try to do some optimizations.
when I am loading the dataset and checking the null values I am getting some,but he is not is anyone else getting this error
Awesome video. Thank you.
Thank You! This was very helpful
Hi, i have a doubt. Can i work this project side by side and do everything you did, and just put it in my resume? or should i have to do something else. I'm a darta science fresher who wants to start career in data science. so could you please clarify, how to present projects in my resume
we can use smote method to resolve unbalanced data, it very useful also
Bro u could have hyper tune the model..and u could use f1,recall,auc/ruc , precision for checking for more accurate score of its ther
Thankyou so much sir it helped me lot.
Sir I have only one confusion which is why we create new dataset for checking accuracy
Which algorithm u used here?
when I am loading the dataset and checking the null values I am getting some,but he is not is anyone else getting this error
Can you do it without undersampling?
If I take this tutorials, is the software used for learning free? Like what you exactly did in the video?
Can we do this project using random forest and SVM?
how did we detect fraud with help of accuracy?
Bro code is not running it shows error in first 3rd line
Hi, amazing video! I wanted to ask that when this is tested against user input, what all inputs are required to be taken from the user? Only the amount?
adi depend on cases
Can anyone send me a video link about how to deal with missing values from this channel?
Can i know the ML algorithm used here??
Why didn't we use a standard scaler in this situation?
what is the algorithm he used in this code can anyone SAYYYYYY
Which algorithm did he use?
can i use this for CSE final year project???
Can you share existing system and disadvantages of that?
Awesome, very helpful
Thanks 😇
feature engineering parts at what minute?
you could have also tried SMOTE technique to better understand and predict.
Nice explanations
(Time-aware attention based gated network for credit card fraud detection by extracting transactional behaviors), Sir i choose this title , can i do ur project for this title ?
great job siddhardhan
thanks 😇
How to deal with "ValueError: could not convert string to float " if we are working with a dataset with various data types ? For example my data has columns such as actionType which includes "transfer", "withdrawal" etc.
I got the same error.. how to solve it?
ParserError: Error tokenizing data. C error: Expected 31 fields in line 1987, saw 42 .This is the error i am getting while loading dataset to data frame any solution. chatgpt told to see line 1987 in csv file i d=saw but i did not found any error how to fix this
Hi Siddhardhan
The video is very informative and easy to implement.
However, undersampling is not the optimal way to approach this problem because we are discarding almost 95% of data and just training over
good insights. I'll research more about this.