Video Classification with a CNN-RNN Architecture | Human Activity Recognition
HTML-код
- Опубликовано: 12 сен 2024
- Video Classification is the task of predicting a label that is relevant to the video.
Github: github.com/Aar...
Topics which I will cover in this Video Classification Tutorial are:
Overview of Video Classification
Steps to build our own Video Classification model
Exploring the Video Classification dataset
Training our Video Classification Model
Evaluating our Video Classification Model
#############################################
In case of any query, You can comment or you can contact me at aarohisingla1987@gmail.com
############################################
What are videos?
Videos are a collection of images(frames) arranged in a specific order.
In Image classification: we take images, use feature extractors (like convolutional neural networks or CNNs) to extract features from images, and then classify that image based on these extracted features. Video classification involves just one extra step.
While performing Video classification:
1- We first extract frames from the given video.
2- use feature extractors (like convolutional neural networks or CNNs) to extract features from all the frames,
3- Classify every frame based on these extracted features.
Before we talk about Video Classification, let us first understand what is Human Activity Recognition?.
The task of classifying or predicting the activity/action performed by someone is called Activity recognition.
With the help of Video Classification models we can solve the problem of Human Activity Recognition.
#VideoClassifier #VideoClassification #HumanActivityRecognition #CNN #RNN #AI #ComputerVision #DeepLearning #ArtificialIntelligence
Join this channel to get access to perks:
/ @codewithaarohi
Glad to see your video after many days. Thanks for uploading it Aarohi ji
My pleasure 😊
Thank you so much for creating such a clear and detailed video! Your effort and dedication truly shine through, and your content made our day. The way you presented the information was not only informative but also engaging. We appreciate the time and hard work you put into making this video, and it has undoubtedly added value to our understanding. Keep up the fantastic work, and we look forward to more insightful content from you in the future! 🌟
Thank you for appreciating my work:) Glad my content is helpful! Keep watching 😊
Very informative mam...helpful video for video classification
Glad to hear that
Thank you for this amazing video, very helpful♥♥
Glad it was helpful!
nice explanation, will it work for deepfake videos ?
Yes, video classification algorithms can be employed as part of a deepfake detection system. While deepfake generation typically involves the synthesis of realistic-looking videos, deepfake detection aims to distinguish between authentic and manipulated videos. Video classification algorithms can play a crucial role in this process by analyzing various characteristics and features of videos to determine their authenticity.
@@CodeWithAarohi Thank you, I tried this, and it is giving better accuracy. Can you please help with get the accuracy and loss plot for the trained model, i am not able to get it.
Really Ma'am you are gem. Plz make video on real time pest detection.
It will be a great project definitely.
Noted
@@CodeWithAarohi Thanks Ma'am
Than you for the video,To classify between sitting and walking, should we simply include videos within the respective classes?
Yes
great explanation! thanks from online community
You are welcome!
maam i know more from your YT than they taught me in 2years of CS
Glad to hear that. Keep watching and learning 😊
Aarohi i am working on ffcresnet with lstm for video classification i extracted features from train and test using ffcresnet but i struct at lstm getting error tensor mismatch between sequences and targets
Your code is 100% functionally ¿How can I predict a specifically video instead of random one as you do on last line code "test_video = np.random.choice(test_df["video_name"].values.tolist())" ? Thanks
specific_video_name = "YOUR_SPECIFIC_VIDEO_NAME"
# Locate the specific video in the test_df dataframe
video_row = test_df.loc[test_df["video_name"] == specific_video_name]
# Extract the features for the specific video
video_features = video_row.drop(columns=["video_name"]).values
# Reshape the features (if needed) to match the model input shape
video_features = video_features.reshape(1, -1) # Assuming the model expects a 2D array
# Make predictions using the trained model
predictions = model.predict(video_features)
# Print or use the predictions as needed
print(predictions)
perfect and thanks for share this
My pleasure 😊
Mam I am getting error like "AttributeError: 'function' object has no attribute 'predict'..
Please suggest me what to do..?
The "AttributeError: 'function' object has no attribute 'predict'" error typically occurs when you are trying to call the "predict" method on a Python function object, rather than on a machine learning model.
In Python, the "predict" method is a common method that is used by many machine learning models to make predictions based on input data. However, this method is not a built-in attribute of the Python function object, and it cannot be called on a function in the same way that it can be called on a machine learning model.
To fix this error, you will need to make sure that you are calling the "predict" method on a machine learning model, rather than on a Python function. For example, if you have defined a machine learning model using the scikit-learn library, you can call the "predict" method on the model object, like this:
Copy code
from sklearn.linear_model import LinearRegression
# Define the model
model = LinearRegression()
# Train the model on some data
model.fit(X, y)
# Make predictions using the model
predictions = model.predict(X_new)
In this code, the "predict" method is called on the "model" object, which is an instance of the LinearRegression class. This is the correct way to make predictions using a machine learning model in scikit-learn.
If you are still getting the "AttributeError: 'function' object has no attribute 'predict'" error, it may be because you are trying to call the "predict" method on a Python function object, rather than on a machine learning model. In this case, you will need to make sure that you are calling the "predict" method on the correct object.
mam can i classifier vibrational data from unvibrational in video using deep learning??
thank you mam, can u pls tell how to load sensor data and image as input for human activity recognition
hello mam,
how can i increase the accuracy?
Great lesson!
Thanks a lot.
My pleasure!
thanks for this very important topic
Most welcome
Thanks mam for uploading 👏👏👏👏 it is really helpfull if possible make an video on oversampling in deep learning best channel on yt
Noted. Will try to cover in my upcoming videos.
Very Good
Thanks
Mam, what is the time performance of such classifier?Suppose I want to predict dancing video segments for a 10 minute video (predicting time stamps from x second to y second as dancing),how long will model take to predict?
Hey did you do this?
very nice explanation mam
Thanks a lot
hi, what is this filepath = "./tmp/video_classifier" looks like? Thank you
Madam how to identify particular video what code is to be changed in inference module
Amazing Mam,
can you share your video datasets?
or you using a open dataset ?
thanks before
I collected the videos from pixabay.com
Thanks
Welcome
Great
Thankyou
Hi! thank you very much! I was wondering where can I download the dataset you used for this?
You can prepare the dataset by downloading videos.
ma'am where can we find the dataset for it
Mam can we do video classification using the image dataset
Image datasets lack temporal information, which is crucial for video classification. Videos are more than just a collection of frames; the order and timing of frames convey important information.
@@CodeWithAarohi thanks you mam
Thank you so much for uploading this amazing video mam. Could you pls guide me on how to recognize emotions from videos mam? I also checked your website link but the, unfortunately, the site has been blocked. Thank you so much mam
Thank for the video very intuitive explanation, I have a doubt I hope you can help me, In the first part with the CNN, the features vector has a size of 2048 for each image, and I understood that this represent the output of the CNN, so it means that each point of this vector represent a class of some object , for example a dog, cat, etcetera, if this work in that way, the size of the output must be 1000, because the InceptionV3 model is able to
No we remove classification layers of cnn and output of pool layer output as features to lstm for training
How do i provide custom weights in here??
Also where do you usually getting all the videos as well madam?
pixabay.com
Nice teaching. Could you tell me how you collect the dataset?
Collected videos from pixabay.com/
Hello mam, Thankyou for amazing video. Can you please guide me on how can I do activity detection for multiple person in a video. E.g. in a sports different players do different activities so how can I detect activity of each player's. Activity like defend, pass, save in football sports
Activity recognition for multiple people in a video is a challenging task in computer vision. One approach to achieve this goal is to use a combination of deep learning and computer vision techniques.
Here is a high-level overview of the steps you can take to implement activity recognition for multiple people in a video:
Data collection: Gather a large dataset of annotated videos that contain multiple people performing different activities. The annotations should include the activity labels and the bounding boxes for each person in the video.
Data preprocessing: Preprocess the data by extracting the frames from the videos, resizing them to a fixed size, and normalizing them.
Object detection: Use an object detection algorithm to detect the people in each frame of the video. This can be achieved using popular algorithms such as YOLO, Faster R-CNN, or SSD.
Person tracking: Use a person tracking algorithm to track each person across frames. This is necessary to keep track of each person's activities throughout the video.
Activity recognition: Use a deep learning algorithm to classify the activities performed by each person. You can use popular deep learning models such as CNNs or RNNs, or a combination of both.
Evaluation: Evaluate the performance of the model on a test dataset using metrics such as accuracy, precision, recall, and F1-score.
Hi Arohi, could you please provide me the dataset link here?
I don' t have it now.
Thank you Mam for such amazing videos. Is there any possibility that we can connect your 2 works in one? You taught: How to do custom keypoint detection using detectron2. Is there any possibility that I can detect the action of a person using the output of detectron2 keypoint detection and LSTM?
did you get it
@@mango8369 yes...do you have any suggestions?
❤❤
The model is unable to predict correct output with higher accuracy, it's always equal probabilities, that means the model is not really working. why? and how to fix?
I asked myself the same question. Even in the video if you notice at the end, probabilities are very close to each other.
.TF is not getting defined how should I resolved it
Thanks you maam...😍😍
Welcome
hi, May I ask what is the effect of such low accuracy?
Dataset is small. And trained it for very less epochs.
Thanks for this great video. I have learnt a lot from the videos but ı have a problem. I cant download tensorflow, somehow. Is there a way that ı need to follow?
Create a separate environment of python and then install tensorflow
@@CodeWithAarohi yes ı managed to come to a stage where ı can see the training were complete and ı saw the test accuracy which showed me 35 % yoga prediction. I had just downloaded like 20-30 videos at that time, so this percantage must be a healthy one. However, ı started to get some errors like "data_train is not defined" etc. I play with some unimportant numbers and ı saw it healed itself somehow. But right now when ı increased the number of videos to 180 it says "didnt improve" with this epoch thing. I deleted the videos but nothing has changed is it because it had already saved the results to weights or sth? I am sorry for my ignorance in python. This is the first time ı am using it but ı need to learn this activity recognition very soon.
Let me clearly state my question here once more. When ı am in sequence model part it starts feeding.(ı suppose it is doing that) . İn epoch 1/30 it says val_loss improved from inf to 1.09728 but in 2/30 didnt improve. What is the possible reason for that?
It is using body key points?
In this video we are not working on keypoints
Ma'am can you please give me link for videos from where you have downloaded it .... Yoga excercise
pixabay.com/
❤💫
thank you so much..can we somehow have your dataset, I mean the videos that you used because in the github are only the csv files? thank you in advance
No, this dataset is not available. You can download videos from internet and put them in a folder as I explained in a video.
@@CodeWithAarohi Thank you very much for the response
thank you madam for such wonderful videos I used this I used the same code as yours at GitHub but i have accounts an error at a point of The sequence model "
OverflowError: cannot convert float infinity to integer" error please if possible you can help me how to solve it please thanks
Thank you
Videos do not appear in the data set, how can I access them?
Download the videos from internet.
Can you tell me the links where I can download the videos?@@CodeWithAarohi
Thank you 👏👏👏
You're welcome 😊
Hi Mam, would you please share the link or give information about the dataset how could I find the dataset?
You can prepare your dataset by downloading the small video clips related to your dataset. I have downloaded the clips form pixabay.com
Thankyou for the video. Can you please guide on how can we do activity detection for multiple people in a video. E.g. in a classroom different students do different activities so how can we detect activity of each student. This would be really helpful if one can help me here.
You need to prepare dataset for all the activities you want to detect and then can use this algorithm or you can use Object detection algorithms. For this you need to annotate the images of your dataset as per the chosen algorithm
@@CodeWithAarohi According to you which model will be better to go with. I want to do activity recognition on a video and for e.g. video can be of a classroom of students.
@@shaminemacwan1426 Have you done that? If you done that I need your help.
may I ask question ?
OpenCV: Couldn't read video stream from file "test/dataset/test/yoga/production_id_3760884 (2160p).mp4". how to over come this problem
If you are human, just read the error message then you will understand how to solve it.
Ma'am in my dataset of 700 videos on train and 800 on test why is it taking more than 20 minutes to predict?
what is the size of your videos?
@@CodeWithAarohi ranging from 2 to 5 mB
Can i build this model with image data
Please anyone reply
i have an problem in prepare all videos() function .
there is showing
ValueError: could not broadcast input array from shape (2048,) into shape (30,)
This error is related to your output classes. Check if have mentioned the number of classes correctly.
@@CodeWithAarohi can you share this dataset?
thank you madam
Welcome 😊
Hello, can u help me for my research work implementation of yoga estimation model . I m looking for a Deep learning expert like u.
Please mail me at aarohisingla1987@gmail.com to discuss further
Would you help me in fake news detection project. i mailed to you.plz reply me ma'am. I am from Pakistan.
Can you make a same video on yolo + rnn please 🥺
Will try to do after finishing my pipelined videos
miss how can get the dataset
I have collected the videos from pixabay website
Hello, can I get your tensorflow, CUDA and cuDNN versions? I'm getting an error saying that I don't have a kernel for some reason which is "Graph execution error: No OpKernel was registered to support Op 'CudunRNNV3'"
My tensorflow version is 2.6.0, cuda 11.2 and cudnn 8.1
need slides
the more ı learn about installing tensorflow the more ı understand it s a real pain in the ass.
Try to setup a new python environment and then install the required packages. Otherwise you will mess up your other codes working
Well ı solved this problem by working on a new computer. Somehow creating new environment doesnt work. It always happens to me when ı start to learn sth new. An error that noone would ever see in their life and ı start to think ı did sth wrong but no it was like because there was an impossible to solve error in an unrelated you would never ever think place.
Imagine just reading out loud the simplest names of variables to define its action as its name and skip trough code reading out also some comments with no further explanations and throw some random not existing words to cast some confusion spells XD comedy park
Who are you and what you have done?!
JUST MAKE A ZIP OF UR VIDEOS THAT U HAVE AND GIVE IT IN THE DESCRIPTION IT IS NOT THE HARD LIKE I HAVE WASTED 5 HOURS FINDING A DATASET. LIKE WHENEVER SOME1 ASKS U UR LIKE: I DOWNLOADED IT FOR PEXA.WHATEVER JUST MAKE A ZIP
how can i improve accuracy madam?
There are various parameters to consider eg: 1) Your dataset quality 2) The hyper parameters you are using 3) Type of Model 4) Dataset Preprocessing etc.
@@CodeWithAarohi madam can i use pure cnn instead of rnn?