Thank you very much for the explanation. But I still don't understand how I can pretrain the similarity function, how should I organize its inputs, etc. Can you explain a little bit more about it?
so during training we still have plenty of data to train the model, including data from same category right? I am first time learner, the name sounds to me even during training we only have very few data, or one example per category. Thanks for the video!
Hey, I've really enjoyed all your videos! Very nicely done at an appropriate technical level. I'd say the name of your channel is a bit misleading. It could also be affecting the number of your subscribers... Keep up the good work. Much appreciated!
Thanks for the compliments! I make all sorts of videos these days. In fact recently, I've been coding a lot. Just hoping people will find the channel eventually 🙂
8:55 - but if it’s the same network used to process image i and j, how exactly do you tune its parameters? Tuning it to make A embedding look closer to B embedding will also change the B embedding values in the same time.
I guess it's simpler if you just look at the math. We are doing gradient descent on the loss function, so you just take the gradient of the loss with respect to parameters of network f. If you want more math details, you can go check out this video. ruclips.net/video/4S-XDefSjTM/видео.html&ab_channel=ShusenWang
So can someone help me, if the network can only tell if two images are the same or not, what is the actual learning done here? Isn't it just a image to vec compare? Also how does this help with the original problem is it sam or not? Thanks in advance!
It doesn’t tell if the images are the same or not, it’s trained to tell if it’s the similar looking person on two different images. It’s doing that by converting an images into 64 dimensional vector of values, that somehow describes all important to us features, and then compares those vectors from different images. It’s trained relatively easy - you just feed it with pairs of images with the same or different people in them and compare its output with the expected results. When it’s wrong (most of the time during the training), you back propagate the error, so tune the network weights. Eventually it learns to compare different people, not only those, that were used for training.
Prior knowledge that it learns here is - what important features (embedding) of the images it needs to compare. Like, when it learned to compare a few human faces, now it can compare other humans that it never seen before (‘cause it learned what makes them look different).
Several amateur problems here: 1. All, so called "prior" knowledge must be handled on preprocessing stage. Like Face detection, for example. First "cook" the data, then "eat". 2 Huge misunderstanding across entire AI/ML community. You professors didn't teach you that there is a huge difference between an array and a vector. Not every array is a vector! Performing "similarity" functions, or any vector function, on an array is useless and you will always get an illusion of recognition. There will always be "weird" cases where you will not be able to explain the decision made by your model.
Dude, this loss function wouldn't work at all in practice. Think about it first before posting video.... Let's discuss positive case only for actual similar pair, Let's say distance = 0, so sigmoid(0)=0.5 and loss=-log(0.5) = 0.69 And similarity = inverse(distance) = 1/(1+distance) That's why folks use contrastive loss.
Hey this channel is my fav, glad you're back
+1
Much appreciated. :)
Me too. Favorite of favorite
It is the only personal channel I subscribe on youtube.
Glad to see you're back. This channel deserved more than 41k Subs! Keep it up!
Thank you for tuning in again. Apologies for the wait. Hoping to make up for it :)
Your videos are really informative and entertaining.
Thank you very much for the explanation.
Thank you very much for the explanation. But I still don't understand how I can pretrain the similarity function, how should I organize its inputs, etc. Can you explain a little bit more about it?
How does zero shot learning fits into this example?
Amazing explanation!
Thank you very much
Wow nice video
I learned new things
And your's secondary voice is makes me fun😅😂🤣 bye bye
Glad it's entertaining :)
Amazing. If possible cover the coding part as well. Good luck.
so during training we still have plenty of data to train the model, including data from same category right?
I am first time learner, the name sounds to me even during training we only have very few data, or one example per category.
Thanks for the video!
Truely great explanation
Hey, I've really enjoyed all your videos! Very nicely done at an appropriate technical level. I'd say the name of your channel is a bit misleading. It could also be affecting the number of your subscribers... Keep up the good work. Much appreciated!
Thanks for the compliments! I make all sorts of videos these days. In fact recently, I've been coding a lot. Just hoping people will find the channel eventually 🙂
can you provide a cosine similarity code using TensorFlow ; pls?
Cute demo!!!!❤
Why thank you :)
8:55 - but if it’s the same network used to process image i and j, how exactly do you tune its parameters? Tuning it to make A embedding look closer to B embedding will also change the B embedding values in the same time.
I guess it's simpler if you just look at the math. We are doing gradient descent on the loss function, so you just take the gradient of the loss with respect to parameters of network f. If you want more math details, you can go check out this video.
ruclips.net/video/4S-XDefSjTM/видео.html&ab_channel=ShusenWang
Hey can u please send link me to the original GAN video
So can someone help me, if the network can only tell if two images are the same or not, what is the actual learning done here? Isn't it just a image to vec compare? Also how does this help with the original problem is it sam or not? Thanks in advance!
It doesn’t tell if the images are the same or not, it’s trained to tell if it’s the similar looking person on two different images. It’s doing that by converting an images into 64 dimensional vector of values, that somehow describes all important to us features, and then compares those vectors from different images. It’s trained relatively easy - you just feed it with pairs of images with the same or different people in them and compare its output with the expected results. When it’s wrong (most of the time during the training), you back propagate the error, so tune the network weights. Eventually it learns to compare different people, not only those, that were used for training.
@@KlimovArtem1 thank you very very much - that's a clear and concise explanation.
thank you!
I learned i either have half a brain or just face blindness
Sorry to break it to you
Do the final embeddings of the trained network make any human readable sense? Like, hair color, face roundness, etc.
well explained
Thanks!
Thanksss
who are these models you hired 😩
Get it, models?
I fell of my chair. Wait till the next one, i hired some pros that'll make you dizzy
What about prior knowledge you did not go into it.
Prior knowledge that it learns here is - what important features (embedding) of the images it needs to compare. Like, when it learned to compare a few human faces, now it can compare other humans that it never seen before (‘cause it learned what makes them look different).
Several amateur problems here: 1. All, so called "prior" knowledge must be handled on preprocessing stage. Like Face detection, for example. First "cook" the data, then "eat". 2 Huge misunderstanding across entire AI/ML community. You professors didn't teach you that there is a huge difference between an array and a vector. Not every array is a vector! Performing "similarity" functions, or any vector function, on an array is useless and you will always get an illusion of recognition. There will always be "weird" cases where you will not be able to explain the decision made by your model.
wow
Very wow
Dude, this loss function wouldn't work at all in practice. Think about it first before posting video....
Let's discuss positive case only for actual similar pair,
Let's say distance = 0, so sigmoid(0)=0.5 and loss=-log(0.5) = 0.69
And similarity = inverse(distance) = 1/(1+distance)
That's why folks use contrastive loss.