I have a PhD in computer science with focus on deep learning and i still learn something new from your videos. I'm grateful for all the neat insights i get from your teaching!
I've had this same effect in some models I trained and I think there is one thing you can do that nearly gets completely rid of the effect is eliminating the effect of the model from your data. In my case it was possible to calculate the outcome if the model hadn't influenced the process. In the case you explained, what I would do is divide the numbers by the percentages of recommendations to compensate for the effect of the recommendations. It's essentially Bayesian statistics, where you try to determine the probability of the film getting watched given that it has been recommended. Hope this makes sense to everyone!
This phenomena occurs in Deep Q-learning and SARSA when you need the target Q-value in order to update the current Q-function. Especially in problems with continuous state spaces where the target Q-value is typically estimated using the same model. So the algorithm essentially try to predict a target and learn from that. One way to reduce this effect is to implement the epsilon-greedy policy which chooses a random action depending on the value of epsilon, which is conceptually similar to the concept in the video of keeping a small amount of randomness in the action of the model.
hi ritivik ive been watching a lot of your videos. you explain very well. ive one request. can you do some videos purely on math topics that are required for ml? especially something like stats.
Probably should implement concepts from genetic algorithms to make sure you include in the next iteration's training set elements that were excluded in the i-1 output. Question: You start with N unique values. you sample with replacement N values (basically, bootstrapping). then you iterate this process, where the i input vector (of length N, always) is the output of iteration i-1. How many iterations would you need untill converging on a single value? Example: N=5 [1,2,3,4,5] i_1 = [1,2,2,4,5] i_2 = [1,2,4,5,5] . . . i_n-1 = [4,4,4,4,5] i_n = [4,4,4,4,4] Vector size: 5 Converged value: 4 (doesnt mean much. can be colors as well) # of iterations: n Answer: The larger the N (size of vector), the closer you get to number of iterations = 2 * N (so for vector of size 50, it'll take, on avg/expectancy, 100 times)
Would you also take the diversity directly in account when training the next model? Like, say, if you can measure that the diversity had an effect different from the predicted, that indicates something, right?
Really unique video! I loved it. Do similar issues of a feedback loop occur with demand or price forecasting? And if so, how? I was thinking high demand on an item on one day put more bias on that item in the future?
Imagine you're paying out profit share to authors of your content based on popularity and your own recommendation model does this. Sucks to be the content creator... The way the diversity is implemented seems to be absolutely key...
Yeah you shouldn't ever be training a new model based on a previous model. This is ignoring that you shouldn't just be recommending stuff that's popular. Which is already wrong. You should recommend related movies to what the user likes. Which avoids all of this.
I have a PhD in computer science with focus on deep learning and i still learn something new from your videos. I'm grateful for all the neat insights i get from your teaching!
I've had this same effect in some models I trained and I think there is one thing you can do that nearly gets completely rid of the effect is eliminating the effect of the model from your data. In my case it was possible to calculate the outcome if the model hadn't influenced the process. In the case you explained, what I would do is divide the numbers by the percentages of recommendations to compensate for the effect of the recommendations. It's essentially Bayesian statistics, where you try to determine the probability of the film getting watched given that it has been recommended. Hope this makes sense to everyone!
This phenomena occurs in Deep Q-learning and SARSA when you need the target Q-value in order to update the current Q-function. Especially in problems with continuous state spaces where the target Q-value is typically estimated using the same model. So the algorithm essentially try to predict a target and learn from that. One way to reduce this effect is to implement the epsilon-greedy policy which chooses a random action depending on the value of epsilon, which is conceptually similar to the concept in the video of keeping a small amount of randomness in the action of the model.
hi ritivik ive been watching a lot of your videos. you explain very well. ive one request. can you do some videos purely on math topics that are required for ml? especially something like stats.
Probably should implement concepts from genetic algorithms to make sure you include in the next iteration's training set elements that were excluded in the i-1 output.
Question:
You start with N unique values. you sample with replacement N values (basically, bootstrapping). then you iterate this process, where the i input vector (of length N, always) is the output of iteration i-1. How many iterations would you need untill converging on a single value?
Example:
N=5
[1,2,3,4,5]
i_1 = [1,2,2,4,5]
i_2 = [1,2,4,5,5]
.
.
.
i_n-1 = [4,4,4,4,5]
i_n = [4,4,4,4,4]
Vector size: 5
Converged value: 4 (doesnt mean much. can be colors as well)
# of iterations: n
Answer:
The larger the N (size of vector), the closer you get to number of iterations = 2 * N (so for vector of size 50, it'll take, on avg/expectancy, 100 times)
Your channel is a treasure! Thank you!
Would you also take the diversity directly in account when training the next model?
Like, say, if you can measure that the diversity had an effect different from the predicted, that indicates something, right?
Really unique video! I loved it. Do similar issues of a feedback loop occur with demand or price forecasting? And if so, how? I was thinking high demand on an item on one day put more bias on that item in the future?
Thank you.
Missed a good opportunity to plug a “like and subscribe to train the model” 😂
Haha good one!
insightful video, thank you!
Imagine you're paying out profit share to authors of your content based on popularity and your own recommendation model does this. Sucks to be the content creator... The way the diversity is implemented seems to be absolutely key...
Yeah you shouldn't ever be training a new model based on a previous model. This is ignoring that you shouldn't just be recommending stuff that's popular. Which is already wrong. You should recommend related movies to what the user likes. Which avoids all of this.
very unpleasant channel, starting from the loud background music.