Amazing new video recapping areas of randomness in deep neural nets. I do have a question regarding top-K sampling, why do we have to renormalize the top-k choices in the vocabulary? Can we not just randomly choose between the top-k choices?
Amazing video. only I'm wondering if we start by sampling different seeds to initialize weights and biases and feed forward them once to see which one results to less loss error. samples of seed can be a range of numbers e.g 1-100 or by themselves a set of random numbers. do you think is it useful in practice?
That's a good question. And yes, it can be useful. Actually, I use that for creating confidence intervals, for example. E.g., see section 4 here: github.com/rasbt/MachineLearning-QandAI-book/blob/main/supplementary/q25_confidence-intervals/1_four-methods.ipynb
Wow, wasn't expecting a new video, luv the way e explaing things bro, keep it up
Glad this was useful!
Excellent summary regarding randomness in NN training and generative AI. Very good illustration as well .
Thanks for the kind words!
Amazing new video recapping areas of randomness in deep neural nets. I do have a question regarding top-K sampling, why do we have to renormalize the top-k choices in the vocabulary? Can we not just randomly choose between the top-k choices?
Good question. This is more for interpretability purposes, but you are right, you can skip the normalization step.
Amazing video.
only I'm wondering if we start by sampling different seeds to initialize weights and biases and feed forward them once to see which one results to less loss error. samples of seed can be a range of numbers e.g 1-100 or by themselves a set of random numbers. do you think is it useful in practice?
That's a good question. And yes, it can be useful. Actually, I use that for creating confidence intervals, for example. E.g., see section 4 here: github.com/rasbt/MachineLearning-QandAI-book/blob/main/supplementary/q25_confidence-intervals/1_four-methods.ipynb
Should we tune the seed for better results?😂
Haha, believe it or not, but I've once reviewed a paper where the seed was a hyperparameter.