I've just scanning through your DS basic playlist and this DS concept list. Thank you so much for all your efforts and it has been incredibly useful and helpful. I want to say that the best thing I find is that you're able to bring different models under the same subject and talk about their pros/cons or uses/unuses such as their loss function or like this one, to do a multiclass classification. This really helps me to add more perspectives to look at the same concept. Great videos and great illustrations!
Hi Ritvik! Great video :) I have a question at 10:25 where you mentioned that each model can predict different class. My understanding of multi-class logistic regression is that, in all these 3 models, that first shape is the positive class of the model and other 2shapes combined is the negative class. So we find the probability of a data point belonging to the positive class in all 3 binary models, then apply argmax on these probabilities to predict the class. If my understanding is correct, we'll not run into the problem that you were mentioning right? Worst case scenario all 3 probabilities can be equal (very rare) and we'll be indecisive.
12:00 You pointed out in universal extension section that those models don't perform well because they can't communicate with each other. Is there a way we can make them communicate ?
Great channel. Thanks. I do have a question: The One vs. One model (Universal Extension) seems to be conceptually very similar to using a pivot class & performing binary classification (Complex Extension). What's the difference? Thank you.
The extension, multinomial logistic regression, classifies based on probabilities so the probability of a case x belonging to a class (where the class is sq,o, or ^) is by design going to give you a probability that is likely higher for a particular class for a particular x chosen. Note that P(y=sq|x)+P(y=o|x)+P(y=^|x)=1 the sum of the probabilities must be one. One vs One will not since each binary choice (even if that choice is based on a binary probability) is not informed by the other possible classes.
One (amateur) question - What about having an universal model that is a variation of a One vs All in a sense that it is gradually stripping classes along the way? I mean if you have classes XYZ and first you ask: Is this item X or YZ? If it answers X, then you'll go with it, if it answers YZ, you then run the classifier just Y vs Z. Shouldn't that be more accurate than the model presented in the video? It would probably take more consideration on how to order the classifiers (possibly some data are easier to pick against the rest, so that classifier should run first, etc).
that's an interesting idea! I do think it adds a "fix" to the universal methods but like you said, you now have to think about how to order the classifiers which seems doable but requires some thinking. In all honestly, universal methods are not often used because of issues like these and because you might have to build a lot of binary classifiers.
I've just scanning through your DS basic playlist and this DS concept list. Thank you so much for all your efforts and it has been incredibly useful and helpful. I want to say that the best thing I find is that you're able to bring different models under the same subject and talk about their pros/cons or uses/unuses such as their loss function or like this one, to do a multiclass classification. This really helps me to add more perspectives to look at the same concept. Great videos and great illustrations!
Hey your kind words and support mean a lot to me. Thank you for taking the time out to comment :)
thanks, really clear explanation
very informative lecture
Excellent.
Great material. Thanks so very much. Can you please make on 1-class classification? Thanks.
Hi Ritvik! Great video :) I have a question at 10:25 where you mentioned that each model can predict different class. My understanding of multi-class logistic regression is that, in all these 3 models, that first shape is the positive class of the model and other 2shapes combined is the negative class. So we find the probability of a data point belonging to the positive class in all 3 binary models, then apply argmax on these probabilities to predict the class. If my understanding is correct, we'll not run into the problem that you were mentioning right? Worst case scenario all 3 probabilities can be equal (very rare) and we'll be indecisive.
12:00 You pointed out in universal extension section that those models don't perform well because they can't communicate with each other. Is there a way we can make them communicate ?
is what u have done in logistic regression is the same as SoftMax or it's different cause i get confused
Another nice video!!! I am wondering if you are going to talk about ordinal classification (I.e the multi class classification with ordering)
Great suggestion!
@@ritvikmath thanks and cannot wait to see that!!
Great channel. Thanks. I do have a question: The One vs. One model (Universal Extension) seems to be conceptually very similar to using a pivot class & performing binary classification (Complex Extension). What's the difference? Thank you.
The extension, multinomial logistic regression, classifies based on probabilities so the probability of a case x belonging to a class (where the class is sq,o, or ^) is by design going to give you a probability that is likely higher for a particular class for a particular x chosen. Note that P(y=sq|x)+P(y=o|x)+P(y=^|x)=1 the sum of the probabilities must be one. One vs One will not since each binary choice (even if that choice is based on a binary probability) is not informed by the other possible classes.
One (amateur) question - What about having an universal model that is a variation of a One vs All in a sense that it is gradually stripping classes along the way?
I mean if you have classes XYZ and first you ask: Is this item X or YZ? If it answers X, then you'll go with it, if it answers YZ, you then run the classifier just Y vs Z. Shouldn't that be more accurate than the model presented in the video? It would probably take more consideration on how to order the classifiers (possibly some data are easier to pick against the rest, so that classifier should run first, etc).
that's an interesting idea! I do think it adds a "fix" to the universal methods but like you said, you now have to think about how to order the classifiers which seems doable but requires some thinking. In all honestly, universal methods are not often used because of issues like these and because you might have to build a lot of binary classifiers.
Where does categorical fit in?
For me, I think I need to see a numerical or coding example to clearly understand the steps.
cool
Did you ever tumble when doing the intro?
not yet!