Asking the model how confident it is in the label ( 6:40 ) isn't necessarily a valid way to assess its confidence, unless they have given the language aspect access to the logits of the detection stuff. If it was trained multimodal it doesn't necessarily know this either and can basically be saying what it learned in image text pairings about the person's confidence, not its own (or maybe they were able to elicit this capability with RLHF?). Is there any calibration out there that shows large multimodal language models can assess their own confidence in images in a way that lines up with the accuracy?
Until AI can flatten autonomously it's input to output structure creating new classifications and agents it is not learning. The issue with most probabilistic models is at the boundary of conflating space and time (kurtosis inference). Shortest path may not be useful to model reality if model gradient is greater than 9 state space parameters.
You have a very isolated, linear point of view about intelligence. In reality people or beings cooperate with one another or their environment, so where are you models for that?
Asking the model how confident it is in the label ( 6:40 ) isn't necessarily a valid way to assess its confidence, unless they have given the language aspect access to the logits of the detection stuff. If it was trained multimodal it doesn't necessarily know this either and can basically be saying what it learned in image text pairings about the person's confidence, not its own (or maybe they were able to elicit this capability with RLHF?). Is there any calibration out there that shows large multimodal language models can assess their own confidence in images in a way that lines up with the accuracy?
Until AI can flatten autonomously it's input to output structure creating new classifications and agents it is not learning. The issue with most probabilistic models is at the boundary of conflating space and time (kurtosis inference). Shortest path may not be useful to model reality if model gradient is greater than 9 state space parameters.
Thanks.
o1 can do all that, so long as it's trained on the right data sets. In other words, it can reason about its world.
You have a very isolated, linear point of view about intelligence.
In reality people or beings cooperate with one another or their environment, so where are you models for that?