What I would like to see is using a set of artificial 3D scans from CAD models, and the corresponding CAD models themselves to create a LLM for automated reverse engineering. I am sure that has to be possible.
Interesting but as long as the 3D perception is a jittery mess and fails to capture straight lines as straight lines the perception is probably very far from ideal. I'm also dubious about strict boundary box thinking. Not everything can be boundary boxed and it fails to properly grasp relevant aspects. For instance if you box a person, how do you perceive the arm, the collar, a stain, a wrinkle, a posture etc plus temporal aspects. An unsupervised paradigm forming its own model seems to be the way to go.
Reminds me of Deepmind's RT-2. These visual-language models (along with action for robotics for RT-2), might just be the new big thing!
Indeed!
What I would like to see is using a set of artificial 3D scans from CAD models, and the corresponding CAD models themselves to create a LLM for automated reverse engineering. I am sure that has to be possible.
Can such a model generalize to 3D of images unseen in the training data?
Isn't developing other models with Chat-GPT against their TOS? 🤔
Seems like it’s for models that compete with it. I assume something like this is not a competition per se?
Interesting but as long as the 3D perception is a jittery mess and fails to capture straight lines as straight lines the perception is probably very far from ideal. I'm also dubious about strict boundary box thinking. Not everything can be boundary boxed and it fails to properly grasp relevant aspects. For instance if you box a person, how do you perceive the arm, the collar, a stain, a wrinkle, a posture etc plus temporal aspects. An unsupervised paradigm forming its own model seems to be the way to go.