Robotics in the Age of Generative AI with Vincent Vanhoucke, Google DeepMind | NVIDIA GTC 2024
HTML-код
- Опубликовано: 7 фев 2025
- Generative AI is taking automated common-sense reasoning, task planning, and perception to a new level. It is also revolutionizing synthetic data generation, human-computer interaction, and multimodal understanding.
Collectively, these are some of the key capabilities required for robots to understand our world and provide humanity with accessible, versatile physical assistance for day-to-day tasks. The key missing ingredient is for generative AI to also understand physical interaction.
In this NVIDIA GTC 2024 session, Vincent Vanhoucke of Google sketches a future in which embodied AI is a natural extension of the revolution that large multimodal models are ushering, and explores the implications for the future of collaborative robotics and human-centered AI at large.
Speaker: Vincent Vanhoucke, Distinguished Scientist and Senior Director of Robotics, Google DeepMind
Explore more GTC 2024 sessions like this on NVIDIA On-Demand: nvda.ws/3U33qo7
Read and subscribe to the NVIDIA Technical Blog: nvda.ws/3XHae9F
Original GTC 2024 Session: Robotics in the Age of Generative AI [S61182]
#GTC24 #NVIDIA #GTC #AI #GenAI #GenerativeAI #Robotics #Simulation #SyntheticData #RobotPerception
We might live in a science fiction today but it's still not an easy task to set the sound volume correctly. I believe this will be true in the age of ASI as well.
Great presentation!
Sure, I can crank the volume up about 24dB, but then when the commercials come I would blow my ears...
Vincent Vanhoucke really knows his stuff.
Google deepmind is one of the best contributor to advancement of AI. Keep it up. Love you.
They're profit-driven...
Just take GNOME as an example since it's, I think, the best illustration of that fact.
Their AI should first recommend them to regulate the audio volume.
Highly promising developments! 😃
Thank you..
A new era
awesome talk
PALM-E shows transfer learning across different robots. That's super interesting.
English subs are available, guys.
Congratulations
Whouhaooooo 😊
Феноменально!
sound is low
I wonder why the Robots were really interested in picking up the laptops? There has to be a way to determine that in the code, right?
Can barley hear anything on all of your recent GTC videos.
Turn the volume up :)
@@Rnjeazy yeah its is not my first day on the internet bro. I also used a volume boost extension and even with that I can barely hear stuff.
@@drupatel5131the thing is my friend is that I can hear everything just fine, perhaps it's your windows settings?
Volume has been so much worse on youtube recently, not just this video it's effing everybody, what happened
A.I has a LONG way to go on Audio. Maybe in another 100 years when I'm dead.
1. The method isn't well.
2. The robots displayed aren't functional - they need two hands and fingers.
3. 3D should be extrapolated per task before testing. Those robots in the video can barely do anything. A different method should be used. Their neural network obviously isn't workable.
Bad idea to use LLM for planning, LLMs are open ended, it can do whatever it wants, it can also go into hallucination, we don’t robots to make decisions based on biased data. I would not use LLMs at the planning level, it is good for speech input but the tokens derived from speech should only pick predetermined and safe plans. We want to know what a robot is capable of and stick to the plan. Robot is the actor, it should follow the script, the owner is human. It is just fashionable to put LLMs for everything, just because you can does not mean you should
AI does not generate. It is just a mapping from A to B. And the #3 winter has started.
What do you mean
@@Copa20777 Just look up the AI winter. en.wikipedia.org/wiki/AI_winter