A Fork in the Road: Tesla LANE PREDICTION Uses SENTENCES?! John Emmons at AI Day 2

Dr. Know-it-all Knows it all

Просмотров 17 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 13 янв 2025

Комментарии • 103

@mjr7991 2 года назад ⁺⁷
John thank you for taking us through this. Whenever I watch your videos I think about all the people who don’t even know this is happening and think Tesla is just another car company. Please keep us up to date on all your findings.
@christopherrubicam4474 2 года назад ⁺¹⁶
This leaves me in awe of toddlers learning to walk independently.
@jonathannumer5415 2 года назад
A trillion tiny mistakes and successes
@flwi 2 года назад ⁺¹⁰
Mind=blown! What an elegant solution. Thanks for explaining it and getting all excited about it :)
@JohnEAvenson 2 года назад ⁺²¹
Fantastic breakdown! Thank you very much putting it into easy to understand words and segments
@WarrenLacefield 2 года назад ⁺¹
Funny, this "auto-regressive" language model actually has been around a very long time. In the 1980's, I worked with a colleague, Dr. Earl Rankin, a very well-known professor in reading and the "father" of what was known then (and still is) as a "cloze test." He invented actually and we used in R&D, the "Degrees of Reading Power" test. That measures reading ability on the same scale as is often used to measure textbook (or magazine, etc) reading difficulty. That was handy for use with college students and courses (and professors). If the students do not have sufficient reading ability to at least match the difficulty of the textbook, then there will be troubles ahead.
@Alexander_S18 2 года назад
I watch a lot of AI Day videos, how Tesla is working inside and how it is improving day by day. And I'm just impressed by how brilliant these people are, my respect.
@kimollivier 2 года назад ⁺³
I got the sentences idea immediately. What a breakthrough! I try to do this by hand using lex on half structured legal text. Lex has a Start State and then I move on to different state (nodes) as I find the words I am looking for, just like a lane, but much simpler because I am just parsing a string of words sometimes abbreviated or in a different order to understand what the rambling lawyer was really talking about.
Lex and Yacc are the original Unix tools that are used in every complier to this day as the first step in parsing source code into tokens and finally machine instructions.
@davidwilkie9551 2 года назад ⁺¹
It makes sense that a sentence, composed of "spelling" superimposed functions to focus on a transverse trancendental picture, will synthesise the processing of images via linearised point-line-circle conic-cyclonic coherence-cohesion GD&P in pure-math relative-timing terms.
The vector-values reference-framing was very interesting and relevant. Thank you.
@BongoWongoOG 2 года назад ⁺³
Hi Dr K. What are your thoughts on using the language model for all tasks? Teach Optimus to think in sentences, rather than tasks? Covers movement, verb and noun relationships, actions and direction. I think it has huge implications.
@timwildauer5063 2 года назад ⁺¹
I think I know what they’re doing for the auto regressive part at 15:20. They mentioned using multiple trips through an intersection to build up a full view of everything, so they have that top down view of the entire intersection. They use that top down view to label everything, and then go back to the individual passes for training. They can do that either in the real world or their cg world where they don’t actually need build the intersection first.
@explor794 2 года назад ⁺¹
I watched the fsd video 10 times to try and understand something lol Your video was a perfect Explanation of many things, thank you so much.
@talkingaboutdisruption9216 2 года назад ⁺¹
Thanks John. Now you have explained the use of language to describe the lanes it now makes a lot of sense.
@ernietam6202 2 года назад
Thanks a million. Your explanation helps me to understand a lot more.
@budgetaudiophilelife-long5461 2 года назад
🤗👍😎THANKS JOHN, FOR SHARING YOUR “TREAT “ with us 💚💚💚
@richiehart7858 2 года назад
Great video! You've illuminated the depth of what Tesla is doing with one aspect of FSD, and by extension all of FSD, which is only hinted at in the AI day presentation to those of us not competent with this type of technology.
@chriscrosby3582 2 года назад ⁺¹
Very helpful! Thanks for breaking it down for us!
@JohnBrown-pw3bz 2 года назад
Wow the description of an intersection with multiple paths that connect the lanes. sounds like G-Code used by CNC routers to cut out a shape a random shape in a piece of wood.
For example you call out a circle and then you give the radius and then you could have the router follow the circumference and end at some point to transition into a straight line.
@kstaxman2 2 года назад ⁺¹
This video will require some revisits. Lots to digest here but it's an example of how far ahead Tesla is on the FSD problem. I would bet that no other driving program is using a language model to help decode a vision data stream. Amazing use of different analysis systems to parse visual data. Well got to let this marinate while I grab something for my headache... LOL can't wait to hear James discuss this development. So many of the complex problems FSD presents will be solved using this new method. And it's interesting to hear them bring into the discussion Tesla bot. This solution could mean even more for the bot than for FSD. The ability of Tesla to integrate so many different fields of study into solving a problem is so unique. Most just don't see how that gives Tesla a massive advantage.
@MrDuncanBooth 2 года назад
Thanks John awesome
@johntrotter8678 2 года назад ⁺¹
Thank for more stuff I don't understand. It's why I tune in!
@305dreamhonda 2 года назад
You are a great teacher, TY. Language models were the first whiff of AGI, being able to expand to fsd and Optimus further solidifies the evolution. So much creativity and state of the art comprehension have to merge in order for breakthroughs to emerge and I think this is a fine example. Perhaps this is the essence of the Elon ecosystem.
@mikebailey2970 2 года назад
Well done!
@jocehockings4192 2 года назад
Great breakdown, thanks
@ResolUloseR 2 года назад ⁺¹
Said it before...Elon needs to become the HIGHWAY/Interstate Czar in the United States...and redesign all the roads, signs and road markers to be more compatible with self-driving vehicles
@jimbobkentucky 2 года назад
Seoul, Korea has all kinds of crazy intersections like in this video. They really should do some neural net training here.
@danielcarlson8386 2 года назад ⁺³
Watching AI Day 2 was really cool even though I didn't understand most things you help me understand tons thanks you deserve 🥓🥓🥓🥓🥓 tonight on home grown BLT's
@davidnantel9438 2 года назад
Great video !
@GoatDirt 2 года назад
I initially thought this was going to be a kind of Lindenmayer system since its an efficient output method to the planner. Since only intermittent nodes would require storage, those vectors can be very powerful and tiny
@skipugh 2 года назад
Another great job. I understand now understand why FSD Beta frequently gets confused as it moves to a left or right turn only lane when it needs to go straight. I don’t see any tokens / descriptors for these lanes (left only, left or straight, left or straight or right, right or straight, right only). I’m not sure if the cameras are currently identifying lane markers painted on the road.
@charleskeller4288 2 года назад ⁺¹
14:30 I thought they were using mono color in FSD not rgb… I don’t recall where I got that idea from though. Great overall explanations flushed out!!
@GoatDirt 2 года назад
They are using YCrCb color space, not rgb iirc
@charleskeller4288 2 года назад
@@GoatDirt the answer seems to be at 01:15:00 of the AI DAY video. Phil talks about photon counts and not rgb but i have trouble comprehending his accent here when he breaks it down.
@GoatDirt 2 года назад
@@charleskeller4288 That time stamp is beyond the end. Can you post again? (looks like it doesnt link to the original and is calling this videos time) Ill check it out
@charleskeller4288 2 года назад
@@GoatDirt I am not fluent in youtube commenting interface... see the video at ruclips.net/video/ODSJsviD_SU/видео.html
@GoatDirt 2 года назад
@@charleskeller4288 No worries, thanks, you rock! When converting a link that last "4s" is where they put in the timestamp in seconds. (4500s in this case)
@cybertrk 2 года назад
AGI is all language defined in NN’s that interact with each other
@edvardmunch6344 2 года назад
What a contrast between your breakdown of AI Day going very deep into the approach of Tesla (which is somewhere between academia, research lab and real world testing) and other youtubers such as Thunderf00t dunking on Elon and Tesla saying stuff like "Oh this robot is impossible to make but it has already been done by Boston Dynamics"
Please continue your commentary on these fascinating topics at the edge of AI computing. Thx
@johnmonroe7749 2 года назад
What I don't see in the lane choices are U-turns. I have never seen a FSD test that includes an uncontrolled U-turn, meaning there is no designated lane or signal
. Does FSD handle them yet?
@rb8049 2 года назад ⁺¹
The latest FSD had my car swerving back and forth with a lane merge. The latest releases are worse than 9 months ago. Hopefully they can fix this, but they are releasing FSD where driving has become substantially worse. Maybe it is becoming better for these corner cases of complex intersections, but it now want to run into stoplight poles when I’m driving.
@MsAjax409 2 года назад
You will notice in the presentation that lane prediction and path planning at an intersection requires course map information about the lane count and topology at the intersection. If this map information is wrong, as it often is, the system performance degrades. This is why some experience excellent FSD performance while others experience poor performance.
@r.a.monigold9789 2 года назад
Using the data and methods presented - I was able to predict how this video would conclude. I was off by only one lane - not bad for a mere human...
Thanks for the video - it's a rewatch must.
@glenbuckner8244 2 года назад ⁺¹
Thanks breaking this down! He spoke way too fast and was using jargon I didn’t understand. You really helped me understand how cool this is. Originally when I watched the livestream, I though it sounds complicated and Tesla is doing it…so it must be cool.
@wreckinball11 2 года назад ⁺¹
I wonder how many times FSD was on the verge of being impossible only to find a way to continue? Hopefully it’s almost solved so the road does come to a dead end.
@helper734 2 года назад
Hi
I have full self driving but still cannot use it in Toronto Ontario Canada because it's geo fenced due to our streetcars. Do you have any idea if Tesla will solve this problem?
Thanks
@joby6462 2 года назад
John - Thank you. One question… when leaving a freeway on an exit ramp, I need to make a left turn and often have 3 turn choices: left, left or right and right. To make a smooth left turn, ideally, I need to be in the middle lane. The far left lane will immediately turn into a lane that will return me to the freeway in the opposite direction, which I don’t want.
FSD always goes to the far left lane, and after turning, can’t easily get to the thru lane, I want. Will what you’re discussing help solve this lane selection problem?
@MsAjax409 2 года назад ⁺¹
I believe what's needed is closer integration of the route planner and the drive path planner. That is, the lane end node at an intersection has to be selected by looking at the route data. This is currently not done.
@joby6462 2 года назад
@@MsAjax409 thx. Do you know if what was presented in this video will create this integration?
@MsAjax409 2 года назад ⁺¹
@@joby6462 It wasn't mentioned to my knowledge, but it will have to be done at some point. We human drivers will select a lane based on the route we are following. Navigation software will provide hints as to which lane to select by enunciating two maneuvers at a time, for example "Turn left on Veteran's Pkwy and then right on Market Drive". An instruction like that tells us to use the rightmost left turn lane to put us into the right lane for making an immediate right turn. The route information can also be used to trigger a turn signal before a maneuver into a turn lane is completed. Today, FSD Beta signals after the car has moved into the turn lane to the irritation of someone following.
@joby6462 2 года назад
@@MsAjax409 thx for your reply! One observation, my turn signals goes on and FSD immediately starts to move in that direction, almost always later than I would like. This causes me to go over the solid whites that outlines the turn lane, technically a violation in my state. One unrelated question. FSD continually races to red lights and then hard brakes to a stop. It clearly must see the red, but is determined to keep at speed limit instead of earlier deceleration and coasting. It’s like a roller coaster sensation. Do you see this too?
@MsAjax409 2 года назад ⁺¹
@@joby6462 Same for me. Speed control needs massive improvement. I'm constantly adjusting the speed to fit the situation at hand. This will never do for a fully autonomous vehicle.
@yepyep266 2 года назад
All this combination of absolutely brilliant technologies working together creating maybe the most complex ai model in use today, yet fsd only performs a handful of miles per disengagement! Shows the madness of this project.
@markbrowning9363 2 года назад
Will Tesla's Semi vision system be able to detect trailer/bridge height & prevent the Semi/ trailer from hitting a low bridge?
@andrasbiro3007 2 года назад
The occupancy network can do it easily. It just has to know the height of the trailer, which isn't hard as containers are standard size. With oversized loads likely humans will keep driving for a while.
@WhodatIzz 2 года назад
I need path language in my own life.. i feel like i'm constantly mis-predicting someone else's walking path or my brain purposely wants to step across their path so we end up doing this awkward dance to get around each other.
@Robert...Schrey 2 года назад
how about this: you get on your car, and activate a learning mode. then you drive to your preferred destination. after that the car can do it on its own.
@lmcclymont 2 года назад
This doesn’t scale very well and the route to your preferred destination will change. It would be good if it could recognise a couple of routes and solidify them with your driving say work and home etc as we probably drive the same routes a massive percentage of the time. Training does also take a very long time and a lot of compute power (dojo) which the car’s computer can’t handle.
@Robert...Schrey 2 года назад
@@lmcclymont learning specific routes, that‘s what I mean.
@barryyoung3861 2 года назад ⁺¹
As usual, you’ve allowed me to BEGIN to at least appreciate the geniuses behind this presentation! Your input is so incredibly important and necessary for we mortals to grasp this new field. I appreciate that you only wished you could exist in their world so you could absorb and learn to better understand their thinking. As I said once before, I wish they’d hire you to be “the fly on the wall”. Nice work!
@rb8049 2 года назад
The real test is my local freeway entrance and exit. It changes lanes rapidly before and after the stoplight, can’t turn left with a cement wall ahead and the freeway exit is worse; it wants to drive into the stoplight pole. In the last 6 months every release of FSD has become worse. I’m losing confidence 😢 they can pull this off using the current approach.
@swissTom124 2 года назад ⁺¹
Do you have fsd beta?
@rb8049 2 года назад
@@swissTom124 Yes - But not CA. Just waiting for the release which can do a drive without intervention to prevent a crash. Not asking for these complex intersections or unprotected left turns which should be much more difficult.
@davidpearn5925 2 года назад
I just want to know why basic AP in Australia hasn’t made any advances in fixing the problems of shaving parked cars and over dimension oncoming lowloaders etc or of panic braking for irrelevant turning traffic, pedestrians and cyclists etc let alone pothole avoidance…….all of which was due shortly 3+ years ago.
Even the basics of speeding now being possible on AP when entering a reduced speed limit area.
It seems to be simply a geofenced system that still ‘invents’ town limits in highway areas occasionally .
@MsAjax409 2 года назад
Is the course map data taken from Google, or MapBox? It's been said that the route planner is not Google's, but Valhalla. In either case, if the course map data has errors, the lane prediction trajectory planning function fails. This is what FSD Beta testers are finding in parts of the country outside of California where map data can be years out of date. It's a serious problem that Tesla will have to solve.
@dewittbo 2 года назад ⁺¹
While I don't really understand how this all works (even with your excellent explanations, John), what I do understand is that it is unbelievably complex in terms of processing data on the fly. How it can do it almost instantaneously while one is driving seems like witchcraft. The fact that all of those steps and processing is happening while I'm driving my car up to and through an intersection--and it doing it continuously in real time--is just mind-boggling to me. I have been using FBD Beta for a few weeks now around town and while the system still makes some scary mistakes and has some close calls (so much so that my wife refuses to ride in my car if I use FSD Beta), I am in awe of the system and enjoy testing it out, but I don't see it being ready for general release by the end of the year, unless some fairly significant breakthroughs are made by the teams.
@explor794 2 года назад
It Never occurred to me that the world was the thing that was moving
@larsnystrom6698 2 года назад ⁺¹
In my view, the lane language is needed because the AI technique isn't enough for creating the concepts needed for FSD.
It's sneeking ordinary human programming into AI learning!
This is actually, in my view, someting which will become the main problem for learning Optimus different tasks.
It obviously isn't enough with just deep neural networks and a lot of data for creating concepts needed to be as good as we want them to be.
This probably means that Optimus can't learn enough from just a lot of data. It needs help, i e, manual intervention, from an AI team for each new task.
@Kitsisuri 2 года назад
unless we automate the development of these contextual languages somehow
@RandomGuyOnYoutube601 2 года назад
I would argue that "rat sat in a hat" is just as possible
@harry-eto 2 года назад
2:26 he sounds a little bit like Steven Hawking used to.
@jull1234 2 года назад
dense dense revolution.
@CruiseWeek 2 года назад
so.. we have given up on Elons want of driving without using maps?
@kimollivier 2 года назад
Everyone needs a map to navigate, including humans. Self driving must have a purpose to go somewhere. I see your point though. It is easier for a human to navigate through one of those intersections using Google Maps, so why not start with a bit of help just like we all do now. We always used to have paper maps and had to pull over to study them.
@nickmcconnell1291 2 года назад
How about Teslabot being used to charge Teslas that are either driving themselves home after being serviced OR for cleaning and charging Tesla’s robo-taxi fleet cars?
The reason I brought this up is because I wondered how Tesla could possibly instigate a robo-taxi cleaning and charging service fast enough if they bring a robo-taxi network up in the next five years. It would normally take years to establish sites and hire enough people to do this. They would have needed to already have started building this in preparation. However if Tesla can make their own workers…..???
I am now thinking that Teslabot will need to come first before Tesla will run its own nationwide robo-taxis….. they may run some in select markets.
In the meanwhile, owners of Teslas, willing to service their own cars, can get into the robo-taxi network whole hog.
Once Teslabot is mass produced however Tesla will expand their own taxi fleet worldwide. Think of the labor cost savings! Tesla might reduce fairs to 10c per mile and still make huge profits!
@wallykramer7566 2 года назад ⁺¹
I guess I don't see why all the raves about language. Isn't this exactly like a nodes and connections graph? What is the difference?
@OlMossBack 2 года назад
Hmm maybe you’re on track to explain why DWA (driving while Asian) is a real phenomenon
@jcjensenllc 2 года назад
The whole point if this video was to self agrandize
@noleftturns 2 года назад
Holographic memory - I worked on that 20 years ago, but other things got in the way, and I never got around to really getting into it.
What's missing from AI is holographic memory, like our human brains use.
A holograph image allows you to break off a small corner of the image, and you get a crude image of the hole image. Holographic memory would do the same - instead of looking at a huge chunk of memory, you just look at any smaller part of it to see if it generally contains what you need - if so, you look at more of the memory, no need to ever look at all of it unless you require incredibly accurate information.
Anyway, at some point, AI will grind to a halt until they invent holographic memory as we have in our brains.
@andrasbiro3007 2 года назад
I highly doubt it works that way.
@yepyep266 2 года назад
Well you can’t cut a piece of information without looking at it first and concluding it’s not important. However I agree everything it sees should not be given as much memory. Instead I hope they do something like what Meta is working on for their next vr headset. They plan on focusing more computer power where the eyes look at to enhance the resolution of that part. The same way, the car could enhance its voxels resolution where more critical action is happening.
@vvattup 2 года назад
🤫
@necbranduc 2 года назад ⁺³
4:10 - I'm sorry Dr. Know-it-all with a master's in Artificial Intelligence, but the fact that you're saying that "Regnet's are more flexible than classic ImageNet" tells me that you haven't even read and understood the article you're quoting and displaying on the video and moreover, you don't even have the necessary background knowledge that ImageNet isn't a neural network, but an image database (dataset) used for benchmarking ML models (neural networks or otherwise).
Here's a fun thing you probably didn't know, but Andrej Karpathy was the human benchmark for ImageNet, after participating in one competition many years ago. Here's a video of him in talks with another great pioneer of ML (Andrew Ng), talking about how he became that human benchmark: ruclips.net/video/_au3yw46lcg/видео.html
@chickenp7038 2 года назад ⁺¹
i know right!!! i can’t watch his videos because he gets so many things so wrong it’s embarrassing.
@rioriggs3568 2 года назад
I'm curious, is Tesla revealing important secrets here or is this all common knowledge among the AI / ADAS community?
@chickenp7038 2 года назад ⁺²
@@rioriggs3568 so typically neural nets which are used in the real world consist of multiple different neural network building blocks. pretty much all of teslas building blocks are known to the ml community. the only interesting part is how tesla plugs one block into another.
@nabormendonca5742 2 года назад
Yeah. And here you’re both commenting on his videos. 😂
@rioriggs3568 2 года назад
@@chickenp7038 Could Ford easily do this tomorrow morning? Is Machine Learning Models now mainstream knowledge or reserved to a few Wiz Kids.
@ibrahimozdenoglu8332 2 года назад
It is time to buyback Tesla Stocks.
@jambay4785 2 года назад
You can delete this: The grid graph had about 40 reference points (the white ones) and 4 of "interest" for intended action (right turn). Ignoring all the other possible options. To quote you "the cat sat in the hat", I'll add it shook it's ass and licked it after that.
@NicholasShanks 2 года назад
I agree with Jambay, please delete his comment 🙂
@jambay4785 2 года назад
@@NicholasShanks thanks for morning laugh.😄
@chickenp7038 2 года назад ⁺³
you clearly don’t know much about neuralnetworks as ImageNet is a dataset and not a model architecture
@chickenp7038 2 года назад ⁺³
furthermore you say that image captioning is assigning a label to an image, which is wrong. it is creating a text which fits to the given image.
@nabormendonca5742 2 года назад
A label could be anything, including an abstract description of the image’s contents. Funny that when trying to sound smarter than him you end up sounding dumber. 😏
@chickenp7038 2 года назад ⁺¹
@@nabormendonca5742 well but if you listen to how he describes you would notice that he means exactly what i am describing
@GB-ob5zx 2 года назад ⁺²
You took up a front row seat instead of letting the young applicants be up front. Shame🤣
@skinnymoonbob 2 года назад
These things makes me highly confident that Tesla will solve autonomy. 🫶🏼

Следующие

Автовоспроизведение

HARDCORE TECH! Tesla's DOJO Will Change the AI World--Transparently! AI Day 2