REVEALED: What Tesla FSD OCCUPANCY NETWORKS Are--And How They'll Soon be MUCH BETTER! AI Day 2

Поделиться
HTML-код
  • Опубликовано: 12 янв 2025

Комментарии • 172

  • @BongoWongoOG
    @BongoWongoOG 2 года назад +8

    Mimicking how the fovea sees. Close up detail handled by a part of the eye which has a more dense pack of rods and cones (higher details) for close up work. Further away the outer retina is used which has fewer rods and cones but 'registers' the photons of objects. By varying the voxel sizes, we're effectively creating a digital 'fovea' and reducing the compute necessary based on distance. This is a great solution IMHO. I'd love to work on this :)

  • @peterprocopio2192
    @peterprocopio2192 2 года назад +25

    great video! Thank you for stopping the presentation and explaining the important and technical aspects. You did a great job for us non-engineers. I am so impressed at the Tesla team's ability to engineer new solutions so quickly. I can't believe that other car companies will be able to compete with the FSD technology. This also gives me confidence that Tesla will solve FSD before anyone else and introduce robotaxi service before anyone else. When is still the question, but I am more confident that they will.

  • @RjDavis000
    @RjDavis000 2 года назад +4

    Great video John, would have been in the dark for a while on the AI day content without this

  • @RajIndia31415
    @RajIndia31415 2 года назад +54

    Thanks for explaining it so well. I think you are the only one doing it for a technical audience while keeping it approachable.

  • @barnabydunning5424
    @barnabydunning5424 2 года назад +2

    You’re a fantastic communicator! Thanks for this vid.

  • @keitho9508
    @keitho9508 2 года назад +5

    Hey John, please keep this up, you are helping so many people understand what our great company is doing.

  • @mb345
    @mb345 2 года назад +5

    Really like this format using OBS. Your overlay to the video with pauses for explanation was great. If you had a pen tablet and some obs addon for screen mark up then you could annotate, draw attention to important parts of the screen or switch over to a white board view and explain a concept.

  • @loweryjk
    @loweryjk 2 года назад +3

    I've been waiting for news on applying variable resolution based on distance to the occupancy network. Thanks so much for finding and sharing this awesome information!

  • @williamjulien5858
    @williamjulien5858 2 года назад +2

    Live in a high-rise and park in a large parking deck. For fun I often experiment/play with Beta in the parking deck. It has always gone straight at solid concrete walls (or pillars that stick out) until 69. It will now turn and not head straight at them 100% of the time.

  • @kunletim8464
    @kunletim8464 2 года назад +4

    Hey John, you are a great teacher; I'm sure your students always enjoy your classes😊
    Thanks for the beautiful explanation.

  • @WilliamQ20
    @WilliamQ20 2 года назад +22

    WOW John, what an amazing explanation of what is happening @ Tesla. I felt from last Friday thet they were on the good track but after this video i am convinced that they are on the right track. How pity i is that so many people do not get it and start complaining about the supprimation of those 12 stupid sensors. Thank you for explaining all this in humann language.

    • @giovannidegeronimo8941
      @giovannidegeronimo8941 2 года назад +2

      I just don't understand how the occupancy network can detect things that the cameras can't see. For example, lets say the car is off, and I put a cone in front of it that is outside of the field of vie of the camera. It has no way of seeing the cone. How can that ever substitute ultrasonic sensors without adding another camera at the front of the car that sees right in front of the bumper?

    • @thoughtfulrebel2246
      @thoughtfulrebel2246 2 года назад

      @@giovannidegeronimo8941 I agree. I wish someone could provide reasoning as to how they will deal with this. Maybe the cameras are recording even when the car is off to see if something goes into the 'blind spot' in front of the car and then it will not go until the cameras see it again outside of the blind spot?

  • @jkev1425
    @jkev1425 2 года назад +1

    This is why I initially subscribed your channel 👍👍👍

  • @jbarvideo12
    @jbarvideo12 2 года назад +1

    Dr KIA Thanks for stopping many times during the Tesla discussion and explaining in understandable terms (and communicating Well to us non gamers) what is happening. It is amazing to understand what is being accomplished in refining FSD.

  • @explor794
    @explor794 2 года назад +1

    Perfect analogies and simple explanations for complicated things. excellent work

  • @Martinit0
    @Martinit0 2 года назад +5

    Excellent summary and explainer. I remember voxels from the 90ies helicopter flight simulator game Comanche.
    What surprised me here is that the Tesla voxels are not on a fixed grid and they are not full (or empty) voxels but there are many partially filled voxels. So there is a complexity to each individual voxel-

  • @FredPauling
    @FredPauling 2 года назад +1

    It's mind blowing that this can be done at 100Hz. Will be following this topic very closely

  • @CorwynGC
    @CorwynGC 2 года назад +1

    I think there is misunderstanding about various 'spaces', at least as Mathematicians would describe them. "Raster Space" is a 2D space based on a plane being viewed (like a video screen) Every spot on the plane has a value. "Voxel Space" is a division of a 3D space into small box shapes. Every box has a value. "Vector space" is a representation of items using (basically) arrows pointing to the item in 2D, 3D or even higher dimensional volumes. Fewer items means a smaller vector space. An oscilloscope works on vector space, with beam being moved directly to the spot of interest, rather than scanning back and forth over the whole screen.

  • @firescratcher.
    @firescratcher. 2 года назад +2

    Finally a well prepared video. It's been quite a few videos since John has prepped before presenting. IE yesterday when he tried to bring up an image of a bumper with a sensor, Fail. How hard would it have been to have that picture ready? This one was good!

  • @jessiejanson1528
    @jessiejanson1528 2 года назад +1

    at roughly 30:00 ..
    They only have so much they can process at once, so for city driving they ignore things above a certain height, focus more in front of them and to the sides with a bit behind them. But for highway driving the sides almost dont matter, and the back also doesnt matter while at speed. they can focus much further ahead while driving and while stopped they could focus more on whats behind them. I think it is probably relatively easy for them to transfer from a voxel highway front to rear model to a voxel city model as needed.
    Im not sure LODs would be possible. but given that its a highway they might simply prefer to focus on the lane itself and whats in that lane as well as whats to the sides if its visible. given the limited space they need to search, their normal approach seems better suited for tracking far off objects where the voxels are more suited for 'whats around me that i might hit'.

  • @robstewart2213
    @robstewart2213 2 года назад +3

    Loved this explanation. I guess I'm a geek too. Much appreciated your summary. Keep up the excellent work! Impressive what the team at Tesla has accomplished to date.

  • @richardgoldsmith7278
    @richardgoldsmith7278 2 года назад +1

    I loved the explanation of the variable size voxel map - implications are enormous and dawning on me like a wave …

  • @SoleLo
    @SoleLo 2 года назад +1

    Great video! Thank you for breaking it all the way down. Although I understood some of it, you really allowed me to understand more of the details.

  • @Mark-kt5mh
    @Mark-kt5mh 2 года назад

    Hey @19:39, that clip is indeed from New York! The chimney is there because of a district steam leak. The articulated bus has the black and yellow New York license plate.

  • @mhfs61
    @mhfs61 2 года назад +2

    Thank you for a great lecture, John.

  • @DanyPell
    @DanyPell 2 года назад +2

    so excited cant wait for the next massive upgrades an AI day 3!!

  • @northernouthouse
    @northernouthouse 2 года назад +1

    Great video. So, my interpretation is that Tesla took 8 cameras to create its own version of lidar without using the lidar sensor.

  • @bootiemacarthur9182
    @bootiemacarthur9182 2 года назад

    Rectifying speed instantaneous is amazing to me , during training as a railroad engineer (over 40 years ago) we made recorded trips with motion, sound, speed etc (a wrap around screen in front of the engine (visual progress was regulated to train speed)

  • @andyhamilton5926
    @andyhamilton5926 2 года назад +1

    One thing I've not seem much discussion of is how Tesla determine the right speed of the vehicle if there's nothing in front. The posted "maximum" is not always "safe" due to corners, weather, and elevration changes. In the UK, our roads are narrower and twistier and I usually find Autopilot going TOO FAST even for slight bends on big roads, resulting in centrifugal force making the car drift to the outside of a not so gentle turn. Ideally the vehicle would slow down or ANTICIPATE the outward drift and apply steering input early. Can you discuss this please? Thanks! Andy

  • @notspm9157
    @notspm9157 2 года назад

    13:48 MAJOR mistake. That is not a YUV space, raw sensor data is not in YUV space! You can't capture YUV space with an image sensor. Want to make that clear.
    What IS being sent is the raw camera sensor data for each color in RGB, where there are 2 pixels representing green (as the sensors typically there are 2 green sensors for every red/blue sensors...due to needing it square and we are more sensitive to green).
    In all the post processing you convert the RGB space into YUV space. So yes they are now working in RGB space, instead of prior they were likely in YUV space...so pretty much reverse of what you said

  • @stevedowler2366
    @stevedowler2366 2 года назад +1

    That one is very fascinating. It peels back a lot of the unknown layers of how Tesla engineering builds a 3D recognizable image from raw pixel data. Thanks much, Dr. KIA (wait, that reads a little scary in military acronyms).

  • @oisiaa
    @oisiaa 2 года назад +1

    It's amazing how much they are able to do with the HW3 CPUs since none of this tech was invented when that hardware came out.

  • @hhal9000
    @hhal9000 2 года назад +8

    Great explanation as usual.This is really exciting stuff and it looks like this could follow a similar trajectory to the history of 3D rendering and particularly the use of Games engines to do real time rendering.If they get enough compute power to get the granular detail down to 1mm voxels at very close range just imagine the accuracy of the Occupancy space.

  • @gaborszollosy2153
    @gaborszollosy2153 2 года назад +1

    This episode was rather for computer NeRFs :). OK, joke aside great explanation, I like this format that we deep dive into shorter parts of the extra long AI Day.

  • @RandomGuyOnYoutube601
    @RandomGuyOnYoutube601 2 года назад

    To me it looks like brown objects are garage doors or gates. It is very impressive that the car can actually discern it.

  • @alanrickett2537
    @alanrickett2537 2 года назад +1

    So asking this question again with this veiw if you have a high D radar that can sense in blocks of the same size that the vision is using at that distance and just lay the two over each other then you would not have sensor clashing issues .Is this a workable option.

  • @ashsilverwizard3275
    @ashsilverwizard3275 2 года назад +1

    The smallest lod would probably vary with speed as well.

  • @BboySnake71
    @BboySnake71 2 года назад +1

    Another amazing video. Great job. I now ACTUALLY understand what is going on. Why the late, hard braking on the highway at high speeds, why they will be able to get rid of the Ultrasonic sensors.

  • @miiihaaas
    @miiihaaas 2 года назад

    17:29 - interesting (off-topic) moment, you covered mic with your hand and the only sound was able to hear is echo... :)
    I wonder how Tesla's AI networks can use sound in their navigation system.

  • @gdaddy9410
    @gdaddy9410 2 года назад

    John - fantastic video!!

  • @cgamiga
    @cgamiga 2 года назад +1

    Great explanations, thanks! My concern re dropping the ultrasonics, is... lack of close-up camera placement.
    Just as the AP cameras are fine for street/traffic vision, but Tesla can't do 360o birds-eye parking view:
    there are no cameras to provide full near-scale perception, especially in front of the bumper, looking down by the front wheels, etc... rear seems to be bit better, w/ the fish-eye backup cam and the rear side repeaters (can see wheels/curb at least)... but I'm not sure the A-pillar cams see that far down? Certainly, not in front of the hood.

  • @sandmehlig
    @sandmehlig 2 года назад +4

    Critical objects far away are smaller, voxel size should not increase with distance but with invariance/insignificance. Even working with outer boundaries, especially for near objects, should decrease complexity and keep functionality.

  • @ericchild3363
    @ericchild3363 2 года назад

    Excellent John, thank you

  • @cmw3737
    @cmw3737 2 года назад +1

    Higher resolution of voxel space will really matter to Teslabot which needs to focus on what is in its hands in order to manipulate it while not needing to know about its surroundings so long as it keeps a distance unless to needs to interact.

  • @billl1768
    @billl1768 2 года назад

    Thanks for the explainer. Very helpful.

  • @sk.n.9302
    @sk.n.9302 2 года назад

    Great video, amazing explanation!

  • @skipugh
    @skipugh 2 года назад

    Thank you. Great presentation 👍👍

  • @ChrisFleck
    @ChrisFleck 2 года назад

    It will be great to avoid curb rash!! Are the cameras in the correct place? What about curbs in front below the view of the camera?

  • @MostViewedTop40
    @MostViewedTop40 2 года назад +6

    Pretty awesome stuff. It would be cool to know how much of the available compute this takes up versus all the other things it is doing. As each new version of FSD computer comes presumably they can have higher and higher numbers of tiny voxels. I always think they are limited by having to keep the hardware in the car the same for so long.

    • @Martinit0
      @Martinit0 2 года назад

      Given Tesla's foresight to use defined interfaces between car components I am pretty sure they could field-replace the FSD computer. The price of FSD certainly has enough margin to do that.

    • @WarrenLacefield
      @WarrenLacefield 2 года назад +1

      Also the new Samsung cameras are far higher resolution. So they can actually see the "up close" much better than the older cameras and to see the "much farther away" in sufficient resolution. In the future, another dimension of the "video stream" might well be the (variable) focal/depth of view of the cameras. That might be a way of implementing variable "voxel resolution" of the 2-3 second dynamic occupancy network.

  • @cpleng7
    @cpleng7 Год назад

    thanks you very much , now I have basic idea for the FSD .

  • @TanujBolisetty
    @TanujBolisetty 2 года назад

    Great explanation. Keep going

  • @gressex7
    @gressex7 2 года назад

    Wow, awesome video. Thank you!!

  • @davidc3463
    @davidc3463 2 года назад +3

    Just a heads up “Ashok” is pronounced “uh-sh-Oh-k”… not uh-sh-oC-k. A lot of RUclipsrs are misspronouncing his name, ;)

  • @chrisd6716
    @chrisd6716 2 года назад +2

    I'd love to see you do a breakdown of George Hotz breakdown. He didn't seem very impressed at all, and still thinks Tesla's approach to all of this is incorrect. Interested in your thoughts on his approach with Comma versus Tesla's.

    • @635574
      @635574 2 года назад +1

      I watched half of that and he just comes from a smaller company who cannot even try to make these complex systems. It is far more useful to have a general system that recognizes 3D motion and knows how to avoid things, more important than understanding how to classify the objects in the way. As long as comma AI doesnt have a solution for general unexpected case. IMO they will be as dangerous as the old self driving before 69 and crash into unrecognized obstacles.

  • @Nakatomi2010
    @Nakatomi2010 2 года назад

    Brown appears to be either "Openable object", like a Garage door, or Gate.

  • @jubjub7406
    @jubjub7406 2 года назад

    Well done!

  • @TheElectricMan
    @TheElectricMan 2 года назад +1

    Thanks For Posting Quality #Tesla Content

  • @kipling1957
    @kipling1957 2 года назад +1

    Physical occupancy is a bit of a misnomer, in that it ignores gases and possibly transparent liquids. Air velocity and liquid state water volumes may be relevant in certain environments.

  • @lylestavast7652
    @lylestavast7652 2 года назад +1

    great job breaking things down ... what's the potential value of having increasingly more dense imagery on all/some of the cameras for finer detail and distant object detection ? Obviously the image processing pipeline would require a lot of upgrading... Given today's video things like 4k, 6k, 8k... say they made a jump for the forward facing camera up to whatever the top is today - maybe 8k ... would the increased image density (and maybe optimized depth of field given time of day/f-stops) and leaving the others be cheaper cameras - would it be straightforward for them to dumb down the image on that frame for skeletal info, yet use the virgin of it for some of the distant object determination (or accuracy of velocity changes) using the much higher res ? and in that vein, if doable at all - how about an additional, non-statically positioned camera where the sensor moves in the direction of intended movement as a time leading predictor element ? Maybe that camera can be moved to look in any direction of concern - like backing up , knowing motion from one side seems sketchy etc... I don't know the space at all, just wondering where it goes down the timeline and what's possible ...

  • @wholmes7177
    @wholmes7177 2 года назад +2

    Thank you for another excellent video. Do you think SpaceX will use FSD tech to land Starship and Super Heavy?

  • @pauli-714
    @pauli-714 2 года назад +2

    John, thanks for doing this video. I watched AI day and wasn’t that engaged and didn’t realize how important occupancy networks are. Thanks for breaking it down and making it much more enjoyable and understandable 😊.
    From your Intel friend you met at Fully Changed Live in San Diego.
    BTW: Hit me up if you are interested learning about GPUs/AI Accelerators/IPUs/DPUs/SiPho/etc.

  • @eugeniustheodidactus8890
    @eugeniustheodidactus8890 2 года назад

    *Thank you Thank you Thank you Thank you Thank you Thank you!*

  • @yangni007
    @yangni007 8 месяцев назад

    Thanks for this inspiring video ! but is this occupancy map the output of a good lidar ?

  • @bukurie6861
    @bukurie6861 Год назад

    Thank you for your video

  • @longboardfella5306
    @longboardfella5306 2 года назад +1

    Thanks John. Great explainer for us. How far do you think they are going on the mid period strategic planning you called out as an issue on a number of occasions? Seems they’d need 10 secs or more of forward planning.

  • @partyboeller
    @partyboeller 2 года назад

    So the occupancy network is basically LIDAR without LIDAR?

  • @kevryfabroni
    @kevryfabroni 2 года назад

    great job!!!

  • @davidfellowes1628
    @davidfellowes1628 2 года назад

    Super interesting

  • @m_sedziwoj
    @m_sedziwoj 2 года назад +1

    About feature implementation, where is one important difference between games and driving, some elements are not important eg static buildings in distance, and some are very important person laying on ground. So in my opinion cube size should not only dependent on distance but content too and it placement (something looking as human on side of road is not as important as at your lane even with 100m distance could be less than 3s before hitting it).
    EDIT and one thing, on distance you have limited information because of resolution of camera, with distance one pixel taking more and more physical space (look at Unreal Engine 5 and it Nanite technology, each pixel is polygon, but no more than one, it is final form of LOD system)

  • @bujin5455
    @bujin5455 2 года назад

    30:07. You probably wouldn't want to use dynamic detail quite like that. The faster you go, the further ahead of yourself you want to plan, and the more you want to narrow your field of focus. You can't make fast maneuvers, so up close detail isn't too important, but you definitely want to understand something is in the road far enough out that you can plan and not hit it (even if it's a relatively small something). So the majority of the detail needs to shift out in front of you, and it needs to be focused however many seconds out it needs to be for you to be able to make meaningful/safe course corrections. Basically there is a judgement window, and the faster you go, the farther out that window is, and the narrower that window becomes. (That is in fact how humans do it.) And you want to prioritize your resources around that judgement window, and you want to shift that window according to the situational context.
    33:28. That is how the eye works, because it can't resolve detail at a distance, but that is not how human focus works. Cameras will have fewer and fewer pixels to work with as you go farther and farther off into the distance (just like an eye), but the AI's focus (that area where all the resources are spent) needs to be optimize around the ideal decision window, as I described.
    They may also be able to do dynamic voxels sort of like how the new Unreal Engine does dynamic triangles, where the voxel becomes content aware. Using fewer voxels to represent less complex items. Like how not many voxels are needed to map the occupancy space of a bus. That way you could increase your area of focus for the same resource expenditure.

  • @richiehart7858
    @richiehart7858 2 года назад

    Very good video.

  • @stephm9261
    @stephm9261 Год назад

    Great video even a year later!

  • @mjr7991
    @mjr7991 2 года назад

    Is vector space running in the latest version? My experience would suggest no. My Tesla does not see a parking arm that is down and will plow through it if I let it.

  • @TheOmega1971
    @TheOmega1971 2 года назад +1

    Maybe this is because there are hardware 4 in the cars now…?

  • @kcx047
    @kcx047 2 года назад +1

    For Optimus, the LOD needs to reduce to 1mm or less. Fine motor skills will eventually need 500 microns or so. At 50 microns, Optimus can perform most surgeries. Very cool

  • @shak7185
    @shak7185 2 года назад

    Thanks for your fantastic video! The dynamic Level of Detail problem they are facing currently would potentially be much simpler if there was LIDAR data available which has a much higher level of detail without the need for heavy (and slow) video processing. New LiDAR solutions are coming online now which are about the price of those ultrasonics they just removed.

  • @jkev1425
    @jkev1425 2 года назад

    What I don't understand is how Tesla can replace the input from the ultrasonic in areas the cameras cannot see. Next to the bumper the ultrasonic sensor can "see" but the camera can't. If the scene is moving it is fine, but if you parked your car and now something is next to your bumber. You can't...

  • @barrydaugherty5528
    @barrydaugherty5528 2 года назад

    With all their doing, it seems like it would be extremely easy to present a Birds Eye view of the car when parked or exiting a space. Wonder if they’ll ever present that to users?

  • @billgibson2520
    @billgibson2520 2 года назад

    🥶 very well done....tks

  • @goingballisticmotion5455
    @goingballisticmotion5455 2 года назад

    They did that already w/o NeRFs? Wow, the combination will be remarkable.

  • @garretthoneycutt3432
    @garretthoneycutt3432 2 года назад

    Red means the object is potentially dynamic.

  • @murta1979
    @murta1979 2 года назад

    Why do I get the felling that this Occupancy Network solution is the last piece of the puzzle Tesla needed. I have always questioned if Tesla AI could "see" for example the pillars in the middle in Gali´s "monorial" -test.

  • @jangeiss8693
    @jangeiss8693 2 года назад

    You are amazing

  • @christophercobb6352
    @christophercobb6352 2 года назад

    Another way to say "thirty centimeters" is "one foot" (approximately).

  • @russadams3008
    @russadams3008 2 года назад

    The problem with the occupancy network AT THE PRESENT is it avoids "objects" that it should not. I had FSD attempt to avoid puddles recently. I'm not talking about flooded roads. I'm talking about a puddle that occupies 1 lane of a multi lane road. What it did was attempt to change lanes every time it encountered a puddle. This resulted in weaving from lane to lane. It became unsafe and I took over. I captured this behavior by pushing the feedback button every time it happened. I wonder if Tesla even noticed my feedback.

  • @jacksonmatysik8007
    @jacksonmatysik8007 2 года назад

    Wonder when the occupancy network reckonises potholes

  • @terrymatic
    @terrymatic 2 года назад

    DKIA, greetings from Jamaica!
    Could the occupancy network be used to translate 2D games to 3D VR worlds?
    Gaming and VR have the same problem as EVs and charging ports...everybody is doing their own thing.
    Could there be a way for Tesla to convert all the 2D objects in any game and translate it into a universal VR format since they could have the physics for every object it would make things frictionless.?!

  • @andyfeimsternfei8408
    @andyfeimsternfei8408 2 года назад

    The problem with eliminating ultrasonics is the cameras can't see close to the car all around. The occupancy networks can be inferred for objects moving into the blind spots if the cameras are on, however if an object gets very close the the car while parked it will be unknown to the network. Example, a small animal or a child's toy placed near the car while parked. This would not be a problem if the cameras had 100% coverage of the vehicle. In my opinion, if Tesla had upgraded the camera locations FSD would be much simpler and farther along by now. The team is struggling with lots of elaborate measures that would be avoided if the front fender repeaters were located at the front of the car, perhaps on the sides of the headlights.

  • @johnhanson6039
    @johnhanson6039 2 года назад

    Teddy bear looking thing is a Kerbal 🤣

  • @MGCAUSTIN
    @MGCAUSTIN 2 года назад

    This is a great research project but it's not ready for public release. Can we just get the cruise control work work reliably without slamming on the brakes at high speed? Then we can move on to the fancy stuff.

  • @amusableackeem
    @amusableackeem 2 года назад

    So basically, they're training our cars to play a video game in the real world lol

  • @cheronecom
    @cheronecom 2 года назад

    If Tesla Vision is good enough to prevent me from smashing into my wall or trailer or curbs, then great. I sure hope this isn't a mistake.

  • @alexi7787
    @alexi7787 2 года назад

    Hmmm thinking about the occupancy network but still think a camera is needed on the front bumper as how can it guess a hydraulic retractable bollard if it can t see it before ?? @elonmusk @Tesla in Europe when we go in cities roads are now being equipped with those bollards but if the car saw it raised before we approached but then (this goes to the blind zone of the car as we get close) then the bollard retracts but if it did see it before , will it say something is in front ?

  • @capslock9031
    @capslock9031 2 года назад +3

    I'm really looking forward to your video here, since I just watched some of what George Hotz had to say about this stuff: ruclips.net/video/lSXwIzww6Us/видео.html My impression is, that there are lots of philosophical differences between what he likes to do and what Tesla does, but that his criticisms are at least very well informed and fair, on a technical level. Two take-aways were that he views the occupancy nets as a reconstruction of lidar and that he thinks Tesla still - and to their disadvantage - does quite a lot of hard coding.
    I'd really love it if you could bring on George Hotz along with James Douma and maybe even Scott Walter to discuss some key points of Tesla's hard- and software approaches. That would be an epic discussion, I believe.

  • @de-kat
    @de-kat 2 года назад

    It's time for Tesla to bring out a few vacuum cleaning robots, with this technology they can easily outperform all existing manufacturers in terms of detection and production costs.

  • @colinmaharaj50
    @colinmaharaj50 2 года назад

    Dr Know-it-all, I like that name, Im new here..

  • @dareldavies2674
    @dareldavies2674 2 года назад

    Looks like my kids Minecraft yet will be a major turning point.

  • @strejf
    @strejf 2 года назад

    Nice video, but they do compress the images right? It's not RAW image data they send to their servers.

    • @strejf
      @strejf 2 года назад

      @Torr-Net I don't think so.

    • @nettlesoup
      @nettlesoup 2 года назад

      @@strejf Yes, the videos uploaded to servers are each several seconds (>10) worth of lossless compression (i.e. RAW). This is ccording to earlier conversations James Douma has had with Dave Lee. Video camera capability is 1280x960 12-bit per colour. I guess we don't know the frame rate any more since now they're using raw photon counts but it used to be 36 Hz in the old days.
      Bear in mind that FSD Beta testers regularly talk about getting home and watching their Tesla uploading several GB of data to the mothership when they check the home router. This is mostly lossless compressed video, presumably from all 9 cameras (if we include the cabin camera).

    • @strejf
      @strejf 2 года назад

      @@nettlesoup When I store videos on my usb drive in the car they are compressed though, not lossless.

    • @nettlesoup
      @nettlesoup 2 года назад +1

      @@strejf That's correct. Because the USB drive has to hold tens of minutes worth of video, which would be impossible in most cases to store losslessley due to size requirements and rate of data transfer to the USB device.
      These 10 or 15 second clips are compressed slightly but they need to be lossless because Tesla needs to see exactly what the camera saw originally and be able to use this raw data to train the neural net at the server side.
      Ten seconds of lossless video is going to be about 1 GB so not a huge issue and I guess it's worth it if Tesla can be strict about the conditions when they collect such data, e.g.:
      1) When an FSD Beta tester presses the data collection icon (limited number of presses since local storage is limited).
      2) When a crash occurs.
      3) When a campaign filter is triggered, meeting a very specific requirement where they want high quality examples of some edge cases.

    • @strejf
      @strejf 2 года назад

      @@nettlesoup I think George Hotz said in his analysis video that the images are compressed, not lossless. Comma AI does it and also Tesla he claimed.

  • @marcusnichols5595
    @marcusnichols5595 2 года назад +2

    Anyone else wondering about defense industry applications? A tank equipped with vision-only perception would be stealthier than a platform that used radar or active IR emitters. The potential for active anti-ATGM measures using vision / passive IR / microphone sensing would create a sensory bubble that could identify and counter most direct threats.
    ofc, passive sensors can always be defeated with paint. ruclips.net/video/v-0pcfxFlRA/видео.html
    Indirect threat vectors require counter battery solutions, either a separate companion platform, or perhaps a slaved drone.
    I don't think Tesla / SpaceX have any ambition in this space. To date, they have not expressed any interest to enter the defense sector. However, both companies are in some sense institutes of technology who induct the best of the best and then upskill them in the science of manufacturing. Tesla alumni know more than just how things work; they know how to make them at scale.
    Manufacturing at scale best practice appears to have morphed from a pick 'n mix catalogue engineering menu to a company culture shift. Over time, Tesla's manufacturing system will overtake the Toyota Production System.

  • @richardgoldsmith7278
    @richardgoldsmith7278 2 года назад

    Be honest, the nodding was you nodding off 😜 😘

  • @tldrinfographics5769
    @tldrinfographics5769 2 года назад

    Can’t believe they just flipped like this on software