SLAM Robot Mapping - Computerphile
HTML-код
- Опубликовано: 27 ноя 2024
- Thanks to Jane Street for their support...
Check out internships here: bit.ly/compute...
More links & stuff in full description below ↓↓↓
This video features the Oxford Robotics Institute demonstrating their SLAM algorithm with their frontier device and the Boston Dynamics Spot robot. Thanks to Marco Camurri & Michal Staniaszek for their time. Last time we met spot: • Automating Boston Dyna... Joining Point Clouds with Iterative Closest Point: • Iterative Closest Poin...
JANE STREET...
Applications for Summer 2023 internships are open and filling up... Roles include: Quantitative Trading, Software Engineering, Quantitative Research and Business Development... and more.
Positions in New York, Hong Kong, and London.
bit.ly/compute...
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottsco...
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
i made a little robot that used slam mapping once. that is, it slammed into walls to find out where they were
😀😆😅🤣😂😂😂
Top of the line roomba robots do the same thing, so your robot was as cutting edge as they are.
"Simultaneous Localisation And Mutilation"
@@timseguine2 exactly. it didn't last long
I knew a vision impaired dog that navigated like this
“The IMU has physics inside”
I’m stealing this line for my next UAV LIDAR lecture
U working on UAV's, so cool
I remember over a decade ago using a periodically rotating ultrasonic distance sensor on a small Lego robot to do a very basic SLAM, where (luckily) all of the relevant rooms were perfect 1m squares and all we needed to know was where we were in relation to the center of the room.
It's amazing how far tech has come and how incredibly diverse and useful it is! I love seeing the multi-camera SLAM systems like used on Skydio drones and inside-out tracked VR headsets
One of the coolest applications of the Kalman Filter is slam modeling. Really gives you a sense for how flexible the Kalman Filter is..
kalman filter for slam is quite outdated, all the state of the art method do least square on graph
@@oldcowbb Makes sense - saw it in a textbook teaching Kalman Filter.
@@oldcowbb Can you link some more information? It would be cool to get a more detailed overview after the video summary.
@@zombie_pigdragon find cyrill stachniss graph slam on youtube, he describes in detail behind the math of graph slam, he also has a paper on graph slam tutorial, basically a paper version of his lecture
@@zombie_pigdragon there is also a matlab tech talk on graph slam, but that one is more on the intuition than the actual math
Makes you appreciate the human brain. I go down the garden on the right and back on the other side and I am quite happy that I have closed the loop. On the way things will have moved the wheelbarrow, the washing in the wind, I can see a programmer absolutely pulling out their hair trying to get a robot to assign permanance to the right landmarks.
you are absolutely right, dynamic environment is known to be hard, but it can be solved by labeling using computer vision and classifies them to dynamic objects and static objects
@@oldcowbb No it can't be solved. The AI solution is even more noisy and easily fooled. A broken chair under a pile of wreckage cannot be labeled whatever you do. Also labeling is done, obviously, by people. Being realistic you realize you cannot even label a small fraction of all normal objects. What you read in the news about all-knowing AI is sugar coated and it is very much way more sensitive to dynamic conditions.
@@sergiureznicencu Yh dynamic env is yet to be solved, I see some recent papers that attempt to use segmentations and show promises and some that run a sort of dynamic landmark detection but it’s still far from solved
The thing is, humans don't have a pinpoint accurate map of the world. We localise ourselves in the current room to within a foot or so and work things out as we go. A human wouldn't know if a building was out of square slightly, our maps are much more conceptual, like a graph of room connections.
@@mirr0rd Yes, agreed. Particularly on the error involved (hence the stubbed toe in the dark).
So we build a conceptual map/graph - is that easier or harder than a robot building a map with accurate coordinates? It's certainly a different approach, perhaps each should be jealous of the other's skill?
SLAM is something I've been studying and working on for some time, so it's very cool to see it discussed! It is so useful in so many cases for automation, as you aren't dependent on external data, such as communication with GPS and such. It is extremely useful both for the maps it produces, and you can use the results for path planning and obstacle avoidance.
Can anyone explain to me how do people update the kd tree where the point clouds are stored during loop closure? I understand the poses and landmarks are updated, but updating the landmarks should distrub the spatial ordering of the kd tree. How do you resolve that?
@@swagatochatterjee7104 I haven't specifically worked with kd trees of point cloud data, but I have done some general SLAM implementations and can speak on how I would approach it. Later on in the video they mention factor graphs. I would link one sample of points (e.g., all the points from one revolution of the LiDAR) to an individual factor within the entire graph. That way, the point cloud is local only to whatever the position of said factor is. If the factor's position changes, then you simply rigidly transform all the points associated with it as well. That way you do not need to constantly rebuild the trees. Once you are ready for post-processing, then you could rebuild the filtered data once.
@@calvinkielas-jensen6665 aah ok, so on loop-closure the pose-landmark factor constraint only changes relatively to the pose, and the visualisation we see is a post processing? However, if we are using landmarks local to the pose, how would one associate landmarks from pose to pose (say using ICP or DBoW)? I am talking in terms of Factor Graph SLAM only.
Amazing video, I love to see that ROS is used for all of these applications, it helps the research grow and the results are amazing. Every year new amazing things happen
Would be very interesting to have a deeper dive into this. Like, how do they extract features from the images, and how do they handle uncertainty in the data?
And then a Numberphile video on how it's solved!
those are military secrets, I don't think the Pentagon will be happy to disclose how these works
there's a whole field of research on that. basically a lot of math and statistics and machine learning
Also how they know that the system is near an older position based on the point clouds
@@mikachu69420 there are not a lot of machine learning required, very basic visual features (orb, sift) will do, of course you can also spice things up with machine learning. uncertainty is handled by Bayesian updates usually with gaussian or monte carlo
I only took an introductory course in robotics in college, but I remember learning a bit about SLAM. I personally think it's one of the coolest problems in computer vision.
You should really go into depth regarding the mathematical intrecacies of probabalistic robotics! It's an awesome field of math and engineering!
I got a suggestion for map management, treat the robot as being in a 3x3xN cuboid, the cell 2x2x2 is the one that the robot is ALWAYS in, it is declared as RxCxD from 0x0x0 which is always the 1st place it started recording, cell 2x2x2 gets the most attention, the cells directly adjacent are for estimating physics of any interaction (such as a ball knocked off of an overhead walkway somehow that need to evade), a change vs the stored recent state indicates objects &/or living things in local ares to be wary of, otherwise just keep checking with the dedicate threads, the cells 4 onwards of dimension N are for vision based threads, they by default use empty space unless a map of the cell happens to be available, when ever the robot reaches the normalised 1.0 edge of the cell it's in then the RxCxD it's in changes with it and it's normalised position flips from 1.0 to -1.0 & vise versa, keeping to a normalised space for analysing physics & locations that can move to simplifies the math used for it since it only has to focus on what is in RAM rather than whatever map file/s are on disk which is loaded/unloaded by whatever thread moves the robots position, the files just need to be stored something like this:
7FFFFFFF_FFFFFFFF_3FFFFFFF.map
The underscores indicate next float (in this example anyways) value so in the above case the 3 values would be read into integers, cast to floats via pointer arithmetic or unions then treated as the absolute position of the map that is being used in related to what is stored (which is faster than reading text obv), this combined with the space of the cell in it's none normalised form gives an exponential increase of absolute positioning while also keeping the RAM usage & CPU/GPU/TPU usage reasonable by not using doubles for physics & mapping of the cells themselves, this no doubt translates to even faster processing of those cells, for the count of N in 3x3xN I would go with 10 so that it's not too far ahead that looking but also not ignoring too much with a 3x3x3 arrangement, 10 is also easier for manual math & percentages. Using fixed width map names also makes it easier to map in your own head roughly where the robot saw it in relation to when it 1st started mapping since the fields of column vs row vs depth are neatly aligned ready for quick read of it by human eyes.
My robot vacuum cleaner has been doing this for the last five years. Clever little bugger. £200 to never have to push the hoover around again. 👍
First video that I saw and something about the IMU was said...Thnak you :)
We recently programmed a small Mindstorms robot to steer through a parcour that used lidar Scans to map the enviroment and then found its path step by step by the means of PRM (probabilistic roadmap method). That was fun.
2D only but you have to start somewhere I guess :)
Long time since I have seen Brady. That was nice.
I do so very much enjoy peeking under the hood into the robotics software through these videos.
Thank you Jane street
So you could in theory hive mind these maps so you have multiple robots and potentially drones that exchange the map data with each other right? That could be a very powerful way to quickly learn new robots the landmarks and rooms they might have to navigate in or to more quickly complete loops as robots work together on completing the same loops?
Sounds like a great idea. The robots would just need to communicate their relative distance and orientation to each other
Distributed map fusion is kind of hard, especially since SLAM isn't weather and season independent yet.
Very much ongoing research. And to make it more fun, distributed algorithms open you to the problem of "what if one or more agents are lying?"
Actually Nasa has implemented one. Its a combo of a drone and a rover. The drone identifies the safe path for the rover to travel.
it looks like a videogame scanner or map. Metal Gear was ahead of its time lol
OMG.... did you actually respond to my plea to cover SLAM with this video?
Either way, thank you so much.
I promise never to call any of you stupid again.
(If. ... it's monocular slam, onboard, in browser and written in javascript. .... with no open3d.. or tensor flow. .... or ARcore or arkit)
You guys can do it
Thank you Sean.
this is actually quite cool
That thing is perfect shape for next gen robovac. Give it suction tool at the front and 2 telescope feather dusters inside front limbs et voilà
Does all this SLAM algorithm work get saved in-memory or on disk? Very interested to see how the theory is implemented.
Amazing content either way - Spot will no-doubt be a part of all our futures...
It depends on the map's scale and the amount of data.
If it's a tiny warehouse and 2d lidar - things are straightforward
On a large scale though, such as self-driving with 3d lidar you need to load "chunks" of the map based on the location, this is where Big Data comes :)
One of the entry level LiDARs will produce about 8Mbit/s of data, that is about 1MB/s. I guesstimate one round around the room by robot to be 30-60 s. That is 60 MB of data per loop. Plus accelerometer and visible light camera data, use better lidar ... in worst case ... let's say ... 5-10× as much, maybe? Still less that gig of RAM, easily fits to the on board computer. Processing the data on the other hand? I have no idea.
@@MarekKnapek There is a lot of research on lidar data compression for obvious reasons. Most algorithms don't use raw pointclouds for the slam problem, instead the data is preprocessed into less dense structures, sometimes using standard methods, sometimes with the help of AI models. To recover a highly detailed map, some state-of-the-art methods again use AI for decompression.
You guys should look at the robots at Ocado, Tom Scott did a video on them
this video is so cool
When will you cover the beer pissing attachment?
How does the software determine that point X50 is the same as point X2? How does it 'know' that it can close the loop?
Also, is there a long term memory aspect that enables the software to continually refine the map and sharpen the picture?
Even with the small IMU errors, once it gets close, the point clouds near X2 and X50 start matching. Once it realizes it's matching features, that *is* closing the loop. It just has to adjust the X's in between (what he called 'warping the map') to make them coincide exactly.
Different matching stategies for landmarks involve either RANSAC, JCBB or simple nearest neighbor searches. For raw Lidar data ICP is the best way to align two scans and insert a corresponding edge into the pose-graph.
really great video!
i love this channel but i always have to turn my volume off and turn on captions whenever vous bust out the markers and paper 😅😅 the noise just turns my stomach ;-;
Can you use the loop closer to add some constant to the IMU guesswork so the errors are likely smaller next time?
That sensor fusion was solved by current 300 usd VR headsets like Oculus Quest, no?
you better have radio base stations for localization as well
usually you dont just have to scan without first being around, you can deploy base station beacons to get accurate local positioning relative to the beacons, and mutually between the beacons
breadcrumb beacons dropped along the scan path to be used as the mapping localization anchor beacons
also you could put those marks in the lidar scan positions only in 3d space, to directly align to those points always, just random align point cloud, much less than aligning to all the points
maximum likelihood mapping model
most likely model based on the measurements
We have an IMU of sorts, and two cameras, and touch sensitive limbs. Should we not strive to make computers able to work just using that?
Can you confuse/disturb the robot by walking around the robot with a big mirror?
How would the 3D map look like?
not sure about lidar but mirror is the bane of robotic vision
I believe the issue is that the lidar needs some diffusion from the material it is detecting such that at least part of the laser will come back to the sensor to be detected. So with a mirror it will have the same effect as light, the robot won't see the mirror but will see the reflection inside the mirror.
👏👏
What is the single-board computer being used there?
👍
Makes one curious where Stone Aerospace is nowadays.
When you solve the optimization problem, (I guess it is some variation of a least squares algorithm), is the a good way to find outliers. (Say there is a large measurement error in the LAIDAR a the robot now things is has moved a lot in the room, but the accelerations looks as expected as if the robot has not moved more than usual).
Can you tell witch measurements align with each other or do they all look equally valid? Maybe something walks in front of the sensor and you have some measurements that do not line up with all the rest.
not sure is this what you are asking, but the least square is scaled by the uncertainty, each measurement is assigned an information matrix, the more certain you are about a measurement, the more it matters in the optimization
Yes, current research is largely focussing on segmenting moving objects from the sensor data, while also solving the slam problem. It's a little more complex than least squares though.
Some new lidars can also measure doppler shift and thus the velocity of a point in space. This makes the taks of segmenting and tracking moving objects much simpler.
How do you distinguish landmarks from things that are moving?
Some form of RANSAC is running during feature matching to ignore outliers. The general assumption is that not everything is moving.
What is the name of the small stereo cameras he is using?
You know you're working at a dream job when your office's stairwell has railings that have lighting underneath. Who the heck over funds all this??? When all they do there is "think tank" and make prototypes no one asked for. They're not solving for existing problems. Someone is feeding them 'future things that would be cool' projects. I've never run across places like that. How do they turn a profit?
Everything is super clean.
It think their board of directors are lizard alien overlords. Mulder from "The X-Files" was right: "military-industrial-entertainment complex".
I'm guessing you haven't seen what robots like these are used for, then? They're absolutely solving for existing problems ~ at least, problems of safety and routine.
Robots like Spot are used commercially to do automated (and manual if needed) inspections of sites that are dangerous for humans, for instance, mines, radioactive sites, and places with other dangerous equipment.
Sure, they might not be "essential", but it is safer, easier, more cost effective, and potentially more reliable, to have a robot do those tasks rather than a human.
And in general, this kind of robotics technology is extremely important for any kind of robots that need to perform "open-world" tasks, and there's many reasons you might want that.
that's just all of tech: an overfunded bubble of hype with little to no real-world uses
Very didactic explanation
I just need a bottle of milk
Is it fixed-lag smoothing or full smoothing ?
Can it distuinguish similar rooms opening from a looped corridoor?
I'd expect so - it's tracking position via velocities and orientations as well as via the mapping, so it should be able to tell that it's in a different place that just looks the same
Can anyone explain to me how do people update the kd tree where the point clouds are stored during loop closure? I understand the poses and landmarks are updated, but updating the landmarks should distrub the spatial ordering of the kd tree. How do you resolve that?
Search for ikd tree
@@sandipandas4420 what was the process before ikd tree was invented? Do they use a pointer to determine which changes happened in which cloud-points?
how u make 3d environment .isnt lidar gives us 2d graph just point data in one direction with its orientation
There are lidars which are tilted onto their sides, and the entire LIDAR unit rotates, so that you get a series of 45-degree slices, stitched together can make a 3D map
Those are expensive, though! Actually...all LIDARS are super pricey
So they assembled a slamhound
Let's see a dozen of them, on ice skates, programmed for survival, with self-learning code, hunting one another down, with oxyacetylene torches! 💥 - j q t -
This is the REAL dream of robotic engineers!
Is that Ouster lidar?
Fascinating insight into how we can stop these machines after the war mongers have strapped automatic weapons onto them.
Thanks.
I wanna see that in a house of mirror :)
Relationship THAT! :D
Garcia Elizabeth Martinez Anthony White Matthew
Have you considered to use a Kalman filter to merge sensor data together?
thats a standard practice
SLAM describes the sound of what I would do with a baseball bat if I saw one of these things in person
This explained nothing. 'We collect these landmarks and position data and then the algorithm solves it, except it's not really solved.'
Okay, cool, I get it now?
Weird ad video.
Is the code/library public? Where can I find more information about it?
google ros slam, thousand of post and example on that
Again: The wrong way around!
Compute what the internal model should look like, compare with the "real" image, and refine the model until it matches.
That is how all animals work.
Interesting video. There is some really annoying high frequency noise at around 04:00.
No, there's not
What are the complete steps to create a PayPal money adder program?
First comment
no prize yet?
@@wisdon that guy didn't like my comment, ok so i give myself Oscars for doin this.
Second 😊
Hello dear viewer watching this about 1-4 years after the creation of this video :)
He sounds so annoyed the whole video haha
Please change the pen. It kills me everytime, the noise is excruciating.
The "chicken and egg" description of SLAM needs to stop. "SLAM" is simply using perceptual information to correct for errors in dead reckoning (without a map to use as a reference). If you have a reference map, it's called "localization."
That definition applies to other types of non-metric SLAM not discussed here (topological and appearance-based SLAM).
chicken and egg refers to the map and the localization, you are missing the part where the robot assembles the map, it is also a key output of SLAM, it's not just for correcting the odometry. The sensor information is useless if you don't build a map, Every solution to SLAM boils down to building a gross map from dead reckoning first, then simultaneously update the map and the localization when the same part of a map is observed, it's overly cliched but not wrong
It continues to amaze me how supposedly smart people keep helping the progress of what very clearly will be a huge downfall of humans. These people have narrowed their education far far too much and need to broaden their understanding of hostory.
Humanity has a lot of much more imminent problems than a robot takeover. This kind of tech is super impressive and very useful for some things, but robots are nowhere even close to the generalist skills of humans, and AI is even less so.
Sure, (4-legged) bots can walk around rooms by themselves, and AI can draw close-to-realistic images, but each on their own can pretty much *only* do those things, nothing else.
Great video but ya gotta find another platform other than youTube. Move to rumble or odysee.
Nuclear decommissioning? You'll need something much more rad-hardened than a BD dog...
scientist need to come up with better names.