Structure from Motion Octocopter - Computerphile

Computerphile

Просмотров 57 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 29 дек 2024

Комментарии • 139

@mogami4869 9 лет назад
I like that you are actually recommending a book in the end each time, as compared to many other channels who are sponsored by Audible and just tell me to follow their link..thank you!
@TheVladBlog 9 лет назад
This is so good! We are slowly coming towards computer visual understanding. These guys have already developed a "basic" system for the AI to differentiate between different kinds of objects in it's sight.
@JohnDoe-pn9ml 9 лет назад ⁺²
Videos like this make me wish there was an engineerphile.
@hcblue 9 лет назад
Such a cool device / technology. Thanks for covering both the theoretical / low-level subjects, e.g., algorithms, and more practical applications / projects, Sean!
@IDontDoDrumCovers 9 лет назад ⁺³⁷
the next google earth is be crazy haha, just imagine like 10,000 of these flying around cities taking pictures of everything in swarms
@GtaRockt 9 лет назад ⁺⁹
+Social Experiment and when they they play the ryde of the valkyres all the time
@zoranhacker 9 лет назад
that could happen lol
@BoJaN4464 9 лет назад ⁺¹
+Social Experiment Google already does this with satellite images, no drones required!
@jmac217x 9 лет назад
+Social Experiment That's a scary cyberpunk future you just made me imagine. I'm kind of wishing it to happen...
@frollard 9 лет назад ⁺¹
+Social Experiment Not sure if you've seen this with current gen google maps/earth; in many cities it's all scanned in 3d already. It's creepy since it can look below obstructed side views of houses (ie trees)
@Adamantium9001 9 лет назад ⁺²
How does the vehicle keep track of its own position in order to report where each picture was taken from?
@xell2k 9 лет назад
+Adamantium9001 it uses GPS basically. The 3D reconstruction and camera motion from the images is than automatically estimated using the structure-from-motion pipeline (without gps), and then simply "moved" to the real-world coordinates using the rough GPS coordinates or ground control points. while flying the vehicle localizes itself using GPS only
@aritakalo8011 8 лет назад
+Manuel Hofer GPS or reference markers. You can see the markers on the table or in the field video. Probably GPS is little bit in accurate for this job. GPS only UAVs have a tendency to drift around somewhat while trying to hover precisely, since GPS is not really pinpoint accurate.
Also since they fly under overhangs etc. it might block GPS, so they probably use those geo reference markers to get a local reliable reference.
Some UAVs do this be themselves to some extend. They have down looking optic and/or laser scanners to scan the ground below to get instant local reference for hover holding. Of course it is relative and not absolute, so hence the marker plates. From those you get absolute reference with optical scanner (aka camera)
@titaniumdiveknife 9 лет назад ⁺¹
Genius! Wish my German and programming were half as good as his English.
@jmac217x 9 лет назад ⁺⁷
Hey Sean would you consider using a monopod stand for your videos? I know it's like the trademark to have that shaky cam, but it's a bit unsettling at times, especially when the camera is pointing a single direction most of the time anyway. I don't want to interrupt your seemingly quick workflow, but something like that could be easily maneuvered to adjust for those close up shots you get, and something with only a single leg wouldn't be too cumbersome to rotate or quickly move. Just a suggestion cause your videos rock.
@Computerphile 9 лет назад ⁺⁷
Will consider, I try to use tripod most of time these days, only go handheld when situation calls for it or if I haven't got access to my tripod (it happens) >Sean
@jmac217x 9 лет назад ⁺³
Awesome. I knew that you must have used something for those videos with Professor Brailsford, but thought I'd mention it anyway. I really love those videos, for more than just their video quality :3
@GroovingPict 9 лет назад ⁺²⁹
could you maybe film in a place with even more background noise next time? Thanks. I was almost able to focus on what he was saying in this one, and we cant have that now can we.
@Computerphile 9 лет назад ⁺⁵²
Yeah if the world was perfect I'd film all interviews on a sound stage with a full camera crew and separate sound crew and pay the contributors so they have the chance to set up their equipment...
@GroovingPict 9 лет назад ⁺⁴
+Computerphile Yep, cause perfect studio conditions is definitely what I meant.
@unverifiedbiotic 9 лет назад ⁺¹¹
+GroovingPict Have you any idea how hard it is to find a quiet spot in a public space when doing an interview? I think that the microphone they gave the guy was doing a pretty good job all things considered. You can edit out some of the noise that stands out a lot and doesn't overlap with the audio you want to keep, but this is almost impossible if the entire audio is polluted and takes A LOT of time in each case and the end result may be even more distracting than the noise itself (audio artifacts). In the end, recording an interview and editing it is often more difficult than the viewer can imagine and you can really screw up your upload schedule if you don't teach yourself to ignore such minor issues.
@mindfulmike8612 9 лет назад ⁺⁶
+GroovingPict People have to work and if they're going to interview in the place where work is actually being done there is going to be background noise. Quick being such a whiny little brat and appreciate the FREE CONTENT they're giving you.
@seanski44 9 лет назад ⁺⁶
+GroovingPict ah so you weren't talking about a perfect world? Do you think I chose this location? Get real!
@JamesJansson 9 лет назад
You should totally join computerphile and numberphile to cover 1) computer algebra systems and then 2) computational theorem provers.
@ZadakLeader 9 лет назад ⁺²³
The sounds of forks and metal things in the background...
@seanski44 9 лет назад ⁺¹²
Actually a lot of the noise was people doing demonstrations of rock carving...
@linkVIII 9 лет назад
+Sean Riley sounds very restaurant like
@seanski44 9 лет назад ⁺²
+linkviii yeah people were clearing lunch but the tap tap tap is someone chipping away at a piece of rock
@NevaranUniverse 9 лет назад ⁺¹
+Vlad Ţepeş hungry developers are hungry!
@MrSabba81 8 лет назад
Hi thanks for sharing this. I am wondering if changing the perspective it would also be possible to use it especially for vegetation instead of excluding it: do you think we could estimate biomass (volume) of trees, shrubs, etc. with some adaptations? Thanks
Simone
@NikiDaDude 9 лет назад ⁺⁴
On a related topic I'd really like to see a video on the software Google use to generate the detailed topography in Maps and Earth.
If you look at most major cities you'll see that all the buildings, trees and even small objects like cars have been turned into a polygon mesh and textured.
@klaxoncow 9 лет назад ⁺¹
+Nick Yes, I'd love to know exactly how they've managed to do that.
It covers the entire planet (correction: planets), so it had to be an automated process.
But the textures show us the sides of objects - so it can't be coming from the satellite data alone, as satellites can only see things top down, they don't see the sides of things.
My secondary thought was that maybe they've cross-referenced it with their Street View images. The satellites see from above and Street View sees it from the ground.
But the thing is that Street View doesn't have full coverage of everywhere. Most of it is obtained by driving a camera car around, but that means it only covers what can be seen from the road. And not all roads are covered.
As you can see when you try to drop the little yellow man on the map to trigger Street View, which highlights which roads have Street View coverage, there's plenty of places where the Google car never goes - because there are no roads, or the roads are private roads, or whatever.
I thought "ah, they've cross-referenced it all with Street View and altitude data to automate a 3D map" - although, fair play, that's still one hell of a computational challenge unto itself not to be sniffed at, isn't it? - but it seems to have coverage well beyond what a satellite could ever see or Street View has coverage of.
So, how the hell did they actually do it?
And how did they do it so well? As I've not yet spotted a single mistake going around looking at this all over the planet. So it must be very good quality source data, as the automated process just ain't getting it wrong anywhere.
So, yeah, if Computerphile could find a Google Maps engineer to explain that one to us all, I'd definitely be interested to know, as they have seemingly done the impossible there!
@manhaxor 9 лет назад
+Nick Yasutaka Furukawa is a Google programmer who created an MVS (multi-view stereo) algorithm that's used to make the reconstructions in Maps and Earth. I would like to see a video that explains a few of the different methods of 3d reconstruction.
@unaliveeveryonenow 8 лет назад
+KlaxonCow Wrong, satellites can see things at slight angle, but they have fixed orbits. So at least you would need images from multiples satellites. But how would they solve the problem of tall buildings covering small objects at their bases?
@stheil 8 лет назад ⁺¹
+cyberconsumer I seem to recall that they don't only use satellite images (both straight down and at an angle) but also aerial photography from planes (and/or helicopters). And those can cover a much shallower angle, obviously.
@Encypruon 9 лет назад
What about moving things? Like leaves, trees moving in the wind, doors, cars, animals, humans, wind turbines in the background, water and changing light conditions (clouds, the sun moving during the process, flickering synthetic light...). And what about reflective surfaces and refraction? Some surfaces don't look the same from different angles...
Can any of these things be handled reliably? I imagine it to be very hard to construct meaningful models with things like these in the scene.
@yomaze2009 9 лет назад
Also, hardware/software side this could benefit greatly from the work being done on the analysis of the polarity of reflected light to determine the absolute color, texture, reflectivity, etc. of the individual objects.
@kensmith5694 8 лет назад
Add some really good magnetic sensors and in a lot of cases you could image what is hidden by the outer surface.
@Durakken 9 лет назад
Is the reason it has to be a static object due to processing power or some algorythmic problem? I don't see why a motion prediction algorythm couldn't be incorporated into that considering that a lot of animation is based on prediction algorythms now other than the render time for those things, but usually that is due to the quality of the render and not the animation I think.
@xell2k 9 лет назад
+Durakken basic Structure-from-motion only works for static scenes. You can only solve the optimization problem behind it when the 3D points "do not move". Otherwise it would get much more complicated. However, if most of the scene is static and a few objects are moving it usually works too. Then the moving things are usually just cancelled out automatically. There is of course software to reconstruct a dynamic scene, but here you usually have the assumption of a static camera, and if not, the algorithms are not ready to be used outside of the safe lab environment (as far as I know)
@EtrielDevyt 9 лет назад ⁺¹
This is gonna be great for location building for games!
@manhaxor 9 лет назад ⁺¹
+EtrielDevyt there's more efficient ways. Like a 3d scanner that takes color data as well as position data. Structure from motion is the best method for making a somewhat decently detailed reconstruction of a large area, but with flaws and distortions included.
@Aragorn450 9 лет назад
This sounds a lot like what senseFly does, but at higher resolution and with more autonomy. Their system is used for mapping cities some, plus keeping track of agriculture growth and all sorts of other things. I could see them approaching these guys to buy their technology for sure.
@yomaze2009 9 лет назад
It would be interesting to see different forks in the learning algorithm that focus on detecting properties of "challenging objects" and see how quickly it "decides" to use alternate techniques to get full coverage of the object. I Imagine crowdsourcing the determination of what areas of a 3d model "need more work" by offering the 3d model alongside the video taken by the device online. Also include that ability for the "crowd" to slice into the 3d model to isolate the "least defined" component of the feature. The system could then attempt a solution and provide it back to the crowd for further analysis. Could be made to be fun as a game of sorts!
@RAHUDAS 4 года назад
can anyone tell me what ML Algorithm used along with the Photogrammetry in this demonstration ??
@illusivec 9 лет назад ⁺²
So photogrametry and agisoft photoscan?
@manhaxor 9 лет назад ⁺⁹
+flanker It seems like he's more selling the semi-autonomous process of the UAV deciding to take more pictures of harder to reconstruct areas.
@brummii 9 лет назад
+flanker I think that the important function would be the UAVs sharing information between each other and "crowdsourcing" the creation of a 3d map, which all of them can use to navigate simultaneously.
@manhaxor 9 лет назад
brummii That's an interesting idea. I've only ever seen realtime 3d reconstruction for single devices, like google's self driving car, and a few pieces of mining equipment. I'm sure there's more, but I have yet to see multiple devices use the same reconstruction.
@Systox25 9 лет назад
TU Graz? nice!
@devjock 9 лет назад
So this octacopter is basically doing what people with complete blindness in one eye are doing? Rocking back and forth to see what background areas of a picture are being exposed / obscured behind objects in the foreground? Yeah that takes a lot of computing power. I'd imagine the algorythms used to reconstruct 3d geometry are modeled on the way human brains work to accomplish the same task (in the case of humans, mostly based on mapping out walking surface, obstacle avoidance, and impact threat assesment). Is it done with dense neural networks? How would those be trained? Or is it a selflearning network? Something completely different?
Also, given the fact that it would be trivial to have that octacopter carry one more camera (for stereoscopic image aquisition), what was the reasoning for that not getting implemented? I'd imaging the aquisition phase would be way more streamlined if the octacopter had 3d imaging in place to begin with..
So many questions!
@hanniffydinn6019 9 лет назад ⁺¹
How does this compare to a laser scanner attached to a drone ??????
@ozdergekko 9 лет назад ⁺¹
Yeah, fellow Austrian *and* Austrian institute of technology (technische Universität Graz)
@jopaki 9 лет назад
effin exciting stuff here man! wow.
@NeilRoy 9 лет назад
Fascinating idea. I wonder if such a system could be used on planetary exploration, like say Mars. Or even underwater etc... kewl stuff anyhow.
@NizarElZarif 9 лет назад
I was wondering, does anybody knows the name of the algorithm used ? like is their is a paper to read or tutorial to watch ?
@leopoldarkham7017 9 лет назад ⁺¹
+Nizar El-Zarif Searching for Pointclouds and Poisson surface reconstruction will get you going in the right direction
@NizarElZarif 9 лет назад
Leopold Arkham Thanks
@yourfilmindustry 9 лет назад
They're not rocks, they're are minerals!
@rockosigafredi 9 лет назад ⁺²⁵
Who's here is also from Austria? :-D
@WolframHofmeister 9 лет назад
AUT rules! 😁
@Anvilshock 9 лет назад
+rockosigafredi Schluchtenscheißer ...
@MayhemUniverse 9 лет назад
Servus!
@malaysiaszsz.hiphop_repres6278 6 лет назад
yuck ugly country shitty people
@Triantalex Месяц назад
Noone.
@0MVR_0 9 лет назад
Who where the first to introduce photogrammetry to machine learning?
@unvergebeneid 9 лет назад
Automatic photogrammetry ... this might be great for indie game developers! Of course it might also save lives by predicting avalanches and landslides and boring stuff like that.
@LastofAvari 9 лет назад
Cool stuff :)
@unaimb 9 лет назад
That’s photogrammetry with dense pointcloud Poisson mesh reconstruction, it’s been used for matchmoving in the film and TV industry for about 6-7 years now. It seems interesting that they chose to make the whole software from scratch instead of just the drone driving bit… I guess most matchmoving software is not really open when trying to expand its capabilities in such a way.
@xell2k 9 лет назад ⁺¹
+Unai Martínez Barredo Exactly, there are a lot of SfM pipelines out there (freeware and non-freeware) that work basically in the same way. However, many of them are closed-source which makes it uncomfortable to extend them. since we are a research institution, we aim at developing new tools, such as the view planning you see in the video. Our basic 3D reconstruction software has been developed since about 2006 or 2007, and we are basically maintaining it and extending it with further apps and functionalities. And since we have the full source code, this is ideal for research
@unaimb 9 лет назад
Nice :)
@frigeragmady9625 9 лет назад ⁺¹
blaming background noise, distorting the vocals of an intellectual while he speaks intelligent-stuff (i am dumbing myself down to level with some of these commenters) is kinda like not wanting to blame yourself for not understanding the intelligent-stuff
@TheKirkster96 9 лет назад
What if the intelligence could identify the position and dimensions of some vegetation (like a tree) and then just generate a model to fill in that space and give the viewers a representation of "hey there is some tree there, but we can't scan each branch and every leaf into a accurate model."
@THEMATT222 Год назад
Noice 👌
@kilésengati 9 лет назад ⁺¹
Photogrametry is great for game development, too.
@chris24hdez 9 лет назад
It's called 3D Photogrammetry
@serkantan2951 5 лет назад
3D reconstruction I believe would be a broader definition. Aside from mere photographs, there are other ways to measure the depth in an image even though they don't really discuss those methods they probably taking them into account such as shades, illumination, defocus, texture.
@gen157 9 лет назад
Early, but not first. Not that I care, just wanted to tell others about it.
About the video: Non-English speaker speaking English makes it a little hard when he isn't fluent enough. I understood enough, but needs to understand sentence structure a little bit.
@cavalrycome 9 лет назад ⁺⁷
+Gen15
I can see why some viewers might have trouble with the speaker's accent, but his grammar is very close to perfect.
@JapTut 9 лет назад ⁺²
+Gen15 and the background noise doesn't help either.
@espalorp3286 9 лет назад ⁺²
enemy uav spotted
@iseslc 9 лет назад
+Proteus Battlefield player spotted
@tisimo123 9 лет назад ⁺¹
+iseslc or any call of duty game after 4
@iseslc 9 лет назад
tisimo123
That's right! I haven't played those, though... only BF games.
@Hwyadylaw 9 лет назад ⁺¹
This is just me being a language geek, and not relevant to the video, but I feel like the ones with four rotors should be called Tetracopter instead of Quadrocopter.
@Ptolemusa 9 лет назад
+McDucky Indeed. ^^
@TheAllardP 9 лет назад ⁺²
+McDucky
It's the same thing. Quad is latin and Tetrad is greek.
@Hwyadylaw 9 лет назад ⁺¹
Philippe Allard
I wouldn't call myself a "language geek" if I didn't know that.
It's Téttrares (τέτταρες) or Téssares(τέσσαρες) in Ancient Greek. It's Quattuor in Latin.
@jazzpi 9 лет назад
+McDucky Why?
@antivanti 9 лет назад ⁺³
+McDucky So your preference for Tetracopter stems not from knowledge of languages but for an arbitrary preference for Greek over Latin? =)
@NeatNit 9 лет назад ⁺⁵
Is it really impossible for you to drop the Audible sponsoring? It's getting really annoying.
@quakquak6141 9 лет назад ⁺²⁰
+NeatNit it's at the end of the video, less annoying than this is impossible, I don't see the problem
@NeatNit 9 лет назад ⁺²
+quak quak I guess you're right... It just kinda sickens me that they have to pretend to be excited and genuinely interested in audible when it's obvious that they're paid to say that.
@trucid2 9 лет назад ⁺¹¹
+NeatNit It bothers you when others make money from their work? Should they be working for free for your sake?
@NeatNit 9 лет назад ⁺²
+trucid2 not what I meant, see my previous reply
@AustrianAnarchy 9 лет назад ⁺³
+NeatNit Maybe they are genuinely excited about the Audible anyway and the sponsorship made them positively giddy?
@GroovingPict 9 лет назад ⁺¹
"take this images"...
@IstasPumaNevada 9 лет назад ⁺¹
+GroovingPict This comment coupled with your other one is pinging my troll-detection algorithms.
@Anvilshock 9 лет назад
+GroovingPict If you do the equivalent of squinting with your ears really hard, you can almost hear he actually says "these" really fast ...
@Anvilshock 9 лет назад ⁺¹
GroovingPict Yep, it's the dumbest thing, and - Congratulations - YOU got it! Here's your Golden Dunce Hat prize!
@TomMinnick 9 лет назад
I've heard it called "Photogrammetry" before, but this is the first time I've heard it called "3d reconstruction"
@ivesennightfall6779 9 лет назад
I saw Ubuntu /o/
@Diggnuts 9 лет назад ⁺⁹
I hate the term "structure from motion". It makes no sense.
@manhaxor 9 лет назад
+Diggnuts Well it's usually used to reconstruct a scene from video. I agree that it's strange to hear it used when the source material is still images.
@antivanti 9 лет назад
+Diggnuts "Parallax 3D reconstruction" might be a more correct term? Or just photogrammetry.
@Diggnuts 9 лет назад ⁺³
Anders Öhlund I prefer photogrammetry as that is quite simply precisely what it is!
@jmac217x 9 лет назад
As soon as he said _intelligence_ i cringed a little. I don't want that to become the next buzz word. An algorithm like that does not equate to intelligence in my opinion. Everything about this guy seems off to me.

Следующие

Автовоспроизведение