You can give high quality content to this world and people like me. Please post such awesome videos regularly. If possible please teach us the path through which we can understand & implement this knowledge in our life. Thank you.
Your communication skills are tremendous! Loved every single bit. Feedback: when you introduce the aliased image of the monkey, and then the corrected version generated from the blurred input, was impossible to tell the difference unless one goes back on the timeline. Solution, having the aliased and correct version side-by-side for comparison. Also, the final “thanks for listening” is not really necessary. _We_ are the ones thankful for your work, here.
a practical question: how do people figure out the viewing angle and position for a scene that's been captured without that dome of cameras? the dome of cameras makes it easy to know the exact viewing angle and position, but what about just a dude with one camera walking around the scene taking photos of it from arbitrary positions? how do you get theta and phi in practice?
Positional Encoding was just like a black magic to me that just works. Now you introduced Integrated Positional Encoding which is black magic on top of black magic. How do you guys understand what is happening here
Does changing the activation function to Siren help at all in larger NERF networks? My little colab experiments it seems to train fast, but it might not scale. Also, interested to see if a modern hopfield network could represent the NERF data well too?
So far I haven't seen any results where SIREN improves NeRF's test-set performance. SIREN primarily targets quickly minimizing training loss (and does a great job at it!) but doesn't really focus on generalization, and performance in NeRF is largely determined by how well the model generalizes to new views.
I don’t want to be disparaging this or any of the other work on NeRF, but I don’t understand where the innovation is. I get the very strong impression it is just rehashing work that has already been done in graphics, and long ago at that. What is the compelling improvement that neural networks bring other then novelty, over prior work? We’ve been able to generate new camera angles, and volumetric reconstructions from still photography for literally decades.
'We’ve been able to generate new camera angles, and volumetric reconstructions from still photography for literally decades.' Yes, there exist prior work in this area. Have you actually looked at any f the comparisons in the nerf papers? Do you feel they make an unfair comparison? They seem like very compelling improvements over the state of the art to me.
You can give high quality content to this world and people like me. Please post such awesome videos regularly.
If possible please teach us the path through which we can understand & implement this knowledge in our life.
Thank you.
Great project and great educational skills !
The views of this vid will blow up very soon! 🙌
Great explanation on an advanced topic of AI. 🤓
Your communication skills are tremendous!
Loved every single bit.
Feedback: when you introduce the aliased image of the monkey, and then the corrected version generated from the blurred input, was impossible to tell the difference unless one goes back on the timeline. Solution, having the aliased and correct version side-by-side for comparison.
Also, the final “thanks for listening” is not really necessary. _We_ are the ones thankful for your work, here.
nicely explained video. clear and to the point. have you considered making educational videos covering computer vision/computational photography? :)
a practical question: how do people figure out the viewing angle and position for a scene that's been captured without that dome of cameras? the dome of cameras makes it easy to know the exact viewing angle and position, but what about just a dude with one camera walking around the scene taking photos of it from arbitrary positions? how do you get theta and phi in practice?
Positional Encoding was just like a black magic to me that just works. Now you introduced Integrated Positional Encoding which is black magic on top of black magic. How do you guys understand what is happening here
Late to the party, but incredible work!
Does changing the activation function to Siren help at all in larger NERF networks? My little colab experiments it seems to train fast, but it might not scale. Also, interested to see if a modern hopfield network could represent the NERF data well too?
So far I haven't seen any results where SIREN improves NeRF's test-set performance. SIREN primarily targets quickly minimizing training loss (and does a great job at it!) but doesn't really focus on generalization, and performance in NeRF is largely determined by how well the model generalizes to new views.
amazing
🧐🤟
I don’t want to be disparaging this or any of the other work on NeRF, but I don’t understand where the innovation is. I get the very strong impression it is just rehashing work that has already been done in graphics, and long ago at that. What is the compelling improvement that neural networks bring other then novelty, over prior work? We’ve been able to generate new camera angles, and volumetric reconstructions from still photography for literally decades.
I think an important distinction is that NeRF lets you construct models from images.
'We’ve been able to generate new camera angles, and volumetric reconstructions from still photography for literally decades.'
Yes, there exist prior work in this area. Have you actually looked at any f the comparisons in the nerf papers? Do you feel they make an unfair comparison? They seem like very compelling improvements over the state of the art to me.
zoom. enhance.
Zoom. Enhance.
ZOOM! ENHANCE!!!!