Neural Radiance Fields | NeRF in 100 lines of PyTorch code

Papers in 100 Lines of Code

Просмотров 18 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 9 сен 2024

Комментарии • 51

@zskater1234 Год назад ⁺⁸
Just bought your course!! Pretty cool to find someone talking/teaching NeRFs, since LLMs and Diffusion models stormed and got all the attention haha
@papersin100linesofcode Год назад
Thank you so much! I am glad that you like the content, and I hope you will like the course. Great videos about NeRF will be released soon :)
@__karthikkaranth__ 2 месяца назад ⁺²
You skipped the coarse/fine logic from the paper. Were you able to get decent results without it?
@papersin100linesofcode 2 месяца назад ⁺¹
Hi, thank you for your question. The results I show at the beginning of the video are without it.To me these are decent results although they would be better with the hierarchical volume sampling strategy. I think I will make a video about it in the near future :)
@ArrayI0 5 месяцев назад ⁺¹
thank you very much! I learn the point of nerf from your vidio
@papersin100linesofcode 5 месяцев назад
Glad to hear it, thank you! :)
@er-wl9sy Год назад ⁺³
Awesome. Please keep doing in this field
@papersin100linesofcode Год назад ⁺¹
Thank you! I have a few upcoming videos related to NeRF, and will produce more if people are interested.
@businessplaza6212 Год назад ⁺³
my appology i mean the argument "dataset" used in line 11 to 36 in the test function. Does it take the dataset from the llff format from the pkl file? i don get it, tks!!
@papersin100linesofcode Год назад ⁺¹
Hi, thank you for question, and excuse me for the delayed answer. Yes, it is using the data from the pkl file, which was generated by myself directly from the NeRF data to make things easier. It can be downloaded from the GitHub link
@businessplaza6212 Год назад ⁺²
Does the code you've shared have the variable "dataset" defined? I don't see it. What is the output of the code a png file with rendered image? than is possible to get a mesh? thanks for your assistance
@papersin100linesofcode Год назад ⁺¹
Hi @businessplaza6212, thank you for your question. The GitHub code has several variables "dataset" in different functions. Therefore, I am not sure to understand your first question, could you please rephrase it? Yes, the output is a rendered 2D image. It is possible to get a mesh, I explain how to do it in my course. Otherwise, you may also be interested in this notebook from the initial NeRF paper github.com/bmild/nerf/blob/master/extract_mesh.ipynb.
@jeffreyalidochair Год назад ⁺³
a practical question: how do people figure out the viewing angle and position for a scene that's been captured without that dome of cameras? the dome of cameras makes it easy to know the exact viewing angle and position, but what about just a dude with one camera walking around the scene taking photos of it from arbitrary positions? how do you get theta and phi in practice?
@papersin100linesofcode Год назад ⁺³
Hi Jeffrey, thank you for your question. In practise, people use COLMAP (open source pipeline) for estimating the camera parameters.
@papersin100linesofcode Год назад ⁺¹
The camera parameters can also be learned (have a look at my video about NeRF-- if you are interested)
@jeffreyalidochair Год назад ⁺²
@@papersin100linesofcode thank you! do MIP-NeRF and Zip-NeRF also use COLMAP?
@papersin100linesofcode Год назад ⁺²
@@jeffreyalidochair MIP-NeRF and Zip-NeRF can be see as algorithms that take as input pictures together with their camera parameters, which can be estimated in several ways. But yes, in the real data from those papers the camera parameters are specifically estimated with colmap
@ankanbhattacharyya8805 9 месяцев назад ⁺¹
I understand 10*6 for the pos enc. but why did you add 3 to it? Posencdim*6+3?
@papersin100linesofcode 9 месяцев назад ⁺¹
Hi, thank you for your question. This is because we concatenate the position to the positional encoding. This is not mentioned in the paper, but done in their implementation.
@ankanbhattacharyya8805 7 месяцев назад ⁺¹
@@papersin100linesofcode ow. I understand. Thanks a lot
@aditya-bl5xh 11 месяцев назад ⁺¹
Hey I have a smaller question, Nerf takes 5d input, position and view direction, is there s way to get the view direction from a rotation matrix (3x3)?
@papersin100linesofcode 11 месяцев назад ⁺¹
Hi, thank you for your question. Do you mean the camera to world matrix (c2w)? If so, yes, and actually the direction is already computed from it most of the time. The direction is computed from the camera, using its 3x3 c2w matrix
@aditya-bl5xh 11 месяцев назад ⁺¹
@@papersin100linesofcode yea can you please tell the formula that is used to get them?
@papersin100linesofcode 11 месяцев назад
@@aditya-bl5xh you may be interested in this script github.com/kwea123/nerf_pl/blob/master/datasets/ray_utils.py. I will soon make a video about it
@aditya-bl5xh 11 месяцев назад ⁺¹
@@papersin100linesofcode thanks! Appreciated
@BenignVishal 5 месяцев назад ⁺¹
I am planning to buy your course, but will i be able to generate the mesh from the capture!?
@papersin100linesofcode 4 месяца назад
Hi, thank you for your question. Unfortunately, not in high quality. We discuss the ray marching algorithm and use it to extract a mesh from NeRF. However, the mesh is not high quality, and does not possess colours. If you want a coarse mesh, that is fine, but if you have high expectations on the quality of the mesh, and need colours, then you would need more advances algorithms than the ones used in the course.
@UncleChrisTs Год назад ⁺²
Great video thank you very much!
@papersin100linesofcode Год назад ⁺¹
I am glad you like it. Thank you for your comment!
@businessplaza6212 Год назад ⁺²
Thank you for your fast reply! Your work is great! Im wondering about de “dataset” variable that you use in line 106. But where is defined? Could you clarify pls? I will buy your course as Im working on a Nerf thesis for my master in sc in ML. You mixed the transforms json file from colmap in a pkl file?
@papersin100linesofcode Год назад ⁺¹
Hi, I am so sorry I forgot to answer. Most questions are already answered in other comments. Do you still need clarifications?
@eliezershlomi3224 9 месяцев назад ⁺¹
How would you add the coarse and fine networks improvement?
@papersin100linesofcode 9 месяцев назад
Hi, thank you for your comment. I am planning to add a video about it. I hope I can release it in the near future
@eliezershlomi3224 9 месяцев назад
@@papersin100linesofcode I subscribed, thank you
@machinelearnernp4438 Год назад ⁺²
Does it generate the sample in 16 epochs?
@papersin100linesofcode Год назад ⁺²
Thank you for your question. The model is trained for 16 epochs and then, it can be used for rendering
@nettyyyys Год назад ⁺¹
@@papersin100linesofcode I have tried it and is it notmal that it generates white images at the beggining? Also Why you set the deltas last as almost inf? Besides I think that using this makes the weight sumbe always 1 so the last regularization has no sense.... Correct me if I am wrong!
@papersin100linesofcode Год назад ⁺¹
@LearningEnglish Does the images remain white with more training? The deltas are the distance to the following sample, and so, for the last sample, the distance to the next one is infinity in theory. We take the exponential of the opposite value of delta which does not lead to exploding values.
I hope this is clear. If no, do not hesitate to ask me questions
@thomascole2933 9 месяцев назад
Absolutely great video! Really helped clear up the papers seeing things implemented so straightforwardly. I have a few questions. What type of GPU did you use to train this model? When creating the encoding you initialize your out variable to have the position vector placed in it. (making the output [batch, ((3 * 2) * embedding_pos_dim) + 3] adding that trailing +3) Was there a reason for doing that? I mean adding it surely doesn't hurt. Batching the image creation is also a great idea for smaller gpus. Thanks again for such a great video!
@papersin100linesofcode 9 месяцев назад
Hi, thank you for your great comment!
1) I should have used a P5000 or RTX5000.
2) I am not sure I understand which line you are referring to?
@SeungLabFx Год назад ⁺³
This is so nice. Just bought your course
@papersin100linesofcode Год назад ⁺⁴
Thank you so much! You can download it here drive.google.com/drive/folders/18bwm-RiHETRCS5yD9G00seFIcrJHIvD-?usp=sharing. You will understand in the course how it was generated :)
@aditya-bl5xh Год назад ⁺²
Can you explain pytorch implementation for mip NeRF or zip NeRF? The github repos are very hard to understand
@papersin100linesofcode Год назад ⁺¹
Thank you for the suggestion! I will try to add them
@aditya-bl5xh Год назад ⁺¹
@@papersin100linesofcode thanks!
@rebellioussunshine1819 8 месяцев назад
Great video! Could you tell me how much time it took for the model to train approximately?
@papersin100linesofcode 8 месяцев назад
Thank you! About 24 hours
@vamsinadh100 Год назад ⁺¹
can you share link for dataset
@papersin100linesofcode Год назад ⁺¹
Done
@YuRan-un8yj Год назад ⁺²
Great video! can i get the dataset ?
@papersin100linesofcode Год назад ⁺¹
Thank you for your comment! You should have accessed to the data now, excuse me for the delay. I have removed the authorization, so that anyone can access it directly from now on

Следующие

Автовоспроизведение

Neural Radiance Fields at High FPS | FastNeRF in 100 lines of PyTorch code