Jetson Orion's great review of this supercomputer that amazes with its great power and speed, which makes it one of the best performing computers. It is a product of the new generation AI computer that makes the world's most powerful, especially high performance and energy efficient Nvidia Jetson AGX Orin Kid that allows the development of advanced AI and robotics applications.
WOW, compact but very powerful. And me with barely feeling the 6th generation of processors and you with this beauty of innovation. You have to fight to get your hands on one of these machines, if I didn't come across your channel I would have never imagined how far technology has come.
In the late 80's I was dabbling in AI and neural nets, ...until it became clear it was going to be quite some time for these to become practical (due to CPU and memory resource constraints). And I've been investigating small computers for security on this Relentless Homestead - so really looking forward to your videos. You really are precise and do a great job of selecting what to discuss and what not to discuss in a given episode. Thanks
Your a very good teacher. Because im a noob and i understood everything and learned alot. I went from not knowing what a jetson nano was to learning about parrallel computing and building supercomputers.
This is the most relevant information that I have hear today, this computer is amazing and I can see that it can do a lot of things and applications. Congrats for it.
I bought a Nano when it came out, I was thinking of an NX when it came out, then it for hard to find and went up in price and I didn't think about jetson for a year or so, now I'm coming back to play with them and stumbled across this...this thing is a BEAST. $2k is just a hard pill to swallow but it looks so worth it
Great review by Jetson Orion of this supercomputer that surprises with its great capacity and speed, which make it one of the best performing computers. This is a product of the new generation of AI computers that are the most powerful in the world, especially with the creation of the Nvidia Jetson AGX Orin kid of high performance and energy efficiency that allow the development of advanced AI and robotics applications.
One more comment. For viewers interested in parallel computing, I highly recommend OPENMPI as the Message Passing Interface version to use as it is open source, actively developed, and easy to implement.
Wow Wow Wow, this is fantastic Jetson Nano, I think you should conduct dedicated series of lessons to use Jetson Nano with various Robotics projects using motors, sensors etc. Same way what you did in Arduino.
Greetings from near Albuquerque, New Mexico, USA. Thanks for all you do to bring various computing concepts, hardware, and software to your viewers. I want to leave a few comments about this video on Build Your Own GPU Accelerated Supercomputer.
Excellent information shared in the video is truly exciting and just can't stop thinking how far the human intelligence will go forward and there is no limit for inventions. Great work in developing this and just totally in awe of how machines have run over our lives and we can't live without them anymore or rather they will only make us live better. thanks mate.
Maxwell has a throughput per SM per clock cycle of 128 FP32 multiplies/adds. So the required four FP32 multiplies per FIS1I operation means you get 32 quad-multiplies per clock cycle. The required single FP32 subtract per FIS1I means you get 128 subtracts per clock cycle.
Great video and I learned a lot even after using my Jetsons for a few years. The heat sink never got too warm but I tried a Noctua NF-A4x20 5V PWM 40x20mm premium fan with good results. Today I installed an ICE Tower Cooling fan that uses heat pipes that work effectively even without the silent fan running. Nice to see you too using an anti-static wrist strap, very professional!
The Jetson is very versatile. The fact that it has a break out board, or no WiFi or Bluetooth is kind of neither here nor there. You can always get a dongle for Either of those, and it can (and usually is) used in much of the same capacities. Being a little bulkier or needing a dongle doesn’t detract from the GPU performance- blowing the Pi or Rockchip out of the water.
On a somewhat related note, the Ampere (Tesla) A100 does not fall under this rule, its CUDA count is the full CUDA count of the card, with an equal number of dedicated INT cores. Nvidia probabably wanted to avoid lawsuites from supercomputer manufacturers if they ever found that the performance halved for their customerswhen the GPU ran INT at the same time as FP32
So in the 2 clock cycles it takes for the Quake FIS1I to perform 32 approximate inverse square roots, the SFUs will be able to perform 64 reciprocal square roots (with results which are also more accurate than the one-iteration Quake results).
Basically: love this device, don't love Jetpack. If we can run other distros with little or no compromise, it's a winner. Suspect the mainline support is not in current versions of mainstream distros, but perhaps in the next few months they will so if you could test that, and also check the machine learning stuff still works, it would be fantastic.
Jetson AGX Orin powers Nvidia’s Clara Holoscan, a new platform for the health care industry that allows developers to build software-defined medical devices that run low-latency streaming applications on the edge.
So we need 2 clock cycles to complete the 32 Quake FIS1I operations (since in actuality operations are completed in whole rather than factional clock cycles). A double-iteration Quake version like you mentioned would be even slower since it does the last line of code twice.
Hey Ahmad I really loved this video of yours...it just shows how computers are becoming tiny, powerful and cheap... since you are the only youtuber I know who has tried to do a cluster of Nvidia jetson...I have become an instant fan your channel and I am a student studying at a research institute and I do a lot of molecular dynamics simulation in biology so can you make a video where you run a GROMACS simulation on Nvidia jetson cluster vs raspberry pi cluster...this will be very interesting to see because main stream computer hardware for such applications is costly and students and researchers can't afford it but if it shows a similar result as a 1000 dollar computer then it brings hope for many researchers in future for affordable hardware and have a great impact on the society and the reason I am asking you to make such video is because you already have access to such hardware
I think it is very interesting, especially if the operations are simple enough (like monitoring IO voltages for temperature sensors and the like) you could reduce down time by segmenting the process over 128 cores instead of 4.
Also, why not use a bunch of single boards with fibre meshed instead. Perhaps, a redone version using GenZ or something close? Id love to see your project results.
In other words, say you want your robot to be able to recognize your friends and be able to pick them out by name. You're going to need to write the main program to do that and just access the Jeston AI face recognition software to let your program know who the robot is looking at.
When you take your square root problem and divide it into smaller and smaller but more numerous parts, that is called 'strong scaling' of a numerical problem. This implies that the problem size on each compute node becomes smaller and smaller. Eventually, if the problem continues to be broken up into smaller and smaller pieces, what happens is the communication time from compute node to compute node imposed by the message passing interface (MPI) becomes dominant over the compute time on each node. When this happens, the efficiency of parallel computing can be really low. My point here is that your video shows that double the compute nodes and you halve the compute time. That scaling will happen at first but cannot be continued ad infinitum.
Great video. One question, can you elaborate a little bit more on the github about commSize? I just didn't know how to set it as an argument. Thanks again for the video.
Best video on jetson orion. Thoroughly explained. Well done
Jetson Orion's great review of this supercomputer that amazes with its great power and speed, which makes it one of the best performing computers. It is a product of the new generation AI computer that makes the world's most powerful, especially high performance and energy efficient Nvidia Jetson AGX Orin Kid that allows the development of advanced AI and robotics applications.
Jetson AGX Orin is extremely well.And designed well by NVIDIA.And it is really an amazing and promising step .
Best powerful next generation performance artificial super computer i have seen ever. Nvidia has done great work on this extreme capable machine.
WOW, compact but very powerful. And me with barely feeling the 6th generation of processors and you with this beauty of innovation. You have to fight to get your hands on one of these machines, if I didn't come across your channel I would have never imagined how far technology has come.
It is amazing that a computer that small has that hardware power, this is a pretty nice video.
Thank you for the demonstration and showcasing, exactly what I was looking for..
Superamazed at how much technology is taking over!! Jetson orin is so powerful and literally the future
Damn first time I see you in action 😍 Are these publicly available to the masses ?
In the late 80's I was dabbling in AI and neural nets, ...until it became clear it was going to be quite some time for these to become practical (due to CPU and memory resource constraints). And I've been investigating small computers for security on this Relentless Homestead - so really looking forward to your videos. You really are precise and do a great job of selecting what to discuss and what not to discuss in a given episode. Thanks
Your a very good teacher. Because im a noob and i understood everything and learned alot. I went from not knowing what a jetson nano was to learning about parrallel computing and building supercomputers.
Wow. Your channel is incredible. One of the best I've found so far.
I loved your office and the benchmark. It's super organized.
This is the most relevant information that I have hear today, this computer is amazing and I can see that it can do a lot of things and applications. Congrats for it.
Awesome review!!! Insane how far technology has come, every day I am more and more amazed by it!! Price permitting, it´s gonna be definitely a hit!
Wow, its amazing video, you always came with surprise,need more this type of video, thanks
Jetson AGX Orin está extremadamente bien. Y diseñado bien por NVIDIA. Y es realmente un paso asombroso y prometedor. Muy bueno Ahmad Bazzi
I bought a Nano when it came out, I was thinking of an NX when it came out, then it for hard to find and went up in price and I didn't think about jetson for a year or so, now I'm coming back to play with them and stumbled across this...this thing is a BEAST. $2k is just a hard pill to swallow but it looks so worth it
Great review by Jetson Orion of this supercomputer that surprises with its great capacity and speed, which make it one of the best performing computers. This is a product of the new generation of AI computers that are the most powerful in the world, especially with the creation of the Nvidia Jetson AGX Orin kid of high performance and energy efficiency that allow the development of advanced AI and robotics applications.
So fascinating. Wow . Thank you all. And the producer.
i really like the way you explain , thanks for your effort
One more comment. For viewers interested in parallel computing, I highly recommend OPENMPI as the Message Passing Interface version to use as it is open source, actively developed, and easy to implement.
Great job Ahmad! So fun to watch
Indeed it does. I demo one of the course modules during my review of the Jetson Nano 2GB.
Dang, you're always ahead of the curve!
Thanks for an excellent video on the Jetson Nano, Ahmad.
Wow Wow Wow, this is fantastic Jetson Nano, I think you should conduct dedicated series of lessons to use Jetson Nano with various Robotics projects using motors, sensors etc. Same way what you did in Arduino.
Excellent tutorial sir ...
Greetings from near Albuquerque, New Mexico, USA. Thanks for all you do to bring various computing concepts, hardware, and software to your viewers. I want to leave a few comments about this video on Build Your Own GPU Accelerated Supercomputer.
Underrated AF ! 🔥
Das ist eine wirklich gute Einführung, vielen Dank dafür!
Excellent information shared in the video is truly exciting and just can't stop thinking how far the human intelligence will go forward and there is no limit for inventions. Great work in developing this and just totally in awe of how machines have run over our lives and we can't live without them anymore or rather they will only make us live better. thanks mate.
Maxwell has a throughput per SM per clock cycle of 128 FP32 multiplies/adds. So the required four FP32 multiplies per FIS1I operation means you get 32 quad-multiplies per clock cycle. The required single FP32 subtract per FIS1I means you get 128 subtracts per clock cycle.
Ahmad, you should create bunch of Robot tutorials using Jetson Nano. Keep up the fantastic work
Great video and I learned a lot even after using my Jetsons for a few years. The heat sink never got too warm but I tried a Noctua NF-A4x20 5V PWM 40x20mm premium fan with good results. Today I installed an ICE Tower Cooling fan that uses heat pipes that work effectively even without the silent fan running. Nice to see you too using an anti-static wrist strap, very professional!
Hey Ahmad, thanks for this video. Awesome!
Thanks a lot ! Perfectly installed and done. Installed a fan as you said it really gets heated up. ! Thanks
Excellent!! Can't wait to see how you use the Jetson nano. Hope you discuss the battery configuration for DB1 soon.
Can't wait more jetson nano videos with you
We can note the seriousness and the prudence with the anti-ESD wrist strap...
Nice video, Please make a video on Object recognition using jetson nano!
The Jetson is very versatile. The fact that it has a break out board, or no WiFi or Bluetooth is kind of neither here nor there. You can always get a dongle for Either of those, and it can (and usually is) used in much of the same capacities. Being a little bulkier or needing a dongle doesn’t detract from the GPU performance- blowing the Pi or Rockchip out of the water.
This was extremely helpful and informative. Your work bench is an inspiration sir, I am definitely borrowing from you formula. I hope you dont mind.
It'd be even better if I could pay attention with better sound quality, I'd love to see your next videos with a new mic
On a somewhat related note, the Ampere (Tesla) A100 does not fall under this rule, its CUDA count is the full CUDA count of the card, with an equal number of dedicated INT cores. Nvidia probabably wanted to avoid lawsuites from supercomputer manufacturers if they ever found that the performance halved for their customerswhen the GPU ran INT at the same time as FP32
So in the 2 clock cycles it takes for the Quake FIS1I to perform 32 approximate inverse square roots, the SFUs will be able to perform 64 reciprocal square roots (with results which are also more accurate than the one-iteration Quake results).
Basically: love this device, don't love Jetpack. If we can run other distros with little or no compromise, it's a winner. Suspect the mainline support is not in current versions of mainstream distros, but perhaps in the next few months they will so if you could test that, and also check the machine learning stuff still works, it would be fantastic.
I recall the original Power PC RISC 6000 used five chips linked by a 128 bit bus.
Jetson AGX Orin powers Nvidia’s Clara Holoscan, a new platform for the health care industry that allows developers to build software-defined medical devices that run low-latency streaming applications on the edge.
So we need 2 clock cycles to complete the 32 Quake FIS1I operations (since in actuality operations are completed in whole rather than factional clock cycles). A double-iteration Quake version like you mentioned would be even slower since it does the last line of code twice.
Well done. Thank you very much. I learned a lot.
Hey Ahmad I really loved this video of yours...it just shows how computers are becoming tiny, powerful and cheap... since you are the only youtuber I know who has tried to do a cluster of Nvidia jetson...I have become an instant fan your channel and I am a student studying at a research institute and I do a lot of molecular dynamics simulation in biology so can you make a video where you run a GROMACS simulation on Nvidia jetson cluster vs raspberry pi cluster...this will be very interesting to see because main stream computer hardware for such applications is costly and students and researchers can't afford it but if it shows a similar result as a 1000 dollar computer then it brings hope for many researchers in future for affordable hardware and have a great impact on the society and the reason I am asking you to make such video is because you already have access to such hardware
Yes, a video on Amdahl's law, please!
Ahmad, thank you so much :)
Informative video
I was going to comment, I wish he showed the Mandelbrot example from the list - especially instead of 'random fog' - which is visually unexciting.
wow this is super amazing. i dont really understand what's that is. but its looks really cool
Thank you for the awesome video!!
I need to implement this!
Wow, got nano here in India today!
wow this is an amaizng video. This video really helpfull. This is best video on Jetson Orin
I hope they release another product in the Jetson nano price class.
All of the above (except AV1 encoding), would be interesting. :)
Really this video is amazing. good job.
Excellent stuff!!
GOOD MORNING PROFESSOR!
I think it is very interesting, especially if the operations are simple enough (like monitoring IO voltages for temperature sensors and the like) you could reduce down time by segmenting the process over 128 cores instead of 4.
Enjoyed this video. Any update on when another Nano video will be released?
nVidia has some kick-ass courses on Ai and machine learning too. Some are even free!
Wow so much power 💪🏽
What Ahmad fails to mention is that pricing for Jetson boards is 23x more expensive per cuda-core than 3000RTX GPU cards.
We just want this in the next sheild pro....
I would be interested in running proxmox on this and perhaps home assistant
The code needed to do this is most interesting. Can you make it available?
You say you are there to help people but when i ask you for help you dont answer. Thank you from South Africa
Nice video, thanks!
It can train models but it will be slow. Ideally you want to train on a big gpu like 3090, then convert to tensorrt for inference on a jetson.
It would be nice to see this on Apple M1
Good show. For other explorations >>>===> Any and all!
once again thank you so much
Mans greatest achievement was working out how to do math faster than his mind would let him ! ! !
The power supply has to be at least 2A. And also very important for the SD card to be rated at minimum UHS-1
Maybe, with a Herculean effort, the timing issues could be resolved and you could run video games on it.
It would be interesting to see how this Orin SoC compares to the Apple M1.
Also, why not use a bunch of single boards with fibre meshed instead. Perhaps, a redone version using GenZ or something close? Id love to see your project results.
In other words, say you want your robot to be able to recognize your friends and be able to pick them out by name. You're going to need to write the main program to do that and just access the Jeston AI face recognition software to let your program know who the robot is looking at.
Thanks for your information...
this guy's voice is insane
You can get the Nvidia Elroy as well which is even smaller.
Almost certainly you could encode a video in parallel, if you have the right software to do that.
Thanks for the excellent introduction ! May I ask, what brand of storage system do you use for your screws and fixings ?
3 like benden Hello istanbul turkey den selamlar Başarılar dilerim maşşallah
Would there be a way to know the gpu/cpu clocks in the 15W mode?
yes yes yes pls make a video about Amdahl's Law
When you take your square root problem and divide it into smaller and smaller but more numerous parts, that is called 'strong scaling' of a numerical problem. This implies that the problem size on each compute node becomes smaller and smaller. Eventually, if the problem continues to be broken up into smaller and smaller pieces, what happens is the communication time from compute node to compute node imposed by the message passing interface (MPI) becomes dominant over the compute time on each node. When this happens, the efficiency of parallel computing can be really low. My point here is that your video shows that double the compute nodes and you halve the compute time. That scaling will happen at first but cannot be continued ad infinitum.
Set a threshold value(for the right face), in a variable. And write a small condition based on that value. This will solve your issue :)
Great video. One question, can you elaborate a little bit more on the github about commSize? I just didn't know how to set it as an argument. Thanks again for the video.
great job keep going
Perfect video
I’d like to see it handle a jupyterlab server!