Striding CUDA like i'm Johnnie Walker

Ahmad Bazzi

Просмотров 545 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 27 сен 2024

Комментарии • 525

@efehanpalta6681 Год назад ⁺²²
11:00 YOU ARE WATCHING A MASTER AT WORK 😍😍😍 !!!
@emirtanseldeveci3300 Год назад ⁺³⁰
This guy is like the HITMAN of NVIDIA, he just simply murders all its competitors 🥶
@kartal2993 Год назад
I feel like Cuda has been demystified. Very glad I found your series.
@RogerJohnson-c8v Год назад
amazing and it is everything someone that wants to learn the basics ever needs. I am a true believer that the most important thing is to get a grasp of the intuition and then slowly try to dive deeper into any topic
@oyundabro8099 Год назад ⁺²
now that I again watched it... I have no words to say more than FANTASTIC .. clarity, knowledge, and everything else ..
@officialchannel2422 Год назад ⁺⁵
Subscribed ! Excellent work.
@mohamedkhalil104 Год назад
This had LTT / LMG levels of production value with one of the best / clearest explanations for what CUDA is and why it matters.
@recepduman4905 Год назад
Anyhow, thank you for your comment! I'll definitley talk about it in due time 😉 ...it's just too soon, it's only the first episode of this parallel computing series
@3RR0R Год назад
By the way, I'm premiering the next episode in this parallel computing series in 45 minutes - come say hi! 😁
@thedilan5912 Год назад
I absolutely agree about the cooling! it's a key component, especially in overclocked systems. a good air circulation will ensure your hardware lasts for much longer!
@bulletsndcarlovers2898 Год назад
Padding in Magical. Awesome explanation!!!
@_7chefrlk Год назад
I use CUDA with HFSS - a numerically intensive electromagnetic solver (solves/satisfies Maxwell’s equations in 3D space).
@RiamuYumemiNM Год назад
Super clear explanation Ahmad, great video, thank you!
@Allinone-py2dx Год назад
Unbelievably clear video. Thanks 🙏
@forbarcelonaistaonly7617 Год назад
Ahmad thank you, my 8yo daughter is very happy with your channel, the graphics help, and representation, it matters.
@_7chefrlk Год назад
This would be two reads with no bank conflicts. Threads would all read from consecutive banks to get the first value, then they'd broadcast from bank 0. The two values would be read into registers, operations are not performed in any memory other than registers. The code looks like its requesting two values from shared memory at once but it's not. The PTX would show that the two reads are separate instructions.
@guney8206 Год назад
Great video! Please continue with this series.
@Berrats Год назад
It's very informative and a good intro to CUDA programming. Thanks very much!
@pkasta5009 Год назад
Thanks for watching, waiting and commenting!
@dondubek668 Год назад
Great presentation on GPU architecture, performance tradeoffs and considerations.
@Allinone-py2dx Год назад
This is really helpful for my computing. Thank you.
@huseynofficial8322 Год назад
Thank you so much, glad you liked it!! 😃
@baran_b3388 Год назад
I'm in love now with your Computer Specs, I9-12900K with RTX 3090, damn, that's absolutely a beast PC.
@Amin-mv8yr Год назад
Thank you so much! Glad you liked it! 😃
@adamndibi6439 Год назад
Thank you for response!!! i am also a parvenu in robotics))) But i love opencv and PCL looks promising(i'd been working on pcl, but without the MS library) ROS guys use python a lot! So, take a rest and look at the theme please!!!
@Baysalvlog Год назад
Very nice introduction. Using additional software like GPU-Z while processing data (e.g: training neural network) you can check your GPU load and temperature. When processing big chunks of data for long time ( days) check that you GPU doesnt pass the maximum temperature joint parameter (in my case it was 100°C). Another thing is that for PCs it is really important to have good coolers and a notable spatial separation between CPU and GPU (GPU support better high temperatures than CPUs).
@losko1978 Год назад
The software works with CUDA. I think their competitor that makes CST also uses video card memory for ultra fast calculations of insanely large matrices (inverting the matrix) .
@mlikdk1313 Год назад
Excellent detail! Thanks for the upload
@anti6b6t6 Год назад
As a graphics engineer, I recommend you use Optix if you do not need any rasterization feature. DirectX or Vulkan is way too verbose and complex for personal projects.
@lopeyt1949 Год назад
Just what I needed! Thanks!
@vimamoglu6203 Год назад
awesome video i found it quite useful, there are two minors errors, first when you initialize the max (d_max) to 0 it will return this zero if all the numbers of the array are less than 0, it should be any number of the array, the second is on the kernel sould also initialize temp with a value in the array (temp = array[0]);
@MucizeDoktorCanl Год назад
Thank you for posting this, it helps a lot!
@yasintusuz Год назад
This was really good. Thanks for posting this!
@ffozgur7932 Год назад
This is great ! Shall continue to support you on youtube. It was simple and actually more clear then any other tutorial.
@jefinhosilva4293 Год назад
GPU and CPU both are good in there way! Merry Christmas ☃️! Video is really good you did well!
@rzayefffootball9631 Год назад
Perfect Video! Saw was revealing to me to understand how it works. Thank you! I am a new subscriber of your channel. Regards from Buenos Aires, Argentina
@eslamyassermusic Год назад
Thank you so much!! Will do! 😃😃😃
@spacexx97 Год назад
А у нас сегодня 13 - холодно, холодная зима, страна Кипр, я тут уже два года живу. Для программиста ты очень красивая - прямо мечта гика :) Нашёл твой ролик про скрэпинг :) Ну далее уже подписался смотрю :)
@kuantumoyuncu Год назад
That is not a computer, is a god-level beast. 3090 + i9 crazy!! Great video
@mrbmkgaming8162 Год назад
Excellent video! thank you Mariya 🌷 ❤️
@lonewolfjovan190 Год назад
wanted to comment that the information in this presentation is very well structured and the flow is excellent.
@karabasdumantv4866 Год назад
Congrats on finishing the course!! 🥳🥳🥳 I hope you had lots of fun!!
@alisardogan4548 Год назад
Wow, that flicker is really really cool! :P Awesome tutorials, thank you for making these tutorials.
@thekubra4638 Год назад
Ahmad Bazzi ! Thank so much! You're the best
@9OKOG Год назад
Awesome explanation!! 👏🏼👏🏼👏🏼
@bestfootballpage1 Год назад
Wow!!! Thank you for sharing.
@leaxy6351 Год назад ⁺⁴
Great tutorial! :)
@fxriyt2206 Год назад
Thank you Sarwar! I'll boost the volume on future tutorials, thanks for letting me know 😃
@realityseries3323 Год назад
Thanks a lot really got me started .
@losko1978 Год назад
Looking forward to CUDA 13.0 🚀😍
@just-a-gamer5311 Год назад
I might do an OpenCL vs CUDA speed test in the future, sounds like a fun project!
@muhtesemtatlar6124 Год назад
VERY helpful, thank you!!!!
@kuantumoyuncu Год назад
Nice video for begginers, i will point my students to your channel.
@berkaydzk Год назад
All I ever wanted to know about CUDA striding and kernels.
@onurgke Год назад
Merry Christmas Ali!! You too! 😀😀😀
@ismailkucuk16 Год назад
We will not only use CUDA, but we will also use something called TensorRT - which is accelerating the prediction process in particular.
@efehanpalta6681 Год назад
11:00 you are watching a Master at work
@brawlstars9324 Год назад
Great tutorial as usual thanks!
@eminfrat8247 Год назад
Hyper V + Remote FX + Cuda Server = the perfection, u .u
Год назад
But if you have a lot of tasks that depend on eachother finishing in sequence or a complicated code with a lot of branching etc or if you want the hardware to work out what parts of the code that can be run in parallel for you then your code will run faster on a CPU then a GPU.
@ismailhicdurmaz3275 Год назад
Amiga 1983 yes.
@jokergaiming8896 Год назад
Hey, thanks for explanation! Very well done 👍 I am downloading CUDA 💪
@einsteinoloji Год назад
Happy new year and see you soon in a brand new tutorial! 🥳🥂🎆
@Berrats Год назад
Very informative and visuals helped me comprehend. I didn't purchase a GPU card last time due to fear of Ubuntu compatability. I'll get a moderate card in upcoming days, I think I got a good power source too but gotta check
@atakantasdemir9345 Год назад
so thank you beast ,, thank you M .... I'm following what you present .. as always..
@Interesting_videos_0 Год назад
Thanks for watching and have a great day!
@yusufdizlek1741 Год назад
You're right that with x86 we're looking at 1 or 2 threads pr core.
@KRATOS.FREEFIRE. Год назад
You explain it very well
@umutstrong.847 Год назад
Thanks for letting me know! I got a bunch of folks reporting the same issue, I'll check if I can turn off RUclips's involvement in the comments :)
@jokergaiming8896 Год назад
Thank you so much Ahmad! 😀
@OzanCan81 Год назад
Could you share a video regarding the implementation of an image processing algorithm????????????????
@dmhat2208 Год назад
gpu agnostic computing acceleration, ie, writing the software computing shaders on hardware shader execution in gpus
@ytjeangamer Год назад
I am also going to try to get this to work on my all AMD desktop computer.
@alimertozdemir1189 Год назад
Would you like to make a video on building or creating a Single node level task scheduling for deep learning based RLScheduler in spark cluster?
@lorkysso2517 Год назад
Happy new year and see you in 2022!!! 🥳🎆❄
@bkkktr-sinan3306 Год назад
I'll post an equivalent OpenCL codealong soon! It's very similar to CUDA but it works for all GPUs - not just Nvidias! 😉
@fikrimsicocuk2181 Год назад
It sounds complex, but it actually involves only a few lines of code! (as some very smart folks have already built this model for us and already trained it on the CIFAR dataset... we're just loading it and using it for our own stuff 😉)
@kucukusta1084 Год назад
What a fabulous girl and how smart you are !
@ahnetbenle1320 Год назад
(I might actually film a tutorial on it as well in the future... I want everyone to enjoy this series, not just the folks with Nvidia GPUs)
@cokbulutlu38 Год назад
Great tutorial! 💪🏻
@ivosevicc.m Год назад
Well just built a new rig with a 980ti and a 4790k so I'm gonna put that to test. Thank you for your wonderful explanation :D
@nezahatylmaz6734 Год назад
Excellent explanation, keep going with this content man ;)
@selmanboga5674 Год назад
Thanx, Ahmad! Very informative information ))) Could i vote on a theme? Using Python at the robotics (such as ROS) seems useful to dig into...
@mertcakmak493 Год назад
Thank you K.Ballaji Axe! 😀
@rzayefffootball9631 Год назад
Thanks for sharing this.
@modifiyeclub3919 Год назад
You need to put some soft material under the keyboard or re-allocate the mic. The noise when you type is similar to good drum and bass music.
@ahfer544 Год назад
P.S. You have inspired to the run my multiprocessing test on my new 5800XT with 12 cores. I haven't tried it over there yet.
@selamas2554 Год назад
thank you. good video!!! it was very helpful
@Emirx12 Год назад
You're absolutley welcome! 😃
@deadby3416 Год назад
Love nvidia jetson orin ❤️
@salmamalik9474 Год назад
It's true enough for this explaination.
@thegreatsidhu7566 Год назад
Thanks for the great work!
@user-kc9uq9ue1i Год назад
nice to hear that!
@amineferski Год назад
Excellent stuff.
@YasinBacaksiz Год назад
Thank you so much! 😃
@omeryusufavcı Год назад
That is one beefy computer. The GPU alone is like $2500 right now 1/1/22. Poggers.
@mehmetaydas6492 Год назад
Of course in normal spagetti code you can't really expect the CPU to be able to do particularly many of those instructions at once.
@veyseltrap6665 Год назад
Thank you! 😃
@hasannarl6718 Год назад
Please make videos on image processing using Cuda !!
@aygunmusayeva898 Год назад
Very helpful, thank you.

Следующие

Автовоспроизведение