Definitely, not many manage this level of clear and articulate presentation. Check out this guy too if you want more perfectly explained tech stuff: ruclips.net/video/Su9_i1UOz9U/видео.html
Excellent series. Very clear and accurate lectures. This is the best CUDA lectures I have ever seen. Academia, please keep up the good work. You are well qualified to be educators.
Non sono solito a lasciare commenti, ma in questo caso non posso farne a meno. Ho seguito l'intera serie di video su CUDA e non posso che ringraziarla per la semplicità e la chiarezza con cui spiega i vari argomenti!
2:43 what does it mean by’ because it is executed on many data elements, the memory access latency can be hidden with calculations instead of big cache memory.’?
I believe it means the overhead of transferring data (which is way slower than performing a calculation), is compensated by performing many calculations with the same piece of data (e.g. for a matrix multiplication), as opposed to caching data to have good performance on a variety of instructions like a CPU does. Here's a good vid explaining the tradeoffs: ruclips.net/video/3l10o0DYJXg/видео.html
Extremely good quality! wow
Definitely, not many manage this level of clear and articulate presentation. Check out this guy too if you want more perfectly explained tech stuff: ruclips.net/video/Su9_i1UOz9U/видео.html
One of the best conceptual overviews of any ML-related topic, not just CUDA. 4 years later - still worth watching.
Props
Too hard to find high-quality content like this these days. Thank you so much
Thank you so much for this series! It's so clear and easy to follow
Excellent series. Very clear and accurate lectures. This is the best CUDA lectures I have ever seen. Academia, please keep up the good work. You are well qualified to be educators.
Sir,make more detailed sessions on CUDA,your explanation is great
Many thanks for the lucid explanation.
Extremely helpful! Thank you for the good lecture :)
Excellent, many thanks for your explanations
Great series. Thanks. Subscribed. Cheers
I was needing this!!! Thanks a lot, Sir!!!!
Non sono solito a lasciare commenti, ma in questo caso non posso farne a meno. Ho seguito l'intera serie di video su CUDA e non posso che ringraziarla per la semplicità e la chiarezza con cui spiega i vari argomenti!
very good introduction! Thanks.
Great explanation! Fascinatingly clear
Amazing, clear, and understandable!
should i be learning this if im 12
I learned this at 6, so you're late.
@@marcusrosales3344Best answer ever to this kind of comments
Obviously bro
@@thesakustory6504 I’m 15 now
hope you did it!! DO NOT WASTE TIME
Typo at 3:41? The slide says that the device is the device plus the host, that should just be for heterogeneous right?
gold on RUclips :)
Really graet tutorials thanks
Really enjoyed this video!
2:43 what does it mean by’ because it is executed on many data elements, the memory access latency can be hidden with calculations instead of big cache memory.’?
I believe it means the overhead of transferring data (which is way slower than performing a calculation), is compensated by performing many calculations with the same piece of data (e.g. for a matrix multiplication), as opposed to caching data to have good performance on a variety of instructions like a CPU does. Here's a good vid explaining the tradeoffs: ruclips.net/video/3l10o0DYJXg/видео.html
where can I get the slides? Thanks
I enjoyed it.
thanks that help me alot
Awesome