Great tutorial. The only suggestion I would offer is to rely less on slides with text only and to include more graphs and diagrams as it makes the learning easier for your followers, particularly around topics as sensitive as the memory model.
I have a question. Even without __syncthreads, wouldn't a total of 2 be added to the value of i by the first and second threads, regardless of the order? Why can't we guarantee that the result will be 2
Question: In another video, someone said the way the GPUs implement control flow, is to push a mask on the other options, execute the true, then execute the false, etc. So wouldn't this be slower using sync threads? if you just use a cascading set of ifs, then the race condition would be solved and you wouldn't need the syncs right? Not sure if the mask thing is still, or was, true. Thanks.
Yes, that's how they work within blocks. I can't remember what this vid was on about, but there's no reason to assume that syncthreads is the fastest method. Best thing to do is try both and time them. IMHO modern hardware is too complex to know these things without trying them. Good luck and thanks for watching!
Sorry, your comment was marked as spam for some reason. I don't see why a GTX 750 wouldn't work with Adobe Premiere. Even if it doesn't work now, Adobe is sure to release a patch. They'd be crazy to release a product that doesn't work with the biggest GPU manufacturer in the world! Thanks for watching and have a great day!
noise reduction filter could help the sound. These videos are worthy of gold.
Best tutorial I watched!
Your tutorials are fantastic! Please keep them coming!
Love this series of tutorial! Would you mind updating the slides after the tutorial 8 to the website? I am dying for it. Anyway, thanks very much!
Please make videos on image processing using Cuda !!
Great tutorial. The only suggestion I would offer is to rely less on slides with text only and to include more graphs and diagrams as it makes the learning easier for your followers, particularly around topics as sensitive as the memory model.
I have a question.
Even without __syncthreads, wouldn't a total of 2 be added to the value of i by the first and second threads, regardless of the order?
Why can't we guarantee that the result will be 2
Could you share a video regarding the implementation of an image processing algorithm????????????????
great tutorials indeed
Hey, I have a question. If I've dedicated 48k shared memory, does that get divided by the number of blocks, or is that per block?
per block, 7:45
Question: In another video, someone said the way the GPUs implement control flow, is to push a mask on the other options, execute the true, then execute the false, etc. So wouldn't this be slower using sync threads? if you just use a cascading set of ifs, then the race condition would be solved and you wouldn't need the syncs right? Not sure if the mask thing is still, or was, true. Thanks.
Yes, that's how they work within blocks. I can't remember what this vid was on about, but there's no reason to assume that syncthreads is the fastest method. Best thing to do is try both and time them. IMHO modern hardware is too complex to know these things without trying them. Good luck and thanks for watching!
hello friend very good your videos have a question ... the new video card GTX 750 and GTX 750 TI works with Adobe Premiere and after?
Sorry, your comment was marked as spam for some reason. I don't see why a GTX 750 wouldn't work with Adobe Premiere. Even if it doesn't work now, Adobe is sure to release a patch. They'd be crazy to release a product that doesn't work with the biggest GPU manufacturer in the world!
Thanks for watching and have a great day!
Thanx :-)
Super
I was confused with shared memory of: stackoverflow.com/questions/5656530/how-to-use-shared-memory-with-linux-in-c