Really enjoyed watching the vid, I've been learning computer architecture with nand2tetris and Digital Design and Computer Architecture by David Harris (Author), Sarah Harris (Author). I'm so happy to be able to understand the concepts he was talking about in this vid. Anyway thank you for the easy-for-beginner excellent content.
Hi Tom, at 16:36, on line 19, you should fix the "float(i);" to "(float) i;" I'm assuming you're trying to cast the integer value to a floating point data type.
Why did you need to use "float f" at time index 30:00 - why didn't you combine everything into 1 line of: "d_out[idx] = d_in[threadIdx.x] * d_in[threadIdx.x]" ? Is there a penalty for reading the thread index multiple times - or you did it just for clarity and explaining how the code works?
How do you ensure that the threadID does not go out of bounds of the array? I could have 1000 threads right? But only have 60 elements in array to square.
among all the cuda videos I ve watched this one made the most sense to me
true
Amazing info! Love the way the data flow and execution is explained!
It is like impossible power of computation! Beautiful beast!
This is very good video explanation about GPU computation
Amazing lecture. Helped me a loooooot for my final exam. Thank u soooo much. ❤️❤️❤️
Great lecture thanks for sharing! Thanks for sharing an interesting piece of history on how "bug" concept came to be
Really enjoyed watching the vid, I've been learning computer architecture with nand2tetris and Digital Design and Computer Architecture by David Harris (Author), Sarah Harris (Author). I'm so happy to be able to understand the concepts he was talking about in this vid. Anyway thank you for the easy-for-beginner excellent content.
best cuda explanation ever
Great Lecture! Very helpful!
Cheers mate! Always love a good programming lecture. :)
Excellent introduktion! Thanks!
Great tutorial. Thank you !
15:20 Single Instruction Multiple Threads
Thank you so much for the video! Quite helpful. Appreciate it :D
Very neat!Thank you!
Hi Tom, at 16:36, on line 19, you should fix the "float(i);" to "(float) i;" I'm assuming you're trying to cast the integer value to a floating point data type.
Why did you need to use "float f" at time index 30:00 - why didn't you combine everything into 1 line of: "d_out[idx] = d_in[threadIdx.x] * d_in[threadIdx.x]" ? Is there a penalty for reading the thread index multiple times - or you did it just for clarity and explaining how the code works?
nice boy
Amazing !!
Could you have squared the d_in array in place? So d_in[idx] = d_in[idx] * d_in[idx]
How do you ensure that the threadID does not go out of bounds of the array? I could have 1000 threads right? But only have 60 elements in array to square.
you pass the arraysize along with thread amount to the kernal e.g. square < < < 1, arraySize > > > ensres only 64 threads are created
Can you tell me what threads mean ? because I'm new to the GPU world😁
You could add timestamps
Great explanation! Thy
*thx not thy
Great tutorial! Thank you so much!