At 4:45 he says the A array sets on the heap. However, since this is a statically sized array declared in a function(presumably the main function), shouldn't the array instead be put on the stack instead of the heap?
No. We have double A[1000] outside the structured block. So this will be shared by all the threads. If we had double A[1000] inside the structured block, then each thread would have it's own private copy of A[1000].
I get where you are coming from, but if we write int main() { double A[1000]; } A sits on the stack of the master thread (A is stack allocated array here). So when we write int main() { double A[1000]; #pragma omp parallel { // do stuff with A } } how all the threads share A since A was created on the stack of the master thread before parallelization began? Maybe openmp itself copies all stack allocated variables of the master thread to heap itself without the programmer knowing?
Think of it this way. Example: If anything is written globally, that is outside a method, then all the methods you have defined in your program will have access to that SAME variable(not a copy of it). If you define anything inside a method, then only that method has access to it. Similarly, if you define A[1000] outside the structured block, then all threads have access to the same A. But if you have A[1000] inside the structured block, then each thread gets a copy of A[1000]. So if some thread modifies A[1000], it will only be on it's copy of A[1000].
You are right. It does not matter whether an array is private or shared, you can allocate it either on stack or heap. For the specific example in question, the shared variable A is obviously allocated on the stack before reaching the parallel block. No doubt about that. As for whether OpenMP moves the shared variables on the stack to the heap, that is implementation dependent. In other words, the data sharing attributes of such variables are implicitly defined by the implementation. However, if you want to be sure, you can check the addresses of the shared variables and, most likely, you will find them on the stack.
@@J.Rahman i asked chatgpt, which was not available in the olden days when you commented my guess, after speaking with Senhor GPT, is thats why he says "To a first order approximation" the data is on the heap. Its not actually on the heap, but behaves as if it were, in that, its available to all threads
Do all the threads need to join back at the same time or can some threads join the master thread at some point in the execution when they're done with their part and the remaining threads join later? Or does the fork and join only happen at the beginning and end of the structured block for which they're created?
There are a lot of syntax errors in this code (forgetting to add ; at the end of omp_set_num_threads and accessing sum as an aggregate when you defined it with a literal). Did you even compile it?
pooh() can be any function with intake ID and array A. For each parallel thread, pooh has a different value of ID in its argument list. pooh() can carry out an operation accordingly for each thread. You can put a switch-case or if-else like construct in pooh(ID,A) which allows different operations on the common array A, in each thread. I just learned it and may be wrong in my interpretation. So please cross check
Why multiply the sum by step after the for loop in the pi code? Doesn't sum store the sum of areas of all rectangles which itself is the value of the integral?
That's just the definition of the Rieman-Integral. What we do is: int_0^1 f(x) dx = sum_i^100000 f(x_i)*step_i = stepi_i*sum_i=0^100000 f(x_i) with f(x) being 1/(1+x^2). So since the step size is constant, we can just take it out of the sum. We we first sum up all f(x_i) evaulations and then multiply it with the step size. Check "Rieman Sum" in the wiki.
+Anant Mishra read and write are operations that a calculator sees as a 3 part scheme in order to fetch, decode and execute an instruction. So it CAN happen that a certain value is going to be affected by two different threads, let's say thread A wants to add 1 to value z and B wants to sub 1. If you are not using some sort of synchronizations the output will be one of the possible interleaved combinations.
Umm... So not only do you have programming mistakes in your code (like not including stdio.h in previous exercises), but you also encourage bad form by using non-monospace font and bad coding style?
8:35 Hahaha, sure I remember learning Calculus in kindergarten
lol noticed after reading your comment. I was so lost in understanding things. :p
When higher secondary is so far back in your life, it's inseparable from kindergarten
Good old days of kindergarten : -})
At 7:14, the code shown calls a function "think" which I "think" is a typo.
At 4:45 he says the A array sets on the heap. However, since this is a statically sized array declared in a function(presumably the main function), shouldn't the array instead be put on the stack instead of the heap?
No. We have double A[1000] outside the structured block. So this will be shared by all the threads. If we had double A[1000] inside the structured block, then each thread would have it's own private copy of A[1000].
I get where you are coming from, but if we write
int main() {
double A[1000];
}
A sits on the stack of the master thread (A is stack allocated array here). So when we write
int main() {
double A[1000];
#pragma omp parallel
{
// do stuff with A
}
}
how all the threads share A since A was created on the stack of the master thread before parallelization began? Maybe openmp itself copies all stack allocated variables of the master thread to heap itself without the programmer knowing?
Think of it this way. Example: If anything is written globally, that is outside a method, then all the methods you have defined in your program will have access to that SAME variable(not a copy of it). If you define anything inside a method, then only that method has access to it.
Similarly, if you define A[1000] outside the structured block, then all threads have access to the same A. But if you have A[1000] inside the structured block, then each thread gets a copy of A[1000]. So if some thread modifies A[1000], it will only be on it's copy of A[1000].
You are right. It does not matter whether an array is private or shared, you can allocate it either on stack or heap. For the specific example in question, the shared variable A is obviously allocated on the stack before reaching the parallel block. No doubt about that. As for whether OpenMP moves the shared variables on the stack to the heap, that is implementation dependent. In other words, the data sharing attributes of such variables are implicitly defined by the implementation. However, if you want to be sure, you can check the addresses of the shared variables and, most likely, you will find them on the stack.
@@J.Rahman i asked chatgpt, which was not available in the olden days when you commented
my guess, after speaking with Senhor GPT, is thats why he says "To a first order approximation" the data is on the heap. Its not actually on the heap, but behaves as if it were, in that, its available to all threads
watching it twice helped me. didn't get it exactly the first time
We put a #pragma omp parallel in your #pragma omp parallel so you can fork threads while you fork threads...
Do all the threads need to join back at the same time or can some threads join the master thread at some point in the execution when they're done with their part and the remaining threads join later? Or does the fork and join only happen at the beginning and end of the structured block for which they're created?
This video is very helpful. Thank you !
Can i get something on how to parallelized nested while loops in C.
Is A is sit on heap or stack? If A is sit on stack then how the threads in parallel region can access A?
So there are another questions; what is relation with lib c++ thread.h ?
4:19
very good explained Thank you Mattson
Here is the code github.com/rishiloyola/pi-openmp
if someone wanted to copy paste it.
There are a lot of syntax errors in this code (forgetting to add ; at the end of omp_set_num_threads and accessing sum as an aggregate when you defined it with a literal). Did you even compile it?
thunk seems to refer to thread function.
hahaha "far back to kindergarten when you learnt Calculus".
Thank you for the tutorial Tim Mattson.
Very helpful and great video. Thanks.
what is pooh really means,i could run my process pooh(ID,A).can anybody explain?thanks
pooh() can be any function with intake ID and array A. For each parallel thread, pooh has a different value of ID in its argument list. pooh() can carry out an operation accordingly for each thread. You can put a switch-case or if-else like construct in pooh(ID,A) which allows different operations on the common array A, in each thread.
I just learned it and may be wrong in my interpretation. So please cross check
Just to make sure, when it requests 4 threads the team does not include the main thread correct? So in total there are 5 threads running 5 thunks?
Dulantha Fernando No. The team gets 4 threads with main (also known as master) thread with ID = 0. Other IDs: 1, 2, 3.
Which kindergarten did you guys to!!!!
why i+0.5 before (i+0.5)*steps? x=(i+0.5)*step
to average the lower sum and upper sum approximations to the actuall integral
MAny thanks for usefull videos
Why multiply the sum by step after the for loop in the pi code?
Doesn't sum store the sum of areas of all rectangles which itself is the value of the integral?
That's just the definition of the Rieman-Integral. What we do is: int_0^1 f(x) dx = sum_i^100000 f(x_i)*step_i = stepi_i*sum_i=0^100000 f(x_i) with f(x) being 1/(1+x^2).
So since the step size is constant, we can just take it out of the sum. We we first sum up all f(x_i) evaulations and then multiply it with the step size. Check "Rieman Sum" in the wiki.
Is it possible that 2 threads would try to read a shared variable at exact same time? or due to higher frequency of procs it is super rare???
+Anant Mishra read and write are operations that a calculator sees as a 3 part scheme in order to fetch, decode and execute an instruction. So it CAN happen that a certain value is going to be affected by two different threads, let's say thread A wants to add 1 to value z and B wants to sub 1. If you are not using some sort of synchronizations the output will be one of the possible interleaved combinations.
I would have had it on the first try, except I omitted the step*sum at the end. Dangit.
Does anybody know of an IRC channel where I can chat a bit about openMP? Ask questions, lurk, all that good stuff?
#cprogramming #programming #c++-libraries maybe on freenode
You don't define the function "pooh". Am I supposed to know what it is?
It can be anything. You can put any code you want in it.
Thank you for using pooh and not foo!
lol 8:30 , calculus in kindergarden.... yeh
He so looks like Benny Hill
Umm... So not only do you have programming mistakes in your code (like not including stdio.h in previous exercises), but you also encourage bad form by using non-monospace font and bad coding style?
So give the gift horse a breath mint.
on gcc, the code works without 'stdio.h'. gives out a warning though. but works.