Hi, First of all i would like to thanks Tim for all his hard work to teach us very nicely and patiently. i am using VS2013 and it seems its only support openMP2.0 and i was able to solve the problem using the critical clause. which is like to force it to make it single thread and assign the node to each respective thread and then put the barrier. once every one reach to common level, i asked them to explode and i got the result with 4 core enable as 29.46sec while running single threaded gives me the same result in 62.38sec. Is it a good enough result? what is the result comparison from task clause? Thanks, Zeeshan
I got it and patted myself on the back. Awesome. I used master instead of single, it seemed easier for me to imagine a master looping over and spawning children for each task.
each thread creates a task tha will be pushed to sort of queue. and those tasks will be executed by the thread that are free in parallel. Its more like tasks are being created serial manner but those created tasks are getting executed parallel(there will be a parallel slack).
This doesn't work for me with gcc version 4.2.4 on Ubuntu. It doesn't give me a compiler error, but it isn't any faster than the serial code. It may be that the compiler is using OpenMP 2.0 or 2.5 so tasks aren't supported, but why doesn't the compiler complain?
A different version with for loop #pragma omp parallel { #pragma omp single for( p=head;p!=NULL;p=p->next ) { #pragma omp task firstprivate(p) processwork(p); } }
If you get: error C3001: 'task' : expected an OpenMP directive name. it means that you have openMP 2.0 so tasks are not available. Only in openMP 3.0.
The coloring at 2:40 is false. node* p; and setting up the while loop should be included to black block 1 as they are done by "the" single thread.
This Video saves my life
Hi,
First of all i would like to thanks Tim for all his hard work to teach us very nicely and patiently.
i am using VS2013 and it seems its only support openMP2.0 and i was able to solve the problem using the critical clause. which is like to force it to make it single thread and assign the node to each respective thread and then put the barrier. once every one reach to common level, i asked them to explode and i got the result with 4 core enable as 29.46sec while running single threaded gives me the same result in 62.38sec.
Is it a good enough result? what is the result comparison from task clause?
Thanks,
Zeeshan
I got it and patted myself on the back. Awesome. I used master instead of single, it seemed easier for me to imagine a master looping over and spawning children for each task.
If we use single, then just one thread is selected from the team of threads, then how the computation is parallelized?
If we use single, then just one thread is selected from the team of threads, then how the computation is parallelized?
each thread creates a task tha will be pushed to sort of queue. and those tasks will be executed by the thread that are free in parallel. Its more like tasks are being created serial manner but those created tasks are getting executed parallel(there will be a parallel slack).
I came up with the same solution in the video. However, I noticed that the task version was approx. 1 second slower than the array version.
This doesn't work for me with gcc version 4.2.4 on Ubuntu. It doesn't give me a compiler error, but it isn't any faster than the serial code. It may be that the compiler is using OpenMP 2.0 or 2.5 so tasks aren't supported, but why doesn't the compiler complain?
But what if there is a for loop inside of the while loop.
It would still work, as one thread will still go on to create tasks, for the threads to process.
A different version with for loop
#pragma omp parallel
{
#pragma omp single
for( p=head;p!=NULL;p=p->next )
{
#pragma omp task firstprivate(p)
processwork(p);
}
}