Re: Multi-GPU processing inconsistent.
Posted by
Sven Gothel on
Jan 26, 2014; 1:49am
URL: https://forum.jogamp.org/Multi-GPU-processing-inconsistent-tp4031306p4031355.html
On 01/26/2014 01:48 AM, The.Scotsman [via jogamp] wrote:
> Thanks for the quick response!
>
> Each comparison computation is completely independent.
> I'm not currently using a clFlush/clFinish for each individual CLTask, as that
> doesn't seem appropriate.
> However, I am doing a CLCommandQueuePool.flushQueues() & finishQueues(), but
> it doesn't appear to make a difference.
>
> The CLCommandQueuePool class acts like a threaded job queue, where each job
> (CLTask) is scheduled as soon as a device is available.
> (Benchmarks that I've done have shown the implementation to be very efficient!)
>
> The CLTask.execute() method passes a CLSimpleQueueContext argument, which
> varies for each device.
> So each CLTask has an associated CLContext & CLCommandQueue.
>
> I see two potential sources of error:
> 1) A CLTask is performed on a given device before the previous CLTask for that
> device is complete.
> 2) There is some issue with copying the same memory to multiple contexts
> simultaneously.
>
> However, recalling that the computations are performed perfectly when using a
> single device, the first seems unlikely.
> The second scenario would occur somewhat randomly, which is the observed
> behavior.
>
> Since there is no exception thrown, further troubleshooting will require a
> substantial amount of debugging...
Can you provide a 'smallest' self-contained [unit]-test
and attach this to a new bug report ?
Withing February I like to validate this case.
Thank you!
~Sven