Reply – Re: Parallel computation on CPU faster than GPU!
Your Name
or Cancel
In Reply To
Re: Parallel computation on CPU faster than GPU!
— by Giovanni Idili Giovanni Idili
I moved the loop to the kernel [] - and getting much better results (only blocking one of the buffers or all of them to get the stuff out at the end with final values does not seem to make a difference):

with 302 items --> GPU: 276ms / CPU: 228ms

Here's the code I am using to invoke the kernel:

One weird thing I've noticed, if I don't block any buffer the computation only takes 1ms ... which makes me think something is horribly wrong. Trying to find a way to verify.

As mentioned in the previous post, ideally I would like at this stage to get a 2-dimensional array out (at least for one of the buffers) at the end with values for each step of the loop I moved into the kernel, so that I can do some plotting and check that the computation is actually happening.

Any help on that appreciated!