– Re: Parallel computation on CPU faster than GPU!
In Reply To
As mentioned in the previous post, ideally I would like at this stage to get a 2-dimensional array out (at least for one of the buffers) at the end with values for each step of the loop I moved into the kernel, so that I can do some plotting and check that the computation is actually happening...???
Dear i haven't looked closely but it appears each kernel calculates a value independently of all other kernels, and every time the same one (by iGid) will always be working on the values it calculated last time. If that is the case you could also just use the same memory for input and output as well and simplify memory management as a bonus (in this case the kernel would also have to dump out sample results as described in the next point). I noticed you have a bug anyway - after the first loop it's just using Vout for both Vin and Vout ...............