I have a simple JOCL demo [http://goo.gl/aYBMi
] adapted from the HelloJOCL sample [http://goo.gl/cVR6m
] (I shall mention that the HelloJOCL sample works OK on the same GPU).
I am having a weird problem where half of the elements in my queue are not being processed (or so it seems) when running on GPU, but everything works ok when switching to CPU.
I tried to change the number of elements and no matter how few I use only the first half gets processed correctly, while the rest is apparently picking up default values for the output buffer. Even though the init vectors are correctly initialized when I peek at them, it almost looks like the second half of the vector values are being passed to the kernel as default values (doubles), but I am not really sure that those items are being processed by the kernel at all.
The kernel is quite simple [http://goo.gl/KiyyK
], it's just a pass-through for one of the input buffers (I am only testing that I can invoke it OK) and I just assign the value from one of the input buffers to the relevant output buffer. I am inclined to think the kernel is not the problem since it works fine on CPU.
I know it's quite a bit of code to go through, so I am just asking if any of this rings a bell.
Any help appreciated!