Login  Register

Re: Enigmatic benchmark results

Posted by Wade Walker on Jan 29, 2017; 11:33pm
URL: https://forum.jogamp.org/Enigmatic-benchmark-results-tp4037603p4037618.html

One thing you might do if you wanted to benchmark the actual compute that's happening on the card instead of just the data copying, would be to run two benchmarks: your current one, and another where the kernel is the same but with twice as much floating-point math in it (making sure all the math feeds into the results of the kernel, or the compiler may optimize it away). Then the difference in execution time between the two kernels will be floating-point execution time only (since the data copied is the same in both cases).

Any programming in Java or C# (or any language that doesn't have pointers :)) will require the Buffer code you mention. This is just a by-product of the fact that if you don't have pointers in the language, you need some workaround to interface with APIs like OpenCL that receive raw memory buffers as input. However, for efficiency any fractal or neural network code you write will probably only be passing one huge buffer to the graphics card, so you shouldn't have to define that many custom types.