jogamp - Re: Enigmatic benchmark results

jogamp › jocl

Re: Enigmatic benchmark results

Posted by Arnold on Jan 29, 2017; 2:51pm
URL: https://forum.jogamp.org/Enigmatic-benchmark-results-tp4037603p4037617.html

Thanks WadeWalker for your explanation. I had the naive impression that the openCL library functions would be always faster. I now know better :-) Just for fun I have the benchmarks results below
Running benchmark for 3 devices.
Summary of computing vectors with 20,000,000 elements (time in ms)
Device VectorAdd.cl VectorMul.cl VectorDiv.cl VectorTri.cl
GeForce GTX 1060 6GB 160 157 156 182
Intel(R) Core(TM) i7 154 159 154 336
Ellesmere 125 104 89 105
Plain Java 41 41 80 2768

What I want to do is fractal computation and linear algebra, like matrix multiplications, for neural networks. I parallellised the Mandelbrot in a "classic" way using threads and that works. I found the openCL demo code and I got it running, but I don't understand it. That's why I am learning openCL in general now, somewhere down the road to understanding openCL i should start to understand the Mandelbrot code :-) (after having it stripped from all the GL stuf).

My understanding fails at the data types. The book I use to learn openCL (OpenCL Parallel Programming Development Cookbook from Raymond Tay) explains buffers the C way. I cannot "translate" a struct with four ints as user data to a CLBuffer<type> without having to implement it as an extended Buffer class. So I might use an IntBuffer but that requires indexing and will be hopeless for user data with mixed data types. Is there a simpler way to define ones own data structure in Java?

I wrote a device lister, that lists each device and its capabilities and this benchmark program, that runs a benchmark for each device. I can contribute these to the demo package. just let me know.