Has somebody seen benchmarks that compares performance of native code applications with OpenCL and/or OpenGL and performance of Java Binding? I know that DLL calls cause of performance decline. Does an application written on C/C++ will be anyway faster than the same on Java?
So as long as your program does most of its work inside the OpenGL driver (i.e. you make a few big drawing calls, instead of millions of small ones), your performance should mostly depend on your graphics hardware, not on the language you call OpenGL from.
I also made a few tests. Test consisted of loop with cross vector multiplication. First one on Java, result was 39 -38 ms. The same task with use of C++ SIMD (Eigen 3.0) on Java was 52 -50ms. And C++ SIMD (Eigen 3.0) on C++ was 22 – 18 ms. The last was openCl (jocl) on Java and result was 25-30 ms. Particularly, the last one was a bit edited "Hello World" example from jogamp.
JOGL has given me a small boost compared to OpenGL + GLUT, something between 5 and 15%. I only use the retained mode. Actually, I agree with Wade, the cost of JNI calls is tiny (some nanoseconds), it isn't a bottleneck except if your applications is badly written, uses immediate mode, makes too much thing in the rendering callback...
JOGL is not magic, the boost rather comes from Java itself which can be sometimes faster than C & C++ as the JVM can perform some dynamic optimizations which are totally impossible to design in strictly compiled languages, the JVM can optimize the memory use more easily thanks to the absence of pointers, it has its own heap that reduces the cost of allocations, method calls and allocations are between 2 and 4 times faster in Java according to Brian Goetz.
OpenCL depends heavily on the target device and the algorithm, so without more details your numbers don't mean much. Porting a CPU algorithm to a GPU will almost always result in very poor code: often 1-3 orders of magnitude out from the potential. And if you're using a CPU backend, be aware that the CPU backends (at least the AMD one) is still fairly immature: don't expect magic out of it, it's still just a c compiler and also you must explicitly write the code to use vectors to access the simd units (last time i looked).
If you want to compare what you're claiming you want to compare, just port the same application to muliple languages. HelloWorld is just based on the C examples, so it should be trivial and presumably there's the same for c++.
Although JNI and JOCL have some overheads, I think you'll find they're quite small compared to the overheads already present in opencl. Because of those, an efficient opencl application necessarily queues a relatively small number of compute-heavy tasks and spends most of it's time sitting around waiting for results: in which case any overheads are hidden anyway.