Login  Register

MulJogamp Timings

Posted by gmseed on Oct 30, 2013; 12:44pm
URL: https://forum.jogamp.org/MulJogamp-Timings-tp4030420.html

Hi

On the Tutorial page is a link to a paper that compares jogamp-jocl, jocl and javacl:

http://jogamp.org/wiki/index.php/JOCL_Tutorial

When comparing against "normal" Java the author includes the filling of the arrays in the cpu timings. I took the time to implement this test case [as I'm new to jocl] and fatcored out the array filling and computation:

       
        private void fillJavaArrays(float[] matA, float[] matB, int seedA, int seedB)
        {
                Random randA = new Random(seedA);
                Random randB = new Random(seedB);
                final int n = matA.length;
                for (int i=0; i<n; i++)
                {
                        matA[i] = randA.nextFloat();
                        matB[i] = randB.nextFloat();
                }
        }
       
        public void normalMatMulCalc(float[] matA, float[] matB, float[] C)
        {
                final int n = matA.length;
                for (int i=0; i<n; i++)
                {
                        C[i] = matA[i] * matB[i];
                }
        }

and now compare apples with apples:

...
                // normal Java calculation
                float[] matA = new float[n];
                float[] matB = new float[n];
                float[] C = new float[n];
                fillJavaArrays(matA,matB,seedA,seedB);
               
                time = nanoTime();
                normalMatMulCalc(matA,matB,C);
                time = nanoTime() - time;
...

From the pdf I'm a bit confused as to whether the size of n is 1444777 or 14447777, but using the bigger 14447777 then my timing results are:

created: CLContext [id: 375806496, platform: NVIDIA CUDA, profile: FULL_PROFILE, devices: 1]
using CLDevice [id: 375806416 name: Quadro K1000M type: GPU profile: FULL_PROFILE]
local: 256
global: 14447872
used device memory: 173MB
A*B=C results snapshot:
0.29194298, 0.23210067, 0.6739147, 0.5184218, 0.53693414, 0.0102392025, 0.2038985, 0.10943726, 0.16293794, 0.018490046, ...; 14447862 more
computation on GPU took: 52 ms
0.29194298, 0.23210067, 0.6739147, 0.5184218, 0.53693414, 0.0102392025, 0.2038985, 0.10943726, 0.16293794, 0.018490046, ...; 14447862 more
computation on CPU took: 16 ms

illustrating that the "normal" Java computation is 52/16=3.25 times faster.

I'm interested to hear if other people have run this test.

Thanks

Graham