Login  Register

Re: Negating some JNI call overhead by doing transformations in Java

Posted by GiGurra on Jul 23, 2011; 8:44am
URL: https://forum.jogamp.org/Negating-some-JNI-call-overhead-by-doing-transformations-in-Java-tp3188212p3193171.html

Cool! Thanks

So if I understand you right, PMVMatrix objects are entirely handled on CPU, or do some of it's functions send stuff to GPU, or do I handle that myself?

Basically, I create a PMVMatrix (is this an on-cpu matrix stack then), I transform it as I want, then i use the standard gl.glLoadMatrixf to load the finished PMVMatrix? Or does absolutely everything go through PMVMatrix like I would do FFP programming normally and I just sit back and relax? :)

EDIT:
Got it. It is just an on host matrix stack.

EDIT2: Trying the PMVMatrix class here a bit but it seems to be significantly slower than standard ffp calls, so I either I make my own implementation or I stick with standard ffp calls. Regardless I can use the PMV matrix math for help should I need it :). so thx
(with -server and letting the the jit compiler have a warmup period of a couple of thousand frames, the ffp calls are faster by a factor 2-4. Initially they are faster by a factor 20 or so, but that goes away after a few seconds and should be expected).

I will proceed to write my own much simpler implementation, as I assume the PMVMatrix class was written more for robustness and compatibility rather than pure speed

EDIT3:
Using my own transformation class (currently only a matrix class, so doesnt do push/pop) I am able to speed up the actual transformation times 20x compared to native ffp calls. I am also using the fact that I'm only drawing 2d to limit the number of ops (meaning translate/rotate do not call on matrix multiply, but instead directly affects only specific elements)

about native ffp calls: glLoadMatrix seems to take as much time as 2 or 3 glrotates, so it is only worth while to make yoru own stuff if you need more than that, so unless I suddenly need 10 transformations per object I wont gain much by using host transformations, cause the loadmatrix will cancel out the benefit anyway.