VBO Performance Misunderstanding

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

VBO Performance Misunderstanding

alicana
Hi all,

I have a stream from external source (an app), with this stream I'm trying to draw a point for each. WARNING: I need an expandable buffer. I changed Wade Walker's code, thanks to him, as follow:

private int[] createAndFillVertexBuffer(GL2 gl, ArrayList<IData> listDataObjects) {

        int[] aiNumOfVertices = new int[] { listDataObjects.size() };

        if (aiVertexBufferIndices[0] == -1) {
                  if (!gl.isFunctionAvailable("glGenBuffers") || !gl.isFunctionAvailable("glBindBuffer")
                                        || !gl.isFunctionAvailable("glBufferData") || !gl.isFunctionAvailable("glDeleteBuffers")) {
                                System.err.println("VBO unsupported!");
                }

                        gl.glGenBuffers(1, aiVertexBufferIndices, 0);
                        gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, aiVertexBufferIndices[0]);

                        allocated = (listDataObjects.size() * 2 * Buffers.SIZEOF_DOUBLE * 3);
                        gl.glBufferData(GL2.GL_ARRAY_BUFFER, allocated, null, GL2.GL_STREAM_DRAW);

                }

                gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, aiVertexBufferIndices[0]);

                int tempAllocated = next_power_of_two(listDataObjects.size() * 2 * Buffers.SIZEOF_DOUBLE * 3) * 2;

                if (allocated < tempAllocated) {

                        allocated = tempAllocated;
                        gl.glBufferData(GL2.GL_ARRAY_BUFFER, allocated, null, GL2.GL_STREAM_DRAW);

}

                ByteBuffer bytebuffer = gl.glMapBuffer(GL2.GL_ARRAY_BUFFER, GL2.GL_WRITE_ONLY);
                DoubleBuffer doublebuffer = bytebuffer.order(ByteOrder.nativeOrder()).asDoubleBuffer();

                for (IData dataobject : listDataObjects)
                        storeVerticesAndColors(doublebuffer, dataobject);

                gl.glUnmapBuffer(GL2.GL_ARRAY_BUFFER);

                return (aiNumOfVertices);

        }


As you see above, I'm trying to change buffer size (Correct me If I'm wrong, I guess its working, not throw exception yet)

if (allocated < tempAllocated) {

        allocated = tempAllocated;
        gl.glBufferData(GL2.GL_ARRAY_BUFFER, allocated, null, GL2.GL_STREAM_DRAW);

}


I used to draw points imidiate mode and draws 2M points in 400 milliseconds.

But, somehow, VBO approach does not give me that performance :(

display time: 265 ms, number of point: 310000
display time: 314 ms, number of point: 375000
display time: 405 ms, number of point: 454724
display time: 450 ms, number of point: 550000
display time: 510 ms, number of point: 660000
display time: 745 ms, number of point: 790000
display time: 750 ms, number of point: 970000
display time: 970 ms, number of point: 1160000
display time: 1150 ms, number of point: 1391426
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

Wade Walker
Administrator
My guess would be that the variable "allocated" is getting reset to 0 somewhere else in your code, so you're re-creating the buffer every time even if you don't need to. You might try tracing through the code in the debugger or adding some print statements to make sure you only create a new buffer when it's really necessary.
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

alicana
I'm afraid allocated set only where bufferSize is not enough.
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

Wade Walker
Administrator
Well, this is the situation that profilers were made for :) I usually use jvisualvm, which comes with Oracle's JVM.
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

gouessej
Administrator
In reply to this post by alicana
Hi

GL_STREAM_DRAW is the slowest draw mode of the VBOs. Moreover, you shouldn't pass arrays to JOGL methods, it's possible just for convenience but it's less efficient especially on some code repeatedly called, rather use some fixed-size direct NIO buffers even just to get the generated buffer identifiers. You should call glDeleteBuffers if you no longer need to use some identifiers. Another solution consists in using the same identifier but binding it to another direct NIO buffer. Keep in mind that OpenGL might wait for some time to destroy a data store.

Edit.: Are you sure that you want to use this draw mode? Is it better with GL_DYNAMIC_DRAW?
https://www.opengl.org/sdk/docs/man/html/glBufferData.xhtml
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

alicana
This post was updated on .
I still have problems with the code that I share below. First of all, my achievement is drawing 1M points on the screen (by accumulating points in vbo). I tried to add some points in time by using glMapBufferRange(), I guess I have a trouble with usage of the buffer offset, position whatever...

1) I allocated a buffer for 1M points.  Is this ok? -> (10e6 * FLOAT_SIZE* 3) 3 Stands for XYZ
2) I want to increase points that are drawn, by 1000. But, exception is thrown which says "GL_INVALID_VALUE Out of range: offset 0, length 15378000, ..."
3) Second step is ok when I set increase to 100.
4) Is my drawArrays() call correct or am I trying to draw all points again and again?
5) Performance does it really matter for me

EDIT:

- I found mistake that I have done in second step.  I used  Buffers.SIZEOF_FLOAT when allocated buffer, but used  Buffers.SIZEOF_FLOAT when mapping to buffer. Works now. But still have questions
- My fps rate decreases dramatically.
FPS: 56
FPS: 63
FPS: 49
FPS: 38
FPS: 32
FPS: 28
FPS: 26
FPS: 23
FPS: 22
FPS: 21
FPS: 19
FPS: 19
FPS: 18
FPS: 17
FPS: 16

Is this normal ?



import java.awt.Dimension;
import java.awt.event.WindowAdapter;
import java.awt.event.WindowEvent;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.FloatBuffer;
import java.util.Random;

import javax.media.opengl.GL2;
import javax.media.opengl.GLAutoDrawable;
import javax.media.opengl.GLEventListener;
import javax.media.opengl.awt.GLCanvas;
import javax.swing.JFrame;
import javax.swing.SwingUtilities;

import com.jogamp.common.nio.Buffers;
import com.jogamp.opengl.util.FPSAnimator;

@SuppressWarnings("serial")
public class Main extends GLCanvas implements GLEventListener {

	private Random random = new Random();

	// VBO variables

	int[] vbo = new int[] { -1 };

	int maxNumberOfPoints = 1000000; // GL_POINTS
	int pointVertexSize = 3; // X,Y,Z
	int numberOfPointsAdded = 0;

	int pointOffset = 0;
	int increaseRate = 1000;

	// ===================================================
	// VBO related functions
	// ===================================================

	private void initVBO(GL2 gl) {

		gl.glGenBuffers(1, vbo, 0);
		gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, vbo[0]);
		gl.glBufferData(GL2.GL_ARRAY_BUFFER, Buffers.SIZEOF_FLOAT * maxNumberOfPoints * pointVertexSize, null,
				GL2.GL_DYNAMIC_DRAW);

		gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, 0);

	}

	private void updateVBO(GL2 gl) {

		gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, vbo[0]);
		ByteBuffer byteBuffer = gl.glMapBufferRange(GL2.GL_ARRAY_BUFFER, pointOffset, increaseRate * Buffers.SIZEOF_FLOAT * 3,
				GL2.GL_MAP_WRITE_BIT | GL2.GL_MAP_UNSYNCHRONIZED_BIT);

		FloatBuffer floatBuffer = byteBuffer.order(ByteOrder.nativeOrder()).asFloatBuffer();

		for (int i = 0; i < increaseRate; i++) {

			floatBuffer.put(random.nextFloat() + random.nextInt(1000)); // X
																		// coordinate
			floatBuffer.put(random.nextFloat() + random.nextInt(1000)); // Y
																		// coordinate
			floatBuffer.put(0);

			pointOffset += 3;
			numberOfPointsAdded += 1;
		}

		System.out.println(numberOfPointsAdded);

		gl.glUnmapBuffer(GL2.GL_ARRAY_BUFFER);
		gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, 0);

	}

	private void renderVBO(GL2 gl) {
		gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, vbo[0]);

		gl.glEnableClientState(GL2.GL_VERTEX_ARRAY);
		gl.glVertexPointer(3, GL2.GL_FLOAT, 0, 0l);

		gl.glDrawArrays(GL2.GL_POINTS, 0, pointOffset);

		gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, 0);
	}

	private void deleteVBO(GL2 gl) {

		gl.glDeleteBuffers(1, vbo, 0);

	}

	// ===================================================
	// OpenGL Callback Functions
	// ===================================================
	@Override
	public void init(GLAutoDrawable drawable) {
		GL2 gl = drawable.getGL().getGL2();

		gl.glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
		gl.glClearDepth(1.0f);
		gl.glEnable(GL2.GL_DEPTH_TEST);
		gl.glDepthFunc(GL2.GL_LEQUAL);
		gl.glHint(GL2.GL_PERSPECTIVE_CORRECTION_HINT, GL2.GL_NICEST);
		gl.glShadeModel(GL2.GL_SMOOTH);

		initVBO(gl);
	}

	@Override
	public void reshape(GLAutoDrawable drawable, int x, int y, int width, int height) {
		GL2 gl = drawable.getGL().getGL2();

		gl.glViewport(0, 0, width, height);

		gl.glMatrixMode(GL2.GL_PROJECTION);
		gl.glLoadIdentity();
		gl.glOrthof(0, 1000, 0, 1000, -1, 1);

		gl.glMatrixMode(GL2.GL_MODELVIEW);
		gl.glLoadIdentity();
	}

	@Override
	public void display(GLAutoDrawable drawable) {
		GL2 gl = drawable.getGL().getGL2();
		gl.glClear(GL2.GL_COLOR_BUFFER_BIT | GL2.GL_DEPTH_BUFFER_BIT);
		gl.glLoadIdentity();

		updateVBO(gl);
		renderVBO(gl);

		if (System.currentTimeMillis() - lastTimeInMillis > 1000) {
			System.out.println("FPS: " + fpsCounter);

			fpsCounter = 0;
			lastTimeInMillis = System.currentTimeMillis();
		}
		fpsCounter++;

	}

	@Override
	public void dispose(GLAutoDrawable drawable) {
		GL2 gl = drawable.getGL().getGL2();
		deleteVBO(gl);
	}

	// ===================================================
	// Main
	// ===================================================

	private static String TITLE = "JOGL 2.0 Setup (GLCanvas)";
	private static final int CANVAS_WIDTH = 320;
	private static final int CANVAS_HEIGHT = 240;
	private static final int FPS = 60;

	private int fpsCounter = 0;
	private long lastTimeInMillis = System.currentTimeMillis();

	public static void main(String[] args) {
		SwingUtilities.invokeLater(new Runnable() {
			@Override
			public void run() {

				GLCanvas canvas = new Main();
				canvas.setPreferredSize(new Dimension(CANVAS_WIDTH, CANVAS_HEIGHT));

				final FPSAnimator animator = new FPSAnimator(canvas, FPS, true);

				final JFrame frame = new JFrame();
				frame.getContentPane().add(canvas);
				frame.addWindowListener(new WindowAdapter() {
					@Override
					public void windowClosing(WindowEvent e) {
						new Thread() {
							@Override
							public void run() {
								if (animator.isStarted())
									animator.stop();
								System.exit(0);
							}
						}.start();
					}
				});
				frame.setTitle(TITLE);
				frame.pack();
				frame.setVisible(true);
				animator.start();
			}
		});
	}

	public Main() {
		this.addGLEventListener(this);
	}
}
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

Wade Walker
Administrator
No, FPS decrease like that isn't normal (if the number of points is staying the same) :) I'd still advise profiling -- if this was my program, that's the first thing I'd do. Otherwise, you're just trying random things and hoping to get lucky :)
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

gouessej
Administrator
In reply to this post by alicana
Hi

At first, please switch to JOGL 2.3.1. You're still using an old version. It won't be enough to solve your problem but you might fall on a fixed bug. Using an old version is a waste of time. Don't forget to modify the imports:
https://jogamp.org/bugzilla/show_bug.cgi?id=682

Secondly, I advise you to follow Wade's advice, it would help to find the culprit.

Thirdly, look at GL2ES3.GL_MAX_ELEMENTS_VERTICES. If your VBOs are too big, it will slow down your rendering. Break them into smaller ones, for example store about 1000 vertices per VBO.

Fourthly, as you use a single VBO, why not binding it once for all and never "unbind" it at least for your test?

Finally, post the complete stack trace of the GL exception GL_INVALID_VALUE. Are you sure that you want to call the relative put() method? In my humble opinion, there is something wrong in updateVBO() but it doesn't impact the performance.

Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

gouessej
Administrator
In reply to this post by alicana
Use a plain animator (replace FPSAnimator by Animator) and disable v-sync.

You can't write beyond the capacity of your buffer. You should stop updating your buffer in your loop when floatBuffer.limit() + 3 >= floatBuffer.capacity().

I have rarely used glMapBufferRange. The documentation says:
offset and length indicate the range of data in the buffer object that is to be mapped, in terms of basic machine units
https://www.opengl.org/sdk/docs/man/html/glMapBufferRange.xhtml
Maybe your offset is wrong.

Please call floatBuffer.rewind() before unmapping the buffer by safety even though it might be useless.
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

alicana
In reply to this post by Wade Walker
CPU Sample

Thank you for advice now I'm profiling . I hope, I'll find a solution and share you results.
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

alicana
This post was updated on .
In reply to this post by gouessej
Hi again, I did some improvements in my code. So, I wanted to share with you.

Main.java

Changes:
Firstly, as Gouessej adviced, jogl lib upgraded to newest one (2.3.1)

1) I used Animator instead of FPSAnimator.
2) Apparently, I made mistake by giving wrong COUNT parameter to glDrawArrays call.
3) pointOffset variable which holds how many vertices was also wrong. When updating offset value, it should be multiplied by SIZEOF_FLOAT.
4) I used glBufferSubData instead of glMapBufferRange. I believe this is the reason why performance improved. (by sense)

Thanks to Wade and Gouessej for their advices.

EDIT: FPS results
FPS: 48
FPS: 61
FPS: 61
FPS: 59
FPS: 58
FPS: 56
FPS: 56
FPS: 55
FPS: 56
FPS: 53
FPS: 51
FPS: 51
FPS: 51
FPS: 50
FPS: 45
FPS: 47
FPS: 47
FPS: 46
FPS: 41 <- About 10 M points.
Reply | Threaded
Open this post in threaded view
|

Re: VBO Performance Misunderstanding

gouessej
Administrator
Hi

I'm glad to see that your frame rate is higher. glMapBuffer*/glUnmapBuffer is really useful to reduce the memory footprint on the CPU side because you don't have to keep a direct NIO buffer allocated in the native memory of the JVM for each VBO to make a transfer. The performance of glBufferSubData vs glMapBuffer*/glUnmapBuffer depends on the size of the updated data too.
Julien Gouesse | Personal blog | Website