GLCLInteroperabilityDemo and float4

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

GLCLInteroperabilityDemo and float4

paolofuse
Hi all,
I'm running correctly GLCLInteroperabilityDemo. But I want to understand why it's using a float4 vertex and not a float3 vertex.
Why I cannot use this kernel:

kernel void sineWave(global float3 * vertex, int size, float time) {

    unsigned int x = get_global_id(0);
    unsigned int y = get_global_id(1);

    // calculate uv coordinates
    float u = x / (float) size;
    float v = y / (float) size;

    u = u*2.0f - 1.0f;
    v = v*2.0f - 1.0f;

    // calculate simple sine wave pattern
    float freq = 4.0f;
    float w = sin(u*freq + time) * cos(v*freq + time) * 0.5f;

    // write output vertex
    vertex[y*size + x] = (float3)(u*10.0f, w*10.0f, v*10.0f);
}

and in display method:

gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, glObjects[VERTICES]);
gl.glVertexPointer(3, GL2.GL_FLOAT, 0, 0);
gl.glColor3f(1, 1, 1);
gl.glEnableClientState(GL2.GL_VERTEX_ARRAY);
gl.glDrawArrays(GL2.GL_POINTS, 0, MESH_SIZE * MESH_SIZE);
gl.glDisableClientState(GL2.GL_VERTEX_ARRAY);
gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, 0);

This doesn't work. There's an explanation?
Thanks
Reply | Threaded
Open this post in threaded view
|

Re: GLCLInteroperabilityDemo and float4

elect
Only a little hint, if you need gpgpu with opengl you may also take in account compute shaders, they have several advantages over opencl
Reply | Threaded
Open this post in threaded view
|

Re: GLCLInteroperabilityDemo and float4

Wade Walker
Administrator
In reply to this post by paolofuse
It could be that if you're not using 4-component vertices, then when they're multiplied by the transform matrix, they don't pick up the translation, so the object may be getting drawn behind the camera. You'd see an empty screen in that case :)
Reply | Threaded
Open this post in threaded view
|

Re: GLCLInteroperabilityDemo and float4

paolofuse
The problem was that float3 (and other type3 types) behave as float4 (and other type4 types) for the purposes of size and alignment, as mentioned here:

http://stackoverflow.com/a/16647333

So if I use float3 I have to specify the w component of the vector, like it was a 4 components vector.
Reply | Threaded
Open this post in threaded view
|

Re: GLCLInteroperabilityDemo and float4

Wade Walker
Administrator
Aha, nice find! I wasn't aware that OpenCL float3 are aligned like float4. Presumably it's to avoid split loads, where one float3 spans two cache lines (thereby taking two cycles in the load/store unit). Since cache lines are usually 64B, there's really no point using anything that isn't an integral divisor of 64 -- you're theoretically saving bandwidth, but the extra split transactions (and possibly the unaligned accesses) will kill the benefit. It might still be worth saving large data structures in this format, then converting to float4 in memory, though :)