Login  Register

Re: GLCLInteroperabilityDemo and float4

Posted by Wade Walker on Mar 08, 2016; 2:10am
URL: https://forum.jogamp.org/GLCLInteroperabilityDemo-and-float4-tp4036419p4036450.html

Aha, nice find! I wasn't aware that OpenCL float3 are aligned like float4. Presumably it's to avoid split loads, where one float3 spans two cache lines (thereby taking two cycles in the load/store unit). Since cache lines are usually 64B, there's really no point using anything that isn't an integral divisor of 64 -- you're theoretically saving bandwidth, but the extra split transactions (and possibly the unaligned accesses) will kill the benefit. It might still be worth saving large data structures in this format, then converting to float4 in memory, though :)