jogamp › jocl

Writing range of CLBuffer to GPU

Classic

List

Threaded

3 messages Options

bananafish

Aug 12, 2020; 1:40pm

Writing range of CLBuffer to GPU

8 posts

Writing a range of an array to an offset of a GPU-domain buffer in OpenCL isn't working out. In other words, I wish to read from my java array at a given offset and write it to the CLBuffer on the GPU at another specified offset and lengths.

The original C API makes this possible by exposing control of the offset of the buffer object (GPU end) to begin writing, and explicit length as well. Unfortunately, the source for CLCommandQueue.putWriteBuffer() hard-codes the offset to zero, closing that option off.

I tried using CLBuffer.getBuffer().position(start).limit(end) then invoking putWriteBuffer(...) but ended up with a segfault. I found that CLMemory.getNIOSize() is using Buffer.capacity(), and I believe this is resulting in an overflow.

There was an older post on this forum suggesting that getNIOSize() use the limit() method instead for determining length and use that for the write length. I disagree with this: If position() is used to determine the host offset when calculating a pointer from the NIO buffer, position() + limit() will exceed capacity() and again result in overflow. I assert that the proper method to use is remaining() * getElementSize() when calculating the number of bytes to transfer, and perhaps position() * getElementSize() for calculating the offset on the GPU. Alternatively, a method could be written which takes an offset (in elements). This makes the offsets on Host and GPU independent.

gouessej

Aug 12, 2020; 5:14pm

Re: Writing range of CLBuffer to GPU

Administrator

6044 posts

Please fill a bug report. I'm not sure that a brand new method is needed to do that, maybe you can create a CLBuffer that exactly maps with your range.

Julien Gouesse | Personal blog | Website

Wade Walker

Aug 12, 2020; 11:25pm

Re: Writing range of CLBuffer to GPU

Administrator

857 posts

In reply to this post by bananafish

If you can give us a test that reproduces the bug, and a patch that fixes it, I'd be happy to commit it for you? JOCL is a little tricky since all the different platform implementations seem to behave a little differently, so a test that exposes something like this would definitely be valuable.