Posted by
notzed on
Aug 30, 2010; 2:44pm
URL: https://forum.jogamp.org/port-of-apple-s-fft-tp1379040p1389039.html
On 30 August 2010 22:03, Michael Bien [via jogamp]
<
[hidden email]> wrote:
>
> On 08/30/2010 05:31 AM, notzed [via jogamp] wrote:
> patch is ok. I will try to make sure that you are listed as author in the
> commit message... should work somehow. Thats the main reason why we usually
> prefer to just pull form other git forks.
Well if it becomes a regular thing i'll look into it - if it's just
for kudos i don't care but if you want it for blame i guess that's
another matter. FWIW regarding the demo code, it is just using
arrays atm since that's what the fft uses and I haven't had time to
play with images yet (already midnight ... again).
> BTW should put1drangekernel(kernel, 0, G, L);
> be the same as
> put2drangekernel(kernel, 0, 0, G, 1, L, 1);?
>
> i don't think so. 2d range methods pass 2 to the dimension param of
> 'clEnqueueNDRangeKernel' and the 1d counterpart passes 1. So both can not
> give the same result.
Hmm, I still don't see why not - If the dimension size is 1 wont it
just iterate over that dimension "from 0 to 0", which will just look
the same as a 1d call to the gpu code?
But regardless - it's the original C that does a 1d call:
err |= clEnqueueNDRangeKernel(queue, kernelInfo->kernel, 1, NULL,
&gWorkItems, &lWorkItems, 0, NULL, NULL);
Just to re-check that I didn't just make it up or make a mistake at
some ungodly hour of the morning I just checked again ...
this works no worries:
queue.put2DRangeKernel(kernelInfo.kernel, 0, 0, wd.gWorkItems, 1,
wd.lWorkItems, 1);
this definitely doesn't:
queue.put1DRangeKernel(kernelInfo.kernel, 0, wd.gWorkItems, wd.lWorkItems);
But I can't say why - the jocl source I have looks correct (although
the different copy2NIO() implementations look strange). Unless it was
something fixed a month or two ago (the version i'm actually using
isn't minty-fresh for various reasons, and my checkout is a few weeks
old at least).
(i've only been running it on 64 bit linux fwiw)
Anyway since just talking about weirdness is a bit of a pain I got off
my lazy bum and just put the code into jocl-demos. So i'll upload a
patch shortly once I reset my bugzilla password and can work out how
to work that offensively named and natured version control system.
Not sure what to do about apple's huge blurb - it says you only need
to keep it if you distribute it unmodified, which is clearly not the
case here. I left it in whole. I've only tested it on nvidia.
http://jogamp.org/bugzilla/show_bug.cgi?id=408(fwiw the single put1drange call commented out in CLFFTPlan.java is
the case it hits with this demo)
Michael