Login  Register

Broken output from my algorithm on nVidia OpenCL implementation

Posted by Wibowit on Apr 20, 2011; 9:15pm
URL: https://forum.jogamp.org/Broken-output-from-my-algorithm-on-nVidia-OpenCL-implementation-tp2843828.html

Hi,

I've developed a OpenCL ST5 (Schindler's Sort Transform of order 5) implementation. On my system (Radeon HD 5770, APP SDK 2.4, CCC 11.3, Ubuntu 64-bit or Windows 7 64-bit) it behaves correctly and produces output identical to verified (valid) CPU-based implementation. Unfortunately it doesn't work the same on nVidia cards, as reported by inikep here: http://encode.ru/threads/1275-OpenCL-ST5-implementation

Program with test data is here: http://www12.zippyshare.com/v/44761190/file.html

enwik16MiB is the input data
enwik16MiB.st5.bak is the correct output
bsc_st5.exe is the "reference" encoder
StreamPacker.tar.gz is my OpenCL implementation - it takes two parameters: input file name and output file name

Could someone test it on nVidia card? I would be glad if someone helps me to hunt the bug. I'm counting on you, Michael :)