Hi! I just faced a strange problem running radix sort demo. It compiles properly, without any warnings in CL kernels, but execution produces incorrect results. These results look like they've undergone data races or incorrect bit operations -- for example (for maxValue = 10), the program outputs the following:
The set looks sorted only partially, like when data races occur. Furthermore, I presume that no numbers greater than 9 could appear in data set during normal execution. I tried -Werror and -cl-opt-disable for CL builds, but this doesnt help -- nor the problem disappears, neither the builds fall with errors. Could someone please help me with this issue?
I've noticed in the past that this demo doesn't give correct results when run on AMD hardware. I worked on it for a while to try to isolate the problem, but I was unsuccessful. Apparently the code makes some assumption about the underlying architecture which is valid for Nvidia but not AMD. If you happen to find the problem, I'd gratefully accept a patch!