[SYCL][CUDA] Wrong global range when sycl::range is passed a prime

**Describe the bug**

I believe https://github.com/intel/llvm/pull/9787 introduces a bug whereby prime `sycl::range` values will be rounded up, producing an incorrect final global range.

**To Reproduce**
``` c++
#include <sycl/sycl.hpp>

int main(){
  sycl::queue q{};
  q.submit([=](sycl::handler& cgh){

   unsigned int x_max = 256;
   unsigned int y_max = 5;
   cgh.parallel_for(sycl::range<2>{x_max, y_max},[=](sycl::item<2> item){
        int x = item.get_id(0);
        int y = item.get_id(1);
        if(y >= y_max)
          printf("Oh no! y: %d\n", y);
    }
   );
  });
}
```
Compile & run:
```
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda mwe.cpp && ./a.out
```
On my system (RTX 3060) this produces the following grid & block size:
```
grid:  <<<3, 1, 1>>>
block: <<<2, 256, 1>>>
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL][CUDA] Wrong global range when sycl::range is passed a prime #10051

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[SYCL][CUDA] Wrong global range when sycl::range is passed a prime #10051

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions