-
Notifications
You must be signed in to change notification settings - Fork 795
Description
The num_simd_work_items
argument must evenly divide the reqd_work_group_size
arguments. Does that mean this code should diagnose?
struct TRIFuncObjGood8 {
[[intel::reqd_work_group_size(64, 64)]]
[[intel::num_simd_work_items(4)]] void
operator()() const {}
};
Note that reqd_work_group_size
takes three arguments, two of which are optional and default to the value 1
, which is not evenly divisible by 4
.
According to https://www.intel.com/content/www/us/en/programmable/documentation/mwh1391807965224.html#mwh1391807939093 (section : 5.2.13. Specifying Number of SIMD Work-Items):
Important: Introduce the num_simd_work_items attribute in conjunction with the reqd_work_group_size attribute. The num_simd_work_items attribute you specify must evenly divide the work-group size you specify for the reqd_work_group_size attribute.
I initially read this as having to evenly divide all of the work group size arguments (because a default value is still a value that is specified). But then I thought maybe the "you specify" means "you specify explicitly", so we'd only check the non-defaulted arguments. But then the example shown immediately below the text is:
__attribute__((num_simd_work_items(4)))
__attribute__((reqd_work_group_size(64,1,1)))
which explicitly specifies all three arguments! I started digging around to see if reqd_work_group_size
has any documentation that suggests the first argument is the work group size (and the other arguments mean something different), but I didn't see any documentation on what those arguments mean individually (only collectively). So it's not clear what the diagnostic behavior should be.