-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Closed
Closed
Copy link
Labels
Description
Bugzilla Link | 46973 |
Version | unspecified |
OS | All |
Reporter | LLVM Bugzilla Contributor |
CC | @joker-eph,@bondhugula |
Extended Description
The size of the private memref after affine fusion is not properly computed in this example, where X mod 128
is used in the map of the consumer memory op:
func.func @test(%in: memref<128xf32>, %out: memref<20x512xf32>) {
%tmp = memref.alloc() : memref<128xf32>
affine.for %arg4 = 0 to 128 {
%ld = affine.load %in[%arg4] : memref<128xf32>
affine.store %ld, %tmp[%arg4] : memref<128xf32>
}
affine.for %arg3 = 0 to 20 {
affine.for %arg4 = 0 to 512 {
%ld = affine.load %tmp[%arg4 mod 128] : memref<128xf32>
affine.store %ld, %out[%arg3, %arg4] : memref<20x512xf32>
}
}
return
}
The resulting private memref after fusion should be memref<1xf32> and not memref<128xf32>:
func.func @test(%arg0: memref<128xf32>, %arg1: memref<20x512xf32>) {
%0 = alloc() : memref<128xf32>
affine.for %arg2 = 0 to 20 {
affine.for %arg3 = 0 to 512 {
%1 = affine.apply #map0(%arg3)
%2 = affine.load %arg0[%1] : memref<128xf32>
affine.store %2, %0[%arg3 mod 128] : memref<128xf32>
%3 = affine.apply #map0(%arg3)
%4 = affine.load %0[%arg3 mod 128] : memref<128xf32>
affine.store %4, %arg1[%arg2, %arg3] : memref<20x512xf32>
}
}
return
}
Interestingly, the size of the private memref is properly compute if we replace mod
with floordiv
.
To reproduce: mlir-opt test.mlir -affine-loop-fusion=fusion-maximal