Skip to content

[Affine fusion] Private memref size is not properly computed for 'mod' cases. #46317

@llvmbot

Description

@llvmbot
Bugzilla Link 46973
Version unspecified
OS All
Reporter LLVM Bugzilla Contributor
CC @joker-eph,@bondhugula

Extended Description

The size of the private memref after affine fusion is not properly computed in this example, where X mod 128 is used in the map of the consumer memory op:

func.func @test(%in: memref<128xf32>, %out: memref<20x512xf32>) {
  %tmp = memref.alloc() : memref<128xf32>

  affine.for %arg4 = 0 to 128 {
    %ld = affine.load %in[%arg4] : memref<128xf32>
    affine.store %ld, %tmp[%arg4] : memref<128xf32>
  }

  affine.for %arg3 = 0 to 20 {
    affine.for %arg4 = 0 to 512 {
      %ld = affine.load %tmp[%arg4 mod 128] : memref<128xf32>
      affine.store %ld, %out[%arg3, %arg4] : memref<20x512xf32>
    }
  }

  return
}

The resulting private memref after fusion should be memref<1xf32> and not memref<128xf32>:

  func.func @test(%arg0: memref<128xf32>, %arg1: memref<20x512xf32>) {
    %0 = alloc() : memref<128xf32>
    affine.for %arg2 = 0 to 20 {
      affine.for %arg3 = 0 to 512 {
        %1 = affine.apply #map0(%arg3)
        %2 = affine.load %arg0[%1] : memref<128xf32>
        affine.store %2, %0[%arg3 mod 128] : memref<128xf32>
        %3 = affine.apply #map0(%arg3)
        %4 = affine.load %0[%arg3 mod 128] : memref<128xf32>
        affine.store %4, %arg1[%arg2, %arg3] : memref<20x512xf32>
      }
    }
    return
  }

Interestingly, the size of the private memref is properly compute if we replace mod with floordiv.

To reproduce: mlir-opt test.mlir -affine-loop-fusion=fusion-maximal

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions