Skip to content

[RFC] [AMDGPU] [SelectionDAG] [GlobalIsel] select with constant combine into binaryOp with zext/sext #121145

@vg0204

Description

@vg0204

In both the ISEL under generic combines, various select with constants combine into binary ops with zext/sext operand like

select Cond, C1, C1-1 --> add (zext Cond), (C1-1)
select Cond, Pow2, 0  -->  shl  (zext Cond), log2(Pow2)  
select Cond, C1, C1+1 --> add (sext Cond), (C1+1)

For various architecture, instruction materialization for zext/sext might be cheaper as compared to select, thus making sense for above combine optimization.

But in case of AMDGPU, both the zext/sext & select (for f32 with inline constants) materializes into v_cndmask_b32_e64. Thus the above optimization increases the cost by introducing an additional binary instruction.

If you look from different persepective, as in AMDGPU both the Zext/Sext and Select boils down to same machine instruction canonincally, thus really undoing the folding of binOp into Select. For example :

Select Cond, 7, 6 --> add ( zext Cond ), 6 materializes as :

v_cndmask_b32_e64 v0, 0, 1, vcc
v_add_u32_e32 v0, 6, v0

instead of

v_cndmask_b32_e64 v0, 6, 7, vcc

on which the binOp into Select combine is really missed, as Select is eliminated, but nevertheless (Zext cond) materializes as same as (Select cond 1, 0). So for AMDGPU : add ( zext Cond ), 6 <==> add ( Select 1, 0 ), 6 after the instruction selection is done. This really showcases that zext introduction (via select's combine) really caused the skip of BinOp fold into select, introducing the additional binary instruction.

It is the root cause of SWDEV-505394, as increases the code length.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions