Skip to content

WASM: Odd choice of selected instruction, and subsequent missed optimization #69938

@matthias-blume

Description

@matthias-blume

Instruction selection for WASM takes "advantage" of knowledge of byte alignment properties of the __stack_pointer global and prefers or instructions over add when it can. But the resulting code suffers from the fact that subsequent optimizations are unable to combine such or instructions with other additive operations, resulting in more instructions than necessary.

Example:

int foo(int *p);
int bar(double *p);
int f(int y[100], double a[100]) {
    int x[100];
    double z[100];
    bar(z);
    x[1] = 0;
    int* xp = x + 2;
    foo(xp);
    return *(xp-1);
}

$ clang -O2 -fexceptions -DAT -fno-unroll-loops -stdlib=libc++

f(int*, double*):                               # @f(int*, double*)
        global.get      __stack_pointer
        i32.const       1200
        i32.sub 
        local.tee       2
        global.set      __stack_pointer
        local.get       2
        call    bar(double*)
        drop
        local.get       2
        i32.const       0
        i32.store       804
        local.get       2
        i32.const       800
        i32.add 
        i32.const       8
        i32.or             <--------------------------------- or instead of add prevents folding
        call    foo(int*)
        drop
        local.get       2
        i32.load        804
        local.set       3
        local.get       2
        i32.const       1200
        i32.add 
        global.set      __stack_pointer
        local.get       3
        end_function

Modifying the code a little creates a better code when compiler can't use or

int foo(int *p);
int bar(double *p);
int f(int y[100], double a[100]) {
    int x[100];
    double z[100];
    bar(z);
    x[1] = 0;
    int* xp = x + 5;   // modified
    foo(xp);
    return *(xp-1);
}

$ clang -O2 -fexceptions -DAT -fno-unroll-loops -stdlib=libc++

f(int*, double*):                               # @f(int*, double*)
        global.get      __stack_pointer
        i32.const       1200
        i32.sub 
        local.tee       2
        global.set      __stack_pointer
        local.get       2
        call    bar(double*)
        drop
        local.get       2
        i32.const       0
        i32.store       804
        local.get       2
        i32.const       820
        i32.add   <------------------------------------ add folded
        call    foo(int*)
        drop
        local.get       2
        i32.load        816
        local.set       3
        local.get       2
        i32.const       1200
        i32.add 
        global.set      __stack_pointer
        local.get       3
        end_function

https://godbolt.org/z/afvf5bvP3

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions