Skip to content

auto vectorizer makes code run slower #28502

@llvmbot

Description

@llvmbot
Bugzilla Link 28128
Version 3.9
OS All
Reporter LLVM Bugzilla Contributor
CC @hfinkel,@RKSimon

Extended Description

#include <stdlib.h>
#include <stdio.h>
#include <time.h>

void tik()
{
    static clock_t start, end;
    static int flag = 0;
    static float elapsed_time;
    if (flag == 0) {
        start = clock();
    } else {
        end = clock();
        elapsed_time = (float)(end - start) / (float)CLOCKS_PER_SEC;
        printf("Elapsed time: %f seconds\n", elapsed_time);
    }
    flag = 1 - flag;
}

typedef unsigned int uint;
static uint i_max = 64 * 1024 * 1024;

void test()
{
    static uint acc[2] = {1, 1};
    tik();
    for (uint i = 0; i < i_max; ++i) {
        acc[0] *= 3;
    }
    tik();
    acc[0] = acc[1] = 1;
    tik();
    for (uint i = 0; i < i_max; ++i) {
        acc[0] *= 3;
        acc[1] *= 3;
    }
    tik();
    acc[0] = acc[1] = 1;
    tik();
#pragma clang loop vectorize(disable)
    for (uint i = 0; i < i_max; ++i) {
        acc[0] *= 3;
        acc[1] *= 3;
    }
    tik();
    acc[0] = acc[1] = 1;
    tik();
    for (uint i = 0; i < i_max; ++i) {
        acc[0] *= 3;
    }
    for (uint i = 0; i < i_max; ++i) {
        acc[1] *= 3;
    }
    tik();
}

int main (int argc, char** argv) {
    test();
    return 0;
}

Output:

$ clang -O vector.c; ./a.out
Elapsed time: 0.003569 seconds
Elapsed time: 0.011365 seconds
Elapsed time: 0.008421 seconds
Elapsed time: 0.005950 seconds

I was expecting the second case run ~2x slower than the first one, for it's doing twice much work. The forth case's runtime is more reasonable.

ENV:

clang: 3.8.1-svn271127-1~exp1 (branches/release_38)
OS: xubuntu 14.04
CPU: Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions