Skip to content

For some bitcodes it can take 12 hours to read and compile #46739

@yowl

Description

@yowl
Bugzilla Link 47395
Version trunk
OS Windows NT
CC @dwblaikie,@efriedma-quic,@tlively,@yuanfang-chen

Extended Description

I create bitcode using libLLVM for the corert compiler project (https://github.com/dotnet/corert). It uses the c# bindings over libLLVM from https://github.com/Microsoft/LLVMSharp.

I have 2 bitcodes generated from mostly the same source code. They are around 240MB in size. One compiles in 3 minutes, the other in 12 hours. I suspect the 12 hour compilation is either not optimal or doing something wrong. I use emscripten to compile and this ultimately calls

E:/GitHub/llvm-project/build/release/bin/clang++.exe -target wasm32-unknown-emscripten -D__EMSCRIPTEN_major__=1 -D__EMSCRIPTEN_minor__=39 -D__EMSCRIPTEN_tiny__=19 -D_LIBCPP_ABI_VERSION=2 -Dunix -D__unix -D__unix__ -Werror=implicit-function-declaration -Xclang -nostdsysteminc -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\libcxx -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libcxxabi\include -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libunwind\include -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\compat -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\libc -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libc\musl\arch\emscripten -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\local\include -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\SSE -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\cache\wasm\include -DEMSCRIPTEN -fignore-exceptions -c -g E:\GitHub\UnoCoreRt\UnoCoreRt.Wasm\bin\Debug\netstandard2.0\UnoCoreRt.Wasm.bc -Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\SDL -c -o E:\GitHub\UnoCoreRt\UnoCoreRt.Wasm\bin\Debug\netstandard2.0\UnoCoreRt-release.o -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -g

What I've noticed is that compared to the "fast", 3 minute compile, the "slow" compile makes around 1 million calls to

ResolveConstants.push_back(std::make_pair(PHC, Idx));
and hence the ResolveConstants variable ends up with that many entries. Resolving these constants is then what seems to take most of the time. I think the bitcode reader is identifying 1 million forward references so possible causes of the slowness that come to mind are:

  1. Incorrect identification of forward references
  2. Incorrect writing from libLLVM that creates forward references unnecessarily
  3. Slow algorithm to resolve correctly identified and written forward references.

A copy of the bitcode is at http://dev.hubse.com/UnoCoreRt.Wasm.bc.msi (its not really an msi, just needed a binary extension that the web server would serve). File is actually a .7z compressed file, so needs renaming from .msi to .7z

I privately messaged @​tlively in discord and I believe he has confirmed that it takes a long time for him also.

#46095 looks to be the same area of code, but not the same problem.

I did spend a bit of time with clang++ in the debugger, but I'm not that familiar with it at all, so I couldn't make any conclusion about my 3 theories above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions