Skip to content

Rename argc/argv main to __main_argc_argv #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 27, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions BasicCABI.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,3 +154,46 @@ is no way to traverse the linear stack. There is also no currently-specified way
is used as the frame pointer or base pointer. This functionality is not needed for backtracing or unwinding (since the
wasm VM must do this in any case); however it may still be desirable to allow this functionality for debugging
or in-field crash reporting. Future ABIs may designate a convention for determining frame size and local usage.

## Program startup

### User entrypoint

The *user entrypoint* is the function which runs the bulk of the program.
It is called `main` in C, C++, and other languages. Note that this may
not be the first function in the program to be called, as programs may
also have global constructors which run before it.

At the wasm C ABI level, the following symbol names are used:

C ABI Symbol name | C and C++ signature |
---------------------------- | -----------------------------------|
`main` | `int main(void)` or `int main()` |
`__main_argc_argv` | `int main(int argc, char *argv[])` |

These symbol names only apply at the ABI level; C and C++ source should
continue to use the standard `main` name, and compilers will handle the
details of conforming to the ABI.

Also note that C symbol names are distinct from WebAssembly export
names, which are outside the scope of the C ABI. Toolchains which export
the user entrypoint may chose to export it as the name `main`, even when
the C ABI symbol name is `__main_argc_argv`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can the toolchain support exporting main? Won't this a toolchain that want to do something like --export=main? We need to some way to export the user entry point and it seems like with this change it because hard to name the user entry point because it has two possible names, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that __attribute__((export_name("main"))) does work with this patch, with and without argv/argc. I get an export named "main" regardless of the internal symbol name.

However, -Wl,--export=main indeed doesn't work with this patch in the case of main with argc/argv. One way to fix that would be to introduce a -mexport=main flag to clang and have people use that instead. Clang could translate that into inserting the appropriate export_name directive. Clang flags are nicer for end users than -Wl, flags anyway, and it would handle mangled names better in general. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For emscripten can't ask users to decorate the main symbol in the source code.

A new -mexport=main cflag would have to be applied to every single compilation unit. Seems kind of heavy weight and would require more plumbing to land in llvm.

Actually I think may be able to just pass -Wl,--export=main and -Wl,--export=__main_argc_argv since this flag doesn't warn or error if the symbol is missing, and then we can have binaryen take care of exporting __main_argc_argv as main perhaps? Will have to see how it goes.


A symbol name other than `main` is needed because the usual trick of
having implementations pass arguments to `main` even when they aren't
needed doesn't work in wasm, which requires caller and callee signatures
to exactly match.

For the same reason, the wasm C ABI doesn't support an `envp` parameter.
Fortunately, `envp` is not required by C, POSIX, or any other relevant
standards, and is generally considered obsolete in favor of `getenv`.

### Program entrypoint

The *program entrypoint* is the first function in the program to be called.
It is commonly called `_start` on other platforms, though this is a
low-level detail that most code doesn't interact with.

The program entrypoint is out of scope for the wasm C ABI. It may depend
on what environment the program will be run in.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it out of scope? Were do we document _start and __wasm_call_ctors if not here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__wasm_call_ctors is a convention between the linker and crt1.o (or something else serving the purpose of crt1.o), making it appropriate for Linking.md, where it is. Anyone calling __wasm_call_ctors has to know that the module was generated from a linker following Linking.md.

I see _start as being a convention between a wasm module and the outside which doesn't know anything about the tools that produced it. It may not use BasicCABI.md or Linking.md. Eventually it'll be replaced by something we define in interface types, but for now it's just a simple "_start" convention.