Skip to content

File system is inconsistent when paths aren't valid utf8 #24690

@hoodmane

Description

@hoodmane

Version of emscripten/emsdk:
Tested against 4.0.8 but problem is in tip of tree.

a.c

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>


char x1[] = "@test_42_tmp\xe7w\xf0";
char x2[] = "./@test_42_tmp\xe7w\xf0";

int main() {
    printf("x: %s\n", x1);
    int fd1 = open(x1, O_CREAT | O_WRONLY, 0x777);
    if (fd1 == -1) {
        printf("open failed: %s\n", strerror(errno));
        return 1;
    }
    int w = write(fd1, "hi!", 4);
    if (w == -1) {
        printf("write failed: %s\n", strerror(errno));
        return 1;
    }
    printf("wrote: %d\n", w);
    if (close(fd1) == -1) {
        printf("close failed: %s\n", strerror(errno));
        return 1;
    }


    printf("x2: %s\n", x2);
    int fd2 = open(x2, O_RDONLY);
    if (fd2 == -1) {
        printf("open failed: %s\n", strerror(errno));
        return 1;
    }
    printf("fd2: %d\n", fd2);
    char out[10];
    int r = read(fd2, out, 10);
    if (r == -1) {
        printf("read failed: %s\n", strerror(errno));
        return 1;
    }
    printf("read: %d %s\n", r, out);
    unlink(x1);
    return 0;
}

Compile and run a.c with gcc:

$ gcc a.c && ./a.out
x: @test_42_tmp�w�
wrote: 4
x2: ./@test_42_tmp�w�
fd2: 3
read: 4 hi!

Compile and run with emcc:

$ emcc a.c && node a.out.js 
x: @test_42_tmp緰
wrote: 4
x2: ./@test_42_tmp緰
open failed: No such file or directory

The problem is that @test_42_tmp\xe7w\xf0 is 15 bytes long whereas ./@test_42_tmp\xe7w\xf0 is 17 bytes long. The longer string hits this codepath:
https://github.com/emscripten-core/emscripten/blob/main/src/lib/libstrings.js#L58
and is decode by TextDecoder() to ./@test_42_tmp�w�. The shorter string gets decoded in JS to ./@test_42_tmp緰. Then those don't match so we don't find the file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions