Skip to content

Pass advisory lock calls to filesystem #15070

@jlongster

Description

@jlongster

Hello! I meant to file this issue a while back.

Context

I wrote a custom filesystem that conforms to emscripten's filesystem API, and it is specially built for SQLite (using the sql.js project). It takes reads/writes from SQLite and persists them into IndexedDB. The difference between this and most other IDB filesystems is that it actually writes the data down in blocks, so you don't have to load the whole thing in memory to use it. It basically makes IDB a real-ish filesystem (in the way that filesystems read/write in blocks).

My project is here: https://github.com/jlongster/absurd-sql. This research led to some very interesting results, particularly that using SQLite with this backend is way faster than using raw IDB directly. The reason is simple: it naturally batches reads/writes to the number of times it calls out out to IDB is way lower. Since IDB is super slow, this makes it really really fast.

I wrote a post explaining the project and the results here: https://jlongster.com/future-sql-web

Problem

One of the critical things for this to work is locking. If you open the db in several different tabs, each tab has the potential to corrupt the database by overwriting other writes with out-of-date data. To solve this, SQLite uses posix adivsory locking. Each connection locks the db before writing (the whole process is explained in Atomic Commit In SQLite).

We need to respect these locks on the web, otherwise dbs will get corrupted.

The way my project works is it hooks into the filesystem calls. I found this to be an intuitive and easy way to hook SQLite up to IDB; more so than a custom SQLite vfs. Almost everything fell into place nicely (opening a db, handing short reads, etc).

There's one thing that didn't: locking. And the only reason it didn't is because emscripten doesn't expose the locking APIs for filesystems to handle. To solve this we had to extend the unix vfs and install our lock/unlock hooks, but this is painful because it requires users to call an initialization function and we need to track file pointers. That's all a bunch of implementation detail; basically we had to do a bunch of work to handle lock/unlock.

We shouldn't have to though! All we want to do is handle fcntl(fd, F_SETLK, pLock). If we could handle that in the filesystem, we don't have to do anything special.

(Read my blog post for more information about how we implement locks. In short, we leverage IDB transaction semantics for locking semantics. Locking is super important for us!)

The problem is emscripten right now assumes locking is successful:

return 0; // Pretend that the locking is successful.
. It would be very easy in that function to instead delegate locking to the stream (the same with unlocking). Let us handle F_SETLK and F_GETLK.

If we could have that it would simplify our SQLite use cases a bunch! Would you all be open to a PR?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions