Skip to content

[1.x] Postgres support #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Dec 5, 2023
Merged

[1.x] Postgres support #64

merged 15 commits into from
Dec 5, 2023

Conversation

jessarcher
Copy link
Member

@jessarcher jessarcher commented Nov 30, 2023

This PR introduces Postgres support for Pulse.

Note that Postgres cannot handle an upsert statement that contains multiple entries that would update the same row, so the entries have been pre-aggregated in PHP prior to the upsert. This benefits the MySQL implementation as well by reducing the number of rows being upserted, which can be especially beneficial when running the pulse:work command with a large backlog in the Redis stream.

@jessarcher jessarcher force-pushed the postgres branch 3 times, most recently from 72c72a9 to 14a3ef4 Compare November 30, 2023 15:13
@websitevirtuoso
Copy link

sounds good. hope pgsql will be supported soon

@tpetry
Copy link

tpetry commented Dec 4, 2023

Hey Jess, all the differences in SQL syntax could be solved using the database expressions I developed for Laravel 11. Some of the ones you need are already provided by tpetry/laravel-query-expressions.

@tpetry
Copy link

tpetry commented Dec 4, 2023

Is there a specific reason why used an efficient binary column for MySQL but a string column in PostgreSQL which is twice the size? The same approach of MySQL can be done in PostgreSQL with decode(md5("key_hash"), 'hex').

@jessarcher
Copy link
Member Author

Is there a specific reason why used an efficient binary column for MySQL but a string column in PostgreSQL which is twice the size? The same approach of MySQL can be done in PostgreSQL with decode(md5("key_hash"), 'hex').

@tpetry I couldn't find a fixed-length binary column type for Postgres. The closest I found is bytea but it only seems to be variable length which seemed worse than a fixed-length string column in this instance.

Is there a better type we can use? Or can I safely store the results of decode in a 16-byte string column?

@tpetry
Copy link

tpetry commented Dec 5, 2023

There is no real measurable performance impact between variable and fixed-length columns (in PostgreSQL). That's why even all text columns are variable length in PG. And the type column is also variable-length, so there is no improvement. Just use the binary type, which is bytea.

Alternatively, you can use UUIDs in PG: The great part of md5 is that it is 128-bit long, exactly as long as a UUID. And the uuid type in PostgreSQL displays the value as a string but stores it internally as a fixed-size 128-bit binary. So you could transform the md5 to a uuid for PostgreSQL - which I always do:

$table->uuid('key_hash')->storedAs('md5("key")::uuid')

That would be the most PostgreSQL way of doing it.

@jessarcher jessarcher changed the base branch from master to 1.x December 5, 2023 07:19
@tpetry
Copy link

tpetry commented Dec 5, 2023

@jessarcher I've checked the source code of the upsert method again. You don't have to call the grammar to compile the insert and change it to make it an upsert manually. Custom update rules in case of duplicates are already implemented - but have never been documented:

$this->connection()->table('pulse_xyz')->upsert([
    'key' => '....',
    'value' => 44,
], ['key'], [
    'value' => match ($driver = $this->connection()->getDriverName()) {
        'mysql' => new Expression('`value` + values(`value`)'),
        'pgsql' => new Expression('"pulse_xyz"."value" + "excluded"."value"'),
        default => throw new RuntimeException("Unsupported database driver [{$driver}]."),
    },
]);

@jessarcher
Copy link
Member Author

Thanks, @tpetry! That's much tidier!

@tpetry
Copy link

tpetry commented Dec 5, 2023

Just one tiny thing left I've seen: You can replace all wrap calls like 'sum('.$this->wrap('count').')' with "sum({$this->wrap('count')})" which makes the more complex statements much more easy to read. But that's really a personal thing I do when extending the Laravel Grammar.

@jessarcher
Copy link
Member Author

@tpetry Agree - that's nicer! Will update them all now. Thanks for all your help!

@jessarcher jessarcher marked this pull request as ready for review December 5, 2023 08:20
@jessarcher jessarcher changed the title Postgres support [1.x] Postgres support Dec 5, 2023
@taylorotwell taylorotwell merged commit 23ce6d4 into 1.x Dec 5, 2023
@taylorotwell taylorotwell deleted the postgres branch December 5, 2023 14:47
@marcaparent
Copy link

marcaparent commented Jul 24, 2024

Generated columns are not available when using any PosgreSQL version lower than 12 (which are now unsupported). This is a little workaround allowing to fill the columns using a trigger instead:

Replacement migration file
<?php

use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Facades\Schema;
use Laravel\Pulse\Support\PulseMigration;

return new class extends PulseMigration
{
    /**
     * Run the migrations.
     */
    public function up(): void
    {
        if (!$this->shouldRun()) {
            return;
        }

        Schema::create('pulse_values', function (Blueprint $table) {
            $table->id();
            $table->unsignedInteger('timestamp');
            $table->string('type');
            $table->mediumText('key');
            match ($this->driver()) {
                'mariadb', 'mysql' => $table->char('key_hash', 16)->charset('binary')->virtualAs('unhex(md5(`key`))'),
                /* Psql 11 support */
                /* 'pgsql' => $table->uuid('key_hash')->storedAs('md5("key")::uuid'), */
                'pgsql' => $table->uuid('key_hash')->nullable(),
                'sqlite' => $table->string('key_hash'),
            };
            $table->mediumText('value');

            $table->index('timestamp'); // For trimming...
            $table->index('type'); // For fast lookups and purging...
            $table->unique(['type', 'key_hash']); // For data integrity and upserts...
        });

        Schema::create('pulse_entries', function (Blueprint $table) {
            $table->id();
            $table->unsignedInteger('timestamp');
            $table->string('type');
            $table->mediumText('key');
            match ($this->driver()) {
                'mariadb', 'mysql' => $table->char('key_hash', 16)->charset('binary')->virtualAs('unhex(md5(`key`))'),
                /* Psql 11 support */
                /* 'pgsql' => $table->uuid('key_hash')->storedAs('md5("key")::uuid'), */
                'pgsql' => $table->uuid('key_hash')->nullable(),
                'sqlite' => $table->string('key_hash'),
            };
            $table->bigInteger('value')->nullable();

            $table->index('timestamp'); // For trimming...
            $table->index('type'); // For purging...
            $table->index('key_hash'); // For mapping...
            $table->index(['timestamp', 'type', 'key_hash', 'value']); // For aggregate queries...
        });

        Schema::create('pulse_aggregates', function (Blueprint $table) {
            $table->id();
            $table->unsignedInteger('bucket');
            $table->unsignedMediumInteger('period');
            $table->string('type');
            $table->mediumText('key');
            match ($this->driver()) {
                'mariadb', 'mysql' => $table->char('key_hash', 16)->charset('binary')->virtualAs('unhex(md5(`key`))'),
                /* Psql 11 support */
                /* 'pgsql' => $table->uuid('key_hash')->storedAs('md5("key")::uuid'), */
                'pgsql' => $table->uuid('key_hash')->nullable(),
                'sqlite' => $table->string('key_hash'),
            };
            $table->string('aggregate');
            $table->decimal('value', 20, 2);
            $table->unsignedInteger('count')->nullable();

            $table->unique(['bucket', 'period', 'type', 'aggregate', 'key_hash']); // Force "on duplicate update"...
            $table->index(['period', 'bucket']); // For trimming...
            $table->index('type'); // For purging...
            $table->index(['period', 'type', 'aggregate', 'bucket']); // For aggregate queries...
        });

        /**
         * As Psql 11 doesn't support generated columns, we have to create a trigger
         */
        if ($this->driver() == 'pgsql') {
            DB::statement('
                CREATE OR REPLACE FUNCTION generate_key_hash()
                RETURNS trigger
                LANGUAGE plpgsql
                SECURITY DEFINER
                AS $BODY$
                BEGIN
                    NEW."key_hash" = md5(NEW."key")::uuid;
                    RETURN NEW;
                END
                $BODY$;
            ');

            DB::statement('
                CREATE TRIGGER computed_key_hash_pulse_values
                BEFORE INSERT OR UPDATE
                ON pulse_values
                FOR EACH ROW
                EXECUTE PROCEDURE generate_key_hash();
            ');

            DB::statement('
                CREATE TRIGGER computed_key_hash_pulse_entries
                BEFORE INSERT OR UPDATE
                ON pulse_entries
                FOR EACH ROW
                EXECUTE PROCEDURE generate_key_hash();
            ');

            DB::statement('
                CREATE TRIGGER computed_key_hash_pulse_aggregates
                BEFORE INSERT OR UPDATE
                ON pulse_aggregates
                FOR EACH ROW
                EXECUTE PROCEDURE generate_key_hash();
            ');
        }
    }

    /**
     * Reverse the migrations.
     */
    public function down(): void
    {
        if ($this->driver() == 'pgsql') {
            DB::statement('DROP TRIGGER IF EXISTS computed_key_hash_pulse_aggregates ON pulse_aggregates;');
            DB::statement('DROP TRIGGER IF EXISTS computed_key_hash_pulse_entries ON pulse_entries;');
            DB::statement('DROP TRIGGER IF EXISTS computed_key_hash_pulse_values ON pulse_values;');
            DB::statement('DROP FUNCTION IF EXISTS generate_key_hash;');
        }

        Schema::dropIfExists('pulse_values');
        Schema::dropIfExists('pulse_entries');
        Schema::dropIfExists('pulse_aggregates');
    }
};

@muhaimenul
Copy link

@marcaparent Thanks bro. I faced similar issue and your modified migrations worked for PostgreSQL 9.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants