-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Description
Laravel Version
10.13.1
PHP Version
8.2.6
Database Driver & Version
No response
Description
We encountered problems with job uniqueness. The jobs, which are configured to have a maximum of 1 attempt, failed multiple times and couldn't be written to the failed_jobs table due to the failed_jobs_uuid_unique constraint.
It seems that the failed_jobs_uuid_unique constraint may be a separate bug for jobs that have a "ShouldBeUnique" implementation, and there is an issue with error handling when they fail multiple times. Upon further investigation through the stack trace, we found that the issue was caused by the MaxAttemptsExceededException.
[2023-06-05 13:59:10] production.ERROR: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'b0985078-42cd-4512-a70f-91fad1f69ca5' for key 'failed_jobs.failed_jobs_uuid_unique' (Connection: mysql, SQL: insert into `failed_jobs` (`uuid`, `connection`, `queue`, `payload`, `exception`, `failed_at`) values (b0985078-42cd-4512-a70f-91fad1f69ca5, redis, default, {"backoff":null,"uuid":"b0985078-42cd-4512-a70f-91fad1f69ca5","timeout":290,"tags":{},"id":"b0985078-42cd-4512-a70f-91fad1f69ca5","job":"Illuminate\\Queue\\CallQueuedHandler@call","pushedAt":"1685973367.0205","maxExceptions":null,"maxTries":1,"data":{"command":"O:47:\"App\\Domain\\Caching\\Jobs\\FullyCacheIdentifierJob\":1:{s:56:\"\u0000App\\Domain\\Caching\\Jobs\\FullyCacheIdentifierJob\u0000refUuid\";s:36:\"63f3cb9c-66f4-47af-a915-882d37e999d6\";}","commandName":"App\\Domain\\Caching\\Jobs\\FullyCacheIdentifierJob"},"failOnTimeout":true,"silenced":false,"retryUntil":null,"type":"job","displayName":"App\\Domain\\Caching\\Jobs\\FullyCacheIdentifierJob","attempts":2}, Illuminate\Queue\MaxAttemptsExceededException: App\Domain\Caching\Jobs\FullyCacheIdentifierJob has been attempted too many times or run too long. The job may have previously timed out. in /home/www-data/api/vendor/laravel/framework/src/Illuminate/Queue/Worker.php:780
Stack trace:
#0 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Queue/Worker.php(514): Illuminate\Queue\Worker->maxAttemptsExceededException(Object(Illuminate\Queue\Jobs\RedisJob))
#1 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Queue/Worker.php(423): Illuminate\Queue\Worker->markJobAsFailedIfAlreadyExceedsMaxAttempts('redis', Object(Illuminate\Queue\Jobs\RedisJob), 1)
#2 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Queue/Worker.php(384): Illuminate\Queue\Worker->process('redis', Object(Illuminate\Queue\Jobs\RedisJob), Object(Illuminate\Queue\WorkerOptions))
#3 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Queue/Worker.php(175): Illuminate\Queue\Worker->runJob(Object(Illuminate\Queue\Jobs\RedisJob), 'redis', Object(Illuminate\Queue\WorkerOptions))
#4 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Queue/Console/WorkCommand.php(136): Illuminate\Queue\Worker->daemon('redis', 'default', Object(Illuminate\Queue\WorkerOptions))
#5 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Queue/Console/WorkCommand.php(119): Illuminate\Queue\Console\WorkCommand->runWorker('redis', 'default')
#6 /home/www-data/api/vendor/laravel/horizon/src/Console/WorkCommand.php(51): Illuminate\Queue\Console\WorkCommand->handle()
#7 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(36): Laravel\Horizon\Console\WorkCommand->handle()
#8 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Container/Util.php(41): Illuminate\Container\BoundMethod::Illuminate\Container\{closure}()
#9 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(93): Illuminate\Container\Util::unwrapIfClosure(Object(Closure))
#10 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(35): Illuminate\Container\BoundMethod::callBoundMethod(Object(Illuminate\Foundation\Application), Array, Object(Closure))
#11 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Container/Container.php(661): Illuminate\Container\BoundMethod::call(Object(Illuminate\Foundation\Application), Array, Array, NULL)
#12 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Console/Command.php(194): Illuminate\Container\Container->call(Array)
#13 /home/www-data/api/vendor/symfony/console/Command/Command.php(312): Illuminate\Console\Command->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Illuminate\Console\OutputStyle))
#14 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Console/Command.php(163): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Illuminate\Console\OutputStyle))
#15 /home/www-data/api/vendor/symfony/console/Application.php(1040): Illuminate\Console\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#16 /home/www-data/api/vendor/symfony/console/Application.php(314): Symfony\Component\Console\Application->doRunCommand(Object(Laravel\Horizon\Console\WorkCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#17 /home/www-data/api/vendor/symfony/console/Application.php(168): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#18 /home/www-data/api/vendor/laravel/framework/src/Illuminate/Foundation/Console/Kernel.php(200): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#19 /home/www-data/api/artisan(35): Illuminate\Foundation\Console\Kernel->handle(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#20 {main}, 2023-06-05 14:00:40)) {"exception":{"errorInfo":["23000",1062,"Duplicate entry '3ca6725e-ed61-45b1-8df2-109631e964bb' for key 'failed_jobs.failed_jobs_uuid_unique'"],"connectionName":"mysql"}}
After removing the unique constraint from the database (for easier debugging), we narrowed down the error to the following log line:
[2023-06-05 13:43:54] production.ERROR: App\Domain\Caching\Jobs\BuildCacheJob has been attempted too many times or run too long. The job may have previously timed out. {"exception":[]}
After diving deeper into debugging, we discovered that the jobs were not being removed from the "reserved" queue correctly, and the workers were executing them again, resulting in the MaxAttemptsExceededException.
Further examination of the source code revealed a failed removal of entries in \Illuminate\Queue\RedisQueue::deleteReserved. The result of the execution was always 0, indicating that the entry was not being removed.
After spending more hours experimenting with different Redis clients and configurations, we found that setting either compression and/or serialization in the Redis database configuration resulted in a faulty zrem request to Redis. As a workaround, we disabled the two options for the default database, which is used by the worker, and did not investigate the problem further.
Our hypothesis is that the lack of compression or serialization in the Redis client causes a mismatch between the member in the zrem request, leading to the entry not being deleted.
Workaround was to disable serializer and compression
'redis' => [
'client' => env('REDIS_CLIENT', 'phpredis'),
'options' => [
'cluster' => env('REDIS_CLUSTER', 'redis'),
'prefix' => 'XXX:',
'serializer' => 2, // Redis::SERIALIZER_IGBINARY,
'compression' => 2 // Redis::COMPRESSION_ZSTD,
],
'default' => [
'url' => env('REDIS_URL'),
'host' => env('REDIS_HOST', '127.0.0.1'),
'password' => env('REDIS_PASSWORD'),
'port' => env('REDIS_PORT', '6379'),
'database' => env('REDIS_DB', '0'),
'scheme' => env('REDIS_SCHEME', 'tcp'),
'options' => [
'serializer' => 0, // disabled global configuration due to failing removing jobs from reserver queue
'compression' => 0, // disabled global configuration due to failing removing jobs from reserver queue
],
],
'cache' => [
'url' => env('REDIS_URL'),
'host' => env('REDIS_HOST', '127.0.0.1'),
'password' => env('REDIS_PASSWORD'),
'port' => env('REDIS_PORT', '6379'),
'database' => env('REDIS_CACHE_DB', '1'),
'scheme' => env('REDIS_SCHEME', 'tcp'),
],
],
Hope i haven't forgotten anything :)
Thanks
Steps To Reproduce
Configure redis DB
phpredis as client
serializer option 2 (Redis::SERIALIZER_IGBINARY)
compression option set to 2 (Redis::COMPRESSION_ZSTD)
'redis' => [
'client' => env('REDIS_CLIENT', 'phpredis'),
'options' => [
'cluster' => env('REDIS_CLUSTER', 'redis'),
'prefix' => 'XXX:',
'serializer' => 2, // Redis::SERIALIZER_IGBINARY,
'compression' => 2 // Redis::COMPRESSION_ZSTD,
],
'default' => [
'url' => env('REDIS_URL'),
'host' => env('REDIS_HOST', '127.0.0.1'),
'password' => env('REDIS_PASSWORD'),
'port' => env('REDIS_PORT', '6379'),
'database' => env('REDIS_DB', '0'),
'scheme' => env('REDIS_SCHEME', 'tcp'),
],
'cache' => [
'url' => env('REDIS_URL'),
'host' => env('REDIS_HOST', '127.0.0.1'),
'password' => env('REDIS_PASSWORD'),
'port' => env('REDIS_PORT', '6379'),
'database' => env('REDIS_CACHE_DB', '1'),
'scheme' => env('REDIS_SCHEME', 'tcp'),
],
],
Create a simple job with tries set to 1 (I have also added the ShouldBeUnique to provocate the DB constraint error which from my point of view is a separate error)
class SomeJob implements ShouldQueue, ShouldBeUnique
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
public $tries = 1;
public function handle(): void
{
sleep(1);
}
public function uniqueId(): string
{
return 7411;
}
}
Dispatch the job and wait for second execution
As described above this will result in multiple executions on the second worker run due to the job was not removed from the reserved
queue