Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHP-FPM children die with SIGPIPE since 8.3.10 #1854

Open
driskell opened this issue Aug 28, 2024 · 12 comments
Open

PHP-FPM children die with SIGPIPE since 8.3.10 #1854

driskell opened this issue Aug 28, 2024 · 12 comments
Labels

Comments

@driskell
Copy link

driskell commented Aug 28, 2024

Description:

See: brefphp/aws-lambda-layers#192

We're seeing lots of random failures in 2.3.4 and 2.3.3 where the child dies of SIGPIPE at the start of a request - meaning something dropped a SIGPIPE prior to the request.

Error communicating with PHP-FPM to read the HTTP response. Bref will restart PHP-FPM now. Original exception message: hollodotme\FastCGI\Exceptions\ReadFailedException Stream got blocked, or terminated.

Followed immediately by (two examples, showing it happens different timings):

[28-Aug-2024 11:03:08] WARNING: [pool default] child 11 exited on signal 13 (SIGPIPE) after 49.066900 seconds from start
[28-Aug-2024 11:03:16] WARNING: [pool default] child 71 exited on signal 13 (SIGPIPE) after 7.462262 seconds from start

Then finally:

{
    "errorType": "Bref\\FpmRuntime\\FastCgi\\FastCgiCommunicationFailed",
    "errorMessage": "",
    "stack": [
        "#0 /var/task/vendor/bref/bref/src/Event/Http/HttpHandler.php(25): Bref\\FpmRuntime\\FpmHandler->handleRequest(Object(Bref\\Event\\Http\\HttpRequestEvent), Object(Bref\\Context\\Context))",
        "#1 /var/task/vendor/bref/bref/src/Runtime/Invoker.php(24): Bref\\Event\\Http\\HttpHandler->handle(Array, Object(Bref\\Context\\Context))",
        "#2 /var/task/vendor/bref/bref/src/Runtime/LambdaRuntime.php(94): Bref\\Runtime\\Invoker->invoke(Object(Bref\\FpmRuntime\\FpmHandler), Array, Object(Bref\\Context\\Context))",
        "#3 /var/task/vendor/bref/bref/src/FpmRuntime/Main.php(46): Bref\\Runtime\\LambdaRuntime->processNextEvent(Object(Bref\\FpmRuntime\\FpmHandler))",
        "#4 /opt/bref/bootstrap.php(17): Bref\\FpmRuntime\\Main::run()",
        "#5 {main}"
    ]
}

Similar to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280775 - which seems to refer to libxml

How to reproduce:

Install 2.3.4 or 2.3.3 Bref, use php-fpm as the runtime and send lots of requests.

I'm not entirely sure if this will reproduce it as I haven't tried but it definitely all goes away when downgrading back to 2.3.1.

@driskell driskell added the bug label Aug 28, 2024
@driskell driskell changed the title PHP-FPM Children die with SIGPIPE since 8.3.10 PHP-FPM children die with SIGPIPE since 8.3.10 Aug 28, 2024
@mnapoli
Copy link
Member

mnapoli commented Aug 28, 2024

Thanks for the report! Any idea what makes you think this is related to libxml specifically? I don't see how xml could play a role in the FPM connection.

Could it be a regression in the new PHP version?

@driskell
Copy link
Author

driskell commented Aug 28, 2024

@mnapoli It could be. I only mention libxml because of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280775.

Usually php-fpm would disable SIGPIPE, but I've found sources showing there's been things in past where libraries re-enable it (such as curl, or OpenSSL) and then php-fpm signal handler gets confused and kills things. So it could be something is re-enabling the signal and I think libxml was related before based on the above bugzilla.

Could be worth me trying building PHP 8.3.10 separately against older libxml and I could test but not sure where to start right now. We've stuck to 2.3.1 on 8.3.9 for now until I can work out time to play around with it.

@mnapoli
Copy link
Member

mnapoli commented Aug 28, 2024

Interesting! Adding @GrahamCampbell to this

@GrahamCampbell
Copy link
Contributor

Odd, though I don't think this is libxml related, because Bref's 8.3.9 is using 2.12.x, and that linked issue says that 2.11.x is good, but 2.12.x is not.

@GrahamCampbell
Copy link
Contributor

We could try downgrading to 2.12.x, but we may need to re-evaluate in the future if a security issue is discovered in 2.12.x in the future, and not patched.

@GrahamCampbell
Copy link
Contributor

Actually, no. That issue is specifically about PHP 8.1, and libxml 2.13 does not work on that version. We only use 2.13 on PHP 8.2+.

@mnapoli
Copy link
Member

mnapoli commented Aug 28, 2024

So it could be the PHP version. We could try waiting for brefphp/aws-lambda-layers#199 and see if it fixes it?

@driskell
Copy link
Author

I can try and test with brefphp/aws-lambda-layers#199 once it is released.

@mnapoli
Copy link
Member

mnapoli commented Aug 31, 2024

@gmorel
Copy link

gmorel commented Sep 2, 2024

Maybe it could help you. I had the same issue with https://github.com/reactphp/http and AWS Lambda

RequestId: 5ad1b639-980b-4f3c-baf6-7ac7d40971ee Error: Runtime exited with error: exit status 141
Runtime.ExitError

Worked perfectly locally with my docker container but crashing on the AWS infra (Lambda as a Docker Image)

I manage to fix it with:

        // Try to handle SIGPIPE and other potential signals gracefully
        \pcntl_async_signals(true);
        \pcntl_signal(SIGPIPE, function() {
            echo 'Caught SIGPIPE, closing socket.' . PHP_EOL; // Remove this line in prod
        });

With \pcntl_async_signals(true);, you enable asynchronous signal handling for your program, allowing it to catch and handle signals.
With the \pcntl_signal(SIGPIPE, ...), you specify a handler function for SIGPIPE.

I don't know what you think. And I don't know the impact. But since pcntl extension is by default in bref layer, maybe we could add it directly into bref.

@driskell
Copy link
Author

driskell commented Sep 2, 2024

@mnapoli Interesting no issues with that one. All working.

@driskell
Copy link
Author

driskell commented Sep 19, 2024

@mnapoli I stand correct - I am still seeing some issues with SIGPIPE. It just more sporadic now and I don't know if that is related to the 2.3.5 or if not. I've had to rollback to 2.3.1 and it works again smoothly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants