-
-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle timeouts more gracefully by allowing the application to shutdown #895
base: master
Are you sure you want to change the base?
Changes from all commits
e78d2c6
5a57044
b89aa04
52f8d65
37a5759
a2e46ca
0b6d7b2
9356e4a
2efd449
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
--- | ||
title: Timeouts | ||
current_menu: timeouts | ||
introduction: Configure and handle timeouts. | ||
--- | ||
|
||
When a Lambda function times out, it is like the power to the computer is suddenly | ||
just turned off. This does not give the application a chance to shut down properly. | ||
This leaves you without any logs and the problem could be hard to fix. | ||
|
||
To allow your application to shut down properly and write logs, Bref can throw an exception just before the Lambda times out. | ||
|
||
> Note, this feature is experimental and available since Bref 1.3. | ||
|
||
To enable this feature **in `php-XX` layers**, set the environment variable `BREF_FEATURE_TIMEOUT`: | ||
|
||
```yaml | ||
provider: | ||
environment: | ||
BREF_FEATURE_TIMEOUT: 1 | ||
``` | ||
|
||
To enable this feature **in `php-XX-fpm` layers**, call `Timeout::enableInFpm()` in your application. | ||
For example in `index.php`: | ||
|
||
```php | ||
if (isset($_SERVER['LAMBDA_TASK_ROOT'])) { | ||
\Bref\Timeout\Timeout::enableInFpm(); | ||
} | ||
``` | ||
|
||
Whenever a timeout happens, a full stack trace will be logged, including the line that was executing. | ||
|
||
In most cases, it is an external call to a database, cache or API that is stuck waiting. | ||
If you are using a RDS database, [you are encouraged to read this section](database.md#accessing-the-internet). | ||
|
||
## Catching the exception | ||
|
||
You can catch the timeout exception to perform some cleanup, logs or even display a proper error page. | ||
|
||
In `php-XX-fpm` layers, most frameworks will catch the `LambdaTimeout` exception automatically (like any other error). | ||
|
||
In `php-XX` layers, you can catch it in your handlers. For example: | ||
|
||
```php | ||
use Bref\Context\Context; | ||
use Bref\Timeout\LambdaTimeout; | ||
|
||
class Handler implements \Bref\Event\Handler | ||
{ | ||
public function handle($event, Context $context) | ||
{ | ||
try { | ||
// your code here | ||
// ... | ||
} catch (LambdaTimeout $e) { | ||
echo 'Oops, sorry. We spent too much time on this.'; | ||
} catch (\Throwable $e) { | ||
echo 'Some other unexpected error happened.'; | ||
} | ||
} | ||
} | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
<?php declare(strict_types=1); | ||
|
||
namespace Bref\Timeout; | ||
|
||
/** | ||
* The application took too long to produce a response. This exception is thrown | ||
* to give the application a chance to flush logs and shut it self down before | ||
* the power to AWS Lambda is disconnected. | ||
*/ | ||
class LambdaTimeout extends \RuntimeException | ||
{ | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
<?php declare(strict_types=1); | ||
|
||
namespace Bref\Timeout; | ||
|
||
/** | ||
* Helper class to trigger an exception just before the Lambda times out. This | ||
* will give the application a chance to shut down. | ||
*/ | ||
final class Timeout | ||
{ | ||
/** @var bool */ | ||
private static $initialized = false; | ||
|
||
/** @var string|null */ | ||
private static $stackTrace = null; | ||
|
||
/** | ||
* Automatically setup a timeout (based on the AWS Lambda timeout). | ||
* | ||
* This method can only be called when running in PHP-FPM, i.e. when using a `php-XX-fpm` layer. | ||
*/ | ||
public static function enableInFpm(): void | ||
{ | ||
if (! isset($_SERVER['LAMBDA_INVOCATION_CONTEXT'])) { | ||
throw new \LogicException('Could not find value for bref timeout. Are we running on Lambda?'); | ||
} | ||
|
||
$context = json_decode($_SERVER['LAMBDA_INVOCATION_CONTEXT'], true, 512, JSON_THROW_ON_ERROR); | ||
$deadlineMs = $context['deadlineMs']; | ||
$remainingTimeInMillis = $deadlineMs - intval(microtime(true) * 1000); | ||
|
||
self::enable($remainingTimeInMillis); | ||
} | ||
|
||
/** | ||
* @internal | ||
*/ | ||
public static function enable(int $remainingTimeInMillis): void | ||
{ | ||
self::init(); | ||
|
||
$remainingTimeInSeconds = (int) floor($remainingTimeInMillis / 1000); | ||
|
||
// The script will timeout 2 seconds before the remaining time | ||
// to allow some time for Bref/our app to recover and cleanup | ||
$margin = 2; | ||
|
||
$timeoutDelayInSeconds = max(1, $remainingTimeInSeconds - $margin); | ||
|
||
// Trigger SIGALRM in X seconds | ||
pcntl_alarm($timeoutDelayInSeconds); | ||
} | ||
|
||
/** | ||
* Setup custom handler for SIGALRM. | ||
*/ | ||
private static function init(): void | ||
{ | ||
self::$stackTrace = null; | ||
|
||
if (self::$initialized) { | ||
return; | ||
} | ||
|
||
if (! function_exists('pcntl_async_signals')) { | ||
trigger_error('Could not enable timeout exceptions because pcntl extension is not enabled.'); | ||
return; | ||
} | ||
|
||
pcntl_async_signals(true); | ||
// Setup a handler for SIGALRM that throws an exception | ||
// This will interrupt any running PHP code, including `sleep()` or code stuck waiting for I/O. | ||
pcntl_signal(SIGALRM, function (): void { | ||
if (Timeout::$stackTrace !== null) { | ||
// we already thrown an exception, do a harder exit. | ||
error_log('Lambda timed out'); | ||
error_log((new LambdaTimeout)->getTraceAsString()); | ||
error_log('Original stack trace'); | ||
error_log(Timeout::$stackTrace); | ||
|
||
exit(1); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we force a That would only be taken into account in the FPM layer, so that wouldn't really impact the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. http_response_code(500);
🤔 not sure about that, maybe. But in any case, the same would happen if your app returns a pretty 500 error page. Does that make sense? What I mean is that, when using FPM, there would be no difference between:
In both cases, API Gateway would behave the same way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, no that is not true.
This will show you a nice error page. API Gateway does nothing
No content(HTML) is provided. I would like the custom error page in API Gateway to be used. Im not sure if Also note. We are not sure it is a HTTP context. It could be any other event.. So Im not sure about sending a HTTP status code. |
||
} | ||
|
||
$exception = new LambdaTimeout('Maximum AWS Lambda execution time reached'); | ||
Timeout::$stackTrace = $exception->getTraceAsString(); | ||
|
||
// Trigger another alarm after 1 second to do a hard exit. | ||
pcntl_alarm(1); | ||
|
||
throw $exception; | ||
}); | ||
|
||
self::$initialized = true; | ||
} | ||
|
||
/** | ||
* Cancel all current timeouts. | ||
* | ||
* @internal | ||
*/ | ||
public static function reset(): void | ||
{ | ||
if (self::$initialized) { | ||
pcntl_alarm(0); | ||
self::$stackTrace = null; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I'm testing and notice that
error_log
adds noise in FPM:I'm changing that to:
Which gives a cleaner log:
No need to update the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didnt know.
Thank you for testing.