-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed feign-micrometer exception handling. #2644
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions could not be made:
- annotation-error-decoder/pom.xml
- lines 49-48
- apt-test-generator/pom.xml
- lines 62-61
- benchmark/pom.xml
- lines 70-69
- core/pom.xml
- lines 37-36
- googlehttpclient/pom.xml
- lines 54-53
- hc5/pom.xml
- lines 67-66
- httpclient/pom.xml
- lines 75-74
- hystrix/pom.xml
- lines 78-77
- jackson-jaxb/pom.xml
- lines 82-83
- java11/pom.xml
- lines 50-49
- jaxb-jakarta/pom.xml
- lines 34-35
- lines 55-57
- jaxb/pom.xml
- lines 45-49
- jaxrs2/pom.xml
- lines 103-102
- jaxrs3/pom.xml
- lines 89-88
- jaxrs4/pom.xml
- lines 83-82
- kotlin/pom.xml
- lines 71-70
- okhttp/pom.xml
- lines 59-58
- pom.xml
- lines 579-580
- reactive/pom.xml
- lines 98-97
- ribbon/pom.xml
- lines 83-82
- soap-jakarta/pom.xml
- lines 34-35
- lines 63-62
- lines 79-81
- soap/pom.xml
- lines 55-60
- lines 70-71
I've just added some fixes, as there were exceptions when some clients had defined a custom ErrorDecoder. The input stream is consumed, and the decoder throws an exception. This way, we are copying the input stream so it can be used multiple times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions could not be made:
- annotation-error-decoder/pom.xml
- lines 49-48
- apt-test-generator/pom.xml
- lines 62-61
- benchmark/pom.xml
- lines 70-69
- core/pom.xml
- lines 37-36
- googlehttpclient/pom.xml
- lines 54-53
- hc5/pom.xml
- lines 67-66
- httpclient/pom.xml
- lines 75-74
- hystrix/pom.xml
- lines 78-77
- jackson-jaxb/pom.xml
- lines 82-83
- java11/pom.xml
- lines 50-49
- jaxb-jakarta/pom.xml
- lines 34-35
- lines 55-57
- jaxb/pom.xml
- lines 45-49
- jaxrs2/pom.xml
- lines 103-102
- jaxrs3/pom.xml
- lines 89-88
- jaxrs4/pom.xml
- lines 83-82
- kotlin/pom.xml
- lines 71-70
- okhttp/pom.xml
- lines 59-58
- pom.xml
- lines 579-580
- reactive/pom.xml
- lines 98-97
- ribbon/pom.xml
- lines 83-82
- soap-jakarta/pom.xml
- lines 34-35
- lines 63-62
- lines 79-81
- soap/pom.xml
- lines 55-60
- lines 70-71
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we discuss the need to read the request body into memory more?
if (response.body() != null) { | ||
response = | ||
response.toBuilder().body(Util.toByteArray(response.body().asInputStream())).build(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If your intent is to properly flag the request with the appropriate error based on status, reading the response into memory before checking the status code may be problematic in certain scenarios. Particularly in one where a large number of successful responses are monitored with large data sets.
By doing this, we impress upon our users that their applications must be able to stream all data from all tracked operations into memory in order to observe this.
Can you help me understand the exceptions observed when trying to observe a client with a custom ErrorDecoder
?
If this must be part of the solution, we should document this assumption and control this behavior with a configurable feature flag, initially off, and allow our users to make the decision whether to take this on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, @kdavisk6 , thank you for your comment. I've made the changes to cache exceptions (errors) only, so successful responses are no longer cached.
Regarding the custom ErrorDecoder
, the issue is that when the exception response message body is read, the stream is consumed in MicrometerObservationCapability.java
, making it unavailable in the ErrorDecoder
. This is problematic because, in most cases, we need to read the response again and map it to another object. When the response is cached, we can read the response as many times as needed. For example, this is useful if we have loggers, additional logic in filters, and so on.
To summarize:
We need to read the response body for exceptions (client/server error codes in the response) to mark the span with error details and exception information. If we don't cache the response, we cannot read it in another layer, such as the ErrorDecoder
, because it is already consumed for observability purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification.
Reviewing MicrometerObservabilityCapability
, I'm do not see it explicitly consuming the Response.body
, only wrapping it. Do you have an example where the Response
is consumed during metrics processing? I'd like to confirm that this is intentional or not. If not, then we should address that, which should remove the need for this workaround.
If it is intentional, I'm still leery of allowing this without some level of user control. I would like to see some way for users to control this behavior. There may be situations where even the error response may be large enough to cause memory pressure and we should not be pressing this onto our users without their consent.
One way is for us to check the Content-Length
on the response and if that is over a certain threshold, we should defer streaming the response into memory. Another is to have a configurable option as part of the MicrometerCapability
that explicitly enables response caching.
Head branch was pushed to by a user without write access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comments
The current MR improves exception handling for feign-micrometer clients. Previously, exceptions such as 404, 400, 500, etc., were not marked with error=true in the Observation and thus were not reported in the trace as errors.
The exception declaration has been changed from (FeignException ex) to (Exception ex), as none of the supported clients (e.g., Apache HttpClient, OkHttp) throws FeignException, but rather client-specific exceptions.
The fix is currently in production in my organization, and it has been working effectively.
Fixes #2566