-
Notifications
You must be signed in to change notification settings - Fork 874
PooledConnectionRetryHandler #4230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/transfermanager
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,183 @@ | ||||||||||||||
| /* | ||||||||||||||
| * Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||||||||||||||
| * | ||||||||||||||
| * Licensed under the Apache License, Version 2.0 (the "License"). | ||||||||||||||
| * You may not use this file except in compliance with the License. | ||||||||||||||
| * A copy of the License is located at | ||||||||||||||
| * | ||||||||||||||
| * http://aws.amazon.com/apache2.0 | ||||||||||||||
| * | ||||||||||||||
| * or in the "license" file accompanying this file. This file is distributed | ||||||||||||||
| * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either | ||||||||||||||
| * express or implied. See the License for the specific language governing | ||||||||||||||
| * permissions and limitations under the License. | ||||||||||||||
| */ | ||||||||||||||
|
|
||||||||||||||
| using Amazon.Runtime.Internal.Util; | ||||||||||||||
| using System; | ||||||||||||||
| using System.IO; | ||||||||||||||
| using System.Net.Http; | ||||||||||||||
| using System.Net.Sockets; | ||||||||||||||
| using System.Threading; | ||||||||||||||
| using System.Threading.Tasks; | ||||||||||||||
|
|
||||||||||||||
| namespace Amazon.Runtime.Pipeline.HttpHandler | ||||||||||||||
| { | ||||||||||||||
| /// <summary> | ||||||||||||||
| /// A DelegatingHandler that automatically retries requests when they fail due to | ||||||||||||||
| /// stale connections from the HTTP connection pool. This prevents dead pooled | ||||||||||||||
| /// connections from counting against the SDK's retry limit. | ||||||||||||||
| /// </summary> | ||||||||||||||
| /// <remarks> | ||||||||||||||
| /// When HttpClient reuses a connection from its pool, it may not immediately know | ||||||||||||||
| /// if the server has closed that connection. The first write attempt will fail with | ||||||||||||||
| /// errors like "Broken pipe" or "Connection reset". This handler catches these | ||||||||||||||
| /// specific errors and retries the request once, allowing HttpClient to establish | ||||||||||||||
| /// a fresh connection without consuming a retry from the SDK's retry policy. | ||||||||||||||
| /// </remarks> | ||||||||||||||
| public class PooledConnectionRetryHandler : DelegatingHandler | ||||||||||||||
| { | ||||||||||||||
| // Key used to mark requests that have already been retried by this handler | ||||||||||||||
| private const string RetryAttemptedKey = "AmazonSDK_PooledConnectionRetryAttempted"; | ||||||||||||||
|
|
||||||||||||||
| private readonly Logger _logger = Logger.GetLogger(typeof(PooledConnectionRetryHandler)); | ||||||||||||||
|
|
||||||||||||||
| /// <summary> | ||||||||||||||
| /// Initializes a new instance of the PooledConnectionRetryHandler class. | ||||||||||||||
| /// </summary> | ||||||||||||||
| /// <param name="innerHandler">The inner handler to delegate requests to.</param> | ||||||||||||||
| public PooledConnectionRetryHandler(HttpMessageHandler innerHandler) | ||||||||||||||
| : base(innerHandler) | ||||||||||||||
| { | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| /// <summary> | ||||||||||||||
| /// Sends an HTTP request, automatically retrying once if the failure is due to | ||||||||||||||
| /// a stale pooled connection. | ||||||||||||||
| /// </summary> | ||||||||||||||
| protected override async Task<HttpResponseMessage> SendAsync( | ||||||||||||||
| HttpRequestMessage request, | ||||||||||||||
| CancellationToken cancellationToken) | ||||||||||||||
| { | ||||||||||||||
| try | ||||||||||||||
| { | ||||||||||||||
| return await base.SendAsync(request, cancellationToken).ConfigureAwait(false); | ||||||||||||||
| } | ||||||||||||||
| catch (Exception ex) | ||||||||||||||
| { | ||||||||||||||
| // Only retry if this is a stale connection error and we haven't already retried | ||||||||||||||
| if (IsStaleConnectionError(ex) && !HasRetryBeenAttempted(request)) | ||||||||||||||
| { | ||||||||||||||
| _logger.DebugFormat( | ||||||||||||||
| "Detected stale pooled connection error: {0}. Automatically retrying request to {1}", | ||||||||||||||
| GetErrorMessage(ex), | ||||||||||||||
| request.RequestUri); | ||||||||||||||
|
|
||||||||||||||
| // Mark that we've attempted a retry to prevent infinite loops | ||||||||||||||
| MarkRetryAttempted(request); | ||||||||||||||
|
|
||||||||||||||
| try | ||||||||||||||
| { | ||||||||||||||
| // Retry the request - HttpClient will use a fresh connection | ||||||||||||||
| var response = await base.SendAsync(request, cancellationToken).ConfigureAwait(false); | ||||||||||||||
|
|
||||||||||||||
| _logger.DebugFormat( | ||||||||||||||
| "Automatic retry succeeded for request to {0}", | ||||||||||||||
| request.RequestUri); | ||||||||||||||
|
|
||||||||||||||
| return response; | ||||||||||||||
| } | ||||||||||||||
| catch (Exception retryEx) | ||||||||||||||
| { | ||||||||||||||
| _logger.DebugFormat( | ||||||||||||||
| "Automatic retry failed for request to {0}: {1}", | ||||||||||||||
| request.RequestUri, | ||||||||||||||
| GetErrorMessage(retryEx)); | ||||||||||||||
|
|
||||||||||||||
| // Retry failed - throw the new exception | ||||||||||||||
| throw; | ||||||||||||||
|
Comment on lines
+97
to
+98
|
||||||||||||||
| // Retry failed - throw the new exception | |
| throw; | |
| // Retry failed - throw an AggregateException with both the original and retry exceptions | |
| throw new AggregateException( | |
| $"Automatic retry after stale pooled connection error failed for request to {request.RequestUri}. See InnerExceptions for details.", | |
| ex, retryEx); |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retrying an HttpRequestMessage with a consumed request body will fail. When an HttpRequestMessage is sent via SendAsync, its Content (if present) is consumed and cannot be retried without recreating the content. The retry logic here doesn't account for requests with bodies (POST, PUT, etc.).
In the AWS SDK, request bodies come from streams or byte arrays that are attached to the HttpRequestMessage.Content. After the first send attempt fails, attempting to retry the same HttpRequestMessage will fail because:
- StreamContent can only be read once - the underlying stream has been consumed
- ByteArrayContent may work, but only if it hasn't been disposed
The SDK's higher-level retry logic handles stream rewinding and content recreation, but this low-level handler bypasses that. This could cause unexpected failures for requests with bodies when a stale connection error occurs.
Consider either:
- Only retrying requests without content (checking if request.Content is null)
- Documenting this limitation clearly
- Implementing content buffering/recreation logic (complex and memory-intensive)
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error detection logic only checks for IOException and SocketException in the exception chain, but doesn't check for HttpRequestException. However, line 107 in the test file shows that HttpRequestException wrapping SocketException is a real scenario that needs to be handled.
While the current implementation correctly walks the exception chain (line 118-135) to find SocketException, the GetErrorMessage helper method (line 169-181) only looks for IOException or SocketException for logging, which means HttpRequestException messages won't be properly logged even though they're retried.
Consider updating GetErrorMessage to also check for HttpRequestException to ensure comprehensive error logging.
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -305,7 +305,11 @@ private static HttpClient CreateManagedHttpClient(IClientConfig clientConfig) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Logger.GetLogger(typeof(HttpRequestMessageFactory)).Debug(pns, $"The current runtime does not support modifying proxy settings of HttpClient."); | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| var httpClient = new HttpClient(httpMessageHandler); | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // Wrap the handler with pooled connection retry middleware to automatically | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // retry requests that fail due to stale connections from the connection pool. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // This prevents dead pooled connections from consuming SDK retry attempts. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+308
to
+310
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // Wrap the handler with pooled connection retry middleware to automatically | |
| // retry requests that fail due to stale connections from the connection pool. | |
| // This prevents dead pooled connections from consuming SDK retry attempts. | |
| /* | |
| * CHANGE: Introduced PooledConnectionRetryHandler to wrap the HTTP handler. | |
| * | |
| * Description: | |
| * Adds automatic retry behavior at the HTTP layer for requests that fail due to stale or dead connections | |
| * in the connection pool. The PooledConnectionRetryHandler intercepts failures caused by dead pooled | |
| * connections and transparently retries the request with a fresh connection, preventing such failures | |
| * from consuming SDK-level retry attempts. | |
| * | |
| * Motivation: | |
| * Without this handler, dead pooled connections can cause immediate request failures, which are then | |
| * retried at the SDK level, potentially exhausting the SDK's retry budget and resulting in failed | |
| * operations. By handling these failures at the HTTP layer, we improve reliability and reduce the | |
| * likelihood of transient network issues causing application errors. | |
| * | |
| * Testing: | |
| * - Unit tests were added/updated to verify that requests failing due to stale connections are retried | |
| * transparently and do not consume SDK retry attempts. | |
| * - Integration tests were performed to validate behavior under simulated connection pool failures. | |
| * | |
| * Type of Change: | |
| * New feature (backward compatible). | |
| * | |
| * Behavioral Impact: | |
| * - Improves reliability for applications using persistent HTTP connections. | |
| * - May result in additional low-level retries for certain network failures, but does not change | |
| * the public API or overall retry semantics at the SDK level. | |
| * - No breaking changes expected. | |
| */ |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR modifies the Core SDK runtime behavior and adds a new public class but is missing a required DevConfig file. According to the contribution guidelines at CONTRIBUTING.md, all PRs that modify code must include a DevConfig file in generator/.DevConfigs/ for proper versioning and changelog generation.
This change requires a core section in the DevConfig since it modifies Core SDK functionality. The DevConfig should specify:
- type: likely "minor" (new feature) or "patch" (bug fix)
- changeLogMessages: describing the pooled connection retry handler
- updateMinimum: likely true since this affects all services using HttpClient
See https://github.com/aws/aws-sdk-net/blob/main/CONTRIBUTING.md for detailed instructions on creating DevConfig files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retry logic doesn't check if the cancellation token has been cancelled before attempting a retry. If a request is cancelled (via CancellationToken) after the first attempt fails but before the retry executes, the retry will still proceed.
This could lead to unnecessary retries of already-cancelled operations. Consider checking
cancellationToken.IsCancellationRequestedbefore line 82 to avoid retrying operations that the caller has already cancelled.