Skip to content

Conversation

@GarrettBeatty
Copy link
Contributor

Description

Motivation and Context

Testing

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project
  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have read the README document
  • I have added tests to cover my changes
  • All new and existing tests passed

License

  • I confirm that this pull request can be released under the Apache 2 license

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a PooledConnectionRetryHandler that automatically retries HTTP requests when they fail due to stale pooled connections. The handler wraps the HttpClient's message handler chain to catch specific socket errors (Shutdown, ConnectionReset, ConnectionAborted) and retry the request once before it reaches the SDK's main retry logic.

Key Changes:

  • Added PooledConnectionRetryHandler class that extends DelegatingHandler to intercept and retry stale connection errors
  • Integrated the handler into the HttpClient creation pipeline in HttpRequestMessageFactory
  • Added comprehensive unit tests covering various retry scenarios

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File Description
sdk/src/Core/Amazon.Runtime/Pipeline/HttpHandler/PooledConnectionRetryHandler.cs New handler class that detects stale connection errors and performs automatic single retry
sdk/src/Core/Amazon.Runtime/Pipeline/HttpHandler/_netstandard/HttpRequestMessageFactory.cs Integration point wrapping the HttpMessageHandler with PooledConnectionRetryHandler
sdk/test/UnitTests/Custom/Runtime/PooledConnectionRetryHandlerTests.cs Unit tests covering socket error detection, retry limits, and basic concurrency

Comment on lines +28 to +343
[TestClass]
public class PooledConnectionRetryHandlerTests
{

/// <summary>
/// Test that SocketException with Shutdown error is detected as a stale connection error
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task SocketExceptionShutdown_IsDetectedAndRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
// First attempt: throw socket shutdown error (Broken pipe on Unix)
throw new SocketException((int)SocketError.Shutdown);
}
// Second attempt: succeed
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts (initial + retry)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test that SocketException with ConnectionReset is detected as a stale connection error
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task SocketExceptionConnectionReset_IsDetectedAndRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
throw new SocketException((int)SocketError.ConnectionReset);
}
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts (initial + retry)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test that HttpRequestException wrapping SocketException is properly unwrapped and detected
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task HttpRequestExceptionWrappingSocketException_IsDetectedAndRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
var socketException = new SocketException((int)SocketError.ConnectionReset);
throw new HttpRequestException("Error while copying content", socketException);
}
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts (initial + retry)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test that only one automatic retry is attempted
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task OnlyOneRetryAttempt_PreventsInfiniteLoop()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
// Always throw stale connection error
throw new SocketException((int)SocketError.Shutdown);
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

await Assert.ThrowsExceptionAsync<SocketException>(async () =>
{
await client.SendAsync(request);
});

Assert.AreEqual(2, attemptCount, "Should have made exactly 2 attempts (initial + one retry)");
}

/// <summary>
/// Test that non-stale-connection errors are not retried
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task NonStaleConnectionError_IsNotRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
throw new TaskCanceledException("Request timed out");
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

await Assert.ThrowsExceptionAsync<TaskCanceledException>(async () =>
{
await client.SendAsync(request);
});

Assert.AreEqual(1, attemptCount, "Should have made only 1 attempt (no retry for non-stale errors)");
}

/// <summary>
/// Test that IOException with non-matching message is not retried
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task IOExceptionWithNonMatchingMessage_IsNotRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
throw new IOException("Disk full");
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

await Assert.ThrowsExceptionAsync<IOException>(async () =>
{
await client.SendAsync(request);
});

Assert.AreEqual(1, attemptCount, "Should have made only 1 attempt (message doesn't match stale connection patterns)");
}

/// <summary>
/// Test successful request on first attempt (no retry needed)
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task SuccessfulRequest_NoRetry()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(1, attemptCount, "Should have made only 1 attempt (successful)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test that retry throws the new exception if retry also fails
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task RetryFailsWithDifferentError_ThrowsNewException()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
throw new SocketException((int)SocketError.Shutdown);
}
// Second attempt throws different error
throw new InvalidOperationException("Different error");
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

await Assert.ThrowsExceptionAsync<InvalidOperationException>(async () =>
{
await client.SendAsync(request);
});

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts");
}

/// <summary>
/// Test ConnectionAborted SocketException is detected and retried
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task SocketExceptionConnectionAborted_IsDetectedAndRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
throw new SocketException((int)SocketError.ConnectionAborted);
}
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts (initial + retry)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test thread safety with concurrent requests
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task ConcurrentRequests_AreHandledSafely()
{
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
// Simulate some work
Thread.Sleep(10);
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);

// Send 10 concurrent requests
var tasks = new Task<HttpResponseMessage>[10];
for (int i = 0; i < 10; i++)
{
var request = new HttpRequestMessage(HttpMethod.Get, $"https://example.com/{i}");
tasks[i] = client.SendAsync(request);
}

var responses = await Task.WhenAll(tasks);

Assert.AreEqual(10, responses.Length);
foreach (var response in responses)
{
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}
}

/// <summary>
/// Mock HttpMessageHandler for testing
/// </summary>
private class MockHttpMessageHandler : HttpMessageHandler
{
private readonly Func<HttpRequestMessage, CancellationToken, Task<HttpResponseMessage>> _sendFunc;

public MockHttpMessageHandler(Func<HttpRequestMessage, CancellationToken, Task<HttpResponseMessage>> sendFunc)
{
_sendFunc = sendFunc;
}

protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
return _sendFunc(request, cancellationToken);
}
}
}
}
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for cancellation scenarios. The retry handler should properly handle cancellation token behavior.

Add tests for:

  1. Request cancelled before retry (verify retry doesn't execute)
  2. Request cancelled during retry (verify OperationCanceledException is thrown)
  3. Timeout scenarios where CancellationToken is triggered

These tests are important because the handler currently doesn't check for cancellation before retrying, which could lead to unexpected behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +104
try
{
return await base.SendAsync(request, cancellationToken).ConfigureAwait(false);
}
catch (Exception ex)
{
// Only retry if this is a stale connection error and we haven't already retried
if (IsStaleConnectionError(ex) && !HasRetryBeenAttempted(request))
{
_logger.DebugFormat(
"Detected stale pooled connection error: {0}. Automatically retrying request to {1}",
GetErrorMessage(ex),
request.RequestUri);

// Mark that we've attempted a retry to prevent infinite loops
MarkRetryAttempted(request);

try
{
// Retry the request - HttpClient will use a fresh connection
var response = await base.SendAsync(request, cancellationToken).ConfigureAwait(false);

_logger.DebugFormat(
"Automatic retry succeeded for request to {0}",
request.RequestUri);

return response;
}
catch (Exception retryEx)
{
_logger.DebugFormat(
"Automatic retry failed for request to {0}: {1}",
request.RequestUri,
GetErrorMessage(retryEx));

// Retry failed - throw the new exception
throw;
}
}

// Not a stale connection error, or already retried - rethrow original exception
throw;
}
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retrying an HttpRequestMessage with a consumed request body will fail. When an HttpRequestMessage is sent via SendAsync, its Content (if present) is consumed and cannot be retried without recreating the content. The retry logic here doesn't account for requests with bodies (POST, PUT, etc.).

In the AWS SDK, request bodies come from streams or byte arrays that are attached to the HttpRequestMessage.Content. After the first send attempt fails, attempting to retry the same HttpRequestMessage will fail because:

  1. StreamContent can only be read once - the underlying stream has been consumed
  2. ByteArrayContent may work, but only if it hasn't been disposed

The SDK's higher-level retry logic handles stream rewinding and content recreation, but this low-level handler bypasses that. This could cause unexpected failures for requests with bodies when a stale connection error occurs.

Consider either:

  1. Only retrying requests without content (checking if request.Content is null)
  2. Documenting this limitation clearly
  3. Implementing content buffering/recreation logic (complex and memory-intensive)

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +343
[TestClass]
public class PooledConnectionRetryHandlerTests
{

/// <summary>
/// Test that SocketException with Shutdown error is detected as a stale connection error
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task SocketExceptionShutdown_IsDetectedAndRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
// First attempt: throw socket shutdown error (Broken pipe on Unix)
throw new SocketException((int)SocketError.Shutdown);
}
// Second attempt: succeed
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts (initial + retry)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test that SocketException with ConnectionReset is detected as a stale connection error
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task SocketExceptionConnectionReset_IsDetectedAndRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
throw new SocketException((int)SocketError.ConnectionReset);
}
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts (initial + retry)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test that HttpRequestException wrapping SocketException is properly unwrapped and detected
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task HttpRequestExceptionWrappingSocketException_IsDetectedAndRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
var socketException = new SocketException((int)SocketError.ConnectionReset);
throw new HttpRequestException("Error while copying content", socketException);
}
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts (initial + retry)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test that only one automatic retry is attempted
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task OnlyOneRetryAttempt_PreventsInfiniteLoop()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
// Always throw stale connection error
throw new SocketException((int)SocketError.Shutdown);
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

await Assert.ThrowsExceptionAsync<SocketException>(async () =>
{
await client.SendAsync(request);
});

Assert.AreEqual(2, attemptCount, "Should have made exactly 2 attempts (initial + one retry)");
}

/// <summary>
/// Test that non-stale-connection errors are not retried
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task NonStaleConnectionError_IsNotRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
throw new TaskCanceledException("Request timed out");
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

await Assert.ThrowsExceptionAsync<TaskCanceledException>(async () =>
{
await client.SendAsync(request);
});

Assert.AreEqual(1, attemptCount, "Should have made only 1 attempt (no retry for non-stale errors)");
}

/// <summary>
/// Test that IOException with non-matching message is not retried
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task IOExceptionWithNonMatchingMessage_IsNotRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
throw new IOException("Disk full");
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

await Assert.ThrowsExceptionAsync<IOException>(async () =>
{
await client.SendAsync(request);
});

Assert.AreEqual(1, attemptCount, "Should have made only 1 attempt (message doesn't match stale connection patterns)");
}

/// <summary>
/// Test successful request on first attempt (no retry needed)
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task SuccessfulRequest_NoRetry()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(1, attemptCount, "Should have made only 1 attempt (successful)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test that retry throws the new exception if retry also fails
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task RetryFailsWithDifferentError_ThrowsNewException()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
throw new SocketException((int)SocketError.Shutdown);
}
// Second attempt throws different error
throw new InvalidOperationException("Different error");
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

await Assert.ThrowsExceptionAsync<InvalidOperationException>(async () =>
{
await client.SendAsync(request);
});

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts");
}

/// <summary>
/// Test ConnectionAborted SocketException is detected and retried
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task SocketExceptionConnectionAborted_IsDetectedAndRetried()
{
var attemptCount = 0;
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
attemptCount++;
if (attemptCount == 1)
{
throw new SocketException((int)SocketError.ConnectionAborted);
}
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);
var request = new HttpRequestMessage(HttpMethod.Get, "https://example.com");

var response = await client.SendAsync(request);

Assert.AreEqual(2, attemptCount, "Should have made 2 attempts (initial + retry)");
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}

/// <summary>
/// Test thread safety with concurrent requests
/// </summary>
[TestMethod]
[TestCategory("UnitTest")]
[TestCategory("Runtime")]
public async Task ConcurrentRequests_AreHandledSafely()
{
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
// Simulate some work
Thread.Sleep(10);
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);

// Send 10 concurrent requests
var tasks = new Task<HttpResponseMessage>[10];
for (int i = 0; i < 10; i++)
{
var request = new HttpRequestMessage(HttpMethod.Get, $"https://example.com/{i}");
tasks[i] = client.SendAsync(request);
}

var responses = await Task.WhenAll(tasks);

Assert.AreEqual(10, responses.Length);
foreach (var response in responses)
{
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}
}

/// <summary>
/// Mock HttpMessageHandler for testing
/// </summary>
private class MockHttpMessageHandler : HttpMessageHandler
{
private readonly Func<HttpRequestMessage, CancellationToken, Task<HttpResponseMessage>> _sendFunc;

public MockHttpMessageHandler(Func<HttpRequestMessage, CancellationToken, Task<HttpResponseMessage>> sendFunc)
{
_sendFunc = sendFunc;
}

protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
return _sendFunc(request, cancellationToken);
}
}
}
}
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests only cover GET requests with no request body. This doesn't validate behavior for POST/PUT requests with bodies, which is where the retry logic would likely fail (see the critical issue about consumed HttpContent).

Add test cases for:

  1. POST/PUT requests with StreamContent to verify retry behavior
  2. POST/PUT requests with ByteArrayContent
  3. Verifying that requests with non-seekable streams either work correctly or fail gracefully

These tests would reveal the content consumption issue and help determine the appropriate fix.

Copilot uses AI. Check for mistakes.
Comment on lines +169 to +181
private static string GetErrorMessage(Exception ex)
{
var currentException = ex;
while (currentException != null)
{
if (currentException is IOException || currentException is SocketException)
{
return $"{currentException.GetType().Name}: {currentException.Message}";
}
currentException = currentException.InnerException;
}
return ex.Message;
}
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error detection logic only checks for IOException and SocketException in the exception chain, but doesn't check for HttpRequestException. However, line 107 in the test file shows that HttpRequestException wrapping SocketException is a real scenario that needs to be handled.

While the current implementation correctly walks the exception chain (line 118-135) to find SocketException, the GetErrorMessage helper method (line 169-181) only looks for IOException or SocketException for logging, which means HttpRequestException messages won't be properly logged even though they're retried.

Consider updating GetErrorMessage to also check for HttpRequestException to ensure comprehensive error logging.

Copilot uses AI. Check for mistakes.
Comment on lines +308 to +310
// Wrap the handler with pooled connection retry middleware to automatically
// retry requests that fail due to stale connections from the connection pool.
// This prevents dead pooled connections from consuming SDK retry attempts.
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation in the PR description is completely empty - only the template is present with no actual description, motivation, testing details, or type of changes checked.

For a change of this significance (adding automatic retry behavior at the HTTP layer), the PR should include:

  • Description of what the change does
  • Motivation explaining the problem being solved
  • Testing details showing how the change was validated
  • Whether this is a bug fix or new feature
  • Any potential behavioral impacts on existing applications

This documentation is important for reviewers and for the historical record of why this change was made.

Suggested change
// Wrap the handler with pooled connection retry middleware to automatically
// retry requests that fail due to stale connections from the connection pool.
// This prevents dead pooled connections from consuming SDK retry attempts.
/*
* CHANGE: Introduced PooledConnectionRetryHandler to wrap the HTTP handler.
*
* Description:
* Adds automatic retry behavior at the HTTP layer for requests that fail due to stale or dead connections
* in the connection pool. The PooledConnectionRetryHandler intercepts failures caused by dead pooled
* connections and transparently retries the request with a fresh connection, preventing such failures
* from consuming SDK-level retry attempts.
*
* Motivation:
* Without this handler, dead pooled connections can cause immediate request failures, which are then
* retried at the SDK level, potentially exhausting the SDK's retry budget and resulting in failed
* operations. By handling these failures at the HTTP layer, we improve reliability and reduce the
* likelihood of transient network issues causing application errors.
*
* Testing:
* - Unit tests were added/updated to verify that requests failing due to stale connections are retried
* transparently and do not consume SDK retry attempts.
* - Integration tests were performed to validate behavior under simulated connection pool failures.
*
* Type of Change:
* New feature (backward compatible).
*
* Behavioral Impact:
* - Improves reliability for applications using persistent HTTP connections.
* - May result in additional low-level retries for certain network failures, but does not change
* the public API or overall retry semantics at the SDK level.
* - No breaking changes expected.
*/

Copilot uses AI. Check for mistakes.
MarkRetryAttempted(request);

try
{
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic doesn't check if the cancellation token has been cancelled before attempting a retry. If a request is cancelled (via CancellationToken) after the first attempt fails but before the retry executes, the retry will still proceed.

This could lead to unnecessary retries of already-cancelled operations. Consider checking cancellationToken.IsCancellationRequested before line 82 to avoid retrying operations that the caller has already cancelled.

Suggested change
{
{
// Check for cancellation before retrying
if (cancellationToken.IsCancellationRequested)
{
_logger.DebugFormat(
"Cancellation requested before automatic retry for request to {0}",
request.RequestUri);
throw new OperationCanceledException(cancellationToken);
}

Copilot uses AI. Check for mistakes.
Comment on lines +308 to +312
// Wrap the handler with pooled connection retry middleware to automatically
// retry requests that fail due to stale connections from the connection pool.
// This prevents dead pooled connections from consuming SDK retry attempts.
var pooledConnectionRetryHandler = new PooledConnectionRetryHandler(httpMessageHandler);
var httpClient = new HttpClient(pooledConnectionRetryHandler);
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR modifies the Core SDK runtime behavior and adds a new public class but is missing a required DevConfig file. According to the contribution guidelines at CONTRIBUTING.md, all PRs that modify code must include a DevConfig file in generator/.DevConfigs/ for proper versioning and changelog generation.

This change requires a core section in the DevConfig since it modifies Core SDK functionality. The DevConfig should specify:

  • type: likely "minor" (new feature) or "patch" (bug fix)
  • changeLogMessages: describing the pooled connection retry handler
  • updateMinimum: likely true since this affects all services using HttpClient

See https://github.com/aws/aws-sdk-net/blob/main/CONTRIBUTING.md for detailed instructions on creating DevConfig files.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +97 to +98
// Retry failed - throw the new exception
throw;
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic modifies state on the original HttpRequestMessage by marking it as retried. However, if the retry itself throws an exception (line 90-99), the original exception from line 66 is lost, and the request is left in a "retry attempted" state even though the retry failed.

This could cause confusion in debugging scenarios where developers would see the retry exception but not the original stale connection error. Consider logging both exceptions when the retry fails, or including the original exception as an inner exception in a wrapper exception.

Suggested change
// Retry failed - throw the new exception
throw;
// Retry failed - throw an AggregateException with both the original and retry exceptions
throw new AggregateException(
$"Automatic retry after stale pooled connection error failed for request to {request.RequestUri}. See InnerExceptions for details.",
ex, retryEx);

Copilot uses AI. Check for mistakes.
Comment on lines +296 to +323
public async Task ConcurrentRequests_AreHandledSafely()
{
var mockHandler = new MockHttpMessageHandler((request, ct) =>
{
// Simulate some work
Thread.Sleep(10);
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK));
});

var retryHandler = new PooledConnectionRetryHandler(mockHandler);
var client = new HttpClient(retryHandler);

// Send 10 concurrent requests
var tasks = new Task<HttpResponseMessage>[10];
for (int i = 0; i < 10; i++)
{
var request = new HttpRequestMessage(HttpMethod.Get, $"https://example.com/{i}");
tasks[i] = client.SendAsync(request);
}

var responses = await Task.WhenAll(tasks);

Assert.AreEqual(10, responses.Length);
foreach (var response in responses)
{
Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
}
}
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concurrency test doesn't validate thread safety of the retry mechanism itself - it only tests successful concurrent requests. This doesn't expose potential race conditions in the retry logic.

Consider adding a test that validates concurrent requests with stale connection errors to ensure:

  1. The retry mechanism is thread-safe when multiple requests fail simultaneously
  2. Each request's retry state is tracked independently
  3. No shared mutable state causes issues

For example, test 10 concurrent requests where all fail on first attempt with SocketError.Shutdown, then succeed on retry.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant