Skip to content

Conversation

@jmayclin
Copy link
Contributor

@jmayclin jmayclin commented Nov 23, 2025

Goal

Emit a "handshake event", allowing customers to observe multiple pieces of information about the handshake.

Why

Currently customers have to call large numbers of individual "getter" APIs to gather information about the handshake. We expect that with this event structure will eventually be used to aggregate metrics for e.g. emission to CloudWatch.

How

The handshake is emitted to the application through a callback.

Callouts

  1. all of these APIs are currently unstable, and are expected to remain so for the forseeable future. Accordingly everything in this API is a "two way door".
  2. s2n_event_handshake vs s2n_handshake_event? I went with the former because it felt more neatly namespaced.
  3. I wonder whether we want to have the handshake event emitted by value?

This is adding new code in the hotpath, but it has a minimal impact

handshake-secp256r1/s2n-tls
                        time:   [1.1705 ms 1.1711 ms 1.1718 ms]
                        change: [−0.1650% +0.6024% +1.3401%] (p = 0.08 > 0.05)

Testing

Unit tests, and self-talk tests in the rust bindings.

The tests for this cover a large API surface area, which is tedious to cover in C. Therefore I deliberately have most of the testing in Rust.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@github-actions github-actions bot added the s2n-core team label Nov 23, 2025
@jmayclin jmayclin marked this pull request as ready for review December 5, 2025 23:18
*
* An event is not emitted if the handshake fails.
*/
S2N_API extern int s2n_config_set_handshake_event(struct s2n_config *config, s2n_event_on_handshake_cb callback);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand, what's the purpose of the subscriber pointer? Is it just the void ctx pointer that we typically have for callbacks? Why wouldn't you just combine these two APIs so that the set_handshake_event takes a subscriber?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rust world, we expect to have

Struct MetricAggregator;

impl MetricAggregator {
    fn emit_some_handshake_metrics(...);
}

The void ctx pointer is what will actually point to the MetricAggregator struct.

Why wouldn't you just combine these two APIs so that the set_handshake_event takes a subscriber?

Because that would make it more difficult to add events in the future. My current expectation for how we'd add a new event is to just add a new s2n_config_set_foo_event(...) function. If we passed the subscriber in along with the callback, I think it would be more difficult to add new events in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why adding new events may affect the subscriber/MetricAggregator 🤔 Could you elaborate why it makes adding events difficult in the future?

b"good enough bytes for test",
SystemTime::UNIX_EPOCH,
)?
.enable_session_tickets(true)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you technically don't need this if you added a session ticket key.

Suggested change
.enable_session_tickets(true)?;

* specific error for a null wall clock config just before the client hello callback.
* The test's assertions are weakened if this check is moved.
*/
POSIX_ENSURE(conn->config->wall_clock, S2N_ERR_CONFIG_NULL_BEFORE_CH_CALLBACK);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is needed because now we are access in the config before the client hello callback. This is because the monotonic clock is stored on the config, and we need the clock as soon as we hit s2n_negotiate.

EXPECT_NOT_NULL(default_config = s2n_fetch_default_config());

/* s2n_config_new() matches s2n_fetch_default_config() */
if (default_config->security_policy != config->security_policy) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why you need this code change anymore since you fixed the problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought is that someone might make the same mistake again in the future, and this would be really helpful to help them debug it.

E.g. something like this

- name: Test external build
# if this test is failing, make sure that api headers are appropriately
# included. For a symbol to be visible in a shared lib, the
# __attribute__((visibility("default"))) label must be on a declaration
# in the same unit of compilation as the definition. Generally this just
# means that if the linker can't resolve foo_method in tls/foo.c, you
# forgot to include api/unstable/foo.h in tls/foo.c

cipher: "ECDHE-RSA-AES128-GCM-SHA256",
// TODO: This is unexpected, and should be None, because TLS 1.2 session
// resumption does not include any ECDHE.
group: Some("secp256r1"),
Copy link
Contributor

@CarolYeh910 CarolYeh910 Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why the group is not overridden. Maybe session resumption doesn't change the original negotiated group? So we get the same value as the full handshake

*
* An event is not emitted if the handshake fails.
*/
S2N_API extern int s2n_config_set_handshake_event(struct s2n_config *config, s2n_event_on_handshake_cb callback);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why adding new events may affect the subscriber/MetricAggregator 🤔 Could you elaborate why it makes adding events difficult in the future?

DEFER_CLEANUP(struct s2n_connection *server_conn = s2n_connection_new(S2N_SERVER),
s2n_connection_ptr_free);
EXPECT_NOT_NULL(server_conn);
server_conn->config = NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't call s2n_negotiate(), so technically we can set server_conn->config to NULL. However, it will hit S2N_ERR_NULL with the new changes, so 🤷‍♀️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants