Skip to content

Conversation

@Danipulok
Copy link
Contributor

Fixes #3636

`run_stream()` behaves differently from `run()` and `run_sync()` when choosing the final result:

- **`run_stream()`**: The first called tool that **can** produce a final result (output or deferred) becomes the final result
- **`run()` / `run_sync()`**: The first **output** tool becomes the final result. If none are called, all **deferred** tools become the final result as `DeferredToolRequests`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If none are called

The wording is like this, since we get an UnexpectedModelBehavior if both output and deferred tools are called, but none output tools are validated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the deferred tools args are not validated

I guess it means we could check that if any deferred tools calls are present and tool_call_results is None, we can skip ignore invalid output tool calls, as we do in here:

yield _messages.FunctionToolCallEvent(call)
output_parts.append(e.tool_retry)
yield _messages.FunctionToolResultEvent(e.tool_retry)

But I'm not sure about it...
And IMHO, if do this at all, then it should be done in a separate PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the deferred tools args are not validated

That's true for external tools, but not for function tools that raise CallDeferred, which do have args validated before the function is called

I guess it means we could check that if any deferred tools calls are present and tool_call_results is None, we can skip ignore invalid output tool calls, as we do in here:

I don't quite understand what you mean

The `'exhaustive'` strategy is useful when tools have important side effects (like logging, sending notifications, or updating metrics) that should always execute.

!!! warning "Streaming vs Sync Behavior Difference"
`run_stream()` behaves differently from `run()` and `run_sync()` when choosing the final result:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please link to the API docs for run_stream
  • It's not just run and run_sync, also iter, run_stream_events, etc. So maybe link to agents.md#running-agents and just say "the other run methods"

!!! warning "Streaming vs Sync Behavior Difference"
`run_stream()` behaves differently from `run()` and `run_sync()` when choosing the final result:

- **`run_stream()`**: The first called tool that **can** produce a final result (output or deferred) becomes the final result
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's link to the deferred tool docs

`run_stream()` behaves differently from `run()` and `run_sync()` when choosing the final result:

- **`run_stream()`**: The first called tool that **can** produce a final result (output or deferred) becomes the final result
- **`run()` / `run_sync()`**: The first **output** tool becomes the final result. If none are called, all **deferred** tools become the final result as `DeferredToolRequests`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the deferred tools args are not validated

That's true for external tools, but not for function tools that raise CallDeferred, which do have args validated before the function is called

I guess it means we could check that if any deferred tools calls are present and tool_call_results is None, we can skip ignore invalid output tool calls, as we do in here:

I don't quite understand what you mean


The `'exhaustive'` strategy is useful when tools have important side effects (like logging, sending notifications, or updating metrics) that should always execute.

!!! warning "Streaming vs Sync Behavior Difference"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not exactly streaming vs sync; it's just run_stream that's odd. So maybe it makes sense to mention this under https://ai.pydantic.dev/agents/#streaming-events-and-final-output, either as well, or instead. I say "instead" because we already give context there on run_stream always preferring the first final result it sees, which is in line with what we're explaining here. So maybe that place should have the primary explanation, and then here we can link to there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistency in result.output for agent.run_sync and agent.run_stream when both OutputTool and DeferredTool are called

2 participants