-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Adding per tool usage limit #3691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…er_tool_usage_limit
… if it should be somewhere else altogether to cover granular limit exceeded as well?
|
|
||
| If you want to limit tool calls but let the model decide how to proceed instead of raising an error, use the `max_tool_calls` parameter. This is a "soft" limit that returns a message to the model when exceeded, rather than raising a [`UsageLimitExceeded`][pydantic_ai.exceptions.UsageLimitExceeded] exception. | ||
|
|
||
| ```py {test="skip"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't skip tests unless absolutely necessary
|
|
||
| 1. Set the maximum number of tool calls allowed during runs. This can also be set per-run. | ||
|
|
||
| When `max_tool_calls` is exceeded, instead of executing the tool, the agent returns a message to the model: `'Tool call limit reached for tool "{tool_name}".'`. The model then decides how to respond based on this information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it can be combined with the sentence before the example
| - Usage limits are especially relevant if you've registered many tools. Use `request_limit` to bound the number of model turns, and `tool_calls_limit` to cap the number of successful tool executions within a run. | ||
| - The `tool_calls_limit` is checked before executing tool calls. If the model returns parallel tool calls that would exceed the limit, no tools will be executed. | ||
|
|
||
| ##### Soft Tool Call Limits with `max_tool_calls` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this belongs on tools-advanced.md, with a link from the usage limits section above.
And no need to mention the name of the field in the title, that'll make it too long for the ToC sidebar
| result = agent.run_sync('Calculate something', max_tool_calls=1) | ||
| ``` | ||
|
|
||
| **When to use `max_tool_calls` vs `tool_calls_limit`:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like something for the explanation in the previous section where we link to this feature. And for the primary docs of this feature where we mention how it's related to the UsageLimits hard limit.
|
|
||
| !!! note "Limiting tool executions" | ||
| You can cap tool executions within a run using [`UsageLimits(tool_calls_limit=...)`](agents.md#usage-limits). The counter increments only after a successful tool invocation. Output tools (used for [structured output](output.md)) are not counted in the `tool_calls` metric. | ||
| You can cap the total number of tool executions within a run using [`UsageLimits(tool_calls_limit=...)`](agents.md#usage-limits). For finer control, you can limit how many times a *specific* tool can be called by setting the `max_uses` parameter when registering the tool (e.g., `@agent.tool(max_uses=3)` or `Tool(func, max_uses=3)`). Once a tool reaches its `max_uses` limit, it is automatically removed from the available tools for subsequent steps in the run. The `tool_calls` counter increments only after a successful tool invocation. Output tools (used for [structured output](output.md)) are not counted in the `tool_calls` metric. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bit about max_uses should be in the next paragraph that's about soft limits, o that we can leave this one about hard limits.
| retries: dict[str, int] = field(default_factory=dict) | ||
| """Number of retries for each tool so far.""" | ||
| tool_usage: dict[str, int] = field(default_factory=dict) | ||
| """Number of calls for each tool so far.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to be consistent/specific about calls vs uses and how this relates to retries; does this imply successful calls, or any call at all?
| for tool in self.tools.values(): | ||
| # Filter out tools that have reached their max_uses limit | ||
| if tool.max_uses is not None: | ||
| current_uses = self.ctx.tool_usage.get(tool.tool_def.name, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_current_use_of_tool
| if current_uses >= tool.max_uses: | ||
| continue | ||
| result.append(tool.tool_def) | ||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have this be a list comprehension
| max_uses=tool.max_uses, | ||
| ) | ||
|
|
||
| self.ctx.tool_usage[name] = self.ctx.tool_usage.get(name, 0) + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, "uses" implied successful uses of a potentially costly thing, so excluding retries. Otherwise I think it should definitely be max_calls. But I think that only tracking successful uses probably makes sense, so we don't punish models unnecessarily for an arg validation error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we take it to mean "successful calls", then this part from #3352 (comment) will require special care, especially if we do it in ToolManager, because we don't want to run a max_uses=1 tool twice in parallel; the >1 calls should fail:
Another problem is when we are within one run step and the model does parallel function calling to the one-shot tool. How to stop the model technically (not by prompting or returning "do not call me again" - this works, but it is not real [and nice] control) from reusing this tool? Sequential=true appears too harsh since it makes other tool calls withing the same step sequential as well (I think).
| def _get_max_tool_calls(self) -> int | None: | ||
| """Get the maximum number of tool calls allowed during this run, or `None` if unlimited.""" | ||
| if self.ctx is None: | ||
| raise ValueError('ToolManager has not been prepared for a run step yet') # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this to a private helper method like _assert_ctx() -> RunContext
| async def raise_on_limit( | ||
| ctx: RunContext[Any], tool_def: ToolDefinition | ||
| ) -> ToolDefinition | None: | ||
| if ctx.max_uses and ctx.tool_usage.get(tool_def.name, 0) >= ctx.max_uses: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another interesting example would be to check ctx.tool_use and do this from #3352 (comment):
A better approach may be to have the tool when output "This tool can only be called M more times." after its real result, so that it works like a counter that increases during the run of the conversation. But that'd definitely be something for the user to do, I think, not Pydantic AI itself.
Closes #3352
PR Summary: Per-Tool Usage Limits & Soft Tool Call Limits
This PR adds two related features for controlling tool execution.
1.
max_uses- Per-Tool Usage LimitLimit how many times a specific tool can be called during a run:
Once a tool reaches its limit, it's removed from available tools and further calls return
'Tool call limit reached for tool "{name}".'2.
max_tool_calls- Soft Total Tool Call LimitSet a global limit on total tool calls at the agent or run level: