-
Notifications
You must be signed in to change notification settings - Fork 568
docs(streaming): update streaming configuration documentation #1542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
feature: #1538 Revise streaming docs to clarify usage of stream_async(), remove outdated global streaming config, and add CLI usage instructions. Explain output rails streaming requirements and deprecation of StreamingHandler. Improve examples and guidance for token usage tracking.
Documentation preview |
Greptile OverviewGreptile SummaryThis PR updates the streaming documentation to reflect recent changes in how streaming works in NeMo Guardrails. The main changes are:
The documentation correctly explains that
|
| Filename | Score | Overview |
|---|---|---|
| docs/configure-rails/yaml-schema/streaming/global-streaming.md | 4/5 | Documentation refactor removes outdated global streaming config, clarifies usage with stream_async(), adds CLI instructions, and marks StreamingHandler as deprecated |
Sequence Diagram
sequenceDiagram
participant User
participant LLMRails
participant StreamingHandler
participant LLM as LLM Provider
participant OutputRails as Output Rails
Note over User,LLM: Basic Streaming Flow (stream_async)
User->>LLMRails: stream_async(messages)
LLMRails->>LLMRails: _validate_streaming_with_output_rails()
LLMRails->>StreamingHandler: Create StreamingHandler
LLMRails->>LLMRails: generate_async(streaming_handler)
LLMRails->>LLM: Request with streaming=True, stream_usage=True
loop For each token
LLM-->>StreamingHandler: Push token chunk
StreamingHandler-->>User: Yield token chunk
end
LLM-->>LLMRails: Complete with usage stats
Note over User,OutputRails: Streaming with Output Rails
User->>LLMRails: stream_async(messages) [with output rails configured]
LLMRails->>LLMRails: Check rails.output.streaming.enabled
alt Output rails streaming disabled
LLMRails-->>User: Raise InvalidRailsConfigurationError
else Output rails streaming enabled
LLMRails->>StreamingHandler: Create handler
LLMRails->>LLM: Request streaming
loop For each token
LLM-->>StreamingHandler: Push token
StreamingHandler->>OutputRails: Buffer for output rail check
OutputRails-->>User: Yield validated token
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 1 comment
|
Thank you for starting the PR! We need to have only config.yml-related content in this chapter. I want to avoid mixing it with the CLI server API/Python API usages. I'll submit a PR to improve this. |
feature: #1538
Greptile Summary
This PR updates the streaming documentation to reflect recent changes in how streaming works in NeMo Guardrails. The main changes are:
stream_async()is the primary method for streaming (rather than requiring YAML configuration first)--streamingflagStreamingHandlerusage as deprecated in favor ofstream_async()The documentation correctly explains that
StreamingNotSupportedErroris raised when output rails are configured withoutrails.output.streaming.enabled: True. The examples are clear and follow Python best practices.