-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
It seems that in the latest version of vllm 0.11+ Chat Completions has stopped honouring max_tokens with GPTOSS 120B model, the below request payload has stopped working with max_tokens earlier the same payload would provide an output to the limit of the max_tokens provided..
Interestingly if you look at the usage tokens, it's showing completion_tokens as 500 but the output is BLANK.
{
"messages": [
{
"role": "user",
"content": "What is the role of AI in medicine?"
}
],
"model": "openai/gpt-oss-120b",
"max_tokens": 500,
"reasoning": {"effort": "low"},
"stream": false
}getting BLANK output, even though the usage is showing token counts created is matching max_tokens
{
"id": "chatcmpl-c71e934ac0b74bd4b8f99fe9b5516ea3",
"object": "chat.completion",
"created": 1764300020,
"model": "openai/gpt-oss-120b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"refusal": null,
"annotations": null,
"audio": null,
"function_call": null,
"tool_calls": [],
"reasoning": "Need to answer.",
"reasoning_content": "Need to answer."
},
"logprobs": null,
"finish_reason": "length",
"stop_reason": null,
"token_ids": null
}
],
"service_tier": null,
"system_fingerprint": null,
"usage": {
"prompt_tokens": 78,
"total_tokens": 578,
"completion_tokens": 500,
"prompt_tokens_details": null
},
"prompt_logprobs": null,
"prompt_token_ids": null,
"kv_transfer_params": null
}When you remove the max_tokens, we get the output which shows usage_token to have completion_tokens to be around 1600 tokens..
It seems that starting from vllm 0.11+ version, the auto-truncation using the max_tokens has stopped working
{
"id": "chatcmpl-61b60144d43147e2b007158712ad4920",
"object": "chat.completion",
"created": 1764300423,
"model": "openai/gpt-oss-120b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "**The role of AI in medicine is expanding rapidly and touches virtually every aspect of healthcare—from the way doctors diagnose patients to how hospitals run their operations.** Below is a structured overview that covers the major domains, concrete examples, benefits, challenges, and future directions.\n\n---\n\n## 1. Clinical Care\n\n| Sub‑area | What AI Does | Real‑World Examples | Benefits |\n|----------|--------------|---------------------|----------|\n| **Diagnostics** | Image analysis, pattern recognition, risk stratification | • Radiology: Google DeepMind’s AI detects lung cancer on CT scans with >95% accuracy.<br>• Dermatology: FDA‑cleared apps (e.g., SkinVision) classify skin lesions from photos.<br>• Pathology: Paige.ai assists in detecting prostate cancer in biopsy slides. | Faster, more consistent readings; can catch subtle findings that human eyes miss. |\n| **Predictive Analytics** | Forecast disease onset, complications, readmission risk | • Sepsis prediction models (e.g., Epic Sepsis Model) trigger alerts hours before clinical signs.<br>• Cardiovascular risk calculators incorporating genomics and wearables. | Enables proactive interventions, reduces morbidity and cost. |\n| **Treatment Planning** | Decision support, dose optimisation, drug selection | • IBM Watson for Oncology (clinical trial matching).<br>• Radiation oncology: AI‑driven dose‑painting to spare healthy tissue.<br>• Pharmacogenomics: AI predicts drug‑gene interactions. | Personalises therapy, improves outcomes, reduces adverse events. |\n| **Robotics & Minimally Invasive Surgery** | Real‑time image guidance, autonomous suturing, task automation | • Da Vinci Surgical System (augmented with AI for instrument tracking).<br>• VERDICT AI for autonomous suturing in animal models. | Increases precision, reduces surgeon fatigue, shortens recovery. |\n\n---\n\n## 2. Patient‑Facing Applications\n\n| Application | Description | Example |\n|-------------|-------------|---------|\n| **Virtual Assistants & Chatbots** | Symptom triage, medication reminders, mental‑health chat | • Babylon Health (AI‑driven triage).<br>• Woebot (CBT‑based mental‑health chatbot). |\n| **Telemedicine Enhancements** | Real‑time vitals extraction from video, automated note‑taking | • KardiaMobile ECG integration with AI‑based arrhythmia detection. |\n| **Wearables & Remote Monitoring** | Continuous data streams analysed for early alerts | • Apple Watch ECG + AI arrhythmia detection; Fitbit heart‑rate trend alerts. |\n\n---\n\n## 3. Operational & Administrative Efficiency\n\n| Domain | AI Functions | Example |\n|--------|--------------|---------|\n| **Scheduling & Resource Allocation** | Predictive staffing, OR utilization optimisation | • Qventus AI platform reduces ER wait times by 30% in pilot sites. |\n| **Revenue Cycle Management** | Claim coding validation, fraud detection | • Change Healthcare’s AI coding assistant. |\n| **Supply Chain** | Demand forecasting for meds, PPE | • GE Healthcare’s AI‐driven inventory management. |\n\n---\n\n## 4. Research & Drug Development\n\n| Stage | AI Contribution | Notable Projects |\n|-------|----------------|------------------|\n| **Target Identification** | Deep learning on genomics & proteomics | • Insilico Medicine discovered a novel DDR1 inhibitor in 46 days. |\n| **Compound Screening** | Virtual screening of billions of molecules | • Atomwise’s AI screened 10M compounds for COVID‑19 antivirals. |\n| **Clinical Trial Design** | Patient‑centering enrollment, adaptive trial simulations | • Deep 6 AI matches patients to trials with 4× higher enrollment speed. |\n\n---\n\n## 5. Public Health & Population Health\n\n* **Epidemiology** – AI models (e.g., BlueDot) flagged COVID‑19 spread days before WHO alerts. \n* **Health Equity** – Bias‑aware algorithms identify underserved populations for targeted interventions. \n* **Surveillance** – AI parses social‑media, EMS calls, and wastewater data for outbreak detection.\n\n---\n\n## 6. Benefits at a Glance\n\n| Dimension | Impact |\n|-----------|--------|\n| **Speed** | Real‑time image and data processing → quicker diagnosis. |\n| **Accuracy** | Reduced inter‑observer variability; higher sensitivity/specificity. |\n| **Scalability** | Extends specialist expertise to remote or low‑resource settings. |\n| **Cost Savings** | Preventive alerts lower expensive complications; automation cuts labor costs. |\n| **Personalisation** | Tailors treatment to genetic, lifestyle, and environmental factors. |\n\n---\n\n## 7. Key Challenges & Risks\n\n1. **Data Quality & Bias** \n - Training data often lacks diversity → risk of health disparities. \n - Need rigorous bias‑mitigation pipelines (e.g., fairness metrics, adversarial debiasing).\n\n2. **Interpretability & Trust** \n - Black‑box models hinder clinician acceptance. \n - Emerging solutions: Explainable AI (XAI) dashboards, attention maps, counterfactual explanations.\n\n3. **Regulatory & Legal Landscape** \n - FDA’s “Software as a Medical Device (SaMD)” pathways, EU’s AI Act, and emerging global standards. \n - Liability unclear when AI recommendations lead to harm.\n\n4. **Integration with Clinical Workflow** \n - Alert fatigue, EMR incompatibility, and need for seamless UI/UX. \n - Human‑in‑the‑loop design is critical.\n\n5. **Data Privacy & Security** \n - HIPAA, GDPR, and emerging “AI‑specific” regulations require robust de‑identification and federated learning techniques.\n\n---\n\n## 8. Future Outlook (Next 5‑10 Years)\n\n| Trend | What to Expect |\n|-------|----------------|\n| **Federated & Edge AI** | Models trained on device (e.g., wearables) without moving PHI, preserving privacy. |\n| **Multimodal Foundation Models** | Large language/vision models (e.g., MedPaLM, ClinicalBERT‑2) that ingest notes, imaging, labs simultaneously for holistic suggestions. |\n| **AI‑driven Clinical Trials** | Real‑time adaptive designs powered by continuous data streams, shrinking development timelines. |\n| **Digital Twins of Patients** | Simulated virtual patients for therapy testing, surgical planning, and disease progression forecasting. |\n| **AI Governance Frameworks** | Standardized audit trails, certification bodies (e.g., ISO 82304‑2), and “AI ethics boards” embedded in hospitals. |\n\n---\n\n## 9. Practical Take‑aways for Stakeholders\n\n| Role | Actionable Steps |\n|------|------------------|\n| **Clinicians** | • Start with FDA‑cleared decision‑support tools.<br>• Participate in model validation studies.<br>• Keep a “human‑in‑the‑loop” mindset. |\n| **Hospital Administrators** | • Conduct ROI analyses for AI pilots.<br>• Build a cross‑functional AI governance committee.<br>• Invest in data infrastructure (FHIR, interoperable APIs). |\n| **Patients** | • Ask providers how AI influences their care.<br>• Review consent forms for data use.<br>• Use FDA‑approved consumer health apps and verify privacy policies. |\n| **Developers / Researchers** | • Prioritize diverse datasets and bias testing.<br>• Implement explainability from day one.<br>• Align with regulatory pathways early (e.g., pre‑submissions to FDA). |\n\n---\n\n### TL;DR\n\nAI is reshaping medicine across **diagnosis, treatment planning, surgery, patient engagement, operations, research, and public health**. It brings speed, accuracy, scalability, and personalization, but it also raises challenges around bias, interpretability, regulation, workflow integration, and privacy. Successful adoption will hinge on thoughtful governance, transparent models, and a collaborative “human‑AI partnership.”",
"refusal": null,
"annotations": null,
"audio": null,
"function_call": null,
"tool_calls": [],
"reasoning": "Need to answer.",
"reasoning_content": "Need to answer."
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null,
"token_ids": null
}
],
"service_tier": null,
"system_fingerprint": null,
"usage": {
"prompt_tokens": 78,
"total_tokens": 1743,
"completion_tokens": 1665,
"prompt_tokens_details": null
},
"prompt_logprobs": null,
"prompt_token_ids": null,
"kv_transfer_params": null
}
### 🐛 Describe the bug
from vllm import LLM, SamplingParams
prompts = [ "What is the role of AI in medicine?"]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
llm = LLM(model="openai/gpt-oss-120b")
outputs = llm.generate(prompts, sampling_params)
# Print the outputs.
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
### Before submitting a new issue...
- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.
sandeepjana
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working