返回精选
AI 精选动态 智能评分 80

OpenAI 将 GPT-5.5 Instant 接入健康助手

来源: twitter关注列表
作者: Rohan Paul (@rohanpaul_ai)
发布于: 2026-06-19
收录于: 2026-06-19
AI 推荐理由
阅读原帖了解 OpenAI 如何通过大规模医生标注实现健康 AI 能力的免费化并显著降低错误率。
核心解读
OpenAI 将健康 AI 功能从专业推理模型移入免费 GPT-5.5 Instant 模型,230 百万用户每周提问,260 医生覆盖 60 国、49 语言、26 专科审核 70 万+ 回复,通过教师模型提炼和医生指导的监督微调,实现在健康问诊中的行为优化。实际流量中事实性错误率降低 71%。
全文
This is really good. OpenAI just moved frontier-level health AI from premium reasoning models into the free GPT-5.5 Instant model. GPT-5.5 Instant now performs near OpenAI’s Thinking models on health evaluations, meaning the cheaper, faster default model is being trained to behave more like the slower models that spend extra computation checking their reasoning. The update targets the gap between a chatbot that sounds fluent and a health assistant that knows when to slow down, ask for missing details, admit uncertainty, and push the user toward care when symptoms look urgent. OpenAI says more than 230 million people ask ChatGPT health and wellness questions every week, so moving this capability into the free product changes the scale from premium assistance to mass access. From OpenAI's blog looks like they did a huge "distillation" to achieve this. i.e. a stronger teacher model and human experts create high-quality responses, and a cheaper student model learns the answer patterns without repeating the same expensive internal search every time. i.e. OpenAI's training loop was heavily physician-shaped: more than 260 doctors across 60 countries, 49 languages, and 26 specialties reviewed over 700,000 model responses and judged whether answers were accurate, cautious, clear, complete, and useful. OpenAI's likely mechanism seems to be a mix of supervised fine-tuning, where Instant is shown better answers, and preference training, where it learns which answer a physician-led rubric prefers when two outputs differ. The physician part is crucial because the target is not just “medical facts,” but clinical response behavior, such as asking for age, pregnancy status, duration, medication history, severe pain, breathing trouble, fever, neurological symptoms, or other missing context before giving guidance. So the strongest improvement is not medical trivia but behavior under uncertainty, because a good health answer often means saying what cannot be known yet, what context is missing, what red flags matter, and what the next safe step should be. OpenAI also reports 71% fewer flagged factuality issues in real health traffic over two months, which suggests the update is reducing wrong claims in everyday use rather than only improving benchmark scores. ![photo](https://pbs.twimg.com/media/HLLHHmpbsAAhQql.jpg) > **引用原帖 OpenAI (@OpenAI):** > GPT-5.5 Instant is now on par with our frontier Thinking models for health-related questions. > Every week, more than 230 million people turn to ChatGPT with health and wellness questions, and GPT-5.5 Instant is better at recognizing when urgent care may be needed, asking for relevant context, explaining uncertainty, and making complex information easier to understand. > Because GPT-5.5 Instant is available to all free users in ChatGPT, these improvements can help more people. > Physician-led evaluation was critical to making these major intelligence gains. > https://x.com/OpenAI/status/2067672740539306261 Rohan Paul (@rohanpaul_ai): https://t.co/mvc5tsmAF1
#产品发布#技术突破#AI产业