AI 精选动态
智能评分 62
The Information 报道 OpenAI 改进模型低成本传输方式
AI 推荐理由
Fachauthorities 可重点关注技术细节对用户路线图与盈利模式的影响,建议持续跟踪成本节约进展。核心解读
文章中指出 OpenAI 通过优化模型切片、知识缓存和硬件调度等技术,将部分模型推理成本减少一半,影响模型商业化和用户采纳。维京数据显示,其调整后边缘模型推理费用下降约三分之一。
全文
The Information reports that OpenAI has cut inference costs by more than half on some existing models, while logged-out ChatGPT traffic ran on only a couple hundred Nvidia GPUs.
The obvious guesses include quantization, KV-cache changes, batching, speculative decoding, and routing easy queries cheaper.
If true, it will be a huge core competitive lever, lower cost can raise margins, expand usage limits, or reduce pressure on API pricing.
For some context, OpenAI’s adjusted gross margin fell to 33% in 2025 from 40% in 2024, after inference costs quadrupled.
Some reporting now puts Q1-2026 at 39%, with a 52% target by year-end.
Anthropic looks similar at roughly 44%, so frontier labs remain far below mature software economics.
---
theinformation .com/newsletters/ai-agenda/openai-discovers-new-way-cut-inference-costs-half
