AI 精选动态智能评分 65

Quantized Reasoning Models Think They Need to Think Longer, but They Do Not

来源: twitter关注列表

作者: Rohan Paul (@rohanpaul_ai)

发布于: 2026-07-01

收录于: 2026-07-01

AI 推荐理由

该研究提供了量化模型过度思考的具体机制和简单有效的解码修正方法，对部署低成本推理模型有实用价值。

核心解读

Meta 的研究发现，后训练量化使推理模型在正确答案上犹豫不决，导致过度思考失败率高达 52%。通过惩罚 50 个犹豫词，推理长度减少 12% 至 23%，并保持或提升准确性。该研究在 5 个推理模型、多种量化方法和 1.5B 至 32B 参数规模上进行了验证。

全文

Paper from Meta shows Quantized reasoning models often lose because they keep doubting a correct answer instead of finishing. Many of them reason well enough, but compression makes them hesitate at the wrong time. The problem is that post-training quantization, a way to shrink models after training, can make reasoning models cheaper to run but worse at finishing cleanly. The authors found that strong quantization does not only make models less capable, since in many failures the model already reached the right answer but then second-guessed itself. Their core idea is that quantization adds noise at uncertain word choices, so the model becomes more likely to pick words like “wait,” “but,” or “alternatively” that reopen the problem. They tested this across math, coding, and science tasks using 5 reasoning models, several quantization methods, and model sizes from 1.5B to 32B. The main result is that aggressive quantization raised overthinking failures up to 52%, while a small penalty on 50 hesitation words cut reasoning length by 12% to 23% and often kept or improved accuracy. Given compressed models are widely used to save memory and cost, very important to know that a very small decoding fix can stop many of them from wasting tokens and losing answers they already had. ---- Link – arxiv. org/abs/2606.00206 Title: "Quantized Reasoning Models Think They Need to Think Longer, but They Do Not" ![photo](https://pbs.twimg.com/media/HMK2m4YbsAAYOA6.png)

#模型#技术#研究

阅读原始全文