AI 精选动态
智能评分 60
RepFusion: 让预训练多模态先验参与去噪过程
AI 推荐理由
该方法提出一种新的架构改进方向,值得关注后续实验效果和代码开源。核心解读
Xichen Pan 提出 RepFusion 方法,解决当前 text-to-image 模型中 LLM 仅编码 prompt 一次而新训练生成骨干独立处理噪声隐状态的不匹配问题,使预训练多模态先验能参与去噪过程。论文和项目主页已公开。
全文
Saining Xie (@sainingxie) 转发了 Xichen Pan (@xichen_pan) 的帖子:
Modern text-to-image models are increasingly powered by large pretrained LLMs.
But there is a curious mismatch: the LLM typically encodes the prompt only once, while the evolving noisy latent states are handled entirely by a newly trained generative backbone.
Can pretrained multimodal prior participate in the denoising process?
Introducing RepFusion. (1/12)
📄 https://t.co/WbkTtg5M79
🌐 https://t.co/iDHggosNJX
