AI 精选动态
智能评分 80
MaineCoon: 首个实时互动视频模型,22B参数、47.5FPS、<$0.001/秒
AI 推荐理由
首次实现实时互动视频生成(47.5 FPS)和极低成本(<$0.001/秒),区别于传统被动视频模型,代表了 AI 视频技术的革新方向。核心解读
Catnip AI 发布 MaineCoon 实时互动视频模型,支持面部表情、情感生成和音频同步。该模型采用 22B 参数规模,在单 H100 GPU 上实现 47.5 FPS 生成速度,生成成本低于 0.001 美元/秒。在 SocialVideo Bench 达到 SOTA 性能,实现 1000 秒以上长时长视频流生成。
全文
MaineCoon is the first video model that focuses on social interactions: facial expressions, emotions, fluid conversation, audio-lip sync, etc. Really impressive inference specs: 22B params, 47.5 FPS on a single H100. Generates in real-time at <$0.001/sec.
They achieve this with an agentic streaming inference framework with 3 different auxiliary models to manage the cache and lookahead buffer. Super cool work.
> **引用原帖 Catnip (@catnips_ai):**
> Most AI video today is still:
> prompt → wait → watch a clip.
> MaineCoon is built for something different:
> prompt → talk → interact in real time.
> In our vision, the character is not a fixed video clip that just waits for your input. It keeps generating voice, expression, and motion on its own.
> That is why AI video starts feeling less like content — and more like someone you can actually hang out with.
> To meet our goal, the first step is Mainecoon, a real-time interactive audio-visual model built for streaming generation to interact with you.
> 1⃣Up to 47.5 FPS on a single H100 GPU
> 2⃣Audio-visual generation cost below $0.001 / second
> 3⃣Long-duration streaming generation for 1000s+ seconds
> 4⃣Continuous audio, motion, expression, and visual alignment
> 5⃣SOTA performance on SocialVideo Bench
> From passive video to real-time AI presence.
> Want to try MaineCoon?
> Learn more and apply for early access: https://t.co/SFpsMswjhH
> Share a great MaineCoon video on X and @catnips_ai , get 2 extra codes.
> https://x.com/catnips_ai/status/2068015962717315126