AI 精选动态
智能评分 60
MaineCoon: From Passive Video to Real-Time AI Presence
AI 推荐理由
建议点开原文确认是否开放模型权重、API 或可复现实时管线,再决定是否跟进集成。核心解读
Catnip 发布 MaineCoon,称其为首个流式原生、无限时长的互动音频-视觉基础模型,可将文本提示转化为带同步语音、动作和表情的实时角色流;该模型为 22B 参数,首帧延迟低于 1 秒,在单张 H100 上 47.5FPS、单张 RTX Pro 6000 上 30FPS,内部测试吞吐量约为同类音频-视觉系统的 7 倍。
全文
Catnip just dropped MaineCoon, a 22B real-time audio-visual foundation model that turns text prompts into a live character stream with synced speech, motion, and expression.
The first streaming-native model of its kind.
sub-second first frame, 47.5FPS on one H100, 30FPS on one RTX Pro 6000, and about 7x faster throughput than comparable audio-visual systems in its internal tests.
The big deal is that a normal video generator can wait, revise, and render a finished clip, but a social interface has to move causally, remember its own imperfect past, and stay ahead of playback without breaking identity, voice, or rhythm.

https://video.twimg.com/amplify_video/2066979245503737856/vid/avc1/480x832/VTxK82g7eC_OL6rO.mp4?tag=28
> **引用原帖 Catnip (@catnips_ai):**
> 🥇MaineCoon: From Passive Video to Real-Time AI Presence
> The first unlimited-duration interactive audio-visual model.
> Most AI products today still feel like they live behind a screen.
> You type. It answers.
> You speak. It replies.
> The interaction is still mostly turn-based.
> Mainecoon is built around a different idea: AI should not just respond to you. It should feel present with you.
> 🔗Learn more
> Website ↓
> https://t.co/SFpsMsvLs9
> Blog ↓
> https://t.co/nkc1KT7bT5
> https://x.com/catnips_ai/status/2066928792291917904