AI 精选动态智能评分 60

MaineCoon: From Passive Video to Real-Time AI Presence

来源: twitter关注列表

作者: Rohan Paul (@rohanpaul_ai)

发布于: 2026-06-16

收录于: 2026-06-16

AI 推荐理由

建议点开原文确认是否开放模型权重、API 或可复现实时管线，再决定是否跟进集成。

核心解读

Catnip 发布 MaineCoon，称其为首个流式原生、无限时长的互动音频-视觉基础模型，可将文本提示转化为带同步语音、动作和表情的实时角色流；该模型为 22B 参数，首帧延迟低于 1 秒，在单张 H100 上 47.5FPS、单张 RTX Pro 6000 上 30FPS，内部测试吞吐量约为同类音频-视觉系统的 7 倍。

全文

Catnip just dropped MaineCoon, a 22B real-time audio-visual foundation model that turns text prompts into a live character stream with synced speech, motion, and expression. The first streaming-native model of its kind. sub-second first frame, 47.5FPS on one H100, 30FPS on one RTX Pro 6000, and about 7x faster throughput than comparable audio-visual systems in its internal tests. The big deal is that a normal video generator can wait, revise, and render a finished clip, but a social interface has to move causally, remember its own imperfect past, and stay ahead of playback without breaking identity, voice, or rhythm. ![photo](https://pbs.twimg.com/media/HK9he-Cb0AAnDal.jpg) https://video.twimg.com/amplify_video/2066979245503737856/vid/avc1/480x832/VTxK82g7eC_OL6rO.mp4?tag=28 > **引用原帖 Catnip (@catnips_ai):** > 🥇MaineCoon: From Passive Video to Real-Time AI Presence > The first unlimited-duration interactive audio-visual model. > Most AI products today still feel like they live behind a screen. > You type. It answers. > You speak. It replies. > The interaction is still mostly turn-based. > Mainecoon is built around a different idea: AI should not just respond to you. It should feel present with you. > 🔗Learn more > Website ↓ > https://t.co/SFpsMsvLs9 > Blog ↓ > https://t.co/nkc1KT7bT5 > https://x.com/catnips_ai/status/2066928792291917904

#模型发布#多模态#产品发布

阅读原始全文