返回精选
AI 精选动态 智能评分 60

MaineCoon: From Passive Video to Real-Time AI Presence

来源: twitter关注列表
作者: Rohan Paul (@rohanpaul_ai)
发布于: 2026-06-16
收录于: 2026-06-16
AI 推荐理由
建议点开原文确认是否开放模型权重、API 或可复现实时管线,再决定是否跟进集成。
核心解读
Catnip 发布 MaineCoon,称其为首个流式原生、无限时长的互动音频-视觉基础模型,可将文本提示转化为带同步语音、动作和表情的实时角色流;该模型为 22B 参数,首帧延迟低于 1 秒,在单张 H100 上 47.5FPS、单张 RTX Pro 6000 上 30FPS,内部测试吞吐量约为同类音频-视觉系统的 7 倍。
全文
Catnip just dropped MaineCoon, a 22B real-time audio-visual foundation model that turns text prompts into a live character stream with synced speech, motion, and expression. The first streaming-native model of its kind. sub-second first frame, 47.5FPS on one H100, 30FPS on one RTX Pro 6000, and about 7x faster throughput than comparable audio-visual systems in its internal tests. The big deal is that a normal video generator can wait, revise, and render a finished clip, but a social interface has to move causally, remember its own imperfect past, and stay ahead of playback without breaking identity, voice, or rhythm. ![photo](https://pbs.twimg.com/media/HK9he-Cb0AAnDal.jpg) https://video.twimg.com/amplify_video/2066979245503737856/vid/avc1/480x832/VTxK82g7eC_OL6rO.mp4?tag=28 > **引用原帖 Catnip (@catnips_ai):** > 🥇MaineCoon: From Passive Video to Real-Time AI Presence > The first unlimited-duration interactive audio-visual model. > Most AI products today still feel like they live behind a screen. > You type. It answers. > You speak. It replies. > The interaction is still mostly turn-based. > Mainecoon is built around a different idea: AI should not just respond to you. It should feel present with you. > 🔗Learn more > Website ↓ > https://t.co/SFpsMsvLs9 > Blog ↓ > https://t.co/nkc1KT7bT5 > https://x.com/catnips_ai/status/2066928792291917904
#模型发布#多模态#产品发布