AI 精选动态
智能评分 85
Qwen-Robot Suite: 阿里云发布具身智能三模型套件
AI 推荐理由
引入自然语言通用动作接口实现跨领域物理知识协同训练,具身智能领域重要技术突破,建议深入研究模型架构和训练方法。核心解读
阿里云发布 Qwen-Robot Suite,包含 Qwen-RobotNav(统一 5 个导航任务)、Qwen-RobotManip(跨机器人状态-动作空间训练,基于 38,100+ 小时开源语料)、Qwen-RobotWorld(单一世界模型支持 20+ 形态,200M+ 帧训练数据),并实现自然语言动作接口和跨领域物理知识协同训练,三模型在 EWMBench/DreamGen/WorldModelBench/PBench 上表现优异。
全文
📣 Introducing the Qwen-Robot Suite — Qwen-RobotNav, Qwen-RobotManip, Qwen-RobotWorld, three foundation models, a full stack for embodied intelligence.
🧭 Qwen-RobotNav — the gateway to mobility.
• Unifies 5 navigation tasks in one model: instruction following, point-goal, object-goal, target tracking, autonomous driving
• Controllable observation protocol
• Tool interface for agentic systems
🤖 Qwen-RobotManip — the foundation of interaction.
• Unified state-action space across heterogeneous robots
• Camera-frame delta poses for coherent cross-embodiment training
• Pretrained on a 38,100+ hour open-source corpus
🌍 Qwen-RobotWorld — infinite worlds for physical agents.
• Single world model, 20+ embodiments
• Natural-language action interface
• Predicts physically grounded futures across manipulation, driving, and navigation
Each model is independently useful, and could be composed as physical-world tools.Together, they form the low-level toolkit for general-purpose agentic systems that don't just see the world, but act in it.
📷 Blog:
https://t.co/olblKRpiBE
📖 Report:
Qwen-RobotNav: https://t.co/tySB8XRVEV
Qwen-RobotManip: https://t.co/uGnx6IpvJd
Qwen-RobotWorld: https://t.co/DSJZB2PtYm


Alibaba Cloud (@alibaba_cloud): Qwen-Robot Suite,enable AI from chatbot to physical action in the real world.
More demos, please visit our blog:https://t.co/4UgEGiE52L https://t.co/l6Jt69TAm9
Alibaba Cloud (@alibaba_cloud): By treating natural language as a universal action interface,Qwen-RobotWorld bridges the gap between general video generation models and domain-specific embodied models — this converts end-effector poses, steering commands, and navigation waypoints into a single interface, enabling 20+ embodiment types and 500+ action categories to be co-trained under the Embodied World Knowledge corpus (8.6M video-text pairs, 200M+ frames), with each domain's physical knowledge reinforcing the others.
Qwen-RobotWorld performs strongly across EWMBench/DreamGen/WorldModelBench/PBench benchmarks.
Blog Link:https://t.co/GTnOP3tVyu