AI 精选动态
智能评分 70
Etched 推出 Cluster‑Scale Memory (CSM)
AI 推荐理由
新架构通过低延迟互连与 HBM/SRAM 混合设计显著缓解当前 SRAM/HBM 成本与热量瓶颈,可提升大模型推理吞吐与可用性。核心解读
Etched 推出了 Cluster-Scale Memory (CSM),一种共享低延迟内存池架构,利用专有 ultra‑low‑latency 高带宽互连实现跨芯片快速内存访问,并采用 HBM/SRAM 混合设计同时解决内存容量与 mem2mem 延迟问题,从而提升大规模 MoE 模型的吞吐量与交互性能。
全文
etched cluster-scale memory has so many SerDes https://t.co/V6yTAtsYf9

> **引用原帖 Etched (@Etched):**
> Introducing Cluster-Scale Memory (CSM) for low latency workloads.
> Today's AI chips using HBM can’t achieve SRAM-level decode speeds due to memory subsystem and interconnect bottlenecks. SRAM-only chips have lower FLOPs density and memory capacity, sacrificing throughput.
> You’re forced to make a tradeoff: serve at much slower speeds, or run at low batch sizes and suffer from higher costs.
> When running large MoE models, token routing across experts requires sending data through a deep memory hierarchy and a networking switch to reach a destination expert.
> Each memory layer inherently adds latency; thus, the best layer is no layer.
> We’ve designed a new architecture that creates a shared low-latency memory pool across the entire scale-up domain.
> We use a proprietary ultra-low-latency, high-bandwidth interconnect to enable dramatically faster memory access across chips.
> Our HBM/SRAM hybrid design solves both memory capacity and mem2mem latency, enabling high throughput and interactivity simultaneously.
> CSM improves latency and avoids today's cost, reliability, yield, thermal, and compute tradeoffs of SRAM-only chips, 3D DRAM chips, or optics.
> https://x.com/Etched/status/2071972138685473208