返回精选
AI 精选动态 智能评分 70

Etched 推出 Cluster‑Scale Memory (CSM)

来源: twitter关注列表
作者: SemiAnalysis (@SemiAnalysis_)
发布于: 2026-06-30
收录于: 2026-06-30
AI 推荐理由
新架构通过低延迟互连与 HBM/SRAM 混合设计显著缓解当前 SRAM/HBM 成本与热量瓶颈,可提升大模型推理吞吐与可用性。
核心解读
Etched 推出了 Cluster-Scale Memory (CSM),一种共享低延迟内存池架构,利用专有 ultra‑low‑latency 高带宽互连实现跨芯片快速内存访问,并采用 HBM/SRAM 混合设计同时解决内存容量与 mem2mem 延迟问题,从而提升大规模 MoE 模型的吞吐量与交互性能。
全文
etched cluster-scale memory has so many SerDes https://t.co/V6yTAtsYf9 ![photo](https://pbs.twimg.com/media/HMFlx5VbgAAwSto.jpg) > **引用原帖 Etched (@Etched):** > Introducing Cluster-Scale Memory (CSM) for low latency workloads. > Today's AI chips using HBM can’t achieve SRAM-level decode speeds due to memory subsystem and interconnect bottlenecks. SRAM-only chips have lower FLOPs density and memory capacity, sacrificing throughput. > You’re forced to make a tradeoff: serve at much slower speeds, or run at low batch sizes and suffer from higher costs. > When running large MoE models, token routing across experts requires sending data through a deep memory hierarchy and a networking switch to reach a destination expert. > Each memory layer inherently adds latency; thus, the best layer is no layer. > We’ve designed a new architecture that creates a shared low-latency memory pool across the entire scale-up domain. > We use a proprietary ultra-low-latency, high-bandwidth interconnect to enable dramatically faster memory access across chips. > Our HBM/SRAM hybrid design solves both memory capacity and mem2mem latency, enabling high throughput and interactivity simultaneously. > CSM improves latency and avoids today's cost, reliability, yield, thermal, and compute tradeoffs of SRAM-only chips, 3D DRAM chips, or optics. > https://x.com/Etched/status/2071972138685473208
#产品发布#技术#大模型