Luis Ancajas is Director for Disaggregated Memory Systems, at Micron Technology, where he leads strategy and ecosystem partnerships for next‑generation memory solutions supporting AI and data‑center workloads. With a background in computer architecture and semiconductors, he focuses on advancing CXL architectures with hyperscalers and infrastructure partners. Luis holds an M.S. in Electrical Engineering from Stanford University and a B.S. in EECS from UC Berkeley
In this session Luis will present practical demonstrations and performance analysis of using CXL-based disaggregated memory to accelerate large-scale AI inferencing workloads in HPC and datacenter environments. The approach integrates a CXL JBOM (Just a Bunch of Memory) as an offload target for the KV cache, connecting it to NVIDIA’s Dynamo inference stack via Micron’s FAMFS, enabling the JBOM to operate as a warm memory file system. This architecture removes storage bottlenecks and significantly increases the effective memory bandwidth available to the KV cache. Preliminary results indicate a 5–10× speedup over traditional storage-backed KV cache implementations, highlighting the transformative impact of CXL memory pooling for next-generation inference systems. This work presents a scalable, standards-aligned approach for deploying memory-intensive inference pipelines in modern HPC systems and data centers.