Minseong Kim | Distinguished Engineer
SK hynix

He is a Distinguished Engineer at SK hynix with extensive experience in system architecture and performance analysis of DRAM-based server platforms and emerging memory solutions. His work spans CXL memory (expansion and pooling), PIM, and MRDIMM to enable scalable data center systems. More recently, he has driven performance characterization of large-scale LLM inference and AI agent systems, focusing on memory bottlenecks, KV cache dynamics, and end-to-end system optimization.

Appearances:

Future of Memory and Storage - Day 3 @ 14:40

CXL Pooling/Sharing for Shared KV Cache: Opportunities and Practical Constraints in LLM Inference

This presentation explores how CXL pooling/sharing could enable KV cache sharing across a memory hierarchy that includes VRAM, local DRAM, local SSD, and an ICMS-like tier. We focus on latency-sensitive and memory-capacity-hungry inference patterns (e.g., multi-turn serving, multi-adapter workloads) where KV reuse and prefix overlap are prominent.The talk is concept-driven and grounded in published literature and public reports. We summarize expected benefits, outline deployment constraints (ecosystem maturity, correctness/coherence boundaries, software support), and discuss how to prioritize a deployable subset of CXL capabilities rather than assuming “full spec implementation” is always optimal.

Minseong Kim, Distinguished Engineer, SK hynix

last published: 19/May/26 18:25 GMT

back to speakers