Kyung Soo Lee | Principal Engineer
SK hynix

Kyung Soo Lee is a senior Principal Engineer at SK hynix, a leading global semiconductor company. With over a decade of experience in the memory and storage technologies, he is a visionary leader driving innovation at the forefront of field. His expertise lies in developing emerging memory and storage solutions that shape the future of data storage and processing.

Appearances:

Future of Memory and Storage - Day 3 @ 08:55

Key-Value caching at the intersection of memory and hybrid storage accelerates LLM inference at scale

As LLMs scale to billions of parameters and handle complex, multi-turn workloads, inference efficiency is no longer determined solely by compute power — but by how intelligently KV cache is managed across memory and storage tiers. This talk explores a novel architecture that situates KV caching at the critical junction between GPU memory and hybrid storage. Using Linux volume groups and SPDK for NVMe over Fabrics, we treat SSD/HDD tiers as active memory extensions, not passive backends. Frequently accessed KV states remain in fast layers; less active data moves to cost-efficient storage — eliminating redundant attention recomputation. Integrated with the Dynamo KB Block Manager and dynamic logical volumes, this reduces time-to-first-token and power consumption, while easing GPU memory (HBM) pressure. Result: higher concurrency, more simultaneous users — without sacrificing responsiveness. The system adapts to real-time workload patterns, improving throughput and lowering operational cost. A practical, scalable solution for production LLM deployment.

Kyung Soo Lee, Principal Engineer, SK hynix

Mohamad El-Batal, Chief Technologist - CTO Office, Seagate