Wes Vaske is a Senior Member of Technical Staff at Micron Technology. As a Storage Solutions Architect with over 15 years of experience in data center storage systems, he is currently focused on developing high-performance NVMe solutions for AI workloads.
He is a lead contributor to the MLPerf Storage Working Group, where he helps define industry benchmarks for AI storage performance. Wes is a frequent presenter at Future of Memory and Storage (FMS) and SNIA Developer Conference (SDC) and Chairs the SNIA AI Data Workloads Technical Working Group (TWG)
Prior to his current role, Wes was a Systems Performance Engineer with the Data Center Workloads Engineering team for more than a decade, where he pioneered system observation, tracing, and analysis tools as well as developing automation frameworks enabling reproducible and insightful performance analysis across diverse environments that have become foundational to Micron’s workload-first product development strategy.
His earlier career includes performance engineering for Oracle RAC database systems at Dell Technologies.
Benchmarking and characterization of storage for AI continues to be a challenge across the industry. There are broadly available tools for executing benchmarks and a broad array of workload definitions. The problem we face is understanding which workload is important to customers, integrators, and product teams.
To address some of these challenges, SNIA has launched a new Technical Working Group (TWG) -- the AI Data Workloads TWG. This TWG was developed to provide definitions of AI storage workloads and the associated SNIA software to run a workload synthetically. This will enable standardization of workload definitions for suppliers, developers, and architects who are designing the next generation of AI Data Centers.
Attendees will leave this session with an understanding of the AI Data Workloads charter, how they can use the content produced by the TWG, and how to become involved in the TWG.
The unrelenting pace of evolution in AI systems continues to present new memory bottlenecks. Various use cases are hitting different memory walls: KV Caches require bandwidth and capacity while being latency tolerant; model offloading requires the lowest latencies at moderate queue depths (usable IOPS); GNN training requires saturating bandwidth at small IO sizes (maximum IOPS); checkpointing requires sustained write bandwidth. NAND provides multiple methods of addressing each of these challenges through various system architectures.
In this session we will explore the various use cases and their specific memory wall problems, the systems being designed and deployed to address the set of walls, and what levers can be used to optimize the Total Cost of Ownership (TCO) for the various solutions and use cases.