Sumit Gupta | Software Engineer
Meta Platforms

Sumit Gupta, Software Engineer, Meta Platforms

Sumit has been in the storage industry for 30 years, and he has been deeply involved in flash-based storage as part of industry efforts including FDP. He has been in Meta since 2020 where he has been working on the server side of the Tectonic stack to improve it for flash and AI. Previously, he worked at Sun Microsystems as part of the open source COMSTAR framework, as well as at Google, VMware, and HPE.

Appearances:



Future of Memory and Storage - Day 2 @ 09:05

Improving QLC write efficiency using FDP

AI induced ramp in flash consumption has significantly increased QLC footprint in data centers. While QLC provides much better read performance as compared to HDDs, its write performance is significantly lower. Any write amplification, specially around high utilizations, further degrades write I/O bringing it very close to HDD.

FDP has been proposed in the past for reducing flash write amplification. It ha been proving very useful for QLC media as any saving in WAF directly translates into precious write bandwidth for applications. The presentation will talk about some recent data on what has been the most effective way to use FDP and how much WAF improvement to expect.

Future of Memory and Storage - Day 2 @ 10:20

Managing flash IO capacity at AI scale

As we build huge AI clusters spanning multiple cities and several exabytes of storage, managing IO capacity becomes an impossibly complex task. Workloads vary from 100s of millions of small reads in few KiBs to 100s of thousands of huge write bursts in several megabytes. Further media like QLC have imbalanced read to write ratios which makes it even more confusing to uniformly represent I/O. Several AI teams actively share the same storage clusters often pushing its limits on both space and IO which then requires the storage cluster to continuously grow often leading to ongoing imbalance in space and IO.

Meta has been operating at the forefront of AI research, leading innovations in not just AI but systems and storage design to serve the growing AI research needs. Storage clusters in Meta have grown to operate at 10s of exabyte scale with heterogenous hardware across both TLC and QLC flash. This presentation will dive into the details of uniform representation of IO capacity and capacity modeling, overload protection and multi-tenancy at scale.

last published: 19/May/26 18:25 GMT

back to speakers

 

TO EXHIBIT OR SPONSOR

 

TO SPEAK

 

FMS website sponsored by XCena

 

Marketing & Press