Sarah Wooders


I'm a fourth year PhD student at UC Berkeley advised by Ion Stoica and Joseph E. Gonzalez, working in Sky Computing and previously RISELab. I am broadly interested in systems for ML, with a focus on data systems and the cloud.

Before Berkeley, I founded Glisten AI, which builds AI to categorize and tag product data and was part of Y Combinator's W20 batch. My undergraduate degree is from MIT, where I studied computer science and math and did research at CSAIL in the Supertech Group.

Open Source

MemGPT Long term memory for LLM agents [paper] [discord]
Skyplane Blazing fast bulk data transfer for any cloud [paper] [blog]
ralf Incremental maintenance for feature & embedding stores [paper]

Selected Publications

Cloudcast: High-Throughput, Cost-Aware Overlay Multicast in the Cloud

NSDI 2024 (to appear)

Sarah Wooders, Shu Liu, Paras Jain, Simon Mo, Joseph E. Gonzalez, Vincent Liu, Ion Stoica

RALF: Accuracy-Aware Scheduling for Feature Store Maintenance

VLDB 2024 (to appear)

Sarah Wooders, Simon Mo, Amit Narang, Kevin Lin, Ion Stoica, Joseph M. Hellerstein, Natacha Crooks, Joseph E. Gonzalez

Skyplane: Optimizing Transfer Cost and Throughput Using Cloud-Aware Overlays

NSDI 2023 [link]

Paras Jain, Sam Kumar, Sarah Wooders, Shishir G. Patil, Joseph E. Gonzalez, Ion Stoica

Press / blogs

[RISELab] Feature Stores: The Data Side of ML Pipelines
[Techcrunch] Glisten uses computer vision to break products into their most important parts
[MIT News] Taking the lead in shaping the future of computing and artificial intelligence
[MemSQL] Faster Data Loading with Adaptive Compression
[Underscore] Fashion Brought to You by AI