Dropbox Engineering: Improving Storage Efficiency in Magic Pocket at Exabyte Scale
Dropbox Engineering: How We Improved Storage Efficiency in Magic Pocket
Dropbox has published a detailed engineering blog post about improving storage efficiency in Magic Pocket, their custom-built exabyte-scale immutable blob storage system that holds all user content.
The Challenge
Magic Pocket stores trillions of blobs and processes millions of deletes daily. As an immutable blob store, data is never modified in place — updates and deletes write new data while old data remains until reclaimed. Last year, a change to data placement reduced write amplification but had an unintended side effect: increased fragmentation pushed storage overhead higher, with a small number of severely under-filled volumes consuming disproportionate raw capacity.
Key Technical Concepts
- Immutable Architecture: Once written, blobs are never modified. Deleted data stays on-disk until compacted
- Garbage Collection: Identifies unreferenced blobs but does not free space
- Compaction: Gathers live blobs from old volumes, writes them to new volumes, retires old ones
- Erasure Coding: Splits data into fragments with parity pieces for fault tolerance, using significantly less storage than replication
- Fragmentation Problem: If a volume is only 10% full of live data, storage is effectively used at 10x the required capacity
The Solution
Dropbox developed a multi-strategy approach to drive overhead back down, even below their previous baseline. The engineering challenge is particularly acute at exabyte scale, where even modest overhead increases translate into significant infrastructure costs.
Why It Matters
This is a rare detailed look at the operational challenges of running storage infrastructure at truly massive scale. The immutable blob store architecture — used by many large-scale systems — presents unique compaction and fragmentation challenges that are not widely discussed in the literature. The post provides practical insights for anyone working on large-scale storage systems.
Source: dropbox.tech — via HN