Turbolite: SQLite VFS in Rust Serves Cold Queries from S3 in Under a Second
Turbolite Brings SQLite to S3 with Sub-Second Cold Query Performance
A developer has built Turbolite, an experimental SQLite Virtual File System (VFS) written in Rust that can serve cold queries directly from Amazon S3 with sub-second latency, often much faster — without needing a local database copy.
How It Works
Instead of naive page-at-a-time reads from a raw SQLite file, Turbolite:
- Introspects SQLite B-trees to understand data structure
- Stores related pages together in compressed page groups on S3
- Maintains a manifest as the source of truth for page locations
- Uses seekable zstd frames and S3 range GETs for cache misses
- Prefetches strategically based on query plans
The Design Philosophy
Turbolite is inspired by Turbopuffer's S3-native approach. The key insight: S3 rewards fewer requests, bigger transfers, and immutable objects — the opposite of traditional filesystem assumptions. Rather than fighting S3's characteristics, Turbolite designs around them.
Target Use Case
The project is aimed at architectures with many mostly-cold SQLite databases:
- Database-per-tenant SaaS applications
- Session-based databases
- User-specific databases
In these scenarios, keeping a separate attached volume for each inactive database feels wasteful. Turbolite assumes a single write source and targets 'many databases with bursty cold reads' rather than 'one hot database.'
Performance Optimizations
- Pages grouped by type in S3 for efficient parallel access
- Tunable prefetching: conservative for point queries, aggressive for scans
- Query plan awareness: passes storage operations down to VFS to frontrun downloads
- Compressed page groups reduce transfer volume
Status
The project is explicitly experimental — the author warns it 'may corrupt data' and should not be trusted with anything important yet. It's a fascinating exploration of whether object storage has gotten fast enough to support embedded databases over cloud storage.
Why It Matters
If reliable, this approach could fundamentally change how developers think about database architecture in serverless and edge computing environments, eliminating the need for persistent volumes in many scenarios.