🦆 Lance x DuckDB, 🚗 Uber-Scale Storage, ⚡ 1.5M IOPS
January Newsletter • February 9, 2026
Highlights
🦆 Lance x DuckDB: SQL for Retrieval on the Multimodal Lakehouse Format
The Lance extension for DuckDB turns DuckDB into a SQL compute engine over Lance datasets, exposing vector, full-text, and hybrid retrieval as SQL table functions. This enables fully composable retrieval workflows — joins with eval data, reproducible top-k slicing, SQL-based debugging, and materialization back into Lance.
🚗 Rethinking Table File Paths with Uber: Lance’s Multi-Base Layout
Working with Uber’s AI Infrastructure team, Lance introduced a multi-base layout to support product systems that need a single dataset to span multiple S3 buckets for parallel reads and writes.
📍 The Quest for One Million IOPS: Benchmarking Storage at Lance
Recent storage benchmarks in Lance reached up to 1.5 million IOPS by combining a scheduler rework with io_uring, showing that high random-access throughput depends more on reducing CPU overhead and context switching than on single-read latency.
We’re bringing together Apache Iceberg, Lance, and Apache DataFusion communities in NYC to chat about all things open lakehouse and data infrastructure at Cloudflare’s NYC office.
🗂️ Multi-base storage layouts to span multiple buckets or regions with a single dataset ⚡ Faster query execution via tighter WAND bounds and reduced per-query overhead
🦆 DuckDB-native SQL retrieval for vector, FTS, and hybrid search 🧩 Expanded embedding support (VoyageAI v4, multimodal) and faster ingestion via parallel embedding computation
🧠 Versioned context store APIs for append, search, and checkout across Python and Rust 🗜️ Background compaction and reduced Python blocking for long-running systems
A huge thank you to contributors from Uber, Netflix, Hugging Face, Bytedance, Huawei, Tencent, Alibaba, and more for their contributions!
Read the full newsletter for more updates around lance-namespace, lance-duckdb, lance-ray, and lance-spark.
In January, we held two Lance Community Syncs focused on the upcoming Lance v2.0.0 release, growing ecosystem integrations with DuckDB, Polaris, and Hugging Face, and the formalization of lance-context and lance-graph as official sub-projects.
The next Lance Community Sync will take place on February 12, 2026.