DuckLake
DuckLake is a lakehouse extension for DuckDB that combines database catalog management with Parquet-based data storage for local, S3, or GCS backends.
Overview
DuckLake is a lakehouse solution built on DuckDB that separates catalog metadata from data storage. Key features include:
- Flexible catalog backends: Use a local DuckDB/SQLite file or a PostgreSQL/MySQL server for multi-user catalog management
- Multiple storage options: Store Parquet data files locally, on Amazon S3, or Google Cloud Storage
- Full SQL support: Create, alter, and query tables using standard SQL through DuckDB
- ACID transactions: Transactional guarantees via the underlying catalog database
- Time travel: Query historical versions of your data
DuckLake is ideal for teams that want lakehouse capabilities without the complexity of Spark or Hive, while retaining the speed and simplicity of DuckDB.
Connecting
To connect to DuckLake in DBCode:
- Open the DBCode Extension: Launch Visual Studio Code and open the DBCode extension.
- Add a New Connection: Click on the “Add Connection” icon.
- Complete connection form:
- Select DuckLake as the database type
- Choose your catalog type (local file or database server)
- Configure the data storage location (local directory, S3 path, or GCS path)
- For S3/GCS storage, configure an AWS authentication profile
- Connect: Click save to establish your connection.
- Start Querying: Begin creating tables and querying your lakehouse data.
For detailed instructions on connecting to DuckLake, refer to the Connect article.
DuckLake Features in DBCode
DBCode provides full support for DuckLake’s lakehouse capabilities:
- Schema browsing: Explore tables, columns, and metadata in the object tree
- SQL autocomplete: Intelligent suggestions for DuckDB-specific functions and syntax
- Data editing: Insert, update, and delete rows directly in the data grid
- EXPLAIN visualization: Graphical query execution plan visualization
By using DuckLake with DBCode, you get a lightweight lakehouse directly within Visual Studio Code, with the full power of DuckDB’s analytical engine and flexible storage backends.
For more information about DuckLake, check out DuckLake.