Data-at-Rest Encryption in Duckdb
Key topics
The encryption-at-rest feature in DuckDB is sparking lively debate, with commenters diving into the nuances of database security and the trade-offs between encryption methods. Some are weighing in on the potential performance impacts and the challenges of key management, while others are highlighting the importance of encryption for protecting sensitive data. As one commenter noted, the new feature brings DuckDB more in line with other major databases, and regulars are buzzing about the implications for data security and compliance. The discussion is timely, given the growing need for robust data protection in an era of increasingly stringent regulations.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
48m
Peak period
10
0-3h
Avg / period
3.3
Based on 26 loaded comments
Key moments
- 01Story posted
Nov 20, 2025 at 2:26 PM EST
about 2 months ago
Step 01 - 02First comment
Nov 20, 2025 at 3:14 PM EST
48m after posting
Step 02 - 03Peak activity
10 comments in 0-3h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 22, 2025 at 10:10 AM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
DB encryption is useful if you have multiple things that need separate ACL and encryption keys but if it is one app one DB there is no need for it
> This allows for some interesting new deployment models for DuckDB, for example, we could now put an encrypted DuckDB database file on a Content Delivery Network (CDN). A fleet of DuckDB instances could attach to this file read-only using the decryption key. This elegantly allows efficient distribution of private background data in a similar way like encrypted Parquet files, but of course with many more features like multi-table storage. When using DuckDB with encrypted storage, we can also simplify threat modeling when – for example – using DuckDB on cloud providers. While in the past access to DuckDB storage would have been enough to leak data, we can now relax paranoia regarding storage a little, especially since temporary files and WAL are also encrypted.
Comparing it to a naive approach (encrypting an entire database file in a single shot and loading it all into memory at once) is always going to make competent work seem "amazing".
I say this not to shit on DuckDB (I see no reason to shit on them); rather, I think it's important that we as professionals have realistic standards that we expect _ourselves_ to hit. Work we view as "amazing" is work we allow ourselves not to be able to replicate. But this is not in that category, and therefore, you should hold yourself to the same standard.
I run a small company and needed to budget solid amount of chunk of time for next year to dig into improving this component of our system. I respect your perspective around holding high standards, but I do think it's worth getting excited about and celebrating reliable performant software that demonstrates consistent competence.
ie. Running it like a normal database, and getting to take advantage of all of its goodies
Where you store the .duckdb file will make a big difference in performance (e.g. S3 vs. Elastic File System).
But I'd take a good look at ducklake as a better multiplayer option. If you store `.parquet` files in blob storage, it will be slower than `.duckdb` on EFS, but if you have largish data, EFS gets expensive.
We[2] use DuckLake in our product and we've found a few ways to mitigate the performance hit. For example, we write all data into ducklake in blog storage, then create analytics tables and store them on faster storage (e.g. GCP Filestore). You can have multiple storage methods in the same DuckLake catalog, so this works nicely.
0 - https://www.definite.app/blog/duck-takes-flight
1 - https://github.com/Query-farm/httpserver
2 - https://www.definite.app/
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mountp...
the others, I haven't tried handling it in.
It also has a growing list of adapters - including: ODBC, JDBC, ADBC, dbt, SQLAlchemy, Metabase, Apache Superset and more.
We also just introduced a PySpark drop-in adapter - letting you run your Python Spark Dataframe workloads with GizmoSQL - for dramatic savings compared to Databricks for sub-5TB workloads.
Check it out at: https://gizmodata.com/gizmosql
Repo: https://github.com/gizmodata/gizmosql
SqliteMultipleCiphers has been around for ages and is free https://utelle.github.io/SQLite3MultipleCiphers/
And Turso Database supports encryption out of the box: https://docs.turso.tech/tursodb/encryption
I'm confident that a scheme based on tweakable block cyphers (like Adiantum or AES XTS) could be made into decent runtime loadable extension.
I implemented such schemes for my Go driver, but Go code is not really ideal to make a runtime loadable extension of (it'd have to be ported to C/Rust/zig).
https://news.ycombinator.com/item?id=40208800
https://github.com/sqlcipher/sqlcipher