Sedonadb: a New Geospatial Dataframe Library Written in Rust
Posted3 months agoActive3 months ago
sedona.apache.orgTechstory
skepticalmixed
Debate
80/100
Geospatial DataRust ProgrammingData Analysis
Key topics
Geospatial Data
Rust Programming
Data Analysis
SedonaDB, a new geospatial DataFrame library written in Rust, is introduced, sparking discussion about its necessity and advantages over existing solutions like PostGIS and DuckDB.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
28m
Peak period
33
0-6h
Avg / period
9.8
Comment distribution49 data points
Loading chart...
Based on 49 loaded comments
Key moments
- 01Story posted
Sep 24, 2025 at 12:00 PM EDT
3 months ago
Step 01 - 02First comment
Sep 24, 2025 at 12:29 PM EDT
28m after posting
Step 02 - 03Peak activity
33 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 26, 2025 at 3:08 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45362206Type: storyLast synced: 11/20/2025, 6:33:43 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
I thought Apache Sedona is implemented in Java/Scala for distributed runtimes like Spark and Flink. Wouldn't Rust tooling for interactive use be built atop a completely different stack?
Surely they're the same? Two sedona projects is one thing, but two apache sedona projects is sheer madness?
What does it do besides being written in Rust?
There are other good alternatives, such as GeoPandas and DuckDB Spatial. SedonaDB has Python/SQL APIs and is very fast. New features like full raster support and compatibility with lakehouse formats are coming soon!
As someone who has had to use geopandas a lot, having something which is up to an order of magnitude faster is a real dream come true.
From the README:
> Update (August 2024): GeoPolars is blocked on Polars supporting Arrow extension types, which would allow GeoPolars to persist geometry type information and coordinate reference system (CRS) metadata. It's not feasible to create a geopolars. GeoDataFrame as a subclass of a polars. DataFrame (similar to how the geopandas. GeoDataFrame is a subclass of pandas.DataFrame) because polars explicitly does not support subclassing of core data types.
My bet is most of actually useful spatial ST_ functions are not implemented in this one, as they are not in the DuckDB offering.
In terms of number of functions PostGIS is still the leader, but for analytical functions (spatial relationships, distances, etc) having those in place in these systems is important. DuckDB started this but this has a spatial focused engine. You can use the two together, PostGIS for transactional processing and queries, and then SedonaDB for processing and data prep.
A combination of tools makes a lot of sense here especially as the data starts to grow.
Postgres made gigantic leaps in recent years - both in performance and feature-set. I don't think ever comparing the new contenders with daddy is fair. But then there are the DuckDB advocates who claim it pioneered spatial, which is so much not true.
Postgres is amazing system, which is also available free. We don;t have too many of these, and too many aging that well.
What am I missing? The api even looks the same.
For example, if i wanted to define a 4d region called (fish, towel, mouse, alien) and there were floats for each of fish/towel/mouse/alien?
{ "type": "EngineeringCRS", "name": "Fish, Towel, Mouse", "datum": {"name": "Wet Kitty + Mouse In Peril"}, "coordinate_system": { "subtype": "Cartesian", "axis": [ {"name": "Fish", "abbreviation": "F", "direction": "east"}, {"name": "Towel", "abbreviation": "T", "direction": "north"}, {"name": "Mouse", "abbreviation": "M", "direction": "up"}, ] } }
(Subject to the limitations of PROJJSON, such as a 4D CRS having a temporal axis and a limited set of acceptable "direction" values)
1. The latitude/longitude ordering for points differs from PostGIS and most standard geospatial libraries, which creates friction due to muscle memory.
2. Anecdotal: spatial joins haven't matched PostGIS performance for similar operations, though this may vary by use case and data size.
3. The spatial extension has a backlog of long-standing GitHub issues.
I don't get to do geospatial work as much anymore, but I would have killed for this just a year ago.
That is to say that if the issue is duckdb running out of memory, it is most likely because the rust implementation is using memory more efficiently for whatever query is crashing duckdb, rather than graceful handling of memory allocation failure.
Where it is possible in c++ to gracefully handle memory allocation failure, it is not really a thing in rust I'm not even sure whether it is possible to catch_unwind it. I say this as a rust person who doesn't fancy c++ in the slightest...
I wouldn't wager a nickel on someone's life if it depended on embedded STL usage.
Here are the queries: https://github.com/apache/sedona-spatialbench/blob/main/prin...
They should be fairly easy to replicate!
It will be great with some more options in this space, especially if it makes a smooth transition from single-node/local interactions to multi-node scale-out.
SedonaDB currently supports SQL, Python, R, and Rust APIs. We can support APIs for other languages in the future. That's another nice part about Rust. There are lots of libraries to expose other language bindings to Rust projects.
It comes a disappointment for me that SedonaDB hasn’t adopted a similar approach.
Apache stack provides everything needed, but for small things I would not prefer SQL exactly
While PostGIS is often used for spatial analytics because of its rich spatial function coverage, it is fundamentally a transactional database. This design makes it less suited for analytical query performance, and including it directly in SpatialBench would risk claims of being an “apples-to-oranges” comparison. That’s why we exclude PostGIS from the published benchmark results.
That said, we do continuously validate against PostGIS. For every single function in SedonaDB, we maintain an automated PyTest benchmark framework (https://github.com/apache/sedona-db/tree/main/benchmarks) that compares both speed and correctness against DuckDB and PostGIS. This ensures we catch regressions early and guarantees correctness. You can even run these benchmarks yourself to see how SedonaDB performs. It is often extremely fast in practice.