Implementing a Kalman Filter in Postgres

Posted3 months agoActive3 months ago

carlotasoto

71 points

12 comments

neon.comTechstory

calmpositive

Debate

40/100

Kalman FilterPostgresGps Data SmoothingSQL

Key topics

Kalman Filter

Postgres

Gps Data Smoothing

SQL

The article discusses implementing a Kalman Filter in Postgres to smooth GPS data, sparking a discussion on the challenges and trade-offs of using this technique in a database.

Snapshot generated from the HN discussion

Discussion Activity

Moderate engagement

First comment

Peak period

84-90h

Avg / period

Comment distribution12 data points

Loading chart...

Based on 12 loaded comments

Key moments

01Story posted
Sep 26, 2025 at 2:33 PM EDT
3 months ago
Step 01
02First comment
Sep 30, 2025 at 6:42 AM EDT
4d after posting
Step 02
03Peak activity
6 comments in 84-90h
Hottest window of the conversation
Step 03
04Latest activity
Sep 30, 2025 at 4:26 PM EDT
3 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (12 comments)

Showing 12 comments

TrackerFF

3 months ago

2 replies

Interestingly, in image 2, the filtered data seems to be worse than the actual noisy data?

Sure, the large spikes from sensor data were reduced, as seen with the blue line up in north which was considerably reduced, but seemingly at the cost of the more accurate tracks. We can see some "ground truth" - namely the map roads. I think if the source of the tracks are someone moving on a road (in a car etc.), it is safe to assume that the roads will be the most likely place to find them. In that image, it seems like we're seeing the tracks of some object moving on the road.

EDIT: But nice work anyway, I work a lot with noisy GPS data for vessels, where there are no roads - only shipping routes / paths, and increased GPS jamming in some areas makes prediction models more useful.

n4r9

3 months ago

1 reply

Yeah, this sounds like a way to "smooth" the GPS trail to remove anomalies quickly, without paying attention to the road network.

The problem of snapping a noisy GPS trail to the road network is known as map-matching. Good map-matching algorithms tend to use hidden Markov models, which are sort of like discrete Kalman filters. The state of the model is something like "which road segment is the truck on", and the predictive step employs routing algorithms to calculate transition probabilities between states. This is a dynamic algorithm that can be done on the fly - i.e. as each GPS point comes in - but I'd be very reluctant to do it in postgres.

foota

3 months ago

1 reply

So apps like Google maps do this? I'm always surprised when it jumps between roads. Like... You knew I've been on this road for the last ten minutes, you think I'm going to teleport into the tunnel beneath me?

n4r9

3 months ago

I'm not sure about Maps to be honest, but that sort of glitch is a strong indicator that they're just snapping to the nearest current road rather than doing proper routing calculations.

My Toyota has a speed limit symbol on the dashboard which will occasionally show the speed of a slip-road going onto the motorway I'm already on. I'm guessing it's a similar phenomenon.

whilenot-dev

3 months ago

I share the confusion. It depends on the measuring intent I guess, and it'd have been nice to say something about that and include some kind of indicator for these outliers. Here's the thing in Google Maps: https://www.google.com/maps/@47.1745904,7.2745602,14z/data=!...

From looking at the company website[0] I'd assume the goal could've been to get a better estimate about the total distance travelled during tracking analysis? Keeping that goal in mind, the error from the outliers was reduced significantly without causing too much disturbance on the accurate data. Nonetheless, including further measurements from speedo- and odometer in the sensor fusion at certain intervals would make this goal redundant and provide an even better estimate.

[0]: https://traconiq.ch/

em500

3 months ago

1 reply

This is nfortunately limited to 2-dimensional state/measurements. In this case the covariance matrix is only 3 numbers, so the required linear algebra can be easily be done in a loop. The generic Kalman handles arbitrary dimensions, but requires general matrix multiplication and inversions, which are not easy to implement in Postgres.

Still, 2d is a useful special case, and if it addresses the problem at hand, there's no need to overbuild. (Even the 1d Kalman filter, which often boils down to exponential smoothing, is a useful special case.)

fifilura

3 months ago

1 reply

I'd imagine 90% of the kalman filters out there are for 2 or maybe 3 dimensions, since the use case is mostly this, determining a position.

The filter fails is when there is not a single "true" answer to aim for, but there are many true answers. A position is clearly defined as long as it is not quantum physics.

thekoma

3 months ago

1 reply

Yeah. Using the Kalman filter just to determine the position from noisy position measurements really undercuts the capability of the filter to use system physics to estimate the true state.

In one of the most common applications of Kalman filters, autonomous robots (e.g., a robot vacuum or a commercial drone), the filters are around 9 to 12 dimensions.

em500

3 months ago

1 reply

Right, in addition to the position you usually want the velocity, and sometimes also the acceleration, in all dimensions. More ambitious (or optimistic) practitioners could add more sensor measurements, like gyroscopes.

fifilura

3 months ago

You are right of course and I was out of my depth. I wonder if the vector types now being added to databases for ML/AI stuff could help with this.

tech_ken

3 months ago

Wow this is extremely cool/impressive, but if my manager asked me to implement this I'd quit lol. The "state" headaches alone seem like a nightmare, nevermind all the whacky linear algebra you're going to have hand-roll (Like does Postgres even have a matrix type?? Did you have to implement matrix inversion in SQL from scratch?? I get nauseous just thinking about it.)

edit: I guess in 2D a lot of this becomes simpler than in general high-dimensions.

fifilura

3 months ago

I have done this with AWS Athena. At the end of the day a kalman filter is just a number of multiplications and divisions.

My version would calculate one step at a time so it is a bit simplified (since that was a requirement, processing one measurement of incoming data daily). And also only in one dimension (here is two).

For the offline version (calculating many steps in a chunk), i'd imagine i'd use the array functions in Athena. But it may very well be possible to recreate using window functions. The state is just more column/columns after all.

View full discussion on Hacker News

ID: 45389589Type: storyLast synced: 11/20/2025, 2:35:11 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN