Implementing a Kalman Filter in Postgres

neon.com

70 points by carlotasoto 5 days ago


TrackerFF - 2 days ago

Interestingly, in image 2, the filtered data seems to be worse than the actual noisy data?

Sure, the large spikes from sensor data were reduced, as seen with the blue line up in north which was considerably reduced, but seemingly at the cost of the more accurate tracks. We can see some "ground truth" - namely the map roads. I think if the source of the tracks are someone moving on a road (in a car etc.), it is safe to assume that the roads will be the most likely place to find them. In that image, it seems like we're seeing the tracks of some object moving on the road.

EDIT: But nice work anyway, I work a lot with noisy GPS data for vessels, where there are no roads - only shipping routes / paths, and increased GPS jamming in some areas makes prediction models more useful.

em500 - 2 days ago

This is nfortunately limited to 2-dimensional state/measurements. In this case the covariance matrix is only 3 numbers, so the required linear algebra can be easily be done in a loop. The generic Kalman handles arbitrary dimensions, but requires general matrix multiplication and inversions, which are not easy to implement in Postgres.

Still, 2d is a useful special case, and if it addresses the problem at hand, there's no need to overbuild. (Even the 1d Kalman filter, which often boils down to exponential smoothing, is a useful special case.)

tech_ken - a day ago

Wow this is extremely cool/impressive, but if my manager asked me to implement this I'd quit lol. The "state" headaches alone seem like a nightmare, nevermind all the whacky linear algebra you're going to have hand-roll (Like does Postgres even have a matrix type?? Did you have to implement matrix inversion in SQL from scratch?? I get nauseous just thinking about it.)

edit: I guess in 2D a lot of this becomes simpler than in general high-dimensions.

fifilura - 2 days ago

I have done this with AWS Athena. At the end of the day a kalman filter is just a number of multiplications and divisions.

My version would calculate one step at a time so it is a bit simplified (since that was a requirement, processing one measurement of incoming data daily). And also only in one dimension (here is two).

For the offline version (calculating many steps in a chunk), i'd imagine i'd use the array functions in Athena. But it may very well be possible to recreate using window functions. The state is just more column/columns after all.