Analysis of unexpected GraphHopper Directions API downtime

Previous Article

Today we had a downtime from 10:30 to 11:15 due to database issues, which was caused by wrong file access properties applied at the wrong directory. We apologize for this!


Such access limitations are normally not a big issue as our API can still work with a read only database and access can be reverted immediately. The API even works if our database crashes, only the customer dashboard will be offline.

The problem today was that the root cause wasn’t clear at that time and so we applied our restore functionality of the database but this didn’t work and all customer were overwritten and suddenly didn’t have access to the API. Be it through the dashboard and even through our API end point.

We needed to fix the restore functionality in order to do a proper import of the backed up data and this took roughly 30 minutes.

We already have restore tests but not with the full database and recent data. So we’ll have to investigate how we can learn from this problem and e.g. add a (daily) restore test to make 100% sure restoring works in the future within seconds not minutes or even better: automated.