On Thursday March 12, 2026, a failure related to the server resource usage caused an outage in the cluster fi-perko. The outage was caused by a newly implemented security feature which led to increased resource usage. Monitoring failed to notice and pinpoint the increased resource usage in time, which led to the outage.
The outage was aggravated by an unrelated networking change which led to delay in recovery.
This caused an outage to all sites in the fi-perko cluster. The problem was noticed immediately by our on-call monitoring processes, and measures were immediately implemented to begin recovery.
The disruption first began at 04:00 (UTC+2), escalated at 05:00 and ended at 08:19 (UTC+2). We apologise for any inconvenience caused by the disruption.
As a result of the incident, we at Seravo have identified the need for the following measures: