Availability issues in cluster fi-perko

Incident Report for Seravo

Postmortem

Notice of a Server Outage on March 12, 2026

On Thursday March 12, 2026, a failure related to the server resource usage caused an outage in the cluster fi-perko. The outage was caused by a newly implemented security feature which led to increased resource usage. Monitoring failed to notice and pinpoint the increased resource usage in time, which led to the outage.

The outage was aggravated by an unrelated networking change which led to delay in recovery.

This caused an outage to all sites in the fi-perko cluster. The problem was noticed immediately by our on-call monitoring processes, and measures were immediately implemented to begin recovery.

The disruption first began at 04:00 (UTC+2), escalated at 05:00 and ended at 08:19 (UTC+2). We apologise for any inconvenience caused by the disruption.

Timeline (all timestamps UTC+2 (EET))

  • 12.3.2026 03:47 Partial outage
  • 12.3.2026 03:52 On-call performs initial recovery measures
  • 12.3.2026 04:00 Recovered from partial outage
  • 12.3.2026 05:00 Incident escalated again
  • 12.3.2026 05:10 On-call attempts recovery measures
  • 12.3.2026 05:15 Full outage
  • 12.3.2026 05:27 On-call performs recovery measures
  • 12.3.2026 06:17 Recovered from the outage - begin site recovery
  • 12.3.2026 08:19 All affected sites were back online

Follow-Up Action

As a result of the incident, we at Seravo have identified the need for the following measures:

  • Testing new features with extended data-sets to be able to catch issues caused by the sheer volume of processed data
  • Improvements to system resource monitoring
Posted Mar 13, 2026 - 09:37 UTC

Resolved

This incident has been resolved.
Posted Mar 12, 2026 - 07:47 UTC

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Mar 12, 2026 - 06:21 UTC

Identified

The issue has been identified and a fix is being implemented.
Posted Mar 12, 2026 - 04:33 UTC

Investigating

We are currently investigating this issue.
Posted Mar 12, 2026 - 03:10 UTC
This incident affected: Finland (fi-perko cluster).