Friday evening, 6:30 p.m.: the central ERP server of a logistics company no longer starts after a failed firmware update. Dispatching is paralysed, truck drivers receive no route plans, and warehouse management falls back to handwritten lists. The system is back up only on Monday morning — three lost business days and six-figure revenue loss.
System failures often hit organisations when they are least prepared. BSI lists the failure of devices or systems as elementary threat G 0.25 — one of the most broadly linked threats in the entire IT-Grundschutz catalogue, with references to more than 200 modules.
What’s behind it?
Every technical device has a finite lifespan and can fail at any time — through wear, defects, misuse or external influences. For time-critical applications with no fallback, a single device failure quickly escalates into a company-wide problem.
Failure scenarios
- Hardware defects — Hard drives, power supplies, memory modules and mainboards are subject to physical wear. For systems beyond their planned service life, failure probability rises exponentially.
- Faulty updates — A firmware installed for the wrong system type can leave a device in an unbootable state. The same applies to operating system updates that trigger incompatibilities with existing drivers.
- Power supply problems — Voltage spikes, interruptions or faulty UPS systems can cause abrupt shutdowns. File system inconsistencies after a hard shutdown often prevent a fast restart.
- Environmental influences — Overheating due to failed air conditioning, humidity, dust or mechanical shocks affect sensitive components.
- Dependency chains — When a single storage controller accessed by multiple virtual machines fails, the impact multiplies across the entire infrastructure.
Impact
The damage depends directly on how time-critical the affected application is and whether fallback options exist. Production control, point-of-sale systems, email servers or VoIP systems can cause economic damage by the second. Direct costs (recovery, replacement procurement) are joined by indirect consequences: missed delivery deadlines, contractual penalties, reputational loss.
Practical examples
Storage controller in the data centre. An internet service provider runs its web servers on a central storage system. A power supply fault shuts down the array. Although the actual defect is fixed within an hour, the servers cannot be restarted because of file system inconsistencies. Several customer systems remain unreachable for days.
Firmware update with wrong image. An administrator installs a firmware update intended for a different model on a network switch. The switch no longer starts, and the entire floor loses network connectivity. Because no replacement device is in stock, resolution takes three days.
Air conditioning fails unnoticed. In a medium-sized company’s server room, the air conditioning fails over the weekend. Temperature rises gradually. By Monday, two servers have failed with hard drive errors, and several RAID arrays must be painstakingly reconstructed.
Relevant controls
The following ISO 27001 controls mitigate this threat. (You’ll find the complete list of 47 mapped controls below in the section ‘ISO 27001 Controls Covering This Threat’.)
Prevention:
- A.8.14 — Redundancy of information processing facilities: Redundant design of critical systems (clusters, mirrored storage, dual power supplies) prevents a single defect from interrupting the service.
- A.7.11 — Supporting utilities: Uninterruptible power supply, emergency generators and monitored climate control protect against environmental failures.
- A.7.12 — Cabling security: Protected cabling and redundant network paths avoid single points of failure in the physical infrastructure.
- A.8.6 — Capacity management: Monitoring and timely scaling prevent failures caused by resource exhaustion.
- A.5.29 — Information security during disruption: Pre-planned continuity measures ensure that critical processes continue to run during a failure.
Detection:
- A.8.15 — Logging: Central logging captures hardware warnings, temperature alerts and error conditions before a total outage occurs.
- A.8.16 — Monitoring activities: Active monitoring (SNMP traps, health checks, heartbeats) detects impending failures early.
Response:
- A.5.24 — Information security incident management planning and preparation: Documented incident response plan with clear escalation paths and recovery procedures.
- A.8.13 — Information backup: Regular, tested backups enable data recovery after a hardware failure.
BSI IT-Grundschutz
G 0.25 is linked by the BSI IT-Grundschutz catalogue to the following modules:
- OPS.1.1.7 (Systems management) — Requirements for monitoring, capacity planning and incident handling.
- SYS.1.1 (General server) — Basic server protection, including redundancy and maintenance.
- INF.2 (Data centre and server room) — Physical protection measures such as climate control, fire protection and power supply.
- DER.4 (Emergency management) — Planning and execution of measures to maintain operations during failures.
Sources
- BSI: The State of IT Security in Germany — Annual report with statistics on IT disruptions and failures
- BSI IT-Grundschutz: Elementary Threats, G 0.25 — Original description of the elementary threat
- ISO/IEC 27002:2022 Section 8.14 — Implementation guidance on redundancy of information processing facilities