What is Redundancy?
Redundancy is the principle of duplicating critical systems, components or functions so that operation can continue in the event of a failure.
The aim of redundancy is to increase the reliability, availability and safety of systems.
In industrial automation, IT and OT environments, redundancy is essential for continuous production, data availability and process safety.
🧠 Why is redundancy important?
- Preventing outages caused by hardware or software failures
- Preserving data and communication during network problems
- Increasing safety and compliance (e.g. SIL, IEC 61511)
- Supporting Business Continuity and Disaster Recovery
🧱 Types of redundancy
| Type | Explanation |
|---|---|
| Hardware redundancy | Duplicate power supplies, CPUs, servers, PLCs |
| Network redundancy | Multiple network paths, separate switches, VLANs |
| Data redundancy | Backup, RAID, Immutable Backup, replication |
| Functional redundancy | Multiple systems with the same logic or role |
| Communication redundancy | Duplicate fieldbuses or protocols (e.g. Profibus Redundant) |
| Geographical redundancy | Systems spread across multiple sites |
⚙️ Redundancy architectures
- Hot standby: an active system plus immediate reserve capacity (e.g. a duplicate SCADA server)
- Cold standby: the system is only activated on failure
- Load balancing: multiple systems work in parallel
- Failover clusters: automatic switchover on fault detection
🏭 Redundancy in OT environments
- SCADA systems with redundant servers or databases
- Redundant PLC CPUs or communication modules
- Historian with data links over duplicate networks
- Power supplies with dual feeds for continuous operation
- SIL systems with duplicate sensors or actuators
✅ Benefits of redundancy
- Increased availability and reliability
- Less unplanned downtime
- Greater safety in critical processes
- Compliance with standards such as SIL, GAMP and IEC 62443
- Quick recovery from incidents or faults
📌 In summary
Redundancy is the deliberate duplication of systems or paths to safeguard continuity, safety and reliability in critical environments.
