Business Continuity Management (BCM) and Disaster Recovery Plan (DRP)

Although many measures can be taken to provide high availability, the availability of the IT infrastructure can never be guaranteed in all situations. In case of a disaster, the infrastructure could become unavailable, in some cases for a longer period of time. Business continuity is about identifying threats an organization faces and providing an effective response.

Business Continuity Management (BCM) and Disaster Recovery Planning (DRP) are processes to handle the effect of disasters. The Recovery Time Objective (RTO) and the Recovery Point Objective (RPO) determine the requirements for DRP.

Business Continuity Management (BCM)

BCM is not about IT alone. It includes managing business processes, and the availability of people and work places in disaster situations. It includes disaster recovery, business recovery, crisis management, incident management, emergency management, product recall, and contingency planning.

A Business Continuity Plan (BCP) describes the measures to be taken when a critical incident occurs in order to continue running critical operations, and to halt non-critical processes. The BS:25999 norm describes guidelines on how to implement BCM.

Disaster Recovery Planning (DRP)

Disaster recovery planning contains a set of measures to take in case of a disaster, when (parts of) the IT infrastructure must be accommodated in an alternative location. The IT disaster recovery standard BS:25777 can be used to implement DRP.

DRP assesses the risk of failing IT systems and provides solutions. A typical DRP solution is the use of fall-back facilities and having a Computer Emergency Response Team (CERT) in place. A CERT is usually a team of systems managers and senior management that decides how to handle a certain crisis once it becomes reality.

The steps that need to be taken to resolve a disaster highly depends on the type of disaster. It could be that the organization's building is damaged or destroyed (for instance in case of a fire), maybe even people got hurt or died.

One of the first worries is of course to save people. But after that, procedures must be followed to restore IT operations as soon as possible. A new (temporary) building might be needed, temporary staff might be needed, and new equipment must be installed or hired.

After that, steps must be taken to get the systems up and running again and to have the data restored. Connections to the outside world must be established (not only to the Internet, but also to business partners) and business processes must be initiated again.

RTO and RPO

Two important objectives of DRP are the Recovery Time Objective (RTO) and the Recovery Point Objective (RPO). The next figure shows the difference.

2013-05/rto-and-rpo.jpg

Figure: RTO and RPO

The RTO is the duration of time within which a business process must be restored after a disaster, in order to avoid unacceptable consequences (like bankruptcy). So the RTO determines the maximum time it may take for a failure to be repaired.

RTO is basically the same concept as MTTR, but for a complete business process instead of one component. Measures like failover and fall-back must be taken in order to fulfill the RTO requirements.

The RPO is the point in time to which data must be recovered considering some "acceptable loss" in a disaster situation. It describes the amount of data loss a business is willing to accept in case of a disaster, measured in time. For instance, when each day a backup is made of all data, and a disaster destroys all data, the maximum RPO is 24 hours – the maximum amount of data lost between the last backup and the occurrence of the disaster.

To lower the RPO, a different back-up regime should be implemented.


This entry was posted on Dinsdag 14 Juni 2011

Earlier articles

Quantum computing

My Book

Security bij cloudproviders wordt niet beter door overheidsregulering

Passend Europees cloudinitiatief nog ver weg

Data Nederlandse studenten in cloud niet grootschalig toegankelijk voor bedrijven VS

VS kan nog steeds Europese data Microsoft opeisen ondanks nieuwe regels

The cloud is as insecure as its configuration

Infrastructure as code

DevOps for infrastructure

Infrastructure as a Service (IaaS)

(Hyper) Converged Infrastructure

Object storage

Software Defined Networking (SDN) and Network Function Virtualization (NFV)

Software Defined Storage (SDS)

What's the point of using Docker containers?

Identity and Access Management

Using user profiles to determine infrastructure load

Public wireless networks

Supercomputer architecture

Desktop virtualization

Stakeholder management

x86 platform architecture

Midrange systems architecture

Mainframe Architecture

Software Defined Data Center - SDDC

The Virtualization Model

What are concurrent users?

Performance and availability monitoring in levels

UX/UI has no business rules

Technical debt: a time related issue

Solution shaping workshops

Architecture life cycle

Project managers and architects

Using ArchiMate for describing infrastructures

Kruchten’s 4+1 views for solution architecture

The SEI stack of solution architecture frameworks

TOGAF and infrastructure architecture

The Zachman framework

An introduction to architecture frameworks

How to handle a Distributed Denial of Service (DDoS) attack

Architecture Principles

Views and viewpoints explained

Stakeholders and their concerns

Skills of a solution architect architect

Solution architects versus enterprise architects

Definition of IT Architecture

What is Big Data?

How to make your IT "Greener"

What is Cloud computing and IaaS?

Purchasing of IT infrastructure technologies and services

IDS/IPS systems

IP Protocol (IPv4) classes and subnets

Introduction to Bring Your Own Device (BYOD)

IT Infrastructure Architecture model

Fire prevention in the datacenter

Where to build your datacenter

Availability - Fall-back, hot site, warm site

Reliabilty of infrastructure components

Human factors in availability of systems

Business Continuity Management (BCM) and Disaster Recovery Plan (DRP)

Performance - Design for use

Performance concepts - Load balancing

Performance concepts - Scaling

Performance concept - Caching

Perceived performance

Ethical hacking

Computer crime

Introduction to Cryptography

Introduction to Risk management

The history of UNIX and Linux

The history of Microsoft Windows

Engelse woorden in het Nederlands

Infosecurity beurs 2010

The history of Storage

The history of Networking

The first computers

Cloud: waar staat mijn data?

Tips voor het behalen van uw ITAC / Open CA certificaat

Ervaringen met het bestuderen van TOGAF

De beveiliging van uw data in de cloud

Proof of concept

Een consistente back-up? Nergens voor nodig.

Measuring Enterprise Architecture Maturity

The Long Tail

Open group ITAC /Open CA Certification

Human factors in security

Google outage

SAS 70

De Mythe van de Man-Maand

TOGAF 9 - wat is veranderd?

Landelijk Architectuur Congres LAC 2008

InfoSecurity beurs 2008

Spam is big business

De zeven eigenschappen van effectief leiderschap

Een ontmoeting met John Zachman

Persoonlijk Informatie Eigendom

Archivering data - more than backup

Sjaak Laan


Recommended links

Genootschap voor Informatie Architecten
Ruth Malan
Gaudi site
XR Magazine
Esther Barthel's site on virtualization
Eltjo Poort's site on architecture


Feeds

 
XML: RSS Feed 
XML: Atom Feed 


Disclaimer

The postings on this site are my opinions and do not necessarily represent CGI’s strategies, views or opinions.

 

Copyright Sjaak Laan