Business continuity is the uninterrupted availability of all key resources supporting essential business functions. Business Continuity management aims to provide the availability of processes and resources following disruption to ensure the continued achievement of mission critical objectives. Its very dry. Writing this makes me question my choices in life.
Should I have a Business Continuity Plan?
A Business Continuity Plan is an essential part of good business practise for any organisation but especially for organisations dealing with the following;
- The requirement for availability during an extended working day, such as 365 day a year up time.
- High dependence on certain key facilities, such as data centres or manufacturing facilities.
- Heavy reliance on IT, data, comms and telephony, which describes most companies.
- A high level of compliance, audit, legal or regulatory impact in the event of loss of a facility such as finance or healthcare.
- If there is the potential for legal liability.
- Possible loss of the confidence and support of workforce – where an incident could cause staff to look to move to new organisations which can be a concern at startups.
- Potential loss of political and stakeholder support.
Ok, that describes me! What are the benefits?
A business continuity program will have benefits for your organisation by providing;
- A more resilient operation infrastructure
- Better compliance and quality requirements
- Capability to continue to achieve your organisations mission
- Capability to continue the businesses profitability
- Capability to maintain market share
- Improved morale for employees
- Protection of image, reputation and brand value
Nice! What are the impacts of business Disruption?
Having Business Continuity Management in place can help with 4 main areas; Marketing, Finance, Statutory/Regulatory and Quality.
- Continued operations is crucial to maintaining customer confidence, too much disruption and customer churn is a certainty!
- Very often strong advertising and messaging can work against us, especially where we set a high bar of expectation.
- Typically spending on marketing may require triple its normally allocated annual budget in the aftermath of a disaster as our skilled team tries to restore customer confidence and maintain or recover market share.
- Many contracts contain damages or penalty clauses, often expressed as a percentage of the contracts value, that may be invoked in the event of a service failure.
- Even for unforeseeable disasters (Force Majeure) clauses are being contested in courts as in our modern world most disasters should be foreseeable and planned for.
- There can be other financial losses too;
- Loss of interest on overnight balances(?)
- Cost of interest on cashflow – especially if you need an overdraft to cover cashflow disruptions.
- Delays in customer accounting and payments.
- Loss of control over debtors.
- Loss of credit control.(?)
Statutory or compliance requirements;
- Many organisations have to meet legal requirements
- Record keeping and audit trails,
- Compliance requirements and industry regulations
- Health and safety, and environmental requirements
- Government agency requirements
- Tax and customs requirements,
- Import and export regulations,
- Data protection regulations
- Depending on your organisation some are all of these will relate to you and not having the capability to comply can result in severe penalties.
- Many international standards have DRP and BCM
- ISO 9000 requires quality management system audits and surveillance visits
- ISO 27001
- ISO 22301 is for Business Continuity Management
- All these require a business continuity plan to be created and available to protect critical business processes from disruption.
- Loss of services aggravated by a lack of a BCP can even cause those standard bodies to review and even withdraw your accreditation.
Survival from disruption
- We must implement a BCM to ensure our survival following a disruption.
- Impacts can include;
- Existing customers leaving.
- Prospective customers looking elsewhere
- Loss of market share.
- Damaged image and credibility
- Reversed cashflow
- Costs could spiral out of control.
- Inventory costs rise and management is difficult – especially for grocery stores and retail.
- Share prices can drop
- Competitors may take advantage of the disruption.
- Key staff may leave
- Layoff may be necessary.
That’s interesting but what are the causes?
Far too many to count, from local to regional and all the way up to national here are a few;
- Systems failure
- Data corruption
- Garda blocking access to the locality due to an emergency.
- Chemical contamination
- Theft of PC’s with Personal information – in some cases these PCs may be the only source of that information.
- Loss of supplied service from backups and data retention to AWS
- Loss of power
- Volcanic erruptions
- Snow, floods and windy weather
- Civil unrest
These don’t sound like IT though?
That’s right, Business Continuity is about more than IT. It has to cover all business operations. These can include manufacturing, retail, operations, and all front and back office activities deemed critical. BCM should be encouraged for organisations of all sizes and US PS-PREP (?) has been a key drivers here.
So what counts as a disaster then?
Disasters can be difficult to identify; really its any event where critical operations are impacted. Where there are control
One example of what would previously be classed as a disaster but not now would be how organisations have migrated services to the cloud, so one physical failure of equipment does not impact the organisation.
Another would be an organisation that has 2 telecoms providers. If one provider goes down it is not a disaster as the second provider can be used.
A last example would be where supply lines fails but the organisation has built up a buffer supply.
In all these examples the critical operations of the organisation continue. Your organisation should take account of what these critical operations are to help define what a disaster is for you. This definition may prove vital for decision-making in determining when to activate the DRP.
What should my Recovery Timescale be?
When planning for business continuity, downtime should be an essential concern and we should aim to reduce or eliminate it in the case of a disaster where possible. This is an even bigger concern in our modern world of online transaction – Imagine Amazons lost sales every second of downtime as frustrate customers try competitor websites. Aside from customer facing applications, downtime can also impact other organisations due to our interconnect and interdependent economies. A good example of this is after Japans 2011 Tsunami, memory chip manufacturing was disrupted and as a result the cost of memory worldwide shot up.
First objectives and the road map we want to recover to always differs depending on the individual organisations needs. An online retailer might prioritize restoring its front end and order processing; while a law firms priority may be to ensure record retention and access to backend services and file shares; At the same time a bank might prioritize restoring services that ensure integrity of its databases are maintained.
Likewise some organisations might envision a full recovery of mission critical activities to be essential, while others might favor a partial recovery followed by phased restoration afterwords. There are a few measurements we use for this – http://www.bcmpedia.org/ is great for bits on info and clarification;
- Recovery Time Objective (RTO) – This is the targeted time from when service is disrupted to when full operations are regained.
- Maximum Tolerable Downtime (MTD) – Is the time after which the service disruption causes irreversible damage to the organisation.
- Maximum Tolerable Period of Disruption (MTPD) – Is the same as MTD
- Recovery Point Objective (RPO) – describes the interval of time that might pass during a disruption before the quantity of data lost during that period exceeds the Business Continuity Plan’s maximum allowable threshold or “tolerance.”
- Maximum Tolerable Data loss (MTDL) – The same as RPO – its the max dataloss you can suffer before catastrophic impact to your organisation. This can be more important that the period of time a service is down for – especially with the rise of FinTech organisations.
Generally we have seen a trend of RTO’s and MTD’s requirements decreasing. The reason for this reduced tolerance for disruption can be seen in a few ways; Many businesses are now very depended on Enterprise Resource Planning and Customer Relationship Management tools, downtime to these can have a paralyzing impact on operations. Companies can have call centers and offer 24/7/365 support; or have front ends or client integration that can all be critical components that clients rely on. Any outage can be fatal for the organisation.
Given this complexity recovery can be complex with different modules, backup schedules, data points and integration points. The speed of recovery is equally important to the protection of data and transactions.
So I kind of get it – but what is Business Continuity; is it a plan, a project or a process?
Great question mini-me! So business continuity usually starts as a project; small scale and highly focused. However once this testing starts generally management see’s the benefit of it and extends it. Generally this cycle repeats until the project becomes an ongoing program and then through the standards it sets and follows it becomes a management system. While initially viewed as a temporary project it evolves into an essential Business As Usual activity.
The end result is a plan that is maintained, regularly reviewed, that staff ared trained in to know what to do should a disaster occur.
Sure but i want to have a structured plan, what should that look like?
For a proper BCP Cycle plan there are a few different models like the one shown above from Ebrary. It operates in a cycle for a few reasons.
At the outset it is necessary to understand and gain executive buy in for certain business issues. This can involve carrying out Risk Assessments and Business Impact Assessments. This allows for us to understand our specific risks and what options are open to us for continuity and recovery. It also helps us understand what we need to proect from the perspective of having contingencies in place and managing risks.
Who do i need to involve in my BCP committee?
As said before there has been a move away from IT-Focused BCP’s towards Business Process BCP’s. Data is now located across the enterprise and critical business processes that rely on IT are carried out throughout the organisation. Because of this any BCP has to focus on the wider organisational units rather than just IT. We need to involve all executives, managers and employees. Its a big job so it is essential we have a BC co-coordinator who is responsible for maintaining, updating, reviewing and distributing the BCP.
BC is growing in maturity; what are the drivers?
Typically companies start with basic incident management runbooks, or plans. These will be how to respond to general hazards like fires, malware infections, bomb threats and dangerous weather etc. These would operate as the lowest maturity level for a BCP and the goal is to cover Health and Safety.
As organisations mature these strategies evolve to include disaster recovery plans and procedures for loss of information and communications technology, equipment, applications and data. This was viewed as a siloed IT activity with no consideration of overall impact on business activities.
Since those times several groups have begun exerting influence to encourage a more holistic approach to DRP’s including DRII, Survive!, ACP and DRIE. At the same time physical security has become more of a consideration, especially in Europe where terrorist attacks have occurred more frequently.
What else do I need to know?
Since the 1990’s regulatory requirements have come to the fore, pushing for a more holistic approach to BCM; with the goal to cover more areas of risk. The need for this was emphasized by high visibility scandles such as Enron(funny story; most of the ArturAnderson employees moved to EY!), Worldcon and Parmalat. So far FYRE Festival has not pushed for DRP and BCM regulator requirements for music festivals, but one can hope.
Some relevent legislation includes;
- SOX – US but similar legislation in EU(but not as comprehensive)
- GLBA – US
- HIPPA – US
- GDPR – EU
- NISD – EU
- PSD2 – EU
There have also been a number of independent and international standards organisations can acquire to reassure clients and stakeholders that should the worst happen, they are prepared! Initially all companies rushed to standardise BCM requirements but this led to too many potential standards, causing alot of confusion. Large multinationals then led a push to… wait for it…. standardise the standards. These included;
- Singapore Standard SS 507
- BS 27999-1
- ISO 22301:2012 BC
- lastly, one of the oldest – 1991 US National Fire Protection Association (NFPA) started to develop NFPA 1600 standard (disaster/emergency management, release in 1995) was improved to include business continuity.
These standards once mature provided numerous benefits including r
- Reducing supply chain disruption which had a positive impact on more organisations.
- Reduced costs for a company relating to unexpected disruption.
- Improving customer satisfaction by ensuring their needs were serviced as expected.
- Reduced barriers for accessing new markets.
- Reduced impacts to the environment.
- Improved market share.
So lets zoom in on supply chains for a moment.
Early on BCM advocates saw that their organisations were disrupted if links in their supply chain suffered disasters. The previous link on memory manufacture in Japan after the tsunami is a good example of this; as is the hard drive shortage caused by flooding in Thailand in the same year, 2011. In both of these instances the impact of the disrupted manufacturing caused issues down the supply chain for PC manufacturers.
In a survey from the Aberdeen Group;
- over 50% of respondents had suffered supply chain failure in the previous year.
- 56% -> supplier capacity did not meet the demand
- 49% -> suffered raw materials price increases or shortages
- 45% -> experienced unexpected changes in customer demand
- 39% -> experienced shipment delays/damages/misdirects
- 35% -> suffered fuel price increases or shortages
Whats this holistic approach thing you mentioned?
Well like my favorite word “Synergy”, Holistic is mostly a buzz word consultants and academics use to pad out their work. What we mean by holistic is just taking an end to end view of business continuity, encompassing all elements of the organisation. For an enterprise that can include taking into account, amoung others;
- BC and DR
- Operational Risk Management
- Insurance aspects
- Security compliance and breaches (information, telco, and e-commerce)
- Regulatory compliance
- Business trading and financial risk management
- Asset protection
- Project development and production risk management
- Supply chain risk management
- Quality tracking, defect management, maintenance, and product recall
- Problem management and escalation from helpdesks
- Customer complaint issues
- Health and safety
- Environmental risks and safety management
- Marketing protection (image and reputation)
- Crisis management (branch attacks, hostage and kidnap, product recall, fraud)
Operational and Business Resilience
Business resiliency is simply the ability of the organisation to adapt and react to internal and external dynamic changes. These can be opportunities or threats, disruptions or disasters. An organisations business resiliency is assessed by how much disruption the org can absorb before there is a significant impact on the business. Organisations never want to use their DRP plan, they want to avoid the crisis altogether.
But thats enough of DRP for this week…