Shai Gindi, VP of Business Development in Y-tech
Business Continuity Program is a broad program that aims to enable the organization to work as continuously as possible, following the goals set in advance.
A Disaster Recovery Plan is part of the more extensive process of building business continuity.
The world of computing today is an integral part of any business activity, some businesses use computer systems but can also exist without them for one period or another, and there are businesses where the entire business core is based on computer systems and a few minutes and sometimes seconds without the ability to work can have business significance.
Today’s general concept is the construction of smart systems that include continuous protection and the ability to continue and recover from one disaster or another. In our world, the ways to damage computer systems are many and varied.
There are two essential concepts associated with disaster recovery that are worth knowing:
RTO – Recovery Time Objective
The period time required for the restart of the business activity, how long it will take for the organization to return to regular and complete work and / or according to the business continuity plan.
The time definition can be zero; that is, the organization cannot experience downtime at all, and the systems must be built accordingly, several hours/days, depending on the needs of business continuity and the ability to invest financially in the solution.
The great importance is to carry out a thorough examination at the characterization stage, and in accordance with the needs of the organization, the computer systems and the solution must be adapted so that the resulting test will meet the set goal.
RPO -Recovery Point Objective
The time span in which data can be accumulated that can be lost during a disaster event. The time frame varies and is defined according to the needs of the organization.
For example, a particular organization may define seven hours of recovery from a disaster that it considers reasonable to lose material of three hours of work; another organization may represent a broader range of hours or an inability to lose at all.
These metrics may affect:
- How to build computer systems on DAY 1
- The time setting can be zero, which means that the organization is not able to experience downtime at all.
- Which backup systems should be adapted to the customer, including a cold or hot, active or passive DR (disaster recovery) solution.
- Level of service required (SLA) from the supporting systems, IT staff, and relevant vendors.
Example of a complete process:
- At the stage of characterizing the solution to the customer, the following data are addressed:
- What is the uptime level required for the system?
- How long of total failure can the organization endure?
- How long can an organization lose information (RPO)?
- How long time does it take for the organization to perform a backup (RTO)?
Each of the above parameters may be relevant to all systems or determined individually for each system according to its importance level.
Depending on the answers, a solution will be characterized that will meet the above needs, for example:
- Uptime level is required 99.99% per year – make sure that the power systems meet the standard and are able to meet the requirement.
- Total failure of up to 9 hours per year.
- Loss of information for one working day.
- Return from backup within 4 hours.
The data will be passed to an engineer at the company who will recommend a cloud solution that can provide the exact solution. It should be noted that there is the ability to offer any resolution and any solution, from a standard cloud solution that includes redundancy to a total hot DR solution that works in an Active-Active configuration.
The characterization phase is critical and essential as it must match the customer’s needs on the operational/technical side but no less important on the financial side.
Advice and guidance are an integral part of the process. It is vital to present to the client the pros and cons of each type of solution and help him reach the right decision recommended by the technical people.
Keep in mind that the system is built and ready for “doomsday”, for a day when there will be a failure. Then there will be a real test where everything must work. It is highly recommended to perform from time to time with the actual customer scenarios, simulate several extreme scenarios, and see how the system responds. These tests will allow the customer and the technical staff to be confident and calm that everything will work as planned on Judgment Day.
No one can predict precisely when a disaster will occur and what it will affect in the existing reality.
It is certainly possible and recommended to prepare in advance to cope optimally as soon as the disaster “knocks on the door”. RTO & RPO values may vary from organization to organization. Still, they will always compromise the business needs and availability the organization needs and the budgetary investment required in IT.
This estimate should be determined in deliberation between management personnel who understand the business need, availability and possible harm in any situation and the IT experts whose job it is to reflect the technical risks and build the technical solution that will address the business need and from there converge to a suitable budget.
Eventually, a decision will be made, and it will be handed over to the operating entity whether it is internal or external to the company; from this moment onwards, the burden of proof is on the solution provider, and its job is to perform periodic inspections and reflect the findings to the customer.
Another angle for understanding the issue: the whole insurance solution can be likened to a complex structure, say a large business tower in a commercial area. Once the agreement is signed, it becomes irrelevant until the day when the insurance company will be required to provide an answer for one reason or another. If an accurate and correct characterization has been made, and further proper maintenance and testing, the answer will be provided according to expectations.