Disaster Recovery
Downtime is not an option for modern organizations that must fulfill their customers’ needs and expectations. Different types of incidents can occur and impact your business revenue or even existence. Whether it’s a ransomware attack, a power outage, flood or simply human mistakes, these events are unpredictable, and the best thing you can do is to BE PREPARED.
Preparedness means that you should have a solid business continuity and disaster recovery plan (BCDR plan). One that has been tested and that can be put in motion smoothly.
Two of the important parameters that define a BCDR plan are the Recovery Point Objective (RPO) and Recovery Time Objective (RTO). For those of you who are not familiar with these terms, let me give you a brief description:
- RPO limits how far to roll back in time, and defines the maximum allowable amount of lost data measured in time from a failure occurrence to the last valid backup.
- RTO is related to downtime and represents how long it takes to restore from the incident until normal operations are available to users
Define RTO and RPO values
The truth is there is no one-size-fits-all solution for a business continuity plan and its metrics. Companies are different from one vertical to another, have different needs, and therefore they have different requirements for their recovery objectives. However, a common practice is to divide applications and services into different tiers and set recovery time and point objective (RTPO) values according to the service-level agreements (SLAs) the organization committed to.
Data protection classification is important to determine how to store, access, protect, recover and update data and information more efficient based on their specific criteria. It is essential to analyze your applications and determine which of them are driving your business, generating revenue and are imperative to stay operational. This process that is essential for a good business continuity plan is called business impact analysis (BIA), and it establishes protocols and actions for facing a disaster.
For example, you can use a three-tier model to design your business continuity plan:
- Tier-1: Mission-critical applications that require an RTPO of less than 15 minutes
- Tier-2: Business-critical applications that require RTO of 2 hours and RPO of 4 hours
- Tier-3: Non-critical applications that require RTO of 4 hours and RPO of 24 hours
It’s important to keep in mind that mission-critical, business-critical and non-critical applications vary across industries and each organization defines these tiers based on their operations and requirements.
Now that you have ranked your applications and services and you know what the impact will be in case of specific incidents, it’s time to find a solution that can help you protect your business data and operations.
Practice of RTO and RPO in MineSec
Quick application recovery (RTO)
Database directly from a backup (RPO)
M-AZs: MineSec uses multiple Availability Zones (AZs), Every AWS Region consists of multiple AZs. Each AZ consists of one or more data centers, located a separate and distinct geographic location. This significantly reduces the risk of a single event impacting more than one AZ. Therefore, we designed a DR strategy to withstand events such as power outages, flooding, and other other localized disruptions, then using a Multi-AZ DR strategy within an AWS Region can provide the protection you need.
Backup: RDS uploads transaction logs for Multi-AZ DB clusters to Amazon S3 continuously. We can restore to any point in time within your backup retention period.
Conclusion
Nobody can predict a disaster, however, you can act organized following your business continuity plan when facing such an incident. RPO and RTO values may vary across different companies, but at all times they will be a compromise between business needs for Availability and required investments in IT. Their estimation should be a result of a deliberation between your organization’s business and IT experts. But what goes beyond any deliberations is an implementation of a reliable Availability solution for virtual, physical and cloud workloads to ensure Always-On operations for your business.
To select the best strategy, you must analyze benefits and risks with the business owner of a workload, as informed by engineering/IT. Determine what RTO and RPO are needed for the workload, and what investment in money, time, and effort you are willing to make.