In order to prepare for the AWS Certified Solutions Architect – Professional Exams, you must be able to design and build enterprise-grade solutions that are scalable and fault tolerant.  You must also be able to design and implement appropriate Backup, Restore, Disaster Recovery and Business Continuity solutions.  This document is Part 1 of the Domain 1 – High Availability and Business Continuity objective which represents 15% of the exam weight.

  • 1 Demonstrate ability to architect the appropriate level of availability based on stakeholder requirements
  • 2 Demonstrate ability to implement DR for systems based on RPO and RTO
  • 3 Determine appropriate use of multi-Availability Zones vs. multi-Region architectures
  • 4 Demonstrate ability to implement self-healing capabilities AWS Certified Solutions Architect Professional Level Exam Blueprint 3
    • Content may include the following: High Availability vs. Fault Tolerance


Recovery Time Objective (RTO) – This is the time it takes after a disruption to restore IT services for normal business operations.  The RTO will be determined by the amount of time it takes to recover from a disaster.  Businesses will have specific requirements for RTO levels.  For example, for critical business applications, they might specify that the RTO cannot be more than 2 hours.   Thus the solutions designed for the organization must be such that restoration of services should take place within 2 hours.

Recovery Point Objective (RPO) – This is the acceptable amount of data loss measures in time.  For example, if you take backups once a day, the worst case scenario is that you will lose 24 hours of data.  If this is not acceptable, you need design solutions which enable you to take more frequent backups.

AWS Features design to build DR Solutions

As part of your DR Strategy, you need to know what AWS Services and Resources are available to build your DR solutions that meet not only your RTO and RPO levels but also enable you to effectively manage cost.  There is always going to be a trade-off where hot standby solutions will cost more than a simple backup and restore strategy, but a range of options are available to meet your business SLA requirements and budget

You can also build DR Solutions as a means to respond to on-site disasters and here the use of networking technologies, EC2 Instances, Storage and DNS plays a vital role.

Core resources and services to build DR Solutions

  • Regions & Availability Zones – Regions are geographical isolation areas where you can build out individual infrastructure solutions. Regions can be used to build DR Solutions or even enable you to position your services closer to your users.
  • Each Region comes with a minimum of 2 Availability Zones.  Availability Zones can be used for both failover in DR scenarios and to offer High Availability for continuous operations in case of a single AZ failure


Additional Architectural Concepts – DR with High Availability

DR with High Availability

  • You use Multiple Availability Zone to design DR with High Availability. In this scenario, if have a web server farm as per the diagram above, these can be spread across a minimum of two availability zones.
    • An Elastic Load Balancer sits in front of the web servers across the two availability zones
    • Users requests are forwarded to a specific zone based on load.
    • Auto Scaling handles deployment of EC2 servers based on triggers
    • Database is configured in a Multi-AZ setup with easy failover in case the primary zone fails



AWS Services and Features Essential for Disaster Recovery

Storage Services

Storage services on Amazon AWS enable a variety DR solutions where backed up data can be stored.  Storage Solutions like the Amazon Storage Gateway Service enable you to back up on-premise data using a variety of backup strategies including using traditional backup applications with the help of the Virtual Tape Drive solution on Amazon Storage Gateway.  The following are core storage services that can be used to build your DR strategy:

  • Amazon Simple Storage Service
    • S3 Standard offers 11 9s for durability
    • S3 offers versioning to further protect versions of your data and help prevent accidental deletion
    • S3 offers lifecycle management to manage costs and move older data to cheaper storage like S3 IA and Glacier
    • S3 – RRS can be used to reduce storage costs in situations where loss of data is not a big issue
    • S3 IA is for infrequent access
      • Data must remain in Standard for 30 days before being moved to S3 IA and then another 30 days in S3 IA
      • Remember you are charged for objects at a minimum size of 128KB
    • Amazon Glacier
      • Ideal for long-term archive storage
      • Not ideal where RTO levels need to be under 3 hours
      • Enables lifecycle management
      • Storage Gateway Virtual Tape Shelf uses Amazon Glacier to store backed up data
    • Amazon Elastic Block Store
      • Create point in time snapshots of your data volumes
      • Use snapshots to create new EBS Volumes
      • Snapshots are stored in S3 and thus durable
      • EBS volumes can persist independently of the life of EC2 Instances
      • EBS Volumes are replicated across multiple servers within an Availability Zone only
    • AWS Import/Export
      • Old style physical disk backups that are transported to Amazon for importing and Exporting data. The AWS Import Export services are designed to work with 16TBof data or less and enable you to import directly into:
        • S3
        • Glacier
        • EBS
        • Export from S3
      • Amazon Snowball
        • AWS Snowball (Snowball), a new feature of AWS Import Export, is generally faster and cheaper to use than Disk for importing data into Amazon Simple Storage Service (Amazon S3).  AWS Snowball is an Amazon Disk Appliance that comes with an enclosure unit designed for security. You can use Snowball to help accelerate petabyte-scale data into and out of AWS.
        • Key features include:
          • Faster Data Transfers – The device comes with an RJ-45 as well as SPF with either fibre or copper interface.
          • Encryption – AWS Snowball automatically encrypts data with 256-bit encryption that is managed by AWS Key Management Service (KMS)
          • Tamper Resistant – The AWS Snowball appliance comes with a tamper-resistant seal and built-in Trusted Platform Module that uses a dedicated processor to detect any unauthorised modification to hardware, firmware or software.
          • End to End Tracking – AWS Snowball uses E Ink shipping label to ensure that the appliance is shipped to the correct AWS facility. Tracking is carried via Amazon SNS, text messages and the AWS Console
          • Secure Erasure – Once transfer jobs are complete, AWS will perform a software erasure of the Snowball appliance following National Institute of Standards and Technology (NIST) guidelines for media sanitization
          • Programmable – AWS Snowball supports APIs that enable customers and partners to integrate their applications with Snowball
        • Amazon Snowball Edge
          • AWS Snowball Edge is a new service that provides a 100 TB data transfer device with onboard storage and compute capabilities. In addition to transferring data to AWS, Snowball Edge can be used to provide compute processing services.
        • Amazon Snow Mobile
        • Review full exam tips on Amazon Snowball
    • Storage Gateway – Is a service that connects an on-premises software appliance with cloud-based storage to provide seamless and highly secure integration between your on-premises IT environment and the storage infrastructure of AWS. Understand the four Storage Gateway Types:
      • File Gateway – The file interface enables you to store and retrieve objects in Amazon S3 using industry-standard file protocols. Files are stored as objects in your S3 buckets, accessed through a Network File System (NFS) mount point.
      • Volume – The volume interface presents your applications with disk volumes using the iSCSI block protocol. Data written to these volumes can be asynchronously backed up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots.
        • Volume Cached – The bulk of data is stored in S3 with frequently accessed data held on-premise in cache
        • Volume Stored – The bulk of data is stored on-premise with point in time snapshots transferred to S3
      • Tape Gateway – Replace physical tape infrastructure with virtual tape libraries and virtual tapes presented to a backup application like Symantec Backup Exec and many other over the iSCSI protocol.
        • Virtual Tape Library – Stores backup data in Amazon S3
        • Virtual Tape Shelf – Stores backup data in Glacier. This storage option is limitless
          • Note: Retrieving data from Virtual Tape Shelf will take 24 hours or more
      • Review full exam tips on Amazon Storage Gateway


Amazon Elastic Compute Cloud (EC2) offers virtual servers in the cloud with both Windows and Linux based operating systems.  Key points to note in reference to DR:

  • Amazon Machine Images (AMIs) are machine images with Operating Systems. You can build your own AMIs and distribute them to other regions for quick deployment with your standard configuration
  • Configure fault tolerance and failover mechanism by deploying your AMIs across multiple Availability Zones to handle the loss of an Availability Zone as part of the recovery strategy
  • Use the Amazon EC2 VM Import Connector to import your virtual machines from your on-premise locations to Amazon EC2 Instances


Networking plays an integral role in design for Disaster Recovery and Business Continuity.  Connectivity to secondary backup systems will require DNS configurations to be correctly defined.  Amazon services such as VPNs into VPCs and Direct Connect enable customers to build an infrastructure to failover to the cloud from on-premise locations.

  • Route 53 – is more than just a DNS Service. Users can be routed to application and routing policies that enable global routing of traffic in the event of regional disasters as well.  DNS endpoint health checks enable the failover over multiple endpoints and even static websites hosted on Amazon S3
  • Elastic Load Balancing – offers the ability to distribute incoming traffic to multiple EC2 instances spread across multiple availability zones. This improves fault tolerance especially as you utilize Elastic Load Balancer DNS names to offer seamless failover.


  • Amazon RDS – Amazon managed relational database solution
    • Design DR Solutions with Multi-AZ to failover to another Availability
    • Scale with Read Replicas or use Read Replicas in other regions
    • Offer regional failover option by promoting a read replica if required or copy snapshots to other regions
  • DynamoDB
    • Highly scalable and fully managed database system
    • Simple and cost-effective to store and retrieve any amount of data and serve any request level
    • Reliable throughput and single-digit, millisecond latency.
    • Copy data to DynamoDB in another region or to Amazon S3 for DR preparation
    • Scale up seamlessly during recovery phase of DR
  • Amazon Redshift
    • Petabyte-scale data warehousing service
    • For DR preparation, take snapshot of database warehouse and store in Amazon S3 either in the same region or copy to another region
    • For recovery, restore your data warehouse to same region or another region

Deployment Tools

Deployment automation tools can be used as part of an overall DR design to automatically launch required infrastructure including EC2 Instances, ELBs, EBS, Databases and more.

  • AWS CloudFormation – enables you to create a collection of AWS resources in a planned orderly fashion to build infrastructure necessary to host your applications and associated data
  • AWS Elastic Beanstalk – enables you to upload your code and have AWS provide the operating environment needed to support your application
  • AWS OpsWorks – define your environment as a series of layers, and configure each layer as a tier of your application. AWS OpsWorks has automatic host replacement, so in the event of an instance failure, it will be automatically replaced. You can use AWS OpsWorks in the preparation phase to template your environment, and you can combine it with AWS CloudFormation in the recovery phase. You can quickly provision a new stack from the stored configuration that supports the defined RTO.


This brings us to the end of Part 1 of our revision article, Domain 1 – High Availability and Business Continuity in preparation for the AWS Certified Solutions Architect – Professional Exam.


AWS Certification – 600 Practice Exam Questions – Get Prepared for your Exam Day!

Our AWS Exam Simulator with 600 practice exam questions comes with comprehensive explanations that will help you prepare for one of the most sought-after IT Certifications of the year.  Register Today and start preparing for your AWS Certification.