A video production company is planning to move some of its workloads to the AWS Cloud. The company will require around 5 TB of storage for video processing with the maximum possible I/O performance. They also require over 400 TB of extremely durable storage for storing video files and 800 TB of storage for long-term archival.
Which combinations of services should a Solutions Architect use to meet these requirements?
1. Amazon EC2 instance store for maximum performance, Amazon S3 for durable data storage, and Amazon S3 Glacier for archival storage.
The best I/O performance can be achieved by using instance store volumes for the video processing. This is safe to use for use cases where the data can be recreated from the source files so this is a good use case.
For storing data durably Amazon S3 is a good fit as it provides 99.999999999% of durability. For archival the video files can then be moved to Amazon S3 Glacier which is a low cost storage option that is ideal for long-term archival.
References:
Save time with our AWS cheat sheets.
A company runs a dynamic website that is hosted on an on-premises server in the United States. The company is expanding to Europe and is investigating how they can optimize the performance of the website for European users. The website’s backed must remain in the United States. The company requires a solution that can be implemented within a few days.
What should a Solutions Architect recommend?
4. Use Amazon CloudFront with a custom origin pointing to the on-premises servers.
A custom origin can point to an on-premises server and CloudFront is able to cache content for dynamic websites. CloudFront can provide performance optimizations for custom origins even if they are running on on-premises servers. These include persistent TCP connections to the origin, SSL enhancements such as Session tickets and OCSP stapling.
Additionally, connections are routed from the nearest Edge Location to the user across the AWS global network. If the on-premises server is connected via a Direct Connect (DX) link this can further improve performance.
Reference:
Amazon CloudFront Dynamic Content Delivery
Save time with our AWS cheat sheets.
A company runs an application in a factory that has a small rack of physical compute resources. The application stores data on a network attached storage (NAS) device using the NFS protocol. The company requires a daily offsite backup of the application data.
Which solution can a Solutions Architect recommend to meet this requirement?
1. Use an AWS Storage Gateway file gateway hardware appliance on premises to replicate the data to Amazon S3.
The AWS Storage Gateway Hardware Appliance is a physical, standalone, validated server configuration for on-premises deployments. It comes pre-loaded with Storage Gateway software, and provides all the required CPU, memory, network, and SSD cache resources for creating and configuring File Gateway, Volume Gateway, or Tape Gateway.
A file gateway is the correct type of appliance to use for this use case as it is suitable for mounting via the NFS and SMB protocols.
Reference:
AWS Storage Gateway | Amazon Web Services
Save time with our AWS cheat sheets.
A company is deploying a fleet of Amazon EC2 instances running Linux across multiple Availability Zones within an AWS Region. The application requires a data storage solution that can be accessed by all of the EC2 instances simultaneously. The solution must be highly scalable and easy to implement. The storage must be mounted using the NFS protocol.
Which solution meets these requirements?
2. Create an Amazon EFS file system with mount targets in each Availability Zone. Configure the application instances to mount the file system.
Amazon EFS provides scalable file storage for use with Amazon EC2. You can use an EFS file system as a common data source for workloads and applications running on multiple instances. The EC2 instances can run in multiple AZs within a Region and the NFS protocol is used to mount the file system.
With EFS you can create mount targets in each AZ for lower latency. The application instances in each AZ will mount the file system using the local mount target.
Reference:
Use Amazon EFS with Amazon EC2 Linux instances - Amazon Elastic Compute Cloud
Save time with our AWS cheat sheets.
A scientific research institute stores experimental datasets in AWS. Some datasets are accessed daily for analysis, while others remain unused for weeks or months. The datasets are large and must be highly durable, but the institute wants to reduce costs without compromising availability for frequently accessed data.
The institute needs a cost-effective storage solution that adapts to these varying access patterns and ensures the highest durability.
Which storage solution meets these requirements?
1. Use Amazon S3 Intelligent-Tiering to automatically adjust storage costs based on the frequency of data access while maintaining high durability.
S3 Intelligent-Tiering dynamically transitions objects between storage tiers, optimizing costs for infrequent access while retaining the high durability and availability required for datasets.
References:
Save time with our AWS cheat sheets.
A global manufacturing company uses AWS Outposts servers to manage IoT workloads in its factories across multiple continents. The company regularly updates factory IoT software, consisting of 50 files, from a central Amazon S3 bucket in the us-east-1 Region. Factories report significant delays when downloading and applying the updates, causing downtime. The company needs to minimize the latency for distributing software updates globally while reducing operational overhead.
Which solution will meet this requirement with the LEAST operational overhead?
2. Create an Amazon S3 bucket in the us-east-1 Region. Set up an Amazon CloudFront distribution with the S3 bucket as the origin. Use signed URLs to download the software updates.
CloudFront caches the software updates at edge locations around the world, significantly reducing latency. Signed URLs ensure secure access, and the solution requires minimal operational overhead.
References:
Save time with our AWS cheat sheets:
An application upgrade caused some issues with stability. The application owner enabled logging and has generated a 5 GB log file in an Amazon S3 bucket. The log file must be securely shared with the application vendor to troubleshoot the issues.
What is the MOST secure way to share the log file?
4. Generate a presigned URL and ask the vendor to download the log file before the URL expires.
A presigned URL gives you access to the object identified in the URL. When you create a presigned URL, you must provide your security credentials and then specify a bucket name, an object key, an HTTP method (PUT for uploading objects), and an expiration date and time. The presigned URLs are valid only for the specified duration. That is, you must start the action before the expiration date and time.
Encryption does not assist here as the bucket would be public and anyone could access it.
Reference:
Sharing objects with presigned URLs
Save time with our AWS cheat sheets.
Objects uploaded to Amazon S3 are initially accessed frequently for a period of 30 days. Then, objects are infrequently accessed for up to 90 days. After that, the objects are no longer needed.
How should lifecycle management be configured?
3. Transition to ONEZONE_IA after 30 days. After 90 days expire the objects
In this scenario we need to keep the objects in the STANDARD storage class for 30 days as the objects are being frequently accessed. We can configure a lifecycle action that then transitions the objects to INTELLIGENT_TIERING, STANDARD_IA, or ONEZONE_IA. After that we don’t need the objects so they can be expired.
All other options do not meet the stated requirements or are not supported lifecycle transitions. For example:
* You cannot transition to REDUCED_REDUNDANCY from any storage class.
* Transitioning from STANDARD_IA to ONEZONE_IA is possible but we do not want to keep the objects so it incurs unnecessary costs.
* Transitioning to GLACIER is possible but again incurs unnecessary costs.
Reference:
Transitioning objects using Amazon S3 Lifecycle
Save time with our AWS cheat sheets.
A medical research institution generates large volumes of patient imaging data daily. These images are initially stored on on-premises block storage systems connected to medical devices. Due to limited local storage capacity, the institution needs to offload data to the cloud. The data must remain accessible to on-premises analysis applications with low latency for frequently accessed images. The institution requires a storage solution that integrates with its existing setup and minimizes operational management.
Which solution will meet these requirements with the MOST operational efficiency?
1. Use AWS Storage Gateway Volume Gateway in cached mode. Configure cached volumes as iSCSI targets to store the primary dataset in AWS and cache frequently accessed imaging data locally.
The cached mode stores the main dataset in AWS while caching frequently accessed data locally, ensuring low-latency access to imaging data for on-premises applications. It also addresses the storage limitations of the institution’s local environment.
Reference:
AWS Storage Gateway Documentation
Save time with our AWS cheat sheets.
A data analytics company is testing a Python-based application that processes customer data on an Amazon EC2 Linux instance. A single 1 TB Amazon Elastic Block Store (Amazon EBS) General Purpose SSD (gp3) volume is currently attached to the EC2 instance for data storage.
The company plans to deploy the application across multiple EC2 instances in an Auto Scaling group. All instances must access the same data that is currently stored on the EBS volume. The company needs a highly available and cost-effective solution that minimizes changes to the application code.
Which solution will meet these requirements?
2. Use Amazon Elastic File System (Amazon EFS) and configure it in General Purpose performance mode. Mount the EFS file system on all EC2 instances.
Amazon EFS is a highly available, scalable, and resilient shared storage solution. It allows multiple EC2 instances to access the same data concurrently without requiring changes to the application code. General Purpose performance mode ensures low latency for shared access workloads.
References:
Save time with our AWS cheat sheets.
A pharmaceutical company is migrating its legacy inventory management system to AWS. The system runs on Microsoft Windows Server and uses shared block storage for data consistency and failover. The company requires a highly available solution that supports active-passive clustering across multiple Availability Zones. The storage solution must minimize operational overhead while ensuring low-latency access to data.
Which solution will meet these requirements with the LEAST implementation effort?
1. Deploy Amazon FSx for Windows File Server in Multi-AZ mode. Configure a Windows Server failover cluster across two Amazon EC2 instances in different Availability Zones, using FSx for Windows File Server as the shared storage.
FSx for Windows File Server provides fully managed, highly available shared storage designed specifically for Windows-based workloads. It integrates seamlessly with Windows failover clusters, minimizing operational complexity.
References:
Save time with our AWS cheat sheets.
A company is using AWS DataSync to migrate millions of files from an on-premises system to AWS. The files are 10 KB in size on average. The company wants to use Amazon S3 for file storage. For the first year after the migration, the files will be accessed once or twice and must be immediately available. After 1 year, the files must be archived for at least 7 years.
Which solution will meet these requirements MOST cost-effectively?
4. Configure a DataSync task to transfer the files to S3 Standard-Infrequent Access (S3 Standard-IA). Use a lifecycle configuration to transition the files to S3 Deep Archive after 1 year with a retention period of 7 years.
S3 Standard-IA is cost-effective for infrequently accessed data. After 1 year, transitioning to S3 Deep Archive is the most cost-effective choice for long-term storage with infrequent access requirements.
References:
Save time with our AWS cheat sheets.
A company operates a self-managed Microsoft SQL Server database hosted on Amazon EC2 instances with Amazon Elastic Block Store (Amazon EBS) volumes. The company uses daily EBS snapshots for backup. Recently, an issue arose when a snapshot cleanup script unintentionally deleted all the snapshots. The solutions architect must design a solution to prevent accidental deletions while avoiding indefinite retention of EBS snapshots.
Which solution will meet these requirements with the LEAST development effort?
4. Apply an EBS snapshot retention rule in Recycle Bin to retain snapshots for 7 days before permanent deletion.
Recycle Bin provides a simple way to recover snapshots deleted by mistake. By configuring a retention rule, snapshots are stored securely for the defined duration, preventing accidental deletions without requiring custom development.
References:
Save time with our AWS cheat sheets.
A company runs its critical storage application in the AWS Cloud. The application uses Amazon S3 in two AWS Regions. The company wants the application to send remote user data to the nearest S3 bucket with no public network congestion. The company also wants the application to fail over with the least amount of management of Amazon S3.
Which solution will meet these requirements?
4. Set up Amazon S3 to use Multi-Region Access Points in an active-active configuration with a single global endpoint. Configure S3 Cros.s-Region Replication
S3 Multi-Region Access Points allow the application to route user requests automatically to the nearest S3 bucket based on network conditions and proximity, minimizing latency and avoiding public network congestion. It also provides failover capabilities with minimal management effort.
References:
Save time with our AWS cheat sheets.
A company requires a fully managed replacement for an on-premises storage service. The company’s employees often work remotely from various locations. The solution should also be easily accessible to systems connected to the on-premises environment.
Which solution meets these requirements?
2. Use Amazon FSx to create an SMB file share. Connect remote clients to the file share over a client VPN.
Amazon FSx for Windows File Server (Amazon FSx) is a fully managed, highly available, and scalable file storage solution built on Windows Server that uses the Server Message Block (SMB) protocol. It allows for Microsoft Active Directory integration, data deduplication, and fully managed backups, among other critical enterprise features.
An Amazon FSx file system can be created to host the file shares. Clients can then be connected to an AWS Client VPN endpoint and gateway to enable remote access. The protocol used in this solution will be SMB.
Reference:
Accessing SMB file shares remotely with Amazon FSx for Windows File Server
Save time with our AWS cheat sheets.
A Solutions Architect has been tasked with migrating 30 TB of data from an on-premises data center within 20 days. The company has an internet connection that is limited to 25 Mbps and the data transfer cannot use more than 50% of the connection speed.
What should a Solutions Architect do to meet these requirements?
3. Use AWS Snowball.
This is a simple case of working out roughly how long it will take to migrate the data using the 12.5 Mbps of bandwidth that is available for transfer and seeing which options are feasible. Transferring 30 TB of data across a 25 Mbps connection could take upwards of 200 days.
Therefore, we know that using the Internet connection will not meet the requirements and we can rule out any solution that will use the internet (all options except for Snowball). AWS Snowball is a physical device that is shipped to your office or data center. You can then load data onto it and ship it back to AWS where the data is uploaded to Amazon S3.
Snowball is the only solution that will achieve the data migration requirements within the 20-day period.
This uses the internet which will not meet the 20-day deadline.
Reference:
AWS Snowball
Save time with our AWS cheat sheets.
A Solutions Architect works for a company looking to centralize its Machine Learning Operations. Currently they have a large amount of existing cloud storage to store their operational data which is used for machine learning analysis. There is some data which exists within an Amazon RDS MySQL database, and they need a solution which can easily retrieve data from the database.
Which service can be used to build a centralized data repository to be used for Machine Learning purposes?
2. AWS Lake Formation
AWS Lake Formation is a service that makes it easy to set up a secure data lake in days. A data lake is a centralized, curated, and secured repository that stores all your data, both in its original form and prepared for analysis. With AWS Lake Formation, you can import data from MySQL, PostgreSQL, SQL Server, MariaDB, and Oracle databases running in Amazon Relational Database Service (RDS) or hosted in Amazon Elastic Compute Cloud (EC2). Both bulk and incremental data loading are supported.
Reference:
AWS Lake Formation Features
Save time with our AWS cheat sheets.
A Financial Services company currently stores data in Amazon S3. Each bucket contains items which have different access patterns. The Chief Financial officer of the organization wants to reduce costs, as they have noticed a sharp increase in their S3 bill. The Chief Financial Officer wants to reduce the S3 spend as quickly as possible.
What is the quickest way to reduce the S3 spend with the LEAST operational overhead?
3. Transition the objects to the appropriate storage class by using an S3 Lifecycle configuration.
Reference:
Managing the lifecycle of objects
Save time with our AWS cheat sheets.
A financial services company runs a credit evaluation system in a private subnet behind an Application Load Balancer (ALB) in a VPC. The VPC includes a NAT gateway and an internet gateway. The system analyzes customer credit data and uploads the results to Amazon S3 for reporting.
The company has strict regulatory requirements stating that all data traffic must remain within AWS’s private network and must not traverse the public internet. Additionally, the company wants to implement a cost-effective solution while ensuring compliance.
Which solution will meet these requirements MOST cost-effectively?
2. Configure an S3 gateway endpoint. Update the route table of the private subnet to direct S3 traffic through the endpoint.
S3 gateway endpoint enables private and secure communication between the VPC and Amazon S3, ensuring no traffic leaves the AWS network. It is also a cost-effective solution because gateway endpoints do not require additional infrastructure like NAT gateways.
References:
Save time with our AWS cheat sheets.
A scientific research organization runs an on-premises simulation application that processes large datasets. The organization has migrated all simulation data to Amazon S3 to reduce costs. The simulation application requires low-latency storage access for seamless performance during processing tasks.
The organization needs to design a storage solution that minimizes costs while maintaining the performance requirements of the application.
Which storage solution will meet these requirements in the MOST cost-effective way?
1. Use Amazon S3 File Gateway to provide low-latency storage for the on-premises application. The File Gateway will cache frequently accessed data locally.
S3 File Gateway provides a seamless and cost-effective way to access data stored in Amazon S3. It locally caches frequently accessed data, reducing latency while still leveraging the cost benefits of S3 storage.
References:
Save time with our AWS cheat sheets.
A research organization is planning to migrate its simulation analysis platform to AWS. The platform stores simulation results and logs on an on-premises NFS server. The platform’s codebase is legacy and cannot be modified to use any protocol other than NFS to store and retrieve data. The organization needs a storage solution on AWS that supports NFS and is highly available and scalable.
Which storage solution should a solutions architect recommend for use after the migration?
2. Use Amazon Elastic File System (Amazon EFS) to provide an NFS-compatible shared file system that integrates with AWS services.
Amazon EFS is a fully managed, scalable, and highly available file storage service that supports NFS. It is designed to work seamlessly with applications requiring NFS without additional setup or modifications.
References:
Save time with our AWS cheat sheets.
A company has on-premises file servers that include both Windows SMB and Linux NFS protocols. The company plans to migrate to AWS and consolidate these file servers into a managed cloud solution. The chosen solution must support both NFS and SMB access, provide protocol sharing, and offer redundancy at the Availability Zone level.
Which solution will meet these requirements?
1. Use Amazon FSx for NetApp ONTAP to consolidate storage and enable multi-protocol access for both SMB and NFS.
Amazon FSx for NetApp ONTAP supports both SMB and NFS protocols with multi-protocol sharing and redundancy across Availability Zones.
References:
Save time with our AWS cheat sheets.
A company needs to implement a new data retention policy for regulatory compliance. As part of this policy, sensitive documents that are stored in an Amazon S3 bucket must be protected from deletion or modification for a fixed period of time.
Which solution will meet these requirements?
2. Enable S3 Object Lock on the required objects and set compliance mode.
Compliance mode ensures that no user, including the root user, can delete or modify objects during the retention period, making it suitable for regulatory requirements.
References:
Save time with our AWS cheat sheets.
A company recently performed a lift and shift migration of its on-premises Oracle database workload to run on an Amazon EC2 memory-optimized Linux instance. The EC2 Linux instance uses a 1 TB Provisioned IOPS SSD (io1) EBS volume with 64,000 IOPS. The database storage performance after the migration is slower than the performance of the on-premises database.
Which solution will improve storage performance?
1. Add more Provisioned IOPS SSD (io1) EBS volumes. Use OS commands to create a Logical Volume Management (LVM) stripe.
Creating a striped volume using multiple io1 EBS volumes allows you to aggregate performance and exceed the performance limits of a single volume, resulting in better storage performance.
References:
Save time with our AWS cheat sheets.