Home › Forums › AWS › AWS Certified Solutions Architect Professional › Bonus exam incorrect answer
-
A retail company is using Amazon OpenSearch Service to analyze its sales and inventory data. Every week, new data from an Amazon S3 Standard bucket is indexed and loaded into a 20-data node Amazon OpenSearch cluster. Read-only queries are performed on this data to monitor recent trends. After 1 week, it’s occasionally accessed for identifying long-term patterns. After three months, the index containing the older data is deleted from the system. However, due to audit requirements, the company needs to keep a complete copy of all processed data.
The company is looking for strategies to reduce storage costs without abandoning Amazon OpenSearch. A slower query response time on infrequently accessed data is acceptable as long as it can be retrieved on demand.
Which solution fits the requirements while being the MOST cost-effective?
Answer marked “correct”:
Downsize the OpenSearch cluster by reducing the number of its data nodes. Add UltraWarm nodes to compensate for the read capacity. Create an Index State Management (ISM) policy that moves data to cold storage after 1 week. Use an S3 lifecycle policy to transition data older than 3 months to the S3 Glacier Deep archive.S3 Deep Archive cannot be retrieved “on-demand”. It has a minimum 12 hour retrieval process. No reasonable measure of “on-demand” is 12 hours later, nor is 12 hours reasonably considered a “slower query response time”. Given the text of the question, the correct answer is actually S3-IA because it can be retrieved on-demand.
Should be the correct answer:
Downsize the OpenSearch cluster by reducing the number of its data nodes. Add UltraWarm nodes to compensate for the read capacity. Create an Index State Management (ISM) policy that moves data to cold storage after 1 week. Use an S3 lifecycle policy to transition data older than 3 months to the S3 Infrequently Access tier.
-
The question says that the data after 1 week is classified as “infrequently accessed”. After 3 months the index is deleted altogether and no longer needs to be found in OpenSearch – just that the data needs to be stored in an archive somewhere. Therefore Glacier Deep Archive is a valid (and correct for MOST cost-effective) answer. Data in OpenSearch cold storage will experience a slight delay in query responses but per the question, that’s okay for data older than 1 week.
-
The answer is eliminated by the wording of the question: “A slower query response time on infrequently accessed data is acceptable as long as it can be retrieved on demand.”
Glacier Deep Archive cannot be retrieved on-demand, hence the answer is not correct.
-
Since the index in OpenSearch is deleted after 3 months, you can’t query the data there to retrieve it. Based on that, I don’t think this question considers data older than 3 months as “infrequently accessed” – that data is not accessed at all since it is no longer indexed in OpenSearch and therefore should be sent to Glacier Deep Archive.
Standard data – < 1 week
Infrequently accessed data – 1 week to 3 months
Archived, never accessed data (since the index is no longer in OpenSearch) > 3 months
-
-
Hi dotcloud,
Thank you for sending this question over and apology for the late response. Let’s further analyze the scenario you shared:
A retail company is using Amazon OpenSearch Service to analyze its sales and inventory data. Every week, new data from an Amazon S3 Standard bucket is indexed and loaded into a 20-data node Amazon OpenSearch cluster. Read-only queries are performed on this data to monitor recent trends. After 1 week, it’s occasionally accessed for identifying long-term patterns. After three months, the index containing the older data is deleted from the system. However, due to audit requirements, the company needs to keep a complete copy of all processed data.
The company is looking for strategies to reduce storage costs without abandoning Amazon OpenSearch. A slower query response time on infrequently accessed data is acceptable as long as it can be retrieved on demand.
Which solution fits the requirements while being the MOST cost-effective?
It is important to note that the company allows “A slower query response time on infrequently accessed data is acceptable as long as it can be retrieved on demand” and the company is looking for a MOST cost-effective solution. Hence the correct answer is “Downsize the OpenSearch cluster by reducing the number of its data nodes. Add UltraWarm nodes to compensate for the read capacity. Create an Index State Management (ISM) policy that moves data to cold storage after 1 week. Use an S3 lifecycle policy to transition data older than 3 months to the S3 Glacier Deep archive.”
Transitioning data older than 3 months to the S3 Glacier Deep Archive meets the company’s requirements to keep a complete copy of all processed data for audit purposes. Amazon S3 Glacier Deep Archive is a secure, durable, and extremely low-cost Amazon S3 cloud storage class, designed for the long-term retention of data archival accessed once or twice a year. On the other hand, the S3 Infrequently Access tier is a more expensive storage tier than the S3 Glacier Deep Archive with a minimum storage duration is only 30 days, considering the company’s requirement to store data for at least three months and only occasionally access it.
For further reading, you can check the links below:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html
https://aws.amazon.com/s3/pricing/
Hope this clarifies any confusion. Please don’t hesitate to drop any message/question if you need further assistance.
Regards,
Nikee @ Tutorials Dojo
Log in to reply.