MemberAugust 18, 2023 at 5:48 am
I have the following scénario, was hoping to find an explanation in the test but didn’t find it unhappily for me.
Below is the exam question:
A company wants to provide its data analysts with uninterrupted access to the data in its Amazon Redshift cluster. All data is streamed to an Amazon S3 bucket with Amazon Kinesis Data Firehose. An AWS Glue job that is scheduled to run every 5 minutes issues a COPY command to move the data into Amazon Redshift.
The amount of data delivered is uneven throughout the day, and cluster utilization is high during certain periods. The COPY command usually completes within a couple of seconds. However, when load spike occurs, locks can exist and data can be missed. Currently, the AWS Glue job is configured to run without retries, with timeout at 5 minutes and concurrency at 1.
How should a data analytics specialist configure the AWS Glue job to optimize fault tolerance and improve data availability in the Amazon Redshift cluster?
- A. Increase the number of retries. Decrease the timeout value. Increase the job concurrency.
- B. Keep the number of retries at 0. Decrease the timeout value. Increase the job concurrency.
- C. Keep the number of retries at 0. Decrease the timeout value. Keep the job concurrency at 1.
- D. Keep the number of retries at 0. Increase the timeout value. Keep the job concurrency at 1.
I don’t thinking increasing the job concurrency will help in this case, I was more for increasing the timeout value and go for D. It’s possible to have an explanation for this scenario.
Thank you very much,
AdministratorAugust 21, 2023 at 8:35 pm
I noticed the question you posted isn’t from our practice exams. Please note that we only handle queries related to our materials. For external questions, it might be most effective to reach out to the original author. They’ll likely be in the best position to assist you.
Thanks for understanding!
Carlo @ Tutorials Dojo
Log in to reply.