Home › Forums › AWS › AWS Certified Data Analytics – Specialty › Q23 – Review mode Set 3
-
Q23 – Review mode Set 3
-
Q23. A company runs multiple Apache Spark jobs using Amazon EMR. Each job extracts and analyzes data from a Hadoop Distributed File System (HDFS) and then writes the results to an Amazon S3 bucket. However, some of the jobs fail with an
HTTP 503 "Slow Down" AmazonS3Exception
Which methods could be taken to rectify the error? (SELECT TWO)
- A. Enable S3 Transfer acceleration on the Amazon S3 bucket.
- B. Increase the number of retries allowed by the EMR File System (EMRFS).
- C. Modify the Spark job to write results to unique S3 prefixes per job
- D. Shorten the retention period of job history files on the HDFS.
- E. Increase the number of Spark partitions in the EMR cluster.
Correct answers are B and C.
¿Why is B? Question says that are using HDFS not EMRFS. I don’t understand that. Yes, I would select B and C, but specifically says using HDFS…
Thank you.
- This discussion was modified 1 year, 2 months ago by ccalvo.
-
Hello ccalvo,
Thanks for the feedback.
In the scenario, HDFS is where the data is being read by Spark. After processing this data, the results are then written to an Amazon S3 bucket.
While the question doesn’t specifically say it, Amazon EMR uses EMRFS when interfacing with Amazon S3. Also, the error message mentioned (AmazonS3Exception) gives us a hint that the problem is likely related to how the system interacts with S3.
So, based on this information, we can assume that the issue arises when EMR is trying to write the results of the Spark jobs to the S3 bucket, which makes options B (Increase the number of retries allowed by EMRFS) and C (Modify the Spark job to write results to unique S3 prefixes per job) valid approaches to solve this problem.
Let me know if this answers your question.
Regards,
Carlo @ Tutorials Dojo
-
Yes, after your explanation, I have understood it.
Thank you very much 👍
-
The forum ‘AWS Certified Data Analytics – Specialty’ is closed to new discussions and replies.