Ends in
00
days
00
hrs
00
mins
00
secs
SHOP NOW

$2 OFF ALL AWS, Azure, Google Cloud & Kubernetes Practice Exams!

Find answers, ask questions, and connect with our
community around the world.

Home Forums AWS AWS Certified Data Analytics – Specialty Q23 – Review mode Set 3

  • Q23 – Review mode Set 3

  • ccalvo

    Member
    July 9, 2023 at 5:05 pm

    Q23. A company runs multiple Apache Spark jobs using Amazon EMR. Each job extracts and analyzes data from a Hadoop Distributed File System (HDFS) and then writes the results to an Amazon S3 bucket. However, some of the jobs fail with an HTTP 503 "Slow Down" AmazonS3Exception

    Which methods could be taken to rectify the error? (SELECT TWO)

    • A. Enable S3 Transfer acceleration on the Amazon S3 bucket.
    • B. Increase the number of retries allowed by the EMR File System (EMRFS).
    • C. Modify the Spark job to write results to unique S3 prefixes per job
    • D. Shorten the retention period of job history files on the HDFS.
    • E. Increase the number of Spark partitions in the EMR cluster.

    Correct answers are B and C.

    ¿Why is B? Question says that are using HDFS not EMRFS. I don’t understand that. Yes, I would select B and C, but specifically says using HDFS…

    Thank you.

    • This discussion was modified 11 months, 2 weeks ago by  ccalvo.
  • Carlo-TutorialsDojo

    Administrator
    July 10, 2023 at 10:41 pm

    Hello ccalvo,

    Thanks for the feedback.

    In the scenario, HDFS is where the data is being read by Spark. After processing this data, the results are then written to an Amazon S3 bucket.

    While the question doesn’t specifically say it, Amazon EMR uses EMRFS when interfacing with Amazon S3. Also, the error message mentioned (AmazonS3Exception) gives us a hint that the problem is likely related to how the system interacts with S3.

    So, based on this information, we can assume that the issue arises when EMR is trying to write the results of the Spark jobs to the S3 bucket, which makes options B (Increase the number of retries allowed by EMRFS) and C (Modify the Spark job to write results to unique S3 prefixes per job) valid approaches to solve this problem.

    Let me know if this answers your question.

    Regards,

    Carlo @ Tutorials Dojo

    • ccalvo

      Member
      July 11, 2023 at 1:35 am

      Yes, after your explanation, I have understood it.

      Thank you very much 👍

Viewing 1 - 2 of 2 replies

The forum ‘AWS Certified Data Analytics – Specialty’ is closed to new discussions and replies.

Original Post
0 of 0 posts June 2018
Now