Ends in
00
days
00
hrs
00
mins
00
secs
SHOP NOW

🤖 Get 25% OFF on AI & ML Practice Exams, Video Courses, and eBooks – AWS, Azure, Google Cloud, and GitHub Reviewers!

Find answers, ask questions, and connect with our
community around the world.

  • Irene-TutorialsDojo

    Administrator
    March 19, 2026 at 12:40 pm

    Hi Mohiddin,

    Thank you for reaching out and for the great question! The correct answer is Step 1: Filter → Step 2: Transform (to columnar/Parquet) → Step 3: Compress.

    The key reason is that compression works best after transformation into a columnar format like Parquet. Columnar storage formats compress by column, with the compression algorithm selected for each column’s data type. This saves storage space and reduces disk space and I/O during query processing. Compressing a CSV first (before transforming) applies a single generic codec to row-based data, which cannot be parallelized and yields significantly worse compression ratios.

    AWS Prescriptive Guidance explicitly states: “When authoring ETL jobs, we recommend outputting transformed data in a column-based data format. Columnar data formats, such as Apache Parquet and ORC, are designed to minimize data movement and maximize compression. Compressing data also helps reduce the amount of data stored, and it improves read/write operation performance.”

    Notice the deliberate order: transform to columnar first, then compress. The answer key in our practice exam reflects this AWS best practice, and we appreciate your diligence in verifying it! If you have further questions, don’t hesitate to ask.

    Cheers,

    Irene @ Tutorials Dojo

Skip to content