Find answers, ask questions, and connect with our
community around the world.

Home Forums AWS AWS Certified Data Analytics – Specialty Glue Crawler Classifier question has two right answers, I believe.

  • Glue Crawler Classifier question has two right answers, I believe.

  • jonathan-crane

    Member
    March 12, 2021 at 9:05 am

    The question asks:

    A Data Analyst is using an Amazon DynamoDB table for keeping inventory and order management data. The Data Analyst adds a custom classifier to an AWS Glue crawler to extract data from the database. After running the crawler, AWS Glue returns a classification string of UNKNOWN.

    What is the most likely reason for the returned classification string?

    And the options are:

    AWS Glue has invoked a built-in classifier.

    AWS Glue has invoked a custom classifier with a certainty of -1.

    AWS Glue was unable to find a classifier with certainty greater than 0.0.

    AWS Glue has invoked a custom classifier that matches the schema of a built-in classifier.

    Two options are correct. First the obvious one:

    AWS Glue was unable to find a classifier with certainty greater than 0.0.

    However, for that to be true, the option

    AWS Glue has invoked a built-in classifier.

    Must also be true. This is because according to the documentation (https://docs.aws.amazon.com/glue/latest/dg/add-classifier.html)

    “If AWS Glue doesn’t find a custom classifier that fits the input data format with 100 percent certainty, it invokes the built-in classifiers in the order shown in the following table.”

    and

    “If no classifier returns certainty=1.0, AWS Glue uses the output of the classifier that has the highest certainty. If no classifier returns a certainty greater than 0.0, AWS Glue returns the default classification string of UNKNOWN.”

    Therefore to end up at an ultimate value for certainty of 0.0, Glue has to run all custom classifiers, then ALL built-in classifiers, and all must return a certainty of 0.0 for the UNKNOWN value to be added.

  • Carlo-TutorialsDojo

    Administrator
    March 13, 2021 at 1:48 am

    Hello Jonathan,

    Thanks for your feedback.

    The answer “AWS Glue has invoked a built-in classifier” is more of an effect rather than a cause.

    The root cause why AWS Glue returns the UNKNOWN classification string is because it wasn’t able to find a classifier with a certainty greater than 0.0.

    Let me know your thoughts.

    Regards,

    Carlo

  • jonathan-crane

    Member
    March 13, 2021 at 2:16 am

    Yeah Carlos now that I’ve slept on it, the key is “What is the most likely reason for the returned classification string?”

    The REASON is that it couldn’t find anything with a score above zero. But it did indeed invoke all the built-in classifiers before coming to that conclusion.

    I suppose it’s tricky on purpose, since the exam is tricky, too!

    Thanks for your help!

Viewing 1 - 3 of 3 replies

The forum ‘AWS Certified Data Analytics – Specialty’ is closed to new discussions and replies.

Original Post
0 of 0 posts June 2018
Now