Find answers, ask questions, and connect with our
community around the world.

Home Forums AWS AWS Certified Data Engineer Associate DEA-C01 Confusing question about Amazon Redshift

  • Confusing question about Amazon Redshift

  • Felipe Diego

    Member
    August 17, 2024 at 3:53 am

    The question:

    A company has a data warehouse solution running in an Amazon Redshift cluster in AWS. The company’s data engineer runs the following query in Redshift:

    select * from pg_table_def where schemaname = current_schema() limit 1;

    Upon running the query, there was an error in the output that says:

    INFO: Function "current_schema()" not supported

    What is the MOST likely root cause of this issue?

    • current_schema() is a compute node–only function in Redshift and can’t be run with a query that references a Redshift pg_table_def user table that resides on the leader nodes.
    • A query that references catalog tables such as pg_table_def or that does not reference any tables, runs exclusively on the compute node.
    • A query that references catalog tables such as pg_table_def and user-created functions like current_schema() can only exclusively run on the leader node.

    • The current_schema() is a leader node–only function in Amazon Redshift and can’t be run with a query that references a Redshift pg_table_def system table that resides on the compute nodes.

    The correct anwser given by the platform is the last one but actually the right one is “A query that references catalog tables such as pg_table_def and user-created functions like current_schema() can only exclusively run on the leader node.” because the query mentioned on the question is executed only on the leader node. The pg_table_def is a catalog table, then it is executed on the leader node and the function current_schema() is also executed on the leader node. Even on the review of this question is saying that.<div>

    The review of this question:

    Some Amazon Redshift queries are distributed and run on the compute nodes; other queries run exclusively on the leader node.

    The leader node distributes SQL to the compute nodes when a query references user-created tables or system tables (tables with an STL or STV prefix and system views with an SVL or SVV prefix). A query that references only catalog tables (tables with a PG prefix, such as PG_TABLE_DEF) or that does not reference any tables, runs exclusively on the leader node.

    Some Amazon Redshift SQL functions are supported only on the leader node and are not supported on the compute nodes. A query that uses a leader-node function must run exclusively on the leader node, not on the compute nodes, or it will return an error.

    </div>

  • Neil-TutorialsDojo

    Member
    August 20, 2024 at 9:38 am

    Hello Felipe Diego,

    Good day!

    Thank you for posting here.

    The options that says: “A query that references catalog tables such as pg_table_def and user-created functions like current_schema() can only exclusively run on the leader node.” was deemed incorrect because this option suggests that both catalog tables (like pg_table_def) and user-created functions (like current_schema()) must exclusively run on the leader node. However, this is not entirely correct. While the current_schema() function is indeed a leader node function, queries referencing catalog tables like pg_table_def can be executed across compute nodes as well, depending on the query.

    The last option is correct because this option specifically and accurately states that current_schema() is a leader node-only function. This means it can only be executed on the leader node. The key part of the correct option(last one given) is that it explains why this causes a problem: the query references a pg_table_def system table that resides on the compute nodes, leading to an error. This is the correct interpretation of the issue.
    I hope this helps. If you have more queries, pls let us know.

    I hope this helps. If you have more queries, pls let us know.


    Regards,
    Neil @ Tutorials Dojo

  • RyanDymek

    Member
    September 10, 2024 at 5:47 am

    This answer seems to be incorrect. According to Amazon PG tables exclusively reside on the leader node. In fact, the query specified in this question should not result in an error and is even the EXACT query used in the Redshift documentation showing that it works, and that both the PG table, along with current_schema() are both in fact run on the leader node. see this reference: https://docs.aws.amazon.com/redshift/latest/dg/c_sql-functions-leader-node.html

  • to22

    Member
    September 27, 2024 at 9:38 pm

    I agree with Ryan that the answer is not correct.

    See https://docs.aws.amazon.com/redshift/latest/dg/r_PG_TABLE_DEF.html :

    A query that references only catalog tables (tables with a PG prefix, such as PG_TABLE_DEF) or that does not reference any tables, runs exclusively on the leader node.

    So it does not make sense that PG_TABLE_DEF resides on compute nodes as stated in the answer currently marked as correct.

  • Neil-TutorialsDojo

    Member
    September 30, 2024 at 10:04 am

    Hello Felipe Diego, RyanDymek, and to22,

    Good day!

    Thank you all for the valuable input and for taking the time to share your insights and documentation. I apologize for the confusion caused by my previous explanation. Upon reviewing the Amazon Redshift documentation you provided, it is clear that my initial understanding was incorrect.

    The pg_table_def table as correctly pointed out, resides exclusively on the leader node in Redshift, and so does the current_schema() function. The query in question should not result in an error under normal circumstances, and the behavior described in the previous answer was inaccurate. Both the pg_table_def table and the current_schema() function are indeed leader node–only elements, and should operate without issue in the scenario mentioned.

    I appreciate your patience, and I would like to assure you that we will be updating this question and its answer to reflect the correct behavior of Redshift in such cases, ensuring alignment with the official AWS documentation. This correction will help avoid any confusion in the future when learners review this material.

    Once again, thank you for your thorough feedback, and should you have any further queries or suggestions, please don’t hesitate to reach out.

    Best regards,
    Neil @ Tutorials Dojo

Viewing 1 - 5 of 5 replies

Log in to reply.

Original Post
0 of 0 posts June 2018
Now
Skip to content