Ends in
00
days
00
hrs
00
mins
00
secs
SHOP NOW

Azure Sale - Get Up to 25% OFF All Azure Reviewers

Find answers, ask questions, and connect with our
community around the world.

Home Forums AWS AWS Certified Data Engineer Associate DEA-C01 Confusing question about Amazon Redshift

  • Confusing question about Amazon Redshift

  • Felipe Diego

    Member
    August 17, 2024 at 3:53 am

    The question:

    A company has a data warehouse solution running in an Amazon Redshift cluster in AWS. The company’s data engineer runs the following query in Redshift:

    select * from pg_table_def where schemaname = current_schema() limit 1;

    Upon running the query, there was an error in the output that says:

    INFO: Function "current_schema()" not supported

    What is the MOST likely root cause of this issue?

    • current_schema() is a compute node–only function in Redshift and can’t be run with a query that references a Redshift pg_table_def user table that resides on the leader nodes.
    • A query that references catalog tables such as pg_table_def or that does not reference any tables, runs exclusively on the compute node.
    • A query that references catalog tables such as pg_table_def and user-created functions like current_schema() can only exclusively run on the leader node.

    • The current_schema() is a leader node–only function in Amazon Redshift and can’t be run with a query that references a Redshift pg_table_def system table that resides on the compute nodes.

    The correct anwser given by the platform is the last one but actually the right one is “A query that references catalog tables such as pg_table_def and user-created functions like current_schema() can only exclusively run on the leader node.” because the query mentioned on the question is executed only on the leader node. The pg_table_def is a catalog table, then it is executed on the leader node and the function current_schema() is also executed on the leader node. Even on the review of this question is saying that.<div>

    The review of this question:

    Some Amazon Redshift queries are distributed and run on the compute nodes; other queries run exclusively on the leader node.

    The leader node distributes SQL to the compute nodes when a query references user-created tables or system tables (tables with an STL or STV prefix and system views with an SVL or SVV prefix). A query that references only catalog tables (tables with a PG prefix, such as PG_TABLE_DEF) or that does not reference any tables, runs exclusively on the leader node.

    Some Amazon Redshift SQL functions are supported only on the leader node and are not supported on the compute nodes. A query that uses a leader-node function must run exclusively on the leader node, not on the compute nodes, or it will return an error.

    </div>

  • Neil-TutorialsDojo

    Member
    August 20, 2024 at 9:38 am

    Hello Felipe Diego,

    Good day!

    Thank you for posting here.

    The options that says: “A query that references catalog tables such as pg_table_def and user-created functions like current_schema() can only exclusively run on the leader node.” was deemed incorrect because this option suggests that both catalog tables (like pg_table_def) and user-created functions (like current_schema()) must exclusively run on the leader node. However, this is not entirely correct. While the current_schema() function is indeed a leader node function, queries referencing catalog tables like pg_table_def can be executed across compute nodes as well, depending on the query.

    The last option is correct because this option specifically and accurately states that current_schema() is a leader node-only function. This means it can only be executed on the leader node. The key part of the correct option(last one given) is that it explains why this causes a problem: the query references a pg_table_def system table that resides on the compute nodes, leading to an error. This is the correct interpretation of the issue.
    I hope this helps. If you have more queries, pls let us know.

    I hope this helps. If you have more queries, pls let us know.


    Regards,
    Neil @ Tutorials Dojo

  • RyanDymek

    Member
    September 10, 2024 at 5:47 am

    This answer seems to be incorrect. According to Amazon PG tables exclusively reside on the leader node. In fact, the query specified in this question should not result in an error and is even the EXACT query used in the Redshift documentation showing that it works, and that both the PG table, along with current_schema() are both in fact run on the leader node. see this reference: https://docs.aws.amazon.com/redshift/latest/dg/c_sql-functions-leader-node.html

Viewing 1 - 3 of 3 replies

Log in to reply.

Original Post
0 of 0 posts June 2018
Now