Home › Forums › AWS › AWS Certified Data Engineer Associate DEA-C01 › Using regular expressions in the Redshift query
-
Using regular expressions in the Redshift query
JR-TutorialsDojo updated 4 months ago
3 Members
·
5
Posts
-
Question: A data engineering team at a company is using Amazon <strong style=”font-family: inherit; font-size: inherit;”>Redshift for their data warehousing needs. They have <strong style=”font-family: inherit; font-size: inherit;”>a table with a column named ‘category’. The team needs to create a query that returns a list of data where the ‘category’ field starts with either “TAD” or “AGO”. T<strong style=”font-family: inherit; font-size: inherit;”>hey are considering using regular expressions in their Redshift query to achieve this.
Which of the following Amazon Redshift SQL queries would correctly fulfill the team’s requirement?
Answer: SELECT * FROM data_table WHERE category REGEXP ‘^TAD|^AGO’;
I executed this query on Redshift, and it’s getting an error as the screenshot.
Please review and guide me to the correct answer. Thanks!
-
Hello anvo0000,
Thank you for bringing this to our attention, and we sincerely apologize for any confusion caused.
We’ve reviewed the issue and confirmed that the correct option needs to be updated. The necessary changes should be reflected on the portal soon.
Let us know if you need further assistance.
Best regards,
JR @ Tutorials Dojo-
Bonus Review Mode Set 3 Question 15 also has an incorrect answer.
You’re asked to return a LIST of data but the given answer returns a COUNT of data.
-
Hello johnsher,
Thank you for bringing this to our attention.
Could you please provide a snippet of the question so we can look it up?
Regards,
JR @ Tutorials Dojo -
Hello johnsher,
I think you are referring to the same scenario.
You’re absolutely right to question that. At first glance, REGEXP_COUNT does sound like it’s returning a count rather than a list. But in this case, it’s being used inside a WHERE clause, which means we’re not interested in the actual count value. We’re just checking whether it’s non-zero (i.e., whether the pattern matches). If it does, the row is included in the result set.
So even though the function name says “COUNT,” the query is still returning a list of rows where the category starts with either “TAD” or “AGO.” You may refer to the given explanation, especially the image attached, since it shows the actual output, which lists the rows where the category starts with either “TAD” or “AGO.” That confirms the query behaves as intended.
I hope this helps! Let us know if you need further assistance.
Best regards,
JR @ Tutorials Dojo
-
-
Log in to reply.