MemberFebruary 5, 2022 at 11:35 am
CDA test2, question 30, says for deduplication to work the developer should ensure the messages are sent AT LEAST 5 minutes apart. I think message deduping IS NOT guaranteeed to work if the messages are more than 5 minutes apart.
AdministratorFebruary 8, 2022 at 5:56 am
Hello Earthsopha Gus,
I understand where you’re coming from. According to the question, there are times when the sensors send duplicate data. The most common cause of this is network interference/outage so let’s assume that’s the case.
The question is quite tricky as there’s actually no exact throughput at which you should fire requests to guarantee non-duplicates. There are other factors outside SQS to consider as well, the most important of which is the duration of a network outage between each SendMessage request. Since it’s not possible to determine how long a network outage may last, duplicates could still appear regardless of the throughput. So I agree that sending messages in 5 minutes or more/less intervals alone won’t solve duplicates. Although there are strategies that you can employ to solve the issue of duplicates due to network failures, it’s outside the scope of the CDA exam as they aren’t really SQS-specific.
Generally, there are two parameters that you can configure in SQS to prevent duplicates: ContentBasedDeduplication queue attribute and MessageDeduplicationId. The first one generates a message deduplication id by calculating the hash of the message content (this could still produce duplicates if two messages have identical content). The second one allows you to set your own id. Between the two, the second one is more effective in suppressing duplicates which is also why it’s the correct answer. The intent of the question is to highlight the usage of MessageDeduplicationId and not so much on the rate at which you must send a message. We will update this item.
Let me know if you have further questions.
Carlo @ TutorialsDojo
MemberFebruary 8, 2022 at 10:22 am
Thanks, that seems like an improvement.
It was from reading about this question that I learned about deduplication interval. If that never comes up in CDA test, I guess it’s best to deemphasize. The wording in AWS docs sounds like there’s some variability about how long that interval can be, they guarantee for 5 minutes and I guess that guarantee has to hold if you are maxing the queue with 3K messages a second, AWS will be holding on to 900K presumably indexed ids in a moving window… I guess I will test how long deduping works on a low volume queue and see if it is a lot longer than 5 min.
Thanks for the thoughtful & helpful answer
Log in to reply.