Home › Forums › AWS › AWS Certified Solutions Architect Associate › Before you purchase, just know that some of these questions could be better.. › Reply To: Before you purchase, just know that some of these questions could be better..
AdministratorJanuary 14, 2024 at 2:46 pm
Thank you for sending this question over. In relation to what you wrote earlier, we don’t just set “assumptions” for our readers. We provide relevant keywords and key phrases so the users can aptly select the most suitable answer that’s being sought after by the given scenario.
Let’s further analyze the scenario you shared:
A media company recently launched their newly created web application. Many users tried to visit the website, but they are receiving a 503 Service Unavailable Error. The system administrator tracked the EC2 instance status and saw the capacity is reaching its maximum limit and unable to process all the requests. To gain insights from the application’s data, they need to launch a real-time analytics service.
Which of the following allows you to read records in batches?
KEYWORDS / KEY PHRASES:
– launch a real-time analytics service
– read records in batches
The scenario is quite clear that there is a need to launch a real-time analytics service. A “real-time” service in AWS is almost always referring to Amazon Kinesis. This is also mentioned in the provided explanation and is well-supported by the included reference to the official AWS documentation:
Another keyword here is “read records in batches” which is a proper use case for AWS Lambda. This is also covered in the provided explanation, as well as the AWS docs:
“Lambda reads records from the data stream and invokes your function synchronously with an event that contains stream records. Lambda reads records in batches and invokes your function to process records from the batch. Each batch contains records from a single shard/data stream.”
Amazon Kinesis Data Streams (KDS) is a massively scalable and durable real-time data streaming service. KDS can continuously capture gigabytes of data per second from hundreds of thousands of sources. You can use an AWS Lambda function to process records in Amazon KDS. By default, Lambda invokes your function as soon as records are available in the stream. Lambda can process up to 10 batches in each shard simultaneously. If you increase the number of concurrent batches per shard, Lambda still ensures in-order processing at the partition-key level.
The first time you invoke your function, AWS Lambda creates an instance of the function and runs its handler method to process the event. When the function returns a response, it stays active and waits to process additional events. If you invoke the function again while the first event is being processed, Lambda initializes another instance, and the function processes the two events concurrently. As more events come in, Lambda routes them to available instances and creates new instances as needed. When the number of requests decreases, Lambda stops unused instances to free upscaling capacity for other functions.
Since the media company needs a real-time analytics service, you can use Kinesis Data Streams to gain insights from your data. The data collected is available in milliseconds. Use AWS Lambda to read records in batches and invoke your function to process records from the batch. If the batch that Lambda reads from the stream only has one record in it, Lambda sends only one record to the function.
Hence, the correct answer in this scenario is: Create a Kinesis Data Stream and use AWS Lambda to read records from the data stream.
The option that says: Create a Kinesis Data Firehose and use AWS Lambda to read records from the data stream is incorrect. Although Amazon Kinesis Data Firehose captures and loads data in near real-time, AWS Lambda can’t be set as its destination. You can write Lambda functions and integrate it with Kinesis Data Firehose to request additional, customized processing of the data before it is sent downstream. However, this integration is primarily used for stream processing and not the actual consumption of the data stream. You have to use a Kinesis Data Stream in this scenario.
The options that say: Create an Amazon S3 bucket to store the captured data and use Amazon Athena to analyze the data and Create an Amazon S3 bucket to store the captured data and use Amazon Redshift Spectrum to analyze the data are both incorrect. As per the scenario, the company needs a real-time analytics service that can ingest and process data. You need to use Amazon Kinesis to process the data in real-time.
Regarding this statement:
And if the system load is going to 100% the first thing you do wouldn’t be to create a data stream on incoming data.
I understand your point here since there are many ways to debug, troubleshoot, and solve 503 errors in a web application. As someone who worked in the industry for 17 years, I’ve seen several production cases where doing data analytics to your incoming requests makes sense.
For instance, a web application can be inundated by illegitimate requests by several attacks that are not blocked by its existing Web Application Firewall. New cyber attacks coming from bots, Dark Web (TOR requests), Web Crawlers, and other sources require real-time analysis in order to block the source IP address, rate limit certain endpoints, or blacklist the request’s user agent.
It’s quite important to understand that the high CPU utilization is NOT always caused by legitimate user activity or a surge of requests due to mass web promotions. These spikes can be caused by internal systems (e.g. unoptimized & CPU-intensive reporting modules) and the aforementioned external attacks. At Tutorials Dojo, our learning portal has similar real-time analytics for incoming requests, as well as analytics for user clickstream.
Thus, the provided scenario is not entirely misleading or ambiguous since the provided keywords are amply included.
Let us know if you need further assistance. The Tutorials Dojo team is dedicated to help you pass your AWS exam on your first try!
Jon Bonso @ Tutorials Dojo