Find answers, ask questions, and connect with our
community around the world.

Home Forums AWS AWS Certified Solutions Architect Professional ECS and spot instances – Spot instance draining

  • ECS and spot instances – Spot instance draining

  • juano1985

    Member
    March 14, 2024 at 5:17 am

    Hi JR,

    I have a question related to ECS and spot instance draining. When we assign spot instances for our ECS cluster, does this mean that our application has a risk of downtime? I ask this because I read some question some time ago, can’t remember where, and in the description it said that the application should always be available (on ECS) but the company was wanting to reduce costs as much as possible. So I believe the correct answer was to use spot instances and enable spot instance draining instead of on-demand instances. This kinda confused me because I assumed that if we use spot instances then we couldn’t be sure that our application would be available all the time. Is this different when we are using spot instances below our ECS cluster?

    Thanks!

    Juan

  • ricardojfdias

    Member
    March 15, 2024 at 11:50 pm

    Hello Juano!

    You have a rather detailed explanation here: EC2 Spot Interruption Handling in ECS :: EC2 Spot Workshops

    One possible risk I see is if there are no Spot instances available in your region with the instance type you have on your configuration.

    Best,

    Ricardo

    • juano1985

      Member
      April 1, 2024 at 11:20 pm

      Hi Ricardo,

      Thanks for your response. I was just reading a bit more on this and came across this article that explains the behaviour, it’s pretty outdated (2018) so I guess maybe there have been quite some fixes/updates since then. I understand then that just like you say, there could be a case where that instance type is not available in that AZ and new spot instances wouldn’t be able to spin up, leading to an outage. Let me know what you think:

      https://medium.com/@kevintruckenmiller/aws-spot-instances-and-ecs-b61c5802b375

      “Don’t Take an Outage

      One problem remains with Spot Fleets, when a spot instance gets outbid in a particular Availability Zone, you have to handle how the instance reacts to the underlying change. If you don’t, you could take an outage because the instance will be terminated and the container riding on top of the instance won’t be notified. On ECS you have the distinct advantage to set your instance to DRAINING. When you set your AWS instance to drain, you tell the host underlying the containers to send the containers a SIGTERM, and you also tell the load balancers target groups to drain the connection of the container so that new traffic doesn’t flow to the containers on that instance. This allows the scheduler to then spin up containers on the other instances within the cluster, and not take an outage on the actual instance that’s been outbid.”

      Thanks!

      Juan

  • JR-TutorialsDojo

    Administrator
    April 2, 2024 at 1:27 pm

    Hello juano1985,

    It would be better if you could provide a snippet of the question so we can look it up. But yes, you’re correct. Spot Instances offer a cost-effective option If you can adjust the running times of your applications and if they can be interrupted. However, Spot capacity might be unavailable during extremely high demand. This can cause Amazon EC2 Spot instance launches to be delayed when the overall demand for capacity increases.

    To mitigate the risk of downtime, Amazon ECS allows a container instance to be transitioned to a DRAINING status. When in DRAINING status, Amazon ECS prevents new tasks from being scheduled on the container instance. The service scheduler will initiate replacement tasks if the cluster has sufficient capacity for container instances. However, if there is insufficient container instance capacity, a service event message will be sent indicating the issue.

    I hope this helps. Let us know if you have any further questions.

    Regards,
    JR @ Tutorials Dojo

  • juano1985

    Member
    April 2, 2024 at 8:22 pm

    Hi JR,

    Thanks for your response. I just found the question here, it’s from another set of practice exams on Udemy from another instructor. I answered correctly because the other options seemed kinda obvious that they are wrong, but still it seemed to be confusing. I think since it says “reducing the probability of service interruptions” we could assume that there could still be service interruptions, hence we can use spot instances. Maybe I remembered for some reason that it said that there couldn’t be service interruptions, but looking at it again it’s a bit more clear. Please see the question attached and let me know what you think.

    Thanks!

    Best,

    Juan

    • JR-TutorialsDojo

      Administrator
      April 4, 2024 at 1:35 pm

      Hello juano1985,

      Your understanding seems to be correct. The question is about reducing costs and minimizing the probability of service interruptions, not eliminating them entirely.

      Amazon ECS Spot Instances can indeed provide a cost-effective solution. However, it can interrupt these instances with two minutes’ notice when it needs the capacity back.

      To handle these potential interruptions, you can configure Spot Instance Draining. If a Spot Instance is marked for termination, the Amazon ECS container agent automatically sets the container instance state to DRAINING, which prevents new tasks from being scheduled for placement on the container instance.

      So, in the context of the given question, using Amazon ECS with Spot Instances and configuring Spot Instance Draining would be a suitable solution. It allows for cost reduction (due to the use of Spot Instances) and reduces the probability of service interruptions (through the use of Spot Instance Draining).


      I hope this helps clarify the question!

      Regards,
      JR @ Tutorials Dojo

Viewing 1 - 4 of 4 replies

Log in to reply.

Original Post
0 of 0 posts June 2018
Now