Strings of numbers 1 and 0 forming wavy lines in turquoise and dark blue

Automating Security Incident Response in AWS

Self-service resource isolation API for Amazon Web Services can save time and reduce the impact of breaches

August 26 2021Deepjyoti Nath

Automating security in Amazon Web Services cloud

When security incidents occur, it’s vital to contain them quickly before the damage spreads. In case of a breach, every second matters, but time is already precious for overstretched IT and security teams. So, it makes sense to automate security wherever possible. This technical guide shows you how self-service resource isolation is an effective way to reduce the manual effort involved in incident response.  

What is resource isolation?

A room full of monitors and one monitor showing a white security lock on a blue background

Resource Isolation is a method of remediating a security incident in the AWS cloud. Security incidents might occur in service or infrastructure domains. AWS already provides other services to log and monitor all activities and detect security-related events in the AWS Cloud environment. These include Amazon CloudTrail, Amazon CloudWatch, Amazon S3 Access Logs, VPC Flow Logs, Amazon GuardDuty, Amazon Detective, AWS Security Hub, and Amazon Macie.  

Service domain incidents in AWS

The impact of a service domain incident can be painful, causing interruptions to vital services and enormous inconvenience to end-users, which can lead to reputational damage. Service domain incidents can affect a customer's AWS account, IAM permissions, resource metadata, and billing. If threat actors gain access to the IAM account, they can misuse the APIs to disrupt the existing setup. 

Infrastructure domain incidents in AWS 

The potential consequences of an infrastructure domain incident include operational downtime, data theft and compliance breaches. Depending on the severity of the breach and the industry involved, such incidents may be reportable under the GDPR and lead to substantial fines by the EU’s Data Protection Authorities. 
Incidents in the infrastructure domain include data or network-related activity, such as the traffic to Amazon EC2 instances within the VPC, processes and data on Amazon EC2 instances, and other areas, like containers or other future services.

Moving classic workloads to cloud

How can companies with legacy applications start quickly in the AWS Cloud? VMware cloud on AWS paves the way

Investigating a security event

A architecture flow diagram of the API-based solution

During a security event investigation, we might need to isolate resources as part of the response to a security anomaly. The intention behind isolating resources is to mitigate the potential impact, prevent further propagation of affected resources, limit the unintended exposure of data, and prevent other unauthorized access. 
Developing a security API and codifying the manual steps saves considerable human effort and time. Incident responders can then invoke the API to remediate the issue. Over time, we can automate more steps and implement other runbooks, and ultimately automatically handle an assortment of classes of common incidents.

How the API solution works 

The solution focuses on isolating IAM users and EC2 resources in AWS accounts. It uses highly secured REST APIs that integrate standard notifications to operations teams and account and application owners. The solution uses a standard deployment workflow and manages the infrastructure resources as code with automation in mind. The following architecture flow diagram illustrates the solution approach.

Resource isolation: The steps

On receiving an incident for resource isolation: 

  1. Authorized user in security team logs in to security account and executes IAM user or EC2 instance isolation by completing the required details in the input form.  
  2. Rest API validates the user data and authenticity of the request and invokes the appropriate service securely. 
  3. Isolation logic is implemented in the service layer using serverless AWS lambda functions. 
  4. The respective lambda functions will assume permission to access the account to isolate the specified vulnerable IAM user or EC2 instance. 
  5. On the target account, the vulnerable IAM user or EC2 instance is isolated.
  6. Resource isolation details are stored for future reference.      

Advice for DevOps and CI/CD

A hand touches a security lock hovering over an iPad.

The earlier architecture flow diagram shows that several AWS resources are used to set up the API-based solution. These resources are provisioned and managed using Terraform as Infrastructure as a Code (IaaC). We define a standard CICD approach with different environments for development, testing, and production use. All code developed and reviewed in the development environment is merged correctly and deployed to the test environment to validate the solution with functional test cases. After successful testing, deployment is performed to the production environment to enable the self-service security incident response API. 

Wrapping up 

Through codifying incident response runbooks and giving valid users or applications the API, we bring these benefits to incident response:

  • Preparedness – be ready for any eventuality, regardless of available human resources and expertise
  • Consistency – a uniform approach to incident response, which can also support later investigations or forensic analyses
  • Integrations – the API can also be used across non-AWS applications and systems 

Read on for a deeper dive on each of these benefits of the API.  

Preparedness for incident response 

The AWS CAF, AWS Security Incidence Response Guide, and Well-Architected Framework recommend that customers formulate known procedures for incident response and test their runbooks before an incident. Testing processes before an event occurs decreases the time it takes to respond in a production environment.

Consistent incident response 

For artifact gathering, codifying the processes into set code and infrastructure prepares us for data collection. Codifying standardizes the collection process into a repeatable and auditable sequence of what information was collected and when and how it was collected. This reduces the likelihood of missing data for future investigations.

Easy and seamless integrations

We have developed the IAM user and EC2 instance isolations process for the current state. We can also implement automation of another runbook and add it as another feature. 
Integrating the features in the same API achieves uniform security configurations and development processes. All the features are exposed as REST APIs and can be integrated easily with other applications or systems.

About the author
Deepjyoti Nath, Technical Architect AWS Cloud Services

Deepjyoti Nath

Technical Architect AWS Cloud Services, T-Systems International GmbH

Show profile and articles

You might also be interested in:

Do you visit t-systems.com outside of Germany? Visit the local website for more information and offers for your country.