Over the past year I’ve seen a huge uptick in interest for concrete advice on handling security incidents inside the cloud, with cloud native techniques. As organizations move their production workloads to the cloud, it doesn’t take long for the security professionals to realize that the fundamentals, while conceptually similar, are quite different in practice. One of those core concepts is that of the kill chain, a term first coined by Lockheed Martin to describe the attacker’s process. Break any link and you break the attack, so this maps well to combining defense in depth with the active components of incident response.
In cloud deployments we have four major categories of attack, each with different kill chains:
- Attacks on the cloud platform itself. Ignoring a fundamental compromise of the cloud provider (outside of the cloud customer’s hands) these attacks typically focus on misconfigurations of cloud services. If you leave an S3 bucket public, fail to put an authorizer on an API Gateway, or expose your credentials for AWS on GitHub, it falls in this category.
- Attacks on customer-deployed resources and applications in the cloud. These traditional attacks are no different than those run against your data center. Common examples include SQL injection in a web application, and vulnerable servers with the wrong ports open to the Internet. These tend to be a bit more constrained than they would be against a data center, assuming you use accounts/subscriptions/projects and VPCs or virtual networks to limit blast radius.
- Attacks against your cloud administrators and developers. Next time you run a penetration test make sure you let the attackers try and phish your developers and admins. This is one of the best ways to pivot into cloud since it’s often a lot easier for an attacker to gain access to a developer system than to break the cloud application itself. We’ll cover this in the future, but let’s just start with saying “MFA is my friend”.
- Blended attacks. This is the category we are going to focus on today. In these attacks the threat actor breaks into something deployed in the cloud and then uses that to pivot into the cloud management plane. (Some would consider attacks against developers to be blended, but I like to break them out separately).
As a rule of thumb I always start by assuming any successful attack at any level can escalate or pivot into a blended attack, at which point your management plane security and incident response become your best defenses.
Today I’m going to focus on one of the most common blended attack processes and outline a mix of detective and preventative controls to help break the kill chain. Before I get into the details, do not take this post of an over-simplification of a complex problem. Managing what I’m about to discuss at scale is incredibly difficult even when you know what you are doing.
Over the next few weeks we will be rolling out our first Ops specifically designed for these issues to our early access customers and we should have them in production relatively soon after that.
The Blended AWS Attack: Extracting IAM Role Credentials
In a blended attack the threat actor breaks into something more traditional and then uses that to pivot into the cloud management plane. There are three primary ways this occurs. In each case the attacker extracts either static stored credentials or ephemeral IAM role credentials, which we will explain in a minute.
- Direct compromise of an instance or a container. For example, if you leave port 22 open and the attacker can hack in or otherwise gain shell access.
- Server Side Request Forgery (SSRF). The attacker takes advantage of (typically) a web server/service vulnerability and can execute commands without gaining shell access.
- Compromise of a Lambda function. Although you can’t gain a shell on a Lambda, they are still subject to code execution vulnerabilities and even arbitrary code execution if they hold an application flaw. The exact implications can resemble SSRF.
In each case the attacker’s goal is to gain credentials to the AWS management plane and then leverage existing privileges or escalate privileges. We will discuss privilege escalation in a future post, and for now will focus on what those credentials are and how you can prevent their abuse.
Most people understand static credentials; which in AWS are an Access Key and a Secret Key. They are like a username and password but are used for AWS API calls. The current version uses a cryptographic process known as Signature 4 for HTTP request signing when you make those API calls. You can and should treat them just like a username and password — and you should never store them within cloud resources such as instances and Lambdas.
IAM roles are tricker when you first get started in AWS, they are both awesome and scary. An IAM Role in AWS is effectively a container for permissions that you use for a session. IAM Roles are great because they aren’t credentials per se… when you assume the role AWS provides a set of credentials for a time limited session. Roles are an “inside AWS only” thing. You can assign them to resources within AWS (like an instance or a Lambda function) and that resource can now make API calls without static, stored credentials! We use roles for federated identity connections, instances, Lambda functions, and every other service within AWS. Access keys are really only when you create a user in an AWS account, we use roles for everything else.
Roles have four permission types associated with them:
- What the role can do within AWS. These are the straight up permission policies you attach to the role.
- Who or what can use the role (the trust policy). Creating a role doesn’t mean anything or anyone can use it, this policy restricts access to, for example, AWS instances or a specific Lambda function.
- A permission boundary to limit the scope of the role. This is a bit more complex and not relevant to our discussion today so we will cover it later.
- When you assume a role for a session, you can also specify a subset of your existing permissions to use for that session. This is a cool feature for least privilege, but also not totally relevant for our discussion today.
It’s probably easier to explain how this works by walking through it. Suppose I have an application that needs to access an S3 bucket or a Dynamo Database. I create an IAM role for the instance and set the trust policy so the EC2 service can use the role. Then I launch an instance and assign the role. AWS runs the instance and has the instance assume the role. Assuming the role opens up a session and assigns an access key, a secret key, and a session token. AWS then rotates those credentials every 1-6 hours and the instance can now make those API calls authorized by the permissions policies.
While the credentials aren’t in the instance, the credentials are still accessible to the instance. Any code running inside needs to know the credentials to make the actual API calls to access S3 and Dynamo so something known as the metadata service provides them on demand The metadata service is a special thing in AWS for instances and containers that holds all the information about how it is configured. It’s pretty darn important for a server to be able to get its IP address, for example.
This is where the attack comes in.
The metadata service is simply a url you can access that returns the requested information.
curl 169.254.169.254/latest/meta-data/ will provide all the basic information, and you can use the path
curl 169.254.169.254/latest/meta-data/iam-security-credentials/ will provide the access, secret, and token. (In the case of a Lambda-based attack this all looks different and you use SDK code instead of curl, but the same principles apply).
The attacker can then copy those credentials and use them someplace else where they embed them in tools instead of having to load and run code on the compromised server. Also, being URL based, it opens the metadata service up to a wider range of SSRF attacks since you don’t need full arbitrary code execution. The credentials will expire at some point, but depending on the attack they might just come back and get a new set when they see the current ones stop working.
Smart attackers these days will use the credentials in an AWS account they control since Amazon has some tooling to detect credentials extracted and used outside their known address ranges.
Here is what it looks like in a short video we use in the Securosis Advanced Cloud Security training class where the students learn the process in a hands-on lab:
Breaking the IAM Role Extraction Kill Chain
Let’s map out the kill chain. The attacker needs to do the following:
- Discover and exploit a vulnerability in an instance, container, or Lambda that allows them to access the role credentials. This is pretty much always a mistake on the customer side… such as failing to patch, opening up the wrong ports, or deploying vulnerable code.
- Extract the current role credentials.
- Successfully run allowed API calls in an environment under their control.
- Do something bad within the allowed IAM role’s permission policy scope. I mean, probably bad, it isn’t like most attackers patch your code for you.
The following techniques can break different links in the chain and include a mix of detective and preventative controls. Don’t feel bad if this looks overwhelming… very, VERY few of the organizations I work with implement these comprehensively, especially at scale.