How S3 Buckets Become Public, and the Fastest Way to Find Yours

By |2018-10-26T14:07:03+00:00October 9th, 2018|

In What Security Managers Need to Know About Amazon S3 Exposures we mentioned that one of the reasons finding your public S3 buckets is so darn difficult is because there are multiple, overlapping mechanisms in place that determine the ultimate amount of S3 access. To be honest, there’s a chance I don’t even know all the edge cases but this list should cover the vast majority of situations.

S3 is one of the oldest services in AWS. So old that parts of it still support XML-based policies instead of the JSON you see everywhere else. S3 also has a lot of features that aren’t as commonly known or used anymore, such as allowing someone else to host put objects in your bucket yet still maintain ownership of them. AWS actually has a wide range of use cases it needs to support, which is why we have all these mechanisms. Sharing publicly, within an account, between accounts, hosting websites, and so on all create complexity. The good news is S3 always defaults to secure and private. The bad news is AWS allows human beings to use it, and humans are really good at mucking things up.

Also keep in mind that there are slightly different options for S3 buckets (think of a bucket as a server that can have subdirectories and files) and objects, which are the actual files. I like to think of buckets as servers, not directories, since they have a different root url (e.g. “dops.s3.amazonaws.com”) and have their own lifecycle and settings. Also, directories in S3 allow you to organize objects, but don’t have any distinct settings of their own other than a name.

One last warning- S3 supports both read and write permissions (and list, for buckets). It’s possible to have no public read but public write permissions, which could lead to people putting… bad things… in your buckets

Eight (Yes, Eight) Ways Amazon S3 Data Becomes Public

The interplay between these can be a little confusing so we’ll walk through the interactions after we list them out. Let’s start with the three primary bucket controls:

  • Bucket ACLs (Access Control Lists) — This is an XML document that defines the first layer of access. By default, only the account owner has access, but this can be opened up to other AWS accounts or the public at large. Amazon recommends against using these at all, but it’s also the easiest way to make something quickly public so we see it all the time.
  • Bucket Policies — These are super-flexible JSON policies that allow you to set things like IP-based and other conditional permissions on a bucket. While this should be the primary way for managing public access ACLs are the first tab and about 3 clicks to make something public so, shrug. It’s this flexibility that leads to mistakes such as opening up to a wider range of IP addresses than intended or using a negative to block access to some IPs but inadvertently granting access to everyone else.
  • IAM Policies — These are the normal IAM permissions you use to control access throughout your AWS account. You can’t make a bucket public with them since they only govern AWS users, but you can open up access to the bucket by authorizing another AWS service access, and then that service exposes the content. We’ll get back to this.

Then come the object controls with one important point: objects do not necessarily inherit bucket ACLs. You can totally make an object public even if the bucket is private.

  • Object ACLs — These are the primary object-level controls. It’s just like a bucket ACL and uses XML to allow access from other accounts or the public.
  • Explicit IAM or Bucket Policy Statements — IAM policies and bucket policies can have explicit statements referencing the object, which will override the object ACL (if you own the object) since they are evaluated first.

That’s one of the nuances… IAM is evaluated first, then bucket policies and ACLs. If something is explicitly blocked in a higher policy (and by explicitly we mean the object’s unique identifier) then the request is blocked. Otherwise the object ACL is evaluated.

There are two last ways something can be public (not counting making the content public in your application/code outside of S3 itself):

  • Pre-Signed URLs — These object-level policies must be created using code (not the console) and provide temporary access for anyone with the URL. It’s used, for example, to share files for a few minutes or an hour for someone using your app to download a media file.
  • Cloudfront Origin Access Identity — Cloudfront is the AWS content delivery network and can serve as the front end to S3. You can create something called a Cloudfront origin access identity to write an IAM policy that allows Cloudfront to access the S3 content. If the Cloudfront allows public access it can access the S3 content and that won’t show in a bucket policy or ACL.
  • CORS Policies — Cross Origin Resource Sharing is required if you use S3 in a web site and don’t want the browser to break due to same site security settings. CORS in S3 won’t override an ACL or bucket policy, but could mask public access in limited situations where the data is exposed in the web code through the authorized site.

To put it all together, first AWS looks at IAM permissions. Within those, the only one to make a bucket public over the web is the Cloudfront Origin Access Identity. Then AWS looks at bucket ACLs and policies for any explicit deny statements. Then it looks at the object ACLs for public access. If you make an object public and the bucket policy doesn’t have an explicit deny, it is still public, but otherwise a good bucket policy will block everything. Besides, a lot of the problems are open bucket ACLs, not object ACLs.

Finding and Protecting Public S3 Data

So what’s the best way to find public buckets? It’s relatively easy in the console and we will cover looking in code in a future post:

  1. Log into the console, click on S3, and look for the Public tag. AWS uses some advanced back end math to evaluate all the bucket policies to figure out if something is public, which catches most of the fringe cases, but it does not show if the bucket is private and objects in it are public.
  2. Check Cloudfront to see if you use any origin access identities. It’s easier to look here than in all your bucket policies (usually). This also saves you from looking through object ACLs as well.
  3. Check any CORS policies, if you use them. This is the least common situation and I even hesitated to complicate this post by mentioning them.
  4. The bad news… there is no easy way to find public objects. Even finding them programmatically can be difficult if you have a large number. Also, changing a bucket ACL doesn’t cascade to the object ACLs so you need to run through and fix them one by one. However, if the bucket is locked down the attacker would need to know the full URL/name of the object to find it. Not perfect, but better than nothing.

That’s it, although if you know of any other fringe cases let me know. When I’m out running assessments I find that well over 90% of the problems I see are just basic bucket ACL and policy mistakes. Object ACLs are also definitely an issue, but slightly lower risk since you need the exact URL to find them if your bucket is otherwise secured properly.

About the Author:

With twenty years of experience in information security, physical security, and risk management, Rich is one of the foremost experts on cloud security, having driven development of the Cloud Security Alliance’s V4 Guidance and the associated CCSK training curriculum. In addition to his role at D-OPS, Rich currently serves as Analyst & CEO of Securosis.