ALERT to FIX in a MINUTE
As Rich and I have been talking about for years, the ability to move to automated cloud security operations remains one of the most compelling opportunities for improving security in the cloud. The ability to have an alert trigger automated remediations will change your security operations motions. But that’s hard for some folks to comprehend.
Let’s point to a recent post by Expel (one of my favorite Security Ops blogs) to provide a more tangible example of this concept. It details how Expel went from Alert-to-fix in AWS. It’s a real-life description of how their SOC responded to a root key compromise (and an associated cryptocurrency mining activity). Working with the customer, they were able to remediate the issue in 37 minutes. Less than an hour is pretty good, especially given many issues of this type take days, if not weeks, to find and control.
First of all, this attack is no joke. The client is fortunate; the attacker only decided to spin up ten extra-large instances. Hmmm, that sounds familiar. But I digress, or more to the point, never miss an opportunity to give Rich a hard time about a mistake he made six years ago.
Now, where was I? Right, I was impressed the issue was detected, investigated, and remediated in 37 minutes. What if I told you you could have detected and mitigated the issue in a minute. Yes, a minute. 60 seconds. Or even less. You’d probably snicker and try to figure out whether I spiked my coffee. I didn’t, but the day is still young.
You can achieve this by using a tool like DisruptOps to take a trigger and intelligently route the alert to the right operational team. Then by giving that team a set of options they could use immediately to remediate the issue, you can not just reduce but mostly eliminate the time it takes to respond to many cloud security issues.
Let’s play it out. In Expel’s attack, they get an alert about the generation of a bunch of SSH keys. That’s an Op you could set up in DisruptOps to pair that trigger (the alert) with a set of potential actions. The alert on the SSH keys gets sent to the SOC analyst on call. Or maybe the cloud Ops team that is responsible for that account takes responsibility for operational fixes. Either way, DisruptOps provides you with tremendous flexibility in how you generate and route the alerts. If the Ops team uses Slack, the alert shows up directly in the Slack channel for the on-call Ops person. In that alert would be a set of remediations, like disable the SSH keys. Press a button right within Slack, and the keys are disabled. No going into the AWS console or any other security tool to deal with the issue.
You also got an alert from AWS GuardDuty about potential crypto-mining activities, which were detected when the extra-large instances spun up. Again, the alert comes with context on your options for taking action. In this case, we don’t want to ignore or delay the action since the AWS meter is running, and extra larges aren’t cheap, so we’ll just terminate those instances.
Revisiting the timeline, as long as it takes you to look at the Slack channel, digest the information, and make a decision, you’ve fixed the immediate problem. Taking action directly from within the alert dramatically reduces the time to get from alert to fix.
To be clear, your work is not done at this point, even though getting back to bed sounds nice. You still have an adversary in your root account, so you’ll need to start a more formal incident response to figure out if other accounts are compromised and what actions the attackers may have taken. Since DisruptOps monitors your environment, you’ll get an alert if the adversary created new users, spun up new instances, creates access keys, turns off CloudTrail, or does anything else in the attacker playbook.
And if you are using AWS Security Hub to aggregate the findings, you could already have a set of pre-defined actions set up for each of those alerts and may have already fixed the issues before the responders had a chance to point them out. Can you imagine getting out ahead of the incident response team?
By the way, this doesn’t mean that companies like Expel and other MSPs aren’t still beneficial as you move towards automated cloud security operations. They can do triage and run the investigation to make sure you didn’t miss anything. If anything, the combination of DisruptOps and an MSP is complimentary and very powerful. You take care of the immediate danger, and you have a team of experts to make sure that everything got cleaned up, and there is no attacker residue.
That’s a big win in our book.