You’re on-call, it’s 3 am and you’re not really asleep or awake. The background anxiety always peaks during on-call weeks but this time you know there’s something that could ratchet the stress down significantly. Suddenly your phone vibrates, you become fully alert, and as you slowly and apprehensively reach for your phone, your heart sinks. It’s a P1, the adrenaline kicks in and your chest tightens.
But wait, it’s not a widespread outage or attack, it’s a SecOps (Security Operations) ‘white-glove’ executive incident. It’s been almost a year since the previous high-profile incident that few know about, fewer dare mention, and even less know why. You scan the current incident details and remember you have the option to initiate the SecOps IR (Incident Response) and containment playbook from right there inside PagerDuty.
A little smile creeps up the side of your face, you exhale slowly, and your chest becomes less tight. You ACK the notification and click the link to initiate the IR playbook. You get an immediate status update, and you reset the incident to a P3. “Phew, that’s pretty cool...”, you muse to yourself as you proceed to snooze the alert in PagerDuty for 4 hours until your normal 7 am wake-up call. This time you go straight to sleep but just before you doze off, you wonder what other teams could use this sort of automation for any incidents and to standardize their responses, be they new hires or old ‘hats..
But how is this possible? Let’s take a peek at what’s behind this automation.
What is your desired automation outcome?
At a high level, we map out what we want to achieve in a simple sequence. We then use modular stories to tell our integrations what we want them to do (and under what conditions). Let’s query a group of CrowdStrike EDR (Endpoint Detection and Response) actions installed on a group of executive's laptops. We look for high severity detections and then distill them down to truly actionable alerts right inside PagerDuty.
Simple and modular main Story flow
We not only generate the PagerDuty incidents and notifications themselves but add single-click actions to initiate any number of automated workflows from inside the PagerDuty UI (User Interface) itself.
Many teams still want to retain human oversight of certain decisions while automating others. It all depends on whether the scale, velocity and confidence required lends itself to human or machine steps. For human operators, it’s also satisfying and reassuring to remove uncertainty and know exactly what’s next.
Signal from noise
If we quickly dive into the first sub-Story we see a very linear and simple flow that retrieves hosts from a host group. It goes looking for detections within a specific timeframe and above a severity threshold. This flow is basic but can easily evolve and be reused in multiple workflows.
CrowdStrike searching detections by group, severity, and time
We can pass any group name and severity level into this sub-Story and it will build the results for us in a readily consumable manner. We then pass back a range of information including hostname, username, platform type, and device IDs which allow either the engineer or any subsequent