Exorcizing the ‘Ghost of CrowdStrike’: Q&A With LogicMonitor’s Ryan Worobel


Back in July, CrowdStrike pushed out a faulty update to its Falcon Sensor that caused worldwide outages on Microsoft operating systems. Users experienced the BlueScreen of Death (BSOD) error, which caused their systems to shut down or restart unexpectedly.

The effects were immediately wide reaching: Doctors weren’t able to diagnose patients, airports abruptly paused travel, and there were also disruptions to emergency services and responses. Government services, banks, and other businesses had to shut down.

The incident revealed that global tech infrastructure is still fragile despite robust teams and solutions. One small glitch brought operations to a standstill, leaving CIOs with a lingering sense of vulnerability.

Ryan Worobel, CIO at LogicMonitor, witnessed firsthand how the CrowdStrike outage unfolded and understands the immense pressure IT teams face when technology goes dark. He believes digital resiliency and business continuity are keys to being prepared during an outage.

Worobel has more than 25 years of experience holding leadership positions in business, technology, and information security, previously working in the industrial space, where he acted as EVP and CIO for Incora. Throughout his career, Worobel has specialized in driving strategic initiatives for global Fortune 100 brands to transform IT, focusing on driving competitive differentiation through digital business strategies and transformational technology.
Prior to Incora, Worobel held positions at brands that included Maersk Line, Zuora, Dell, and Salesforce.

LogicMonitor offers hybrid observability powered by AI. The company’s SaaS-based platform, LM Envision, enables observability across on-prem and multi-cloud environments.

What is the general consensus surrounding the CrowdStrike incident, and how did it change IT’s attitude toward an outage like this?

The general sentiment is frequency; we’re so reliant upon third-party vendors. In so many places, digital technology is so interwoven, creating so many risks. I think the main focus is for CIOs to be prepared, think through business continuity planning, be diligent, and understand the landscape and how it operates. We need to resurrect one of these lost skills, continuity planning, so we can still operate during a dark period. We’ve become too dependent on the cloud; we need to revisit how things work when manual.

What are the lessons learned from this situation?

You can’t have blind faith in vendors; you must have your own checks and balances. While the vendors we work with are respected, you need to also know how to do your own work. This, again, is through business continuity planning. CIOs need to know where things touch, protect components, understand the landscape, know how tier 1 connects to tier 3, and if something goes down, get [it] up and running as quickly as possible.
Hybrid environments are definitely coming back. We must diversify architecture with platforms such as hybrid cloud. Resiliency there is critical.

How can LogicMonitor prepare IT/companies for outages like this?

LogicMonitor can help companies get people running back up as fast as possible. The alerts we get come in quickly and early. We were able to deduce that the issue came from patches CrowdStrike issued. We narrowed that down, got in front of the problem, and repaired it. The earlier you get on it, the better.

What new innovations do you see coming for security in the future?

CIOs can only mitigate these types of outages to minimize the damage; we can’t completely eliminate them. The development and growth of AIOps [make up] a key component in this space. It has been around for a while, but it’s now in its 2.0 stage, where it can pinpoint the problem automatically.

AIOps can sort through the noise for you with capabilities in real time to deduce the alerts [that need] fixing. It’s going to get to the point of self-healing these types of outages.

The issue moving forward will also be sorting through the malicious versus accidental. The CrowdStrike outage was an accident. Cyberattack methods are changing, and malicious actors are coming through trusted vendors.

It’s about understanding environment continuity plans and having backups. These days it’s easy to have reroutes to get back up and running. CIOs also need to practice failover every year. Manual due diligence got lost in the cloud. We need that mindset from back in the days of on-prem systems; we need that sense of control.



Newsletters

Subscribe to Big Data Quarterly E-Edition