The Aftereffects of the CrowdStrike Outage: Healthcare IT Today
ClearDATA Chief Technology Officer (CTO) Jim Ducharme was featured in Healthcare IT Today with other insights from healthcare IT leaders on the CrowdStrike outage.
Jim Ducharme, CTO at ClearDATA:
Business resiliency can be impacted by many things. We typically think of outages being caused by malicious attackers finding ways to penetrate the many security layers we have put into place to keep our applications and data secure and compliant. But in this case, the very technology meant to help keep us safe and secure caused a significant resiliency problem. Certainly, this wasn’t anything malicious but much like the old data centers and someone tripping over a power cord and causing an outage the inadvertent cutting of fiber optic cables in the Red Sea back in March of 2024, just about anything imaginable can have an impact on the resiliency of our business.
There are many lessons here even for those fortunate enough to not be impacted by this event or the vendor at the center of it. First, for everyone, the importance of having business continuity and disaster recovery (BCDR) plans is essential, and that when incidents like this occur we test our BCDR plans against these scenarios, whether we were impacted or not, to understand any risk exposure and react accordingly. Second, for those deploying software or infrastructure, resilient change management processes are also essential. From the “Mr. Obvious” department, quality control is of course important but once again there is always the unintended impact of change that we can’t always anticipate or replicate in a testing lab. Companies should look at the maturity of their change processes and for those critical systems implement progressive rollout strategies coupled with tight monitoring of the infrastructure to methodically introduce change into their environment and watch for any adverse reaction. These processes may not prevent issues from happening but can certainly minimize their impact.
Per the above, ensuring they have, and regularly test their BCDR plans against these real-world instances when they occur as well as have progressive rollout strategies for any change into critical environments or infrastructure to help minimize the impact when change introduces unintended consequences.
Beyond implementing these best practices, I think in certain industries we will see heightened regulatory requirements around resiliency of infrastructure. Healthcare organizations have already had plenty of advanced notice around issues like this due to the prominence of ransomware attacks that hold their systems hostage and stress even the most rigorous BCDR plans. Vendors will be under even more scrutiny around not only their security posture but now in their resiliency posture and change management processes. I’m sure third-party risk evaluations will now contain more requests for information on how changes to their services are rolled out to their customers.
Healthcare organizations, and really any business for that matter, should take a look at their business processes and the dependencies on IT infrastructure to execute them successfully. Something as simple as a scheduling application can shut down a business. In the case of Delta Airlines in this incident, the system responsible for contacting their flight crews was impacted so their ability to connect with their employees and address contingency plans was impacted because they couldn’t even effectively communicate with their employees to change their operating plans. Elective surgeries in many hospitals had to be canceled, not because the operating room wasn’t ready but because scheduling systems and EHR systems were not available. Many times there are systems we don’t think of as “business critical” when looked at in isolation but when looked at from its dependency on key business processes or even in the context of BCDR suddenly these seemingly simple systems become business critical.