trackd_logo_dark-1
Operational risk is the primary challenge to aggressively mitigating cyber risk.

The Never-Ending Battle: Routine Patching vs. Operational Stability

This article was first published in CPO Magazine in October 2023.

There’s an on-going battle between competing priorities being waged every day in enterprises globally, and it’s been going on for decades. Cyber security teams are concerned with unpatched vulnerabilities and the breaches they risk, while IT professionals are driven by operational availability, the lack of which jeopardizes the business’ ability to operate (and potentially their careers). In today’s world, operational stability is winning, to the delight of threat actors everywhere.

A never-ending firehose of vulnerabilities

IT and Security teams have to deal with the challenges of keeping up with vulnerability disclosures (25,000 in 2022 alone, and 7,500 through the first 4 months of 2023), delivering patches, and understanding which assets they need to keep secure, and these have all intensified many times over since the start of the pandemic. This has made it clear that existing vulnerability and patch management solutions fail to deliver a meaningful positive impact to the modern threat landscape.

It’s unfortunately common knowledge in the cyber security community that ransomware is back in vogue (March 2023 saw an increase of 91% in the number of attacks vs February, and a 62% increase from the year before). Few debate that one of the most effective strategies against ransomware and other cyber attacks is patching vulnerabilities, yet the average time to patch is 215 days.

Efficient remediation remains elusive, even in the age of RBVM

Existing vulnerability management strategies and tools focus only on the prioritization of risk. While that helps organizations identify which vulnerabilities they should attend to first, the actual orchestration of remediation is often ignored entirely. Remediation efforts must be handled discretely by different tools, processes, and teams, often with little to no continuity between them.

Further, tangibly demonstrating the efficacy and progress of a vulnerability and patch management program is a massive undertaking. With each new vulnerability and patch, individual teams tackle discovery, correlation, and remediation in a one-off fashion, compounding existing inter-team frustration and increasingly blurring the distinction of success.

Whether we want to admit it or not, risk-based vulnerability management (RBVM) strategies are not enough. And everyone suffers:

“I have insight into all of our devices and what software they are running!” said no IT Engineer, ever. A comprehensive inventory of the devices operators are responsible for is critical to accurately assess risk exposure, but it is only one of many responsibilities of IT teams. Such an inventory requires multiple disparate and complicated systems to deploy and manage, and is generally deemed a non-priority. IT and security teams waste countless hours with too many active and passive scanning tools to try and fill in the gaps, because they don’t have a simple turnkey solution to rely on. Even when they’re “done” they continue to worry about new devices coming online and software changes on existing ones, and so they must constantly deal reactively with disruptive surprises.

Security teams are hamstrung with the responsibility for remediation efforts but have limited (if any) access to help identify and patch vulnerabilities. CISOs and their engineers want an easy way to manage vulnerabilities – from disclosure to remediation – with the least amount of burden to their counterparts responsible for IT and Infrastructure. Instead, security teams are forced into a working model where every request they make begets frustration. Relegated to the sidelines as distractions, IT instead deals with the countless other tickets demanding their “immediate attention” (password resets, jammed printers, coffee-logged laptops, etc.), and vulnerabilities live on.

IT Engineers are pressured to patch quickly, but lambasted when things go sideways; so they punt. Damned if they do; damned if they don’t, IT Engineers tend towards laggardly applying the patches required to attend to vulnerabilities. They introduce arbitrary “burn in” periods between rollout phases to reduce the likelihood of being stuck troubleshooting or rolling back a patch at 3 am and subsequently being scolded the following day. They want a repeatable yet flexible way to orchestrate the delivery of patches to their devices as quickly as possible, but more importantly, they want to ensure their safety.

Patching paranoia is inconsistent with patching data…but still…

At its core, all the chaos, conflict, and angst described above is the result of one unassailable truth: those responsible for patching are hesitant to do so aggressively because they’re afraid apply patches will break their s*%t. If someone could waive a magic wand and guarantee that no patches will ever cause an operational disruption, unpatched vulnerabilities would be as obsolete as the floppy disk, and a breach originating from an unpatched vulnerability would be a man-bites-dog story. And yet, less than 2% of patches are actually rolled back, so disruptions resulting from patching are much less common than they were years ago. Perception, however, is a powerful influencer of behavior, and fear of disruption is still the underlying ethos of remediation teams, something that despite the data, still makes sense. Only 2% of patches may fail, but identifying which 2% are likely to cause problems has historically been something vulnerability management technology has ignored. And the heartache endured by IT teams when an operational disruption occurs has not lessened over time. It’s time for vulnerability management technology innovators to spend less time identifying and reporting on vulnerabilities, and more time building tools to help IT teams fix them more efficiently…and without their fingers constantly crossed.