The use of CPE data by vulnerability scanners is responsible for many of VM's false positives.

CPE Data and False Positives in Vulnerability Management

To borrow a common idiom, IT operators need another vulnerability to deal with like they need another hole in the head. With just under 3,000 new vulnerabilities reported each month, it’s inevitable that some of those will legitimately impact just about every network, so the last thing anyone in IT needs is a vulnerability scanner telling them they have vulnerabilities where, in fact, they don’t.

Correlation: A Critical Step in Vulnerability Management 

The problem of false positives in vulnerability management can largely be attributed to the use of CPE (Common Platform Enumeration) data in the correlation process, a critical first step in vulnerability management. There are tens of thousands of vulnerabilities identified and listed by both NIST in their NVD (National Vulnerability Database) as well as by the vendors that create and publish the software harboring the vulnerabilities. Correlation is the process by which vulnerability scanners identify which of the published vulnerabilities is applicable to the software deployed on a given network. It effectively compares the software deployed on the network to the software impacted by each vulnerability. When there’s a match, that means the network has a vulnerability.

Vulnerability scanner vendors have two options to identify which vulnerabilities are applicable to which software: pull granular software package data directly from vendors, or use the CPE description from NIST for each software package. CPE is an attempt to describe every possible software package from every vendor using a common format, and although the attempt to do so is a noble undertaking, CPE data is notoriously inaccurate and incomplete. These flaws in CPE data ultimately manifest in false positives.

The Problem with CPE Data

In a nutshell, the unreliability of CPE data makes it impossible to trust. In a 2022 whitepaper written by 15 highly distinguished members of the cyber security community, including multiple OWASP leaders, the challenge of identifying a software package by a universally accepted and adequately detailed name is addressed, and some of the reasons why CPE data is problematic were enumerated:

  • CPE is typically not created for a software product until a CVE is determined to be applicable to the product. 
  • There is no error checking when a new CPE name is entered in the NVD 
  • Products change names and CPEs are never updated
  • The same holds true for supplier or product names that can be written in different ways, such as “Microsoft(™)” and “Microsoft(™) Inc.”, or “Microsoft(™) Word” and “Microsoft Office(™) Word”, etc.
  • A single product will have many CPE names in the NVD because they have been entered by different people, each making a different mistake 
  • Often, a vulnerability will appear in one module of a library. Thus, if the vulnerable module is not installed in a software product being used – but other modules of the library are installed (meaning the library itself is listed as a component in a Software Bill of Materials) – the user may unnecessarily patch the vulnerability or perform other gratuitous mitigations.

In an anecdote that underscores the practical ramifications of these CPE deficiencies, Oracle Corporation estimates that they can identify CPEs for no more than 20% of the components in their software products.

Lack of CPE Data Updates – Anatomy of False Positives

One of the more disconcerting issues with CPE data identified previously is that CPE data isn’t updated. This is particularly problematic for Microsoft CVEs. When a CVE for a Microsoft package is initially published, it often encompasses versions of the software that are not vulnerable, largely because Microsoft doesn’t provide build number and platform details for extended periods of time after initial CVE discovery. Once that detail is released, however, it becomes clear that several versions of the ostensibly vulnerable software are not, and scanners pulling data from Microsoft would reflect that in their vulnerability correlations. CPE data, however, is rarely updated to reflect that reality, so scanners relying on CPE data will continue to identify vulnerabilities in software packages from Microsoft that are, in fact, not there (including Microsoft’s own Defender). CVE-2023-38140 presents an example of this type of CPE-driven false positive, as a comparison between the NVD CPE data and Microsoft’s native disclosure reveals CPE’s data lack of accounting for the Microsoft build number granularity.

Another example of poor or non-existent CPE data updates that result in false positives stems from the vendor practice of “backporting”.  Some software publishers – Red Hat is one example – will apply a security fix to an existing software version. So, if, for example, version 1.23 of the software was deemed vulnerable, CPE data will indicate that all versions before 1.24 are vulnerable. When Red Hat backports a fix to version 1.23 and labels it 1.23.1, CPE data isn’t updated to reflect that not all versions prior to 1.24 are vulnerable. Thus, if the network was scanned with a scanner using CPE data for its correlations, it would identify version 1.23.1 as having a vulnerability, when in fact that vulnerability had been addressed by the software vendor in a backported release.

CPE-Induced False Positive – Another Example

A May Vulners article provides a classic example of a series of false positives generated by the NVD’s attempt to catalog CVE-2024-23296. The CVE description clearly indicated it was relevant to only iOS 17.4, and yet the NVD incorrectly associated CVE-2024-23296 with all iOS operating systems in its CPE data. Consequently, if devices on a given network were running an iOS version other than 17.4, scanners relying on CPE data would identify all those devices as vulnerable to CVE-2024-23296, generating a raft of false positives and unnecessary remediations.

CPE Data and Severity Ratings

It’s common knowledge that each CVE is assigned a severity by the NVD, but those CVE’s are also assigned a severity score by the vendor, and opinions on the appropriate seriousness of individual CVEs can vary significantly between vendor and NIST. Certainly, the NVD can claim the benefit of being an impartial 3rd party, but the vendor is not constrained by the need to apply generic criteria across countless vendors, but rather can draw on insight that only the creator of the software may have.  Returning to Red Hat as a useful example of a vendor that is skeptical of NVD CVE severity scores, their website offers this analysis:

“For open source software shipped by multiple vendors, the CVSS base scores may vary for each vendor’s version, depending on the version they ship, how they ship it, the platform, and even how the software is compiled. This makes scoring vulnerabilities difficult for third-party vulnerability databases, such as NVD, who can give only a single CVSS base score to each vulnerability, or when comparing to other vendors who score based on the characteristics of use in their own products. These differences can cause the scores to vary widely. Other reasons may include how the source code was compiled with certain compiler flags or hardening technologies that reduce or eliminate the security severity of a flaw, or how the software is used within the product. Software running standalone may suffer from a flaw, but its use within a product may preclude the vulnerable code from ever being used in a way that could be exploited.”

Amazon provides a similar explanation for the difference between their severity scores and those of the NVD:

“Amazon Linux evaluates each CVE for their applicability and impact on our products. Our CVSS scores may differ from NVD and other vendors because of the characteristics of our software, such acs versions, infrastructure platforms, default configurations, or build environments. Other vendors target environments with unique characteristics (other Linux distributions) or provide generic evaluations that cannot consider the execution environment.”

A recent example of the challenge faced by software publishers was highlighted in a Bleeping Computer article about developer Fedor Indutny, author of one of the most popular utilities on Github (17 million weekly downloads). A CVE with an eye-popping score of 9.8 was identified in one of his utilities, a severity assessment he vehemently disagreed with and could justify convincingly. Navigating the process to challenge the CVSS score so frustrated him that he archived the project’s Github repository.

The Backlog

Even in a perfect world where CPE data were flawless, NIST can’t keep up with the task of assigning CPE designations to all vulnerable software packages, and is, in fact, about 10,000 CVEs behind. Thus, scanners relying on CPE data to run their correlations are blind to CVEs published after about mid-February.

Addressing CPE Deficiencies in VM Correlations

It’s inevitable that any vulnerability scanner relying on CPE data for its correlations will experience false positives, so some vendors choose to augment CPE data internally in an effort to reduce false positives. Although an improvement over using raw CPE data, that approach is still fraught with challenges. At trackd, we’ve chosen to take the road less traveled, and one significantly more technically difficult: we pull software package data directly from individual vendor, as well as the much-better-informed vendor severity scores for each CVE. It takes longer, requires senior development resources, and offers countless technical challenges and vendor idiosyncrasies, but it results in a vulnerability management and patching platform with zero false positives (barring vendor errors). We think the trade-off is more than worth it.