0% found this document useful (0 votes)
12 views2 pages

Baseintrorpyecto

This document discusses a study that uses a naïve Bayes classifier to categorize Common Vulnerabilities and Exposures (CVE) entries by vulnerability type. There are over 77,000 CVE entries but many lack sufficient information like the vulnerability type. The study collects CVE entries and generates a classification model to predict the vulnerability category, such as the Common Weakness Enumeration (CWE) type, for CVE entries based on the entry text. The classification ability of the naïve Bayes method is then evaluated on a test set of CVE entries.

Uploaded by

luis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views2 pages

Baseintrorpyecto

This document discusses a study that uses a naïve Bayes classifier to categorize Common Vulnerabilities and Exposures (CVE) entries by vulnerability type. There are over 77,000 CVE entries but many lack sufficient information like the vulnerability type. The study collects CVE entries and generates a classification model to predict the vulnerability category, such as the Common Weakness Enumeration (CWE) type, for CVE entries based on the entry text. The classification ability of the naïve Bayes method is then evaluated on a test set of CVE entries.

Uploaded by

luis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

A Study on the Classification of Common

Vulnerabilities and Exposures using Naïve Bayes

Sarang Na, Taeeun Kim, and Hwankuk Kim


Security R&D Team 2
Korea Internet & Security Agency
Seoul, Republic of Korea
{no.1.nasa,tekim31,rinyfeel}@kisa.or.kr

Abstract. National Vulnerability Database (NVD) provides publicly known


security vulnerabilities called Common Vulnerabilities and Exposures (CVE).
There are a number of CVE entries, although, some of them cannot provide
sufficient information, such as vulnerability type. In this paper, we propose a
classification method of categorizing CVE entries into vulnerability type using
naïve Bayes classifier. The classification ability of the method is evaluated by a
set of testing data. We can analyze CVE entries that are not yet classified as
well as uncategorized vulnerability documents.

Keywords: Vulnerability analysis, Common Vulnerabilities and Exposures


(CVE), Common Weakness Enumeration (CWE), naïve Bayes classifier,
document classification.

1 Introduction

Security vulnerabilities inherent in software packages can be easily exploited for


conducting malicious manipulations. Attackers can identify vulnerable Web services
by using an Internet-wide scanning tool and conduct malicious behavior [1]. Thus,
security experts must be aware of known vulnerabilities and be able to quickly cope
with threats.
National Vulnerability Database (NVD) provides Common Vulnerabilities and
Exposures (CVE) entries to easily share publicly known security vulnerabilities [2].
CVE system provides a reference-method for the security vulnerabilities of released
software packages. A CVE entry is composed of vulnerability overview, Common
Vulnerability Scoring System (CVSS), references, Common Platform Enumeration
(CPE), and Common Weakness Enumeration (CWE).
There are over 77,000 CVE entries, but they cannot provide satisfactory
vulnerability information that is available in the vulnerability overview or reference
sites. In particular, the CWEs that identify types of vulnerabilities are provided for
only 57.6% of all CVE entries (Figure 1).
To find out which type of vulnerability is explained by a CVE entry, it is possible
to use the vulnerability overview of each CVE entry and thus insufficient information
may be supplemented. The overview text is structuralized in a certain form, but as the

© Springer International Publishing AG 2017 657


L. Barolli et al. (eds.), Advances on Broad-Band Wireless Computing,
Communication and Applications, Lecture Notes on Data Engineering
and Communications Technologies 2, DOI 10.1007/978-3-319-49106-6_65
658 S. Na et al.

structure is not perfectly the same, we need to convert this text into an appropriate
form through data preprocessing.
In this paper, we propose a classification method to categorize the CVE entries that
predicts the vulnerability types explained by text documents. We collect CVE entries
from NVD and generate a vulnerability classification model based on naïve Bayes. By
using this method, we are able to classify CVE entries into vulnerability category, i.e.,
CWE.
The remainder of the paper is organized as follows. In Section ІІ, we review the
related work. In Section ІІІ, we explain the proposed method. Experimental results are
described in Section IV and this paper concludes in Section V.

Fig. 1. Number of CVE entries and CWEs by year.

2 Related Work

Genge and Enăchescu [3] proposed a vulnerability assessment tool for devices
connected to the Internet identified by Shodan [4]. This tool simply matched CVE
entries to the corresponding devices without additional processing of the CVE entries.
Chang et al. [5] analyzed vulnerability trends using CVE entries from 2007 to 2010.
They showed the vulnerability trends through vulnerability frequency and severity by
using the CVEs and CVSS scores, respectively. As vulnerabilities that occurred in
that year were additionally discovered and registered until now, it is different than the
security trends that were analyzed in the past.
Neuhaus and Zimmermann [6] used topic models to analyze vulnerability trends,
such as vulnerability types of CVE entries until 2009. The authors found 28 topics in
CVE entries by using Latent Dirichlet Allocation (LDA) and assigned LDA topics to
CWEs. The precision and recall of LDA is good at some CWEs, such as CWE-79 and
CWE-89, but is poor at other categories, such as CWE-310 and CWE-94.
Guo and Wang [7] modeled CVE vulnerabilities based on ontology and used it to
analyze similar vulnerabilities. We refer the structure of CVE vulnerabilities to be
used in the classification model in this research.
Li et al. [8] analyzed the characteristics of bugs and classified the bugs through text
classification and information retrieval techniques. In this paper, we use naïve Bayes
classifier to categorize vulnerabilities.

You might also like