Baseintrorpyecto
Baseintrorpyecto
1 Introduction
structure is not perfectly the same, we need to convert this text into an appropriate
form through data preprocessing.
In this paper, we propose a classification method to categorize the CVE entries that
predicts the vulnerability types explained by text documents. We collect CVE entries
from NVD and generate a vulnerability classification model based on naïve Bayes. By
using this method, we are able to classify CVE entries into vulnerability category, i.e.,
CWE.
The remainder of the paper is organized as follows. In Section ІІ, we review the
related work. In Section ІІІ, we explain the proposed method. Experimental results are
described in Section IV and this paper concludes in Section V.
2 Related Work
Genge and Enăchescu [3] proposed a vulnerability assessment tool for devices
connected to the Internet identified by Shodan [4]. This tool simply matched CVE
entries to the corresponding devices without additional processing of the CVE entries.
Chang et al. [5] analyzed vulnerability trends using CVE entries from 2007 to 2010.
They showed the vulnerability trends through vulnerability frequency and severity by
using the CVEs and CVSS scores, respectively. As vulnerabilities that occurred in
that year were additionally discovered and registered until now, it is different than the
security trends that were analyzed in the past.
Neuhaus and Zimmermann [6] used topic models to analyze vulnerability trends,
such as vulnerability types of CVE entries until 2009. The authors found 28 topics in
CVE entries by using Latent Dirichlet Allocation (LDA) and assigned LDA topics to
CWEs. The precision and recall of LDA is good at some CWEs, such as CWE-79 and
CWE-89, but is poor at other categories, such as CWE-310 and CWE-94.
Guo and Wang [7] modeled CVE vulnerabilities based on ontology and used it to
analyze similar vulnerabilities. We refer the structure of CVE vulnerabilities to be
used in the classification model in this research.
Li et al. [8] analyzed the characteristics of bugs and classified the bugs through text
classification and information retrieval techniques. In this paper, we use naïve Bayes
classifier to categorize vulnerabilities.