0% found this document useful (0 votes)
10 views

Software_vulnerabilities_overview_A_descriptive_study

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Software_vulnerabilities_overview_A_descriptive_study

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

TSINGHUA SCIENCE AND TECHNOLOGY

ISSNll1007-0214 09/12 pp270–280


DOI: 1 0 . 2 6 5 9 9 / T S T . 2 0 1 9 . 9 0 1 0 0 0 3
V o l u m e 2 5, N u m b e r 2, A p r i l 2 0 2 0

Software Vulnerabilities Overview: A Descriptive Study

Mario Calı́n Sánchez, Juan Manuel Carrillo de Gea , José Luis Fernández-Alemán,
Jesús Garcerán, and Ambrosio Toval

Abstract: Computer security is a matter of great interest. In the last decade there have been numerous cases of
cybercrime based on the exploitation of software vulnerabilities. This fact has generated a great social concern
and a greater importance of computer security as a discipline. In this work, the most important vulnerabilities of
recent years are identified, classified, and categorized individually. A measure of the impact of each vulnerability is
used to carry out this classification, considering the number of products affected by each vulnerability, as well as
its severity. In addition, the categories of vulnerabilities that have the greatest presence are identified. Based on
the results obtained in this study, we can understand the consequences of the most common vulnerabilities, which
software products are affected, how to counteract these vulnerabilities, and what their current trend is.

Key words: descriptive study; software security; software vulnerabilities; vulnerability databases

1 Introduction A software vulnerability is a flaw in the system


that allows an attacker to breach the security measures
Computers in general, and the Internet in particular,
implemented[6] . Errors and bugs are not new in the
have a great social importance nowadays; the network
software world; a large and complex software system
of networks allows us to live interconnected in a
could contain a large number of bugs[7] , and security
relatively easy way. Currently, 50% of the world’s
bugs can sometimes be exploited by malicious users to
population uses the Internet, that is, more than 3.77
produce damage or obtain benefits. Security problems
billion people are connected online[1] . From the point
due to software vulnerabilities can become particularly
of view of companies, this technological globalization
worrisome in the case of compromising information of a
means both advantages and disadvantages. There has
private nature, such as, for example, information related
been a longstanding consensus on the idea that being
to a patient’s health history[8–10] .
connected to the Internet has its risks[2] ; when the
In order to fight against software vulnerabilities,
company’s data is no longer confined under full control,
Internet databases have been created that record
the likelihood of a third party being able to violate
all these vulnerabilities to inform companies and
that data increases[3] . Despite the security measures programmers. A vulnerability database is a platform
currently used in any Internet service, information that stores, maintains, and disseminates information
systems are frequently exposed to different threats and about vulnerabilities discovered in real computer
potential damages[4, 5] . systems[11] . These databases allow for security
Mario Calı́n-Sánchez, Juan Manuel Carrillo de Gea, José Luis measurement and vulnerability management. In
Fernández-Alemán, Jesús Garcerán, and Ambrosio Toval are addition, these data can be listed and each vulnerability
with the Department of Informatics and Systems, Faculty can be saved with a unique identifier, which in
of Computer Science, University of Murcia, Murcia 30100, turn facilitates the sharing of information on those
Spain. E-mail: fmario.calin, jmcdg1, aleman, jesus.garceran,
vulnerabilities.
[email protected].
To whom correspondence should be addressed.
Currently, one of the most extensive vulnerability
Manuscript received: 2018-12-12; revised: 2019-02-23; databases is the National Vulnerability Database (NVD)
accepted: 2019-03-11 from the U.S. government (https://nvd.nist.gov/).
@ The author(s) 2020. The articles published in this open access journal are distributed under the terms of the
Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
Mario Calı́n Sánchez et al.: Software Vulnerabilities Overview: A Descriptive Study 271

This governmental repository stores the vulnerability common type of vulnerabilities is NMV. In addition, the
management data. The body in charge of this database study concludes that the time to repair a vulnerability
is the National Institute of Standards and Technology according to its type is 66.8 days in BOV, 70.5 in NMV,
(NIST). This database provides the data according and 60.6 in ARV.
to the Security Content Automation Protocol (SCAP) Venter et al.[15] focused on performing a
specifications. SCAP is a set of NIST specifications categorization of vulnerabilities following this process:
for expressing and manipulating information related to (1) acquiring data sources (for example, the CVE list);
failures and configurations in a standardized way[12] . (2) data preprocessing to add important information
SCAP has a number of components on which NVD and eliminate vulnerabilities that are not of interest;
relies, as in the case of the Common Vulnerabilities (3) making a data storage using a Self-Organized
and Exposures (CVE) vulnerability dictionary. The Map (SOM); and (4) inspecting and labeling the
CVE has related Common Vulnerability Scoring clusters in order to categorize the CVE database.
System (CVSS) scores that indicate the severity of a In this work, vulnerabilities are divided into seven
vulnerability[13] . categories: buffer overflow, Denial of Service (DoS),
CVE is the vulnerability dictionary that contains all scripting metacharacters, privilege escalation, data
the vulnerabilities with their respective identification corruption, information gathering, and configuration
(https://cve.mitre.org/about/). All these vulnerabilities vulnerabilities. The results obtained show that out of
are grouped into categories. Common Weaknesses a total of 167 vulnerabilities, the most common type
Enumeration (CWE) offers that categorization and the was DoS (61) and the least common was the buffer
required functionality to provide the security industry overflow (4).
with a list of types of weaknesses. The agency Alhazmi et al.[16] carried out measurements of the
responsible for the management of both CVE and CWE security of the systems according to their density
is MITRE Corporation, a non-profit company that offers of vulnerability. This metric is the number of
information technology support to the United States vulnerabilities multiplied by code size, and it helps, for
example, a provider to know when a product should be
Government.
placed on the market. This work studies this metric
This article is structured as follows: Section 2
in the five most used operating systems in 2005. In
introduces different studies related to the identification
Windows XP, it showed a lower density compared to
and categorization of software vulnerabilities. Section 3
Windows 95 and Windows 98 because it was more
explains in detail the methodology followed to carry out
recent and not so many vulnerabilities were discovered
this work about software vulnerabilities, and presents
so far. This article concludes by stating the benefits
the tool that was developed to gather the data to perform
of this metric and the possibility to expand it with
the study. Section 5 shows the results and information
additional data.
obtained after analyzing the data. Section 6 discusses
Alqahtani et al.[17] used a research methodology
the results of this work. Finally, Section 7 presents our
focused on a unified ontological representation that
conclusion and future work.
processes vulnerabilities and project information. The
2 Related Work objective is to establish a bidirectional relationship
between the vulnerability databases and traditional
Li et al.[14] classified vulnerabilities according to software repositories. This study focuses on high-level
the complexity of their identification, repair, and security vulnerabilities, creating its own ontology to
exploitation. The vulnerabilities are thus divided relate products with vulnerabilities, and vulnerabilities
into those that are (1) easy to identify and exploit with a series of properties such as date, weakness,
(Bohr-Vulnerability, BOV); (2) complex to identify and author, source, etc. The product has a series of
exploit (Non-aging-related Mandel Vulnerability, properties such as library, application, and operating
NMV); (3) exploited by attackers to degrade system. With this ontological representation, the
performance (Aging-Related Vulnerability, ARV); authors[17] expect to make their analysis accessible to
and (4) not classified in any of the other three developers.
categories (Unknown Vulnerability, UNK). Once Cruz et al.[18] established a categorization of existing
the study was done, the results indicated that the most vulnerabilities in the context of cloud computing. The
272 Tsinghua Science and Technology, April 2020, 25(2): 270–280

objective of this work is to study existing security cvss:score (Severity). Each vulnerability has a
research on cloud computing to analyze the state of the degree of severity from 0 to 10 that is given according to
art and identify future directions in this field. Finally, the CVSS metric. Within this metric, the classification
security companies also release periodic reports on of a vulnerability is mild if the value is between 0.0–3.9,
computer security threats and general information about medium if the value is between 4.0–6.9, high if the value
the latest news in the security world. For example, is between 7.0–8.9, and critical if the value is between
McAfee[19] highlighted at the end of 2016 the increase 9.0–10.0[13] .
of threats in Mac OS and mobile devices, the increase of vuln:cwe (Category). Categories are used to
malware based on macros, or the decrease of phishing group the vulnerabilities. These categories are defined
URLs within the web. according to the SCAP component called CWE
In addition to these research articles, there are and there are about 1000 different categories of
different classifications available on the Internet. The vulnerability types.
CVE Details database makes a series of interesting 3.2 Obtaining the annual impact
classifications on its website. For example, the top
We initially address each year separately. For each
50 products by total number of distinct vulnerabilities
year, a process to obtain the impact of a vulnerability
are reported (https://www.cvedetails.com/top-50-
is carried out. In the context of this study, the
products.php). The same top 50 ranking is also
impact is defined as a relation between the number of
presented but concerning vendors instead of products
software products that are affected by a vulnerability in
(https:// www.cvedetails.com/top-50-vendors.php). In
a specific year, and the severity that is assigned to that
other databases, such as NVD, the percentage of
vulnerability. For example, if two vulnerabilities affect
vulnerabilities of each type found in their database
100 different systems and one has a severity of 5 and
is shown in a chart (https://nvd.nist.gov/vuln/
another one has a severity of 7, the vulnerability with
visualizations/cwe-over-time). In addition, they
a severity of 7 will have more impact than the other.
offered a study of how the frequency of the type of
The process described below is followed to calculate the
vulnerabilities has changed with time, based on the
impact:
CWE vulnerability classification.
The software products that are affected by each of
3 Method the vulnerabilities are identified, and their severity is
obtained.
The method proposed in this work offers an approach
The percentage of software products that are
for the study of software vulnerabilities. It is focused
affected by a vulnerability with respect to the total
on the number of software products affected by
number of products of that year is obtained.
each vulnerability in the NVD database (the presence Productsvul
metric), and the measure of severity of those Presencevul D :
TotalProducts
vulnerabilities. Both parameters are related by means Once the frequency of appearance is obtained (i.e.,
of a formula with which a metric called impact is the presence), the severity of the vulnerability is also
calculated. Each year is investigated individually to considered to compute the impact of the vulnerability.
later unify the results and conduct a general study. In this way, a value is obtained that allows us to relate
3.1 Data collection presence and severity giving a similar importance to
both values.
We relied on the NVD database website to obtain
Impactvul D Presencevul Severityvul :
data, which provides a series of XML files that
contain information about the emergence of software 3.3 Obtaining the annual category
vulnerabilities following the SCAP specifications. The vulnerabilities are split up into categories once the
Vulnerabilities from the year 2015 to 2017 were studied presence of each vulnerability is obtained. Despite the
in order to analyze the current situation on this regard. large amount of information available in NVD, there are
The information used to carry out this study is as many vulnerabilities that do not have a category within
follows: this data source. Part of the desired information can thus
entry id (Identifier). It is a unique vulnerability be missing. Even so, all existing categories are studied
identifier per file. and the most repeated ones are obtained.
Mario Calı́n Sánchez et al.: Software Vulnerabilities Overview: A Descriptive Study 273

4 Data Collection Tool 4.2 Usage of the tool


A tool was developed using the Java programming A usage example of the data collection tool is shown in
language to work with the XML files, allowing for Fig. 1.
the extraction and presentation of data in the CSV A typical execution flow of the tool is illustrated in
format, which is compatible with spreadsheet solutions Fig. 2. As shown in the diagram, an XML file is injected
such as Libreoffice Calc and Microsoft Excel. The into the XMLNistParser, which processes the file and
developed tool is also a contribution of this work, creates a Result that includes yearly data, either focused
and it is open source, so it is possible to download, on categories of vulnerabilities or vulnerabilities.
use, and modify it through a repository (https://github. 4.3 XML data feeds
com/mariocalin/nistAnalysis).
In order to run the tool and get results, it is mandatory
This tool is developer-oriented; neither graphical
to download the data feeds that NIST provides in their
user interface nor command-line program is therefore
official website (https://nvd.nist.gov/vuln/data-feeds),
provided. In order to use the tool, the repository must
and place them into the corresponding folder with
be downloaded or cloned, and imported into your
the same exact name that is initially defined. At the
preferred Java Integrated Development Environment
moment, only XML feeds are allowed and the folder
(IDE). Then, the tool can be run, and its parameters can
is XML Data.
be changed as desired.
4.3.1 XML data feed structure
4.1 Source code description
Each XML file contains multiple elements of the type
The complete Java documentation of the tool can be entry. This element represents a vulnerability registered
found in the doc folder inside the mentioned repository. in the database. There are some elements appended
In this section, however, the most important aspects of as children of each entry; among other info elements
the tool in terms of development are described. (e.g., vuln:cve-id or vuln:published-datetime), the most
4 Tsinghua Science and Technology, April 2020, 25(2): ???–???
Firstly, the package src/nist/functions is presented. interesting elements are as follows.
There4 are threecollection
Data important toolfiles in this package: vuln:vulnerable-configuration.
4.2 Usage of the tool This element
INistDataAnalysis.java: Interface that defines the refers to the configuration in which the vulnerability
A tool awas developed using themust Java have.
programming A usage example of the data collection tool is in Fig. 1.
operations NIST data analyzer This
language to work with the XML files, allowing for public static void main(String[] args) throws Exception {
interface is created to provide parsers to different file // Creates an analyzer instance with the year to analyze
the extraction and presentation of data in the CSV INistDataAnalysis analyzer = new
formats (e.g., XML or JSON).
format, which is compatible with spreadsheet solutions XMLNistParser(XMLNistParser.XMLFiles.FULL_YEAR_2017);
INistDataResult.java: Interface that defines
such as Libreoffice Calc and Microsoft Excel. The // Creates a result
the elements a NistDataResult must have. Every Result result = analyzer.createResult();
developed tool is also a contribution of this work,
NistDataAnalyzer must create NistDataResults.
and it is open source, so it is possible to download, // Prints the entries result to a CSV file or String
XMLNistParser.java:
use, It implements INistData
and modify it through a repository (https://github.
result.entriesResult().toCSV("entries-2017.csv", true);
result.entriesResult().toString();
Analysis.
com/mariocalin/nistAnalysis). the XML Data Feed.
This parser processes
// Prints the categories result to a CSV file or String
It will This
parsetool
the XML into the tool model.
is developer-oriented; neither graphical result.categoriesResult().toCSV("categories-2017.csv",
Secondly, the package
user interface nor command-linesrc/nist/modelprogramincludes the
is therefore
true);
result.categoriesResult().toString();
threeprovided.
main concepts that can be found in the model:
In order to use the tool, the repository must System.out.println("END OF PROGRAM");
beEntry represents
downloaded or a cloned,
vulnerability entry. into your
and imported }
CategoryJavarepresents
preferred Integrated Developmentvulnerability
a CWE Environment
AFig. 1 A execution
typical usage example
flowofof
thethe
datatool
collection tool.
is illustrated
category.
(IDE). Then, the tool can be run, and its parameters can
in Fig. ??. As shown in the diagram, an XML file
beResult
changedrepresents
as desired.a summary of a year in terms of
is injected into the XMLNistParser, which processes
vulnerabilities. It offers two different types of Result:
4.1 Source code description the file and creates a Result that includes yearly
(1) per categories, and (2) per entries.
The complete Java documentation data, either focused on categories of vulnerabilities or
Thirdly, in the package src/nist/utils,of the istool
there onlycan
onebe
found in the doc folder inside the mentioned repository. vulnerabilities.
class. It contains some utilities in terms of readability.
In this section,
Finally, however,
the Main.java filetheis most important
the entry pointaspects
for theof Fig. 1 Data collection tool diagram
the tool in terms of development are described.
tool and contains the main function to be run with the
Firstly, the package src/nist/functions is presented. 4.3 XMLFig.
data
IDE. 2 feeds
Data collection tool diagram.
There are three important files in this package:
In order to run the tool and get results, it is mandatory
INistDataAnalysis.java: Interface that defines the
to download the data feeds that NIST provides in their
operations a NIST data analyzer must have. This
official website , and place them into the corresponding
interface is created to provide parsers to different file
folder with the same exact name that is initially defined.
formats (e.g., XML or JSON).
At the moment, only XML feeds are allowed and the
INistDataResult.java: Interface that defines
folder is XMLData.
the elements a NistDataResult must have. Every
274 Tsinghua Science and Technology, April 2020, 25(2): 270–280

was found. NIST provides a list of configurations The second, third, and fourth vulnerabilities with the
where they test the vulnerabilities. greatest impact this year (CVE-2015-0569/70/71) are
vuln:vulnerable-software-list. It contains the also buffer overflow, but this time in the Linux kernel,
specific software products that are affected by the in versions 3.x and 4.x. These vulnerabilities allow
vulnerability. attackers to obtain privileges through an application ran
vuln:cvss. It includes the CVSS metrics of the on the mentioned platform.
vulnerability. Among them, it is of special interest the The fifth (CVE-2015-0573) vulnerability also affects
score given by the cvss:score element. the Linux 3.x kernel through a driver that is used in the
vuln:cwe. This element refers to the category of Qualcomm Innovation Center (QuIC), allowing guest
the vulnerability in the CVE dictionary. users to obtain operating system privileges or produce
4.3.2 Parsing mechanism a DoS.
The parsing mechanism implemented to process the Figure 3b shows a breakdown by categories of the
XML files is described below: vulnerabilities with greater presence detected in 2015.
(1) The XMLNistParser receives an XML file path Among them, there is a category (CWE-119) that
that corresponds to the NIST XML year data feed. refers to Improper Restriction of Operations within the
It loads the file and, by using the Java libraries of Bounds of a Memory Buffer that has been registered
javax.xml.parsers and org.w3c.dom, it creates a Result 1073 times, followed by CWE-79 (i.e., Improper
object containing the information needed in a object Neutralization of Input During Web Page Generation
oriented way. (“Cross-site Scripting”)) with 773 records. The third
(2) With the Result object, it can be chosen whether most important category is CWE-200, which refers to
to print the entries results or the categories results (via Information Exposure, with a record of 690 times.
CSV or console). The only difference between them is It should be noted that there are 1911 vulnerabilities
the way of representing the information: not associated with a specific category whose average
EntryResult is focused on vulnerability entries; severity is 6.68, which represent 24% of the total
it shows information about the entry code, the
number of vulnerabilities registered in 2015.
entry CVSS score, and the products affected by the
Secondly, the results of the year 2016 are presented
vulnerability.
in Fig. 3c. As shown in the chart, the vulnerability with
CategoryResult is focused on CWE categories;
the greatest impact (CVE-2016-6380), which refers to
it shows information about the total number of
obtaining sensitive information or causing DoS, occurs
vulnerability entries that a category has, and its average
vulnerability CVSS score. in the DNS forwarder in Cisco IOS 12.0 through 12.4
and 15.0 through 15.6 and IOS XE 3.1 through 3.15.
5 Results The vulnerability with the second highest impact
is CVE-2016-1409, referring to DoS. It affects the
The next step is to present the results. As it has been
Neighbor Discovery (ND) protocol implementation in
defined in the method, the results are extracted on an
the IPv6 stack in Cisco IOS XE 2.1 through 3.17S,
annual basis (see Section 5.1) and after that, the global
IOS XR 2.0.0 through 5.3.2. The occurrence of this
results are shown (see Section 5.2).
vulnerability reaches 9795 cases, but a severity score
Some vulnerabilities have different identifiers but
of just 5 (i.e., medium) is assigned.
they can be analyzed together for the sake of simplicity.
Since both their causes and their consequences are The third vulnerability with the greatest impact is
similar and have little differences between them, we CVE-2016-6393, referring to inadequate management
have grouped these vulnerabilities into one. of system resources. It mainly affects CISCO DNS
forwarders from 4.1 through 7.2.
5.1 Annual analysis With respect to the categories, as shown in Fig. 3d,
Firstly, Fig. 3a shows the results regarding the impact the most repeated category of vulnerabilities is again
of vulnerabilities in 2015. The vulnerability with the CWE-119, with 1322 cases out of 9431 records. The
greatest impact (CVE-2015-1290) is buffer overflow in next category with the highest representativeness, with
the Google Chrome browser versions ranging from 0.1 823 cases, is CWE-200. The third category in this
through 43, which allow attackers to obtain operating ranking is CWE-264, which refers to Permissions,
system privileges. Privileges, and Access Controls, with a total of 725
Mario Calı́n Sánchez et al.: Software Vulnerabilities Overview: A Descriptive Study 275

(a) Vulnerabilities with the greatest impact of 2015 (b) Categories with the greatest presence in 2015

(c) Vulnerabilities with the greatest impact of 2016 (d) Categories with the greatest presence in 2016

(e) Vulnerabilities with the greatest impact of 2017 (f) Categories with the greatest presence in 2017

Fig. 3 Impact of vulnerabilities and presence of categories (annual breakdown).

records. that may allow an attacker to potentially exploit heap


A significant percentage of vulnerabilities are not corruption. It affects Google Chrome versions prior to
associated with a category, with a total of 2292 cases 62.0.3202.62, and the libxml2 library before version
and an average severity of 6.3. This represents 24.3% 2.9.5.
of the total registered vulnerabilities. With respect to the vulnerability categories, which
Finally, the results of the year 2017 are shown in are shown in Fig. 3f, the most repeated category is
Fig. 3e. The vulnerability with the greatest impact is again CWE-119 with 2219 cases out of 14 027. The
CVE-2017-5055, which refers to out-of-bounds reads, next category with greatest presence is CWE-79. This
and allows the attacker to access information from category has 1183 cases, followed by the 1140 cases
unauthorized memory areas or cause a failure. of the CWE-284 category, which refers to Improper
The second vulnerability with the greatest impact Access Control.
is CVE-2017-12240, referring to a buffer overflow The number of vulnerabilities not associated with
condition in the DHCP relay subsystem of Cisco IOS specific categories is 1944, with an average severity
12.2 through 15.6 and Cisco IOS XE Software. It could of 5.42. This amount represents 13.8% of total
allow an attacker to execute arbitrary code and gain full vulnerabilities.
control of the system, and also perform a DoS attack.
5.2 Interannual analysis
The third vulnerability with the greatest impact is
CVE-2017-5130, which refers to an integer overflow The total grouped results of this study are presented
276 Tsinghua Science and Technology, April 2020, 25(2): 270–280

below. Figure 4a shows the 20 vulnerabilities with the are among the most serious that exist today; there are,
greatest impact of the three years under study, and however, fewer cases of this type of vulnerabilities.
Fig. 4b shows a breakdown of the categories with the When privilege management is inadequate or fails, an
largest number of vulnerabilities detected during that attacker can compromise the security of the software
time. by unauthorized appropriation of permissions (e.g.,
Our results indicate that the vulnerability with the reading, modification) on files and directories that could
greatest impact in the period of time analyzed was contain sensitive information.
CVE-2015-1290, referring to buffer overflow. The Inferential statistical analysis has been performed to
second and third vulnerabilities with the greatest impact formally check whether the probability that an observed
occurred in 2016 (i.e., CVE-2016-6380 and CVE-2016- difference between the impact of vulnerabilities in
1409). The next vulnerability is the one with the greatest different years has happened by chance. The statistical
impact of the year 2017, CVE-2017-12240. Of the 20 software package IBM SPSS Statistics (https://www.
vulnerabilities with the greatest general impact, 17 of ibm.com/products/spss-statistics) version 20 was used
them are from the year 2017, while there are only one to carry out the data analysis.
of the year 2015 and two of the year 2016. The assumptions about the data that are entailed by
With respect to the categories, CWE-119 is the most statistical tests must be taken into consideration to apply
repeated category throughout the different years. It the correct technique. In this regard, parametric tests
includes 4614 vulnerabilities, which represents 14.68% require the variables coming from a normal distribution;
of the total. Secondly, the CWE-79 category has 2636 when this requirement is not satisfied, a non-parametric
vulnerabilities or 8.39%. The category CWE-200 has test is recommended. In addition, the number of groups
2606 vulnerabilities, being thus the third with the is a key factor to decide upon the technique. Typically,
largest global presence, 8.29%. Out of a total of 31 426 the one-way ANalysis Of VAriance (ANOVA) and
vulnerabilities, 6147 of them (or 19.5%) do not specify the Kruskal-Wallis test (parametric and non-parametric
category. techniques, respectively) are used to test for differences
among at least three groups. Indeed, since our analysis
6 Discussion encompasses three years, either the ANOVA or the
Once the global data of the three years were Kruskal-Wallis test should to be used.
presented, some conclusions can be outlined about the When applying the Kruskal-Wallis test to compare
most relevant vulnerabilities according to the metrics the medians of the impact of the vulnerabilities
proposed in this study, its type and its category. between the years 2015 (M D 0:003 147 10), 2016
The vulnerabilities with the greatest impact identified (M D 0:002 572 54), and 2017 (M D 0:004 772 00),
in this work have as a consequence the DoS, and are statistically significant differences were observed
of the utmost importance for software products. This (2 .2/ D 2689:536, p < 0:001). In the post-hoc
type of vulnerability causes a service or resource to be contrasts, it can be seen that in 2017 the impact
inaccessible to legitimate users. of the vulnerabilities was greater than in 2016 (p <
Another consequence of the vulnerabilities that 0:001) and 2015 (p < 0:001). Statistically significant
are among those with the greatest impact is the differences were also found between the year 2015 and
escalation of privileges. Vulnerabilities of this nature 2016 (p < 0:001) that show a greater impact of the

(a) Vulnerabilities with the greatest impact (b) Categories with the greatest presence

Fig. 4 Impact of vulnerabilities and presence of categories (combined).


Mario Calı́n Sánchez et al.: Software Vulnerabilities Overview: A Descriptive Study 277

vulnerabilities in 2015. disclosure of information to someone who does not


With respect to the categories, it is noteworthy that have authorization to have access to the information;
despite the fact that the CWE organism has defined 125 finally, CWE-264 is described as weaknesses related to
categories, the 14 most common categories represent access control (i.e., permissions, privileges, and other
66.13% of the total number of vulnerabilities. The security features).
rest of the categories represent only 14.31% of the The four categories mentioned above (i.e., CWE-
total vulnerabilities. Likewise, in the three years under 119, CWE-79, CWE-200, and CWE-264) represent
study (i.e., 2015, 2016, and 2017), 19.56% of the 38.82% of the total. This is an important detail,
vulnerabilities have not been associated with a category. since it could have been initially thought that the
Figure 5 shows all these details graphically. amount of information about categories would be too
Within these categories, buffer overflow (CWE-119) extensive to be studied in this way. However, in the
is by far the most common problem, with almost twice end the categories of vulnerabilities are constantly
as many cases as the second largest in the ranking. A repeated. Therefore, attackers frequently use the same
buffer overflow is a read or write to a memory location strategies against different software. In other words, a
outside the buffer limit[20] . This category, which large number of weaknesses are common to different
represents 14.68% of the vulnerabilities, provides an software products.
indication of the type of weaknesses the attackers To provide more information about the categories
are taking advantage of, as well as where the most of vulnerabilities that stand out throughout the three-
significant problems are in terms of security of the year period, Table 1 shows a relationship between these
main software products. The typical consequences of categories and the affected programming languages,
vulnerabilities in this category are usually running paradigms, technologies, and platforms.
unauthorized code, modifying or reading memory, The C, C++, and assembler languages are, despite
DoS, and consuming resources, among others their age, the most affected programming languages by
(http://cwe.mitre.org/data/definitions/119.html). the CWE-119 category, which has the highest number
The categories placed next in the list also have a of registered vulnerabilities. In addition, another of
considerable presence. CWE-79 corresponds to failure the categories with the most vulnerabilities (CWE-200)
to neutralize user input that is used as a web makes explicit mention of an information exhibition in
page that is served to others; CWE-200 refers to a mobile environment. This is in line with the drastic
increase in recorded attacks to the security of mobile
devices. According to Nokia[21] , more than 100 million
mobile devices were infected by malware in 2016,
including smartphones, laptops, and a wide range of
Internet of Things devices.
In summary, buffer overflow is currently the most
common vulnerability category; on the other hand, the
main consequence of vulnerabilities is DoS.

7 Conclusion and Further Work


This work can be of help to inform users, researchers,
and security practitioners about the vulnerabilities with
the greatest impact in recent years, and the software
that is affected by them. This information can be
Fig. 5 Percentage of presence of each category. useful, for example, to apply the corresponding security
Table 1 Environments affected by the main categories of vulnerabilities.
Category Programming language Paradigm Technology Platform
CWE-119 C, C++, and assembler Independent Independent Independent
CWE-79 Independent Web-based Web technology Independent
CWE-200 Independent Independent Independent Mobile environment
CWE-264 Independent Independent Independent Independent
278 Tsinghua Science and Technology, April 2020, 25(2): 270–280

patches[22] . We believe that it is crucial to have a grasp this countermeasure is the library libsafe (https://
on the most relevant vulnerabilities nowadays to protect directory.fsf.org/wiki/Libsafe).
ourselves against them. In addition, the study can serve Compiler tools. The compilers insert instructions
as a guide to foresee the evolution of vulnerabilities in that allow verifying the integrity of the stack, as
the coming years, as well as identify the most common well as eliminate the conditions that an attacker
categories of vulnerabilities. needs to perform a buffer overflow attack. The
The results of this study indicate that the best known solutions are StackShield (http://
vulnerabilities with the greatest impact are usually www.angelfire.com/sk/stackshield/info.html) and
found in free and open source software. The most StackGuard[26] .
repeated software products in the ranking are different Our future work includes the monitoring of
versions and products of CISCO (IOS, XE, etc.), vulnerabilities that can compromise the main operating
versions 3.x and 4.x of the Linux Kernel, and Mozilla systems for mobile devices (i.e., iOS and Android).
software (Firefox and Thunderbird). To a lesser extent, Owing to the strong presence of this type of devices in
there is also presence of software products from NTP, our daily life[27] , we consider this topic of the utmost
ImageMagick, Moodle, Tryton, Django, etc. importance. Another possible line of future work is
It is worth mentioning the notable presence of the study of the variation of the consequences of the
vulnerabilities in the Linux kernel and Apple’s Mac OS vulnerabilities with the greatest impact. This kind of
X operating system, despite their good reputation in study would analyze the situation of a vulnerability
terms of security features. In this sense, the information or category of vulnerabilities in a given year, and
of the NVD database contains more vulnerabilities and compare it to the situation in the previous years. As
with greater impact of the Linux Kernel and Apple’s presented in Section 5, DoS attacks in 2017 are not
Mac OS X operating system than those of Microsoft’s as frequent as in 2015 and 2016. This suggests that
Windows operating system in the three years under we could be witnessing a change of trend, which
study. According to Ref. [23], 384 vulnerabilities were could be caused by a greater ability or interest of the
detected in Mac OS X, 77 in the Linux kernel, and 53 cybercriminals to attack other software weaknesses that
in Microsoft Windows 10 in 2015. This fact may be were less common in the past[28] . In this sense, the
motivated by the greater or lesser willingness to make study of the evolution over time of vulnerabilities in
public a vulnerability detected in the system, which the software products is another interesting line of work for
free and open source software community seems to do researchers in this field.
more frequently than Microsoft[24] .
Acknowledgment
As shown in Section 6, buffer overflow is the
most common software vulnerability. This vulnerability This research was part of the BIZDEVOPS-GLOBAL-
causes problems ranging from a DoS to the total UMU project (No. RTI2018-098309-B-C33) supported by
appropriation of the control of the application by the the Spanish Ministry of Economy and Competitiveness
attacker. It is mainly a problem of low-level languages, and the European Fund for Regional Development
such as C or C++, while higher level languages such as (ERDF).
Java or Visual Basic prohibit direct access to memory References
and avoid this problem. For this reason, when possible,
it is better to not allow users to work with low- [1] We Are Social and Hootsuite, Digital in 2017: Global
overview, https://wearesocial.com/special-reports/digital-
level code, and work only with high-level code. It
in-2017-global-overview, 2017.
is also recommended that developers replace insecure [2] S. Lichtenstein, Internet risks for companies, Comput.
functions such as strcpy, strcat, and so on[20] . Secur., vol. 17, no. 2, pp. 143–150, 1998.
Among other solutions, there is a series of well- [3] M. P. Qi, J. Chen, and Y. Chen, A secure biometrics-
based authentication key exchange protocol for multi-
known countermeasures to protect against buffer
server TMIS using ECC, Comput. Methods Programs
overflow[25] : Biomed., vol. 164, pp. 101–109, 2018.
Dynamic linking of secure libraries. These [4] M. Jouini, L. B. A. Rabai, and A. B. Aissa, Classification of
libraries replace unsafe functions with other functions security threats in information systems, Proced. Comput.
with the same purpose that incorporate measures Sci., vol. 32, pp. 489–496, 2014.
[5] A. N. Navaz, M. A. Serhani, N. Al-Qirim, and M. Gergely,
that protect against this attack. An example of Towards an efficient and energy-aware mobile big health
Mario Calı́n Sánchez et al.: Software Vulnerabilities Overview: A Descriptive Study 279

data architecture, Comput. Methods Programs Biomed., [17] S. S. Alqahtani, E. E. Eghan, and J. Rilling, Tracing known
vol. 166, pp. 137–154, 2018. security vulnerabilities in software repositories—
[6] O. Alhazmi, Y. Malaiya, and I. Ray, Security A semantic web enabled modeling approach, Sci. Comput.
vulnerabilities in software systems: A quantitative Programming, vol. 121, pp. 153–175, 2016.
perspective, in Proc. 19th Ann. IFIP WG 11.3 Working [18] Z. B. Cruz, J. L. Fernández-Alemán, and A. Toval, Security
Conf. on Data and Applications Security XIX, Storrs, CT, in cloud computing: A mapping study, Comput. Sci.
USA, 2005, pp. 281–294. Inform. Syst., vol. 12, no. 1, pp. 161–184, 2015.
[7] J. T. Gong and H. Y. Zhang, BugMap: A topographic [19] McAfee, McAfee labs threats report, https://www.mcafee.
map of bugs, in Proc. 9th Joint Meeting on Foundations com/enterprise/en-us/assets/reports/rpquarterly-threats-
of Software Engineering, Saint Petersburg, Russia, 2013, mar-2016.pdf, 2016.
pp. 647–650. [20] J. C. Foster, V. Osipov, N. Bhalla, N. Heinen, and D. Aitel,
[8] J. L. Fernández-Alemán, I. C. Señor, P. Á. O. Lozoya, and Buffer Overflow Attacks. Syngress Publishing, 2005.
A. Toval, Security and privacy in electronic health records: [21] Nokia, Android & iOS infections rose by 400%. Windows
A systematic literature review, J . Biomed. Inform., vol. 46, Infections declined, https://nokiapoweruser.com/nokia-
no. 3, pp. 541–562, 2013. malware-report-smartphones-infections-rose-nearly-400
[9] I. C. Señor, J. L. Fernández-Alemán, and A. Toval, Are -percent-2016/, 2016.
personal health records safe? A review of free web- [22] A. V. Uzunov, E. B. Fernandez, and K. Falkner, Assessing
accessible personal health record privacy policies, J . Med. and improving the quality of security methodologies for
Internet Res., vol. 14, p. e114, 2012. distributed systems, J . Softw.: Evol. Process, vol. 30, no.
[10] C. T. Li, D. H. Shih, and C. C. Wang, Cloud-assisted 11, p. e1980, 2018.
mutual authentication and privacy preservation protocol for [23] C. Manes, 2015’s MVPs-the most vulnerable players,
telecare medical information systems, Comput. Methods https://techtalk.gfi.com/2015s-mvps-the-most-vulnerable-
Programs Biomed., vol. 157, pp. 191–203, 2018. players/, 2016.
[11] Y. H. Gu and P. Li, Design and research on vulnerability [24] N. Metha and B. Leonard, Disclosing vulnerabilities
database, in Proc. 3rd Int. Conf. on Information and to protect users, https://security.googleblog.com/2016/10/
Computing, Wuxi, China, 2010, pp. 209–212. disclosing-vulnerabilities-to-protect.html, 2016.
[12] C. Schmidt, Technical introduction to SCAP, https://www. [25] W. L. Du, Chapter 4: Buffer overflow attack, Computer
energy.gov/sites/prod/files/cioprod/documents/Technical Security: A Hands-on Approach, Syngress Publishing,
Introduction to SCAP - Charles Schmidt.pdf, 2010. 2017.
[13] P. Mell, K. Scarfone, and S. Romanosky, Common [26] C. Cowan, C. Pu, D. Maier, H. Hintony, J. Walpole, P.
vulnerability scoring system, IEEE Secur. Privacy, vol. 4, Bakke, S. Beattie, A. Grier, P. Wagle, and Q. Zhang,
no. 6, pp. 85–89, 2006. StackGuard: Automatic adaptive detection and prevention
[14] X. D. Li, X. L. Chang, J. A. Board, and K. S. Trivedi, of buffer-overflow attacks, in Proc. 7 th Conf. on USENIX
A novel approach for software vulnerability classification, Security Symp., San Antonio, TX, USA, 1998, p. 5.
in Proc. 2017 Ann. Reliability and Maintainability Symp., [27] Newzoo, Newzoo global mobile market report 2018—
Orlando, FL, USA, 2017. Light version, https://newzoo.com/insights/trend-reports/
[15] H. Venter, J. H. P. Eloff, and Y. L. Li, Standardising newzoo-global-mobile-market-report-2018-light-version/,
vulnerability categories, Comput. Secur., vol. 27, nos. 3&4, 2018.
pp. 71–83, 2008. [28] F. Mercaldo, A. Di Sorbo, C. A. Visaggio, A. Cimitile,
[16] O. H. Alhazmi, Y. K. Malaiya, and I. Ray, Measuring, and F. Martinelli, An exploratory study on the evolution
analyzing and predicting security vulnerabilities in of android malware quality, J . Softw.: Evol. Process, vol.
software systems, Comput. Secur., vol. 26, no. 3, pp. 219– 30, no. 11, p. e1978, 2018.
228, 2007.

Mario Calı́n Sánchez received the BS Juan Manuel Carrillo de Gea received
and MS degrees from University of the BS, MS, and PhD degrees from
Murcia, Murcia, Spain, in 2016 and 2018, University of Murcia, Murcia, Spain, in
respectively. He worked as a researcher 2000, 2009, and 2016, respectively. He is
in the software sustainability area for the an assistant professor with the University
Software Engineering Research Group of of Murcia. He has published more than
the University of Murcia, and now he is 30 articles on software engineering,
currently a software developer with the requirements engineering, and applications
Polytechnic University of Cartagena, where he is a member of in the e-health and e-learning domains in relevant journals and
the INDIe project development team. conferences. His current research interests include software
engineering, sustainability, medical informatics, and education.
280 Tsinghua Science and Technology, April 2020, 25(2): 270–280

José Luis Fernández-Alemán received finally, in the possibilities of IoT agents, especially from the side
the BS and PhD degrees from University of software. He is pending to publish a research work about
of Murcia, Murcia, Spain, in 1994 and Semantically-Enhanced System for Pest Recognition in AISC
2002, respectively. He is currently an Springer as an author.
associate professor with University of
Murcia, where he is a member of the
Ambrosio Toval received the BS degree
Software Engineering Research Group. He
from University Complutense of Madrid,
has published more than 50 JCR papers in
Madrid, Spain, in 1983, and the PhD
the areas of software engineering and requirements engineering
degree from Technical University of
and their application to the fields of e-health and e-learning.
Valencia, Valencia, Spain, in 1994. He is
Currently, his main research interest is m-health and m-learning
currently a full professor with University
and their application to computer science, medicine, and nursing.
of Murcia, Spain, where he is the head
of the Software Engineering Research
Jesús Garcerán received the BS degree Group. He has conducted a variety of research and technology
from University of Murcia, Murcia, Spain, transfer projects in the areas of requirements engineering
in 2018. He has worked for 8 months processes and tools, privacy and security requirements,
in HOP Ubiquitous S.L. as a software sustainable requirements, and applications in the e-health,
developer, where he was managing some e-learning, and mobile development domains. He has published
apps, and he has experience with IoT in the same topics in international journals, such as IEEE
and FIWARE. He has specialized in Software, Information and Software Technology, Requirements
information systems, but he is interested in Engineering, Computer Standards & Interfaces, IET Software,
other areas inside computing, like front-end development with International Journal of Information Security, etc.
AngularJS, cybersecurity, and how to protect an enterprise and,

You might also like