0% found this document useful (0 votes)
34 views5 pages

Machine Learning A Review On Binary Classification

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views5 pages

Machine Learning A Review On Binary Classification

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

International Journal of Computer Applications (0975 – 8887)

Volume 160 – No 7, February 2017

Machine Learning: A Review on Binary Classification


Roshan Kumari Saurabh Kr. Srivastava
M.Tech. Scholar Sr. Asst. Professor
Department Of Computer Department Of Computer
Science & Engineering Science & Engineering
ABES Engineering College, ABES Engineering College,
Ghaziabad Ghaziabad

ABSTRACT behavior to detect multiple account identity deception on


In the field of information extraction and retrieval, binary social media. New accounts initiated by blocked users are
classification is the process of classifying given called sockpuppetry. In social media, identity deception is a
document/account on the basis of predefined classes. major issue. Nonverbal communication(user activity or
Sockpuppet detection is based on binary, in which given movement) are more powerful than Verbal
accounts are detected either sockpuppet or non-sockpuppet. communication(speech or text). Identity deception focuses on
Sockpuppets has become significant issues, in which one can manipulating the senders information and is divided in three
have fake identity for some specific purpose or malicious use. categories-identity concealment, identity theft and identity
Text categorization is also performed with binary forgery. Major issue with identity deception in social media is
classification. This research synthesizes binary classification the presence of multiple identities by one user. They have
in which various approaches for binary classification are used Logs of blocked users on Wikipedia during the period
discussed. since February 2004 until October 2013 as dataset and used
SVM, Random forest and Adaptive Boosting(ADA) method
Keywords for classification of sockpuppetry or not. They found that
Sockpuppets, Non sockpuppets, multiple identity deception, Adaptive Boosting provides the best balance between Recall
text categorization, NB, SVM, Random Forest, Ensemble and Precision, and achieved highest Accuracy among all used
methods and Binary Classification classification techniques.

1. INTRODUCTION 2. REVIEWED PAPERS IN THIS


Sockpuppets are some fake IDs or accounts, which are created DIRECTION
for some specific malicious use. So sockpuppet detection is Thamar Solorio et.al.[3] has described a corpus of sockpuppet
based on binary classification, in which classification is done cases from Wikipedia. A Corpus provides a real world dataset
on each account to assign a class i.e., sockpuppet or non- of short messages from malicious users. Sockpuppet
sockpuppet. Anyone can have new account with the help of investigations in Wikipedia(SPI) are identified using support
less information. So it's necessary to have some method to vector machine. Author has tried to detect SPI, to decide
find out these sockpuppet cases or suspicious cases because whether to mentioned the editors as belonging to the same
its violates privacy. Wikipedia does not provide any specific person or not on the basis of binary classification. These
facility to detect such malicious accounts. So the current results are based on observations from comments made by
process are done manually which is time consuming and cost each user. Used complete list of features can be found at the
effective. So identify such accounts as sockpuppet in following link:http://docsig.cis.uab.edu/media/2014/03/list-of-
Wikipedia, is significant issue. Multiple identity is also an features.pdf. Xueling Zheng et.al.[4] has proposed algorithm
example of binary classification, in which one person has for detecting sockpuppet pair in one forum and two different
more than one account for malicious use. With the help of forum. Two different online accounts but belong to the same
multiple account, one can try to alter senders contents for person are referred as sockpuppets pairs. In this paper there
some specific purpose. So multiple identity deception is also a are two methods proposed for detecting sockpuppets. The first
big issues on social media. Text categorization is also done by one is designed for detecting those sock puppets pairs in the
performing binary classification. Text categorization plays an same discussion forum while the second one is for detecting
important role on information retrieval for classification of sockpuppets pairs that appear in two different forum. Authors
different documents. has used dataset from Uwants and HK discuss during the
period of March 2010 to May 2010 . On the basis of Detection
Sockpuppet detection becomes a significant problem in
Score, they have tried to find out similar keywords used by
social media environment. Thamar Solorio et.al.[1] has
different people. Sadia Afroz et.al.[5] Author has proposed a
contributed his work towards sockpuppet detection. They
method to detect stylistic deception in written document. It is
have done their work with small case study of automated
mentioned in this paper that with the help of large feature set,
detection of sockpuppets based on Authorship Attributes
it is possible to distinguish regular documents from deceptive
where the task consists of analyzing a written document to
documents. To detect adversarial writing, it is necessary to
predict the true author. Some features of authorship attribution
identify a set of discriminating features that distinguish
are collected and examined. Each user comment is a
deceptive writing from regular writing. After determining
"document". There are two steps taken for classification
these features supervised learning techniques are used to
process, in initial step, predictions from the classifier on each
classify new writing styles. Three feature sets are used: Write
comment has taken. Then in second step, predictions for each
print feature set, Lying-detection feature set and 9-feature
comments and combine them in majority voting schema to
set(Authorship attribution features). SVM and DT techniques
assign final decisions to each account. Michail Tsikerdekis
are used for analysis. SVM classifier works best with the
et.al. [2] has proposed a novel approach for use of non- verbal
write prints feature and DT performed well with the Lying

11
International Journal of Computer Applications (0975 – 8887)
Volume 160 – No 7, February 2017

detection features. Dhanyasree P et.al.[6] has contributed their time. Ashkan Sami et.al.[12] has provide a framework for
work for detection of identity deception on social networking analyzing and classifying PE files based on data mining
sites. On social networking sites, one person creates multiple techniques. Windows Application programming
account for malicious use. So this become a very big issue on interface(API) can be used to extract knowledge describing
social sites. So on the basis of verbal and non verbal behavior behavior of executables .Each API call is used as a feature.
it can be detect such types of account. So authors has tried to FISHER SCORE based feature selection process is used. Top
detect such accounts on the basis of verbal and non verbal 4 categories by Fisher's Score are :File Management, Process
behavior. They have used algorithms, Calculation of non- and Thread, Console and Registry. 34820 PE files where
verbal variables and model testing using Random Forest 31,869 were malicious and 2951 were benign windows PE
method and Identification of time window using PSO. They files. RF,NB and DT techniques are used. Random Forest
found that Detecting multiple accounts through nonverbal gives good performance. G.Ganesh Sundarkumar et.al. [13]
behavior has more accuracy. The automated system to detect has done text mining for feature selection. Then Mutual
multiple accounts gives good performance . Both the verbal Information is used to extract most influential features. Then
and nonverbal behavior can be combined and used for data mining models such as Decision tree, Neural network
sockpuppets detection, in which binary classification are done model, SVM , Probabilistic neural network and group method
to detect sockpuppet or non sockpuppet cases. M BalaaNand of data handling(GMDH) is used. On the basis of Accuracy,
et.al. [7] has proposed a method to detect multiple account Sensitivity. Specificity all 59 models are compared. DT,
and fake identity on social media like WIKIPEDIA using non- SVM, PNN, NN and GMDH techniques are used for
verbal behavior(User activity and User Movement). Authors comparison. Then again the dataset are balanced using
has worked for time independent based non verbal behavior. Oversampling and again tested the model .After balancing
In which they has used data from Wikipedia and SVM,RF and sensitivity/accuracy improved. Prasha Shrestha et.al.[14] has
ADA techniques for binary classification. They found explained Malware Family Identification process using string
Adaptive Boosting gives the best balance between recall and information. Classification of malware into correct family is
precision with high accuracy. Sheetal Antony et.al.[8] has an important task for antivirus vendor. Using term-frequency
proposed a system that can use verbal and non verbal and inverse document frequency(tf-idf) and using prominent
behavioral patterns to detect identity deception. There is an strings extraction classification work are done in this paper.
Admin who manages each account for users. The details and To check accuracy-way vendor agreement are compared with
activities of the user are analyzed and detect if there is some accuracy achieved by used algorithm or techniques. Exact
deception. The details are verified in database. If it detects match: Global vocabulary, exact matches: Prominent strings,
that there is some deception then there are some security Prominent strings set and Absence of prominent string are
questions that are asked to users. Zaher Yamak et.al.[9] has techniques used for this purpose. Data are used from
proposed a detection method in which following steps are University's malware database(1504 malware files). On the
taken: first of all data are crawled from Wikipedia, then detect basis of above mentioned experiments it can be easy to detect
sockpuppet accounts, after that create a set of non-verbal malware family files. Exact Match: Global vocabulary gives
behavior features and then calculate the values of the the best result. Michael Bailey et.al.[15] has explained that
proposed features and finally used machine learning algorithm anti-virus is incomplete in that it fails to detect or provide
for classification. SVM, RF, Naive Bayes, K nearest neighbor, labels of the malware samples. Authors explained that when
Bayesian Network and Adaptive Boosting are taken for result these systems do provide labels, theses labels do not have
comparison. Best accuracy given by Random Forest(99.8%) consistent meaning across families and variants within a
and Bayesian Network(99.6%) for sockpuppet detection. single naming convention as well as across multiple vendors.
Malware detection is an important issues to save our computer Finally they demonstrated that these system lack conciseness
system and communication infrastructure. So, Anti-virus in that they provide some little information or sometime too
technology is a key player in tackling malware files, based on much information about a specific piece of malware. Authors
two methods: signature based and heuristic-based method. has proposed a novel technique to overcome these problems.
Asaf Shabtai et.al.[10] has addressed different challenges i.e., On the basis of behavioral fingerprints of malware's activity,
files representation method, feature selection method and automated malware classification are done. To compare and
classification algorithm. Some additional issues are also combine these fingerprints, single-linkage hierarchical
mentioned in this paper such as: weighting clustering approach are applied. Gaston L’Huillier et.al.[16]
algorithm(ensembles),imbalance problem ,active learning and has explained phishing mail classification. Phishing email
chronological evaluation. Authors has proposed a framework fraud is to attempt to gain personal/sensitive information such
for detecting new malicious code in executable files can be as username, passwords and credit cards details. Algorithm
designed to achieve very high accuracy while maintaining low like Support vector machines, naïve Bayes, Random forest
false positives. Antu Mary et.al.[11] has proposed a method algorithm are used for classification of phishing emails. The
for detecting identity deception by a single user is based on classification of phishing emails is extension of text mining.
using Nonverbal behavior. Non verbal behavior explains In this paper feature extraction methodology for fishing
activities done by each user separately such as Some emails are enhances by using latent semantic analysis features
Wikipedia users create multiple accounts and use them for and keyword extraction techniques. SVMs ,the naïve Bayes
various malicious purposes such as Number of articles model and the logistic regression method are used in Weka
generates, Number of searches done for same articles, tool to improve accuracy. Rafiqul Islam et.al.[17] has tried to
Number of bytes added and also removed, Number of times classify malware on the use of static and dynamic features.
same spelling mistakes carryout constantly, Time taken There are some drawback in static techniques for malware
between each revision, creating fraudulent articles, damaging classification. So it focuses to detect some dynamic features
existing article text etc. So these deceptions cannot easily which is very useful in classification process. For static
detected by any authority. Numerous methods have been features there are two information needed: function length
proposed that can help in detecting multiple accounts owned frequency and printable strings information. For dynamic
by the same persons. Using verbal and nonverbal behavior of features API functions name are used. SVM,DT,RF and Naive
user can easily detect the sockpuppet with limited amount of Bayes techniques are used in WEKA tools with 10 fold cross

12
International Journal of Computer Applications (0975 – 8887)
Volume 160 – No 7, February 2017

validation for classification. Random forest gives the highest and F-measures. All these three improved random forest
accuracy with TP,FN and Accuracy parameters. . Ali Danesh methods are compared with other widely used text
et.al. [18] proposed a classifier fusion method to improve text categorization methods i.e., support vector
classification. Proposed approach combined Naive Bayes, K- machines(SVM),Naive Bayesian(NB),and K-
NN and Rocchio methods by Voting algorithms methods and NearestNeighbor(KNN). M.Sivakumar et.al. [21] proposed a
achieve a better classification rate which experimental results hybrid text classification Approach using KNN and SVM.
shows that the classification error decreases by 15%. 2000 They proposed SVM-KNN approach aims to reduce the
documents from 20 different newsgroups has taken for impact of parameters in classification accuracy. The
experiment. Aytug Onan et.al.[19] proposed ensemble performance analysis shows the accuracy of SVM-KNN
approach such as Adaboost, Bagging, Dagging, Random method remains optimal for even huge values of the
Subspaces and majority Voting. Two way ANOVA test parameters. The accuracy compared to the KNN method is
conducted. The experimental analysis shows that the bagging higher in the SVM-KNN. Unlike the conventional KNN
ensemble of Random Forest with the most frequent based classification approach, the SVM-KNN approaches has low
keyword extraction method yields promising results for text impact on the implementation of the parameters. Sundus
classification. The experimental result shows that the Hassan et.al. [22] proposed a method for text categorization in
utilization of keyword based representation of text documents which they compared Support Vector Machine(SVM) and
in conjunction with ensemble learning can enhance the Naive Bayes (NB) classifiers. Baseline for the experiment has
predictive performance and scalability of text classification setup by removing stopwords and stemmed the dataset by
schemes. Baoxun Xu et.al.[20] has proposed an improved using Porter Stemmer. They used micro-average and macro
Random forest classifier for text categorization. They average F-Measure. Experiments shows the improvement in
proposed improved random forest methods with both feature micro average and macro average F-measure in both method
weighting and tree selection methods(WTRF), Breiman's i.e., SVM and NB.
Random forest (BRF) and the random forest with only tree
selection method(TRF).Comparisons are based on accuracy
Table 1: Summary of related work on sockpuppets detection, multiple identity deception detection and text categorization by
performing binary classification

Citation Dataset used Classifiers Measures Results

On the basis of authorship


Thamar Solorio Data collected from Support Vector Precision, Recall ,F- attributes sockpuppet cases
et.al.[2013] Wikipedia Machine(SVM) Measure and accuracy are detected.
Precision, Recall,
SVM, Random forest and Accuracy, F-Measures, false
Logs of blocked users on Adaptive Boosting(ADA) positive rate and Matthews
Michail Wikipedia during the period Correlation Coefficient
Tsikerdekis since February 2004 until (MCC) Higher accuracy
et.al[2014]. October 2013 Rate.
Support Vector Machine in
Thamar Solorio Weka tools Precision, Recall ,F- F-Measure gives the best
et.al[2014] Wikipedia Measure and Accuracy result.
Uwants and HK discuss
Xueling Zheng during the period of March On the basis of similarity
et.al[2011] 2010 to May 2010 Detection score and keywords Sockpuppet pair are detected.
(1)Extended -brennan-
Greenstadt corpus
(2)Hemingway-Faulkner SVM classifier works best
Imitation corpus Precision, Recall and F- with the Write prints feature
(3)Thomas-Amina Hoax Measure and DT performed well with
Sadia Afroz corpus the Lying detection features.
et.al[2011]. SVM and DT
M BalaaNand. Dataset used from SVM, RF and ADA. Precision, Recall, F- Adaptive Boosting gives the
et.al[2015] Wikipedia Measure, Accuracy, MCC best balance between recall
and False Positive rate. and precision with high
accuracy
Sheetal Antony. Not Used Not Used Not Used On the basis of verbal and
et.al[2016] non verbal behavioral
patterns, detected identity
deception.

SVM, RF, Naive Baiyes, K Best accuracy given by


Dataset used from nearest neighbor, Bayesian TPR, FPR, F-Measure, Random Forest(99.8%) and
Zaher Yamak Wikipedia from Feb2004 to Network and Adaptive Precision and MCC Bayesian Network(99.6%).
et.al[2016] April2015 Boosting
This paper includes aspects of
different challenges for
classifying new malicious
Asaf Shabtai. ANN,DT,KNN,BN,SVM code based on static features
et.al[2009] Not Used ,OneR,Boosted Algorithm TPR, FP and, Accuracy extracted.

13
International Journal of Computer Applications (0975 – 8887)
Volume 160 – No 7, February 2017

Identity deception detection is


more accurate using non
verbal behavior in
Antu Mary Recall, Precision and F- comparison to verbal
et.al[2015] Wikipedia SVM Measure behavior.
34820 PE files where
31,869 were malicious and
2951 were benign windows Accuracy, Precision, Recall
PE files. and F Arate(False alarm Random Forest gives good
Ashkan Sami RF, NB and DT rate) performance.
et.al[2010]

Table 1: Continued...

Citation Dataset used Classifiers Measures Results


On the basis of Accuracy
,Sensitivity and Specificity
all 5 models are compared.
Then again the dataset are
balanced using Oversampling
and again testing the model.
After balancing
G.Ganesh DT,SVM,PNN,NN and sensitivity/accuracy
Sundarkumar et Dataset from CSMINING GMDH Accuracy ,Sensitivity and improved.
al.[2013] group is used Specificity
(1)Exact match: Global
vocabulary
(2) exact matches
(3) Prominent strings set
(4)Absence of prominent
string
Data are used from
University's malware
database(1504 malware Accuracy and correlation Exact match: Global
Prasha Shrestha files). between n-way vendor vocabulary gives the best
et al.[2014] agreement with accuracy accuracy around 91.02%.
Data collected from data
sources .There are 3 types of It's easy to classification of
dataset used: Legacy, small malware on the basis of
Michael Bailey and large. Hierarchical clustering Consistency, Completeness behavior of fingerprints.
et al.[2007] algorithm and conciseness
SVMs ,the naïve Bayes
Gaston model and the logistic
L’Huillier et regression method were Accuracy
al.[2013] Not Used used in Weka tool F-Measure improved
Data are used from
antivirus vendors time TP,FN and Accuracy
periods 2003-2007 and Random forest gives the
Rafiqul Islam et 2009-2010. SVM,DT,RF and Naive highest accuracy.
al.[2013] Bayes in WEKA tool.

Fusion of classifiers gives


Ali Danesh et 100 articles are taken from NB, KNN, Rocchio ,Voting better results in comparison to
al.[2007] 20 different newsgroup and OWA AND DT Accuracy Rate base classifiers.

This paper represents analysis


NB, SVM, LR, RF and of 5 statistical keywords
Aytug Onan et ensemble methods of SVM F-Measure, Accuracy and methods for text
al.[2016] Reuters-21578 dataset used and RF. AUC Values classification.
It has been observed that
20 different Usenet proposed method WTRF
newsgroups and contains SVM,KNN,NB,BRF,TRF method outperforms among
Baoxun Xu et 18772 documents divided and WTRF Micro F-Measures and all other text categorization
al.[2012] into 20 different classes Macro F-Measures methods

Reuters-21578 R8 dataset
M.Sivakumar et used. Proposed SVM-KNN method
al.[2014] KNN and SVM Accuracy provides high accuracy.

Sundus Hassan et Dataset from 20 Newsgroup Macro F-Measures and NB gives better performance
al.[2000] with 1000 documents NB and SVM Micro F-Measures over SVM.

14
International Journal of Computer Applications (0975 – 8887)
Volume 160 – No 7, February 2017

April 11-15, 2016, Montreal, Quebec, Canada. ACM


3. CONCLUSION AND FUTURE WORK 978-1-4503-4144-8/04.
This paper presented a taxonomy for binary classification.
Sockpuppet detection is based on binary classification, in [10] Asaf Shabtai, Robert Moskovitch, Yuval Elovici and
which two classes are predefined i.e. sockpuppet or non- Chanan Glezer, " Detection of malicious code by
sockpuppet. And datasets are classified on the basis of applying machine learning classifiers on static features:
predefined classes. Multiple identity deception is also based A state -of-the-art-survey ", INFORMATION
on binary classification in which classification process are SECURITY TECHNICAL REPORT 14 (2009) 16-29,
done on given datasets into two groups i.e., sockpuppet and ELSEVIER.
non-sockpuppet. Text categorization is also done by involving [11] Antu Mary Kuruvilla1 and Saira Varghese2, "A Survey
binary classification. So binary classification implies an on detecting Identity Deception in Social Media
important role in machine learning process. To get better Applications", International Journal of Modern Trends in
result, analyze different features sets for binary classification. Engineering and Research (IJMTER) Volume 02, Issue
With different feature sets, better results can be observed in 04, [April – 2015] ISSN (Online):2349–9745 ; ISSN
terms of precision, recall, F-Measure and accuracy. Different (Print):2393-8161.
datasets can be used for experiment with different text
features. These feature sets can be used for multilevel [12] Ashkan Sami, B. Yadegari, N. Peiravian, and S. Hashemi
classification and multiclass classification. and A. Hamze, "Malware detection based on mining API
calls", SAC '10: Proceedings of the ACM Symposium on
4. REFERENCES Applied Computing, pp. 1020-1025, 2010.
[1] Thamar Solorio, Ragib Hasan and Mainul Mizan, "A Case
[13] G.Ganesh Sundarkumar and Vadlamani Ravi, "Malware
Study of Sockpuppet Detection in Wikipedia",
Detection by Text and Data Mining".IEEE2013..
Proceedings of the Workshop on Language in Social
Media(LASM 2013),Pages 59-68,Atlanta,Georgia,June [14] Prasha Shrestha,Suraj Maharajan,Gabriela Ramirez de la
13 2013.@2013 Association for Computational Rosa,Alan Sprague,Thamar Solorio and Gracy Warner,
Linguistics. "Using String Information for Malware Family
Identification" @Springer International Publishing
[2] Michail Tsikerdekis and Sherali Zeadally, "Multiple
Switzerland 2014,A.L.C.Bazzan and
Account Identity Deception Detection in Social Media
K.Pichara(Eds.):IBERAMIA 2014,LNAI 8864,pp.686-
Using Non Verbal Behavior", IEEE Transactions on
697,2014.DOI:10.1007/978-3-319-12027-0_55
Information Forensics and Security, Vol 9, No 8, August
2014. [15] Michael Bailey, Jon Oberheide, Z. Morley Mao, Farnam
Jahanian and Jose Nazario, " Automated Classification
[3] Thamar Solorio, Ragib Hasan and Mainul Mizan,
and Analysis of Internet Malware". April 26 2007
"Sockpuppet Detection in Wikipedia :A Corpus of Real-
World Deceptive Writing For Linking Writing", [16] Gaston L’Huillier, Alejandro Hevia, Richard Weber and
arXiv:1310.6772v1[cs.CL] 24 Oct 2013. Sebastian Rios, "Latent Semantic Analysis and Keyword
Extraction for Phishing Classification".IEEE2010.
[4] Xueling Zheng, Yiu Ming Lai, K.P. Chow, Lucas C.K.
Hui and S.M. Yiu, "Detection of Sockpuppets in Online [17] Rafiqul Islam, Ronghua Tian , Lynn M. Batten and
Discussion Forums", HKU CS Tech Report TR-2011-03. Steve Versteeg," Classification of malware based on
integrated static and dynamic features". Journal of
[5] Sadia Afroz, Michael Brennan and Rachel Greenstadt,
Network and Computer Applications 36 (2013) 646–656.
"Detecting Hoaxes Frauds and Deception in Writing
ELSEVIER.
Style Online". 2011.
[18] Ali Danesh, Behzad Moshiri and Omid Fatemi, "Improve
[6] Dhanyasree P*, Sajitha Krishnan and Ambikadevi Amma
Text Classification Accuracy based on Classifier Fusion
T, "Deception Detection in Social Media through
Methods".2007 IEEE Xplore.
Combined Verbal and Non-Verbal Behavior ",
International Journal of Advanced Research in Computer [19] Aytuğ Onana, Serdar Korukoğlub and Hasan Bulutb, "
Science and Software Engineering , Volume 5, Issue 4, Ensemble of keyword extraction methods and classifiers
2015. in text classification". A. Onan et al. / Expert Systems
With Applications 57 (2016) 232–247.
[7] M Balaanand,R Soumipriya,S Sivaranjani and S Sankari,
"Identifying Fake Users in Social Networks Using Non- [20] Baoxun Xu, Xiufeng Guo, Yumming Ye and Jiefeng
Verbal Behaviour". International Journal of Technology Cheng, "An Improved Random Forest Classifier for Text
and Engineering System (IJTES)Vol 7. No.2 2015 Pp. Categorization", [JOURNAL OF COMPUTERS] VOL.
157-161©gopalax Journals, Singapore. 7, NO. 12, DECEMBER 2012.
[8] Sheetal Antony, Prof. B. S. Umashankar, "Identity [21] M. Sivakumar, C. Karthika and P. Renuga, "A Hybrid
Deception Detection and Security in Social Medium, Text Classification Approach using KNN and SVM",
IJCSMC, Vol. 5, Issue 4, April 2016, pg.499-502. [IJIRSET] Volume 3, Special Issue 3, March 2014.
[9] Zaher Yamak, Julien Saunier and Laurent Vercouter, " [22] Sundus Hassan, Muhammad Rafi and Muhammad
Detection of Multiple Identity Manipulation in Shahid Shaikh, "Comparing SVM and Naive Classifiers
Collaborative Projects", IW3C2, WWW'16 Companion, for Text categorization with Wikitology as knowledge
enrichment". IEEE Xplore 2012.

IJCATM : www.ijcaonline.org 15

You might also like