0% found this document useful (0 votes)

34 views5 pages

Machine Learning A Review On Binary Classification

Uploaded by

Marcos Reiman Durán

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views5 pages

Machine Learning A Review On Binary Classification

Uploaded by

Marcos Reiman Durán

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

International Journal of Computer Applications (0975 – 8887)

Volume 160 – No 7, February 2017

Machine Learning: A Review on Binary Classification

Roshan Kumari Saurabh Kr. Srivastava
M.Tech. Scholar Sr. Asst. Professor
Department Of Computer Department Of Computer
Science & Engineering Science & Engineering
ABES Engineering College, ABES Engineering College,
Ghaziabad Ghaziabad

ABSTRACT behavior to detect multiple account identity deception on

In the field of information extraction and retrieval, binary social media. New accounts initiated by blocked users are
classification is the process of classifying given called sockpuppetry. In social media, identity deception is a
document/account on the basis of predefined classes. major issue. Nonverbal communication(user activity or
Sockpuppet detection is based on binary, in which given movement) are more powerful than Verbal
accounts are detected either sockpuppet or non-sockpuppet. communication(speech or text). Identity deception focuses on
Sockpuppets has become significant issues, in which one can manipulating the senders information and is divided in three
have fake identity for some specific purpose or malicious use. categories-identity concealment, identity theft and identity
Text categorization is also performed with binary forgery. Major issue with identity deception in social media is
classification. This research synthesizes binary classification the presence of multiple identities by one user. They have
in which various approaches for binary classification are used Logs of blocked users on Wikipedia during the period
discussed. since February 2004 until October 2013 as dataset and used
SVM, Random forest and Adaptive Boosting(ADA) method
Keywords for classification of sockpuppetry or not. They found that
Sockpuppets, Non sockpuppets, multiple identity deception, Adaptive Boosting provides the best balance between Recall
text categorization, NB, SVM, Random Forest, Ensemble and Precision, and achieved highest Accuracy among all used
methods and Binary Classification classification techniques.

1. INTRODUCTION 2. REVIEWED PAPERS IN THIS

Sockpuppets are some fake IDs or accounts, which are created DIRECTION
for some specific malicious use. So sockpuppet detection is Thamar Solorio et.al.[3] has described a corpus of sockpuppet
based on binary classification, in which classification is done cases from Wikipedia. A Corpus provides a real world dataset
on each account to assign a class i.e., sockpuppet or non- of short messages from malicious users. Sockpuppet
sockpuppet. Anyone can have new account with the help of investigations in Wikipedia(SPI) are identified using support
less information. So it's necessary to have some method to vector machine. Author has tried to detect SPI, to decide
find out these sockpuppet cases or suspicious cases because whether to mentioned the editors as belonging to the same
its violates privacy. Wikipedia does not provide any specific person or not on the basis of binary classification. These
facility to detect such malicious accounts. So the current results are based on observations from comments made by
process are done manually which is time consuming and cost each user. Used complete list of features can be found at the
effective. So identify such accounts as sockpuppet in following link:http://docsig.cis.uab.edu/media/2014/03/list-of-
Wikipedia, is significant issue. Multiple identity is also an features.pdf. Xueling Zheng et.al.[4] has proposed algorithm
example of binary classification, in which one person has for detecting sockpuppet pair in one forum and two different
more than one account for malicious use. With the help of forum. Two different online accounts but belong to the same
multiple account, one can try to alter senders contents for person are referred as sockpuppets pairs. In this paper there
some specific purpose. So multiple identity deception is also a are two methods proposed for detecting sockpuppets. The first
big issues on social media. Text categorization is also done by one is designed for detecting those sock puppets pairs in the
performing binary classification. Text categorization plays an same discussion forum while the second one is for detecting
important role on information retrieval for classification of sockpuppets pairs that appear in two different forum. Authors
different documents. has used dataset from Uwants and HK discuss during the
period of March 2010 to May 2010 . On the basis of Detection
Sockpuppet detection becomes a significant problem in
Score, they have tried to find out similar keywords used by
social media environment. Thamar Solorio et.al.[1] has
different people. Sadia Afroz et.al.[5] Author has proposed a
contributed his work towards sockpuppet detection. They
method to detect stylistic deception in written document. It is
have done their work with small case study of automated
mentioned in this paper that with the help of large feature set,
detection of sockpuppets based on Authorship Attributes
it is possible to distinguish regular documents from deceptive
where the task consists of analyzing a written document to
documents. To detect adversarial writing, it is necessary to
predict the true author. Some features of authorship attribution
identify a set of discriminating features that distinguish
are collected and examined. Each user comment is a
deceptive writing from regular writing. After determining
"document". There are two steps taken for classification
these features supervised learning techniques are used to
process, in initial step, predictions from the classifier on each
classify new writing styles. Three feature sets are used: Write
comment has taken. Then in second step, predictions for each
print feature set, Lying-detection feature set and 9-feature
comments and combine them in majority voting schema to
set(Authorship attribution features). SVM and DT techniques
assign final decisions to each account. Michail Tsikerdekis
are used for analysis. SVM classifier works best with the
et.al. [2] has proposed a novel approach for use of nonverbal
write prints feature and DT performed well with the Lying

11
International Journal of Computer Applications (0975 – 8887)
Volume 160 – No 7, February 2017

detection features. Dhanyasree P et.al.[6] has contributed their time. Ashkan Sami et.al.[12] has provide a framework for
work for detection of identity deception on social networking analyzing and classifying PE files based on data mining
sites. On social networking sites, one person creates multiple techniques. Windows Application programming
account for malicious use. So this become a very big issue on interface(API) can be used to extract knowledge describing
social sites. So on the basis of verbal and non verbal behavior behavior of executables .Each API call is used as a feature.
it can be detect such types of account. So authors has tried to FISHER SCORE based feature selection process is used. Top
detect such accounts on the basis of verbal and non verbal 4 categories by Fisher's Score are :File Management, Process
behavior. They have used algorithms, Calculation of non- and Thread, Console and Registry. 34820 PE files where
verbal variables and model testing using Random Forest 31,869 were malicious and 2951 were benign windows PE
method and Identification of time window using PSO. They files. RF,NB and DT techniques are used. Random Forest
found that Detecting multiple accounts through nonverbal gives good performance. G.Ganesh Sundarkumar et.al. [13]
behavior has more accuracy. The automated system to detect has done text mining for feature selection. Then Mutual
multiple accounts gives good performance . Both the verbal Information is used to extract most influential features. Then
and nonverbal behavior can be combined and used for data mining models such as Decision tree, Neural network
sockpuppets detection, in which binary classification are done model, SVM , Probabilistic neural network and group method
to detect sockpuppet or non sockpuppet cases. M BalaaNand of data handling(GMDH) is used. On the basis of Accuracy,
et.al. [7] has proposed a method to detect multiple account Sensitivity. Specificity all 59 models are compared. DT,
and fake identity on social media like WIKIPEDIA using non- SVM, PNN, NN and GMDH techniques are used for
verbal behavior(User activity and User Movement). Authors comparison. Then again the dataset are balanced using
has worked for time independent based non verbal behavior. Oversampling and again tested the model .After balancing
In which they has used data from Wikipedia and SVM,RF and sensitivity/accuracy improved. Prasha Shrestha et.al.[14] has
ADA techniques for binary classification. They found explained Malware Family Identification process using string
Adaptive Boosting gives the best balance between recall and information. Classification of malware into correct family is
precision with high accuracy. Sheetal Antony et.al.[8] has an important task for antivirus vendor. Using term-frequency
proposed a system that can use verbal and non verbal and inverse document frequency(tf-idf) and using prominent
behavioral patterns to detect identity deception. There is an strings extraction classification work are done in this paper.
Admin who manages each account for users. The details and To check accuracy-way vendor agreement are compared with
activities of the user are analyzed and detect if there is some accuracy achieved by used algorithm or techniques. Exact
deception. The details are verified in database. If it detects match: Global vocabulary, exact matches: Prominent strings,
that there is some deception then there are some security Prominent strings set and Absence of prominent string are
questions that are asked to users. Zaher Yamak et.al.[9] has techniques used for this purpose. Data are used from
proposed a detection method in which following steps are University's malware database(1504 malware files). On the
taken: first of all data are crawled from Wikipedia, then detect basis of above mentioned experiments it can be easy to detect
sockpuppet accounts, after that create a set of non-verbal malware family files. Exact Match: Global vocabulary gives
behavior features and then calculate the values of the the best result. Michael Bailey et.al.[15] has explained that
proposed features and finally used machine learning algorithm anti-virus is incomplete in that it fails to detect or provide
for classification. SVM, RF, Naive Bayes, K nearest neighbor, labels of the malware samples. Authors explained that when
Bayesian Network and Adaptive Boosting are taken for result these systems do provide labels, theses labels do not have
comparison. Best accuracy given by Random Forest(99.8%) consistent meaning across families and variants within a
and Bayesian Network(99.6%) for sockpuppet detection. single naming convention as well as across multiple vendors.
Malware detection is an important issues to save our computer Finally they demonstrated that these system lack conciseness
system and communication infrastructure. So, Anti-virus in that they provide some little information or sometime too
technology is a key player in tackling malware files, based on much information about a specific piece of malware. Authors
two methods: signature based and heuristic-based method. has proposed a novel technique to overcome these problems.
Asaf Shabtai et.al.[10] has addressed different challenges i.e., On the basis of behavioral fingerprints of malware's activity,
files representation method, feature selection method and automated malware classification are done. To compare and
classification algorithm. Some additional issues are also combine these fingerprints, single-linkage hierarchical
mentioned in this paper such as: weighting clustering approach are applied. Gaston L’Huillier et.al.[16]
algorithm(ensembles),imbalance problem ,active learning and has explained phishing mail classification. Phishing email
chronological evaluation. Authors has proposed a framework fraud is to attempt to gain personal/sensitive information such
for detecting new malicious code in executable files can be as username, passwords and credit cards details. Algorithm
designed to achieve very high accuracy while maintaining low like Support vector machines, naïve Bayes, Random forest
false positives. Antu Mary et.al.[11] has proposed a method algorithm are used for classification of phishing emails. The
for detecting identity deception by a single user is based on classification of phishing emails is extension of text mining.
using Nonverbal behavior. Non verbal behavior explains In this paper feature extraction methodology for fishing
activities done by each user separately such as Some emails are enhances by using latent semantic analysis features
Wikipedia users create multiple accounts and use them for and keyword extraction techniques. SVMs ,the naïve Bayes
various malicious purposes such as Number of articles model and the logistic regression method are used in Weka
generates, Number of searches done for same articles, tool to improve accuracy. Rafiqul Islam et.al.[17] has tried to
Number of bytes added and also removed, Number of times classify malware on the use of static and dynamic features.
same spelling mistakes carryout constantly, Time taken There are some drawback in static techniques for malware
between each revision, creating fraudulent articles, damaging classification. So it focuses to detect some dynamic features
existing article text etc. So these deceptions cannot easily which is very useful in classification process. For static
detected by any authority. Numerous methods have been features there are two information needed: function length
proposed that can help in detecting multiple accounts owned frequency and printable strings information. For dynamic
by the same persons. Using verbal and nonverbal behavior of features API functions name are used. SVM,DT,RF and Naive
user can easily detect the sockpuppet with limited amount of Bayes techniques are used in WEKA tools with 10 fold cross

12
International Journal of Computer Applications (0975 – 8887)
Volume 160 – No 7, February 2017

validation for classification. Random forest gives the highest and F-measures. All these three improved random forest
accuracy with TP,FN and Accuracy parameters. . Ali Danesh methods are compared with other widely used text
et.al. [18] proposed a classifier fusion method to improve text categorization methods i.e., support vector
classification. Proposed approach combined Naive Bayes, K- machines(SVM),Naive Bayesian(NB),and K-
NN and Rocchio methods by Voting algorithms methods and NearestNeighbor(KNN). M.Sivakumar et.al. [21] proposed a
achieve a better classification rate which experimental results hybrid text classification Approach using KNN and SVM.
shows that the classification error decreases by 15%. 2000 They proposed SVM-KNN approach aims to reduce the
documents from 20 different newsgroups has taken for impact of parameters in classification accuracy. The
experiment. Aytug Onan et.al.[19] proposed ensemble performance analysis shows the accuracy of SVM-KNN
approach such as Adaboost, Bagging, Dagging, Random method remains optimal for even huge values of the
Subspaces and majority Voting. Two way ANOVA test parameters. The accuracy compared to the KNN method is
conducted. The experimental analysis shows that the bagging higher in the SVM-KNN. Unlike the conventional KNN
ensemble of Random Forest with the most frequent based classification approach, the SVM-KNN approaches has low
keyword extraction method yields promising results for text impact on the implementation of the parameters. Sundus
classification. The experimental result shows that the Hassan et.al. [22] proposed a method for text categorization in
utilization of keyword based representation of text documents which they compared Support Vector Machine(SVM) and
in conjunction with ensemble learning can enhance the Naive Bayes (NB) classifiers. Baseline for the experiment has
predictive performance and scalability of text classification setup by removing stopwords and stemmed the dataset by
schemes. Baoxun Xu et.al.[20] has proposed an improved using Porter Stemmer. They used micro-average and macro
Random forest classifier for text categorization. They average F-Measure. Experiments shows the improvement in
proposed improved random forest methods with both feature micro average and macro average F-measure in both method
weighting and tree selection methods(WTRF), Breiman's i.e., SVM and NB.
Random forest (BRF) and the random forest with only tree
selection method(TRF).Comparisons are based on accuracy
Table 1: Summary of related work on sockpuppets detection, multiple identity deception detection and text categorization by
performing binary classification

Citation Dataset used Classifiers Measures Results

On the basis of authorship

Thamar Solorio Data collected from Support Vector Precision, Recall ,F- attributes sockpuppet cases
et.al.[2013] Wikipedia Machine(SVM) Measure and accuracy are detected.
Precision, Recall,
SVM, Random forest and Accuracy, F-Measures, false
Logs of blocked users on Adaptive Boosting(ADA) positive rate and Matthews
Michail Wikipedia during the period Correlation Coefficient
Tsikerdekis since February 2004 until (MCC) Higher accuracy
et.al[2014]. October 2013 Rate.
Support Vector Machine in
Thamar Solorio Weka tools Precision, Recall ,F- F-Measure gives the best
et.al[2014] Wikipedia Measure and Accuracy result.
Uwants and HK discuss
Xueling Zheng during the period of March On the basis of similarity
et.al[2011] 2010 to May 2010 Detection score and keywords Sockpuppet pair are detected.
(1)Extended -brennan-
Greenstadt corpus
(2)Hemingway-Faulkner SVM classifier works best
Imitation corpus Precision, Recall and F- with the Write prints feature
(3)Thomas-Amina Hoax Measure and DT performed well with
Sadia Afroz corpus the Lying detection features.
et.al[2011]. SVM and DT
M BalaaNand. Dataset used from SVM, RF and ADA. Precision, Recall, F- Adaptive Boosting gives the
et.al[2015] Wikipedia Measure, Accuracy, MCC best balance between recall
and False Positive rate. and precision with high
accuracy
Sheetal Antony. Not Used Not Used Not Used On the basis of verbal and
et.al[2016] non verbal behavioral
patterns, detected identity
deception.

SVM, RF, Naive Baiyes, K Best accuracy given by

Dataset used from nearest neighbor, Bayesian TPR, FPR, F-Measure, Random Forest(99.8%) and
Zaher Yamak Wikipedia from Feb2004 to Network and Adaptive Precision and MCC Bayesian Network(99.6%).
et.al[2016] April2015 Boosting
This paper includes aspects of
different challenges for
classifying new malicious
Asaf Shabtai. ANN,DT,KNN,BN,SVM code based on static features
et.al[2009] Not Used ,OneR,Boosted Algorithm TPR, FP and, Accuracy extracted.

13
International Journal of Computer Applications (0975 – 8887)
Volume 160 – No 7, February 2017

Identity deception detection is

more accurate using non
verbal behavior in
Antu Mary Recall, Precision and F- comparison to verbal
et.al[2015] Wikipedia SVM Measure behavior.
34820 PE files where
31,869 were malicious and
2951 were benign windows Accuracy, Precision, Recall
PE files. and F Arate(False alarm Random Forest gives good
Ashkan Sami RF, NB and DT rate) performance.
et.al[2010]

Table 1: Continued...

Citation Dataset used Classifiers Measures Results

On the basis of Accuracy
,Sensitivity and Specificity
all 5 models are compared.
Then again the dataset are
balanced using Oversampling
and again testing the model.
After balancing
G.Ganesh DT,SVM,PNN,NN and sensitivity/accuracy
Sundarkumar et Dataset from CSMINING GMDH Accuracy ,Sensitivity and improved.
al.[2013] group is used Specificity
(1)Exact match: Global
vocabulary
(2) exact matches
(3) Prominent strings set
(4)Absence of prominent
string
Data are used from
University's malware
database(1504 malware Accuracy and correlation Exact match: Global
Prasha Shrestha files). between n-way vendor vocabulary gives the best
et al.[2014] agreement with accuracy accuracy around 91.02%.
Data collected from data
sources .There are 3 types of It's easy to classification of
dataset used: Legacy, small malware on the basis of
Michael Bailey and large. Hierarchical clustering Consistency, Completeness behavior of fingerprints.
et al.[2007] algorithm and conciseness
SVMs ,the naïve Bayes
Gaston model and the logistic
L’Huillier et regression method were Accuracy
al.[2013] Not Used used in Weka tool F-Measure improved
Data are used from
antivirus vendors time TP,FN and Accuracy
periods 2003-2007 and Random forest gives the
Rafiqul Islam et 2009-2010. SVM,DT,RF and Naive highest accuracy.
al.[2013] Bayes in WEKA tool.

Fusion of classifiers gives

Ali Danesh et 100 articles are taken from NB, KNN, Rocchio ,Voting better results in comparison to
al.[2007] 20 different newsgroup and OWA AND DT Accuracy Rate base classifiers.

This paper represents analysis

NB, SVM, LR, RF and of 5 statistical keywords
Aytug Onan et ensemble methods of SVM F-Measure, Accuracy and methods for text
al.[2016] Reuters-21578 dataset used and RF. AUC Values classification.
It has been observed that
20 different Usenet proposed method WTRF
newsgroups and contains SVM,KNN,NB,BRF,TRF method outperforms among
Baoxun Xu et 18772 documents divided and WTRF Micro F-Measures and all other text categorization
al.[2012] into 20 different classes Macro F-Measures methods

Reuters-21578 R8 dataset
M.Sivakumar et used. Proposed SVM-KNN method
al.[2014] KNN and SVM Accuracy provides high accuracy.

Sundus Hassan et Dataset from 20 Newsgroup Macro F-Measures and NB gives better performance
al.[2000] with 1000 documents NB and SVM Micro F-Measures over SVM.

14
International Journal of Computer Applications (0975 – 8887)
Volume 160 – No 7, February 2017

April 11-15, 2016, Montreal, Quebec, Canada. ACM

3. CONCLUSION AND FUTURE WORK 978-1-4503-4144-8/04.
This paper presented a taxonomy for binary classification.
Sockpuppet detection is based on binary classification, in [10] Asaf Shabtai, Robert Moskovitch, Yuval Elovici and
which two classes are predefined i.e. sockpuppet or non- Chanan Glezer, " Detection of malicious code by
sockpuppet. And datasets are classified on the basis of applying machine learning classifiers on static features:
predefined classes. Multiple identity deception is also based A state -of-the-art-survey ", INFORMATION
on binary classification in which classification process are SECURITY TECHNICAL REPORT 14 (2009) 16-29,
done on given datasets into two groups i.e., sockpuppet and ELSEVIER.
non-sockpuppet. Text categorization is also done by involving [11] Antu Mary Kuruvilla1 and Saira Varghese2, "A Survey
binary classification. So binary classification implies an on detecting Identity Deception in Social Media
important role in machine learning process. To get better Applications", International Journal of Modern Trends in
result, analyze different features sets for binary classification. Engineering and Research (IJMTER) Volume 02, Issue
With different feature sets, better results can be observed in 04, [April – 2015] ISSN (Online):2349–9745 ; ISSN
terms of precision, recall, F-Measure and accuracy. Different (Print):2393-8161.
datasets can be used for experiment with different text
features. These feature sets can be used for multilevel [12] Ashkan Sami, B. Yadegari, N. Peiravian, and S. Hashemi
classification and multiclass classification. and A. Hamze, "Malware detection based on mining API
calls", SAC '10: Proceedings of the ACM Symposium on
4. REFERENCES Applied Computing, pp. 1020-1025, 2010.
[1] Thamar Solorio, Ragib Hasan and Mainul Mizan, "A Case
[13] G.Ganesh Sundarkumar and Vadlamani Ravi, "Malware
Study of Sockpuppet Detection in Wikipedia",
Detection by Text and Data Mining".IEEE2013..
Proceedings of the Workshop on Language in Social
Media(LASM 2013),Pages 59-68,Atlanta,Georgia,June [14] Prasha Shrestha,Suraj Maharajan,Gabriela Ramirez de la
13 2013.@2013 Association for Computational Rosa,Alan Sprague,Thamar Solorio and Gracy Warner,
Linguistics. "Using String Information for Malware Family
Identification" @Springer International Publishing
[2] Michail Tsikerdekis and Sherali Zeadally, "Multiple
Switzerland 2014,A.L.C.Bazzan and
Account Identity Deception Detection in Social Media
K.Pichara(Eds.):IBERAMIA 2014,LNAI 8864,pp.686-
Using Non Verbal Behavior", IEEE Transactions on
697,2014.DOI:10.1007/978-3-319-12027-0_55
Information Forensics and Security, Vol 9, No 8, August
2014. [15] Michael Bailey, Jon Oberheide, Z. Morley Mao, Farnam
Jahanian and Jose Nazario, " Automated Classification
[3] Thamar Solorio, Ragib Hasan and Mainul Mizan,
and Analysis of Internet Malware". April 26 2007
"Sockpuppet Detection in Wikipedia :A Corpus of Real-
World Deceptive Writing For Linking Writing", [16] Gaston L’Huillier, Alejandro Hevia, Richard Weber and
arXiv:1310.6772v1[cs.CL] 24 Oct 2013. Sebastian Rios, "Latent Semantic Analysis and Keyword
Extraction for Phishing Classification".IEEE2010.
[4] Xueling Zheng, Yiu Ming Lai, K.P. Chow, Lucas C.K.
Hui and S.M. Yiu, "Detection of Sockpuppets in Online [17] Rafiqul Islam, Ronghua Tian , Lynn M. Batten and
Discussion Forums", HKU CS Tech Report TR-2011-03. Steve Versteeg," Classification of malware based on
integrated static and dynamic features". Journal of
[5] Sadia Afroz, Michael Brennan and Rachel Greenstadt,
Network and Computer Applications 36 (2013) 646–656.
"Detecting Hoaxes Frauds and Deception in Writing
ELSEVIER.
Style Online". 2011.
[18] Ali Danesh, Behzad Moshiri and Omid Fatemi, "Improve
[6] Dhanyasree P*, Sajitha Krishnan and Ambikadevi Amma
Text Classification Accuracy based on Classifier Fusion
T, "Deception Detection in Social Media through
Methods".2007 IEEE Xplore.
Combined Verbal and Non-Verbal Behavior ",
International Journal of Advanced Research in Computer [19] Aytuğ Onana, Serdar Korukoğlub and Hasan Bulutb, "
Science and Software Engineering , Volume 5, Issue 4, Ensemble of keyword extraction methods and classifiers
2015. in text classification". A. Onan et al. / Expert Systems
With Applications 57 (2016) 232–247.
[7] M Balaanand,R Soumipriya,S Sivaranjani and S Sankari,
"Identifying Fake Users in Social Networks Using Non- [20] Baoxun Xu, Xiufeng Guo, Yumming Ye and Jiefeng
Verbal Behaviour". International Journal of Technology Cheng, "An Improved Random Forest Classifier for Text
and Engineering System (IJTES)Vol 7. No.2 2015 Pp. Categorization", [JOURNAL OF COMPUTERS] VOL.
157-161©gopalax Journals, Singapore. 7, NO. 12, DECEMBER 2012.
[8] Sheetal Antony, Prof. B. S. Umashankar, "Identity [21] M. Sivakumar, C. Karthika and P. Renuga, "A Hybrid
Deception Detection and Security in Social Medium, Text Classification Approach using KNN and SVM",
IJCSMC, Vol. 5, Issue 4, April 2016, pg.499-502. [IJIRSET] Volume 3, Special Issue 3, March 2014.
[9] Zaher Yamak, Julien Saunier and Laurent Vercouter, " [22] Sundus Hassan, Muhammad Rafi and Muhammad
Detection of Multiple Identity Manipulation in Shahid Shaikh, "Comparing SVM and Naive Classifiers
Collaborative Projects", IW3C2, WWW'16 Companion, for Text categorization with Wikitology as knowledge
enrichment". IEEE Xplore 2012.

IJCATM : www.ijcaonline.org 15

Thesis
No ratings yet
Thesis
269 pages
Classification, Particle Technology Lab
No ratings yet
Classification, Particle Technology Lab
16 pages
Bachelor of Technology: Diabetes Disease Prediction Using Machine Learning
No ratings yet
Bachelor of Technology: Diabetes Disease Prediction Using Machine Learning
58 pages
Guide
No ratings yet
Guide
210 pages
Fake Profile Identification
No ratings yet
Fake Profile Identification
51 pages
SCA Module 9
No ratings yet
SCA Module 9
43 pages
A Hybrid Approach For Detecting Automated Spammers in Twitter
No ratings yet
A Hybrid Approach For Detecting Automated Spammers in Twitter
6 pages
Lec01 Conceptlearning
100% (1)
Lec01 Conceptlearning
49 pages
MODULE 5
No ratings yet
MODULE 5
27 pages
IZETAM TECHNOLOGIES - Interview Questions of Python For 2 Years Experienced
No ratings yet
IZETAM TECHNOLOGIES - Interview Questions of Python For 2 Years Experienced
9 pages
"Fake News" Is Not Simply False Information: A Concept Explication and Taxonomy of Online Content
No ratings yet
"Fake News" Is Not Simply False Information: A Concept Explication and Taxonomy of Online Content
33 pages
A new way to detect profiles
No ratings yet
A new way to detect profiles
13 pages
Aplikasi Citra Drone Untuk Klasifikasi Vegetasi Di Cagar Alam Curah Manis Sempolan 1 Menggunakan Metode Manual, Object Base Image
No ratings yet
Aplikasi Citra Drone Untuk Klasifikasi Vegetasi Di Cagar Alam Curah Manis Sempolan 1 Menggunakan Metode Manual, Object Base Image
13 pages
38-Article Text-176-2-10-20230411
No ratings yet
38-Article Text-176-2-10-20230411
10 pages
Information: Malicious Text Identification: Deep Learning From Public Comments and Emails
No ratings yet
Information: Malicious Text Identification: Deep Learning From Public Comments and Emails
19 pages
Improved Performance of Fake Account Classifiers With Percentage Overlap Features Selection
No ratings yet
Improved Performance of Fake Account Classifiers With Percentage Overlap Features Selection
11 pages
Bitcoin Price Analyze and Prediction Using Data Science Process
No ratings yet
Bitcoin Price Analyze and Prediction Using Data Science Process
11 pages
Kredit Skoring Dan Big Data
No ratings yet
Kredit Skoring Dan Big Data
12 pages
Mini Project On Generative AI 2
No ratings yet
Mini Project On Generative AI 2
44 pages
Machine Learning a Review on Binary Classification
No ratings yet
Machine Learning a Review on Binary Classification
6 pages
Detection_and_Classification_of_Cyberbullying_in_Social_Media_using_Text_Mining (1) (2)
No ratings yet
Detection_and_Classification_of_Cyberbullying_in_Social_Media_using_Text_Mining (1) (2)
6 pages
EBPN
No ratings yet
EBPN
10 pages
Study Case of Fake Profiling
No ratings yet
Study Case of Fake Profiling
15 pages
Fake News Detection Based On Word and Document Embedding Using Machine Learning Classifiers
No ratings yet
Fake News Detection Based On Word and Document Embedding Using Machine Learning Classifiers
11 pages
Untitled Collection 2ye0ujym Composite User Behaviour Assisted Rumour Detection Over 4abixgi4tv
No ratings yet
Untitled Collection 2ye0ujym Composite User Behaviour Assisted Rumour Detection Over 4abixgi4tv
6 pages
Descriptive and Predictive Analysis of Euroleague PDF
No ratings yet
Descriptive and Predictive Analysis of Euroleague PDF
25 pages
Analisis Klasifikasi Bencana Banjir Berdasarkan Curah Hujan Menggunakan Algoritma Naïve Bayes
No ratings yet
Analisis Klasifikasi Bencana Banjir Berdasarkan Curah Hujan Menggunakan Algoritma Naïve Bayes
9 pages
Identification of Forged Profiles in Online Social Networks Using Machine Learning and NLP
No ratings yet
Identification of Forged Profiles in Online Social Networks Using Machine Learning and NLP
6 pages
Prediction of Heart Disease Using Random Forest in Comparison With Logistic Regression To Measure Accuracy
No ratings yet
Prediction of Heart Disease Using Random Forest in Comparison With Logistic Regression To Measure Accuracy
5 pages
IOT Based Smart Irrigation System and Water Leakage Detection Using Image Proccesing
No ratings yet
IOT Based Smart Irrigation System and Water Leakage Detection Using Image Proccesing
5 pages
Assignment-2 ML Solution by Loknath Regmi
No ratings yet
Assignment-2 ML Solution by Loknath Regmi
6 pages
Avi Watwani d17b 75 Bda Project Report
No ratings yet
Avi Watwani d17b 75 Bda Project Report
13 pages
SVM - Friend or Foe?: Reason 1
No ratings yet
SVM - Friend or Foe?: Reason 1
9 pages
R Data Analysis
No ratings yet
R Data Analysis
10 pages
Sat - 95.Pdf - Heart Disease Prediction Using Machine Learning Algorithms
No ratings yet
Sat - 95.Pdf - Heart Disease Prediction Using Machine Learning Algorithms
11 pages
Final Report - Smart and Fast Email Sorting: 1 Project's Description
No ratings yet
Final Report - Smart and Fast Email Sorting: 1 Project's Description
5 pages
Electronics: Identification of Plant-Leaf Diseases Using CNN and Transfer-Learning Approach
No ratings yet
Electronics: Identification of Plant-Leaf Diseases Using CNN and Transfer-Learning Approach
19 pages
Breast Cancer Detection Using SVM Classifier With Grid Search Technique
No ratings yet
Breast Cancer Detection Using SVM Classifier With Grid Search Technique
6 pages
Robuts Recognition For Traffic Signals
No ratings yet
Robuts Recognition For Traffic Signals
5 pages
Detecting Fake Accounts in Media Application Using Machine Learning
No ratings yet
Detecting Fake Accounts in Media Application Using Machine Learning
4 pages
Disease Prediction Based On Retinal Images Using Deep Neural Networks
No ratings yet
Disease Prediction Based On Retinal Images Using Deep Neural Networks
3 pages
Spammer Detection and Fake User Identification On Social Networks
No ratings yet
Spammer Detection and Fake User Identification On Social Networks
7 pages
Release Procedure
No ratings yet
Release Procedure
22 pages
nancy_chaurasia
No ratings yet
nancy_chaurasia
2 pages
Mastering Data Mining with Python – Find patterns hidden in your data
From Everand
Mastering Data Mining with Python – Find patterns hidden in your data
Megan Squire
No ratings yet
Essays In Personalizable Software
From Everand
Essays In Personalizable Software
Gerry Stahl
No ratings yet
Essays in Personalizable Software: Gerry Stahl's eLibrary, #8
From Everand
Essays in Personalizable Software: Gerry Stahl's eLibrary, #8
Gerry Stahl
No ratings yet
Metasploit Techniques and Workflows: Definitive Reference for Developers and Engineers
From Everand
Metasploit Techniques and Workflows: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Social Media Mining with Python
From Everand
Mastering Social Media Mining with Python
Marco Bonzanini
5/5 (1)
Prompt Perfect
From Everand
Prompt Perfect
Muni
No ratings yet
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
From Everand
Prompt to Profit: AI Patterns That Give Solo Builders an Unfair Advantage
Lucas Merritt
No ratings yet
Generative AI – An Overview: Software, #1
From Everand
Generative AI – An Overview: Software, #1
Editor IJSMI
No ratings yet
Statistics with Rust: 50+ Statistical Techniques Put into Action
From Everand
Statistics with Rust: 50+ Statistical Techniques Put into Action
Keiko Nakamura
No ratings yet
Mastering OpenCV Android Application Programming
From Everand
Mastering OpenCV Android Application Programming
Salil Kapur
No ratings yet
Trust between Cooperating Technical Systems: With an Application on Cognitive Vehicles
From Everand
Trust between Cooperating Technical Systems: With an Application on Cognitive Vehicles
Walter Bamberger
No ratings yet
Machine Learning Infrastructure and Best Practices for Software Engineers: Take your machine learning software from a prototype to a fully fledged software system
From Everand
Machine Learning Infrastructure and Best Practices for Software Engineers: Take your machine learning software from a prototype to a fully fledged software system
Miroslaw Staron
No ratings yet
MATLAB for Machine Learning: Unlock the power of deep learning for swift and enhanced results
From Everand
MATLAB for Machine Learning: Unlock the power of deep learning for swift and enhanced results
Giuseppe Ciaburro
No ratings yet
Hands-on Cloud Analytics with Microsoft Azure Stack
From Everand
Hands-on Cloud Analytics with Microsoft Azure Stack
Prashila Naik
No ratings yet
Practical Windows Forensics
From Everand
Practical Windows Forensics
Ayman Shaaban
4/5 (1)
Swift 3 Object-Oriented Programming - Second Edition
From Everand
Swift 3 Object-Oriented Programming - Second Edition
Gastón C. Hillar
No ratings yet
Artificial Inteligence: 1
From Everand
Artificial Inteligence: 1
OLUWASEUN ADENEYE
No ratings yet
Python for Developers: Learn to Develop Efficient Programs using Python
From Everand
Python for Developers: Learn to Develop Efficient Programs using Python
Mohit Raj
No ratings yet
Sussman Anomaly: Fundamentals and Applications
From Everand
Sussman Anomaly: Fundamentals and Applications
Fouad Sabry
No ratings yet
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
From Everand
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
Dr. Gypsy Nandi
No ratings yet
Artificial Intelligence Systems Integration: Fundamentals and Applications
From Everand
Artificial Intelligence Systems Integration: Fundamentals and Applications
Fouad Sabry
No ratings yet
Learn Penetration Testing with Python 3.x: Perform Offensive Pentesting and Prepare Red Teaming to Prevent Network Attacks and Web Vulnerabilities (English Edition)
From Everand
Learn Penetration Testing with Python 3.x: Perform Offensive Pentesting and Prepare Red Teaming to Prevent Network Attacks and Web Vulnerabilities (English Edition)
Yehia Elghaly
5/5 (1)
Mobile Agents in Networking and Distributed Computing
From Everand
Mobile Agents in Networking and Distributed Computing
Jiannong Cao
No ratings yet
Artificial Intelligence 2024 Book 2 of 2: AI, #2
From Everand
Artificial Intelligence 2024 Book 2 of 2: AI, #2
Yang Yen Thaw
No ratings yet
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
From Everand
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
Dr. Rajkumar Tekchandani
No ratings yet
Deep Learning for Beginners: A Comprehensive Introduction of Deep Learning Fundamentals for Beginners to Understanding Frameworks, Neural Networks, Large Datasets, and Creative Applications with Ease
From Everand
Deep Learning for Beginners: A Comprehensive Introduction of Deep Learning Fundamentals for Beginners to Understanding Frameworks, Neural Networks, Large Datasets, and Creative Applications with Ease
Steven Cooper
5/5 (1)
The Art of AI Scrum Master & Work
From Everand
The Art of AI Scrum Master & Work
Tom Henricksen
No ratings yet
Kali Linux, Ethical Hacking And Pen Testing For Beginners
From Everand
Kali Linux, Ethical Hacking And Pen Testing For Beginners
BHARAT NISHAD
No ratings yet
The Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats
From Everand
The Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats
Dr.Chandrakant
No ratings yet
Pandas in 7 Days: Utilize Python to Manipulate Data, Conduct Scientific Computing, Time Series Analysis, and Exploratory Data Analysis
From Everand
Pandas in 7 Days: Utilize Python to Manipulate Data, Conduct Scientific Computing, Time Series Analysis, and Exploratory Data Analysis
Fabio Nelli
No ratings yet
Activity Recognition: Fundamentals and Applications
From Everand
Activity Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
Concept Mining: Fundamentals and Applications
From Everand
Concept Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Image Retrieval: Unlocking the Power of Visual Data
From Everand
Image Retrieval: Unlocking the Power of Visual Data
Fouad Sabry
No ratings yet
Blockchain, Cryptocurrencies and NFTs : A Practical Introduction
From Everand
Blockchain, Cryptocurrencies and NFTs : A Practical Introduction
MAX EDITORIAL
No ratings yet
Image Retrieval: Fundamentals and Applications
From Everand
Image Retrieval: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet
Pattern Recognition: Fundamentals and Applications
From Everand
Pattern Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Text Mining: Fundamentals and Applications
From Everand
Text Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Knowledge Reasoning: Fundamentals and Applications
From Everand
Knowledge Reasoning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Conceptual Dependency Theory: Fundamentals and Applications
From Everand
Conceptual Dependency Theory: Fundamentals and Applications
Fouad Sabry
No ratings yet
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
From Everand
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
Steven Cooper
No ratings yet
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
From Everand
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
PURNA CHANDER RAO. KATHULA
5/5 (1)

Uploaded by

Uploaded by

International Journal of Computer Applications (0975 – 8887)

Volume 160 – No 7, February 2017

Machine Learning: A Review on Binary Classification

ABSTRACT behavior to detect multiple account identity deception on

1. INTRODUCTION 2. REVIEWED PAPERS IN THIS

Citation Dataset used Classifiers Measures Results

On the basis of authorship

SVM, RF, Naive Baiyes, K Best accuracy given by

Identity deception detection is

Citation Dataset used Classifiers Measures Results

Fusion of classifiers gives

This paper represents analysis

April 11-15, 2016, Montreal, Quebec, Canada. ACM

You might also like