Websites phishing detectio using URLs tokens as a discriminating features
Journal
Journal of Engineering and Applied Sciences
ISSN
1816949X
Date Issued
2017-01-01
Author(s)
Daeef A.Y.
Ahmad R.B.
Yacob Y.
DOI
10.3923/jeasci.2017.513.519
Abstract
The Phishing detector must be wide scope to deal with the several strategies used to start the phishing campaign and provides high speed detection to avoid user's unsatisfactionby introducing large delay. Consequently, this word presents wide scope and fast detection system by using URLs tokens as a discriminating features without using any external or content features. The method based on analyzing the percentage of the re-used tokens and the token overlap between phishing and legitimate URLs. This research differs from other research by analyzes URLs collected from different sources and according to, this analysis, a statistical classifier is built and the performance is evaluated to measure the technique effectiveness. The results show that the dictionary of phishing tokens is smaller than the dictionary of legitimate tokens and the token overlap between phishing and legitimate URLs is small. Also, the token overlap rate between different phishing sources is more than compared with legitimate token overlap percentage. The average accuracy of 77% is achieved by this technique.