Abstract
The need for cyber security is growing every day as the amount of data available online continues to rise exponentially. The cyber security has become a field of prime importance in the recent years and will continue to be so. Hackers and malpractitioners are growing day by day and are using varied methods and techniques to extract information of prime importance from the users. “Phishing” is one of the most common yet unique security concern. It is unique in the way that instead of targeting the system vulnerabilities, it is a social engineering attack targeting human vulnerabilities. Users give up their personal and sensitive data viz. passwords, card details, bank details etc. by falling to scam emails or websites. The target of this research is to create a tool which will help to detect and differentiate a phishing website from a safe website, thus preventing users into opening risky URLs and keeping their personal data safe. Linear Regression and MultinomialNB are used as the prime methods for the classification apart from other techniques viz. Random Forest, Artificial Neural Network and Support Vector Machine. Most common machine learning algorithms require intensive training of data, causing the process to become slow in order to be executed in real time. The aim of the research is to create a model that can work in real time. The designed pipelined model using Logistic regression, achieved an accuracy of around 98%.
Original language | English |
---|---|
Pages (from-to) | 29431-29456 |
Number of pages | 26 |
Journal | Multimedia Tools and Applications |
Volume | 82 |
Issue number | 19 |
DOIs | |
State | Published - Aug 2023 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Keywords
- Classification
- Logistic regression
- Machine learning
- MultinomialNB
- Phishing websites