ProGRES

ARTICLE

Characterization of Malicious URLs Using Machine Learning and Feature Engineering,

2024
Seeam, A., Ramsurrun, V., Juddoo, S., Phokeer, A. (eds) Innovations and Interdisciplinary Solutions for Underserved Areas. InterSol 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, LNICST Springer, Cham (541): 15-32

Lien de l'article: https://doi.org/10.1007/978-3-031-51849-2_2

Discipline: Informatique et sciences de l'information

Auteur(s): Nana, S.R., Bassolé, D., Dimitri Ouattara, J.S., Sié, O.

Auteur(s) tagués: BASSOLE Didier

Renseignée par : BASSOLE Didier

Résumé

In this paper, we use Machine Learning models for malicious URL detection and classification by Feature Engineering techniques. These models were implemented with scikit-learn using Random Forest, Support Vector Machine and XGBoost classifier algorithms. Our models were trained, tested, and then optimized with a dataset of 641,125 URLs (benign, defacement, malware, and phishing) from several sources including ISCX-URL2016 from the University of New Brunswick. Through iterative learning, we have shown that the combination of certain hyperparameters and features reduces the false positive rate. The results obtained are interesting with scores close to 100% and zero false positive rates for some types of URLs. We then evaluated the performance of the models against other related works models.

Mots-clés

Malicious URL, Characterization, Feature Engineering, Detection, Classification

Retour

924

5779

49

84