Faculty
Department
Year of Publication
Keyword
Publication Type
Abstract
This research develops an AI-powered Web Application Firewall (WAF) to detect SQL injection( SQLi) attacks, addressing the limitations of traditional signature-based systems. Using the Kaggle SQLi dataset (30,905 queries), the study applied TF-IDF character-level n-grams and three machine learning models: XGBoost, Random Forest, and SVM, with hyperparameter tuning using grid search and cross-validation. The SVM model performed best, achieving 99.48% accuracy, 99.59% F1-score, 99.90% AUC- ROC, very low false positives and false negatives, and real-time detection with 1.52 ms latency and throughput of 658 queries/second per CPU core. Character n-grams successfully captured common SQLi patterns such as UNION SELECT, OR operators, comments, and tautologies. A Flask-based web application and REST API demonstrated that the system is production-ready, highly scalable, and far cheaper than commercial WAFs. The research confirms that traditional machine learning with good feature engineering can match deep learning performance while remaining simpler and more efficient. Limitations include reliance on one dataset, binary classification, and reduced effectiveness against highly obfuscated or second-order attacks. Future work should involve multi-dataset
testing, adversarial robustness, attack subtype classification, and exploring contextual embeddings. Overall, the study shows that ensemble machine learning provides an accurate, fast, and cost- effective alternative for real-time SQL injection detection.
testing, adversarial robustness, attack subtype classification, and exploring contextual embeddings. Overall, the study shows that ensemble machine learning provides an accurate, fast, and cost- effective alternative for real-time SQL injection detection.
Supervisor(s)
co-supervisor


