Injection Web Attacks Using Ensemble Learners and Data Sampling PPT

Question Description

I’m working on a statistics presentation and need a sample draft to help me understand better.

I need a ppt presentation. The presentation should have enough slides to present at most 30 minutes (between 20 and 30 minutes long). The emphasis of presentation should be on the case studies/experiments and statistical analysis used in the paper summited bellow. You only presenting the attached paper. How can you apply these statistical techniques in a research project (like attached paper).

Detecting SQL Injection Web Attacks Using Ensemble Learners and Data Sampling Richard Zuech, John Hancock, Taghi M. Khoshgoftaar College of Engineering and Computer Science Florida Atlantic University Boca Raton, Abstract—SQL Injection web attacks are a common choice among attackers to exploit web servers. We explore classification performance in detecting SQL Injection web attacks in the recent CSE-CIC-IDS2018 dataset with the Area Under the Receiver Operating Characteristic Curve (AUC) metric for the following seven classifiers: Random Forest (RF), CatBoost (CB), LightGBM (LGB), XGBoost (XGB), Decision Tree (DT), Naive Bayes (NB), and Logistic Regression (LR) (with the first four learners being ensemble learners and for comparison, the last three being single learners). Our unique data preparation of CSE-CIDIDS2018 affords a harsh experimental testbed of class imbalance as encountered in the real world for cybersecurity attacks. To the best of our knowledge, we are the first to apply random undersampling techniques to web attacks from the CSE-CICIDS2018 dataset while exploring various sampling ratios.

We find the ensemble learners to be the most effective at detecting SQL Injection web attacks, but only after first applying massive data sampling. Index Terms—SQL Injection, Web Attacks, Class Imbalance, Random Undersampling, Rarity, Ensemble Learners. I. I NTRODUCTION Cybersecurity is an important consideration for the modern Internet era, with consumers spending over $600 billion on e-commerce sales during 2019 in the United States

[1]. Security practitioners struggle to properly defend this increasingly important cyberspace in a constant arms race against criminals and other adversaries. At the time of this writing, SQL Injection web attacks rank number one on the Open Web Application Security Project (OWASP) “Top 10 Web Application Security Risks”

[2]. SQL Injection web attacks are a code injection technique

[3] where attackers will craft special sequences of characters and submit them to web page forms in an attempt to directly query the back-end database of that website. When successful with SQL Injection web attacks, attackers might be able to exfiltrate sensitive data or maybe even alter data to their liking. Some SQL Injection web attacks may even allow attackers to exploit the operating system of the database host, and they can then attempt to pivot to other targets on the victim’s network (possibly starting from a more favorable foothold). In this study, we employ machine learning towards the detection of SQL Injection web attacks. When employing security analytics

[4], one important aspect that defenders confront is the issue of class imbalance. Class imbalance 978-1-7281-5684-2/20/$31.00 c 2021 IEEE occurs when one class label is disproportionately represented as compared to another class label. For example, in cybersecurity, it is not uncommon for a cyberattack to be lost in a sea of normal instances similar to the proverbial “needle in a haystack” analogy. Amit et al.

[5] at Palo Alto Networks and Shodan, state that in cybersecurity “imbalance ratios of 1 to 10,000 are common.” We agree with their assessment that very high imbalance ratios are common in cybersecurity, which is a motivation for this study to explore sampling ratios in detecting SQL Injection web attacks. To evaluate SQL Injection web attacks, we utilize the CSECIC-IDS2018 dataset which was created by Sharafaldin et al.

[6] at the Canadian Institute for Cybersecurity. CSE-CICIDS2018 is a more recent intrusion detection dataset than the popular CIC-IDS2017 dataset

[7], which was also created by Sharafaldin et al. The CSE-CIC-IDS2018 dataset includes over 16 million instances which includes normal instances, as well as the following family of attacks: web attack, Denial of Service (DoS), Distributed Denial of Service (DDoS), brute force, infiltration, and botnet. For additional details on the CSE-CIC-IDS2018 dataset

[8], please refer to [9]. For illustrative purposes, Table I contains the breakdown for the entire CSE-CIC-IDS2018 dataset (although the entire dataset is not used in this experiment, and this table should only be used for reference purposes). The 928 web attack instances from Table I are actually comprised of the following three different web attack labels: “Brute Force-Web” (611 instances), “Brute Force-XSS” (230 instances), and “SQL Injection” (87 instances). To implement these three different web attacks, the authors of CSE-CIC-IDS2018 utilized the Damn Vulnerable Web App (DVWA) [10] and Selenium framework [11] tools.

Table II illustrates the instances used in this experiment. In this study, we only focus on SQL Injection web attacks along with all the normal traffic and discard the other attack instances (including discarding the “Brute Force-Web” and “Brute Force-XSS” labels). Through our unique data preparation process, we are able to evaluate SQL Injection web attacks from CSE-CIC-IDS2018 at a class ratio for normal to attack of 153,911:1 for SQL Injection web attacks.

Our work is unique, in that existing works only evaluate class ratios as high as 2,896:1 for web attacks and none of the existing works evaluate the effects of applying sampling techniques. TABLE I E NTIRE CSE-CIC-IDS2018 DATASET BY F ILES /DAYS ( FOR REFERENCE ONLY – THE FULL DATASET IS NOT USED IN OUR EXPERIMENT ) Day 02/14 Wed – Brute Force 02/15 Thurs – DoS 02/16 Fri – DoS 02/20 Tues – DDoS 02/21 Wed – DDoS 02/22 Thu – Web 02/23 Fri – Web 02/28 Wed – Infiltration 03/01 Thurs – Infiltration 03/02 Fri – Bot Total Records Normal Instances 667,626 996,077 446,772 7,372,557 360,833 1,048,213 1,048,009 544,200 238,037 762,384 13,484,708 Attack Instances 380,949 52,498 601,802 576,191 687,742 362 566 688,871 93,063 286,191 2,748,235 TABLE II I NSTANCES U SED IN THIS E XPERIMENT FROM CSE-CIC-IDS2018 Attack Type Sql Injection Web * PCC* 87 Normal Instances 13,390,234 Imbalance Ratio 153,911:1 Positive Class Count (PCC) The CSE-CIC-IDS2018 dataset is comprised of ten different days of files, and we combine all ten days of normal traffic with the SQL Injection web attack instances.

Other works only evaluate web attacks with one or two days of normal traffic. By combining all ten days of normal traffic, we obtain a higher imbalance ratio and have a richer backdrop of normal data as compared to other studies. An additional challenge encountered in our experiment is class rarity with the Positive Class Count (PCC) of the SQL Injection web attacks only comprising 87 positive instances. Class rarity is an extreme case of class imbalance, and rarity is not uncommon in cybersecurity especially among more stealthy or sophisticated attacks [12]. Throughout this document, the term rarity will always refer to class rarity. Rarity occurs in machine learning when the PCC has less than a few hundred instances [13], as compared to many more negative instances. For example, 10,000,000 total instances with an imbalance level of 1% from the positive class would yield a PCC of 100,000 which is typically enough positive class instances for machine learning classifiers to discriminate class patterns (and this example would only be highly imbalanced).

On the other hand, 1,000 total instances with that same imbalance level of 1% would only provide a PCC of 10, and this would constitute rarity as machine learning classifiers may struggle with such few instances from the positive class. To evaluate the effects of class imbalance with SQL Injection web attacks, we explore eight different levels of sampling ratios with random undersampling (RUS): no sampling, 999:1, 99:1, 95:5, 9:1, 3:1, 65:35, and 1:1. We also compare the following seven different classifiers: Random Forest (RF), CatBoost (CB), LightGBM (LGB), XGBoost (XGB), Decision Tree (DT), Naive Bayes (NB), and Logistic Regression (LR). To quantify classification performance, we utilize the Area Under the Receiver Operating Characteristic Curve (AUC) metric. The remaining sections of this paper are organized as follows.

The Related Work section studies existing literature for web attacks with CSE-CIC-IDS2018 data. In the Data Preparation section, we describe how the datasets used in our experiments were cleaned and prepared. Then, the Methodologies section describes the classifiers, performance metrics, and sampling techniques applied in our experiments. The Results and Discussion section provides our results and statistical analysis. Finally, the Conclusion section concludes our work. II. R ELATED W ORK None of the prior four studies [14], [15], [16], [17] for web attacks with CSE-CIC-IDS2018 provided any results for class imbalance analysis. No sampling techniques are applied to explore class imbalance issues for web attacks in CSE-CICIDS2018. None of these four studies combine the full normal traffic (all days) from CSE-CIC-IDS2018 with the individual web attacks for analysis, and instead they only use a single day of normal traffic when considering web attacks. No prior CSE-CIC-IDS2018 studies have explored class rarity.

Three of these four studies [14], [15], [16] utilized multiclass classification for the “Web” attacks, resulting in extremely poor classification performance for each of the three individual web attack labels (“Brute Force-Web”, “Brute Force-XSS”, and “SQL Injection”). In many cases, not even one instance could be correctly classified for an individual web attack. However, classification results for the aggregated web attacks in [17] are extremely high. This performance discrepancy in literature between the three individual web attacks and those same web attacks combined (aggregated), motivated us to conduct this study. We were surprised to find our results to be so much better than the three other studies [14], [15], [16] analyzing these same SQL Injection web attacks through multi-class classification. Our random undersampling approach definitely helped, although some of our classifiers still fared much better even when no sampling was applied which was likely due to our rigorous data preparation approach.

With the CSE-CIC-IDS2018 dataset, Basnet et al. [14] benchmark different deep learning frameworks: KerasTensorflow, Keras-Theano, and using 10-fold cross validation. However, full results are only produced for which is likely due to the computational constraints they frequently mention (where in some cases it took weeks to produce results). They achieve 99.9% accuracy for the aggregated web attacks with binary classification. However, the multi-class classification for those same three individual web attacks tell a completely different story with: 53 of 121 “Brute Force-Web” classified correctly, 17 of 45 “Brute Force-XSS” classified correctly, and 0 of 16 “SQL Injection” classified correctly. Basnet et al. only provide classification results in terms of the Accuracy metric and confusion matrices (where only accuracy is provided for the aggregated web attacks).

Their 99.9% accuracy scores for the aggregated web attacks can be deceptive when dealing with such high levels of class imbalance, as such a high accuracy can still be attained even with zero instances from the positive class correctly classified. When dealing with high levels of class imbalance, performance metrics which are more sensitive to class imbalance should be utilized. For web attacks, only two separate days of traffic from CSE-CIC-IDS2018 are evaluated with imbalance levels of 2,880:1 (binary) and 30,665:7.32:2.32:1 (multi-class) for one day and 1,842:1 (binary) and 19,666:6.83:2.85:1 (multi-class) for the other day. Such high imbalance levels require metrics more sensitive to class imbalance.

Also, perhaps better classification performance might have been achieved by properly treating the class imbalance problem. Atefinia and Ahmadi [15] propose a new “modular deep neural network model” and test it with CSE-CIC-IDS2018 data. Web attacks perform very poorly in their model with multi-class classification results of: 56 of 122 “Brute ForceWeb” classified correctly, 0 of 46 “Brute Force-XSS” classified correctly, and 0 of 18 “SQL Injection” classified correctly. For two of the three web attacks, their model does not correctly classify even one instance of the test data. They only produce results with their one custom learner, and so benchmarking their approach is not easy.

The work of Atefinia and Ahmadi is unique compared to the other three CSE-CIC-IDS2018 studies considering web attacks in that Atefinia and Ahmadi combine the two web attack days together with the attack and normal traffic for only those two days, whereas the other three studies consider each of these two days separately for the web attack data (days: Thursday 02/22/2018 and Friday 02/23/2018). The classification results with their new model are very poor for the web attacks, and they do not explore treating the class imbalance problem. Li et al. [16] create an unsupervised Auto-Encoder Intrusion Detection System (AE-IDS), which is based on an anomaly detection approach utilizing 85% of the normal instances as the training dataset with the testing dataset consisting of the remaining 15% of the normal instances plus all the attack instances.

They only analyze one day of the available two days of “Web” attack traffic from CSE-CIC-IDS2018, and they evaluate the three different web attacks separately (versus aggregating the “Web” category together). The three individual web attacks perform very poorly with AE-IDS and multiclass classification results of: 147 of 362 “Brute Force-Web” classified correctly, 26 of 151 “Brute Force-XSS” classified correctly, and 6 of 53 “SQL Injection” classified correctly. Overall, less than half of the web attacks are classified correctly for each of the three different web attacks. D’hooge et al. [17] evaluate each day of the CSE-CICIDS2018 dataset separately for binary classification with 12 different learners and stratified 5-fold cross validation. Importantly, their study does not evaluate individual SQL Injection web attacks like we do, but instead they use all three of the CSE-CID-IDS2018 web attacks combined together.

Their F1 and AUC scores for the two different days with “Web” categories are generally very high, with some perfect F1 and AUC scores achieved with XGBoost. Other learners varied between 0.9 and 1.0 for both F1 and AUC scores, with the first day of “Web” usually having better performance than the second day of “Web”. The three other studies we evaluated all used multi-class classification for these same web attacks, but they all had extremely poor classification performance (many times with zero attack instances classified correctly). D’hooge et al. state overfitting might have been a problem for CIC-IDS2017 in this same study, and “further analysis is required to be more conclusive about this finding”. Given such extremely high classification scores, overfitting may have been a problem in their CSE-CIC-IDS2018 results as well (for example in their source code, we noticed the max depth hyperparameter set to a value of 35 for Decision Tree and Random Forest learners). In addition, their model validation approach is not clear.

They state they utilize two-thirds of each day’s data with stratified 5-fold cross validation for hyperparameter tuning. And then, they utilize “single execution testing”. However, it is not clear how this single execution testing was performed and whether there is indeed a “gold standard” holdout test set. In summary, extremely poor classification performance of individual SQL Injection web attacks from CSE-CIC-IDS2018 motivated us to further explore this phenomenon with class imbalance in mind. Two of the three studies evaluating individual SQL Injection web attacks could not even correctly classify one instance, while the third study could only classify 6 of 56 instances correctly. Additionally, we investigate severe class imbalance and rarity for the SQL Injection web attacks in CSE-CIC-IDS2018 which has not previously been done. III. DATA P REPARATION In this section, we describe how we prepared and cleaned the dataset files used in our experiment.

We dropped the “Protocol” and “Timestamp” fields from CSE-CIC-IDS2018 during our preprocessing steps. The “Protocol” field is somewhat redundant as the “Dst Port” (Destination Port) field mostly contains equivalent “Protocol” values for each Destination Port value. Also, we dropped the “Timestamp” field as we wanted the learners not to discriminate attack predictions based on time especially with more stealthy attacks in mind. In other words, the learners should be able to discriminate attacks regardless of whether the attacks are high volume or slow and stealthy. Additionally, a total of 59 records were dropped from CSECIC-IDS2018 due to header rows being repeated in files. The fourth downloaded file named “Thuesday-20-022018 TrafficForML CICFlowMeter.csv” was different than the other nine files from CSE-CIC-IDS2018. This file contained four extra columns: “Flow ID”, “Src IP”, “Src Port”, and “Dst IP”. We dropped these four additional fields. Also of note is that this one particular file contained nearly half of all the records for CSE-CIC-IDS2018. This fourth file contained 7,948,748 records of the dataset’s total 16,232,943 records.

Certain fields contained negative values which did not make sense and so we dropped those instances with negative values for the “Fwd Header Length”, “Flow Duration”, and “Flow IAT Min” fields (with a total of 15 records dropped from CSE-CIC-IDS2018 for these fields containing negative values). Negative values in these fields were causing extreme values that can skew classifiers which are sensitive to outliers. Eight fields in both datasets contained constant values of zero for every instance. In other words, these fields did not contain any value other than zero. We dropped these eight fields: Bwd PSH Flags, Bwd URG Flags, FwdAvg Bytes Bulk, Fwd Avg Packets Bulk, Fwd Avg BulkRate, Bwd Avg Bytes Bulk, Bwd Avg Packets Bulk, and Bwd Avg Bulk Rate. We also excluded the “Init Win bytes forward” and “Init Win bytes backward” fields because they contained negative values.

These fields were excluded since about half of the total instances contained negative values for these two fields (so we would have removed a very large portion of the dataset by filtering all these instances out). Similarly, we did not use the “Flow Duration” field as some of those values were unreasonably low with zero values. The “Flow Bytes/s” and “Flow Packets/s” fields contained some “Infinity” and “NaN” values (with less than 0.6% of the records containing these values). We dropped these 95,760 instances where either “Flow Bytes/s” or “Flow Packets/s” contained “Infinity” or “NaN” values. We also excluded the Destination Port categorical feature which contains more than 64,000 distinct categorical values.

Since Destination Port has so many values, we determined that finding an optimal encoding technique was out of scope for this study. Table II includes the final counts for the positive and negative instances used in this experiment. IV. M ETHODOLOGIES A. Classifiers For all experiments in this study, stratified 5-fold cross validation [18] is used. Stratified refers to evenly splitting each training and test fold so that each class is proportionately weighted across all folds equally. Splitting in a stratified manner is especially important when dealing with high levels of class imbalance, as randomness can inadvertently skew the results between folds [19]. To account for randomness, each stratified 5-fold cross validation w…

Do you have a similar assignment and would want someone to complete it for you? Click on the ORDER NOW option to get instant services at

Do you have a similar assignment and would want someone to complete it for you? Click on the ORDER NOW option to get instant services at We assure you of a well written and plagiarism free papers delivered within your specified deadline.