Intrusion detection

Assignment

PYTHON

you have to apply Long Short Term Memory Recurrent Neural Network Classifier for Intrusion Detection on the attached csv file , and also have to evaluate the performance with some metrics such as Detection Rate (DR) and False Alarm Rate (FAR) and if  there others, you have to provide comments and explanation on the code and how it works.

Solution 

Intrusion_Detection_Classification.ipynb 

{ “cells”: [ { “cell_type”: “markdown”, “metadata”: {}, “source”: [ “# Intrusion Detection Classification\n”, “\n”, “### Nick Koutroumpinis, ML|mind Software Development\n”, “\n”, “In this project we are gonna build an Intrusion Detection Classification Model. \n”, “Let’s start with the dependencies and importing the data” ] }, { “cell_type”: “code”, “execution_count”: 14, “metadata”: { “collapsed”: true }, “outputs”: [], “source”: [ “import numpy as np\n”, “import pandas as pd\n”, “import matplotlib.pyplot as plt” ] }, { “cell_type”: “code”, “execution_count”: 15, “metadata”: {}, “outputs”: [ { “data”: { “text/html”: [ ”

\n”, ”

\n”, ” \n”, ” \n”, ” \n”, “ F1\n”, “ F2\n”, “ F3\n”, “ F4\n”, “ F5\n”, “ F6\n”, “ F7\n”, “ F8\n”, “ F9\n”, “ Class\n”, ” \n”, ” \n”, ” \n”, “
\n”, ”

0\n”, “

744\n”, “ 156\n”, “ 5\n”, “ 30\n”, “ 63\n”, “ 99\n”, “ 114\n”, “ 32\n”, “ 122\n”, “ Abnormal\n”, ” \n”, “
\n”, ”

1\n”, “

603\n”, “ 87\n”, “ 5\n”, “ 25\n”, “ 63\n”, “ 51\n”, “ 61\n”, “ 16\n”, “ 122\n”, “ Abnormal\n”, ” \n”, “
\n”, ”

2\n”, “

674\n”, “ 87\n”, “ 5\n”, “ 25\n”, “ 63\n”, “ 51\n”, “ 61\n”, “ 16\n”, “ 122\n”, “ Abnormal\n”, ” \n”, “
\n”, ”

3\n”, “

517\n”, “ 10\n”, “ 1\n”, “ 15\n”, “ 55\n”, “ 0\n”, “ 0\n”, “ 0\n”, “ 126\n”, “ Abnormal\n”, ” \n”, “
\n”, ”

4\n”, “

648\n”, “ 128\n”, “ 5\n”, “ 33\n”, “ 67\n”, “ 81\n”, “ 99\n”, “ 19\n”, “ 122\n”, “ Abnormal\n”, ” \n”, ” \n”, “

\n”, ”

” ], “text/plain”: [ ” F1 F2 F3 F4 F5 F6 F7 F8 F9 Class\n”, “0 744 156 5 30 63 99 114 32 122 Abnormal\n”, “1 603 87 5 25 63 51 61 16 122 Abnormal\n”, “2 674 87 5 25 63 51 61 16 122 Abnormal\n”, “3 517 10 1 15 55 0 0 0 126 Abnormal\n”, “4 648 128 5 33 67 81 99 19 122 Abnormal” ] }, “execution_count”: 15, “metadata”: {}, “output_type”: “execute_result” } ], “source”: [ “df = pd.read_csv(\”normal_and_abnormal_data.csv\”)\n”, “\n”, “df[0:5] # show how the data looks like.” ] }, { “cell_type”: “code”, “execution_count”: 16, “metadata”: {}, “outputs”: [ { “name”: “stdout”, “output_type”: “stream”, “text”: [ “Features Length: 60667 Labels Length: 60667\n” ] } ], “source”: [ “features = [] # initialize 2 lists for features and labels\n”, “labels_pre = [] \n”, “\n”, “for ind, data in df.iterrows(): # making a list of lists from the dataframe i initialized.\n”, ” to_pass = [float(i) for i in data[:-1]]\n”, ” features.append(to_pass)\n”, “\n”, “labels_pre = df[‘Class’].tolist()\n”, “\n”, “labels = [] \n”, “for state in labels_pre:\n”, ” if state == ‘Abnormal’:\n”, ” labels.append(0)\n”, ” else:\n”, ” labels.append(1)\n”, “\n”, “\n”, “print(\”Features Length: \”+ str(len(features)) + \” Labels Length: \”+ str(len(labels))) # have to be the same length.\n” ] }, { “cell_type”: “code”, “execution_count”: 17, “metadata”: { “collapsed”: true }, “outputs”: [], “source”: [ “from sklearn.model_selection import train_test_split # import some algorithms from the sklearn package…. \n”, “from sklearn.tree import DecisionTreeClassifier\n”, “from sklearn.ensemble import RandomForestClassifier” ] }, { “cell_type”: “code”, “execution_count”: 18, “metadata”: {}, “outputs”: [ { “name”: “stdout”, “output_type”: “stream”, “text”: [ “Decision Tree Accuracy: 0.999450579638\n” ] } ], “source”: [ “train_x, test_x, train_y, test_y = train_test_split(features, labels, test_size=0.3, random_state=42)\n”, “\n”, “\n”, “clf = DecisionTreeClassifier()\n”, “clf.fit(train_x, train_y)\n”, “\n”, “score = clf.score(test_x, test_y)\n”, “\n”, “print(\”Decision Tree Accuracy: \”+ str(score))” ] }, { “cell_type”: “code”, “execution_count”: 19, “metadata”: {}, “outputs”: [ { “data”: { “text/plain”: [ “[(0, 0),\n”, ” (0, 0),\n”, ” (0, 0),\n”, ” (0, 0),\n”, ” (1, 1),\n”, ” (0, 0),\n”, ” (1, 1),\n”, ” (1, 1),\n”, ” (0, 0),\n”, ” (0, 0)]” ] }, “execution_count”: 19, “metadata”: {}, “output_type”: “execute_result” } ], “source”: [ “## Making a validation list to see the results\n”, “val_list = [] \n”, “\n”, “for k in range(len(test_y)):\n”, ” val_list.append((test_y[k], clf.predict([test_x[k]])[0]))\n”, “\n”, “val_list[0:10]” ] }, { “cell_type”: “markdown”, “metadata”: {}, “source”: [ “So we can see that we get really good results out of this … almost 100% accuracy… Of course we can predict any intrusion with this classifier given the same features” ] } ], “metadata”: { “kernelspec”: { “display_name”: “Python 3”, “language”: “python”, “name”: “python3” }, “language_info”: { “codemirror_mode”: { “name”: “ipython”, “version”: 3 }, “file_extension”: “.py”, “mimetype”: “text/x-python”, “name”: “python”, “nbconvert_exporter”: “python”, “pygments_lexer”: “ipython3”, “version”: “3.6.1” } }, “nbformat”: 4, “nbformat_minor”: 2 }