Difference between revisions of "AI: Anomaly Detection in logfiles"

Revision as of 14:16, 13 July 2022

➤ IMPORTANT: This page is still under construction.

Summary

This guide will create a basic AI model to perform binary classification in order to detect anomalies in logfiles. This AI model is also suitable for the Jetson AGX Xavier Development Kit

Requirements

Packages: TensorFlow, Keras, Pandas, sklearn, numpy, seaborn, matplotlib
Software: Pycharm or any other python editor

Description

Step 1 - Read the dataset

First we need to read the data, for that we can use the predefined function from pandas 'read_csv'

logfile_features = pd.read_csv(path)

Afterwards we replace the infinite values with nans and drop them all together

logfile_features.replace([np.inf, -np.inf], np.nan, inplace=True)
logfile_features.dropna(inplace=True)

Our dataset has labels which define if its an attack or not, so we replace them with numericals (0 and 1)

logfile_features["Label"].replace({"Benign": 0, "DoS attacks-Slowloris": 1, "DoS attacks-GoldenEye": 1}, inplace=True)

Next we shuffle our dataset

logfile_features = logfile_features.sample(frac=1)

Now we need to split our data into 3 parts: Training data (60%), Test data (20%) and Validation data (20%). To do that we use the following methods

train_dataset, temp_test_dataset = sk.train_test_split(logfile_features, test_size=0.4)
 test_dataset, valid_dataset = sk.train_test_split(temp_test_dataset, test_size=0.5)

Step 2 - Create a model

First we need to create a sequential model, which can be trained later.

model = Sequential()

The next step is to create an input layer consisting of 63 nodes, one for every feature we have in our dataset.

 model.add(Dense(63))

Next a hidden layer consisting of 128 nodes with the ReLU (Rectified Linear Unit) activation function.

model.add(Dense(128, Activation('relu')))

And finally the output layer consisting of 1 node which represents 'attack' or 'no attack'

model.add(Dense(1))

Now we could change the learning rate to a specific value, but we just leave it at the default 0.001

learning_rate = 0.001

For the optimizer we just use the Adam Optimizer with the pre-defined learning rate.

optimizer = tf.optimizers.Adam(learning_rate)

Lastly we need to compile the model, for the loss function we use BinaryCorssentropy, our optimizer and the metric should be the accuarcy of the model

model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), 
  optimizer=optimizer,
  metrics=['accuracy'])

Used Hardware

Jetson AGX Xavier Development Kit

Difference between revisions of "AI: Anomaly Detection in logfiles"

Revision as of 14:16, 13 July 2022

Contents

Summary

Requirements

Description

Step 1 - Read the dataset

Step 2 - Create a model

Used Hardware

Navigation menu

@@ Line 23: / Line 23: @@
 Our dataset has labels which define if its an attack or not, so we replace them with numericals (0 and 1)
   logfile_features["Label"].replace({"Benign": 0, "DoS attacks-Slowloris": 1, "DoS attacks-GoldenEye": 1}, inplace=True)
-Afterwards we shuffle our dataset
+Next we shuffle our dataset
   logfile_features = logfile_features.sample(frac=1)
+Now we need to split our data into 3 parts: Training data (60%), Test data (20%) and Validation data (20%).
+To do that we use the following methods
+ train_dataset, temp_test_dataset = sk.train_test_split(logfile_features, test_size=0.4)
+  test_dataset, valid_dataset = sk.train_test_split(temp_test_dataset, test_size=0.5)
 === Step 2 - Create a model ===

Difference between revisions of "AI: Anomaly Detection in logfiles"

Revision as of 14:16, 13 July 2022

Summary

Requirements

Description

Step 1 - Read the dataset

Step 2 - Create a model

Used Hardware

Navigation menu

Search