Anomaly_Detection

When you set all parameters needded for clustering, including data processing, now you are ready to start training your model.

Train and Assign anomalies

This section is for training models where the number of clusters is already defined.

Click Step 1 : Train , and the training process will start, you will get this type of output

After your model is trained, click Step 2 : Assign anomalies , and anomalies will be assigned to data, you get the following output.

  • The Model will be saved as Anomaly_Detection_Model_15.pkl

  • The assigned data will be saved under the file name predicted_data_Assigned_15.csv

15 is the session ID number.

Train and tune the fraction parameter with data containing labled target column

This section is for is for training model when data is already labeled. and tune fraction fraction parameter of the model.

Fill the following multiselect boxes.

Select the target column containing labels : Name of the target column containing labels.

Select type of task (Automatically inferred when None): Choose from the list

if Classification:

  • ‘ Logistic Regression (Default)

  • K Nearest Neighbour

  • Naive Bayes

  • Decision Tree Classifier

  • SVM - Linear Kernel

  • SVM - Radial Kernel

  • Gaussian Process Classifier

  • Multi Level Perceptron

  • Ridge Classifier

  • Random Forest Classifier

  • Quadratic Discriminant Analysis

  • Ada Boost Classifier

  • Gradient Boosting Classifier

  • Linear Discriminant Analysis

  • Extra Trees Classifier

  • Extreme Gradient Boosting

  • Light Gradient Boosting

  • CatBoost Classifier

if Regression:

  • Linear Regression (Default)

  • Lasso Regression

  • Ridge Regression

  • Elastic Net

  • Least Angle Regression

  • Lasso Least Angle Regression

  • Orthogonal Matching Pursuit

  • Bayesian Ridge

  • Automatic Relevance Determ.

  • Passive Aggressive Regressor

  • Random Sample Consensus

  • TheilSen Regressor

  • Huber Regressor

  • Kernel Ridge

  • Support Vector Machine

  • K Neighbors Regressor

  • Decision Tree

  • Random Forest

  • Extra Trees Regressor

  • AdaBoost Regressor

  • Gradient Boosting

  • Multi Level Perceptron

  • Extreme Gradient Boosting

  • Light Gradient Boosting

  • CatBoost Regressor

Select the evaluation metric: For Classification tasks: Accuracy, AUC, Recall, Precision, F1, Kappa (default = ‘Accuracy’), For Regression tasks: MAE, MSE, RMSE, R2, RMSLE, MAPE (default = ‘R2’).

Select the method of labeling outliers (default = drop) : When method set to drop, it will drop the outliers from training dataset. When surrogate, it uses decision function and label as a feature during training.

Select the number of folds to be used in cross validation Number of folds to be used in Kfold CV. Must be at least 2.

Click Step 1 : Tune the fraction parameter and Evaluate button, to get the following output:

Click Step 2 : Assign anomalies to assign anomalies to assign data with new tuned model.the output will look like this:

  • The model will be saved as Anomaly_Detection_Model_16_tuned.pkl

  • Assigned data will be saved under the file name predicted_data_16_tuned.csv

with session ID equal 16 in this case.

Last updated