AutoML

After exploring your data, you can use AutoML module to perform your machine learning task, using the following three steps :

Step 1 : Build

Before you start training your model, you can set the execution time to default or custom, in default option the app will do all the processing needed and build the best model for you, but it will take time depending on the size of your data, if you choose custom, then you need to specify the time that the app needs to build your model, minimum 3 minutes, please note that in this option you may or you may not get the best model, since the app will get you the best model that was built in the period of time you specified.

When you click 'build your model' button a progressing bar will pop up, and indicates the processing steps of training the model, and when done, the model will be saved under the file name machine learning task_Model_sessionID.pkl , (exp Clustering_Model_1.pkl), now you can start evaluating it.

  • In custom mode, the app will go through a lot of processing, and it may take time depending on the size of your data, so be patient and wait until all the processing is complete to get the best possible model.

  • Change the seesion ID, each time you want to retrain your model within the same experiment

Step 2 : Evaluate

In this step, you have a list of evaluation metrics to choose from.

Available metrics for Classification:

  • Accuracy

  • AUC (Area Under the Curve for estimators that does not support ‘predict_proba’ is shown as 0.0000.)

  • Recall

  • Precision

  • F1

  • Kappa

Available metrics for Regression:

  • R2

  • MAE

  • MSE

  • RMSE

  • RMSLE

  • MAPE

For better understanding of each metrics and how it works, please refer to this tutorial .

Select the proper evaluation metric and click 'Evaluate your model' button, you will get a dataframe table showing the score of each metric, and a value hilighted in yellow for the metric that you chose.

You can also use visualization metrics to evaluate your model using:

  • For Classification : auc, confusion matrix, threshold,precision_recall.

  • For Regression : residuals, error, cooks, learning

  • For Clustering : cluster, tsne

  • For Anomaly_Detection : tsne, umap

Step 3 : Predict

After the evaluation stepid done, and you are satisfied with the outcome, you can process to step 3 to make predictions. the app will let you choose between the model you just built, or a pre-existing model which was built before, in this case you need to enter the name of your pre-existing model without the extention .pkl and must be saved in the same location as your working directory.

Now you either choose to make prediction for a single observation, which means you have one instance and you want to predict the target, in this case you need to fill manually all the features of that instance, check your entries if they are correct, and click predict, or if you want to predict multiple observations, in this case you need to drag and drop your file that contains observations you want to predict, the file must be a csv or exel file.

  • In both cases, the target column must be removed from your data, before predicting for Classification and Regression.

  • Your entries must have the same original data dtype.

  • You must respect the order of the 3 steps. (step1 then step2 then step3 )

When you click predict, the app will process the prediction, and show a dataframe table with the predicted label and the probabilty score.

Single observation

Multiple observations

When performing prediction for Anomaly_Detection, for single observation, if yes then it s an anomaly, if NO then it s not.

When performing prediction for Clustering , for single observation, then the predictions is the category of the cluster, according to the number of clusters you specified or the number that was generated automatically.

In case of multiple observations, you will get the same outcome, except that now you will have a data frame table and a prediction for all and each instance.

Predicted data will be saved in your working directory with the name predicted_data_sessionID.csv .

Last updated