Continuing AutoML runs

Sometimes it may be beneficial to do first a smaller run and investigate how its results look like before doing a more thorough optimization run. The work done (i.e. computation spent) on the first run should however not be lost and can be utilized by continuing the next run from where the last one stopped. Note: A cluster with 10 workers was started before running below steps.

In [1]:
from techila_ml import find_best_model
from techila_ml.stats_vis_funs import plot_res_figs
import matplotlib.pylab as plt
import numpy as np
from sklearn.datasets import load_breast_cancer

data = {}
data['X_train'], data['y_train'] = load_breast_cancer(return_X_y=True)

n_jobs = 10  # number of Techila jobs
n_iterations = 50
Techila Python module using JPyPe
In [2]:
res = find_best_model(
    n_jobs,
    n_iterations,
    data,
    task='classification',
    optimization={
        'score_f': "roc_auc",
    },
)
settingloglevel
configs evaluated: 100%|██████████| 50/50 [08:28], best=-0.9951 @iter 43
In [3]:
plot_res_figs(res, "run1")

Based on the results we decide to drop XGBoost and CatBoost (just for the example to finish fast). We will continue with these new choises for another 200 iterations by giving the result as starting point in previous_results. Note: the iteration counter will start from 50 now as that is the history from last run.

In [4]:
n_iterations2 = 200
res2 = find_best_model(
    n_jobs,
    n_iterations2,
    data,
    task='classification',
    exclude_models=['xgboost', 'catboost'],
    optimization={
        'score_f': "roc_auc",
    },
    previous_results=res
)
settingloglevel
configs evaluated: 100%|██████████| 250/250 [01:34], best=-0.9977 @iter 217
In [5]:
plot_res_figs(res2, "run2")

We can then use the best obtained model (or pipeline) to generate predictions like for normal sklearn models:

In [6]:
res2['best_model'].predict(data['X_train'][0:100,:])
Out[6]:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
       0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0,
       1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0,
       1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0])