Using custom scoring function

In this example we will use a user-defined scoring function in find_best_model. To set the scene, let's say we're working with an application that has a requirement for producing predictions fast enough (about in less than 50ms). To restrict the search space in this example we're only considering three classifiers. The example dataset is credit-G from OpenML.

First we'll do a "normal" run (using a standard scoring function) and investigate how fast the returned models are in predicting.

In [1]:
from techila_ml import find_best_model
from techila_ml.datasets import openml_dataset_loader
import matplotlib.pylab as plt
import numpy as np


# load OpenML credit-g dataset
data = openml_dataset_loader({'openml_dataset_id': 31, 'target_variable': 'class'})

n_jobs = 10  # number of Techila jobs
n_iterations = 50
Techila Python module using JPyPe

Note: A cluster with 10 workers was started before running next steps.

In [2]:
res = find_best_model(
    n_jobs,
    n_iterations,
    data,
    task='classification',
    optimization={
        'score_f': "roc_auc",
        'study_auto_stopping': False,
        'auto_stopping': False,
        'model_burnin': 10,  # configs to sample randomly before optimizer starts to guide the process
    },
    models=['sgd', 'extratrees', 'randomforest'],
)
settingloglevel
configs evaluated: 100%|██████████| 50/50 [03:30], best=-0.7962 @iter 18

We can use the returned stats to check out how long doing a single prediction takes for models trained on each iteration:

In [3]:
pred_times = [j.perf_timers['t_pred'] for j in res['history']['jobs']]
plt.plot(sorted(pred_times))
Out[3]:
[<matplotlib.lines.Line2D at 0x7ff89058ff28>]

We want to have a model that perofrms well enough and is fast in producing predictions (< 50ms). We'll formulate a simple ad hoc curve that we use to multiply the scores so that the slower models and configs get scaled down.

In [4]:
t = np.arange(0.001, 1, step=0.01)
plt.plot(t, np.exp(-20*t))
plt.axvline(0.05, ls='--', c='k')
Out[4]:
<matplotlib.lines.Line2D at 0x7ff89061aa58>

Then we write a simple custom scoring function that utilizes this prediction time "punisher" and run find_best_model with this scoring function. For sort of monitoring we also define the "vanilla" 'roc_auc' as extra evaluation function in eval_funs meaning that the function value is evaluated together with the scoring function but it has no effect on the optimization.

In [5]:
from sklearn.metrics import roc_auc_score
import time

def score_with_time_punisher(clf, X, y):
    N = min(100, X.shape[0])
    t0 = time.process_time()
    for i in range(N):
        _ = clf.predict(X.iloc[[i], :])
    t = (time.process_time() - t0) / N
    ypred = clf.predict_proba(X)[:, 1]
    s = roc_auc_score(y, ypred)
    return s * np.exp(-20*t)
In [6]:
n_iterations = 200

res2 = find_best_model(
    n_jobs,
    n_iterations,
    data,
    task='classification',
    optimization={
        'score_f': score_with_time_punisher,
        # the next line is not really needed with the given models but this ensures that each model can
        # produce probability estimates (has predict_proba method)
        'score_f_args': {'needs_proba': True},
        'study_auto_stopping': False,
        'auto_stopping': False,
        'model_burnin': 10,
        'eval_funs': ['roc_auc'],
    },
    models=['sgd', 'extratrees', 'randomforest'],
)
settingloglevel
configs evaluated: 100%|██████████| 200/200 [04:30], best=-0.7378 @iter 132

Checking the results shows that we're exploring mostly configs & models that give predictions in about 10-15ms so well under the 50ms limit and thus our scoring functions seems to have done a quite good job in that sense. Using the "monitored" roc_auc we can also see the effect of prediction time i.e. the loss is pushed much lower compared to AUC for some slow cases.

In [11]:
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(20, 15))

pred_times = [j.perf_timers['t_pred'] for j in res2['history']['jobs']]
ax1.plot(pred_times)
ax1.set_xlabel('iteration')
ax1.set_ylabel('prediction time (sec)')

losses = [j.loss for j in res2['history']['jobs']]
ax2.plot(losses)
ax2.set_xlabel('iteration')
ax2.set_ylabel('loss')

ax3.plot(pred_times, losses, '.')
ax3.set_xlabel('prediction time (sec)')
ax3.set_ylabel('loss')

roc_aucs = [j.extras['eval_funs']['roc_auc'] for j in res2['history']['jobs']]
ax4.plot(-np.array(losses), roc_aucs, '.')
ax4.set_xlabel('neg loss')
ax4.set_ylabel('AUC')

plt.show()