自定义损失函数
lightgbm自定义示例:
https://www.cnblogs.com/fujian-code/p/9804129.html
https://github.com/manifoldai/mf-eng-public/blob/master/notebooks/custom_loss_lightgbm.ipynb
担心链接失效:重新编辑。
%load_ext autoreload
%autoreload 2
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from lightgbm import LGBMRegressor
import lightgbm
from sklearn.datasets import make_friedman2, make_friedman1, make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
import lightgbm
import seaborn as sns; sns.set()
from sklearn.metrics import mean_absolute_error, mean_squared_error
sns.set_style("whitegrid", {'axes.grid' : False})
Simulating Friendman dataset
About the dataset
Inputs X are independent features uniformly distributed on the interval [0, 1]. The output y is created according to the formula:
y(X) = 10 * sin(pi * X[:, 0] * X[:, 1]) + 20 * (X[:, 2] - 0.5) ** 2 + 10 * X[:, 3] + 5 * X[:, 4] + noise * N(0, 1).
Out of the n_features features, only 5 are actually used to compute y. The remaining features are independent of y.
# simulating 10,000 data points with 2 useless and 5 uniformly distributed features
X, y = make_friedman1(n_samples=10000, n_features=7, noise=0.0, random_state=11)
min(y), max(y)
(0.3545368892371061, 28.516918961287963)
# distribution of target variable
h = plt.hist(y)
# train-validation split
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.20, random_state=42)
# test set for generalization of scores
X_test, y_test = make_friedman1(n_samples=5000, n_features=7, noise=0.0, random_state=21)
### Plotting helper functions
def plot_residual_distribution(model):
"""
Density plot of residuals (y_true - y_pred) for testation set for given model
"""
ax = sns.distplot(y_test - model.predict(X_test), hist = False, kde = True,
kde_kws = {'shade': True, 'linewidth': 3}, axlabel="Residual")
title = ax.set_title('Kernel density of residuals', size=15)
def plot_scatter_pred_actual(model):
"""
Scatter plot of predictions from given model vs true target variable from testation set
"""
ax = sns.scatterplot(x=model.predict(X_test), y = y_test)
ax.set_xlabel('Predictions')
ax.set_ylabel('Actuals')
title = ax.set_title('Actual vs Prediction scatter plot', size=15)
Random Forest
# basic random forest regressor with mse as criterion to measure the quality of split
rf = RandomForestRegressor(n_estimators=50, oob_score=True, random_state=33)
rf.fit(X_train, y_train)
plot_residual_distribution(rf)
plot_scatter_pred_actual(rf)
print(f"MSE is {mean_squared_error(y_test, rf.predict(X_test))}")
MSE is 1.0925877452294468
Default LightGBM
LightGBM default: MSE
# make new model on new value
gbm = lightgbm.LGBMRegressor(random_state=33)
gbm.fit(X_train,y_train)
print(f"MSE is {mean_squared_error(y_test, gbm.predict(X_test))}")
MSE is 0.2362458093307746
We see that GBM has performed a better than random forest model for our validation MSE score
LightGBM default: MSE + early stopping¶
# make new model on new value
# 'regression' is actually also the default objective for LGBMRegressor
gbm2 = lightgbm.LGBMRegressor(objective='regression',
random_state=33,
early_stopping_rounds = 10,
n_estimators=10000
)
gbm2.fit(
X_train,
y_train,
eval_set=[(X_valid, y_valid)],
eval_metric='l2', # also the default
verbose=False,
)
It is basically similar to the default model we fitted in the previous section but we did not use eval_set in that section. Because we are specifically mentioning eval_set in this case, we can leverage early_stopping_rounds and run for more boosting iterations which improves our model performance aand saves best score based on the give metrics and eval_score.
print(f"MSE is {mean_squared_error(y_test, gbm2.predict(X_test))}")
MSE is 0.13763903255048504
We see an improvement in score as the model is able to run for more boosting iterations
Assymetric Custom Loss
There are 2 parameters that we might be interested in which define the traininig process in gradient boosted based tree based models. In context of LightGBM, we
In LightGBM training API:
fobj: Customized objective function
feval: Customized evaluation function. basically a way to use custom metric for cv. used in addition to metric
metric: a function to be monitored while doing cross validation. (select hyperparameters that minimize or maximuze this). can be plural
In sklearn wrapper around LightGBM API:
objective: default parameter in model()
eval_metric in model.fit()
I am going to use sklearn wrapper to set the objective and evaluation metric, but these are essentially the same
Let’s say that we don’t want our model to overpredict, but we are fine with underpredictions.
We can make a custom loss which gives 10 times more penalty when the true targets are less than predictions as compared to when true targets are more.
Let’s say that we don’t want our model to overpredict, but we are fine with underpredictions.
We can make a custom loss which gives 10 times more penalty when the true targets are less than predictions as compared to when true targets are more
def custom_asymmetric_objective(y_true, y_pred):
residual = (y_true - y_pred).astype("float")
grad = np.where(residual<0, -2*10.0*residual, -2*residual)
hess = np.where(residual<0, 2*10.0, 2.0)
return grad, hess
def custom_asymmetric_eval(y_true, y_pred):
residual = (y_true - y_pred).astype("float")
loss = np.where(residual < 0, (residual**2)*10.0, residual**2)
return "custom_asymmetric_eval", np.mean(loss), False
Exploring our custom loss function with some plots
# let's see how our custom loss function looks with respect to different prediction values
y_true = np.repeat(0,1000)
y_pred = np.linspace(-100,100,1000)
residual = (y_true - y_pred).astype("float")
custom_loss = np.where(residual < 0, (residual**2)*10.0, residual**2)
fig, ax = plt.subplots(1,1, figsize=(8,4))
sns.lineplot(y_pred, custom_loss, alpha=1, label="asymmetric mse")
sns.lineplot(y_pred, residual**2, alpha = 0.5, label = "symmetric mse", color="red")
ax.set_xlabel("Predictions")
ax.set_ylabel("Loss value")
fig.tight_layout()
grad, hess = custom_asymmetric_objective(y_true, y_pred)
fig, ax = plt.subplots(1,1, figsize=(8,4))
# ax.plot(y_hat, errors)
ax.plot(y_pred, grad)
ax.plot(y_pred, hess)
ax.legend(('gradient', 'hessian'))
ax.set_xlabel('Predictions')
ax.set_ylabel('first or second derivates')
fig.tight_layout()
The behaviour of gradient of custom loss is as per our expecation. The slope has a little higher value when the residual is negative as compared to when it is positive
LightGBM custom objective
# make new model on new value
gbm3 = lightgbm.LGBMRegressor(random_state=33)
gbm3.set_params(**{'objective': custom_asymmetric_objective}, metrics = ["mse", 'mae'])
gbm3.fit(
X_train,
y_train,
eval_set=[(X_valid, y_valid)],
eval_metric='l2',
verbose=False,
)
LightGBM_early_boosting custom eval_metric
# make new model on new value
gbm4 = lightgbm.LGBMRegressor(random_state=33,
early_stopping_rounds = 10,
n_estimators=10000
)
gbm4.set_params(**{'objective': "regression"}, metrics = ["mse", 'mae'])
gbm4.fit(
X_train,
y_train,
eval_set=[(X_valid, y_valid)],
eval_metric=custom_asymmetric_eval,
verbose=False,
)
LightGBM_early_boosting custom objective
# make new model on new value
gbm5 = lightgbm.LGBMRegressor(random_state=33,
early_stopping_rounds = 10,
n_estimators=10000
)
gbm5.set_params(**{'objective': custom_asymmetric_objective}, metrics = ["mse", 'mae'])
gbm5.fit(
X_train,
y_train,
eval_set=[(X_valid, y_valid)],
eval_metric="l2",
verbose=False,
)
LightGBM_early_boosting custom eval_metric + objective
# make new model on new value
gbm6 = lightgbm.LGBMRegressor(random_state=33,
early_stopping_rounds = 10,
n_estimators=10000
)
gbm6.set_params(**{'objective': custom_asymmetric_objective}, metrics = ["mse", 'mae'])
gbm6.fit(
X_train,
y_train,
eval_set=[(X_valid, y_valid)],
eval_metric=custom_asymmetric_eval,
verbose=False,
)
Reporting scores for different models
Scores table
# asymmetric mse scores
_,loss_rf,_ = custom_asymmetric_eval(y_test, rf.predict(X_test))
_,loss_gbm,_ = custom_asymmetric_eval(y_test, gbm.predict(X_test))
_,loss_gbm2,_ = custom_asymmetric_eval(y_test, gbm2.predict(X_test))
_,loss_gbm3,_ = custom_asymmetric_eval(y_test, gbm3.predict(X_test))
_,loss_gbm4,_ = custom_asymmetric_eval(y_test, gbm4.predict(X_test))
_,loss_gbm5,_ = custom_asymmetric_eval(y_test, gbm5.predict(X_test))
_,loss_gbm6,_ = custom_asymmetric_eval(y_test, gbm6.predict(X_test))
score_dict = {'Random Forest default':
{'asymmetric custom mse (test)': loss_rf,
'asymmetric custom mse (train)': custom_asymmetric_eval(y_train, rf.predict(X_train))[1],
'symmetric mse': mean_squared_error(y_test, rf.predict(X_test)),
'# boosting rounds' : '-'},
'LightGBM default' :
{'asymmetric custom mse (test)': loss_gbm,
'asymmetric custom mse (train)': custom_asymmetric_eval(y_train, gbm.predict(X_train))[1],
'symmetric mse': mean_squared_error(y_test, gbm.predict(X_test)),
'# boosting rounds' : gbm.booster_.current_iteration()},
'LightGBM with custom training loss (no hyperparameter tuning)':
{'asymmetric custom mse (test)': loss_gbm3,
'asymmetric custom mse (train)': custom_asymmetric_eval(y_train, gbm3.predict(X_train))[1],
'symmetric mse': mean_squared_error(y_test, gbm3.predict(X_test)),
'# boosting rounds' : gbm3.booster_.current_iteration()},
'LightGBM with early stopping' :
{'asymmetric custom mse (test)': loss_gbm2,
'asymmetric custom mse (train)': custom_asymmetric_eval(y_train, gbm2.predict(X_train))[1],
'symmetric mse': mean_squared_error(y_test, gbm2.predict(X_test)),
'# boosting rounds' : gbm2.booster_.current_iteration()},
'LightGBM with early_stopping and custom validation loss':
{'asymmetric custom mse (test)': loss_gbm4,
'asymmetric custom mse (train)': custom_asymmetric_eval(y_train, gbm4.predict(X_train))[1],
'symmetric mse': mean_squared_error(y_test, gbm4.predict(X_test)),
'# boosting rounds' : gbm4.booster_.current_iteration()},
'LightGBM with early_stopping and custom training loss':
{'asymmetric custom mse (test)': loss_gbm5,
'asymmetric custom mse (train)': custom_asymmetric_eval(y_train, gbm5.predict(X_train))[1],
'symmetric mse': mean_squared_error(y_test, gbm5.predict(X_test)),
'# boosting rounds' : gbm5.booster_.current_iteration()},
'LightGBM with early_stopping, custom training and custom validation loss':
{'asymmetric custom mse (test)': loss_gbm6,
'asymmetric custom mse (train)': custom_asymmetric_eval(y_train, gbm6.predict(X_train))[1],
'symmetric mse': mean_squared_error(y_test, gbm6.predict(X_test)),
'# boosting rounds' : gbm6.booster_.current_iteration()}
}
pd.DataFrame(score_dict).T
Plots
Density plot to show comparison of LightGBM with symmetric and with asymmetric MSE functions
fig, ax = plt.subplots(figsize=(12,6))
ax = sns.distplot(y_test - gbm.predict(X_test), hist = False, kde = True,
kde_kws = {'shade': True, 'linewidth': 3}, axlabel="Residual", label = "LightGBM with default mse")
ax = sns.distplot(y_test - gbm3.predict(X_test), hist = False, kde = True,
kde_kws = {'shade': True, 'linewidth': 3}, axlabel="Residual", label = "LightGBM with asymmetric mse")
# control x and y limits
ax.set_xlim(-3, 3)
title = ax.set_title('Kernel density plot of residuals', size=15)
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(12,5))
ax1, ax2, ax3 = ax.flatten()
ax1.plot(rf.predict(X_test), y_test, 'o', color='#1c9099')
ax1.set_xlabel('Predictions')
ax1.set_ylabel('Actuals')
ax1.set_title('Random Forest')
ax2.plot(gbm.predict(X_test), y_test, 'o', color='#1c9099')
ax2.set_xlabel('Predictions')
ax2.set_ylabel('Actuals')
ax2.set_title('LightGBM default')
ax3.plot(gbm6.predict(X_test), y_test, 'o', color='#1c9099')
ax3.set_xlabel('Predictions')
ax3.set_ylabel('Actuals')
ax3.set_title('LightGBM with early_stopping, \n custom objective and custom evalution')
fig.suptitle("Scatter plots of predictions vs. actual targets for different models", y = 1.05, fontsize=15)
fig.tight_layout()
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(12,5))
ax1, ax2, ax3 = ax.flatten()
ax1.hist(y_test - rf.predict(X_test), bins=50, color='#1c9099')
ax1.axvline(x=0, ymin=0, ymax=500, color='black', lw=1.2)
ax1.set_xlabel('Residuals')
ax1.set_title('Random Forest')
ax1.set_ylabel('# observations')
ax2.hist(y_test - gbm.predict(X_test), bins=50, color='#1c9099')
ax2.axvline(x=0, ymin=0, ymax=500, color='black', lw=1.2)
ax2.set_xlabel('Residuals')
ax2.set_ylabel('# observations')
ax2.set_title('LightGBM default')
ax3.hist(y_test - gbm6.predict(X_test), bins=50, color='#1c9099')
ax3.axvline(x=0, ymin=0, ymax=500, color='black', lw=1.2)
ax3.set_xlabel('Residuals')
ax3.set_ylabel('# observations')
ax3.set_title('LightGBM with early_stopping, \n custom objective and custom evalution')
fig.suptitle("Error histograms of predictions from different models", y = 1.05, fontsize=15)
fig.tight_layout()
上一篇: 七夕(情人节)表白女朋友,程序员应该如何装一波13
下一篇: 情人节的程序员浪漫表白HTML礼物