FIFA 20: explain default vs tuned model with dalex¶

imports¶

import dalex as dx 

import numpy as np
import pandas as pd

from lightgbm import LGBMRegressor
from sklearn.model_selection import train_test_split
from sklearn.model_selection import RandomizedSearchCV

import warnings
warnings.filterwarnings('ignore')

import plotly
plotly.offline.init_notebook_mode()

dx.__version__

'1.7.0'

load data¶

Load fifa, the preprocessed players_20 dataset. It contains 5000 'overall' best players and 43 columns. These are:

short_name (index)
nationality of the player (not used in modeling)
overall, potential, value_eur, wage_eur (4 potential target variables)
age, height, weight, attacking skills, defending skills, goalkeeping skills (37 variables)

It is advised to leave only one target variable for modeling.

data = dx.datasets.load_fifa()

data.head(10)

Divide the data into variables X and a target variable y. Here we will be predicting the value of the best players.

X = data.drop(["nationality", "overall", "potential", "value_eur", "wage_eur"], axis = 1)
y = data['value_eur']

The target variable is skewed so we transform it with log for a better fit.

ylog = np.log(y)

import matplotlib.pyplot as plt
plt.hist(ylog, bins='auto')
plt.title("ln(value_eur)")
plt.show()

Split the data into train and test.

X_train, X_test, ylog_train, ylog_test, y_train, y_test = \
    train_test_split(X, ylog, y, test_size=0.25, random_state=4)

create a default boosting model¶

gbm_default = LGBMRegressor()

gbm_default.fit(X_train, ylog_train)

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000364 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2596
[LightGBM] [Info] Number of data points in the train set: 3750, number of used features: 37
[LightGBM] [Info] Start training from score 15.433596

LGBMRegressor()

LGBMRegressor()

create a tuned model¶

gbm_default._estimator_type

'regressor'

#:# hp tuning
estimator = LGBMRegressor(n_jobs = -1)
param_test = {
    'n_estimators': list(range(201,1202,50)),
    'num_leaves': list(range(6, 42, 5)),
    'min_child_weight': [1e-3, 1e-2, 1e-1, 15e-2],
    'learning_rate': [1e-3, 1e-2, 1e-1, 15e-2]
}

rs = RandomizedSearchCV(
    estimator=estimator, 
    param_distributions=param_test, 
    n_iter=100,
    cv=4,
    random_state=1
)

# rs.fit(X, ylog)
# print('Best score reached: {} with params: {} '.format(rs.best_score_, rs.best_params_))

#:# best parameters after 100 iterations
best_params = {'num_leaves': 6,
               'n_estimators': 951,
               'min_child_weight': 0.1,
               'learning_rate': 0.15}

gbm_tuned = LGBMRegressor(**best_params)
gbm_tuned.fit(X_train, ylog_train)

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000422 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 2596
[LightGBM] [Info] Number of data points in the train set: 3750, number of used features: 37
[LightGBM] [Info] Start training from score 15.433596

LGBMRegressor(learning_rate=0.15, min_child_weight=0.1, n_estimators=951,
              num_leaves=6)

LGBMRegressor(learning_rate=0.15, min_child_weight=0.1, n_estimators=951,
              num_leaves=6)

create explainers for the models¶

We aim to see real values of the target variable in the explanations (not log). Therefore, we need to make a custom predict_function.

def predict_function(model, data):
    return np.exp(model.predict(data))

exp_default = dx.Explainer(gbm_default, X_test, y_test,
                           predict_function=predict_function, label='default')
exp_tuned = dx.Explainer(gbm_tuned, X_test, y_test,
                         predict_function=predict_function, label='tuned')

Preparation of a new explainer is initiated

  -> data              : 1250 rows 37 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 1250 values
  -> model_class       : lightgbm.sklearn.LGBMRegressor (default)
  -> label             : default
  -> predict function  : <function predict_function at 0x29e725120> will be used
  -> predict function  : Accepts pandas.DataFrame and numpy.ndarray.
  -> predicted values  : min = 3.57e+05, mean = 7.12e+06, max = 8.12e+07
  -> model type        : regression will be used (default)
  -> residual function : difference between y and yhat (default)
  -> residuals         : min = -1e+07, mean = 2.12e+05, max = 2.43e+07
  -> model_info        : package lightgbm

A new explainer has been created!
Preparation of a new explainer is initiated

  -> data              : 1250 rows 37 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 1250 values
  -> model_class       : lightgbm.sklearn.LGBMRegressor (default)
  -> label             : tuned
  -> predict function  : <function predict_function at 0x29e725120> will be used
  -> predict function  : Accepts pandas.DataFrame and numpy.ndarray.
  -> predicted values  : min = 3.56e+05, mean = 7.12e+06, max = 9.51e+07
  -> model type        : regression will be used (default)
  -> residual function : difference between y and yhat (default)
  -> residuals         : min = -1.49e+07, mean = 2.11e+05, max = 2.41e+07
  -> model_info        : package lightgbm

A new explainer has been created!

introduction to the topic: Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models ¶

Above functionalities are accessible from the Explainer object through its methods.

Model-level and predict-level methods return a new unique object that contains the result attribute (pandas.DataFrame) and the plot method.

Model-level explanations ¶

Model Performance ¶

mp_default = exp_default.model_performance("regression")
mp_default.result

mp_tuned = exp_tuned.model_performance("regression")
mp_tuned.result

mp_default.plot(mp_tuned)

This are very big values so the difference on paper may be very subtle.

What are the differences between these two models? Let's find out.

Variable Importance ¶

Customize the computation with parameters:

loss_function function to use for drop-out loss evaluation
B number of bootstrap rounds (e.g. 15 for slower computation but more stable results)
N number of observations to use (e.g. 500 for faster computation but less stable results)
variable_groups Dict of lists of variables. Each list is treated as one group. This is for testing joint variable importance

X.columns

Index(['age', 'height_cm', 'weight_kg', 'attacking_crossing',
       'attacking_finishing', 'attacking_heading_accuracy',
       'attacking_short_passing', 'attacking_volleys', 'skill_dribbling',
       'skill_curve', 'skill_fk_accuracy', 'skill_long_passing',
       'skill_ball_control', 'movement_acceleration', 'movement_sprint_speed',
       'movement_agility', 'movement_reactions', 'movement_balance',
       'power_shot_power', 'power_jumping', 'power_stamina', 'power_strength',
       'power_long_shots', 'mentality_aggression', 'mentality_interceptions',
       'mentality_positioning', 'mentality_vision', 'mentality_penalties',
       'mentality_composure', 'defending_marking', 'defending_standing_tackle',
       'defending_sliding_tackle', 'goalkeeping_diving',
       'goalkeeping_handling', 'goalkeeping_kicking',
       'goalkeeping_positioning', 'goalkeeping_reflexes'],
      dtype='object')

variable_groups = {
    'age': ['age'],
    'body': ['height_cm', 'weight_kg'],
    'attacking': ['attacking_crossing',
       'attacking_finishing', 'attacking_heading_accuracy',
       'attacking_short_passing', 'attacking_volleys'],
    'skill': ['skill_dribbling',
       'skill_curve', 'skill_fk_accuracy', 'skill_long_passing',
       'skill_ball_control'],
    'movement': ['movement_acceleration', 'movement_sprint_speed',
       'movement_agility', 'movement_reactions', 'movement_balance'],
    'power': ['power_shot_power', 'power_jumping', 'power_stamina', 'power_strength',
       'power_long_shots'],
    'mentality': ['mentality_aggression', 'mentality_interceptions',
       'mentality_positioning', 'mentality_vision', 'mentality_penalties',
       'mentality_composure'],
    'defending': ['defending_marking', 'defending_standing_tackle',
       'defending_sliding_tackle'],
    'goalkeeping' : ['goalkeeping_diving',
       'goalkeeping_handling', 'goalkeeping_kicking',
       'goalkeeping_positioning', 'goalkeeping_reflexes']
}

vi_default = exp_default.model_parts(variable_groups=variable_groups, B=15, random_state=0)
vi_tuned = exp_tuned.model_parts(variable_groups=variable_groups, B=15)

Customize the plot with parameters:

vertical_spacing value between 0.0 and 1.0 (e.g. 0.15 for more space between the plots)
rounding_function rounds the contributions (e.g. np.round, np.rint, np.ceil)
digits (e.g. 2 for np.round, None for np.rint)

vi_default.plot(vi_tuned,
                max_vars=6, rounding_function=np.rint, digits=None, vertical_spacing=0.15)

Variables connected with body and power aren't important for these models. It is also true for goalkeeping. This might mean that goalkeepers predictions aren't accurate. The most important factors in predicting players value are skill, attacking and movement.

It seems like the default model is focusing on movement variables too much and doesn't find other variables so important, especially skill. The tuned model finds mentality and defending quite important. Next, we will examine these variables closer.

Aggregated Profiles¶

Choose a proper algorithm. The explanations can be calulated as Partial Dependence Profile or Accumulated Local Dependence Profile.

The key parameter is N number of observations to use (e.g. 800 for slower computation but more stable results).

Here we will use ale plots, which work better if the explanatory variables are correlated.

ale_default = exp_default.model_profile(type = 'accumulated', N=800, label='ale-default')

Calculating ceteris paribus: 100%|██████████| 37/37 [00:02<00:00, 15.16it/s]
Calculating accumulated dependency: 100%|██████████| 37/37 [00:02<00:00, 14.57it/s]

ale_tuned = exp_tuned.model_profile(type = 'accumulated', N=800, label='ale-tuned')

Calculating ceteris paribus: 100%|██████████| 37/37 [00:06<00:00,  5.48it/s]
Calculating accumulated dependency: 100%|██████████| 37/37 [00:02<00:00, 14.22it/s]

ale_default.plot(ale_tuned, variables = ['goalkeeping_positioning', 'power_stamina',
                                           'mentality_vision', 'defending_marking',
                                           'attacking_finishing', 'attacking_heading_accuracy',
                                           'attacking_short_passing', 'skill_ball_control'])

Overall, we can see that the tuned model is using more variables. Examples are defending_marking, goalkeeping_positioning, mentality_vision, power_stamina, skill_ball_control and attacking variables.

It also acts differently with variables like age and movement_reactions.

ale_default.plot(ale_tuned, variables = ['age', 'movement_reactions'])

Predict-level explanations ¶

Variable Attribution¶

Choose a proper algorithm. The explanations can be calulated as Break Down, iBreakDown or Shapley Values.

For type='shap' the key parameter is B number of bootstrap rounds (e.g. 10 for faster computation but less stable results).

Let's find out what attributes to the value of the best players.

va = {'ibd':[], 'sh':[]}

for name in data.index[0:3]:
    player = X.loc[name,]
    
    ibd = exp_tuned.predict_parts(player, type='break_down_interactions', label=name)
    sh = exp_tuned.predict_parts(player, type='shap', B=10, label=name)
    
    va['ibd'].append(ibd)
    va['sh'].append(sh)

va['ibd'][0].plot(va['ibd'][1:3],
                  rounding_function=lambda x, digits: np.rint(x, digits).astype(int),
                  digits=None, max_vars=10)

va['sh'][0].plot(va['sh'][1:3],
                 rounding_function=lambda x, digits: np.rint(x, digits).astype(int),
                 digits=None, max_vars=10)

Looking at the Break Down plots, age and movement_ractions variables are standing out. Let's focus on them more.

Ceteris Paribus Profiles ¶

cp = exp_tuned.predict_profile(X.iloc[2:3,],
                               variables=['age', 'movement_reactions'],
                               label=X.index[2]) # variables to calculate

Calculating ceteris paribus: 100%|██████████| 2/2 [00:00<00:00, 548.42it/s]

cp.plot(size=3, title="What If? Neymar Jr") # larger width of the line and dot size & change title

Here we see how the prediction would change if Neymar Jr was younger/older or had lower movement_reactions.

Hover over all of the above plots for tooltips with more information.

Plots¶

This package uses plotly to render the plots:

Install extentions to use plotly in JupyterLab: Getting Started Troubleshooting
Use show=False parameter in plot method to return plotly Figure object
It is possible to edit the figures and save them

Resources - https://dalex.drwhy.ai/python ¶

Introduction to the dalex package: Titanic: tutorial and examples
Key features explained: FIFA20: explain default vs tuned model with dalex
How to use dalex with: xgboost, tensorflow, h2o (feat. autokeras, catboost, lightgbm)
More explanations: residuals, shap, lime
Introduction to the Fairness module in dalex
Introduction to the Aspect module in dalex
Introduction to Arena: interactive dashboard for model exploration
Code in the form of jupyter notebook
Changelog: NEWS
Theoretical introduction to the plots: Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models

	nationality	overall	potential	wage_eur	value_eur	age	height_cm	weight_kg	attacking_crossing	attacking_finishing	...	mentality_penalties	mentality_composure	defending_marking	defending_standing_tackle	defending_sliding_tackle	goalkeeping_diving	goalkeeping_handling	goalkeeping_kicking	goalkeeping_positioning	goalkeeping_reflexes
short_name
L. Messi	Argentina	94	94	565000	95500000	32	170	72	88	95	...	75	96	33	37	26	6	11	15	14	8
Cristiano Ronaldo	Portugal	93	93	405000	58500000	34	187	83	84	94	...	85	95	28	32	24	7	11	15	14	11
Neymar Jr	Brazil	92	92	290000	105500000	27	175	68	87	87	...	90	94	27	26	29	9	9	15	15	11
J. Oblak	Slovenia	91	93	125000	77500000	26	188	87	13	11	...	11	68	27	12	18	87	92	78	90	89
E. Hazard	Belgium	91	91	470000	90000000	28	175	74	81	84	...	88	91	34	27	22	11	12	6	8	8
K. De Bruyne	Belgium	91	91	370000	90000000	28	181	70	93	82	...	79	91	68	58	51	15	13	5	10	13
M. ter Stegen	Germany	90	93	250000	67500000	27	187	85	18	14	...	25	70	25	13	10	88	85	88	88	90
V. van Dijk	Netherlands	90	91	200000	78000000	27	193	92	53	52	...	62	89	91	92	85	13	10	13	11	11
L. Modric	Croatia	90	90	340000	45000000	33	172	66	86	72	...	82	92	68	76	71	13	9	7	14	9
M. Salah	Egypt	90	90	240000	80500000	27	175	71	79	90	...	77	91	38	43	41	14	14	9	11	14

FIFA 20: explain default vs tuned model with dalex¶

imports¶

load data¶

create a default boosting model¶

create a tuned model¶

create explainers for the models¶

introduction to the topic: Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models¶

Model-level explanations¶

Model Performance¶

Variable Importance¶