import dalex as dx
import pandas as pd
import numpy as np
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
import xgboost as xgb
import warnings
warnings.filterwarnings('ignore')
dx.__version__
xgb.train
¶We will consider the most basic example first. Due to categorical variables in the dataset, that are not handled by default by xgb
, we will drop them in the first example.
data = dx.datasets.load_titanic()
X = data.drop(columns='survived').loc[:, ['age', 'fare', 'sibsp', 'parch']]
y = data.survived
params = {
"max_depth": 5,
"objective": "binary:logistic",
"eval_metric": "auc"
}
train = xgb.DMatrix(X, label=y)
classifier = xgb.train(params, train, verbose_eval=1)
Note that despite special data format needed by the xgb
, you pass pandas.DataFrame
to the Explainer
. You have to use Data Frame in all interactions with the Explainer
.
exp = dx.Explainer(classifier, X, y)
Again, note that X
is just a Data Frame.
exp.predict(X)
exp.model_parts().plot()