GitHub

The implementation of this selector is not great. This selector is very prone to fail if the categorical variables are highly cardinal or show rare categories and if the numerical variables are highly skewed or user enters a high number of bins.

When the transformer fails, the user gets very little information on how to troubleshoot. We need to improve this. Ideally we want to indicate which variables are causing the problem, and add a few hints on how to mitigate the problem.

Also, instead of dividing the dataset into train and test, we would like to use cross_validate. This means that first we need to create a predictor class, that outputs predictions as the target mean. And then, we pass this predictor class to cross_validate.

Finally, we need to expand the demos

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modify target mean performance selector

There are no open issues in this milestone

modify target mean performance selector

List view

There are no open issues in this milestone