Skip to content

modify target mean performance selector

Open
No due date
Last updated Mar 26, 2022
100% complete

The implementation of this selector is not great. This selector is very prone to fail if the categorical variables are highly cardinal or show rare categories and if the numerical variables are highly skewed or user enters a high number of bins.

When the transformer fails, the user gets very little information on how to troubleshoot. We need to improve this. Ideally we want to indicate which variables are causing the problem, and add a few hints on how to mitigate the problem.

Also, instead of dividing the dataset into train and test, we would like to use cross_validate. This means that first we need to create a predictor class, that outputs predictions as the target mean. And then, we pass this predictor class to cross_validate.

Finally, we need to expand the demos

List view

    There are no open issues in this milestone

    Add issues to milestones to help organize your work for a particular release or project. Find and add issues with no milestones in this repo.