The implementation of this selector is not great. This selector is very prone to fail if the categorical variables are highly cardinal or show rare categories and if the numerical variables are highly skewed or user enters a high number of bins.
When the transformer fails, the user gets very little information on how to troubleshoot. We need to improve this. Ideally we want to indicate which variables are causing the problem, and add a few hints on how to mitigate the problem.
Also, instead of dividing the dataset into train and test, we would like to use cross_validate. This means that first we need to create a predictor class, that outputs predictions as the target mean. And then, we pass this predictor class to cross_validate.
Finally, we need to expand the demos
List view
0 issues of 0 selected
There are no open issues in this milestone
Add issues to milestones to help organize your work for a particular release or project. Find and add issues with no milestones in this repo.