Describe alternatives you've considered, if relevant
There's no good alternative for compatibility with sklearn's pipelines. I was following the issue #11996 of adding a handle_missing to OneHotEncoder but it has been ignored in favor of using a "constant" strategy on the categorical columns. But the constant strategy will add an unnecessary new column that could be dropped in this scenario.
Additional context
No response
The text was updated successfully, but these errors were encountered:
There is a drop argument in OneHotEncoder which you can pass a array to (one category to drop for each feature), can you use this for you use case? Adapting your snippet, something like this:
Describe the workflow you want to enable
When using
SimpleImputer
+OneHotEncoder
, I am able to add a new constant category for NaN values like the example below:However, I wanted to have an argument like
OneHotEncoder(drop='last')
in order to have an output like:This would allow all NaNs to be filled with zeros.
Describe your proposed solution
Describe alternatives you've considered, if relevant
There's no good alternative for compatibility with sklearn's pipelines. I was following the issue #11996 of adding a handle_missing to OneHotEncoder but it has been ignored in favor of using a "constant" strategy on the categorical columns. But the constant strategy will add an unnecessary new column that could be dropped in this scenario.
Additional context
No response
The text was updated successfully, but these errors were encountered: