New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add per feature "maximum category" counts to OrdinalEncoder
#26013
Comments
Dear @betatim, I'm reaching out to express my interest in contributing to the scikit-learn project, specifically the issue #26013 that you opened regarding adding per feature maximum category counts to OrdinalEncoder. I have experience with Python programming and machine learning, and I believe that I can make a meaningful contribution to this project. I'm excited about the idea of allowing users to specify the number of maximum categories per feature, and I'm eager to work on this feature. I would appreciate it if you could guide me on how to get started with this project. Should I read any specific documentation or study any relevant code before diving in? Thank you for your time and consideration. Best regards, |
OrdinalEncoder
OrdinalEncoder
Hi @betatim, if this feature can be considered for inclusion, I would like to work on this issue. |
If you want to work on this please do. Try and open a PR as soon as possible so that others can see that you are working on this and people can guide the work. You can mark the PR as "draft" if it isn't ready for reviewing yet. |
Hi @betatim, I would opened a working PR. The failures stem from the lack of an updated changelog and the fact that many files not changed by this PR are not properly linted. |
Describe the workflow you want to enable
This is a follow up task for #25677
It would be nice to allow users to specify a per feature number of maxcategories instead of having a global limit as implemented in #25677.
More details in the linked comment.
Describe your proposed solution
Allow users to pass a list of shape
(n_features,)
or a dict mapping column name tomax_categories
values to specify the number of maximum categories per feature.Describe alternatives you've considered, if relevant
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: