Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upFix documentation of default values in all classes #15761
Comments
Hello @cgsavard, I would like to work on this. Can I start looking at the AgglomerativeClustering class? |
@vachanda Go for it! We can continue posting here which ones we work on so that others know. |
Thanks for coordinating this @cgsavard Note to contributors: please follow the guidelines under: https://scikit-learn.org/stable/developers/contributing.html#guidelines-for-writing-documentation |
@cgsavard, Is there a list of classes that have discrepancies or do we have to go through each of them and update them? |
@vachanda I do not have a list, unfortunately. I have just been going through the files and seeing what needs to be updated. |
I am working on AffinityPropagation, SpectralCoclustering, SpectralBiclustering, and Birch. |
I am working on FeatureAgglomeration, KMeans, and MiniBatchKMeans. |
Logically speaking, if a param is optional, shouldn't the default be If there is a default, this usually means that the literature has found this to be a sensible default value which also suggests that this parameter has an impact on performance and thus it shouldn't be optional, but should just mention what the default is. Those seem closer to required parameters by definition, we just happened to make a sensible choice for the user so they can change it or not. Or more practically speaking, are there currently any optional parameters that we've found which have numeric default values, but for which specifying Or maybe I've been confusing the meaning of |
@jmwoloso We were really inconsistent regarding the usage of |
i want to contribute as well. can i go ahead with this |
@glemaitre ok, that definitely makes sense. so then we're removing the should each of these that we find be opened as an issue separately or how are we staging all of this work that we're doing since multiple people are working on multiple things relating to this single issue? |
@cyrus303 @jmwoloso You can get a class (a module maximum) and correct it. The idea is to remove the optional and add a default when there is one (there is usually one). Since we are touching the documentation, we should make sure that the style on the line follow our new style guide: https://scikit-learn.org/dev/developers/contributing.html#guidelines-for-writing-documentation You can mention which of the class/module you are working, open a link a PR to avoid duplicate effort :). Looking forward to reviewing it. |
Hey! I will work on |
I will also fix this issue for the |
I'll take the |
@glemaitre any preference on EDIT: ditto for EDIT (again): also seeing docstrings with inconsistent values relative to the
the |
@jmwoloso Could you refer to https://scikit-learn.org/stable/developers/contributing.html#guidelines-for-writing-documentation. Basically you should default to the python type name (bool, str, int, float)
We should match the parameter in the function signature. This value default parameter has changed and the docstring was not updated. |
Hi @cgsavard , |
Hi, I am new, and I would like to start contributing. Do you still need some help on this issue? is there any file you still need help with? |
Hey @boricles! Have a look to #15761 (comment) for a list with the modules still to be fixed. |
@alfaro96 thanks. I did a quick look just now. I will select a module tonight, and work on it. |
I am working on sklearn/config_context |
Hey, thought I'd see if I could help with the docs. @alfaro96 I'd like to work on Also, I noticed that although |
Hi @alfaro96, after edits: I would like to work on |
Hey @madprogramer,
Edit: The
It is worth it updating, although this should be done in a separate PR. Thank you! |
Hey @haiatn,
I have updated the checklist.
That is the idea, although the classes in the Nevertheless, it would be nice if you could work in any of the other functions, classes and modules that are pending to be fixed. Thank you! |
I looked at the checklist. From what I saw the following can be checked from the checklist:
Can I work on |
Thank you @haiatn, I have updated the checklist.
The
I have had a look to the Nevertheless, there are several functions in the Thank you @haiatn, we really appreciate your help! |
I will now work on |
I will finish sklearn.utils. There are only 3 files I found that need fixing. |
Hey @haiatn! I have already have a look to your open PRs. Thank you! |
Now that we merged what's left off sklearn.utils and it was the last on the checklist, did we finish? |
There is one last open pull request #18025, then this issue could be eventually closed. |
Hello, |
Description
The documentation of default values in many classes is either not included, inconsistent in how it is written, or out-of-date. I would like to gather a few people to work on the default value documentation for every class as there are a ton of classes where these issues exist. I have been told that the default values should be documented as "default=<'value'>" and so I am creating this issue under that assumption.
Solution
Here are a few things that I have seen for the parameters which should be changed:
If a few people work on a few classes each, then this should be done in no time! These should all be fairly simple fixes.
Examples
https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html
The link above is an example where default values are not indicated but the parameters say "optional", and where those with default values indicated are all inconsistently documented.