The Wayback Machine - https://web.archive.org/web/20230427184032/https://github.com/scikit-learn/scikit-learn/pull/23198
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC List estimators that supports missing data in user guide #23198

Merged

Conversation

MaxwellLZH
Copy link
Contributor

@MaxwellLZH MaxwellLZH commented Apr 23, 2022

Fixes #21382. This is based on the work of lisekleiber

What does this implement/fix? Explain your changes.

Code refactoring and adding formatting as suggested in the original PR #22096

@MaxwellLZH
Copy link
Contributor Author

The styling of the documentation is still not working as expected, I wasn't able to figure out where's the issue, could really need some help here
image

@cmarmo
Copy link
Member

cmarmo commented Apr 23, 2022

Hi @MaxwellLZH, thanks for your follow-up.
About the rendering, I would suggest to create a bullet list here.

  • List of estimators that allow NaN values for type regressor:
    • :class:sklearn.ensemble._hist_gradient_boosting.gradient_boosting.HistGradientBoostingRegressor
  • List of estimators that allow NaN values for type classifier:
    • :class:sklearn.ensemble._hist_gradient_boosting.gradient_boosting.HistGradientBoostingClassifier
  • etc...

Also, do you mind editing the description with "Fixes" or "Solves" and the issue number just after? This will link the issue to the PR and close the issue once the PR is merged. Thanks!

@MaxwellLZH
Copy link
Contributor Author

HI @cmarmo, I've edited the description as suggested.

I'm kind of stuck with the formatting here, I checked the output string (as shown below), which seems to have the right format, but it's still shown as plain text rather than markdown in the generated HTML :(, would you mind help take a look at the code here, thank you!

* List of estimators that allow NaN values for type *regressor*:
 * :class:`sklearn.ensemble._hist_gradient_boosting.gradient_boosting.HistGradientBoostingRegressor`

@cmarmo
Copy link
Member

cmarmo commented Apr 24, 2022

Hi @MaxwellLZH I'm not an expert in sphinx and docutils.... so, please, bear with me.
My understanding is that the the output text is written in a docutils paragraph node, translated as an html <p> tag.
So I guess you can write the paragraph in HTML not rst.... this will possibly work but it is not a clean solution....
A clean solution would probably be the use of docutils bullet_list node. Do you think you can give it a try?

I still need to figure out how the links to the estimators are represented.... sorry...

@MaxwellLZH
Copy link
Contributor Author

Hi @cmarmo, thank you for the instructions! I will take a look at using bullet_list node this week :)

@MaxwellLZH
Copy link
Contributor Author

MaxwellLZH commented May 1, 2022

Sorry, I tried to figure out how to work with docutils.nodes.bullet_list and the link to classes with :class: but still wasn't able to fix the formatting :(

If someone familiar with sphinx and docutils could help move on with this PR or give some instructions on how to make it work would be really appreciated !

@cmarmo
Copy link
Member

cmarmo commented May 2, 2022

@MaxwellLZH I have opened a pull request to your branch.
The code is translated to a docutils node structure.
Could you please check the url to the estimators? Thanks!

@MaxwellLZH
Copy link
Contributor Author

Hi @cmarmo, thank you so much for helping! I've updated the estimator links and checks it's working now :)

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you all. This looks good to me!

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

doc/sphinxext/allow_nan.py Outdated Show resolved Hide resolved
doc/sphinxext/allow_nan.py Outdated Show resolved Hide resolved
doc/modules/impute.rst Outdated Show resolved Hide resolved
try:
est = _construct_instance(est_class)
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this raises a SkipTest exception. In those cases, I think we can skip without warning, since warning is not that actionable here.

Suggested change
try:
est = _construct_instance(est_class)
except:
from contextlib import supress
...
with suppress(SkipTest):
est = _construct_instance(est_class)

I suspect the "Estimators that handle NaN values" section is for non meta-estimators anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense! I've updated code as suggested :)

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, otherwise LGTM

doc/conf.py Outdated Show resolved Hide resolved
@@ -0,0 +1,59 @@
from sklearn.utils import all_estimators
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with the class name, can you rename this: doc/sphinxext/allow_nan_estimators.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated :)

MaxwellLZH and others added 2 commits May 4, 2022 16:03
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan changed the title DOC Maintain a list of all estimators that supports missing data DOC List estimators that supports missing data in user guide May 4, 2022
@thomasjpfan thomasjpfan merged commit d3050e4 into scikit-learn:main May 4, 2022
28 checks passed
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request May 19, 2022
glemaitre pushed a commit that referenced this pull request May 19, 2022
Co-authored-by: Thomas J. Fan <[email protected]>
Co-authored-by: Lise <[email protected]>
Co-authored-by: Chiara Marmo <[email protected]>
mathijs02 pushed a commit to mathijs02/scikit-learn that referenced this pull request Dec 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC maintain a list of all estimators that natively handle missing data
5 participants