The Wayback Machine - https://web.archive.org/web/20200822001248/https://github.com/scikit-learn/scikit-learn/issues/18175
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix shape parameter inconsistencies in developer's guide #18175

Open
lorentzenchr opened this issue Aug 17, 2020 · 11 comments
Open

Fix shape parameter inconsistencies in developer's guide #18175

lorentzenchr opened this issue Aug 17, 2020 · 11 comments

Comments

@lorentzenchr
Copy link
Contributor

@lorentzenchr lorentzenchr commented Aug 17, 2020

Describe the issue linked to the documentation

https://scikit-learn.org/stable/developers/contributing.html#guidelines-for-writing-documentation specifies how to write parameters with a shape attribute in a docstring, e.g.

array_parameter : {array-like, sparse matrix, dataframe} of shape (n_samples, n_features) or (n_samples,)
    This parameter accepts data in either of the mentioned forms, with one
    of the mentioned shapes. The default value is
    `np.ones(shape=(n_samples,))`.

list_param : list of int

typed_ndarray : ndarray of shape (n_samples,), dtype=np.int32

sample_weight : array-like of shape (n_samples,), default=None

There are a few instances in the developer's guide, examples and user guide that do still not follow that example:

  • examples/model_selection/plot_learning_curve.py
  • examples/miscellaneous/plot_johnson_lindenstrauss_bound.py
  • doc/developers/develop.rst (#18191)
  • doc/modules/clustering.rst
@OlehKSS
Copy link
Contributor

@OlehKSS OlehKSS commented Aug 17, 2020

Hello, I can take it provided no one else is working on it now.

@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Aug 17, 2020

You can go ahead @OlehKSS

Please submit a pull-request by file modified to ease the reviewing process.

@cmarmo I was wondering if this issue is not a duplicate?

@cmarmo
Copy link
Member

@cmarmo cmarmo commented Aug 17, 2020

@glemaitre @lorentzenchr this issue slightly overlaps with #14312 and #15761 but it is not a duplicate.
I think that those three issues will benefit of being unified in one and only for the sake of readability, for users and reviewers.

@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Aug 17, 2020

I think that those three issues will benefit of being unified in one and only for the sake of readability, for users and reviewers.

At least they are cross-referenced now :)
But I agree that a single issue would be best because all these little changes could be done at once, for a single file/class.

@lorentzenchr
Copy link
Contributor Author

@lorentzenchr lorentzenchr commented Aug 17, 2020

@glemaitre @cmarmo Thanks for cross-referencing. I'm not sure how to unify all in one issue. Note that #14312 and #15761 are about docstrings of classes and functions, not about examples and developer's guide (or user guide). I think one PR per file seems reasonable and might motivate some new contributors.
Therefore I updated this issue with a list of some files where I found shape of inconsistencies.

@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Aug 18, 2020

My bad, I omit the fact that you pointed out the UG and example only. So this issue should be fixed pretty quickly then.

@OlehKSS
Copy link
Contributor

@OlehKSS OlehKSS commented Aug 19, 2020

I edited one of the specified files, e.g. develop.rst, here is my PR #18191

@OlehKSS
Copy link
Contributor

@OlehKSS OlehKSS commented Aug 19, 2020

I noticed that in the documentation guidelines for “References” in docstrings it is suggested to look at the Silhouette Coefficient, which itself does not follow documentation writing guidelines for shapes and arrays. Is there a better example that can be provided in the documentation guidelines or should fix that docstring as a part of this issue?

image

@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Aug 19, 2020

The docstring of the silhoutte coefficient is not that good but the "References" section is fine:

 References
    ----------
    .. [1] `Peter J. Rousseeuw (1987). "Silhouettes: a Graphical Aid to the
       Interpretation and Validation of Cluster Analysis". Computational
       and Applied Mathematics 20: 53-65.
       <https://www.sciencedirect.com/science/article/pii/0377042787901257>`_
    .. [2] `Wikipedia entry on the Silhouette Coefficient
           <https://en.wikipedia.org/wiki/Silhouette_(clustering)>`_

I think this is what the documentation is referring too. The simplest would be to make a PR solving the issue in the class and we merge it quickly :)

@OlehKSS
Copy link
Contributor

@OlehKSS OlehKSS commented Aug 20, 2020

I added a new PR #18214 that fixes the issue with the Silhouette Coefficient.

@lorentzenchr
Copy link
Contributor Author

@lorentzenchr lorentzenchr commented Aug 20, 2020

I reopen for the 3 left files in examples and user guide not inlcuded - in my understanding - in the above mentioned issues.

@lorentzenchr lorentzenchr reopened this Aug 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.