-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Added support for not normalising OneVsRestClassifier predict_proba result #10612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want predict_proba
to ever return anything but normalized probabilities. Is there a reason decision_function doesn't suit your needs?
I was using it on random forests and decision trees as well, which do not have a decision_function. |
well perhaps it makes sense to define a decision_function here that calls
the estimator's predict_proba but leaves it unnormalized. Wdyt?
|
As far as I know, the decision function is -1 for confident -1 class, +1 for confident +1 class, but it's not necessarily in the [-1, 1] range, and it represents the distance from the hyperplanes. For the random forest probabilities, would decision function be equal to p(x) * 2 - 1, or would calculating a distance from the decision boundaries of the random forest make sense? (i.e. for each tree node, how far away were we from the condition in the node). If it's documented that it's not necessarily equivalent to the SVM's decision_function, then it might be OK to define decision_function like you suggest. |
In the multiclass case, the decision function argmax needs to be consistent
with the discrete prediction. Otherwise I don't think it is constrained...
|
I changed some things so that decision_function retuns predict_proba * 2 - 1, if decision_function is not defined. I added a test for this case.
|
This pull request introduces 37 alerts and fixes 18 - view on lgtm.com new alerts:
fixed alerts:
Comment posted by lgtm.com |
This pull request introduces 37 alerts and fixes 18 - view on lgtm.com new alerts:
fixed alerts:
Comment posted by lgtm.com |
Never mind lgtm.com for now. It's a glitch on their side.
…On 14 February 2018 at 05:46, sklearn-lgtm ***@***.***> wrote:
This pull request introduces 37 alerts and fixes 18 - view on lgtm.com
<https://lgtm.com/projects/g/scikit-learn/scikit-learn/rev/pr-46a527562b3ce13d3b64ae8ac63edd4841926bb3>
*new alerts:*
- 37 for Explicit export is not defined
*fixed alerts:*
- 6 for Mismatch between signature and use of an overridden method
- 3 for Missing call to *init* during object initialization
- 3 for Potentially uninitialized local variable
- 2 for Comparison using is when operands support *eq*
- 2 for Wrong number of arguments for format
- 1 for Conflicting attributes in base classes
- 1 for Mismatch between signature and use of an overriding method
------------------------------
*Comment posted by lgtm.com <https://lgtm.com>*
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#10612 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6yW1R8balMruDnQEn1JvuJwMYDXTks5tUdiNgaJpZM4SAISq>
.
|
@jnothman any comments on the current state? |
sorry I haven't had time yet. decision_function should indeed return only a
1d array in the case of binary classification (even if it's annoying)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you had better add a test for that binary case and note it in the docstring
except AttributeError: | ||
T = np.array([e.predict_proba(X)[:, 1] * 2 - 1 | ||
for e in self.estimators_]).T | ||
if len(self.estimators_) == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simplify by just placing this logic below, to select the column instead
@jnothman , does it look better now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not altogether happy about making if_delegate_has_method less understandable
sklearn/utils/metaestimators.py
Outdated
@@ -87,10 +87,11 @@ class _IffHasAttrDescriptor(object): | |||
See https://docs.python.org/3/howto/descriptor.html for an explanation of | |||
descriptors. | |||
""" | |||
def __init__(self, fn, delegate_names, attribute_name): | |||
def __init__(self, fn, delegate_names, attribute_name, backup_attribute): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here attribute_name should become a list, like delegate_names.
However I wonder if we should just use a property rather than making magic.
sklearn/utils/metaestimators.py
Outdated
@@ -118,7 +124,7 @@ def __get__(self, obj, type=None): | |||
return out | |||
|
|||
|
|||
def if_delegate_has_method(delegate): | |||
def if_delegate_has_method(delegate, backup_method=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the parameter below
@jnothman , I'm open to making a new decorator, but couldn't think of one that is useful in multiple places because I'm not very familiar with the entire codebase. Do you have a specific design in mind? The remainder of the code is fixed. |
no, not a new decorator, just an equivalent custom implementation with
`@property` would suffice
|
@jnothman , I added a second decorator to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I preferred the previous version to this extra decorator. The name is no clearer by making it plural; plural makes it seem like all must be present.
I don't want to complexity the existing decorator if we can help it either, but the solution is to use property, not a new descriptor. Sigh, perhaps I'll live with the existing decorator being more complex.
@jnothman @if_delegate_has_methods(['_first_estimator', 'estimator'],
['decision_function', 'predict_proba'])
def decision_function(self, X): As far as I could understand, property is for defining class variables that have getters and setters. Which of the variables would be a property? |
Properties are descriptors which will result in hasattr(..., ...) to return False if the property getter raises an attribute error. The descriptor is more-or-less equivalent to: def decision_function(self):
if not hasattr(self.estimator, 'decision_function') and not hasattr...:
raise AttributeError
def _func(X, y):
# calculate the decision function
return df
return _func |
@jnothman , I'm not sure if I understood correctly, so I made a sample. https://pastebin.com/6Y9XUwg0 is this a better solution? |
not quite. The property needs to return the function that calculates the
ovr decision function from data.
Sigh. Maybe modifying the util is clearer and simpler
|
@jnothman I put it back together as a single decorator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this LGTM
Please add an entry to the change log at |
@jnothman I added it to what's new. Are pull requests are also referenced as |
yes
|
@jnothman , it's all set then. |
you will need a second dev to review
|
I'm not entirely sure what the motivation for this is. |
Which If you mean the I think understanding the difference between the probabilities |
Why do you think it would be useful here and not in other cases? Adding kwargs to functions is kind of frowned upon as it's hard to access those in compound estimators like pipelines. And anything that changes the |
Your goal is basically to implement a rejection threshold, right? So the only reason for this PR is to not do ducktyping in your implementation of the rejection threshold? I think we'll add thresholding methods soon (hopefully) and I'm pretty sure they will contain the ducktyping. I'm not sure why putting this logic in this particular place would be the right solution. |
I'm not sure what other cases it would be useful in, but if you think it would, I can add that to Yes, I was trying to implement a rejection threshold for I can add an option to normalize or not to the constructor of |
My question was more why you didn't use |
I was comparing multiple algorithms, and For the values, I was calibrating within the training set. Anyways this was used for a publication which is already published, so feel free to close this PR without merging, because now it seems not very necessary for other people, or for me. Especially if there's going to be a rejection threshold implementation soon. |
I would likewise find an optional In my case I would like to plot probability thresholds vs. recall / precision scores for the respective classes, however, since these are in a normalised form I need to train independent classifiers (i.e. not use OneVsRestClassifier) to get the necessary probabilities. It would also be interesting from the perspective of considering alternate probability thresholds. Thanks. |
Reference Issues/PRs
None
What does this implement/fix? Explain your changes.
I added support for not normalizing the predict_proba result, and returning it as it is, even if the data is not multilabel. This is useful for implementing a rejection threshold.
Any other comments?
I can also add thresholding in ovrclassifier or somewhere else to reject classifications results, but I didn't see anything else like that in scikit, so I don't know if it's a wanted feature.