[MRG] Fix Improvement in RANSAC: is_data_valid should receive subset_idxs #19498

alex90r7 · 2021-02-19T08:26:26Z

Reference Issues/PRs

Fixes #19495

What does this implement/fix? Explain your changes.

Any other comments?

glemaitre · 2021-07-30T13:15:07Z

This solution is buggy because your subset of indices is based on the X given at fit and not the subset X_subset given in the RANSAC iteration.

One possibility is to propagate sample_weight such that it can be an input of is_data_valid and thus putting a weight at zero could be detected in the callable is_data_valid and not consider a specific data point.

However, I don't understand why these samples should be included in the original X.

alex90r7 · 2021-07-30T13:21:08Z

Please check my comment in the issue.
Also check the paper I mentioned:
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6856449
section II C:
"In each RANSAC iteration, three targets, containing at least one target of each sensor are chosen randomly, providing a unique solution for the velocity profile. "

glemaitre · 2021-07-30T15:55:00Z

So in the end, it is not that you want to reject samples but you want to enforce some constraints on the resampling.
To me, it looks like what we do with some cross-validation and the grouping strategy.

I would think that we could add a new parameter (e.g. sampling_strategy) that will default to the random sampling without replacement by default. However, we could pass another strategy based on some group sampling where the groups could be provided at fit as an array.

alex90r7 · 2021-08-02T06:27:33Z

Yeap, now you got it.
Let me sum up:
I) A new optional argument called "sampling_strategy" will be given to fit.
II) If sampling_strategy is "None" then current sampling will be used:

choose random sample set

subset_idxs = sample_without_replacement(n_samples, min_samples, random_state=random_state)
III) If sampling_strategy is not None, you suggested that:
"we could pass another strategy based on some group sampling where the groups could be provided at fit as an array."
Not sure if that is generic enough for all sampling_strategies. Why not pass a function that will apply the strategy, like the "is_data_valid"?

glemaitre · 2021-08-02T11:25:17Z

The grouping sampling could be an option and we could also accept some callables.

alex90r7 · 2021-08-02T11:31:14Z

That sounds nice. If you think that this interface is better then lets do that.
Could please provide an one-liner with the signature of the fit method?And then I could take it from there.
For example:
default case: fit(self, X, y, sample_weight=None, sampling_strategy=None)
non-default case: fit(self, X, y, sample_weight=None, sampling_strategy=callable)?
non-default case: fit(self, X, y, sample_weight=None, sampling_strategy=dict)?

glemaitre · 2021-08-02T11:35:18Z

The contract of scikit-learn estimator is that at fit estimators should not take parameters apart from sample property (such as group or sample_weight)

alex90r7 · 2021-08-02T11:44:42Z

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html#sklearn.model_selection.GridSearchCV.fit

I think that this is not similar. I would personally go for an extra attribute in the init of the class that would define sampling strategy.
The signature of the fit method would remain the same fit(self, X, y, sample_weight=None)

glemaitre · 2021-08-02T12:15:55Z

I think that this is not similar. I would personally go for an extra attribute in the init of the class that would define sampling strategy. The signature of the fit method would remain the same fit(self, X, y, sample_weight=None)

For sure, you will need to add the parameter in __init__. However, you need to pass on the extra information required by the sampling. The issue with a callable is that you need to provide a **kwargs to __init__ as well if you want to be flexible. It might start to be YAGNI. Limiting the strategy_sampling could be user-friendly enough and could answer to most of the needs.

Basically, what would be interesting is to know what are the other type of sampling that one can need.

is_data_valid can receive the subset_idxs. Backward compatible

d199f36

github-actions bot added the module:linear_model label Feb 19, 2021

alex90r7 added 2 commits February 19, 2021 09:46

line too long

8e31f0b

line in comments

1869a48

alex90r7 changed the title ~~is_data_valid can receive the subset_idxs. Backward compatible~~ Fix Improvement in RANSAC: is_data_valid should receive subset_idxs Feb 22, 2021

alex90r7 changed the title ~~Fix Improvement in RANSAC: is_data_valid should receive subset_idxs~~ [MRG] Fix Improvement in RANSAC: is_data_valid should receive subset_idxs Feb 22, 2021

May	JUN	Jul
	14
2024	2025	2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] Fix Improvement in RANSAC: is_data_valid should receive subset_idxs #19498

[MRG] Fix Improvement in RANSAC: is_data_valid should receive subset_idxs #19498

Uh oh!

alex90r7 commented Feb 19, 2021 •

edited

Loading

Uh oh!

glemaitre commented Jul 30, 2021

Uh oh!

alex90r7 commented Jul 30, 2021

Uh oh!

glemaitre commented Jul 30, 2021

Uh oh!

alex90r7 commented Aug 2, 2021

Uh oh!

glemaitre commented Aug 2, 2021

Uh oh!

alex90r7 commented Aug 2, 2021

Uh oh!

glemaitre commented Aug 2, 2021

Uh oh!

alex90r7 commented Aug 2, 2021

Uh oh!

glemaitre commented Aug 2, 2021

Uh oh!

Uh oh!

Uh oh!

[MRG] Fix Improvement in RANSAC: is_data_valid should receive subset_idxs #19498

Are you sure you want to change the base?

[MRG] Fix Improvement in RANSAC: is_data_valid should receive subset_idxs #19498

Uh oh!

Conversation

alex90r7 commented Feb 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

glemaitre commented Jul 30, 2021

Uh oh!

alex90r7 commented Jul 30, 2021

Uh oh!

glemaitre commented Jul 30, 2021

Uh oh!

alex90r7 commented Aug 2, 2021

choose random sample set

Uh oh!

glemaitre commented Aug 2, 2021

Uh oh!

alex90r7 commented Aug 2, 2021

Uh oh!

glemaitre commented Aug 2, 2021

Uh oh!

alex90r7 commented Aug 2, 2021

Uh oh!

glemaitre commented Aug 2, 2021

Uh oh!

Uh oh!

alex90r7 commented Feb 19, 2021 •

edited

Loading