The Wayback Machine - https://web.archive.org/web/20220425092517/https://github.com/scikit-learn/scikit-learn/pull/16948
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Online implementation of non-negative matrix factorization #16948

Merged
merged 349 commits into from Apr 22, 2022

Conversation

Copy link
Member

@cmarmo cmarmo commented Apr 17, 2020

Reference Issues/PRs

Continues #13386
Aim to fix #13308, fix #13326.

What does this implement/fix? Explain your changes.

Implement Online non-negative matrix factorization, following
Online algorithms for nonnegative matrix factorization with the Itakura-Saito divergence, A Lefevre, F Bach, C Févotte, 2011.

@cmarmo
Copy link
Member Author

@cmarmo cmarmo commented Aug 30, 2020

@GaelVaroquaux , @jeremiedbb this is still WIP but I feel like its status needs an update.
I have rerun benchmarks for commit 0020eb6 , the result is below
bench_topics_drago
With respect to the previous implementation:

  • I am no longer hardcoding the number of iterations for the minibatch algorithm: the convergence time is still better than for the classical NMF;
  • I have introduced the forgetting factor: a 'weight' to be applied to old batches (hardcoded for now).

I have also checked that the minibatch algorithm gives the same results as the single batch one, when the batch size is set to the number of samples (I have added a test for that).

From the plot, the loss is greater in the minibatch implementation, but in some cases it seems to be comparable... I am planning to investigate the role of the forgetting factor on the loss: from Fevotte et al. it seems that this factor and then a good solution, depend on the number of samples and the batch size.

Here what is still needed for this PR:

  • investigate the role of the forgetting factor in the loss estimation (and improve it, hopefully it turns out that the expected loss should be investigated instead).
  • make the forgetting factor a parameter and suggest possible optimizations depending on the dimensions of the problem
  • avoid code duplication (non_negative_factorization and non_negative_factorization_online are quite the same function right now)
  • improve tests (I am open to suggestions for testing those lines)
    image
  • write documentation (!)

Thanks for listening and let me know if you have any comment or suggestion.

jeremiedbb added 2 commits Mar 25, 2022
Co-authored-by: Patricio Cerda <pcerda>
Copy link
Member

@adrinjalali adrinjalali left a comment

I did not check the math, the rest looks pretty good.

doc/whats_new/v1.1.rst Outdated Show resolved Hide resolved
sklearn/decomposition/_nmf.py Show resolved Hide resolved
return self

def _solve_W(self, X, H, max_iter):
"""Minimize the objective function w.r.t W"""
Copy link
Member

@adrinjalali adrinjalali Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would make it easier for somebody like me to read/review if we could explain what these methods do in the docstrings.

Copy link
Member

@jeremiedbb jeremiedbb Apr 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make it more explicit. Let me know what you think

return W

def partial_fit(self, X, y=None, W=None, H=None):
"""Update the model using the data in X as a mini-batch.
Copy link
Member

@adrinjalali adrinjalali Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we explain how a user could use partial_fit on a dataset to get the same result as running fit on its entirety? It would make it easier for people to decide how to use it.

Copy link
Member

@jeremiedbb jeremiedbb Apr 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the docstring and linked to the doc of incremental learning. Maybe we could add a section there to explain with more details how to use partial_fit, but I think it should be done in a separate PR since it concerns many estimators.

Copy link
Contributor

@glemaitre glemaitre left a comment

Just posted the comment that I had from before. You can discard the comment that could be outdated now.

sklearn/decomposition/_nmf.py Show resolved Hide resolved
self._check_params(X)

if X.min() == 0 and self._beta_loss <= 0:
raise ValueError(
Copy link
Contributor

@glemaitre glemaitre Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, we don't check for this error in the test as well.

Copy link
Member

@jeremiedbb jeremiedbb Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test

sklearn/decomposition/_nmf.py Show resolved Hide resolved
sklearn/decomposition/_nmf.py Show resolved Hide resolved
Copy link
Contributor

@glemaitre glemaitre left a comment

A review mainly about nitpicks on the documentation just for the format.

doc/modules/decomposition.rst Outdated Show resolved Hide resolved
doc/modules/decomposition.rst Outdated Show resolved Hide resolved
.. math::
0.5 * ||X - WH||_{loss}^2
Copy link
Contributor

@glemaitre glemaitre Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if . would be better than * for the mulitplication.

Copy link
Member

@jeremiedbb jeremiedbb Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use * everywhere. Let's keep consistency

sklearn/decomposition/_nmf.py Outdated Show resolved Hide resolved
sklearn/decomposition/_nmf.py Outdated Show resolved Hide resolved
sklearn/decomposition/_nmf.py Show resolved Hide resolved
sklearn/decomposition/_nmf.py Outdated Show resolved Hide resolved
sklearn/decomposition/_nmf.py Outdated Show resolved Hide resolved
sklearn/decomposition/_nmf.py Outdated Show resolved Hide resolved
sklearn/decomposition/_nmf.py Outdated Show resolved Hide resolved
Copy link
Contributor

@glemaitre glemaitre left a comment

LGTM once the doc nitpicks are included.

batch_size = 3
max_iter = 1000

rng = np.random.mtrand.RandomState(42)
Copy link
Contributor

@glemaitre glemaitre Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question here: do we want to add support for the global random state fixture?

Copy link
Member

@jeremiedbb jeremiedbb Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the mess it caused so far, I'd rather do that very carefully in a follow up PR :)

Copy link
Contributor

@glemaitre glemaitre left a comment

LGTM.

@glemaitre glemaitre merged commit 69132eb into scikit-learn:main Apr 22, 2022
22 of 28 checks passed
@jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Apr 22, 2022

Thanks @cmarmo !

@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Apr 22, 2022

Thanks everyone.

@jjerphan
Copy link
Member

@jjerphan jjerphan commented Apr 22, 2022

Thanks to everyone involved.

@cmarmo cmarmo deleted the modified_nmf_for_minibatch branch Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment