Store spectral embeddings in SpectralClustering #26061

matteo-bastico · 2023-04-03T08:47:54Z

Describe the workflow you want to enable

Save the spectral embeddings used for clustering in the SpectralClustering class and make them accessible through an attribute, e.g. maps_, to make easier post-processing on the clusters.

Describe your proposed solution

Optionally return the maps in the spectral_clustering method with a new parameter:

def spectral_clustering(
    affinity,
    *,
    n_clusters=8,
    n_components=None,
    eigen_solver=None,
    random_state=None,
    n_init=10,
    eigen_tol="auto",
    assign_labels="kmeans",
    verbose=False,
    return_maps=False
):
    
    ...
    
    if return_maps:
        return maps, labels
    else:
        return labels

Store maps_ attribute in the fit method of the SpectralClustering class:

self.maps_, self.labels_ = spectral_clustering(
            self.affinity_matrix_,
            n_clusters=self.n_clusters,
            n_components=self.n_components,
            eigen_solver=self.eigen_solver,
            random_state=random_state,
            n_init=self.n_init,
            eigen_tol=self.eigen_tol,
            assign_labels=self.assign_labels,
            verbose=self.verbose,
            return_maps=True
        )

Describe alternatives you've considered, if relevant

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

ogrisel · 2023-04-06T15:42:36Z

Save the spectral embeddings used for clustering in the SpectralClustering class and make them accessible through an attribute, e.g. maps_, to make easier post-processing on the clusters.

@matteo-bastico it would help us decide if you could explain how you would use this.

thomasjpfan · 2023-04-06T16:28:09Z

As a workaround, you may recompute the mapping after fitting SpectralClustering:

from sklearn.cluster import SpectralClustering
from sklearn.manifold import spectral_embedding
import numpy as np

X = np.array([[1, 1], [2, 1], [1, 0],
              [4, 7], [3, 5], [3, 6]])
clustering = SpectralClustering(n_clusters=2,
        assign_labels='discretize',
        random_state=0).fit(X)

maps = spectral_embedding(
    clustering.affinity_matrix_,
    n_components=clustering.n_clusters,
    eigen_solver=clustering.eigen_solver,
    random_state=0,
    eigen_tol=clustering.eigen_tol,
    drop_first=False,
)

matteo-bastico · 2023-04-11T11:42:38Z

Save the spectral embeddings used for clustering in the SpectralClustering class and make them accessible through an attribute, e.g. maps_, to make easier post-processing on the clusters.

@matteo-bastico it would help us decide if you could explain how you would use this.

In my case, I want to compute the medoids of the clusters using the distances in the spectral embedding space instead of the original Euclidean space. But there are others applications in which this feature can be useful.

As a workaround, you may recompute the mapping after fitting SpectralClustering:

from sklearn.cluster import SpectralClustering
from sklearn.manifold import spectral_embedding
import numpy as np

X = np.array([[1, 1], [2, 1], [1, 0],
              [4, 7], [3, 5], [3, 6]])
clustering = SpectralClustering(n_clusters=2,
        assign_labels='discretize',
        random_state=0).fit(X)

maps = spectral_embedding(
    clustering.affinity_matrix_,
    n_components=clustering.n_clusters,
    eigen_solver=clustering.eigen_solver,
    random_state=0,
    eigen_tol=clustering.eigen_tol,
    drop_first=False,
)

Thank you, as a workaround it works but the spectral embeddings are computed twice and for large matrices it is time consuming.

matteo-bastico added Needs Triage Issue requires triage New Feature labels Apr 3, 2023

thomasjpfan added Needs Info and removed Needs Triage Issue requires triage labels Apr 6, 2023

github-actions bot assigned enigdata Apr 10, 2023

May	JUN	Jul
	13
2022	2023	2024

Store spectral embeddings in SpectralClustering #26061

Store spectral embeddings in SpectralClustering #26061

matteo-bastico commented Apr 3, 2023

ogrisel commented Apr 6, 2023 •

edited

thomasjpfan commented Apr 6, 2023

matteo-bastico commented Apr 11, 2023 •

edited

Store spectral embeddings in SpectralClustering #26061

Store spectral embeddings in SpectralClustering #26061

Comments

matteo-bastico commented Apr 3, 2023

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

ogrisel commented Apr 6, 2023 • edited

thomasjpfan commented Apr 6, 2023

matteo-bastico commented Apr 11, 2023 • edited

ogrisel commented Apr 6, 2023 •

edited

matteo-bastico commented Apr 11, 2023 •

edited