deepmol.unsupervised package

Submodules

deepmol.unsupervised.base_unsupervised module

class KMeans(**kwargs)[source]

Bases: UnsupervisedLearn

Class to perform K-Means clustering.

Wrapper around scikit-learn K-Means. (https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans)

plot(x_new: ndarray, path: Optional[str] = None, **kwargs) None[source]

Plot the results of the clustering.

Parameters
  • x_new (np.ndarray) – Transformed dataset.

  • path (str) – Path to save the plot.

  • **kwargs – Additional arguments for the plot.

class PCA(**kwargs)[source]

Bases: UnsupervisedLearn

Class to perform principal component analysis (PCA).

Wrapper around scikit-learn PCA (https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html#sklearn.decomposition.PCA)

Linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space.

plot(x_new: ndarray, path: Optional[str] = None, **kwargs) None[source]

Plot the results of unsupervised learning (PCA).

X_newndarray of shape (n_samples, n_components)

Transformed values.

path: str

Path to save the plot.

**kwargs:

Additional arguments to pass to the plot method.

plot_explained_variance(path: Optional[str] = None, **kwargs) None[source]

Plot the explained variance.

Parameters
  • path (str) – Path to save the plot.

  • **kwargs – Additional arguments to pass to the plot method.

class TSNE(**kwargs)[source]

Bases: UnsupervisedLearn

Class to perform t-distributed Stochastic Neighbor Embedding (TSNE).

Wrapper around scikit-learn TSNE (https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#sklearn.manifold.TSNE)

It converts similarities between data points to joint probabilities and tries to minimize the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embedding and the high-dimensional data.

plot(x_new: ndarray, path: Optional[str] = None, **kwargs) None[source]

Plot the results of unsupervised learning.

Parameters
  • x_new (np.ndarray) – Transformed values.

  • path (str) – The path to save the plot.

  • **kwargs – Additional arguments to pass to the plot function.

class UnsupervisedLearn[source]

Bases: ABC

Class for unsupervised learning.

A UnsupervisedLearn sampler receives a Dataset object and performs unsupervised learning.

Subclasses need to implement a _unsupervised method to perform unsupervised learning.

abstract plot(x_new: ndarray, path: Optional[str] = None, **kwargs) None[source]

Plot the results of unsupervised learning.

Parameters
  • x_new (np.ndarray) – Transformed values.

  • path (str) – The path to save the plot.

  • **kwargs – Additional arguments to pass to the plot function.

run_unsupervised(dataset: Dataset, **kwargs) SmilesDataset[source]

Run unsupervised learning.

Parameters
  • dataset (Dataset) – The dataset to perform unsupervised learning.

  • kwargs – Additional arguments to pass to the _run_unsupervised method.

Returns

df – The dataset with the unsupervised features in dataset.X.

Return type

SmilesDataset

deepmol.unsupervised.umap module

class UMAP(parametric: bool = True, **kwargs)[source]

Bases: UnsupervisedLearn

Class to perform Uniform Manifold Approximation and Projection (UMAP).

Wrapper around umap package. (https://github.com/lmcinnes/umap)

plot(x_new: ndarray, path: Optional[str] = None, **kwargs) None[source]

Plot the UMAP embedding.

Parameters
  • x_new (np.ndarray) – The new features.

  • path (str) – The path to save the plot.

  • kwargs – Additional keyword arguments for the plot.

Module contents