deepmol.pipeline package

Submodules

deepmol.pipeline.pipeline module

class Pipeline(steps: List[Tuple[str, Transformer | Predictor]], path: str | None = None, hpo: HyperparameterOptimizer | None = None)[source]

Bases: Transformer

Pipeline of transformers and predictors. The last step must be a predictor, all other steps must be transformers. It applies a list of transformers in a sequence followed (or not) by a predictor. The transformers must implement the fit() and transform() methods, the predictor must implement the fit() and predict() methods.

evaluate(dataset: Dataset, metrics: List[Metric], per_task_metrics: bool = False) Tuple[Dict, None | Dict][source]

Evaluate the pipeline on a dataset based on the provided metrics.

Parameters:
  • dataset (Dataset) – Dataset to evaluate on.

  • metrics (Union[List[Metric]]) – List of metrics to evaluate on.

  • per_task_metrics (bool) – Whether to return per-task metrics.

Returns:

  • multitask_scores (dict) – Dictionary mapping names of metrics to metric scores.

  • all_task_scores (dict) – If per_task_metrics == True is passed as a keyword argument, then returns a second dictionary of scores for each task separately.

fit(train_dataset: Dataset, validation_dataset: Dataset | None = None) Pipeline[source]

Fit the pipeline to the train data.

Parameters:
  • train_dataset (Dataset) – Dataset to fit the pipeline to.

  • validation_dataset (Dataset) – Dataset to validate the pipeline on if hpo is not None.

Returns:

self – Fitted pipeline.

Return type:

Pipeline

is_fitted() bool[source]

Whether the pipeline is fitted.

Returns:

is_fitted – Whether the pipeline is fitted.

Return type:

bool

is_prediction_pipeline() bool[source]

Whether the pipeline is a prediction pipeline.

Returns:

is_prediction_pipeline – Whether the pipeline is a prediction pipeline.

Return type:

bool

classmethod load(path: str) Pipeline[source]

Load the pipeline from disk. The sequence of transformers is loaded from a config file. The transformers and predictor are loaded from separate files. Transformers are loaded from pickle files, while the predictor is loaded using its own load method.

Parameters:

path (str) – Path to the directory where the pipeline is saved.

Returns:

pipeline – Loaded pipeline.

Return type:

Pipeline

predict(dataset: Dataset) ndarray[source]

Make predictions on a dataset using the pipeline predictor.

Parameters:

dataset (Dataset) – Dataset to make predictions on.

Returns:

y_pred – Predictions.

Return type:

np.ndarray

predict_proba(dataset: Dataset) ndarray[source]

Make predictions on a dataset using the pipeline predictor.

Parameters:

dataset (Dataset) – Dataset to make predictions on.

Returns:

y_pred – Predictions.

Return type:

np.ndarray

save()[source]

Save the pipeline to disk (transformers and predictor). The sequence of transformers is saved in a config file. The transformers and predictor are saved in separate files. Transformers are saved as pickle files, while the predictor is saved using its own save method.

Module contents