seqme.metrics.ConformityScore#

class seqme.metrics.ConformityScore(reference, predictors, *, n_splits=5, kde_bandwidth='silverman', seed=0, name='Conformity score')[source]#

Distributional conformity score.

References

[1] Frey et al., “Protein Discovery With Discrete Walk-jump Sampling” (2024).

(https://arxiv.org/abs/2306.12360)

__init__(reference, predictors, *, n_splits=5, kde_bandwidth='silverman', seed=0, name='Conformity score')[source]#

Initialize the metric.

Parameters:
  • reference (list[str]) – Reference sequences assumed to represent the target distribution.

  • predictors (list[Callable[[list[str]], ndarray]]) – A list of predictor functions. Each should take a list of sequences and return a 1D NumPy array of features.

  • n_splits (int) – Number of cross-validation folds for KDE.

  • kde_bandwidth (Union[float, Literal['scott', 'silverman']]) – Bandwidth parameter for the Gaussian KDE.

  • seed (int) – Seed for deterministic k-fold shuffling.

  • name (str) – Metric name.

__call__(sequences)[source]#

Compute the conformity score for the given sequences.

Parameters:

sequences (list[str]) – Sequences to evaluate.

Returns:

Mean and standard error of the conformity scores across all folds.

Return type:

MetricResult

Methods

__init__(reference, predictors, *[, ...])

Initialize the metric.

__call__(sequences)

Compute the conformity score for the given sequences.

Attributes

name

Name of the metric.

objective

Whether lower or higher scores indicate better performance.