seqme.metrics.Recall#
- class seqme.metrics.Recall(n_neighbors, reference, embedder, *, batch_size=256, device='cpu', strict=True, name='Recall')[source]#
Evaluates how well the reference data is covered by the generated sequences.
Computes the Improved Recall metric [1], which measures the fraction of reference embeddings that lie on or near the manifold defined by the generated sequence embeddings. This metric quantifies coverage, i.e., the degree to which the generated samples represent the full reference distribution. To achieve this, the reference manifold is approximated using nearest-neighbor balls.
Its value ranges from 0 to 1, representing the fraction of reference embeddings that are effectively captured by the generated data.
References
- [1] Kynkäänniemi et al., “Improved precision and recall metric for assessing generative models”, NeurIPS 2019
- __init__(n_neighbors, reference, embedder, *, batch_size=256, device='cpu', strict=True, name='Recall')[source]#
Initialize the Improved Recall metric.
- Parameters:
n_neighbors (
int) – Number of nearest neighbors used to define the radii of the nearest-neighbor balls. More neighbors result in larger radii.reference (
list[str]) – List of reference sequences used to build the reference manifold.embedder (
Callable[[list[str]],ndarray]) – Function mapping sequences to embeddings.batch_size (
int) – Number of samples per batch when computing distances.device (
str) – Compute device, e.g.,"cpu"or"cuda".strict (
bool) – If True, enforces an equal number of evaluation and reference samples.name (
str) – Metric name.
- Raises:
ValueError – If n_neighbors < 1.
ValueError – If reference contains fewer than 1 sequence after embedding.
- __call__(sequences)[source]#
Compute the Improved Recall of the sequences.
- Parameters:
- Returns:
Improved Recall.
- Return type:
Methods
Attributes
|
Name of the metric. |
|
Whether lower or higher scores indicate better performance. |