seqme.metrics.Diversity

seqme.metrics.Diversity#

class seqme.metrics.Diversity(k=None, *, seed=0, name='Diversity')[source]#

Measures the diversity of synthetic sequences using normalized pairwise Levenshtein distance.

Evaluates how similar or different the synthetic sequences are relative to each other in the sequence space. Higher values indicate greater diversity, while lower values indicate more similarity or redundancy among sequences.

__init__(k=None, *, seed=0, name='Diversity')[source]#

Initialize the metric.

Parameters:
  • k (Optional[int]) – If not None randomly sample k other sequences to compute diversity against.

  • seed (int) – For deterministic sampling. Only used if k is not None.

  • name (str) – Metric name.

__call__(sequences)[source]#

Compute the diversity.

Note: For a large number of sequences, a small value for k (e.g., 10) usually provides a stable approximation of the diversity.

Parameters:

sequences (list[str]) – Sequences to evaluate.

Returns:

Diversity score.

Return type:

MetricResult

Methods

__init__([k, seed, name])

Initialize the metric.

__call__(sequences)

Compute the diversity.

Attributes

name

Name of the metric.

objective

Whether lower or higher scores indicate better performance.