seqme.metrics.MMD

seqme.metrics.MMD#

class seqme.metrics.MMD(reference, embedder, *, estimate='biased', sigma=10, scale=1000, device='cpu', name='MMD')[source]#

Maximum Mean Discrepancy (MMD) metric using a Gaussian kernel.

This metric measures the similarity between the distributions of synthetic sequences and reference sequences in the embedding space.

References

[1] Jayasumana, Sadeep, et al., “Rethinking FID: Towards a better evaluation metric for image generation,”

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024 (https://arxiv.org/pdf/2401.09603)

__init__(reference, embedder, *, estimate='biased', sigma=10, scale=1000, device='cpu', name='MMD')[source]#

Initialize the metric.

Parameters:
  • reference (list[str]) – List of reference sequences representing real data.

  • embedder (Callable[[list[str]], ndarray]) – Function that maps a list of sequences to their embeddings. Should return a 2D array of shape (num_sequences, embedding_dim).

  • estimate (Literal['biased', 'unbiased']) – Expectation estimate.

  • sigma (float) – Bandwidth parameter for the Gaussian RBF kernel.

  • scale (float) – Scaling factor for the MMD score.

  • device (str) – Compute device, e.g., "cpu" or "cuda".

  • name (str) – Metric name.

__call__(sequences)[source]#

Compute the MMD between embeddings of the input sequences and the reference.

Parameters:

sequences (list[str]) – Sequences to evaluate.

Returns:

MMD score.

Return type:

MetricResult

Methods

__init__(reference, embedder, *[, estimate, ...])

Initialize the metric.

__call__(sequences)

Compute the MMD between embeddings of the input sequences and the reference.

Attributes

name

Name of the metric.

objective

Whether lower or higher scores indicate better performance.