seqme.utils.subsample

Contents

seqme.utils.subsample#

seqme.utils.subsample(sequences, n_samples, *, return_indices=False, seed=0)[source]#

Sample a subset of the sequences without replacement.

Parameters:
  • sequences (list[str]) – Sequences to sample from.

  • n_samples (int) – Number of sequences to sample.

  • return_indices (bool) – If True, return a tuple of the sequence subset and indices else return only the sequence subset.

  • seed (int | None) – Local seed when sampling. If None, no fixed local seed is used.

Return type:

list[str] | tuple[list[str], ndarray]

Returns:

A list of n_samples randomly chosen sequences. Optionally, including the indices.

Raises:

ValueError – If n_samples exceeds the number of available sequences.