seqme.models.Hyformer

seqme.models.Hyformer#

class seqme.models.Hyformer(model_name, *, device=None, batch_size=256, cache_dir=None, verbose=False)[source]#

Wrapper for the Hyformer molecule/peptide embedding model.

Computes sequence-level embeddings by extracting the [CLS] token embedding.

Installation for molecules: pip install "seqme[hyformer_molecules]" "hyformer @ git+https://github.com/szczurek-lab/hyformer.git@main"

Installation for peptides: pip install "seqme[hyformer]" "hyformer @ git+https://github.com/szczurek-lab/hyformer.git@v2.0".

Reference:

Izdebski et al., “Synergistic Benefits of Joint Molecule Generation and Property Prediction” (https://arxiv.org/abs/2504.16559)

__init__(model_name, *, device=None, batch_size=256, cache_dir=None, verbose=False)[source]#

Initialize Hyformer model.

Parameters:
  • model_name (HyformerCheckpoint | str) – Model checkpoint name or enum.

  • device (Optional[str]) – Device to run inference on, e.g., "cuda" or "cpu".

  • batch_size (int) – Number of sequences to process per batch.

  • cache_dir (Optional[str]) – Directory to cache the model.

  • verbose (bool) – Whether to display a progress bar.

__call__(sequences)[source]#

Call self as a function.

Return type:

ndarray

Methods

__init__(model_name, *[, device, ...])

Initialize Hyformer model.

__call__(sequences)

Call self as a function.

compute_perplexity(sequences)

Compute perplexity for a list of sequences.

embed(sequences)

Compute embeddings for a list of sequences.

generate(num_samples[, temperature, top_k, seed])

Generate sequences de novo.

predict(sequences)

Compute predictions for a list of sequences.