Third-party models

Contents

Third-party models#

In this notebook, we show how to integrate “external” models into seqme.

from functools import partial

import seqme as sm

Some models of interest are not available through e.g., PyPI or Huggingface - only the git repository may be available. Here we show how to run such models in seqme.

An external model is compatible with seqme if it is setup using uv (lockfile, python version defined), and defines an entry point (function).

Setup a project using:

uv init --package hello-model

Let’s use a toy model in a github repository satisfying all three requirements. To do so, we need to define the function entry point, repository url and the path which stores the repository.

hello_model = sm.models.ThirdPartyModel(
    entry_point="hello_model.model:embed",
    path="../thirdparty/hello-model",
    url="https://github.com/szczurek-lab/seqme-thirdparty",
    branch="main",
)
Cloning into '/Users/rasmus.larsen/work/hackathon-2025/seqme/docs/thirdparty/hello-model'...

ThirdPartyModel clones the model repository and installs the dependencies first time running the model.

Assuming everything went well, let’s now compute a metric using this embedding model.

hello_model(["MKQW", "RKSPL"], batch_size=32)
array([[44.,  8., 12., 32.],
       [55., 10., 15., 40.]])
sequences = {
    "HydrAMP": ["MMRK", "RKSPL", "RRLSK", "RRLSK"],
    "hyformer": ["MKQW", "RKSPL"],
    "Random": ["KKKKK", "PLQ", "RKSPL"],
}

metrics = [sm.metrics.FBD(reference=sequences["Random"], embedder=hello_model)]
df = sm.evaluate(sequences, metrics)

sm.show(df)
100%|██████████| 3/3 [00:00<00:00, 11.06it/s, data=Random, metric=FBD]  
  FBD↓
HydrAMP 119.24
hyformer 45.17
Random 0.00

AMPlify#

Let’s also use AMPlify which is an antimicrobial peptide (AMP) classifier, i.e., outputs the probability a peptide has antimicrobial properties.

Let’s setup the model.

amplify = sm.models.ThirdPartyModel(
    entry_point="amplify.predict:predict",
    path="../thirdparty/amplify",
    url="https://github.com/szczurek-lab/seqme-amplify",
)
Cloning into '/Users/rasmus.larsen/work/hackathon-2025/seqme/docs/thirdparty/amplify'...

Assuming everything went well, let’s now compute a metric using this predictive model.

amplify(["MKQW", "RKSPL"], model_type="imbalanced", batch_size=128, n_ensembles=2)
array([0.00635906, 0.49806994], dtype=float32)
sequences = {
    "HydrAMP": ["MMRK", "RKSPL", "RRLSK", "RRLSK"],
    "hyformer": ["MKQW", "RKSPL"],
    "Random": ["KKKKK", "PLQ", "RKSPL"],
}

metrics = [
    sm.metrics.ID(
        predictor=partial(amplify, model_type="balanced", n_ensembles=5, batch_size=128),
        name="p_AMP (AMPlify)",
        objective="maximize",
    )
]
df = sm.evaluate(sequences, metrics)

sm.show(df)
100%|██████████| 3/3 [00:16<00:00,  5.50s/it, data=Random, metric=p_AMP (AMPlify)]  
  p_AMP (AMPlify)↑
HydrAMP 0.26±0.10
hyformer 0.24±0.25
Random 0.40±0.22

amPEPpy#

Let’s also use amPEPpy which is an antimicrobial peptide (AMP) classifier, i.e., outputs the probability a peptide has antimicrobial properties.

Let’s setup the model.

ampeppy = sm.models.ThirdPartyModel(
    entry_point="ampeppy.predict:predict",
    path="../thirdparty/ampeppy",
    url="https://github.com/szczurek-lab/seqme-amPEPpy",
)
Cloning into '/Users/rasmus.larsen/work/hackathon-2025/seqme/docs/thirdparty/ampeppy'...

Assuming everything went well, let’s now compute a metric using this predictive model.

ampeppy(["MKQW", "RKSPL"])
array([0.49427083, 0.28333333])
sequences = {
    "HydrAMP": ["MMRK", "RKSPL", "RRLSK", "RRLSK"],
    "hyformer": ["MKQW", "RKSPL"],
    "Random": ["KKKKK", "PLQ", "RKSPL"],
}

metrics = [sm.metrics.ID(predictor=ampeppy, name="p_AMP (amPEPpy)", objective="maximize")]
df = sm.evaluate(sequences, metrics)

sm.show(df)
100%|██████████| 3/3 [00:04<00:00,  1.59s/it, data=Random, metric=p_AMP (amPEPpy)]  
  p_AMP (amPEPpy)↑
HydrAMP 0.41±0.08
hyformer 0.39±0.11
Random 0.39±0.09