BioLMTox Info#
Chance Challacombe
Dec 26, 2023
6 min read
This page explains applications of BioLMTox and documents it’s usage for classification and embedding extraction on BioLM.
Description#
Toxin classification has important applications in both industry and research settings and has been a concern for some time with respect to biosecurity and in the fields of protein, DNA and drug design. BioLMTox is an application of the pre-train fine-tune paradigm, honing the ESM-2 Pre-Trained Protein Language Model for general toxin classification.
Model Background#
BioLMTox is a protein language model fine-tuned for general (different domains of life and sequence lengths) toxin classification. BioLMTox was trained on a selection of sequences from the UniProt, UniRef50 and comparable SOTA datasets.
Applications of BioLMTox#
BioLMTox classification predictions and embeddings can be
used to augment biosecurity screening. Incorporate BioLMTox predictions before wet lab testing or alongside other computational screening software.
used to discriminate between toxin and not toxin homolologs that may bypass standard sequence similarity methods
incorporated into public facing APIs, we apps and chat agents to reduce dual-use risks
BioLM Benefits#
Always-on, auto-scaling GPU-backed APIs; highly-scalable parallelization.
Save money on infrastructure, GPU costs, and development time.
Quickly integrate multiple embeddings into your workflows.
Interact with the endpoint using natural language and our Chat Agents.
Rapidly screen for biosecurity risks
Get ahead of potential biosecurity regulation and laws