Gradio

Prot2Text-V2 Demo

Prot2Text-V2 treats a protein sequence as if it were another language and translates it into English. Supply a raw amino acid sequence and the model returns a clear, human-readable paragraph describing what the protein does.

The paper describing Prot2Text-V2 has been accepted to the NeurIPS 2025 main conference and pairs fast experimentation with explainability-minded outputs.

Input: protein sequence using IUPAC single-letter amino acid codes (20 canonical amino acids).
Output: polished descriptions of predicted function, localization cues, and structural hints.
Why it matters: accelerate protein characterization, lab annotations, or downstream hypothesis building.

Model architecture at a glance

Protein language model encoder: facebook/esm2_t36_3B_UR50D.
Modality adapter: lightweight bridge aligning protein embeddings with the language model.
Natural language decoder: meta-llama/Llama-3.1-8B-Instruct for articulate descriptions.

Resources

Paper (NeurIPS 2025)
Code repository
Training data