Prot2Text-V2 Demo
Prot2Text-V2 treats a protein sequence as if it were another language and translates it into English. Supply a raw amino acid sequence and the model returns a clear, human-readable paragraph describing what the protein does.
The paper describing Prot2Text-V2 has been accepted to the NeurIPS 2025 main conference and pairs fast experimentation with explainability-minded outputs.
- Input: protein sequence using IUPAC single-letter amino acid codes (20 canonical amino acids).
- Output: polished descriptions of predicted function, localization cues, and structural hints.
- Why it matters: accelerate protein characterization, lab annotations, or downstream hypothesis building.
Model architecture at a glance
- Protein language model encoder: facebook/esm2_t36_3B_UR50D.
- Modality adapter: lightweight bridge aligning protein embeddings with the language model.
- Natural language decoder: meta-llama/Llama-3.1-8B-Instruct for articulate descriptions.
Resources
Sample sequences
1 1024
0 4
0.05 1
1 1000
1 2
- Model stack: Facebook ESM2 encoder + Llama 3.1 8B instruction-tuned decoder.
- Token budget: the generator truncates after the configured
Max new tokens. - Attribution: Outputs are predictions; validate experimentally before publication.