This model is a version of microsoft/deberta-v3-large fine-tuned on the SQuAD version 2.0 dataset.
Fine-tuning & evaluation on a NVIDIA Titan RTX - 24GB GPU took 15 hours.
Results from 2023 ICLR paper, "DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing", by Pengcheng He, et. al.