SetFit Aspect Model with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Aspect Based Sentiment Analysis (ABSA). This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification. In particular, this model is in charge of filtering aspect span candidates.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

This model was trained within the context of a larger system for ABSA, which looks like so:

Use a spaCy model to select possible aspect span candidates.
Use this SetFit model to filter these possible aspect span candidates.
Use a SetFit model to classify the filtered aspect span candidates.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
spaCy Model: en_core_web_lg
SetFitABSA Aspect Model: NazmusAshrafi/mams-ds-setfit-MiniLM-mpnet-absa-tesla-tweet-aspect
SetFitABSA Polarity Model: NazmusAshrafi/mams-ds-setfit-MiniLM-mpnet-absa-tesla-tweet-polarity
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
aspect	"food:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you." "place:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you." 'service:Though the service might be a little slow, the waitresses are very friendly.'
no aspect	"sit:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you." "area:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you." "citizen brigade:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you."

Label

Examples

aspect

"food:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you."
"place:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you."
'service:Though the service might be a little slow, the waitresses are very friendly.'

no aspect

"sit:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you."
"area:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you."
"citizen brigade:It might be the best sit down food I've had in the area, so if you are going to the upright citizen brigade, or the garden, it could be just the place for you."

Evaluation

Metrics

Label	Accuracy
all	0.9681

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import AbsaModel

# Download from the 🤗 Hub
model = AbsaModel.from_pretrained(
    "NazmusAshrafi/mams-ds-setfit-MiniLM-mpnet-absa-tesla-tweet-aspect",
    "NazmusAshrafi/mams-ds-setfit-MiniLM-mpnet-absa-tesla-tweet-polarity",
)
# Run inference
preds = model("The food was great, but the venue is just way too busy.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	8	26.6069	52

Label	Training Sample Count
no aspect	229
aspect	33

Training Hyperparameters

batch_size: (16, 2)
num_epochs: (1, 16)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.2315	-
0.0149	50	0.2637	-
0.0297	100	0.1795	-
0.0446	150	0.1164	-
0.0595	200	0.0131	-
0.0744	250	0.0036	-
0.0892	300	0.0004	-
0.1041	350	0.0003	-
0.1190	400	0.0001	-
0.1338	450	0.0002	-
0.1487	500	0.0001	-
0.1636	550	0.0001	-
0.1785	600	0.0001	-
0.1933	650	0.0001	-
0.2082	700	0.0	-
0.2231	750	0.0001	-
0.2380	800	0.0001	-
0.2528	850	0.0	-
0.2677	900	0.0001	-
0.2826	950	0.0003	-
0.2974	1000	0.0008	-
0.3123	1050	0.0001	-
0.3272	1100	0.0	-
0.3421	1150	0.0	-
0.3569	1200	0.0	-
0.3718	1250	0.0	-
0.3867	1300	0.0	-
0.4015	1350	0.0	-
0.4164	1400	0.0	-
0.4313	1450	0.0	-
0.4462	1500	0.0	-
0.4610	1550	0.0	-
0.4759	1600	0.0	-
0.4908	1650	0.0	-
0.5057	1700	0.0	-
0.5205	1750	0.0	-
0.5354	1800	0.0	-
0.5503	1850	0.0	-
0.5651	1900	0.0	-
0.5800	1950	0.0	-
0.5949	2000	0.0	-
0.6098	2050	0.0	-
0.6246	2100	0.0	-
0.6395	2150	0.0	-
0.6544	2200	0.0	-
0.6692	2250	0.0	-
0.6841	2300	0.0	-
0.6990	2350	0.0	-
0.7139	2400	0.0	-
0.7287	2450	0.0	-
0.7436	2500	0.0	-
0.7585	2550	0.0	-
0.7733	2600	0.0	-
0.7882	2650	0.0	-
0.8031	2700	0.0	-
0.8180	2750	0.0	-
0.8328	2800	0.0	-
0.8477	2850	0.0	-
0.8626	2900	0.0	-
0.8775	2950	0.0	-
0.8923	3000	0.0	-
0.9072	3050	0.0	-
0.9221	3100	0.0	-
0.9369	3150	0.0	-
0.9518	3200	0.0	-
0.9667	3250	0.0	-
0.9816	3300	0.0	-
0.9964	3350	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.4.0
spaCy: 3.7.4
Transformers: 4.37.2
PyTorch: 2.1.0+cu121
Datasets: 2.17.1
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for NazmusAshrafi/mams-ds-setfit-MiniLM-mpnet-absa-tesla-tweet-aspect

Base model

sentence-transformers/paraphrase-mpnet-base-v2

Finetuned

(320)

this model

Paper for NazmusAshrafi/mams-ds-setfit-MiniLM-mpnet-absa-tesla-tweet-aspect

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 4

Evaluation results

Accuracy on Unknown
test set self-reported

0.968