Instructions to use Barakuga/me5-checkthat-task1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Barakuga/me5-checkthat-task1 with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Barakuga/me5-checkthat-task1")

sentences = [
    "query: @user Alright we would both see eye to eye that contraception is ethically better, correct? I would think the most hard‑line #catholics would agree with that even though they don't support it. So sentientce is your line? There is proof sentientce at roughly 18 to 25 weeks.",
    "passage: title: Aggressive or Moderate Fluid Resuscitation in Acute Pancreatitis abstract: Early aggressive hydration is widely recommended for the management of acute pancreatitis, but evidence for this practice is limited.",
    "passage: title: Imperfect Vaccination Can Enhance the Transmission of Highly Virulent Pathogens abstract: Could some vaccines drive the evolution of more virulent pathogens? Conventional wisdom is that natural selection will remove highly lethal pathogens if host death greatly reduces transmission. Vaccines that keep hosts alive but still allow transmission could thus allow very virulent strains to circulate in a population. Here we show experimentally that immunization of chickens against Marek's disease virus enhances the fitness of more virulent strains, making it possible for hyperpathogenic strains to transmit. Immunity elicited by direct vaccination or by maternal vaccination prolongs host survival but does not prevent infection, viral replication or transmission, thus extending the infectious periods of strains otherwise too lethal to persist. Our data show that anti-disease vaccines that do not prevent transmission can create conditions that promote the emergence of pathogen strains that cause more severe disease in unvaccinated hosts.",
    "passage: title: When is the Capacity for Sentience Acquired During Human Fetal Development? abstract: The question of when the human fetus develops the capacity for sentience is central to many contentious issues. The answer could and should influence attitudes toward IVF and embryo experimentation, abortion, and fetal and neonatal surgery. For the fetus to be described as sentient, the somatosensory pathways from the periphery to the primary somatosensory region of the cerebral cortex must be established and functional. Fetal behaviour is described and the development of the underlying anatomical substrate and the chemical and electrical pathways involved in the detection, transmission, and perception of somatosensory stimuli are reviewed.It is concluded that the basic neuronal substrate required to transmit somatosensory information develops by mid-gestation (18 to 25 weeks), however, the functional capacity of the neural circuitry is limited by the immaturity of the system. Thus, 18 to 25 weeks is considered the earliest stage at which the lower boundary of sentience could be placed. At this stage of development, however, there is little evidence for the central processing of somatosensory information. Before 30 weeks gestational age, EEG activity is extremely limited and somatosensory evoked potentials are immature, lacking components which correlate with information processing within the cerebral cortex. Thus, 30 weeks is considered a more plausible stage of fetal development at which the lower boundary for sentience could be placed."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

SentenceTransformer based on intfloat/multilingual-e5-large

This is a sentence-transformers model finetuned from intfloat/multilingual-e5-large. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: intfloat/multilingual-e5-large
Maximum Sequence Length: 512 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Barakuga/me5-checkthat-task1")
# Run inference
sentences = [
    'query: @user It’s quite possibly the reverse (Pfizer @ 39%) “Our data show that anti-disease vaccines that do not prevent transmission can create conditions that promote the emergence of pathogen strains that cause more severe disease in unvaccinated hosts” Source:',
    "passage: title: Imperfect Vaccination Can Enhance the Transmission of Highly Virulent Pathogens abstract: Could some vaccines drive the evolution of more virulent pathogens? Conventional wisdom is that natural selection will remove highly lethal pathogens if host death greatly reduces transmission. Vaccines that keep hosts alive but still allow transmission could thus allow very virulent strains to circulate in a population. Here we show experimentally that immunization of chickens against Marek's disease virus enhances the fitness of more virulent strains, making it possible for hyperpathogenic strains to transmit. Immunity elicited by direct vaccination or by maternal vaccination prolongs host survival but does not prevent infection, viral replication or transmission, thus extending the infectious periods of strains otherwise too lethal to persist. Our data show that anti-disease vaccines that do not prevent transmission can create conditions that promote the emergence of pathogen strains that cause more severe disease in unvaccinated hosts.",
    'passage: title: Access to lifesaving medical resources for African countries: COVID-19 testing and response, ethics, and politics abstract: Coronavirus disease 2019 (COVID-19) has revealed how strikingly unprepared the world is for a pandemic and how easily viruses spread in our interconnected world. A governance crisis is unfolding alongside the pandemic as health officials around the world compete for access to scarce medical supplies. As governments of African countries, and those in low-income and middle-income countries around the world, seek to avoid potentially catastrophic epidemics and learn from what has worked in other countries, testing and other medical resources are of concern.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.7027, -0.0760],
#         [ 0.7027,  1.0000, -0.0821],
#         [-0.0760, -0.0821,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

Size: 19,244 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1
type string string
details
min: 21 tokens
mean: 59.89 tokens
max: 134 tokens

min: 30 tokens
mean: 336.42 tokens
max: 512 tokens

	sentence_0	sentence_1
type	string	string
details	min: 21 tokens mean: 59.89 tokens max: 134 tokens	min: 30 tokens mean: 336.42 tokens max: 512 tokens

Samples:

sentence_0	sentence_1
`query: In what way will Language Modelers such as ChatGPT Impact Jobs and Sectors? by Edward W. Felten, Manav Raj, Robert Seamans :: SSRN`	passage: title: How will Language Modelers like ChatGPT Affect Occupations and Industries? abstract: Recent dramatic increases in AI language modeling capabilities has led to many questions about the effect of these technologies on the economy. In this paper we present a methodology to systematically assess the extent to which occupations, industries and geographies are exposed to advances in AI language modeling capabilities. We find that the top occupations exposed to language modeling include telemarketers and a variety of post-secondary teachers such as English language and literature, foreign language and literature, and history teachers. We find the top industries exposed to advances in language modeling are legal services and securities, commodities, and investments. We also find a positive correlation between wages and exposure to AI language modeling.
`query: Spannende Studie zu #POTS. Sie verdeutlicht, was man ärztlich häufig wahrnimmt, nämlich dass die geistige Leistungsfähigkeit beim Sitzen und Stehen nachlässt. In diesem Fall waren Konzentration und Ausführungsfunktion gegen Kontrollen vermindert. 1/6`	passage: title: Cognitive functioning in postural orthostatic tachycardia syndrome among different body positions: a prospective pilot study (POTSKog study) abstract: Approximately 96% of patients with postural orthostatic tachycardia syndrome (PoTS) report cognitive complaints. We investigated whether cognitive function is impaired during sitting and active standing in 30 patients with PoTS compared with 30 healthy controls (HCs) and whether it will improve with the counter manoeuvre of leg crossing.In this prospective pilot study, patients with PoTS were compared to HCs matched for age, sex, and educational level. Baseline data included norepinephrine plasma levels, autonomic testing and baseline cognitive function in a seated position [the Montreal Cognitive Assessment, the Leistungsprüfsystem (LPS) subtests 1 and 2, and the Test of Attentional Performance (TAP)]. Cognitive functioning was examined in a randomized order in supine, upright and upright legs crossed position. The prima...
`query: We now know that Omicron is far from mild. In the unvaccinated it is equally lethal, while being more contagious, as other strains. Most children were and remain not vaccinated. We were aware that in winter 2021. And we know it now.`	passage: title: Intrinsic and effective severity of COVID-19 cases infected with the ancestral strain and Omicron BA.2 variant in Hong Kong abstract: ABSTRACT Background Understanding severity of infections with SARS-CoV-2 and its variants is crucial to inform public health measures. Here we used COVID-19 patient data from Hong Kong to characterise the severity profile of COVID-19 and to examine factors associated with fatality of infection. Methods Time-varying and age-specific effective severity measured by case-hospitalization risk and hospitalization risk was estimated with all individual COVID-19 case data collected in Hong Kong from 23 January 2020 through to 26 October 2022 over six epidemic waves, in comparison with estimates of influenza A(H1N1)pdm09 during the 2009 pandemic. The intrinsic severity of Omicron BA.2 was compared with the estimate for the ancestral strain with the data from unvaccinated patients without previous infections. Factors potentially associated with the...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false,
    "directions": [
        "query_to_doc"
    ],
    "partition_mode": "joint",
    "hardness_mode": null,
    "hardness_strength": 0.0
}

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 4
per_device_eval_batch_size: 4
num_train_epochs: 1
fp16: True
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

do_predict: False
eval_strategy: no
prediction_loss_only: True
per_device_train_batch_size: 4
per_device_eval_batch_size: 4
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_ratio: None
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
enable_jit_checkpoint: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
use_cpu: False
seed: 42
data_seed: None
bf16: False
fp16: True
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: -1
ddp_backend: None
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_for_metrics: []
eval_do_concat_batches: True
auto_find_batch_size: False
full_determinism: False
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
use_cache: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss
0.1039	500	0.1880
0.2079	1000	0.1486
0.3118	1500	0.1368
0.4157	2000	0.1392
0.5196	2500	0.1169
0.6236	3000	0.1305
0.7275	3500	0.1070
0.8314	4000	0.1079
0.9354	4500	0.1064

Framework Versions

Python: 3.12.13
Sentence Transformers: 5.3.0
Transformers: 5.0.0
PyTorch: 2.10.0+cu128
Accelerate: 1.13.0
Datasets: 4.0.0
Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{oord2019representationlearningcontrastivepredictive,
      title={Representation Learning with Contrastive Predictive Coding},
      author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
      year={2019},
      eprint={1807.03748},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1807.03748},
}