metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:3400000
- loss:CosineSimilarityLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
- source_sentence: >-
གྲོང་དང་། གྲོང་ཁྱེར་དང་།
ཡུལ་རྣམས་སུ་ཡང་སྡིག་པ་དང་ལྟས་ངན་པ་ཐམས་ཅད་རབ་ཏུ་ཞི་བར་བགྱིད་དོ། །
sentences:
- >-
In the past you were unaware of the constant flow of your underlying
thoughts. Now, thanks to the work on your mind that you have undertaken,
you begin to be conscious of your thoughts. In his Ornament of Mahayana
Sutras, the protector Maitreya explains the four attentions and how to
achieve the nine mental states by means of the six forces. Accordingly,
among the six, it is with the force of hearing that you achieve the
first mental state, mental placement. At this point you simply focus
your mind on the object following the instructions on how to meditate
that you have only heard and have yet to apply.
- Destroy anger and conceit, And be endowed with humility.
- >-
If you ask why, Subhūti, it is because the emptiness of external and
internal phenomena that is not apprehended is neither apprehended, nor
is it not apprehended.
- source_sentence: >-
དེ་ལྟར་ན་ལུས་ལ་བརྟེན་པའི་བསམ་གཏན་གྱི་སྐབས་སུ་ལུས་ཀྱི་བསྒུལ་བསྐྱོད་མ་ཞིག་པ་གནད་ཡིན་ནོ།།
sentences:
- >-
The bodhisattva Vajra garbha replied, “The bodhisattva enters cessation
above the sixth bodhisattva bhūmi.
- >-
Answer: Here, although those two change moment by moment, they are not
used as examples from that viewpoint; rather, it is in consideration of
their unchanging continuum during states of having and not having
defilement.
- '{6.1.11} “In the spaces between the bones Are planted the five seeds.'
- source_sentence: >-
རབ་འབྱོར་འདི་ལྟ་སྟེ་དཔེར་ན།
སངས་རྒྱས་ཀྱི་ཆོས་མ་འདྲེས་པ་བཅྭོ་བརྒྱད་ནི་གནས་པ་ཡང་མ་ཡིན་མྱི་གནས་པ་ཡང་མ་ཡིན་ནོ།
།རབ་འབྱོར་དེ་བཞིན་དུ་ཐེག་པ་ཆེན་པོ་དེ་ཡང་གནས་པ་ཡང་མ་ཡིན་མྱི་གནས་པ་ཡང་མ་ཡིན་ནོ།
།དེ་ཅིའི་ཕྱིར་ཞེ་ན།
རབ་འབྱོར་དེ་བཞིན་གཤེགས་པའི་སྟོབས་བཅུའི་ངོ་བོ་ཉིད་ལ་ནི་གནས་པ་འམ་མྱི་གནས་པ་མྱེད་དོ།
།མྱི་འཇིགས་པ་རྣམས་ཀྱི་ངོ་བོ་ཉིད་ལ་ནི་གནས་པ་འམ་མྱི་གནས་པ་མྱེད་དོ།
།སོ་སོ་ཡང་དག་པར་རིག་པ་རྣམས་ཀྱི་ངོ་བོ་ཉིད་ལ་ནི་གནས་པ་འམ་མྱི་གནས་པ་མྱེད་དོ།
།
sentences:
- >-
Although this is the case, differences in the definition of conven
tional and ultimate truth can be found in the philosophical tenets of
the various schools.
- >-
The Conversion of the Pasandakas] Then, in a district lying not far to
the west of Magadha, he took up his abode in the place where 500
adherents of the Pasandaka teaching were residing.
- >-
And, Subhūti, just as the eighteen distinct qualities of the buddhas
neither rest nor do not rest, similarly, Subhūti, this Great Vehicle
does not rest, nor does it not rest.
- source_sentence: >-
གཉིས་པ་ནི་སྔར་མཐོང་སྒོམ་གྱི་སྤང་བྱ་མ་སྤངས་པས་འཆིང་བ་མཐའ་དག་དང་ལྡན་པ་ཞིག་དང་པོར་མཐོང་ལམ་གྱིས་མཐོང་སྤང་རྣམས་སྤངས་ནས་དེ་ནས་འཇིག་རྟེན་ལས་འདས་པའམ་འཇིག་རྟེན་པའི་སྒོམ་ལམ་ཡང་རུང་སྟེ་སྒོམ་སྤང་རྣམས་རིམ་གྱིས་སྤོང་བའོ།།
sentences:
- >-
If they have not abandoned one object to be abandoned by meditation,
they enter the first result. Since those who possesses all the ties
bandhana have not previously abandoned objects to be abandoned by the
paths of seeing or meditation, they first abandon objects to be
abandoned by the path of seeing. Then they serially abandon objects to
be abandoned by the path of meditation, through either the mundane path
or the transcendent path.
- >-
More extensively, that the qualities are immeasurable is expressed
through many forms such as “,heaps of attributes,” “inconceivable
attributes beyond the count of sands of the Ganges River,” “the buddha
basic element adorned with endless excellent marks and beauties,” and so
forth. Moreover, some of these have been indicated earlier. Similarly,
the conqueror Maitreya sets forth the qualities of the noumenal nature
body and the characteristics of attained qualities in his Ornament for
the Great Vehicle Sūtras: The nature body is asserted as the cause of
mastery over complete enjoyment Equality, subtlety, Relation with that,
And having mastery in displaying all enjoyments. and in his Ornament for
the Clear Realizations: A subduer’s nature body have attained
uncontaminated attributes entire purity the nature has the
characteristic. The eight aspects by way of thoroughly dividing The
sources overcome by magnificent splendor, Non afflictedness, the
knowledge from resolve, The clairvoyances, The analytical knowledges,
The four purities of all aspects, The ten sovereignties, the ten powers,
and Knowledge of all aspects, Is called the body of attributes.
- >-
For instance, when I see a friend from a distance, this constitutes a
mental episode which may appear as a single event but is in fact a
highly complex process.
- source_sentence: ཆོས་ཉིད་རྣམ་པར་མ་རྟོག་པའི་ཡེ་ཤེས་སུ་འདོད་པ་ལ།
sentences:
- >-
This is the meaning of perceiving the true reality. The first
Bhavanakrama continues: What does the perception of ultimate reality
signify? It signifies the non-cognition of any absolute self-nature of
all realities. The term "noncognition of all realities" should not be
construed to be the same as the dark void experienced by a blind man, a
person with his eyes shut, or someone lacking in mental application. As
the text states: The inconceivable nature of all phenomena, established
through analytical wisdom obtained in absorptive meditation, is the
ultimate reality beyond conception. Therefore, a meditator seeking the
perfect view must first settle the mind in absorptive equipoise and then
conduct meditational investigation through discerning wisdom. Once the
unerring awareness of perfect view is established, meditation with fixed
attentiveness alone, rather than alternating it with investigation,
should be the practice followed, until that view is mastered. This will
be illustrated through doctrinal expositions later. Elimination of
Doubts About the Essential View of Reality There are two sections:
Review of other Buddhist schools Establishing the meditational system of
our school.
- Make the Secret Mantra teachings prosper and flourish!
- >-
“ ‘Through further cultivation on just that path, they thoroughly
abandon attachment to sense objects and malice.
datasets:
- billingsmoore/mlotsawa_sim
- billingsmoore/mlotsawa_mt
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
model-index:
- name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: ones
type: ones
metrics:
- type: pearson_cosine
value: .nan
name: Pearson Cosine
- type: spearman_cosine
value: .nan
name: Spearman Cosine
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: high
type: high
metrics:
- type: pearson_cosine
value: 0.4553051678079648
name: Pearson Cosine
- type: spearman_cosine
value: 0.44518236653271653
name: Spearman Cosine
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: low
type: low
metrics:
- type: pearson_cosine
value: 0.6362668920803624
name: Pearson Cosine
- type: spearman_cosine
value: 0.638849903249996
name: Spearman Cosine
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: neg
type: neg
metrics:
- type: pearson_cosine
value: 0.16953904102615214
name: Pearson Cosine
- type: spearman_cosine
value: 0.2036594857352673
name: Spearman Cosine
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("khyentsevision/minilm-bo-en-sim")
# Run inference
sentences = [
'ཆོས་ཉིད་རྣམ་པར་མ་རྟོག་པའི་ཡེ་ཤེས་སུ་འདོད་པ་ལ།',
'This is the meaning of perceiving the true reality. The first Bhavanakrama continues: What does the perception of ultimate reality signify? It signifies the non-cognition of any absolute self-nature of all realities. The term "noncognition of all realities" should not be construed to be the same as the dark void experienced by a blind man, a person with his eyes shut, or someone lacking in mental application. As the text states: The inconceivable nature of all phenomena, established through analytical wisdom obtained in absorptive meditation, is the ultimate reality beyond conception. Therefore, a meditator seeking the perfect view must first settle the mind in absorptive equipoise and then conduct meditational investigation through discerning wisdom. Once the unerring awareness of perfect view is established, meditation with fixed attentiveness alone, rather than alternating it with investigation, should be the practice followed, until that view is mastered. This will be illustrated through doctrinal expositions later. Elimination of Doubts About the Essential View of Reality There are two sections: Review of other Buddhist schools Establishing the meditational system of our school.',
'Make the Secret Mantra teachings prosper and flourish!',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7118, 0.4199],
# [0.7118, 1.0000, 0.4087],
# [0.4199, 0.4087, 1.0000]])
Evaluation
Metrics
Semantic Similarity
- Datasets:
ones,high,lowandneg - Evaluated with
EmbeddingSimilarityEvaluator
| Metric | ones | high | low | neg |
|---|---|---|---|---|
| pearson_cosine | nan | 0.4553 | 0.6363 | 0.1695 |
| spearman_cosine | nan | 0.4452 | 0.6388 | 0.2037 |
Training Details
Training Dataset
- Size: 3,400,000 training samples
- Columns:
bo,en, andscore - Approximate statistics based on the first 1000 samples:
bo en score type string string float details - min: 9 tokens
- mean: 92.9 tokens
- max: 512 tokens
- min: 5 tokens
- mean: 74.57 tokens
- max: 512 tokens
- min: 1.0
- mean: 1.0
- max: 1.0
- Samples:
bo en score ཁྱིམ་བདག་རྣམས་ཀྱིས་ལྷ་འབོད་ཅིང་འབྲུ་སྣ་འཐོར།། སྔགས་པས་མེ་ཏོག་འཐོར་ཞིང་བཀྲ་ཤིས་གང་ཤེས་བརྗོད། དགའ་སྟོན་བཀྲ་ཤིས་སྐྱིད་པའི་ལྷ་བྲོ་སོགས་རྩེ་བར་བྱའོ། མངྒ་ལཾ།May the gods be victorious! SVĀSTI SVĀSTI BHRŪṂ BHRŪṂ SVĀHĀ The owners of the house call upon the deities and scatter the different kinds of grains. The tantric practitioner spreads flowers and utters whatever auspicious words they know. Then follows a feast, and auspicious and happy divine dances and so on should be performed. Maṅgalam.1.0ཡུམ་བརྒྱད་ཀྱིས་བོན་དངོས་སུ་ཡི་གེའི་རྣམ་པ་དང་། གཞན་ཡང་སངས་རྒྱས་ཀྱི་གསུང་ཆོས་ཀྱི་སྒྲ་དབྱངས་མཚོན་པར་བྱེད་པ་ཡིན་པས་སྐུ་གསུང་ཐུགས་རྟེན་གསུམ་ཀའི་མཚན་ཉིད་ཚང་བ་དང་།The letters engraved on it are the eight seed-syllables of the eight consorts, and the bell itself symbolizes the Buddha's speech, the sound of the Dharma. So together, vajra and bell fulfil all the criteria of representations of the Buddha's body, speech and mind.1.0གཞོམ་གཞིག་ཡོངས་བྲལ་འཕོ་ཆེན་དྭངས་སྐུར་སྨོན། །May you remain in the pure kāya of great transference beyond dissolution or destruction.1.0 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Evaluation Datasets
ones
ones
- Size: 5,000 evaluation samples
- Columns:
bo,en, andscore - Approximate statistics based on the first 1000 samples:
bo en score type string string float details - min: 6 tokens
- mean: 97.23 tokens
- max: 512 tokens
- min: 5 tokens
- mean: 77.16 tokens
- max: 439 tokens
- min: 1.0
- mean: 1.0
- max: 1.0
- Samples:
bo en score གླེགས་བམ་ཐམས་ཅད་མེ་ལ་བསྲེགས།།All the scriptures will be consumed in flames.1.0འས་འགགས་པ་མེད་པའི་ཐུགས་རྗེའི་དོན་མཚོན་ལ།The macron" indicates the meaning of compassion's unceasing activity.1.0སེམས་ཅན་གྱི་དབང་པོའི་རིམ་པ་ཤེས་པ་ལ་ཉན་ཐོས་འདི་དག་ནི་ཁོ་བོས་དམུས་ལོང་དང་འདྲ་བ་སྙམ་བྱེད་དོ།།The disciple-vehicle is not ultimately valid, and you disciples are like men blind from birth, in regard to recognition of the degrees of the spiritual faculties of living beings.’ “1.0 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
high
high
- Size: 5,000 evaluation samples
- Columns:
bo,en, andscore - Approximate statistics based on the first 1000 samples:
bo en score type string string float details - min: 8 tokens
- mean: 93.83 tokens
- max: 512 tokens
- min: 6 tokens
- mean: 91.8 tokens
- max: 512 tokens
- min: 0.5
- mean: 0.62
- max: 0.92
- Samples:
bo en score འདིར་ངེས་པར་ནུས་པ་ཆེན་པོ་ཐོབ་པས། ཆུ་འབྱུང་མ་དང་། ཐིག་ལེ་མཆོག་མ་སོགས་ལྷ་མོ་འགུགས་པར་སླ་བ་རྣམས་དང་། གཞན་ཡང་ལྷ་མོ་ཀླུ་མོ་གནོད་སྦྱིན་མོ་མ་མོ་ཆེན་མོ་རྣམས་ཀྱང་དགུག་པར་བྱ་སྟེ།In this context, the glorious Vajrasattva, being secondary in nature (gnyis pa nyid yin pas), is not unlike like a king empowered by a precious wish-fulfilling jewel, through which everything is made possible and impossible.0.5368528366088867གདན་ཞུས་ཕེབས་འབྱོར་དེ་ལྟ་བུ་ནི་མཆོད་ཡོན་འབྲེལ་བ་ཡི་བྱུང་བ་བརྒྱ་ལས་དཔེ་མཚོན་གཅིག་སྟེ། དེས་གོང་མ་དང་འོག་མ། སྟེང་མ་དང་འགབ་མའི་འབྲེལ་བ་མཚོན་པ་མ་གཏོགས། འདྲ་མཉམ་དང་རང་དབང་མཚོན་མེད་པ་མངོན་གསལ་རེད།Treatise of the Refutation of the Person Moreover, the.0.5699281692504883དཔེར་ན་མིག་ནད་དག་པའི་གང་ཟག་ལ་སྐྲ་ཤད་དང་རབ་རིབ་མི་སྣང་བ་ལྟ་བུའོ། །This is demonstrated by the fact that while sensory appearances change from the very moment they manifest, ceasing and passing away in a succession of later moments following former ones, ordinary mind does not take on the essence of every passing phenomenon and thereby become itself nonexistent as mind. "0.5960831642150879 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
low
low
- Size: 5,000 evaluation samples
- Columns:
bo,en, andscore - Approximate statistics based on the first 1000 samples:
bo en score type string string float details - min: 6 tokens
- mean: 89.07 tokens
- max: 512 tokens
- min: 5 tokens
- mean: 72.53 tokens
- max: 385 tokens
- min: 0.0
- mean: 0.25
- max: 0.5
- Samples:
bo en score དེ་ཡིས་ང་ཚོ་ཤེས་ལྡན་དང་ཆིངས་རྒྱ་ལས་གྲོལ་བ་ཞིག་བཟོ་ངེས་རེད་ཅེས་པ་ངས་དྲན་གྱིས་ཡོད། བསམ་བློ་འདི་ཡིས་ང་རང་སོ་སོའི་སྤྱི་ཚོགས་ཀྱི་ནང་དུ་ཤེས་ཡོན་ཐོག་ལ་ལས་ཀ་བྱེད་རྒྱུའི་བློ་སྟོབས་སྦྱིན་སོང་། ཕྱི་ལོ་༢༠༠༦ ནས་གནས་སྟངས་ཇེ་བཟང་དུ་ཕྱིན། སྐྱབས་བཅོལ་སྡོད་སྒར་གྱི་མི་འབོར་གྱི་རྒྱ་ཁྱོན་དང་། ལྷག་པར་དུ་ཤེས་ཡོན་གྱི་ཐོག་ལ་རྟག་ཏུ་དཀའ་ངལ་ཆེན་པོ་ཞིག་རེད། གཞོན་ནུའི་དུས་སྐབས་ནི་མིརྣམས་ཀྱི་འཇོན་ཐང་སྟོན་ཐུབ་ས་ཞིག་དང་། སྤྱི་ཚོགས་ཀྱི་ནང་དུ་ཕན་ཐོགས་ཆེན་པོ་བསྒྲུབ་ཐུབ་རྒྱུའི་དུས་ཤིག་རེད། འོན་ཀྱང་ཁོང་ཚོའི་མི་ཚེ་དེ་བེད་མེད་ཞིག་ཏུ་འགྱུར་སྐབས། ཁོང་ཚོས་ཉེན་ཁ་ལ་མི་འཛེམ་པར་འཕྲོད་བསྟེན་ལ་གནོད་པའི་གོམས་གཤིས་སྡུག་ཅག་རིགས་བསྟེན་འགོ་བཙུགས། ཕྱི་ལོ་༢༠༠༦ ནས་བཟུང་མེ་ལ་སྐྱབས་བཅོལ་སྡོད་སྒར་ནང་དུ་གནས་སྟངས་གང་འཚམ་ལྷོད་ཡངས་སུ་འགྱུར་ནས། ང་ཚོ་རང་དབང་གི་ཐོག་ནས་སྡོད་སྒར་གྱི་ཕྱི་ལ་བསྐྱོད་ཆོག་ཅིང་། ཚོང་ཁང་དང་སློབ་གྲྭ་ཡང་མང་དུ་ཕྱིན། དེ་ཚོ་ནི་སྡོད་སྒར་གྱི་དབང་འཛིན་ནས་ལྟ་རྟོགས་དེ་ཙམ་མེད་པའི་ཐོག་ནས་འགོ་ཚུགས། མེ་ལའི་སྡོད་སྒར་ནང་དུ་མཐོ་རིམ་སློབ་གཉེར་ཁང་ཁ་ཤས་འགོ་འཛུགས་བྱས། འདི་ཡིས་གཞོན་སྐྱེས་མང་པོ་ཞིག་ལ་འཛིན་རིམ་བཅུ་གཉིས་ཐོན་རྗེས་མཐོ་རིམ་སློབ་སྦྱོང་བྱེད...On the basis of this presentation of the two truths, made on that an intrinsically iddentifiable status does not even in the conventional exist, there are numerous exegetical procedures, such as the negation of a general basis and so forth, which are not shared by the Own-Continuum theory-system.0.2081005722284317ཁྲིམས་དང་། སྟངས་འཛིན་དང་། དབང་ཆ་རྣམས་མེད་ན། ཕྲུ་གུའི་བསམ་བློར་གོ་རིམ་ཞིག་སླེབས་པ་བྱ་རྒྱུ་དེ་ཧ་ཅང་ཁག་པོ་རེད་འདུག་ནར་སོན་རྣམས་ནི་ཕྲུ་གུ་དང་གནས་སྟངས་འདྲ་ཡི་མ་རེད།།the ripened result, the result corresponding to the cause, and the dominant result.0.06497567892074585མོན་མོ་བཀྲ་ཤིས་འཁྱིལ་འདྲེན་གྱི་སྐྱེ་བར་ལུང་གིས་ཟིན་པ་དེ་གཉིས་ཀྱི་རིགས་ལ་གཟིགས་ཏེ།Use compresses of kar chhu and black mud’ for treating nose bleeding.0.035365331918001175 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
neg
neg
- Size: 5,000 evaluation samples
- Columns:
bo,en, andscore - Approximate statistics based on the first 1000 samples:
bo en score type string string float details - min: 6 tokens
- mean: 91.78 tokens
- max: 512 tokens
- min: 5 tokens
- mean: 37.09 tokens
- max: 365 tokens
- min: -0.3
- mean: -0.13
- max: -0.0
- Samples:
bo en score འདི་ལྟར་ཆོས་ཐམས་ཅད་ནི་མ་བྱས་པ། མཉམ་པ་མེད་པ། མི་མཉམ་པ་མེད་པ། མ་ཞི་བ། ཉེ་བར་མ་ཞི་བ།རབ་ཏུ་མ་ཞི་བའོ།།In the third year after she became the chief queen, she came to dislike the king’s constantly going to pay respect to the Bodhi tree. She stuck the fangs of a venomous snake into the Bodhi tree and it decayed.-0.09662525355815887འདི་ལ་རྔོག་ལུགས།It is very difficult if it is someone who has no background or practice.-0.26938170194625854ཉི་མ་ཕོག་དང་ཞོ་འཐུངས་སྟོན་དུས་ལྡང་།།They maintain their vows perfectly,-0.24520885944366455 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepslearning_rate: 2e-05warmup_ratio: 0.1fp16: Trueauto_find_batch_size: True
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Truefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Framework Versions
- Python: 3.11.9
- Sentence Transformers: 5.2.0
- Transformers: 4.57.3
- PyTorch: 2.9.1+cu128
- Accelerate: 1.12.0
- Datasets: 4.4.2
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}