bart-large-wiki-doc

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.1601
  • Sari: 50.8615
  • Sari Add: 13.6308
  • Sari Keep: 44.3195
  • Sari Del: 94.6343
  • Fkgl: 6.7741
  • Bleu: 25.3408
  • D Sari: 0.4652
  • D Sari Keep: 0.3942
  • D Sari Del: 0.809
  • D Sari Add: 0.1924

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 20
  • label_smoothing_factor: 0.3

Training results

Training Loss Epoch Step Bleu D Sari D Sari Add D Sari Del D Sari Keep Fkgl Validation Loss Sari Sari Add Sari Del Sari Keep
5.2573 0.7459 1000 13.9528 0.3403 0.0 0.9948 0.0261 7.106 5.1128 48.1497 8.6665 95.2235 40.5591
4.8677 1.4915 2000 15.9556 0.3448 0.0 0.9947 0.0397 6.9782 5.0729 48.9106 10.0717 95.187 41.473
4.7689 2.2372 3000 15.8936 0.3431 0.0 0.9947 0.0344 6.9015 5.0350 49.3306 10.7042 95.2951 41.9925
4.6608 2.9830 4000 15.8724 0.3397 0.0 0.9949 0.0242 6.7831 5.0289 49.8733 11.7733 95.3954 42.4512
4.5534 3.7287 5000 17.3442 0.3431 0.0 0.9947 0.0344 6.8518 5.0233 49.9727 11.4246 95.2881 43.2055
4.4703 4.4744 6000 5.0376 49.9166 11.9432 42.5405 95.266 6.8747 17.1222 0.4495 0.3716 0.8118 0.1652
4.4451 5.2200 7000 5.0560 50.1126 12.2016 42.8193 95.3169 6.7038 17.4191 0.4507 0.3723 0.8136 0.1663
4.396 5.9659 8000 5.0326 50.7081 12.7094 44.1816 95.2332 6.6043 20.0309 0.4599 0.3867 0.8153 0.1778
4.3371 6.7115 9000 5.0637 50.6303 12.9272 43.8282 95.1356 6.763 20.4947 0.4587 0.3834 0.8113 0.1814
4.3014 7.4572 10000 5.0950 50.4656 12.8875 43.2239 95.2854 6.6203 18.5182 0.4575 0.3834 0.8113 0.1779
4.2664 8.2029 11000 5.1100 50.173 12.8216 43.0231 94.6743 6.7837 23.116 0.4554 0.3843 0.7992 0.1829
4.2373 8.9487 12000 5.1019 50.4108 13.0119 43.6241 94.5963 6.9084 24.2771 0.4582 0.3914 0.7996 0.1836
4.2038 9.6944 13000 5.1239 50.4995 13.3065 43.4134 94.7785 6.742 23.2113 0.4589 0.3859 0.8035 0.1874
4.1759 10.4401 14000 5.1491 50.4754 13.0605 43.7134 94.6524 6.8311 23.9938 0.4583 0.3853 0.8074 0.1821
4.1618 11.1857 15000 5.1601 50.8615 13.6308 44.3195 94.6343 6.7741 25.3408 0.4652 0.3942 0.809 0.1924
4.1347 11.9316 16000 5.1643 50.5104 13.6341 43.3236 94.5734 6.8763 24.5365 0.4609 0.3836 0.8046 0.1944
4.1155 12.6772 17000 5.1838 50.2122 13.6879 42.6409 94.3079 6.8788 25.6935 0.4586 0.3886 0.7954 0.1918
4.1004 13.4229 18000 5.1975 50.3625 13.5012 43.2364 94.3499 6.8343 25.8639 0.4609 0.3921 0.7997 0.191
4.0916 14.1686 19000 5.2197 50.2762 13.5701 42.9087 94.3498 6.6728 25.7121 0.4623 0.3926 0.8021 0.1921
4.0773 14.9144 20000 5.2248 50.4351 13.6553 43.2455 94.4045 6.7491 25.8344 0.4628 0.3952 0.7992 0.1939
4.0639 15.6601 21000 5.2276 49.6913 13.7193 41.6269 93.7278 7.1849 27.3581 0.4588 0.3911 0.7903 0.1951
4.0558 16.4057 22000 5.2378 50.309 13.6643 43.0305 94.2322 6.691 26.5571 0.4615 0.3913 0.7983 0.195
4.0454 17.1514 23000 5.2447 49.4843 13.563 41.5506 93.3394 7.1792 28.7607 0.4556 0.3915 0.7817 0.1936
4.0411 17.8973 24000 5.2533 50.5396 13.7453 43.538 94.3355 6.8688 26.4366 0.4642 0.3977 0.7999 0.1949
4.0327 18.6429 25000 5.2619 49.9081 13.6194 42.1831 93.9219 6.9129 27.2276 0.4605 0.3929 0.793 0.1956

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.1
Downloads last month
4
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for taiypeo/bart-large-wiki-doc

Finetuned
(197)
this model