b519e38e118d47b9896060f349afe555
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-sv] dataset. It achieves the following results on the evaluation set:
- Loss: 2.1393
- Data Size: 1.0
- Epoch Runtime: 20.4236
- Bleu: 9.7627
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.2365 | 0 | 2.2475 | 0.0327 |
| No log | 1 | 77 | 10.9971 | 0.0078 | 2.4500 | 0.0426 |
| No log | 2 | 154 | 10.6562 | 0.0156 | 3.3247 | 0.0677 |
| No log | 3 | 231 | 10.6224 | 0.0312 | 4.4267 | 0.0652 |
| No log | 4 | 308 | 10.2949 | 0.0625 | 5.3928 | 0.1007 |
| No log | 5 | 385 | 10.3939 | 0.125 | 7.5376 | 0.0614 |
| 1.6745 | 6 | 462 | 8.9687 | 0.25 | 10.6339 | 0.0558 |
| 6.6052 | 7 | 539 | 7.3565 | 0.5 | 14.3264 | 0.1602 |
| 11.4468 | 8.0 | 616 | 6.7779 | 1.0 | 23.0483 | 0.2634 |
| 9.202 | 9.0 | 693 | 4.5088 | 1.0 | 20.9401 | 2.1507 |
| 5.3202 | 10.0 | 770 | 3.1119 | 1.0 | 21.2839 | 11.0441 |
| 4.5071 | 11.0 | 847 | 2.6812 | 1.0 | 21.5924 | 6.0597 |
| 3.6237 | 12.0 | 924 | 2.5237 | 1.0 | 20.3424 | 6.8789 |
| 3.2205 | 13.0 | 1001 | 2.4199 | 1.0 | 20.0496 | 7.3632 |
| 3.1032 | 14.0 | 1078 | 2.3535 | 1.0 | 20.5490 | 7.7015 |
| 2.8917 | 15.0 | 1155 | 2.2919 | 1.0 | 20.7278 | 7.9977 |
| 2.7819 | 16.0 | 1232 | 2.2653 | 1.0 | 21.7191 | 8.1266 |
| 2.6539 | 17.0 | 1309 | 2.2321 | 1.0 | 20.2263 | 8.3158 |
| 2.5842 | 18.0 | 1386 | 2.2095 | 1.0 | 20.8697 | 8.4720 |
| 2.4796 | 19.0 | 1463 | 2.2000 | 1.0 | 21.1639 | 8.6951 |
| 2.4578 | 20.0 | 1540 | 2.1901 | 1.0 | 21.2464 | 8.7047 |
| 2.3617 | 21.0 | 1617 | 2.1707 | 1.0 | 20.0940 | 8.8880 |
| 2.3159 | 22.0 | 1694 | 2.1584 | 1.0 | 19.7261 | 8.8698 |
| 2.2381 | 23.0 | 1771 | 2.1523 | 1.0 | 19.9568 | 9.0408 |
| 2.214 | 24.0 | 1848 | 2.1462 | 1.0 | 20.9034 | 9.0797 |
| 2.1296 | 25.0 | 1925 | 2.1346 | 1.0 | 21.1286 | 9.0972 |
| 2.0865 | 26.0 | 2002 | 2.1356 | 1.0 | 19.7405 | 9.2238 |
| 2.0496 | 27.0 | 2079 | 2.1321 | 1.0 | 20.2183 | 9.2411 |
| 1.987 | 28.0 | 2156 | 2.1282 | 1.0 | 20.9950 | 9.3636 |
| 1.9553 | 29.0 | 2233 | 2.1312 | 1.0 | 22.2056 | 9.4380 |
| 1.8989 | 30.0 | 2310 | 2.1294 | 1.0 | 20.4057 | 9.4722 |
| 1.8778 | 31.0 | 2387 | 2.1312 | 1.0 | 20.6044 | 9.4259 |
| 1.8229 | 32.0 | 2464 | 2.1233 | 1.0 | 20.6957 | 9.4481 |
| 1.801 | 33.0 | 2541 | 2.1257 | 1.0 | 20.8279 | 9.6641 |
| 1.7688 | 34.0 | 2618 | 2.1413 | 1.0 | 22.2676 | 9.5680 |
| 1.7347 | 35.0 | 2695 | 2.1368 | 1.0 | 20.5484 | 9.7490 |
| 1.6756 | 36.0 | 2772 | 2.1393 | 1.0 | 20.4236 | 9.7627 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/b519e38e118d47b9896060f349afe555
Base model
google/umt5-base