b519e38e118d47b9896060f349afe555

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-sv] dataset. It achieves the following results on the evaluation set:

Loss: 2.1393
Data Size: 1.0
Epoch Runtime: 20.4236
Bleu: 9.7627

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.2365	0	2.2475	0.0327
No log	1	77	10.9971	0.0078	2.4500	0.0426
No log	2	154	10.6562	0.0156	3.3247	0.0677
No log	3	231	10.6224	0.0312	4.4267	0.0652
No log	4	308	10.2949	0.0625	5.3928	0.1007
No log	5	385	10.3939	0.125	7.5376	0.0614
1.6745	6	462	8.9687	0.25	10.6339	0.0558
6.6052	7	539	7.3565	0.5	14.3264	0.1602
11.4468	8.0	616	6.7779	1.0	23.0483	0.2634
9.202	9.0	693	4.5088	1.0	20.9401	2.1507
5.3202	10.0	770	3.1119	1.0	21.2839	11.0441
4.5071	11.0	847	2.6812	1.0	21.5924	6.0597
3.6237	12.0	924	2.5237	1.0	20.3424	6.8789
3.2205	13.0	1001	2.4199	1.0	20.0496	7.3632
3.1032	14.0	1078	2.3535	1.0	20.5490	7.7015
2.8917	15.0	1155	2.2919	1.0	20.7278	7.9977
2.7819	16.0	1232	2.2653	1.0	21.7191	8.1266
2.6539	17.0	1309	2.2321	1.0	20.2263	8.3158
2.5842	18.0	1386	2.2095	1.0	20.8697	8.4720
2.4796	19.0	1463	2.2000	1.0	21.1639	8.6951
2.4578	20.0	1540	2.1901	1.0	21.2464	8.7047
2.3617	21.0	1617	2.1707	1.0	20.0940	8.8880
2.3159	22.0	1694	2.1584	1.0	19.7261	8.8698
2.2381	23.0	1771	2.1523	1.0	19.9568	9.0408
2.214	24.0	1848	2.1462	1.0	20.9034	9.0797
2.1296	25.0	1925	2.1346	1.0	21.1286	9.0972
2.0865	26.0	2002	2.1356	1.0	19.7405	9.2238
2.0496	27.0	2079	2.1321	1.0	20.2183	9.2411
1.987	28.0	2156	2.1282	1.0	20.9950	9.3636
1.9553	29.0	2233	2.1312	1.0	22.2056	9.4380
1.8989	30.0	2310	2.1294	1.0	20.4057	9.4722
1.8778	31.0	2387	2.1312	1.0	20.6044	9.4259
1.8229	32.0	2464	2.1233	1.0	20.6957	9.4481
1.801	33.0	2541	2.1257	1.0	20.8279	9.6641
1.7688	34.0	2618	2.1413	1.0	22.2676	9.5680
1.7347	35.0	2695	2.1368	1.0	20.5484	9.7490
1.6756	36.0	2772	2.1393	1.0	20.4236	9.7627

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/b519e38e118d47b9896060f349afe555

Base model

google/umt5-base

Finetuned

(48)

this model