bart-large-wiki-doc

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 5.1601
Sari: 50.8615
Sari Add: 13.6308
Sari Keep: 44.3195
Sari Del: 94.6343
Fkgl: 6.7741
Bleu: 25.3408
D Sari: 0.4652
D Sari Keep: 0.3942
D Sari Del: 0.809
D Sari Add: 0.1924

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.06
num_epochs: 20
label_smoothing_factor: 0.3

Training results

Training Loss	Epoch	Step	Bleu	D Sari	D Sari Add	D Sari Del	D Sari Keep	Fkgl	Validation Loss	Sari	Sari Add	Sari Del	Sari Keep
5.2573	0.7459	1000	13.9528	0.3403	0.0	0.9948	0.0261	7.106	5.1128	48.1497	8.6665	95.2235	40.5591
4.8677	1.4915	2000	15.9556	0.3448	0.0	0.9947	0.0397	6.9782	5.0729	48.9106	10.0717	95.187	41.473
4.7689	2.2372	3000	15.8936	0.3431	0.0	0.9947	0.0344	6.9015	5.0350	49.3306	10.7042	95.2951	41.9925
4.6608	2.9830	4000	15.8724	0.3397	0.0	0.9949	0.0242	6.7831	5.0289	49.8733	11.7733	95.3954	42.4512
4.5534	3.7287	5000	17.3442	0.3431	0.0	0.9947	0.0344	6.8518	5.0233	49.9727	11.4246	95.2881	43.2055
4.4703	4.4744	6000	5.0376	49.9166	11.9432	42.5405	95.266	6.8747	17.1222	0.4495	0.3716	0.8118	0.1652
4.4451	5.2200	7000	5.0560	50.1126	12.2016	42.8193	95.3169	6.7038	17.4191	0.4507	0.3723	0.8136	0.1663
4.396	5.9659	8000	5.0326	50.7081	12.7094	44.1816	95.2332	6.6043	20.0309	0.4599	0.3867	0.8153	0.1778
4.3371	6.7115	9000	5.0637	50.6303	12.9272	43.8282	95.1356	6.763	20.4947	0.4587	0.3834	0.8113	0.1814
4.3014	7.4572	10000	5.0950	50.4656	12.8875	43.2239	95.2854	6.6203	18.5182	0.4575	0.3834	0.8113	0.1779
4.2664	8.2029	11000	5.1100	50.173	12.8216	43.0231	94.6743	6.7837	23.116	0.4554	0.3843	0.7992	0.1829
4.2373	8.9487	12000	5.1019	50.4108	13.0119	43.6241	94.5963	6.9084	24.2771	0.4582	0.3914	0.7996	0.1836
4.2038	9.6944	13000	5.1239	50.4995	13.3065	43.4134	94.7785	6.742	23.2113	0.4589	0.3859	0.8035	0.1874
4.1759	10.4401	14000	5.1491	50.4754	13.0605	43.7134	94.6524	6.8311	23.9938	0.4583	0.3853	0.8074	0.1821
4.1618	11.1857	15000	5.1601	50.8615	13.6308	44.3195	94.6343	6.7741	25.3408	0.4652	0.3942	0.809	0.1924
4.1347	11.9316	16000	5.1643	50.5104	13.6341	43.3236	94.5734	6.8763	24.5365	0.4609	0.3836	0.8046	0.1944
4.1155	12.6772	17000	5.1838	50.2122	13.6879	42.6409	94.3079	6.8788	25.6935	0.4586	0.3886	0.7954	0.1918
4.1004	13.4229	18000	5.1975	50.3625	13.5012	43.2364	94.3499	6.8343	25.8639	0.4609	0.3921	0.7997	0.191
4.0916	14.1686	19000	5.2197	50.2762	13.5701	42.9087	94.3498	6.6728	25.7121	0.4623	0.3926	0.8021	0.1921
4.0773	14.9144	20000	5.2248	50.4351	13.6553	43.2455	94.4045	6.7491	25.8344	0.4628	0.3952	0.7992	0.1939
4.0639	15.6601	21000	5.2276	49.6913	13.7193	41.6269	93.7278	7.1849	27.3581	0.4588	0.3911	0.7903	0.1951
4.0558	16.4057	22000	5.2378	50.309	13.6643	43.0305	94.2322	6.691	26.5571	0.4615	0.3913	0.7983	0.195
4.0454	17.1514	23000	5.2447	49.4843	13.563	41.5506	93.3394	7.1792	28.7607	0.4556	0.3915	0.7817	0.1936
4.0411	17.8973	24000	5.2533	50.5396	13.7453	43.538	94.3355	6.8688	26.4366	0.4642	0.3977	0.7999	0.1949
4.0327	18.6429	25000	5.2619	49.9081	13.6194	42.1831	93.9219	6.9129	27.2276	0.4605	0.3929	0.793	0.1956

Framework versions

Transformers 4.57.3
Pytorch 2.9.1+cu128
Datasets 3.6.0
Tokenizers 0.22.1

Downloads last month: 4

Safetensors

Model size

0.4B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for taiypeo/bart-large-wiki-doc

Base model

facebook/bart-large

Finetuned

(197)

this model