| | --- |
| | language: |
| | - vi |
| | license: mit |
| | tags: |
| | - summarization |
| | - vietnamese |
| | - bartpho |
| | - seq2seq |
| | datasets: |
| | - news-dataset-vietnameses |
| | metrics: |
| | - rouge |
| | model-index: |
| | - name: bartpho-vietnamese-summarization |
| | results: |
| | - task: |
| | type: summarization |
| | dataset: |
| | name: Vietnamese News Dataset |
| | type: news-dataset-vietnameses |
| | metrics: |
| | - type: rouge |
| | value: TBD |
| | --- |
| | |
| | # BARTpho Vietnamese Summarization Model |
| |
|
| | This model is a fine-tuned version of [vinai/bartpho-syllable](https://huggingface.co/vinai/bartpho-syllable) for Vietnamese text summarization. |
| |
|
| | ## Model Details |
| |
|
| | - **Base Model**: vinai/bartpho-syllable |
| | - **Task**: Text Summarization |
| | - **Language**: Vietnamese |
| | - **Training Dataset**: Vietnamese News Dataset |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from transformers import BartForConditionalGeneration, AutoTokenizer |
| | |
| | model_name = "YOUR_USERNAME/bartpho-vietnamese-summarization" |
| | # Use AutoTokenizer for BARTpho (automatically loads BartphoTokenizer) |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = BartForConditionalGeneration.from_pretrained(model_name) |
| | |
| | # Example usage |
| | text = "Your Vietnamese news article text here..." |
| | inputs = tokenizer(text, return_tensors="pt", max_length=1024, truncation=True) |
| | summary_ids = model.generate(inputs["input_ids"], max_length=128, num_beams=4, early_stopping=True) |
| | summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) |
| | print(summary) |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | - **Training Framework**: Hugging Face Transformers |
| | - **GPU**: NVIDIA P100 16GB |
| | - **Batch Size**: 8 per device |
| | - **Gradient Accumulation**: 2 steps |
| | - **Learning Rate**: 2e-5 |
| | - **Epochs**: 3 |
| |
|