YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

DeepSeek-R1-Distill-Llama-8B-VN-History

Model Description

DeepSeek-R1-Distill-Llama-8B-VN-History is a specialized Vietnamese history assistant based on DeepSeek-R1-Distill-Llama-8B, fine-tuned on the comprehensive Vietnam-History-15k dataset containing 15,000 high-quality Vietnamese history question-answer pairs.

Model Details

  • Model Name: DeepSeek-R1-Distill-Llama-8B-VN-History
  • Base Model: DeepSeek-R1-Distill-Llama-8B
  • Model Size: ~8B parameters
  • Context Length: 131,072 tokens (128K)
  • Language: Vietnamese
  • Domain: Vietnamese History
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Training Dataset: minhxthanh/Vietnam-History-15k

Key Features

  • 🇻🇳 Vietnamese History Expert: Specialized knowledge of Vietnamese historical events, figures, and periods
  • 📚 128K Context: Supports extremely long conversations and detailed historical analysis
  • 🎯 Accurate & Concise: Provides precise answers with relevant dates and key figures
  • 💬 Conversational: Natural Vietnamese dialogue capabilities
  • 📖 Comprehensive Coverage: Trained on 15,000 curated Vietnamese history Q&A pairs

Training Details

  • Dataset: minhxthanh/Vietnam-History-15k (15,000 Vietnamese history Q&A pairs)
  • Training Method: LoRA fine-tuning
  • LoRA Configuration:
    • Rank (r): 16
    • Alpha: 32
    • Dropout: 0.05
    • Target modules: ["o_proj", "v_proj", "gate_proj", "q_proj", "down_proj", "up_proj", "k_proj"]
  • Context Extension: Extended from 8K to 128K tokens via RoPE scaling (factor: 8.0)
  • Training Platform: FPT AI Studio

Usage

With Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "minhxthanh/DeepSeek-R1-Distill-Llama-8B-VN-History"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example usage
prompt = "Phân tích ý nghĩa của chiến thắng Bạch Đằng năm 1288?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    **inputs, 
    max_length=512, 
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Chat Template

The model uses a specialized chat template optimized for Vietnamese history Q&A:

<|begin▁of▁sentence|>{{ if .System }}{{ .System }}{{ end }}{{ range .Messages }}{{ if eq .Role "user" }}<|User|>{{ .Content }}<|Assistant|>{{ else if eq .Role "assistant" }}{{ .Content }}<|end▁of▁sentence|>{{ end }}{{ end }}

System Message

Bạn là trợ lý am hiểu lịch sử Việt Nam. Trả lời bằng tiếng Việt, chính xác, súc tích; nêu mốc thời gian và nhân vật then chốt khi phù hợp.

Example Conversations

User: Vì sao xảy ra sự kiện Đổi Mới (1986)? Nêu bối cảnh và nguyên nhân chính.

Assistant: Bối cảnh trước 1986: Cải cách kinh tế sang thị trường định hướng XHCN là phản ứng trước sức ép thời cuộc. Nguyên nhân chính gồm mâu thuẫn quyền lực, xâm lược/đe dọa bên ngoài và nhu cầu khẳng định chủ quyền. Kết cục: tăng trưởng, mở cửa hội nhập.


User: So sánh vai trò lịch sử của Lê Lợi và Nguyễn Huệ (Quang Trung).

Assistant: Cả hai đều tạo bước ngoặt: Lê Lợi (thế kỷ XV) với lãnh đạo khởi nghĩa Lam Sơn, giành độc lập 1428; Nguyễn Huệ (Quang Trung) (thế kỷ XVIII) với đại phá quân Thanh 1789, cải cách giáo dục và quân sự. Điểm khác nằm ở bối cảnh thời đại và phương thức hành động.

Model Architecture

Based on Llama architecture with the following specifications:

  • Architecture: LlamaForCausalLM
  • Hidden size: 4,096
  • Intermediate size: 14,336
  • Number of attention heads: 32
  • Number of key-value heads: 8
  • Number of hidden layers: 32
  • Max position embeddings: 131,072 (128K context)
  • RoPE theta: 500,000.0
  • Vocabulary size: 128,256
  • Torch dtype: bfloat16

Training Data Coverage

The Vietnam-History-15k dataset covers comprehensive Vietnamese historical periods:

  • Ancient Period: Hùng Vương era, Âu Lạc kingdom
  • Chinese Domination: Bắc thuộc periods and resistance movements
  • Independent Dynasties: Đinh, Tiền Lê, Lý, Trần, Hậu Lê periods
  • Medieval Conflicts: Mongol invasions, Ming occupation, Lam Sơn uprising
  • Fragmentation Era: Mạc dynasty, Trịnh-Nguyễn conflict, Tây Sơn rebellion
  • Modern Period: Nguyễn dynasty, French colonization, independence movements
  • Contemporary: Vietnam War, reunification, Đổi Mới reforms

Performance & Capabilities

This model excels at:

  • Historical Facts: Accurate dates, events, and chronology
  • Key Figures: Detailed knowledge of Vietnamese historical personalities
  • Cause & Effect: Analysis of historical events and their consequences
  • Comparisons: Cross-period and cross-figure historical comparisons
  • Context: Understanding historical context and background
  • Vietnamese Language: Natural, fluent Vietnamese responses

Limitations

  • Domain Specific: Optimized for Vietnamese history; may underperform on other topics
  • Computational Requirements: 8B parameters require significant GPU memory
  • Training Cutoff: Knowledge limited to training data timeframe
  • Cultural Context: Best suited for Vietnamese cultural and historical context

Technical Specifications

{
  "model_type": "llama",
  "architectures": ["LlamaForCausalLM"],
  "torch_dtype": "bfloat16",
  "max_position_embeddings": 131072,
  "rope_scaling": {
    "factor": 8.0,
    "high_freq_factor": 4.0,
    "low_freq_factor": 1.0,
    "original_max_position_embeddings": 8192,
    "rope_type": "llama3"
  },
  "vocab_size": 128256
}

License

This model inherits the license from the base DeepSeek-R1-Distill-Llama-8B model. Please refer to the original model's licensing terms for usage guidelines.

Citation

@misc{deepseek-r1-distill-llama-8b-vn-history,
  title={DeepSeek-R1-Distill-Llama-8B-VN-History: A Vietnamese History Specialist},
  author={minhxthanh},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/minhxthanh/DeepSeek-R1-Distill-Llama-8B-VN-History}
}

Dataset Citation

@dataset{vietnam_history_15k,
  title={Vietnam-History-15k: Vietnamese History Question-Answer Dataset},
  author={minhxthanh},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/datasets/minhxthanh/Vietnam-History-15k}
}

Acknowledgments

Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support