Text Generation
Transformers
Safetensors
Chinese
English
PanguEmbedded
feature-extraction
causal-lm
conversational
custom_code
Eval Results (legacy)
Instructions to use killer66678/openpangu_7b_lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use killer66678/openpangu_7b_lora with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="killer66678/openpangu_7b_lora", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("killer66678/openpangu_7b_lora", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use killer66678/openpangu_7b_lora with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "killer66678/openpangu_7b_lora" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "killer66678/openpangu_7b_lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/killer66678/openpangu_7b_lora
- SGLang
How to use killer66678/openpangu_7b_lora with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "killer66678/openpangu_7b_lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "killer66678/openpangu_7b_lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "killer66678/openpangu_7b_lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "killer66678/openpangu_7b_lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use killer66678/openpangu_7b_lora with Docker Model Runner:
docker model run hf.co/killer66678/openpangu_7b_lora
openPangu-7B LoRA (merged)
This repository contains LoRA-finetuned and merged weights based on
openPangu-Embedded-7B-V1.1. The LoRA adapters were merged into the
base model to produce full weights suitable for standard inference.
Base Model
- Base model:
FreedomIntelligence/openPangu-Embedded-7B-V1.1 - License:
OPENPANGU Model License Agreement v1.0(seeLICENSE)
Training Data
- Private dataset (not released).
Training Procedure
- Finetuning: LoRA using LLaMA-Factory.
- Export: merged full weights with
llamafactory-cli export.
Example (paths are placeholders):
llamafactory-cli export \
--model_name_or_path <base_model_dir> \
--adapter_name_or_path <lora_adapter_dir> \
--template default \
--finetuning_type lora \
--export_dir <export_dir> \
--export_size 2 \
--export_device cpu \
--export_legacy_format False \
--trust_remote_code True
Evaluation
Evaluated with lm-evaluation-harness using vLLM on 4x RTX 4090.
Dates (UTC): 2026-01-04.
GSM8K (5-shot)
- exact_match (strict-match): 0.6171
- exact_match (flexible-extract): 0.5777
C-Eval (valid, 5-shot)
- acc: 0.6241
- acc_norm: 0.6241
Example command (paths are placeholders):
lm_eval --model vllm \
--model_args "pretrained=<model_dir>,tensor_parallel_size=4,dtype=auto,gpu_memory_utilization=0.8,max_model_len=4096,enforce_eager=True,trust_remote_code=True" \
--tasks gsm8k \
--num_fewshot 5 \
--batch_size auto
Usage
This repo includes custom modeling code; trust_remote_code=True is required.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "killer66678/openpangu_7b_lora"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=True,
torch_dtype="auto",
device_map="auto",
)
Limitations and License Notes
- The openPangu license restricts use within the European Union.
- If you distribute a product or service based on this model, the license requires specific attribution and trademark notices.
- As with any LLM, outputs may be incorrect or biased.
Acknowledgements
本研究的实验与计算工作依托于华为云昇腾AI云服务平台完成,特此对其提供的稳定算力支持表示感谢。
- Downloads last month
- 5
Model tree for killer66678/openpangu_7b_lora
Evaluation results
- exact_match (strict-match) on gsm8ktest set self-reported0.617
- exact_match (flexible-extract) on gsm8ktest set self-reported0.578
- acc on ceval/ceval-examself-reported0.624
- acc_norm on ceval/ceval-examself-reported0.624