Instructions to use Nondzu/Mistral-7B-Instruct-v0.2-code-ft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Nondzu/Mistral-7B-Instruct-v0.2-code-ft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Nondzu/Mistral-7B-Instruct-v0.2-code-ft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Nondzu/Mistral-7B-Instruct-v0.2-code-ft")
model = AutoModelForCausalLM.from_pretrained("Nondzu/Mistral-7B-Instruct-v0.2-code-ft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Nondzu/Mistral-7B-Instruct-v0.2-code-ft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Nondzu/Mistral-7B-Instruct-v0.2-code-ft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Nondzu/Mistral-7B-Instruct-v0.2-code-ft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Nondzu/Mistral-7B-Instruct-v0.2-code-ft

SGLang

How to use Nondzu/Mistral-7B-Instruct-v0.2-code-ft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Nondzu/Mistral-7B-Instruct-v0.2-code-ft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Nondzu/Mistral-7B-Instruct-v0.2-code-ft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Nondzu/Mistral-7B-Instruct-v0.2-code-ft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Nondzu/Mistral-7B-Instruct-v0.2-code-ft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Nondzu/Mistral-7B-Instruct-v0.2-code-ft with Docker Model Runner:
```
docker model run hf.co/Nondzu/Mistral-7B-Instruct-v0.2-code-ft
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Mistral-7B-Instruct-v0.2-code-ft

I'm thrilled to introduce the latest iteration of our model, Mistral-7B-Instruct-v0.2-code-ft. This updated version is designed to further enhance coding assistance and co-pilot functionalities. We're eager for developers and enthusiasts to try it out and provide feedback!

Additional Information

This version builds upon the previous Mistral-7B models, incorporating new datasets and features for a more refined experience.

Prompt template: ChatML

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Quantised Model Links:

Eval Plus Performance

For detailed performance metrics, visit Eval Plus page: Mistral-7B-Instruct-v0.2-code-ft Eval Plus

Score: 0.421

Dataset:

The model has been trained on a new dataset to improve its performance and versatility:

path: ajibawa-2023/Code-74k-ShareGPT

type: sharegpt

conversation: chatml

Find more about the dataset here: Code-74k-ShareGPT Dataset

Model Architecture

Base Model: mistralai/Mistral-7B-Instruct-v0.2
Tokenizer Type: LlamaTokenizer
Model Type: MistralForCausalLM
Is Mistral Derived Model: true
Sequence Length: 16384 with sample packing

Enhanced Features

Adapter: qlora
Learning Rate: 0.0002 with cosine lr scheduler
Optimizer: adamw_bnb_8bit
Training Enhancements: bf16 training, gradient checkpointing, and flash attention

Download Information

You can download and explore this model through these links on Hugging Face.

Contributions and Feedback

We welcome contributions and feedback from the community. Please feel free to open issues or pull requests on repository.

Downloads last month: 12

Safetensors

Model size

7B params

Tensor type

F16

Model tree for Nondzu/Mistral-7B-Instruct-v0.2-code-ft

Merges

8 models

Quantizations

5 models