Phi-3.5 Mini Instruct (GGUF Quantized)

This repository contains the GGUF quantized version of the Microsoft Phi-3.5 Mini Instruct model. It is optimized for low-resource devices (like mobile phones and older laptops) while maintaining high reasoning capabilities.

Model Creator: Microsoft
Quantized By: Habibur Rahman (Aasif)
Quantization Format: GGUF (Q4_0)

πŸš€ Usage

You can run this model easily using the llama-cpp-python library.

1. Installation

First, install the necessary library. Ensure you have GPU support enabled for faster inference.

pip install llama-cpp-python huggingface_hub
  1. Python Code Example

Here is a simple script to download and run the model:

from huggingface_hub import hf_hub_download
from llama_cpp import Llama

# Download the GGUF model
model_path = hf_hub_download(
    repo_id="Habibur2/Phi-3.5-mini-GGUF",
    filename="phi-3.5-mini-q4_0.gguf"
)

# Load the model
# Set n_gpu_layers=-1 for full GPU usage (Requires CUDA)
# Set n_gpu_layers=0 if you only want to use CPU
llm = Llama(
    model_path=model_path,
    n_ctx=2048,        # Context window
    n_threads=4,       # Number of CPU threads
    n_gpu_layers=-1    # Offload all layers to GPU
)

# Run Inference
output = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": "Who is the founder of Microsoft?"}
    ],
    max_tokens=512,
    temperature=0.7
)

print(output['choices'][0]['message']['content'])

βš™οΈ Model Details Feature,Details Original Model,Phi-3.5 Mini Instruct Parameters,3.8 Billion Quantization,Q4_0 (4-bit) File Size,~2.18 GB Recommended RAM,4 GB+

πŸ‘¨β€πŸ’» About the Author

Quantized and uploaded by Md Habibur Rahman. This model is intended for educational purposes and hackathon projects focusing on Edge AI and SLM (Small Language Models).

Downloads last month
21
GGUF
Model size
4B params
Architecture
phi3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Habibur2/Phi-3.5-mini-GGUF

Quantized
(171)
this model